Journal of Automated Reasoning 21: 327–355, 1998. © 1998 Kluwer Academic Publishers. Printed in the Netherlands.
327
Integrating Computer Algebra into Proof Planning MANFRED KERBER School of Computer Science, The University of Birmingham, Birmingham B15 2TT, England, e-mail:
[email protected] URL: http://www.cs.bham.ac.uk/˜mmk
MICHAEL KOHLHASE and VOLKER SORGE Fachbereich Informatik, Universität des Saarlandes, D-66141 Saarbrücken, Germany, e-mail: {kohlhase|sorge}@ags.uni-sb.de URL: http://jswww.ags.uni-sb.de/{˜kohlhase|˜sorge} (Accepted: September 1997) Abstract. Mechanized reasoning systems and computer algebra systems have different objectives. Their integration is highly desirable, since formal proofs often involve both of the two different tasks proving and calculating. Even more important, proof and computation are often interwoven and not easily separable. In this article we advocate an integration of computer algebra into mechanized reasoning systems at the proof plan level. This approach allows us to view the computer algebra algorithms as methods, that is, declarative representations of the problem-solving knowledge specific to a certain mathematical domain. Automation can be achieved in many cases by searching for a hierarchic proof plan at the method level by using suitable domain-specific control knowledge about the mathematical algorithms. In other words, the uniform framework of proof planning allows us to solve a large class of problems that are not automatically solvable by separate systems. Our approach also gives an answer to the correctness problems inherent in such an integration. We advocate an approach where the computer algebra system produces high-level protocol information that can be processed by an interface to derive proof plans. Such a proof plan in turn can be expanded to proofs at different levels of abstraction, so the approach is well suited for producing a high-level verbalized explication as well as for a low-level, machine-checkable, calculus-level proof. We present an implementation of our ideas and exemplify them using an automatically solved example. Changes in the criterion of ‘rigor of the proof’ engender major revolutions in mathematics. H. Poincaré, 1905 Key words: mechanized reasoning, computer algebra, hierarchical proof planning, proof checking.
1. Introduction The computer and the development of high-level programming languages made possible the mechanization of logic as well as the realization of mechanical symbolic calculations we have witnessed in the past forty years. This has led to two rather disjoint academic fields, mechanized reasoning and computer algebra, each
VTEX(P) PIPS No. 181208 (jarskap:mathfam) v.1.15 JADKS02.tex; 23/09/1998; 9:44; p.1
328
M. KERBER, M. KOHLHASE AND V. SORGE
of which has its own methods, interests, and traditions, even though they share common roots: neither of the two fields is imaginable without the underlying foundation of mathematical logic or the mathematical study of symbolic calculations (leading to such algorithms and methods as the determination of the GCD or the Gaußian elimination). Only in the last decade we have seen a move toward an integration of the fields, driven by the insight that real-world formal problems often involve a mixture of both computation and reasoning and, hence, that an integration of mechanized reasoning systems and computer algebra systems is highly desirable (see [8]). This is the case in particular, since deduction systems are very weak, when it comes to computation with mathematical objects, and computer algebra systems manipulate highly optimized representations of these objects, but do not yield any formally checkable proof information (if they give any explanation at all). In the remainder of this introduction we briefly summarize key points of mechanized reasoning systems as well as of computer algebra systems and then give a short preview on the integration approach advocated in this paper. By its nature, such a short description has to abstract from many details and to simplify considerably.
1.1.
MECHANIZED REASONING SYSTEMS
Mechanized reasoning systems (for short, MRS in the following) are built with various purposes in mind. One goal is the construction of an autonomous theorem prover, whose strength achieves or even surpasses the ability of human mathematicians. Another is to build a system where the user derives the proof, with the system guaranteeing its correctness. A third purpose consists in modeling human problem-solving behavior on a machine; that is, cognitive aspects are the focus. Advanced theorem-proving systems often try to combine the different goals, since they can complement each other in an ideal way. Let us roughly divide existing theorem-proving systems into three categories: machine-oriented theorem provers, proof checkers, and human-oriented (plan-based) theorem provers. Normally all these systems do not exist in a pure form anymore, and in some systems like our own MEGA system [5] it is explicitly tried to combine the reasoning power of automated theorem provers as logic engines, the specialized problem-solving knowledge of the proof planning mechanism, and the interactive support of tactic-based proof development environments. We think that the combination of these complementary approaches inherits more advantages than drawbacks, because for most tasks domain-specific as well as domain-independent problem-solving know-how is required and for difficult tasks, more often than not, explicit user-interaction should be provided. While such an approach seems to be general enough to cope with any kinds of logic-level proofs, it neglects the fact that for many mathematical fields, the everyday work of mathematicians only partially consists in proving or verifying theorems. Calculation plays an equally important
JADKS02.tex; 23/09/1998; 9:44; p.2
INTEGRATING COMPUTER ALGEBRA INTO PROOF PLANNING
329
rôle. In some cases the tasks of proving theorems and calculating simplifications of certain terms can be separated from each other, but very often the tasks are interwoven and inseparable. In such cases an interactive theorem-proving environment will provide only poor support to a user. Although theoretically any computation can be reduced to theorem proving, this is not practical for nontrivial cases, since the search spaces are intractable. For many of these tasks, however, no search is necessary at all, since there are numerical or algebraic algorithms that can be used. If we think of Kowalski’s equation “Algorithm = Logic + Control” [31], generalpurpose procedures do not (and cannot) provide the control for doing a concrete computation.
1.2.
COMPUTER ALGEBRA SYSTEMS
Early computer algebra systems (CAS for short) developed from collections of algorithms and data structures for the manipulation of algebraic expressions like the multiplication of polynomials, or the derivation and integration of functions [22]. Abstractly, the main objective of a CAS can be viewed in the simplification of an algebraic expression or the determination of a normal form. Today there is a broad range of such systems, from very generally applicable systems to a multitude of systems designed for specific applications. Unlike MRS, CAS are used by many mathematicians as a tool in their everyday work; they are even widely applied in sciences, engineering, and economics. Their high academic and practical standard reflects the fact that the study of symbolic calculation has long been an established and fruitful subfield of mathematics that has developed the mathematical theory and tools. Most modern systems [34, 12, 28] have in common that the algebraic algorithms are integrated in a very comfortable graphical user interface that includes formula editing, visualization of mathematical objects, and even an interface to programming languages. As in the case of MRS the representation languages of CAS differ from system to system, thereby complicating the integration of such systems as well as the cooperation between them. This deficiency has been attacked in the OpenMath initiative [1], which strives for a standard CAS communication protocol. Currently the main emphasis is on standardizing the syntax and the computational behavior of the mathematical objects, while their properties or semantics are not considered. That means there is no explicit representation format for theorems, lemmata, and proofs. Some specific systems allow one to specify mathematical domains and theories. For instance, in systems like M U PAD [18] or A XIOM [28], computational behavior can be specified by attaching types and axiomatizations to mathematical objects; but this also falls short of a comprehensive representation of all relevant mathematics. Furthermore, almost all CAS fail to give an explanation or proof of their solution to the problem at hand, even though some mathematical theories like that of Gröbner bases can be successfully applied to theorem proving in elementary geometry [13, 29, 14, 35].
JADKS02.tex; 23/09/1998; 9:44; p.3
330 1.3.
M. KERBER, M. KOHLHASE AND V. SORGE
CONTRIBUTIONS OF THIS PAPER
Not only can a mutual simulation of the tasks of an MRS and a CAS be quite inefficient, but more important, the daily work of mathematicians is about proving and calculating. This points to the integration of such systems, since mathematicians want to have support in both of their main activities. Indeed, two independent systems can hardly cover their needs, since in many cases the tasks of proving and calculating are hardly separable. As pointed out by Buchberger [7] the integration problem is still unsolved, but it can be expected that a successful combination of these systems will lead to “a drastic improvement of the intelligence level” of such support systems. Our paper addresses two immediate questions occurring in the integration of automated reasoning and computation systems. – How can the algorithms be integrated so that the underlying mathematical knowledge is mutually respected and a synergy effect is achieved? – How can the correctness problem inherent in any such combination be addressed? In particular, how can results from the CAS be integrated into a proof without having to completely trust the CAS? We advocate an integration of computer algebra into mechanized reasoning systems using the proof planning paradigm. This approach allows one to encapsulate the computer algebra algorithms into methods, that is, declarative representations of the problem-solving knowledge specific to a certain mathematical domain. The proof planning paradigm enables a user to guide a proof or to fully hand over the control to a planner, which in turn can use computer algebra systems, if the specifications for the corresponding algorithms are met. The use of hierarchic proof plans at the method-level gives a suitable granularity of integration, since it allows one to directly use existing (human) control knowledge about the interplay of computation and reasoning. A proper integration into the proof planning approach answers the question about the correctness automatically, since the corresponding questions are solved for proof planning. In this area a proof plan can either be rejected (since the tactics are not executable, the plan cannot be used to build a proof) or be executed. The latter results either in a further planning phase to fill in possible gaps or in an accepted machine-checkable proof. Hence a proper integration requires that the computer algebra system produces high-level protocol information that can be processed by an interface to derive proof plans that themselves can be seamlessly integrated into the overall proof plan generated in the problem-solving attempt. Since this can be expanded into an explicit, checkable proof in order to obtain a correctness guarantee for the combined solution, we have also given a principled answer to the correctness problem. The feasibility of the approach advocated in the sequel has been verified by integrating a simple CAS into the MEGA proof planning system. Therefore, we
JADKS02.tex; 23/09/1998; 9:44; p.4
INTEGRATING COMPUTER ALGEBRA INTO PROOF PLANNING
331
organize the paper around this experiment and describe the relevant features with a system perspective. Our approach requires a mode of the CAS that generates information from which it is possible to generate a proof plan. For that reason the integration of a standard CAS makes major adaptations unavoidable (in particular it is necessary to change the source code of these systems). Our approach is not committed to the particular systems involved; in particular, the work reported here should be understood as a proof of principle rather than as the development of a state-of-the-art integrated system. Moreover, we will make the details of the approach more concrete by explaining them by means of an example that cannot easily be solved by either a mechanized reasoning system or a computer algebra system alone, but that needs the combined efforts of systems of each kind.
2. Related Work We give a short description of some of the experiments to combine MRS and CAS and roughly categorize them into three classes with respect to the treatment of proofs that is adopted, that is, with respect to the correctness issue. In doing so we describe in detail only the approaches of integrating CAS into MRS; that is, essentially the MRS is the master and the CAS the slave, since our approach is also of this kind. With the same right, one can of course follow the converse direction, namely, to approach the integration from the point of the CAS; and indeed such approaches are also successfully undertaken (see, e.g., [15, 9]). The question about the granularity of integration is treated uniformly by all these experiments. The application of the CAS is treated as another (derived) rule of inference at the level of the (tactic) calculus, so the granularity of integration depends on the granularity of the calculus or the tactics involved. In the first category of attempts (see, e.g., [21, 3]) one essentially trusts that the CAS properly work; hence, their results are directly incorporated into the proof. All these experiments are at least partly motivated by achieving a broader applicability range of formal methods; and this objective is definitively achieved, since the range of mathematical theorems that can be formally proved by the system combinations is much greater than that provable by MRS alone. However, CAS are very complex programs and therefore trustworthy only to a limited extent, so that the correctness of proofs in such a hybrid system can be questioned. This is not only a minor technical problem, but will remain unsolved for the foreseeable future, since the complexity (not only the code complexity, but also the mathematical complexity) of a CAS does not permit verification of the program itself with currently available program verification methods. Conceptually, the main contribution of such an integration is the solution of the software-engineering problem of how to pass the control between the programs and translate results forth and back. While this is an important subproblem, it does not seem to cover the full complexity of the interaction of reasoning and computation found in mathematical theorem proving.
JADKS02.tex; 23/09/1998; 9:44; p.5
332
M. KERBER, M. KOHLHASE AND V. SORGE
In an alternative approach that formally respects correctness, but essentially trusts CAS, an additional assumption standing for the CAS is introduced, so that essentially formulae are derived that are proved modulo the correctness of the computer algebra system at hand (see, e.g., [21]). The second category (for which [20] is paradigmatic) is more conscious about the rôle of proofs and uses the CAS only as an oracle, receiving a result, whose correctness can then be checked deductively. While this certainly solves the correctness problem, this approach has only limited coverage, since even checking the correctness of a calculation may be beyond the scope of most MRS, when they don’t have additional information. Indeed, from the point of applicability, the results of the CAS help only in cases where the verification of a result is simpler than its discovery, such as prime factorizations, solving equations, or symbolic integration. For other calculations, such as symbolic addition or multiplication of polynomials and differentiation, the verification is just as complex as the calculation itself, so that employing the CAS does not speed the proof construction. Typically, in longer calculations, both types of subcalculations are contained. A third approach of integrating computer algebra systems into a particular kind of mechanized reasoning system consists in the meta-theoretic extension of the reasoning system as proposed, for instance, in [6, 24] and been realized in N UPRL [17]. In this approach a constructive mechanized reasoning system is basically used as its own meta-system. The constructive features are exploited to synthesize a correct computer algebra system; and because of bridge rules between ground and meta-system, it is possible to integrate the so-built CAS that it can be directly used as a component. The theoretical properties of the meta-theoretic extension guarantee that if the original system was correct, then the extended system is correct too. This method is the most appealing one from the viewpoint of correctness, although the assumption that the original (also rather complex) system must be correct can hardly be expected to be self-evident for any nontrivial system. A disadvantage compared with the other two approaches is that it is not possible to employ an existing CAS, but that it is necessary to (re)implement one in the strictly formal system given by the basic MRS. Of course, this is subject to the limitations posed by the (mathematical and software engineering) complexities mentioned above. The main problem of integrating CAS into MRS without violating correctness requirements is that CAS are generally highly optimized toward maximal speed of computation but not toward generating explanations of the computations involved. In most cases, this is dealt with by meta-theoretic considerations about why the algorithms are adequate. This lack of explanation not only makes it impossible for the average user to understand or convince himself of the correctness of the computation, but leaves any MRS essentially without any information why two terms should be equal. This is problematic, since computational errors have been reported even for well-tested and well-established CAS. From the reported categories of approaches, only the last one seriously addresses this problem.
JADKS02.tex; 23/09/1998; 9:44; p.6
INTEGRATING COMPUTER ALGEBRA INTO PROOF PLANNING
333
3. MEGA as an Open System for Integrating Computation MEGA is a proof development system, based on the proof planning paradigm. In this section we describe its architecture and components and show how this supports the integration of computer algebra systems. Since the goal of this paper is not to present a system description of MEGA, but to document the integration of computer algebra into it, we try to be as concise as possible and introduce the relevant parts only: the general architecture, the proof planner, and the integration possibilities for external reasoners. 3.1.
THE PROOF DEVELOPMENT ENVIRONMENT
MEGA
The entire process of theorem proving in MEGA can be viewed as an interleaving process of proof planning, execution, and verification centered on a hierarchical proof plan data structure. Several integrated tools support the user in interacting with the system. Some of them are also available to the proof planner. Theory Database Since methods and control knowledge used in proof planning are mostly domainspecific, MEGA organizes the mathematical knowledge in a hierarchy of theories. Theories represent signature extensions, axioms, definitions, and methods that make up typical established mathematical domains. Each theorem has its home theory and therefore has access to the theory’s signature extensions, axioms, definitions, and lemmata without explicitly introducing them. A simple inheritance mechanism allows one to incrementally build larger theories from smaller parts. We give an overview of the part of MEGA’ S theory database that is necessary for solving our extended example in Figure 1. Proof Explanation Proof presentation is one important feature of a mathematical assistant that has been neglected by traditional deduction systems. MEGA employs an extension of the P ROVERB system [27] developed by our group that allows for presenting proofs and proof plans in natural language. In order to produce coherent texts that resemble those found in mathematical textbooks, P ROVERB employs state-of-theart techniques of natural language processing. Because of the possibly hierarchical nature of MEGA proofs, these can be verbalized at more than one level of abstraction, which can be selected by the user. To summarize our view of proofs, for every theorem an explicit proof has to be constructed so that on the one hand it can be checked by a proof checker and on the other hand the system provides support to represent this proof in a high-level form that is easily readable by humans [27]. Neither the process of generating proofs nor that of checking them is fully replaced by the machine but only supported. If a
JADKS02.tex; 23/09/1998; 9:44; p.7
334
M. KERBER, M. KOHLHASE AND V. SORGE
human mathematician wants to see a proof, he/she can do so at an appropriate level of abstraction. 3.2.
PROOF PLANNING
The central data structure for the overall process is the Proof plan Data Structure (P DS). This is a hierarchical data structure that represents a (partial) proof at different levels of abstraction (called proof plans). It is represented as a directed acyclic graph, where the nodes are justified by (LCF-style) tactic applications. Conceptually, each such justification represents a proof plan (the expansion of the justification) at a lower level of abstraction that is computed when the tactic is executed.? In MEGA, we explicitly keep the original proof plan in an expansion hierarchy. Thus, the P DS makes the hierarchical structure of proof plans explicit and retains it for further applications such as proof explanation or analogical transfer of plans. Once a proof plan is completed, its justifications can successively be expanded to verify the well-formedness of the corresponding P DS. This verification phase is necessary, since the correctness of the different components (in particular, that of external ones like automated theorem provers or computer algebra systems) cannot be guaranteed. When the expansion process is carried out down to the underlying ND-calculus (natural deduction), the soundness of the overall system relies solely on the correctness of the verifier and of ND. This also provides a basis for the controlled integration of external reasoning components if each reasoner’s results can (on demand) be transformed into a sub-P DS. The level to which the proofs have to be expanded depends on the sophistication of the proof checker. As pointed out by Barendregt [4], a more complex proof-checker that accepts proofs in a more expressive formalism may drastically reduce the length of the communicated proofs. If the high-level justifications are not expanded but accepted as they are, our approach reduces to one in which the computer algebra system is fully trusted. In short, the hierarchical nature of the P DS supports the full spectrum of user preferences, from total trust in the CAS, over partial trust in certain levels to full expansion of the proofs in a detailed calculus level description that is machine checkable. A P DS can be constructed by automated or mixed-initiative planning, or pure user interaction that can make use of the integrated tools. In particular, new pieces of P DS can be added by directly calling tactics, by inserting facts from a database, or by calling some external reasoner (cf. Section 3.3) such as an automated theorem prover or a computer algebra system. Automated proof planning is adequate only for problem classes for which methods and control knowledge have already been established. ? This proof plan can be recursively expanded, until we have reached a proof plan that is in fact
a fully explicit proof, since all nodes are justified by the inference rules of a higher-order variant of Gentzen’s calculus of natural deduction (ND).
JADKS02.tex; 23/09/1998; 9:44; p.8
INTEGRATING COMPUTER ALGEBRA INTO PROOF PLANNING
335
The goal of proof planning is to fill gaps in a given P DS by forward and backward reasoning [26] (proof plans were first introduced by Bundy; see [10, 11]). Thus, from an abstract point of view the planning process is the process of exploring the search space of planning states that is generated by the plan operators in order to find a complete plan from a given initial state to a terminal state. MEGA’ S proof planner is an extension of the well-known STRIPS algorithm that can be evoked to construct a proof plan for a node g (the goal node) from a set I of supporting nodes (the initial state) by using a set Ops of proof planning operators, here called methods. A method is a (partial) specification of a tactic in a meta-level language. In MEGA, planning is combined with hierarchical expansion of methods and precondition abstraction. The plans found by this procedure are directly incorporated into the P DS as a separate level of abstraction. In this model, the actual reasoning competence of the planner and the user builds upon the availability of appropriate methods together with meta-level control knowledge that guides the planning. At the moment, MEGA provides user-defined method ratings as a means of control and can use analogy as a control strategy of the planner. Two examples of methods are displayed in Section 3.4 on the extended example.
3.3.
INTEGRATION OF COMPUTER ALGEBRA SYSTEMS AS EXTERNAL REASONERS
According to the different modes of MEGA there are different levels on which an external reasoning system, RS YS, can be integrated: – Interactive calls: RS YS is represented as a command call-RSys that invokes the reasoner on a particular subproblem and returns the result. – Proof planning: RS YS is represented as a method whose specification contains knowledge about the problem-solving behavior and option settings for RS YS. – Justifications: RS YS can serve as a justification of a declaratively given subgoal that is left to be proved by RS YS. In any case, the proof found by RS YS must eventually be transformed into a P DS, since this is the proof-theoretic basis of MEGA. For automated theorem provers like OTTER [32], we described the integration in [25] and the necessary proof transformation to P DS in [27], so we will not pursue this matter here. The integration of CAS follows the same paradigm and is the main topic of this paper, so we will develop the paradigm for the case of external computations in MEGA. We will see examples for the three different levels of integrations of a CAS into MEGA in the example in the next section, so we will not go into that here. This leaves us with the question of the transformation of the CAS results into P DS.
JADKS02.tex; 23/09/1998; 9:44; p.9
336
M. KERBER, M. KOHLHASE AND V. SORGE
If we take the idea of generating explicit P DS seriously also for computations, we can neither just take existing systems nor follow the approach of meta-theoretic extensions, since MEGA is a classical proof system and does not use constructive logic. On the other hand, we cannot forgo using them even in cases where the verification of a calculation is much easier than the calculation itself (e.g., integration of functions); the computation needed for verifying alone is in many cases still much too complicated to be automatically checked without any guidance. For instance, even the proof for the binomial formula (x +y)2 = x 2 +2xy +y 2 (a trivial problem for any computer algebra system) needs more than 70 single steps in the natural deduction calculus.? Thus, using theorem provers or rewriting systems to find such proofs can produce unnecessarily large search spaces and absorb valuable resources. On the other hand, such proofs show a remarkable resemblance to algebraic calculations themselves and suggest the use of the CAS not only to instantly compute the result of the given problem, but also to guide a proof in the way of exploiting the implicit knowledge of the algorithms. We propose to do this extraction of information not by trying to reconstruct the computation in the MRS after the result is generated – as we have seen, even in case of a trivial example for a CAS, this may turn out to be a very hard task for an MRS – but rather by extending the CAS algorithm itself so that it produces some logically usable output alongside the actual computation. Surely in most cases a user would not like to see proofs at a level where the binomial formula is explained (although a novice might want to). This means that a hierarchical approach to proof generation is appropriate, in which the abstraction level of the proof presentation can be chosen by the user. Our approach is to use the mathematical knowledge implicit in the CAS to extract proof plans that correspond to the mathematical computation in the CAS. So, essentially the output of a CAS should be transferable into a sequence of tactics, which presents a high-level description for the proof of correctness of the computation the CAS has performed. Note that this does not prove general correctness of the algorithms involved; instead it gives a proof only for a particular instance of computation. The high-level description can then be used to produce a readable explanation or further expanded to a level that can be automatically checked by proof checkers. The level of abstraction on which the checking can take place depends on the level of sophistication of the proof checker. For a naive proof checker, the proof must be expanded to an explicit calculus level. The decision to extract proof plans rather than concrete proofs from the CAS is essential to the goal of being verbose without transmitting too much detail. For our purpose, we need different modes, in which we can use the CAS. Normally, during a proof search, we are only interested in the result of a computation, since the assumption that the computation is correct is normally justified for established CAS. When we want to understand the computation – in particular, in a successful proof – we need a mode of the CAS that gives enough information to ? Proofs of this length are among the hardest ever found by totally automatic theorem provers without domain-specific knowledge.
JADKS02.tex; 23/09/1998; 9:44; p.10
INTEGRATING COMPUTER ALGEBRA INTO PROOF PLANNING
337
generate a high-level description of the computation in terms of the mathematics involved. This is described in the next section in detail. First, however, we describe how the integrated system automatically solves an extended example from an economics examination. 3.4.
EXTENDED EXAMPLE
The concrete task at hand is to minimize the costs for running a machine while producing a certain product. PROBLEM. The output of a machine can range over a certain interval, I = [1, 7]. The cost of the product prod is determined by the costs of water and electricity for producing prod, which are given by the functions • r1 = (0.5d 2 + 3)
m3 , prod
• r2 = (4d 2 − 24d + 6)
kWh , prod
and the prices for water and electricity, • p1 = 2 DM , m3 DM • p2 = 0.5 kWh . Determine the output d in I of the machine such that the total costs are minimal. This example serves our purposes for several reasons. First, it allows us to show the interaction of proof planning with symbolic computation and the extraction of proof plans from calculations. Second, the mathematics involved is simple enough to be fully explained (only simple polynomial manipulations are necessary). Third, it is not an example we created, but the problem is a slightly varied version of a minimization problem from a master’s examination in economics at the Universität des Saarlandes, Saarbrücken [33]. To solve problems like this, we have integrated a simple CAS into MEGA, called µ-C AS.? The µ-C AS-system is very simple and can at the moment perform only basic polynomial manipulations and differentiation, but it suffices for automatically solving the example at hand. Clearly, for a practical system for mathematical reasoning, a much more developed system like Maple [12], Reduce [22], A XIOM [28], or Mathematica [34] has to be integrated. The technicalities of the integration will be described in Section 4. ? The µ-C AS system is part of the standard distribution of MEGA, which can be obtained
from http://www.ags.uni-sb.de/software/deduktion/omega. The example is accessible as WiWi-Exam in the theory economy.
JADKS02.tex; 23/09/1998; 9:44; p.11
338
M. KERBER, M. KOHLHASE AND V. SORGE
For the formalization of the example, we use the theory mechanism of MEGA to create a theory economy (see Figure 1) that contains the domain-specific knowledge (both the factual and the method knowledge) needed for the problem solution. Obviously, we need a background theory of costs in economics (that handles both numerical parts and denomination of cost functions) and one of minimization of real functions. Therefore, our theory inherits material from the theories costs and calculus. The calculus theory is provided by MEGA and contains relevant parts of the knowledge of an elementary calculus textbook. For instance, the real numbers are introduced as a complete, dense Archimedian field (based on elementary algebraic notions such as groups and rings defined in the respective theories). The set of real numbers (showing the existence of such a complete, dense Archimedian field) are constructed as the quotient field of the ring of sequences of rational numbers over the ideal of null-sequences. The rational numbers in turn are constructed as signed fractions of natural numbers that are defined from the Peano axioms in theory natural. All of these mathematical theories are based on the theories function, set, and relation that specify naive (simply typed) set theory and the properties of functions and relations on such sets. Finally, the whole hierarchy builds on the theory base, which declares the underlying logic by providing the logical connectives and quantifiers and the basic ND inference rules. The theory economy provides a type υ of units that covers the different units of denominations – in our example m3 (for volume), kWh (for work), prod (for product), and DM (for the price). We then formalize prices as triples consisting of one real number and two units and cost functions as a real function together with two units (read as input/output units). Note that just as in the real world, addition (⊕), multiplication (⊗), and comparison of costs and cost functions are defined as that of their real parts with respect to the denominations. For these calculations we have the axioms CF1 and CF2. If two denominations differ, we can relate them by their prices. For this purpose we use axiom Pr. CF1 CF2 Pr
cf (f, u, v) ⊕ cf (g, u, v) cf (f, u, v) ⊗ cf (g, v, w) price(f, u, v) ⇒ cf (g, v, w)
= = =
cf (f + g, u, v), cf (f · g, u, w), cf (f · g, u, w).
Optimization in economy is formalized by a predicate Opt on a cost function cf (f , DM, prod) and an interval I that is true whenever f has a total minimum? on I . O Opt(cf (f, DM, prod), I ) ⇔ ∃x TotMin(x, f, I ). Thus, we can state the problem as the following formula:?? ? The predicate TotMin and the problem-solving knowledge related to it are inherited from the
theory calculus. ?? Actually the formalization of the problem is not fully correct, since the examiner not only is interested in the proof that there exists such an x, but he/she wants to know the value of x as well as a proof that this value fits the requirements. Obviously, such an answer cannot be obtained from the formula here, but only from a proof that is constructive for the variable x, where we can extract
JADKS02.tex; 23/09/1998; 9:44; p.12
INTEGRATING COMPUTER ALGEBRA INTO PROOF PLANNING
339
Figure 1. Theory hierarchy in MEGA’s knowledge base.
THM H ` Opt([cf (λd 0.5d 2 + 3, m3 , prod)⊕ cf (λd 4d 2 − 24d + 6, kWh, prod)], [1, 7]), where H is a set of hypotheses that are needed for the complete proof, for instance the price axioms price(2, DM, m3 ), Pm3 PkWh price(0.5, DM, kWh). The planner solves the problem by generating a high-level proof plan consisting of methods from its domain specific method base on economics exam questions.? a witness term. This is no problem for a CAS nor for an MRS based on constructive logic, but for a traditional MRS based on classical logic, the proof construction process has to be refined to guarantee constructivity for x. Note that the arguments why the witness for x meets the requirements can still be classical and nonconstructive. For MEGA this means that the proof planner may only use methods in our proof plan that are constructive to get the wanted answer as presented here and not a nonconstructive abstract argument. Finally, note that this phenomenon is another argument in favor manipulating explicit proofs. Without this, one may find oneself in the position that one is convinced (by meta-theoretic arguments) of the existence of a (constructive) proof, but in fact without one from which to extract a term witness to answer the exam question. ? Questions for certain standard exams are a good example for a very restricted mathematical domain, since the proofs and calculations involved are highly standardized. Therefore, finding the proof plan in this example is not a big problem for MEGA.
JADKS02.tex; 23/09/1998; 9:44; p.13
340
M. KERBER, M. KOHLHASE AND V. SORGE
We are going to outline this process by describing its major steps. In particular, we will demonstrate how the proof planner of MEGA and the µ-C ASsystem interact, and make explicit on which entries of a mathematical database this interaction depends. The planner finds the following simple proof plan: 1 2 3 4 5
Mult-by-Price Mult-by-Price Add-by-Denom Optimize TotMin-Rolle
where the first three methods compute the actual cost function by adjusting the denominations and adding. Method 4 uses Axiom O for optimization. As the example contains only polynomials of degree two, the planner selects a method TotMin-Rolle (cf. Figure 3) for finding total minima that makes implicit use of Rolle’s theorem from the calculus theory: Let f be a polynomial of degree two. Then f has a total minimum at x ∈ [a, b] iff f has a minimum at x and f (a) ≥ f (x) ≤ f (b). Formally we get the following equivalence: TotMin
TotMin(x, f, [a, b]) ⇔ x ∈ [a, b] ∧ Min(x, f ) ∧ f (x) ≤ f (a) ∧ f (x) ≤ f (b).
Note that Rolle’s theorem is accessible in the current theory and, to ensure correctness, the database has to contain its formal proof. Now let us take a closer look at some of the methods in order to get a feeling of how this initial proof plan can be expanded. In Figures 2 and 3 we have given slightly simplified presentations of the Mult-by-Price and TotMin-Rolle method.? The declaration slot of the method simply defines the meta-variables used in the body of the method. The premises, conclusions, and the constraint describe the applicability of the method. In the example of Mult-by-Price, for instance, line L4 has to be present and to be an open subgoal, while L1 and L3 are lines that can be used in order to infer L4 . L1 has to be given already, whereas L3 is generated by the application of the method (indicated by the ⊕). Since the method is intended to prove L4 , after the application of the method, this line can be deleted from the current planning state (we indicate this by the ). In the constraint slot further applicability criteria are described, which cannot be formulated in terms of proof line schemata. Declarations, premises, constraints, and conclusions form the specification part of the method. In order to be able to mechanically adapt methods, ? We have especially adjusted the syntax of the constraint in a way that is more comprehensive for the reader.
JADKS02.tex; 23/09/1998; 9:44; p.14
INTEGRATING COMPUTER ALGEBRA INTO PROOF PLANNING
Declarations Premises Constraint Conclusions Declarative Content Procedural Content
341
Method : Mult-by-Price L1 , L2 , L3 , L4 : prln H1 , H2 , H3 : list(prln) J1 : just 0 0 f, g, v, w, φ, φ , ψ, ψ : variable L1 , ⊕L3 ψ ← (2ndarg(termocc(cf, φ)) 6= DM → termocc(cf, φ)) g ← 1starg(ψ) v ← 2ndarg(ψ) w ← 3rdarg(ψ) ψ 0 ← cf (g · f, DM, w) φ 0 ← replace(ψ 0 , ψ, φ) L4 (L1 ) H2 ` price(f, DM, v) (J1 ) (L2 ) H1 , H2 ` cf (g, v, w) = ψ 0 (Pr L1 ) (L3 ) H3 ` φ0 (Call-CAS) (L4 ) H1 , H2 , H3 ` φ (=subst L3 L2 ) schema − interpreter
Figure 2. The Mult-by-Price method from theory cost.
the tactic part is further subdivided into the declarative content and the procedural content. (However, this particular feature is not important for the purpose of this paper.) In our examples the procedural content consists of a schema-interpreter, which essentially inserts the declarative content (using the bindings made in the planning phase) at the correct place in the current partial proof tree. In the concrete example the lines L1 through L4 are inserted (Note that we adopted a linearized version of ND proofs as introduced in [2].) In order to understand to which piece of actual proof these methods evaluate, we have to examine the declarative content and the bindings performed in particular in the constraint. The constraint of the Mult-by-Price-method states a rather simple computation: if there is a cost function in the given open line which has a denomination other than DM, it is multiplied with the appropriate price. The multiplication of the real parts is carried out by the CAS, and the corresponding cost function is constructed. As this point is crucial for understanding the working scheme of a method, we will view the bindings in the constraint step by step: When applied to the current plan the method is matched with the open goals of the planning state. The first pass of the planner yields that L4 can be matched with our theorem THM. Thus, its formula Opt([cf (λd 0.5d 2 + 3, m3 , prod) ⊕ cf (λd 4d 2 − 24d + 6, kWh, prod)], [1, 7]) is bound to the meta-variable φ. It is then examined to find an occurrence of a cost function. If such a subterm exists, its arguments are bound to g, v, w; and by matching line L1 , we receive the numerical part of price in f (if the appropriate price is not provided, the application of the method would fail here). Afterwards the new cost function is computed (according to axiom Pr) in ψ 0 , and finally φ 0 contains the result of replacing the old cost function in φ by ψ 0 .
JADKS02.tex; 23/09/1998; 9:44; p.15
342
M. KERBER, M. KOHLHASE AND V. SORGE
Hence in the first plan step the optimization formula stored in φ 0 contains the cost function cf (λd 1d 2 + 6, DM, prod) as a subterm. With all these meta-variables instantiated the subproof contributed by the Multby-Price method consists of lines L2 and L3 in the declarative content. Here we observe that L2 results from applying the price-axiom Pr (which is fetched from the database) to line L1 . Furthermore note that in L3 we have a call to the CAS as a justifying method for the line. This means that at this point in the proof planning procedure, the CAS is called in order to compute the product of price and original cost function. The line resulting from this calculation is then used as the new open subgoal in the planning state. Summarizing, the effects of the method Mult-by-Price can be observed in two steps. First the goal line THM is justified with the method, yielding the following subproof: H ` Opt ([cf (λd cf (λd THM H ` Opt ([cf (λd cf (λd L1
1d 2 + 6, DM, prod) ⊕ (Open) 4d 2 − 24d + 6, kWh, prod)], [1, 7]), 0.5d 2 + 3, m3 , prod) ⊕ (MbP L1 ) 4d 2 − 24d + 6, kWh, prod)], [1, 7]).
Then the method in the justification of line THM (which has been abbreviated because of lack of space) could be expanded, thereby inserting the intermediate steps as described above by instantiating the macro steps of the method. Note that the following expanded subproof is at a more detailed level of abstraction in the P DS. In particular, the justification of THM itself is different at this level. (HYP) Pm3 Pm3 ` price(2, DM, m3 ), 2 3 (Pr Pm3 ) L2 H ` cf (λd 0.5d + 3, m , prod) = cf (λd 1d 2 + 6, DM, prod), (Open) L1 H ` Opt ([cf (λd 1d 2 + 6, DM, prod)⊕ cf (λd 4d 2 − 24d + 6, kWh, prod)], [1, 7]), THMH ` Opt ([cf (λd 0.5d 2 + 3, m3 , prod)⊕ (=subst L1 L2 ) 2 cf (λd 4d − 24d + 6, kWh, prod)], [1, 7]). In the proof of THM, the method Mult-by-Price is applied twice in order to normalize both summands. To preserve space, we will not present the next two methods of our proof plan as extensively as the Mult-by-Price-method. Addby-Denom is very similar to Mult-by-Price and applies axiom CF1 inside the optimization function Opt to compute the final cost function. In its course the CAS is called once to perform a polynomial addition. Then the Optimize-method simply introduces the definition for the Opt function of axiom O. Far more interesting than these two methods is the TotMin-Rolle method, as it contains a different example for the use of a CAS in MEGA. Again the presentation of the method in Figure 3 is simplified. The TotMin-Rolle method is applied at a stage of the proof where the actual minimum of the cost function has to be introduced. This task is fulfilled within
JADKS02.tex; 23/09/1998; 9:44; p.16
INTEGRATING COMPUTER ALGEBRA INTO PROOF PLANNING
Declarations Premises Constraint Conclusions
Declarative Content
Procedural Content
343
Method : TotMin-Rolle L1 , L2 , L3 , L4 , L5 , L6 , L7 , L8 , L9 , L10 , L11 : prln H1 , H2 , H3 : list(prln) J1 , J2 : just a, b, f, x: variable y, φ, α, β: term L1 , L2 . degree(φ) = 2 y ← compute_with_CAS(minimum, φ) L12 (L1 ) H1 ` ∀f ∀x (f 0 (x) = 0 ∧ (J1 ) f 00 (x) > 0) ⇒ Min(x, f ) (L2 ) H2 ` ∀a ∀b ∀x x ∈ [a, b] ⇔ (J2 ) (a ≤ x ∧ x ≤ b) (Call-CAS) (L3 ) H3 ` φ 0 (y) = 0 (L4 ) H3 ` φ 00 (y) > 0 (Call-CAS) (L5 ) H3 ` α ≤ y (Simplify) (Simplify) (L6 ) H3 ` y ≤ β (L7 ) H3 ` φ(y) ≤ φ(α) (Simplify) (Simplify) (L8 ) H3 ` φ(y) ≤ φ(β) (L9 ) H3 ` Min(y, φ) (L1 L3 L4 ) (L10 ) H3 ` y ∈ [α, β] (L2 L5 L6 ) (TotMinL7 L8 L9 L10 ) (L11 ) H3 ` TotMin(y, φ, [α, β]) (L12 ) H3 ` ∃x TotMin(x, φ, [α, β]) (∃I L11 ) schema − interpreter
Figure 3. The TotMin-Rolle method from theory calculus.
the constraint of the method. The compute_with_CAS statement actually calls the CAS in quiet mode to compute the minimum of the function φ and store it in the meta-variable y. At this stage, the CAS is used as an oracle here, just as in [20]. In our example the minimum of the cost function is at y = 2, and the ND-line of the form ∃xTotMin(x, λx (3x 2 + (−12x + 9)), [1, 7]) will be transformed by eliminating the existentially quantified variable: TotMin(2, λx (3x 2 + (−12x + 9)), [1, 7]). The rest of the proof plan is devoted to proving that the result is actually a total minimum. This is done by using the definition for TotMin from the database and furthermore by using the definitions for minimum and interval that correspond to line L1 and L2 in the method TotMin-Rolle. These definitions are introduced in lines L9 through L11 by applying them to the correct assertions given in lines L3 through L8 . This is expressed by the justifications in the corresponding lines; for instance, the justification of line L10 states that we can infer y ∈ [α, β] from the lines L5 and L6 with the definition of interval in line L2 . A closer look at the justifications of lines L3 through L8 reveals that these contain methods themselves. Lines L3 and L4 again depend on calculations of
JADKS02.tex; 23/09/1998; 9:44; p.17
344
M. KERBER, M. KOHLHASE AND V. SORGE
the CAS which computes the first and second derivative of our cost function. The justifications Simplify correspond to a method performing basic arithmetic simplifications and comparisons. Consisting of only five methods, the above proof plan gives the impression of a small proof, and on an abstract level it is indeed; an experienced mathematician might not want to see more. But expanding the plan into a partially grounded ND proof gives it a length of 90 lines, containing lines justified by the CAS. The proof on this level may roughly correspond to a proof that a novice would like to see and that would form a reasonable solution of the exam problem once it is presented in natural language by the P ROVERB system. By rerunning the CAS in a proof plan generating mode on the CAS-justifications and extracting proof plans, the proof can be expanded to a more detailed proof plan containing an account of the mathematics behind the calculations. This proof plan already contains 135 plan steps and – if the user does not feel comfortable with the level of detail yet – can then be expanded to a calculus-level ND proof of length 354. Note that even this proof is not a stand-alone proof of the minimization theorem, but depends on the proofs of a number of lemmata from a database. Furthermore, in these proofs the simplification of ground arithmetic expressions is not expanded, for instance, into a representation involving zero and the successor function either, which would be necessary to obtain a detailed logic-level proof.
4. Integrating Computations into Explicit Proofs In this section we describe S APPER (System for Algorithmic Proof Plan Extraction and Reasoning), which generates proof plans from CAS output. As mentioned in Section 3.3, for the intended integration it is necessary to augment the CAS with mathematical information for a proof plan generating mode in order to achieve the proposed integration at the level of proofs. For the µ-CAS system, which we have developed to demonstrate the feasibility of the approach, this was rather simple, as we will demonstrate below. Enriching a state-of-the-art CAS with such a mode for producing the necessary additional protocol information would, of course, require a considerable amount of work.
4.1.
ARCHITECTURE
The S APPER system can be seen as a generic interface for connecting MEGA (or another proof plan-based mechanized reasoning system) with one or several computer algebra systems (see Figure 4). An incorporated CAS is treated as a slave to MEGA, which means that only the latter can call the first one and not vice versa. From the software engineering point of view, MEGA and the CAS are two independent processes while the interface is a process providing a bridge for com-
JADKS02.tex; 23/09/1998; 9:44; p.18
INTEGRATING COMPUTER ALGEBRA INTO PROOF PLANNING
345
Figure 4. Interface between MEGA and computer algebra systems.
munication. Its rôle is to automate the broadcasting of messages by transforming output of one system into data that can be processed by the other.? Unlike other approaches (see [23, 19], for example), we do not want to change the logic inside our MRS. In the same line, we do not want to change the computational behavior of the computer algebra algorithms. In order to achieve this goal, the trace output of the algorithm is kept as short as possible. In fact, most of the computations for constructing a proof plan is left to the interface. The proof plans can directly be imported into MEGA. This approach makes the integration independent of the particular systems, and indeed all the results below are independent of the CAS employed and make only some general assumptions about the MRS (such as being proof plan-based). Moreover, the interface approach helps us to keep the CAS free of any logical computation, for which such a system is not intended anyway. Finally, the interface minimizes the required changes to an existing CAS, while maintaining the possibility of using the CAS stand-alone. The only requirement we make for integrating a particular CAS is that it has to produce enough protocol information so that a proof plan can be generated from this information. The proof plan in turn can be expanded by the MRS into a proof verifying the concrete computation. ? This is an adaptation of the general approach on combining systems in [16].
JADKS02.tex; 23/09/1998; 9:44; p.19
346
M. KERBER, M. KOHLHASE AND V. SORGE
The interface itself can be roughly divided into two parts: the translation part and the plan generator. The first performs syntax translations between MEGA and a CAS in both directions, while the latter transforms only verbose output of the CAS to MEGA proof plans. Clearly only the translation part depends on the particular CAS that is invoked. For the translations a collection of data structures – called abstract CAS? – is provided each one referring to a particular connected CAS (or just parts of one). The main purpose of these structures is to specify function mappings, relating a particular function of MEGA to a corresponding CAS-function and the type of its arguments. Furthermore, it provides functionality to convert the given arguments of the mapped MEGA function to CAS input. In the same fashion it transforms results of algebraic computations back into data that can be further processed by MEGA. The functionality in this part of our interface offers us the possibility of connecting any CAS as a black box system, as in the first approach we have described in Section 2. For instance, we may want to use a very efficient system without a mode for generating proof plans in proof search as a black box system, and then another less efficient system with such a mode for the actual proof construction, once it is clear what the proof should look like. This corresponds to recent techniques used in knowledge-based systems, where the explanation component is not just a trace of the rules applied during the search, but the explanation is reconstructed by an independent component. The plan generator solely provides the machinery for our main goal, the proof plan extraction. Equipped with supplementary information on the proof by MEGA, it records the output produced by the particular algebraic algorithm and converts it into a proof plan. Here the requirements of keeping the CAS side free of logical considerations and, on the other hand, of keeping the interface generic seem conflicting at first glance. However, this conflict can be solved by giving both sides of the interface access to a database of mathematical facts formalizing the mathematics behind the particular CAS algorithms. Conceptually, this database, together with the mappings governing the access, provides the semantics of the integration of MEGA with a particular CAS. Thus, expanding the plan generator is simply done by expanding the theory database by adding new tactics. While MEGA itself can access the complete database, S APPER ’ S plan generator in the interface is only able to use tactics and lookup hypotheses of a theory (cf. Figure 4). The CAS does not interact with the database at all: it only has to know about it and references the logical objects (methods, tactics, theorems, or definitions) in the proof plan generating mode. Thus, knowledge about the database is compiled a priori into the algebraic algorithms in order to document their calculations.
? In a reimplementation of S APPER we would probably use the OpenMath protocol [1] as a lingua franca on the CAS side.
JADKS02.tex; 23/09/1998; 9:44; p.20
INTEGRATING COMPUTER ALGEBRA INTO PROOF PLANNING
4.2.
347
PROOF PLAN EXTRACTION
Let us now take a closer look at the implementation of the proof plan generation in µ-CAS and at the expansion process of its output. This should demonstrate how proofs can be extracted from computer algebra calculation and provide an intuition on the requirements that our approach poses on the CAS side. As an example we will consider a polynomial addition from the example above. Normally, an experienced mathematician would not like to see any proof at all for that, while a high-school student would like to. As we have seen in our example, the main purpose of the Add-by-Demon-method is to compute the final cost function cf (λd (3d 2 − 12d + 9), DM, prod). This is done in µ-C AS by adding the two polynomials λd d 2 + 6 and λd 2d 2 − 12d + 3. In the remainder of this subsection we will expand this addition in several steps and thereby obtain a calculus-level proof for the computation. Before examining this example in detail, let us consider the general scheme of the proof plan generation inside the polynomial addition algorithm of µ-C AS. We first take a look at the different representations of a polynomial p in the variables P e1 er x1 , . . . , xr : p = ni=1 αi x1 i · · · xr i . The logical language of MEGA is a variant of the simply typed λ-calculus (indeed we use a stronger type system, but here we want to keep things as simple as possible), so the polynomials are represented as polynomial functions, that is, as λ-expressions, where the formal parameters x1 , . . . , xr are λ-abstracted (mathematically, p is a function of r arguments): p:
λx1 · · · λxr (+ (∗ αn (∗ (↑ x1 e1n ) · · ·)) · · · (∗ α1 (∗ (↑ x1 e11 ) · · ·)).
For the notation, we use a prefix notation; the symbols +, ∗, and ↑ denote binary functions for addition, multiplication, and exponentiation on the reals. In this representation, we can use β-reduction for the evaluation of polynomials. In µ-C AS, we use a variable dense, expanded representation as an internal data structure for polynomials (as described in [36], for instance). Thus, every monomial is represented as a list containing its coefficient together with the exponents of each variable. Hence we get the following representation for p: p:
((αn e1n · · · ern ) · · · (α1 e11 · · · er1 )).
Let us now turn to the actual µ-C AS algorithm for polynomial addition. This simple algorithm adds polynomials p and q by a case analysis on the exponents? P P e1 f1i er with recursive calls to itself. So let p = ni=1 αi x1 i · · · xr i and q = m i=1 βi x1 fr · · · xr i . We have presented the algorithm in the j th component of p and the kth component of q in a L ISP-like pseudo-code in Figure 5. Intuitively, the algorithm proceeds by ordering the monomials, advancing the leading monomial either of the ? We assume a lexicographic monomial order and employ it for ordering the exponents. Thus we make use of the operators >, (e1j · · · erj )(f1k · · · frk )) (tactic "pop-first") e1
er
(cons-poly αj x1 j · · · xr j m n P e1 f1 er P fr (poly-add αi x1 i · · · xr i βi x1 i · · · xr i )) i=j +1
i=k
(< (e1j · · · erj )(f1k · · · frk )) (tactic "pop-second") f1
fr
(cons-poly βk x1 k · · · xr k n P e1 er (poly-add αi x1 i · · · xr i i=j
m P i=k+1
f1
fr
βi x1 i · · · xr i )))
Figure 5. Polynomial addition in µ-C AS.
Obviously, the only expansions of the original algorithm needed for the proof plan generation are the additional (tactic...) statements.? They just produce the additional output by returning keywords of tactic names to the plan generator and do not have any side effects. In particular, the computational behavior of the algorithm does not have to be changed at all. If we now apply this algorithm to the two polynomials p := x 2 + 6
q := 2x 2 − 12x + 3,
we obtain the following proof plan: (mono-add, pop-second, mono-add). First the two quadratic monomials from p and q are added; then the linear term of q (the second argument) is raised, since it only appears in one argument; and finally the remaining monomials are added. In the case of the polynomial addition, each of the methods (proof plan operators) directly corresponds to a tactic with the same name; that is, the list of the three ? Observe that in this case, the called tactics do not need any additional arguments, since our plan generator in the interface keeps track of the position in the proof and thus knows on which monomials the algorithm works when returning a tactic. This way we need not be concerned which form a monomial actually has during the course of the algorithm.
JADKS02.tex; 23/09/1998; 9:44; p.22
INTEGRATING COMPUTER ALGEBRA INTO PROOF PLANNING
349
methods above directly represents a concrete proof plan for polynomial addition of the concrete polynomials p and q. (In the following representation we omitted the context in which the polynomials are embedded in the actual proofs.) ((x 2 + 6) + (2x 2 − 12x + 3)) (3x 2 + (6 + (−12x + 3))) (3x 2 − 12x + (6 + 3)) (3x 2 − 12x + 9)
(mono-add) (pop-second) (mono-add)
These four lines correspond to a step-by-step version of the basic high-school algorithm. So far the expansion of the call-cas-method has been exclusively done by µ-C AS proof plan generation mode. But at this stage µ-C AS cannot provide us with any more details about the computation, and the subsequent expansion of the next hierarchic level can be achieved without further use of a CAS. Let us, for instance, take a look at the pop-second tactic to understand its logical content. The tactic itself describes a reordering in a sum that looks in the general case as follows: (a + (b + c)) = (b + (a + c))
(1)
For the current example we can view a and c as arbitrary polynomials and b as a monomial of rank greater than that of the polynomial a. It is now obvious that the behavior of pop-second is determined by the pattern of the sum it is applied to. If in Equation (1) the polynomial c does not exist, pop-second is equivalent to a single application of the law of commutativity. Otherwise, as in our example, the tactic performs a series of commutativity and associativity steps. The pop-second step above can thus be expanded in a plan that reflects the single step applications of the laws of commutativity and associativity. (3x 2 + (6 + (−12x + 3))) (3x 2 + ((6 − 12x) + 3)) (3x 2 + ((−12x + 6) + 3)) (3x 2 − 12x + (6 + 3))
(associativity) (commutativity) (associativity)
Assuming we have expanded the two mono-add tactics as well, we have constructed a representation of the proof at a level where it needs only the axioms in the polynomial ring. To finally expand this to a fully explicit calculus-level proof, we further expand all three justifications of the above lines. This process leads to a sequence of eliminations of universally quantified variables in the corresponding hypothesis, the axioms of commutativity and associativity. In our example the commutativity axiom would be transformed in the following fashion: ∀a∀b (a + b) = (b + a) ∀b (6 + b) = (b + 6) (6 − 12x) = (−12x + 6)
(THM) (∀E 6) (∀E −12x).
JADKS02.tex; 23/09/1998; 9:44; p.23
350
M. KERBER, M. KOHLHASE AND V. SORGE
Here, the justification (THM) in the first proof line indicates that the commutativity of + was imported from the theory real in MEGA’s mathematical database, where it was established as a theorem. The remaining lines are natural deduction inferences: universal eliminations that instantiate a with the number 6 and b with the term −12x. Altogether this single application of the pop-second-tactic is equivalent to a calculus-level proof of 11 inference steps. The length of the subproof for this trivial polynomial addition is 43 single steps. This example shows how it is possible to mechanically construct a proof verifying the correctness of any particular CAS computation without verifying the CAS algorithm (or their implementation) in the general case. However, the calculus-level proofs for the computations are very long and rather boring, and therefore hardly any human user might actually want to see, much less read, them. Therefore, the P ROVERB proof explanation system in MEGA provides a more realistic alternative, since it gives the user access to representations of the parts of the proof on various levels of abstractions making use of the hierarchical structure of the underlying P DS. For instance, it is then possible to present the computations with some intermediate steps, as it is customary in textbooks. For example, we could include the three steps of the high-school algorithm mentioned above, to illustrate the polynomial addition. (The decision which steps should be included and which omitted depends, of course, on the expertise of readers for which a particular proof presentation is intended.) Despite all these abstractions in both developing and presenting the proof, we can still use any proof checker for ND-calculus to verify all steps including computations. Furthermore, if we assume we have a more sophisticated proof checker, for example one that works modulo the axioms of polynomial rings, it is also possible to check the proof on an abstract level. As already mentioned, the more sophisticated the proof checker is, the more concise the communicated proofs can be. We have tested proof plan extraction from simple recursive and iterative CAS algorithms, where it works quite well, since these algorithms closely correspond to the mathematical definitions of the corresponding concepts. However, more complicated schemes like divide-and-conquer algorithms (for instance, the polynomial multiplication of Karatsuba and Ofman [30]) cannot be adapted to our approach so easily without extending the mathematical knowledge base by corresponding lemmata. The example of the polynomial addition is surely a trivial one; we have chosen it solely for presentation reasons. In particular, it is very likely to be correct in any real-world implementation, since it is well tested and does not depend on sophisticated mathematical theorems for which fuzzy boundary cases must be considered. For the sake of argument, let us assume an error in the implementation. For instance, in the second case of the polynomial addition algorithm in Figure 5 the cons-poly statement was forgotten, so that the algorithm has the following
JADKS02.tex; 23/09/1998; 9:44; p.24
INTEGRATING COMPUTER ALGEBRA INTO PROOF PLANNING
351
(incorrect) form: (> (e1j · · · erj )(f1k · · · frk )) (tactic "pop-first") m n P e1 f1 er P fr (poly-add αi x1 i · · · xr i βi x1 i · · · xr i ). i=j +1
i=k
In the computation of ((x + 6) + (2x − 12x + 3)) that we have discussed above, the second case is never used, and the computation would be correct although the program is not. If we now change the order of addition of our polynomials p and q to q + p, we get the following incorrect result from the changed algorithm: 2
2
((x 2 + 6) + (2x 2 − 12x + 3)) = (3x 2 + 9). Inserting the proof plan generated by the faulty algorithm then yields ((2x 2 − 12x + 3) + (x 2 + 6)) (3x 2 + ((−12x + 3) + 6)) (3x 2 + (3 + 6)) (3x 2 + 9)
(mono-add) (pop-first) (mono-add)
In checking, the proof checker would see that the pop-first step is not justified, since the expansion corresponds to the application of the law of associativity. This would yield ((−12x + 3) + 6) = (−12x + (3 + 6)) and thus would not be applicable during the expansion. Thus, the proof plan and consequently the calculation would be rejected by MEGA. Note that in a large system with literally millions of possible cases, the correctness of a calculation like (x 2 + 6) + (2x 2 − 12x + 3) depends only on a tiny subset of the whole program. It is a strength of our approach that only the calculations that are necessary for a given proof would be checked. This has the advantage that errors on different levels can be detected (in particular, on the levels of algorithms, of compilers, and of processors). Of course, for very long computations, checking can be pretty expensive. Moreover, highly elaborated and efficient algorithms in state-of-the-art CAS might be hard to augment with proof plan generation modes. As we have seen in the example above, the mathematical knowledge in the database has to reflect the mathematical knowledge in the algorithm in order to easily decorate the algorithms by a proof plan generation mode. However, to extend and prove corresponding lemmata is not a trivial task for sophisticated algorithms. In particular, such an approach would go very much in the direction of program verification. Even if it proves practically impossible to extract the information that is valuable at the conceptual, mathematical level, it is always possible to reserve these elaborated techniques for the quiet mode used in proof discovery, and use more basic algorithms, for which the mathematics is easier and that are more easily
JADKS02.tex; 23/09/1998; 9:44; p.25
352
M. KERBER, M. KOHLHASE AND V. SORGE
decorated by a proof plan generation mode, for the proof extraction phase. Systems like Axiom [28] or MuPAD [18] seem to come closest among standard CAS to the needs for a proof plan generation, since one can already attach axiomatizations to algorithms.
5. Conclusion In this work we have reported on an experiment of integrating a computer algebra system into the interactive proof development environment MEGA, not only at the system level, but also at the level of proofs. The motivation for such an integration is the need for support of a human user when his/her proofs contain nontrivial computations. We have shown that the proof planning paradigm in general and the MEGA system in particular provide an open environment for such an extended integration that supports different integration levels. In our approach it is not possible to use a standard CAS for the integration as it is, since such a system provides answers, but no directly usable justifications from which proof plans can be extracted. This, however, turned out to be essential in an environment that is built to construct communicable and checkable proofs. To achieve a solution that is compatible with such a strong requirement, we have adopted a generic approach, where the only requirement for the CAS is that it has a proof plan generation mode for the generation of communicable and checkable proofs. Since we want to achieve the two goals simultaneously, namely, to have high-level descriptions of the calculations of the CAS for communicating them to human users as well as low-level ones for mechanical checking, we represent the protocol information in form of high-level hierarchical proof plans, which can be expanded to the desired detail. Fully expanded proof plans correspond to natural deduction proofs that can be mechanically checked by a simple proof checker. In the case that the CAS has made a mistake, the proof checker will detect it. The general idea and the fundamentals of the integration of a CAS into an MRS are independent from the concrete proof development environment MEGA and the concrete computer algebra system µ-C AS. It can be realized in any plan-based theorem prover. Proof extraction can even be realized on any tactic-based system and with any CAS that can protocol its calculations in form of tactics. Axiom [28] and MuPAD [18] seem to be best suited for a corresponding extension, since one can already attach axiomatizations to algorithms. If in addition the algorithms could be enriched in a way that they produce protocol information in every computation step, that is, state which of the attached axioms are used and what the particular instantiations are, the systems would probably fit in with our approach pretty well. A useful extension of our approach would consist in the usage of various algorithms for the same computation, for instance, one as a fast and efficient algorithm that is not suitable for knowledge extraction while searching for a proof. Afterwards, when actually documenting the whole proof, a less efficient algorithm, which is optimized to find short proofs, can provide a complete proof plan.
JADKS02.tex; 23/09/1998; 9:44; p.26
INTEGRATING COMPUTER ALGEBRA INTO PROOF PLANNING
353
Although the correctness issue can be achieved by a tactic-based approach as well and does not need the specifications that are used in proof planning, the full strength of an integration where considerable automated support is provided cannot be achieved on this level, since it is not possible to perform mechanical reasoning about the tactics. Such an automation can, however, be achieved by the proof planning approach, where the proof planner can automatically call a CAS procedure, when the conditions in the corresponding method are met. The usefulness of an integration on this level can already be seen in the case of our simple µ-C AS: After the integration we are able to prove optimization problems that were out of reach without such a support. On the other hand, the system is able to give explanations of the involved computations at various levels of abstraction, a feature that is missing from today’s CAS. From our experiments we expect that the successful integration of any powerful computer algebra systems would considerably enhance the reasoning power of any mechanized reasoning system.
Acknowledgments The work presented in this paper was supported by the “Deutsche Forschungsgemeinschaft” in SFB 378, project OMEGA. It benefited a lot from discussions in the Calculemus interest group. We thank Lassaad Cheikhrouhou for his help with MEGA’s proof planner and with coding the methods for our examples. Furthermore, we thank Deepak Kapur and the anonymous referees for carefully reading earlier versions of the article and for their detailed comments that helped us to improve the presentation considerably.
References 1. 2.
3.
4.
5.
Abbot, J., van Leeuwen, A., and Strotmann, A.: Objectives of OpenMath, Technical Report 12, RIACA, Eindhoven, June 1996. Andrews, P. B.: Transforming matings into natural deduction proofs, in W. Bibel and R. Kowalski (eds), Proceedings of the 5th CADE, Les Arc, France, 1980, LNCS 87, Springer-Verlag, 1980, pp. 281–292. Ballarin, C., Homann, K., and Calmet, J.: Theorems and algorithms: An interface between Isabelle and Maple, in A. H. M. Levelt (ed.), Proceedings of International Symposium on Symbolic and Algebraic Computation (ISSAC’95), ACM Press, 1995, pp. 150–157. Barendregt, H.: Computations and formal proofs in type theory, talk at the 2nd Meeting of the CALCULEMUS Project, Schloß Dagstuhl, Germany, 18.11.–20.11.1996. See also The quest for correctness, available as URL: ftp://ftp.cs.kun.nl/pub/ CompMath.Found/quest.ps.Z, 1996. Benzmüller, C., Cheikhrouhou, L., Fehrer, D., Fiedler, A., Huang, X., Kerber, M., Kohlhase, M., Melis, E., Meier, A., Schaarschmidt, W., Siekmann, J., and Sorge, V.: MEGA: Towards a mathematical assistant, in W. McCune (ed.), Proceedings of the 14th CADE, Townsville Australia, 1997, LNAI 1249, Springer-Verlag, 1997, pp. 252–255.
JADKS02.tex; 23/09/1998; 9:44; p.27
354 6. 7. 8. 9. 10.
11. 12. 13. 14. 15.
16.
17. 18. 19.
20.
21.
22. 23.
24.
25.
26.
M. KERBER, M. KOHLHASE AND V. SORGE
Boyer, R. S. and Moore, J S.: Metafunctions, in R. S. Boyer and J S. Moore (eds), The Correctness Problem in Computer Science, Academic Press, 1981, pp. 103–184. Buchberger, B.: Mathematische Software-Systeme: Drastische Erweiterung des “Intelligenzniveaus” entsprechender Programme erwartet, Informatik Spektrum 19/2 (1996), 100–101. Buchberger, B.: Symbolic computation (an editorial), J. Symbolic Comput. 1 (1985), 1–6. Buchberger, B.: Using Mathematica for doing simple mathematical proofs, invited talk at the 4th Tokyo Mathematica Users’ Conference, November 2–3, 1996. Bundy, A.: The use of explicit plans to guide inductive proofs, in E. L. Lusk and R. A. Overbeek (eds), Proceedings of the 9th CADE, Argonne, Illinois, USA, 1988, LNCS 310, Springer-Verlag, 1988, pp. 111–120. Bundy, A., Stevens, A., van Harmelen, F., Ireland, A., and Smaill, A.: Rippling: A heuristic for guiding inductive proofs, Artif. Intell. 62 (1993), 185–253. Char, B. W., Geddes, K. O., Gonnet, G. H., Leong, B. L., Monagan, M. B., and Watt, S. M.: First Leaves: A Tutorial Introduction to Maple V, Springer-Verlag, 1992. Chou, S.-C.: Mechanical Geometry Theorem Proving, Mathematics and Its Applications, D. Reidel Publishing Company, Dordrecht, 1988. Chou, S.-C., Gao, X.-S., and Zhang, J.-Z.: Machine Proofs in Geometry: Automated Production of Readable Proofs for Geometry Theorems, World Scientific, Singapore, 1994. Clarke, E. and Zhao, X.: Analytica – A theorem prover in Mathematica, in Automated Deduction, 11th International Conference on Automated Deduction, Saratoga Springs, New York, 15–18 June 1992, pp. 761–763. Clément, D., Montagnac, F., and Prunet, V.: Integrated software components: A paradigm for control integration, in Proceedings of the European Symposium on Software Development Environments and CASE Technology, LNCS 509, Springer-Verlag, 1991. Constable, R. L. et al.: Implementing Mathematics with the Nuprl Proof Development System, Prentice-Hall, 1986. Fuchssteiner, B. et al. (The MuPAD Group): MuPAD User’s Manual, Wiley, 1996. Giunchiglia, F., Pecchiari, P., and Talcott, C.: Reasoning theories – towards an architecture for open mechanized reasoning systems, in F. Baader and K. U. Schulz (eds), Frontiers of Combining Systems (FroCoS-1): 1st International Workshop, Munich, Germany, 1996, Kluwer Acad. Publ., 1996, pp. 157–174 Harrison, J. and Théry, L.: Extending the HOL theorem prover with a computer algebra system to reason about the reals, in C.-J. H. Seger and J. J. Joyce (eds), Higher Order Logic Theorem Proving and its Applications (HUG’93), LNCS 780, Springer-Verlag, 1993, pp. 174–184. Harrison, J. and Théry, L.: Reasoning about the reals: The marriage of HOL and Maple, in A. Voronkov (ed.), Proceedings of the 4th International Conference on Logic Programming and Automated Reasoning (LPAR’93), St. Petersburg, Russia, LNAI 698, Springer-Verlag, 1993, pp. 351–353. Hearn, A. C.: Reduce user’s manual: Version 3.6, Technical Report, Rand Corporation, Santa Monica, CA, USA, 1995. Homann, K. and Calmet, J.: An open environment for doing mathematics, in M. Wester, S. Steinberg, and M. Jahn (eds), Proceedings of 1st International IMACS Conference on Applications of Computer Algebra, Albuquerque, USA, 1995. Howe, D. J.: Computational metatheory in Nuprl, in E. Lusk and R. Overbeek (eds), Proceedings of the 9th CADE, Argonne, Illinois, USA, 1988, LNCS 310, Springer-Verlag, 1988, pp. 238–257. Huang, X., Kerber, M., Kohlhase, M., Melis, E., Nesmith, D., Richts, J., and Siekmann, J.: M KRP: A proof development environment, in A. Bundy (ed.), Proceedings of the 12th CADE, Nancy, 1994, LNAI 814, Springer-Verlag, 1994, pp. 788–792. Huang, X., Kerber, M., Kohlhase, M., and Richts, J.: Adapting methods to novel tasks in proof planning, in B. Nebel and L. Dreschler-Fischer (eds), KI-94: Advances in Artificial Intelli-
JADKS02.tex; 23/09/1998; 9:44; p.28
INTEGRATING COMPUTER ALGEBRA INTO PROOF PLANNING
355
gence – Proceedings of KI-94, 18th German Annual Conference on Artificial Intelligence, Saarbrücken, Germany, LNAI 861, Springer-Verlag, 1994, pp. 379–380. 27. Huang, X. and Fiedler, A.: Presenting machine-found proofs, in M. A. McRobbie and J. K. Slaney (eds), Proceedings of the 13th CADE, New Brunswick, NJ, USA, 1996, LNAI 1104, Springer-Verlag, 1996, pp. 221–225. 28. Jenks, R. D. and Sutor, R. S., AXIOM: The Scientific Computation System, Springer-Verlag, 1992. 29. Kapur, D.: A refutational approach to theorem proving in geometry, Artif. Intell. 37 (1988), 61–93. 30. Karatsuba, A. and Ofman, Y.: Multiplication of multidigit numbers by automata, Soviet Phys. Dokl. 7 (1963), 595–596. 31. Kowalski, R.: Algorithm = Logic + Control, Comm. Assoc. Comput. Machinery 22 (1979), 424–436. 32. McCune, W. W.: Otter 3.0 reference manual and guide, Technical Report ANL-94-6, Argonne National Laboratory, Argonne, IL, USA, 1994. 33. Universität des Saarlandes, Diplomthemen SS-89 Nr. 35, Fachschaft Wirtschaftswissenschaften, Saarbrücken, Germany, 1989. 34. Wolfram, S.: The Mathematica Book: Version 3.0, 3rd edition, Wolfram Media, Inc., Champaign, IL, 1996. 35. Wu, W.: Mechanical Theorem Proving in Geometries: Basic Principles, Texts and Monographs in Symbolic Computation, Springer, Wien, 1994. 36. Zippel, R.: Effective Polynomial Computation, Kluwer Acad. Publ., 1993.
JADKS02.tex; 23/09/1998; 9:44; p.29