Combining Automated Reasoning Systems using Global ... - CiteSeerX

3 downloads 0 Views 687KB Size Report
Nov 28, 2006 - In a series of studies, we have combined machine learning, ... can be proposed for broadcast by the global workspace, how sub- ... exhaustive list – in fact the framework is meant to enable ...... New sub-processes: EX3: propose E3 (spawned by P C[Otter]), EX4: propose E4 (spawned by DC[MACE]).
Combining Automated Reasoning Systems using Global Workspace Architectures

Simon Colton, Murray Shanahan and John Charnley Department of Computing, Imperial College, 180 Queens Gate, London SW7 2RH, United Kingdom [email protected], [email protected], [email protected]

Abstract Artificial Intelligence techniques from areas such as theorem proving, planning, machine learning, constraint solving, etc., have been implemented in powerful software packages and applied successfully in various domains. We believe that it is time to fully harness the power of individual AI techniques by considering how to combine various reasoning methods so that the whole is more than a sum of the parts. In a number of studies, we have constructed combined reasoning systems and successfully employed them for various discovery tasks in domains of pure mathematics such as finite algebras, number theory and graph theory. We have similarly shown that combined reasoning systems can be fruitfully employed for constraint solving and automated theorem proving tasks. Given the success of these combined reasoning systems, the next step is to move from ad-hoc systems to configurations of a generic framework which can perform the same tasks. We develop here such a framework based on a global workspace architecture, which is essentially a model of combined serial and parallel information flow, wherein specialist processes compete and co-operate for access to a global workspace. Having developed the framework, we configure it into three different combined reasoning systems, each of which can in theory produce results that we previously achieved with an ad-hoc combined system. We argue that such an architecture has many attractive features for combining reasoning systems. In particular, modelled in this way, the combined reasoning system reports the reasoning process in a serial manner, but takes advantage of massive parallelism to determine what to report. Moreover, the component sub-systems have no need to communicate with each other and require no knowledge of how the other sub-systems reason.

Key words: Combined Reasoning, Global Workspace Architectures, Automated Reasoning, Constraint Solving, Machine Learning, Automated Theorem Proving

Preprint submitted to Elsevier

28 November 2006

1

Introduction

In general, AI researchers and practitioners tend to work within a problem solving paradigm. In particular, intelligent tasks to automate are interpreted as problems to be solved, usually with a clearly defined goal which can direct the search for a solution. Once the problem has been defined, it is categorised as either a classification problem, a theorem proving problem, a planning problem, etc., and suitable techniques from machine learning, automated theorem proving, planning, etc. are applied. Rarely is a task re-interpreted in a different category, and/or techniques from more than one category applied to the same problem simultaneously. In a series of studies, we have combined machine learning, automated theorem proving, model generation and constraint solving systems so that the whole is more than a sum of the parts. Via these studies, we have demonstrated that combined reasoning systems can be used to: • solve standard problems more efficiently • enable more flexibility in the usage of AI systems • automate tasks which are beyond the capabilities of single systems It is fair to say that the majority of approaches to combining reasoning systems have been fairly ad-hoc. This is mainly because the studies have been application driven, i.e., the automating of a particular task has necessitated both the choice of particular systems and the specifics of the integration scheme. We approach here from a different perspective, and ask whether there is a framework which is generic and flexible enough to combine different sets of systems so that they can solve a range of problems. Moreover, we propose and develop such a framework based on global workspace architectures. A global workspace architecture is essentially a model of combined serial and parallel information flow, wherein specialist sub-processes compete and cooperate for access to a global workspace [1]. A specialist sub-process can perform any cognitive process such as perception, planning, problem solving, etc. If allowed access to the global workspace, the information generated by such a specialist is broadcast to the entire set of specialists, thereby updating their knowledge of the current situation. In recent years, a substantial body of evidence has been gathered to support the hypothesis that the mammalian brain is organised via such global workspaces [2]. Moreover, this theory can be used to explain aspects of human cognition such as conscious and unconscious information processing, and can be applied to anomalies such as the frame problem [46]. From an engineering point of view, the global workspace architecture has much in common with Newell’s blackboard architectures for problem solving, which is now an established AI technology [42]. Moreover, AI software agents 2

based on global workspace theory have been successfully implemented and utilised [23]. Our framework starts with a global workspace architecture and makes two important alterations, namely that a specialist sub-process can detach itself from the workspace, and can attach new sub-processes to the workspace. Configuring the framework involves specifying the type of information which can be proposed for broadcast by the global workspace, how sub-processes reason about this information to produce new information, and the metaheuristics for controlling which information is broadcast at any round. Such meta-heuristics are based on the individual sub-processes being able to approximate the value of the information they propose for broadcast. In order to specify the framework, in section 2, we describe some individual reasoning systems which could be combined using it. This is by no means an exhaustive list – in fact the framework is meant to enable combinations of any AI reasoning systems – but it does enable us to ground the discussion of the configurations. Following the development of the framework in section 3, we look at three case studies. In each case, we provide details of an existing combined reasoning system which we have successfully used to automate an intelligent task. We then describe a configuration of the framework which produces a combined reasoning system that, in theory, can produce the same results as the ad-hoc system. In section 4, we look at automating mathematical theory formation for discovery purposes in algebraic domains of pure mathematics. In section 5, we look at the task of suggesting modifications to non-theorems, where the modifications can be proved true. Finally, in section 6, we look at the task of discovering implied constraints to a basic model of a constraint satisfaction problem (CSP) in such a way that a reformulation of the CSP which includes the implied constraints is solved faster by the solver than the basic model. To conclude, we survey some existing work in section 7 and relate our research to that of John McCarthy. We then discuss the advantages of using such a framework for future implementations of combined reasoning systems.

2

Component Systems

We assume general familiarity with methods from machine learning, constraint solving, automated theorem proving and model generation. In order to fully describe our previous ad-hoc combined reasoning systems and the three configurations of the generic framework developed in the next section, it will be helpful to have an overview of the Otter, MACE and CLPFD systems, as described in section 2.1. We also make extensive use of the HR program in 3

our combined reasoning systems. As HR performs a non-standard machine learning routine, we give a more detailed overview of this system in section 2.2.

2.1 Automated Reasoning Tools • The Otter program [39] is a first order automated theorem prover which uses the resolution method [44]. Resolution is refutation complete, and so if a first order theorem is true, resolution is guaranteed to prove it – although the time taken to do so may be prohibitively high. Given a set of axioms and a theorem which we want to prove follows from the axioms, Otter performs proof by contradiction, hence the theorem to be proved has to be negated. Otter then shows that the negation is inconsistent with the axioms, hence the negation is false, and the theorem must be true. • The MACE program [40] is a model generator which can take first order axioms and find ground instantiations which satisfy the conditions of the axioms. MACE uses the Davis-Putnam method [21] variant of resolution. The input to MACE is in the same format as for Otter. • The CLPFD solver [7] is a package in the Sicstus Prolog programming environment which enables constraint logic programming over finite domains. Constraint solvers such as CLPFD require the specification of a constraint satisfaction problem (CSP). A CSP consists of a set of variables, a set of domains that each variable can be assigned values from, and a set of constraints which define certain simultaneous assignments of values to variables which are/aren’t allowed. The purpose of a CSP solver is to find an assignment of the values to variables which doesn’t break any constraints.

2.2 The HR System

The HR system comprises both a set of descriptive machine learning tools which can form concepts and make conjectures about background information and a set of techniques for combining reasoning techniques in a routine which we call automated theory formation (ATF). For the purposes of this note, we separate the combined reasoning aspects into the discussion of ATF presented in section 4, and we concentrate here on HR’s machine learning tools. The implementation of HR is described in detail in [10], and a more formal representation of the underlying machine learning methods is presented in [18]. HR works with concepts which are multi-faceted objects, but for our purposes, we can think of them as a pair D, T  where D is a definition and T is the set 4

of known tuples which satisfy the definition. We call T the datatable of the concept, but in logic programming terms, T is defined as the success set for D (note that HR is not restricted to working in first order logic). The user supplies a set of concepts as the background information to the theory, and HR’s task is to add further – hopefully interesting – concepts to this set, as well as conjectures (described later). As described in [13] and [14], HR forms new concepts from old ones by using one of a number of production rules. We say a production rule is unary if it produces a new concept from a single existing concept, and binary if it uses two existing concepts. One such unary production rule is called match. Given a concept which defines a relationship between elements in a tuple, match will alter the definition by merging two variables. For instance, if given the concept of triples of integers related by the multiplication operation, which we would write thus: [a, b, c] : a b c ∈ N ∧ a = b ∗ c, the match rule might produce the concept of square numbers and their square roots, by merging the b and c variables: [a, b] : a b ∈ N ∧ a = b ∗ b Note that each production rule has an associated set of parameterisations, which fine tune the application of the production rule for a particular concept. The current version of HR has 17 production rules, but in a particular session, only around 5 or 6 will be used in general. Recently, we have enabled the user to have more control over the nature of HR’s production rules. In particular, the first-order generic production rule (FOGPR) enables the user to supply first order sentences which FOGPR uses for concept formation [49]. To do this, we interface HR with MACE at the concept formation level: HR uses MACE to construct the examples for a concept it invents. Similarly, HR can interface with the Gap [26] and Maple [50] computer algebra systems to enable the calculation of complex mathematical functions. This has been used to good effect for discovery tasks in number theory [12] and graph theory [41]. In the process of inventing new concepts, HR also makes conjectures empirically using the datatables of concepts to relate them. In particular, HR checks that every new concept has a non-empty datatable, and if not, it makes a non-existence conjecture. Similarly, if HR finds an existing concept that has exactly the same datatable as a newly invented concept, HR makes the conjecture that the definitions of the old and new concept are logically equivalent. HR also looks for old concepts where the datatable is a subset/superset of the datatable of the new concept, and makes implication conjectures accordingly. Moreover, HR will extract simpler conjectures from those it makes empirically, 5

as described in [11]. Finally, as discussed in section 4, HR will appeal to Otter and MACE to prove/disprove the conjectures it makes. As an example of some historical importance, 1 we consider the case of refactorable numbers. Starting with only the background concept of how to multiply two integers to produce a third, HR invents the concept of divisors of an integer using the exists production rule, followed by the number of divisors of an integer, using the size production rule. Following this, using the compose production rule, HR invents the concept of integers where the number of divisors is itself a divisor (such as 9, which has three divisors – 1, 3 and 9 – and three is itself a divisor of 9). We call these integers refactorable 2 numbers. In a separate line of reasoning, HR invents the concept of odd numbers, by using the split production rule to define integers divisible by 2 (even numbers), and the negate production rule to invent the concept of non-even numbers (odd numbers). HR then invents the concept of square numbers using the match and exists production rules sequentially. Finally, HR invents the concept of odd refactorable numbers, using the compose production rule again. At this stage, it notices empirically that all odd refactorable numbers are square numbers. This result is fairly easy to prove using some theorems from [27], and we published this, along with some other results about refactorable numbers in [9]. HR also approximates the value of the concepts it produces using a weighted sum of measures of interestingness, as described in [16]. The measures can be intrinsic, such as the complexity of the definition, or the size of the datatable, or they can be relational, such as how novel the datatable is in comparison with the others in the theory. The measures can also be related to the conjectures about the concepts: if a concept is involved in a number of interesting conjectures, it is deemed interesting itself. Hence, HR also has measures of interestingness for the conjectures, such as the surprisingness measure, which – given a conjecture relating two concepts – counts the number of concepts in the construction history of one but not both related concepts. If a conjecture relates two concepts, it is deemed surprising if they are not related by their constructions. The measures of interestingness drive a heuristic search: HR builds new concepts from those deemed most interesting by the weighted sum.

1

This was the first of HR’s results to be interesting enough to publish in the mathematical literature. 2 See sequence A033950 in Sloane’s Encyclopedia of Integer Sequences www.research.att.com/~njas/sequences.

6

3

A Global Workspace Framework for Combined Reasoning

Our aim here is to undertake the theoretical development of a simple framework which can be configured by a system designer to combine disparate reasoning systems, and then employed by a user to fruitfully undertake intelligent tasks. We start with a basic global workspace architecture as follows: • A blackboard-style workspace has a number of sub-processes attached. For our purposes, each sub-process performs some form of reasoning. • Reasoning proceeds sequentially in rounds, but at each round every subprocess is activated in parallel. • At each round, some – but not necessarily all – of the sub-processes propose a result to be broadcast to all the sub-processes still attached to the workspace in the next round. • Each sub-process also provides a numerical value for its proposal, which represents the process’s own estimation of the worth of the result. • The proposal with the highest value is chosen for broadcasting in the next round, which effectively drives the reasoning process. Given that each subprocess might have its own heuristics for searching, we use the term ‘metaheuristics’ to describe the ways in which sub-processes approximate the value of their proposals. • Each sub-process reacts to the broadcast result and possibly proposes a result, which starts the next round of reasoning. To configure the framework, the system designer must supply: • A set of artefact types - these are strings which represent the names of the types of various ground instances. • A set of broadcastable relationship templates, which are predicates that relate variables that are typed with artefact types. • A set of fixed sub-processes which are attached to the global workspace. We say a ground relationship is a relationship template for which all the variables have been ground to instances of artefacts of the correct artefact type. Given such a configuration, we can be more precise about the nature of a sub-process: A sub-process is a method which embeds some reasoning procedure. It takes a single ground relationship as input. In response, they either do nothing, or one or more of the following: (a) proposes a new ground relationship for the architecture, and supplies a numerical value which approximates the worth of the proposal. (b) detaches itself from the global workspace. 7

(c) attaches one or more new sub-processes to the global workspace. Note that we have deviated from the basic global workspace model in two fundamental ways: our sub-processes can spawn their own sub-processes, and our sub-processes can terminate themselves. Having configured the framework into a combined reasoning system, the user can now employ the system with various choices of background knowledge. To express the background knowledge, a user must supply a set of relationships of a form that can be broadcast. Each configuration of the framework must include sub-process generating mechanisms which take the background knowledge and attach sub-processes to the global workspace which employ the background knowledge. At the start of a reasoning session, the fixed sub-processes and the background sub-processes are attached to the global workspace. To start the session running, the workspace will broadcast a dummy relationship to the entire set of attached sub-processes. In the next sections, we look at three successful combined reasoning systems that we have built on an ad-hoc basis. After discussing the background to the projects and how the systems combine reasoning techniques, we configure the global workspace framework to a new combined reasoning system which is able to produce equivalent output. For each configuration, we supply a worked example comprising the background information supplied and the reasoning rounds which would occur given that information. To describe the configurations and worked examples, we distinguish between three types of sub-process, in terms of how they become attached to the global workspace: • Fixed sub-processes: these are supplied by the system designer as part of the configuration of the framework. • Background sub-processes: these are sub-processes generated to handle the background knowledge supplied by the user. The generation mechanism is also specified by the system designer. • Spawned sub-processes: these are new sub-processes which are attached to the global workspace at run-time. Note that once attached to the workspace, each sub-process is treated the same.

4

Automated Theory Formation

Automated Theory Formation (ATF) is essentially a descriptive machine learning algorithm, which appeals to third-party reasoning systems for theorem proving, model generation and calculating mathematical functions. This ap8

proach has been described extensively in [10] and has been presented as an Inductive Logic Programming algorithm in [18]. The approach has been adapted and altered many times, but the core methods are concept formation and conjecture making (via the HR program), first order theorem proving (via the Otter program), and Davis-Putnam model generation (via the MACE program). ATF has been used for numerous discovery tasks in domains of pure mathematics, with particular success in number theory [9], [12], [15] and graph theory [41], as well as algebraic domains, [17], [20]. As a flavour of the power of the ATF approach, our most recent application to mathematical discovery – as described in [20] – was to an algebraic domain with a single axiom, namely: ∀x y z ((x ∗ y) ∗ z = y ∗ (z ∗ x)). The algebras prescribed by this axiom are called star algebras. Starting with only this single axiom, the ATF approach discovered a number of surprising theorems in a fully automatic way. For instance, some theorems involving idempotent elements 3 were discovered, e.g., that in star algebras, idempotent elements are closed under multiplication, i.e., ∀a b (idempotent(a) ∧ idempotent(b) → idempotent(a ∗ b)) We also used the system in a semi-automated way, by performing numerous guided theory formation sessions, as described in [20]. We eventually focused on the question of finding a canonical example. These are star algebras which gain that status not by way of other axioms which are true of them. For instance, commutative and/or associative algebras all satisfy the star algebra axiom, hence to find a canonical example, we first had to find a non-commutative, non-associative example. After some semi-automated investigation, we eventually proved that there are no interesting canonical examples. Specifically, we showed that any canonical example must have a repeated row and column in the multiplication table, and removal of these duplications results in a commutative and associative algebra. Hence, our main result was negative, but it was surprising that star algebras – specified by a single axiom – have associativity and commutativity embedded in them. In section 4.1 we describe the ATF algorithm in the mode where it starts from just the axioms of an algebraic domain. This algorithm proceeds with linear sequences of reasoning. In section 4.2, we re-model the algorithm by configuring the global workspace framework described above, and we provide a worked example of the resulting combined reasoning system in section 4.3.

3

Idempotent elements, a, are such that a ∗ a = a.

9

1. A new concept C is invented using one of HR’s production rules to produce a new concept from the most interesting existing concepts (as discussed in section 2.2 above) in the theory. 2. A check to see if C has no tuples in its datatable is made. This fails. 3. A check to see if C has exactly the same datatable as an existing concept is made. This fails. 4. C is added to the theory. 5. Each existing concept, D, is examined, and if the datatable of D is a subset of the datatable of C, then an implication conjecture stating that D → C is made. If instead the datatable of D is a superset of the datatable of C, then the conjecture C → D is made. Each conjecture is added to the theory. 6. For each conjecture identified in part 5, Otter is used to attempt to prove the conjecture. Any proof found is added to the theory. 7. For every conjecture not proved in part 6, MACE is used to attempt to find a counterexample. Every counterexample found is added to the theory as a new object embedded in the counterexample. 8. For every counterexample added to the theory in part 7, the datatable of each concept in the theory is recalculated in light of the new object of interest. 9. For every concept E whose datatable changed under recalculation in part 8, all the conjectures in the theory which involve E and have not been proven or disproven are checked in the light of the new datatable. Any which are no longer true are removed from the theory. 10. The interestingness of C is evaluated, and the interestingness of all the other concepts is re-assessed in light of this (as some measures of interestingness are relative rather than absolute). Fig. 1. An example linear theory formation step

4.1 Linear-ATF

The purpose of the ATF algorithm is to build up a theory by successively adding theory constituents, which are either: • An ground instance, which is a string representing a fundamental object in the domain. For instance, in the star algebra domain, the star algebras themselves are ground instances. • A concept, which is a pair  definition, datatable  where the definition specifies a membership relation for an arbitrary tuple of ground instances, and the datatable contains all the known tuples which satisfy the membership condition. For instance, in the star algebra domain, one concept describes elements which are idempotent, hence its datatable is the following set of 10

tuples: {[S, a] : star algebra(S) ∧ a ∈ S ∧ a ∗ a = a} • A conjecture, which is an empirically observed property of a concept, or an empirically observed relationship between two concepts. For instance, in star algebras, the conjecture that idempotent elements are closed under multiplication can be observed empirically. • A proof, which is a deductive proof of an empirically observed conjecture. For instance, in star algebras, Otter can prove the closure of idempotent elements under multiplication. The algorithm proceeds in theory formation steps. Each step may add multiple theory constituents of different types to the theory, depending on the success or failure of parts of the step. In figure 1, we present an example step which uses HR’s concept formation and conjecture making techniques, Otter’s theorem proving, MACE’s model generation, and performs certain administrative tasks in order to keep the theory consistent and correct. The step portrayed in figure 1 is distinctly linear: one reasoning method follows after another. Parts 2 and 3 of the step aim to reduce redundancy in the theory: if a concept has no tuples then there is a chance that its definition is inconsistent with the axioms of the theory. Hence, if Otter can prove this inconsistency, the concept should not be allowed into the theory, as any concepts built from it would be similarly inconsistent. Likewise, if a concept has exactly the same datatable as an existing concept, then there is a chance that the definition of the old and the new concept are logically equivalent. Hence, if Otter can prove this equivalence, the concept is not added to the theory, as concept formation processes would be duplicated if it were.

4.2 GW-ATF

The linear approach to automated theory formation is rather rigid in a number of ways. Firstly, the linear approach requires processes to be undertaken in the correct order, so it would be difficult to distribute it over multiple processors. Also, the ordering of the production rules, and the ordering of the background knowledge can have a great influence on the theory which is formed. A global workspace approach opens up the possibility of designing an ATF configuration which is massively parallel, and more robust to the order in which processes are carried out. To configure the global workspace framework to perform automated theory formation, we specify below the artefact types, relationships, background sub-process generating mechanism and fixed sub-processes. Following this, we will suggest a meta-heuristic able to simulate the linear-ATF model.

11

Artefact types • Datatable: a set of tuples of strings, where each tuple is of the same arity. • Definition: a string definition which can be interpreted for membership, i.e., a tuple of strings from a datatable can be checked to see whether it conforms to (satisfies) the definition or not. • Proof: a string which proves the truth of some statement. • DefinitionList: a list where all the entries are definitions. • Keywords: reserved strings which are used to specify particular relationships. Broadcastable relationship templates • concept(D : Def initionList, T : Datatable) This is read as: the set of tuples, T , all satisfy definition D, and there is currently no other known tuple that satisfies D. • conjecture(L : Def initionList, C : Keyword) where C is either: non-exists, equivalent or implies. This is read as: the definitions in L are conjectured to be logically related in the manner specified by C. • explanation(L : Def initionList, C : Keyword, E : Keyword, P : P roof, D : Datatable) where C is either: non-exists, equivalent or implies, and E is either proved, disproved or open. · If E is proved, this is to be read as: the conjecture that the definitions in L are logically related in the manner specified by C is proved true by P . · If E is disproved, this is to be read as: D provides a set of counterexamples to the conjecture that the definitions in L are logically related in the manner specified by C. · If E is open, this is to be read as: an attempt to prove or disprove the conjecture that the definitions in L are logically related in the manner specified by C has failed. Fixed sub-processes To finish the configuration of the framework, we specify below the set of fixed sub-processes which will be attached to the global workspace at the start of the session. Note that we use the following conventions: (i) we use R to represent the broadcast relationship to which a sub-process is exposed, (ii) if that reaction is to attach a new sub-process to the global workspace, we simply say ‘attach’, and we use R to represent the relationship that the new sub-process will itself be exposed to, (iii) if a sub-process S0 spawns a new subprocess S1 , then S1 must be independent of S0 , and (iv) by the term ‘detach’, we mean a sub-process detaches itself from the global workspace. For convenience, we use the the shows redundancy(R, D) predicate, where R 12

BG (Proposing background relationships) If R = concept(D, ) then detach, else propose concept(D, T ). UD (Updating datatable) If R = explanation( , , , , N) then let T  = update datatable(D, T, N), and X = concept(D, T  ), replace T by T  , then propose X, and attach: If R = X or R = concept(D, ), then detach, else propose X. Fig. 2. Background sub-processes for the GW-ATF configuration

is a ground relationship and D is a definition. This is defined as follows: shows redundancy(R, D) if and only if R = explanation([D], non-exists, proved, , ) or R = explanation([ , D], equivalence, proved, , ) As in the linear ATF algorithm, if a new concept is proved to be equivalent to a previous concept, or is proved to have no examples, then this concept should not be allowed to have further concepts developed from it, and the shows redundancy predicate is used in many sub-processes for the ATF configuration to identify such cases. Background sub-processes In this configuration, the user is only allowed to supply concepts as background information. They are not obliged to supply examples for any background concept, but they do need to specify how to extract the data for a model generated by the MACE model generator and update a given background datatable with the new data. Specifying such an update datatable function is usually a trivial matter. For each concept relationship, concept(D, T ), given as background information, the sub-processes described in figure 2 will be attached to the global workspace. The BG sub-process simply proposes the concept until it is eventually broadcast, at which stage BG detaches itself from the global workspace. We need to cater for the scenario where the system finds new data, and the datatables of the background knowledge need to be updated. This can only be done if a counterexample to a false conjecture is produced (by MACE), and the UD sub-process reacts to this scenario. UD uses the update datatable function to extract from a counterexample datatable N the new data for the background concept and produces a new datatable. The background concept with the new datatable is then proposed for broadcast. 13

Concept formation sub-processes To specify the sub-processes associated with concept formation, we assume that the system designer has production rules available like those in the HR system which can transform a list of definitions into a single new definition, and similarly can transform the respective list of datatables into a single new datatable. We say that a production rule is unary if it forms a new concept from one old concept, and binary if it forms a new concept from two old concepts. In the HR system, each production rule has a set of parameterisations, and each parameterisation potentially produces a different concept from the same input concept(s). For our purposes here, we flatten this and state that a production rule has a fixed parameterisation. Moreover, we note that a binary rule can take the same concept as both inputs, and in this case, we define a unary version of the production rule. Hence, a binary rule has strictly different input concepts. Given a production rule, P R, a list of definitions, LD, and a list of datatables LT , we use the notation DP R (LD) for the definition produced by P R from LD, and TP R (LT ) for the datatable produced by P R from LT . We define two fixed sub-processes for the GW-ATF configuration of the workspace, which are given in figure 3. Looking at the UCI fixed sub-process (for a specific unary production rule U), we see that, every time a concept C is broadcast, UCI calculates the result of the application of U on C and attaches a new sub-process, S1 , to the global workspace which continually proposes this result until either C is shown to be redundant, the proposal is broadcast, or the parent, P , of C is re-broadcast. In the latter case – as we shall see later – this indicates that the datatable of concept P has been altered, which in turn means that the concept being proposed by S1 is out of date, hence the proposal should be dropped, in favour of a new proposal which constructs C from the altered parent concept, P . The proposing of the new concept is taken care of by the second sub-process, S2 , which UCI spawns in light of a broadcast concept. Note that S2 is never detached from the workspace, as P  may itself be changed in future reasoning rounds. The BCI fixed sub-process (for a specific binary production rule B) attaches a spawned sub-process for each broadcast concept, P1 . This doesn’t propose a new concept itself, but is designed to wait for the broadcast of a concept P2 in the future, and then use B to construct a new concept, N, from parents P1 and P2 . This is done similarly to the UCI sub-process, i.e., the spawning of a sub-process to continually propose N, and a sub-process to continually check whether either parent of N has been altered, and react accordingly. Conjecture making sub-processes The four fixed sub-processes detailed in figure 4 implement the conjecture 14

UCI[PR] (Unary concept invention using U) If R = concept(D, T ), then calculate X = concept(DU ([D]), TU ([T ])), propose X, and attach both this sub-process: If shows redundancy(R, D) or R = X, or R = concept(D, ), then detach, else propose X. and this sub-process: If shows redundancy(R, D) then detach, else if R = concept(D, T  ) and T = T  , then calculate X  = concept(DU ([D]), TU ([T  ])), propose X  , and attach: If shows redundancy(R, D) then detach, else propose X  .

BCI[B] (Binary concept invention using B) If R = concept(D1 , T ), then attach: If shows redundancy(R, D1 ) then detach, else if R = concept(D2 , T2 ), then calculate X = concept(DB ([D1 , D2 ]), TB ([T1 , T2 ])), propose X, and attach both this sub-process: If shows redundancy(R, D1 ) or shows redundancy(R, D2 ), or R = X, or R = concept(D1 , ) or R = concept(D2 , ) then detach, else propose X. and this sub-process: If shows redundancy(R, D1 ) or shows redundancy(R, D2 ), then detach else if (R = concept(D1 , T3 ) and T1 = T3 ) then calculate X  = concept(DB ([D1 , D2 ]), TB ([T3 , T2 ]), propose X  , and attach the sub-process below, else if (R = concept(D2 , T4 ) and T2 = T4 ), then calculate X  = concept(DB ([D1 , D2 ]), TB ([T1 , T4 ]), propose X  , and attach: If shows redundancy(R, D1 ) or shows redundancy(R , D2 ), or R = X  , or R = concept(D1 , ) or R = concept(D2 , ) then detach, else propose X  .

Fig. 3. Concept formation fixed sub-processes for the GW-ATF configuration

15

NE (Non-existence conjectures) If R = concept(D, {}), then let X = conjecture([D], non-exists), propose X, and attach: If shows redundancy(R, D), or R = X, or R = concept(D, ) then detach else propose X. EQ (Equivalence conjectures) If R = concept(D1 , T1 ), then attach: If shows redundancy(R, D1 ) or R = concept(D1 , ), then detach, else if R = concept(D2 , T2 ) and T1 = T2 , then let X = conjecture([D1 , D2 ], equivalent), propose X, and attach: If shows redundancy(R, D1 ) or R = concept(D1 , ) or shows redundancy(R, D2 ) or R = concept(D2 , ) or R = X, then detach, else propose X.

L-IMP (Left implication conjectures) This is the same as EQ, but with T1 = T2 replaced by T1 ⊂ T2 and conjecture([D1 , D2 ], equivalent) replaced by conjecture([D1 , D2 ], implies). R-IMP (Right implication conjectures) This is the same as EQ, but with T1 = T2 replaced by T2 ⊂ T1 and conjecture([D1 , D2 ], equivalent) replaced by conjecture([D2 , D1 ], implies). Fig. 4. Conjecture making fixed sub-processes for the ATF configuration

making methods required for automated theory formation. With the NE fixed sub-process, whenever a new concept, C, is broadcast, if it has an empty datatable, then a new sub-process is attached to the workspace which continually proposes the conjecture that the definition of the concept is logically inconsistent. This continues until either the conjecture is broadcast, or C becomes redundant. Whenever a new concept, C1 , is broadcast, the EQ, L-IMP and R-IMP subprocesses each spawn a new sub-process, S, which will endeavour to find an empirical relationship between C1 and any concept, C2 , broadcast in the future. To do this, S reacts to a new concept by checking a relationship between the datatables of C1 and C2 (equality, subset, superset for EQ, L-IMP and R-IMP respectively). If the relationship holds, S attaches a sub-process which continually proposes the appropriate conjecture. As usual, if either C1 or C2 16

PC[Otter] (Proving Conjectures) If R = conjecture(L, K) then use Otter to attempt to prove the logical relationship on the definitions in L prescribed by K, then: If Otter successfully produces proof P , then let X = explanation(L, K, proved, P, ), propose X, and attach: If R = X then detach, else propose X. else let X = explanation(L, K, open, , ), propose X, and attach: If R = X or R = explanation(L, K, disproved, , ) then detach, else propose X. DC[MACE] (Disproving Conjectures) If R = conjecture(L, K) then use MACE to attempt to find a counterexample datatable which disproves the logical relationship on the definitions in L prescribed by K, then: If MACE successfully produces counterexample T , then let X = explanation(L, K, disproved, , T), propose X, and attach: If R = X then detach else propose X. else, let X = explanation(L, K, open, , ), propose X, and attach: If R = X or R = explanation(L, K, proved, , ) then detach, else propose explanation(L, K, open, , ). Fig. 5. Conjecture handling fixed sub-processes for the GW-ATF configuration

is shown to be redundant, then S and all the sub-processes spawned by S are detached from the workspace. Moreover, once any sub-process, X, spawned by S is broadcast, X is detached. Conjecture Handling sub-processes The two fixed sub-processes presented in figure 5 model the conjecture proving and disproving aspects of automated theory formation. We see that the PC and DC sub-processes react to a conjecture of any type by attempting to prove/disprove the conjecture with Otter/MACE. If either is successful, then a sub-process is attached to the workspace which proposes this result persistently until it is broadcast. In both cases, if they are unsuccessful, then a new sub-process, S, will be attached to the workspace which continually proposes an open explanation. S will detach itself from the workspace if its proposal is broadcast. For PC spawned sub-processes, S will also detach itself if another sub-process reports that the conjecture has been disproved. Likewise, for DC 17

spawned sub-processes, S will detach itself if another sub-process reports that the conjecture has been proved. This enables proving and disproving to be carried out in parallel, and also models the fact that in the linear-ATF approach only one attempt to prove/disprove each conjecture is made. This restriction could, of course, be relaxed, and in this case, rather than proposing an open explanation results, the original conjecture could be re-proposed, so that future sub-processes might try harder to solve the open conjecture.

Meta-heuristics

Automated theory formation performed by the framework configured in the above way is massively parallel: every time a concept is broadcast, in that round every possible concept derivable from it is calculated, and every possible conjecture about it is calculated, including all the relationships to any previously broadcast concept. In the next round, the results from these calculations are proposed, and the sub-processes making those proposals persist on the global workspace until their proposals are broadcast. Hence, in addition to the benefits from parallelism, the new ATF approach is not sensitive to the order in which processes – in particular the production rules – are applied. The order in which relationships are broadcast, does, however alter the theory which is formed, especially in a limited time frame. Therefore, we need to carefully define the way in which sub-processes assess the value of the proposals they make, i.e., the meta-heuristics we specify.

Given that only one proposal is broadcast at every round, yet in each round, multiple sub-processes can detach themselves, an appropriate meta-heuristic would be to attach a higher value to the results from sub-processes which cause the detachment of many sub-processes. This would keep the workspace as free as possible from superfluous sub-processes and their proposals. Hence a suitable ordering of the values associated with the different relationships would be to value explanations more than conjectures, which are themselves valued more than concepts. A sensible exception to this is: we should value a concept output by the UD spawned processes higher than any conjecture, because the new concept will make the previous one redundant, hence causing many sub-processes to detach themselves from the workspace. Another sensible exception is to place a higher value on the artefacts produced by background knowledge sub-processes, than anything else. This will ensure that the background knowledge is broadcast at the start of the session. In addition, we could use the measures of interestingness mentioned in section 2.2 to further order the concepts output by the concept formation sub-processes. 18

4.3 Worked Example

Using the configuration and meta-heuristics as above, we claim that the ATF model presented here produces results which are equivalent to the those produced by the linear ATF routine. To add weight to this claim, we note that: • The two models produce the same kinds of theory constituents. • The persistence built into the fixed sub-processes (i.e., they attach subprocesses which continually propose the same relationship until it is finally broadcast) means that no artefacts will be lost, as in the linear model. • The concept formation in both models is driven by the same production rules. • The conjecture making of the linear model is catered for by the NE, EQ, L-IMP and R-IMP fixed sub-processes. • The proving and disproving of both models is performed by Otter and MACE respectively, and the fact that the value of P C and DC spawned artefacts score higher than the others (except UD spawned sub-processes) guarantees that proof and disproof attempts will be carried out for every conjecture that is made. • The redundancy checking of the linear model is simulated by the shows redundancy predicate which occurs in the sub-processes spawned by most of the fixed sub-processes. • The updating of the theory in light of a counterexample is taken care of by the UD sub-process and the sub-processes spawned by UCI and BCI, which check for the broadcasting of altered parent concepts, and propose an updated child concept. To further add to our claim, we present a worked example. As previously mentioned, in the star algebras application described in [20], HR made the conjecture – and Otter proved – that idempotent elements are closed under multiplication. We demonstrate below how the combined reasoning system gained from configuring the global workspace framework as above could also discover this result. For simplicity, we will use only a specific instance of the match production rule and a specific instance of the FOGPR production rule, namely one which is able to check whether an element type is closed under multiplication. The match and FOGPR production rules are described in section 2.2. To further simplify matters, we will omit the L-IMP and R-IMP sub-processes, as they don’t contribute to this worked example. The global workspace would therefore begin with 6 fixed sub-processes attached, namely: UCI[Match], UCI[Closure], NE, EQ, P C[Otter] and DC[MACE]. Suppose further that we specified a meta-heuristic which makes all sub-processes assign concept proposals a value of 1, conjecture proposals a value of 2, open explanation 19

proposals a value of 0, and proved/disproved explanation proposals a value of 3. We will make one exception to this by specifying that the UD sub-process should assign a value of 4 to its concept proposals. We supply the configured system with just two background concept relationships, neither of which is supplied with examples: C1 = concept([S] : star algebra(S), {}) C2 = concept([S, x, y, z] : star algebra(S) ∧ x, y, z ∈ S ∧ x ∗ y = z, {}) These relate to the concept of being a star algebra, and the multiplication table in a star algebra respectively. Note that we have used the predicate star algebra(S) as a shorthand for the actual axiom which defines star algebras: star algebra(S) ↔ ∀x, y, z ∈ S((x ∗ y) ∗ z = y ∗ (z ∗ x)) The BG and UD sub-processes generated by the background handling mechanism would take the background knowledge and together attach four subprocesses, namely: B1 : this proposes broadcast of C1. B2 : this proposes broadcast of C2. UD1 : this checks for the broadcast of a new star algebra and updates the datatable of C1 accordingly. UD2 : this checks for the broadcast of a new star algebra and updates the datatable of C2 accordingly. In appendix 1, we present the reasoning rounds undertaken by the configured system, in terms of the existing spawned sub-processes, the relationship which was broadcast in that round, the proposals for broadcasting which the subprocesses returned, the new sub-processes that were spawned and the subprocesses which detach themselves from the workspace. In round 10, we see that the proof that idempotent elements are closed under multiplication is broadcast.

5

Non-Theorem Modification

We believe that combined reasoning systems have much potential for making AI systems more flexible in their usage. One obvious inflexibility is that the majority of AI systems are problem solvers, hence an intelligent task to automate may have to be shoe-horned into a problem specification. Moreover, the problem has to be correctly specified in order for the AI system to work 20

properly. For instance, many automated theorem provers work very well if they are given a theorem to prove, yet are quite useless if they are given a non-theorem. The situation may be helped by using a model-generator to find a counterexample to the non-theorem, but this is often not particularly enlightening, as it serves only to validate the fact that the theorem is wrong, i.e., little can be learned from examination of the counterexample. A much more flexible system would take a non-theorem and suggest ways to modify it in order to make the modification provably true. We implemented such a system with the Theorem Modifier (TM) program, the algorithm for which is given in section 5.1 below. To test this approach, we gave TM 98 nontheorems which were taken from the TPTP library [48] directly or constructed from TPTP library theorems by introducing errors. As described in [19], TM produced meaningful modifications to 83% of the non-theorems, and in some cases made some interesting discoveries. For instance, TM took the following non-theorem 4 about rings: ∀ w x ((((w ∗ x) ∗ x) ∗ (w ∗ w)) = id) where id is the additive identity element. Naturally, TM couldn’t prove this non-theorem, but it did discover and prove the non-obvious fact that the theorem is true in rings where ∀ x(x ∗ x = x + x).

5.1 Linear-TM

The TM system combines the concept formation and conjecture making abilities of HR, theorem proving by Otter, and model generation by MACE. TM works in the following way: 1. TM is given a conjecture by the user, in Otter syntax, and of the form Axioms → Result. TM first attempts to prove this using Otter. If this is successful, TM reports this and the process terminates. 2. TM then tries to prove the negation of the conjecture using Otter, i.e., Axiom → ¬Result. If this is successful, then TM will not be able to modify the non-theorem, so it reports this result and terminates. 3. TM calls MACE to generate examples ES which support the conjecture, and examples EF which falsify it. 4. TM then uses HR to form a theory, starting with the Axioms that the user supplied. 4

This is problem number RNG031-6 in the TPTP library.

21

5. After the theory is formed, TM extracts into a set CS any concepts for which the set of examples that satisfy the definition of the concept are a subset of ES . We call these concepts specialisation concepts. 6. For every C in ES , TM uses Otter to attempt to prove that Axioms + C → Result and C → Result. If the second theorem is true, then the modification is highly likely to be trivially true because C is just a restatement of Result. Hence, any case where the first theorem is proved, but the second one is not, is reported to the user as a modified theorem. Note that TM could similarly extract a set of concepts where the examples which satisfy the definition of the concept are a subset of EF . It could then attempt to prove conjectures of the form Axioms+¬C → Result. However, this way of modifying the non-theorem is taken care of by the concept formation that HR performs in step 4, which includes the negating of concepts. Hence if the examples satisfying the definition of a concept were a subset of EF , then the negation of the concept would have examples which were a subset of ES , and the modification would not be missed. As an example of this process in action, given the above non-theorem about rings, MACE generated 7 supporting examples and 6 falsifying examples, and HR found a single specialisation concept which was true of only supporting examples. When TM used the specialisation in a modified theorem, Otter proved the result as given above.

5.2 GW-TM

Our aim for the GW-TM configuration was to base it as closely as possible on the configuration for automated theory formation prescribed in section 4.2. In particular, it seemed natural to think of the modifications of the non-theorem as conjectures which have to be proved or disproved like any other conjecture. In order to keep the configuration as lightweight as possible, we don’t cater for the initial check that the non-theorem is actually true, or the check that the negation of the non-theorem can be proved. Assuming therefore that the non-theorem is indeed false and fixable, to specify the configuration, we need to introduce two new relationships: trivial f ix(L : Def initionList, C : Keyword) non theorem(L : Def initionList, C : Keyword) The first of these is to be read as: the proved theorem relating the definitions in L in the way specified by C is trivially true. The second new relationship is to be read as: the list of definitions in L are conjectured to be related in 22

If R = N then detach, else propose N. Fig. 6. The background sub-process attached for the GW-TM configuration

the way specified by C, and this is the conjecture that the user wants to be modified. Note that only implication conjectures can be modified, so C must be implies. The non theorem relationship is supplied by the user as background information, N, along with the relevant background concepts for the GW-ATF configuration. Taking N, the background sub-process generating mechanism will attach the sub-process in figure 6 to the workspace, which persistently proposes the non-theorem. To perform the modifications of the given non-theorem, we add to the GWATF configuration a single fixed sub-process, called NF, which is given in figure 7. When initially attached to the workspace, NF creates an empty set, called datatables. Whenever a counterexample to a conjecture is broadcast as part of an explanation relationship, the counterexample is added to this set. If instead a non theorem relationship is broadcast which states that definition A implies definition J, then – using the supporters function which employs MACE – NF calculates the subset, P ositives, of the datatables set which support the non-theorem, and attaches two new sub-processes: • The first new sub-process waits for the broadcast of a concept where the examples satisfying the concept definition D is a subset of the P ositives. In this case, the new sub-process constructs the modified conjecture that definition D implies definition J. This is proposed for broadcast like any other conjecture. Note that the new sub-process also reacts to a new example being introduced as part of a broadcast explanation relationship by: (a) adding it to the P ositives if appropriate and (b) detaching any sub-process which is proposing a modified conjecture if that conjecture is shown to be false by the new example – this check is performed by using the supporters function again. • The second new sub-process waits for the broadcast of an explanation of the conjecture made by the first new sub-process. If such an explanation is broadcast, the check trivial function uses Otter to determine if the modification has made the conjecture trivially true i.e., of the form A ∧ B → C where B alone implies C. If the explanation is trivial, a trivial f ix relationship is proposed for broadcast – to warn the user to probably ignore the previously broadcast modified theorem. With this configuration, the non-theorem modification fits in seamlessly with the ATF process. In particular, it would be easy to enable any conjecture which neither Otter could prove nor MACE could disprove to be broadcast as a non-theorem, and theorem modification could become part of the standard automated theory formation routine. 23

NF (Nontheorem fix conjecturing) Initialise: datatables = {}. If R = explanation( , , , , T ) then datatables = datatables ∪ [T ], else if R = non theorem([A, J], implies) then let P ositives = supporters([A, J], datatables), and attach: If R = explanation( , , , , T ) then P ositives = P ositives ∪ supporters([A, J], [T ]). If R = concept(D, T ) and T ⊆ P ositives, then let X = conjecture([D, J], implies), propose X, and attach these two sub-processes: If R = X or (R = explanation( , , , , T ) and supporters([D, J], [T ]) = {}) then detach, else propose X. If R = explanation([D, J], implies, proved, , ), then: If check trivial([A, D, J]) = true then let X = trivial f ix([D, J], implies), propose X, and attach: If R = X then detach, else propose X. then detach, else detach.

Fig. 7. Additional fixed sub-process for the GW-TM configuration

5.3 Worked Example

We present here an example of how the combined reason system obtained by configuring the framework as above could be used to modify a non-theorem. We start with the false theorem that all groups are Abelian. A group is a finite algebra which is associative (∀ a b c (a ∗ b) ∗ c = a ∗ (b ∗ c)), has an identity (∃ e ∀ a (a ∗ e = e ∗ a = a)) and each element has an inverse (∀ a∃ b (a ∗ b = b ∗ a = e)). The non-theorem states that ∀ a b (a ∗ b = b ∗ a), which is not true of all groups. We supply the configured system with the following background concept relationships: • C1 = concept([G] : group(G), ) • C2 = concept([G, x, y, z] ∧ x, y, z ∈ G ∧ x ∗ y = z, ) • C3 = concept([G, x, y] ∧ x, y ∈ G ∧ x = y, ) These, respectively, refer to the concept of being a group, the multiplication 24

table of a group and the concept of equality between elements. Background sub-processes are configured for each of these. In addition, a background process, BNT1 , for the non-theorem is configured. This process will propose, until broadcast, the relationship non theorem([D1, D2], implies) where the definitions D1 and D2 define a goup and being Abelian respectively. In the example, there are several additional definitions. Of these, D3,D4,D5 and D6 need not be elaborated upon. They merely serve to illustrate how the P ositives set of the non-theorem evolves over time. The definition of D7 is [G, x] ∧ x ∈ G ∧ ∀x(x ∗ x = e), which states that all members of the group are self inverse. This is important as it can be shown that all self-inverse groups are Abelian and the production of this concept in the worked example shows how the theorem modification operates. The worked example, in overview, shows the broadcast of the non-theorem in round 1. This spawns a single monitor process, NF1 , which updates the P ositives set in rounds 6 and 10 in response to new group examples. Also,when a specialisation concept is identified in round 15, NF1 begins the non-theorem modification process. It does this by suggesting a specialisation of the theorem and spawns NF12 to check that the new theorem is non-trivial, should it be proven to be true. For clarity, only those processes that relate directly to the non-theorem modification are shown. The reasoning rounds for this worked example are given in appendix 2.

6

Reformulation of Constraint Satisfaction Problems

We mentioned in section 5 that AI systems usually require the correct specification of the problem they are being asked to solve. Related to this is another inflexibility: that users of AI systems should supply problem specifications which make the best use of the solver, or be prepared to experiment with various solver settings. Constraint solvers are particularly sensitive to the way in which constraint satisfaction problems (CSPs) are specified to them: one CSP model could take the solver hours to solve, while another model of the same problem could take just a few minutes. Given this sensitivity, as described in section 7, there has been much research into the automatic reformulation of CSPs. Our contribution has been to enable automatic discovery of additional constraints which are implied by the constraints in the CSP, hence can be added without loss of solutions. To do this, we treated a CSP as the seed for a theory formation session, extracted any proved theorems, translated them into implied constraints and reformulated the CSP by adding them singularly, then in pairs, triples, and so on. This approach was implemented in the ICARUS combined reasoning system, which is described in section 6.1 below. 25

To test ICARUS, we used finite algebras, and in particular some benchmark 5 CSPs, namely QG-quasigroups. These are finite algebras which satisfy the Latin square property – of having each element appear in every row and column of the multiplication table – in addition to other axioms. For instance QG7-quasigroups are Latin squares with the additional axiom that: ∀ a b ((b ∗ a) ∗ b = a ∗ (b ∗ a)). As described in [8], our experiments showed that ICARUS was able to reformulate the basic CSP model for 8 of 14 finite algebras in a way which helped the solver to perform more efficiently. Often, the speed up was significant. For example, the system automatically reformulated the basic solver model for QG7 quasigroups, resulting in a constraint program that was solved 85% faster. In the case of QG7 quasigroups of size 9, this reduced the time to exhaust the search space from 27 to 4 hours. ICARUS took just 16 minutes to discover the reformulation, which was clearly time well spent.

6.1 Linear-ICARUS

The ICARUS system operates in a linear manner to combine the HR machine learning system, the CLPFD constraint solving package of Sicstus Prolog and the Otter first order theorem prover. The starting point for the process is a CSP for generating all possible members of a particular problem family, parameterised by an integer, e.g., the size of a QG7 quasigroup. The background knowledge comprises only the axioms for that particular family, which are common to all members of the problem family. ICARUS works as follows: 1. The user supplies information about a problem class, including the core constraints. We call the formulation of the CSP which includes only the core constraints the basic model. 2. The constraint solver generates solutions to small problem instances. 3. The solutions are given, along with the core model, to the HR system. This generates empirically true conjectures which relate concepts expressed in the core constraints. 4. Each conjecture is passed to Otter, and an attempt is made to show that the conjecture is implied by the core model. 5. Each proved conjecture is interpreted as an implied constraint and translated into the syntax of the constraint solver. 5

Taken from the CSPLib library of constraint problems www.csplib.org.

26

6. Each implied constraint is added to the basic model to produce a reformulated model. The small problem instances are then solved again using the reformulated model. Every implied constraint used in a reformulation which improves – in terms of efficiency – upon the basic model, is recorded in a set E. 7. Every pair of constraints from E are added to the basic model and tested for an efficiency increase. Every triple of constraints is then tested, etc., until a user-defined limit is reached, or no reformulation improves upon the best model so far in a particular round. 8. All reformulations leading to a speed up for the small problem instances are presented to the user in decreasing order of the efficiency gained from the reformulation.

6.2 GW-ICARUS

We can configure the global workspace framework to perform automated reformulation of constraint satisfaction problems to produce the same results as the ICARUS system described above. We build this configuration on top of the GW-ATF configuration detailed in section 4.2. Consequently, we only describe here the additional artefact types, broadcastable relationship templates and sub-processes that are necessary for the system to perform automated CSP reformulation. We assume that the problem family is a finite algebra with a problem instance from the family being parameterised by an integer relating to the fixed size of algebra to find. We also assume that the automated theory formation part of the configuration can start with just the axioms of the algebra, and produce a theory containing multiple proved theorems. When broadcast, these theorems are used to reformulate a user-given CSP. This CSP is the basic model from which all others will be built, and solutions to the CSP are examples of the finite algebra under consideration. Artefact types In addition to the artefact types described in §4.2, we use the following artefact types: • CSP: a string which can be interpreted as a constraint satisfaction problem by the CLPFD solver. • Constraint: a string which can be interpreted as a constraint within a CSP by the CLPFD solver. 27

BMP (Background model proposing) If R = M, then detach, else propose M. Fig. 8. Additional background sub-process for the GW-ICARUS configuration

• ConstraintList: an ordered list of Constraints. • Integer: for representing problem instance sizes and solver times. Broadcastable relationship templates In addition to the broadcastable relationship templates described in §4.2, we use the following relationship templates: • model(B : CSP, I : ConstraintList, S : integer) This is to be read as: in this model of the CSP, B is the basic CSP specification, I is the list of additional constraints, to be given to the solver in the list order, and to test the model, the search space for size 1, size 2, . . ., size S will be exhausted. • implied constraint(N : Constraint) This is to be read as: the constraint N has been generated by translating a proved theorem. • solving time(C : CSP, I : ConstraintList, S : integer, T : integer) This is to be read as: solving CSP C with the additional constraints in the list I for problem instance size S takes T milliseconds. Background sub-process generating mechanism This configuration inherits the background sub-process generating mechanism from the GW-ATF configuration. In addition, suppose that the user supplies a basic CSP model, M, as the relationship model(B, [], S), where S is the largest size for the finite algebra which will be tested. Then a simple subprocess will be attached to the workspace, which persistently proposes the model for broadcast, as portrayed in figure 8. Fixed sub-processes We use the three sub-processes given in figure 9 to take a theorem produced by the ATF sub-processes and use it to reformulate a CSP model. It is important to note that the order in which constraints are given to the solver can change the efficiency of the solver for the CSP. Hence, given a set of implied constraints, we must enable the system to test the efficiency of the solver for every CSP which comprises the basic model and any combination of implied constraints in any order. To achieve this, we introduce the CT , CR and CE sub-processes. Given a user-supplied CSP model, the CT sub-process starts the reformu28

lation process by reacting to any proved theorem which is produced by the ATF sub-processes. CT translates the theorem into an implied constraint, N, and proposes this for broadcast. At the same time, CT also attaches a sub-process, P , to the workspace which waits for different CSP models to be broadcast in the future. In response to such a broadcast, P prefixes N to the implied constraints of the model, and proposes the new model for broadcast. To complement CT , the CR sub-process takes any broadcast model, and waits for a new implied constraint to be broadcast. In response, CR adds the constraint to the end of the implied constraints in its CSP model, and proposes the new model for broadcast. Working in this manner, CT and CR guarantee that every combination of implied constraints in every order will be tested. This testing is performed by the CE sub-process. To keep to the maxim of making every reasoning task occupy a single sub-process, CE is designed to split the testing of the model into sub-processes, one for each size of finite algebra. Each of these sub-processes reports back the time taken by the solver to exhaust the search space using its CSP model. To describe the three new fixed sub-processes, we use the following functions: • translate(L, T ): this takes a list of concept definitions, L, which are proved to be related by conjecture type, T , and produces a constraint in the syntax of the Sicstus CLPFD solver. • solver time(C, I, S): this takes a CSP C and passes it to the CLPFD solver along with a list of implied constraints I (given in the list order). It then calls the solver to exhaust the search for solutions of size S, and outputs the time taken in milliseconds. Meta-heuristics Note that with the exception of the CSP reformulation sub-process which waits for proved theorems, CT , there is no other interaction between the ATF process and the CSP reformulation process. Hence, it would be possible to have two global workspaces running in parallel, with the conjectures from the ATF workspace being proposed for broadcast on the CSP reformulation workspace. We don’t consider that possibility here, but it does suggest a meta-heuristic which assigns higher values for relationships broadcast by the BMP , CT , CR and CE sub-processes than the relationships broadcast by any of the ATF sub-processes. We would also suggest that, as before, the explanation relationships score higher than conjecture relationships, which themselves score higher than concept relationships. Also, the solving time relationship should score higher than the implied constraint and model relationships, so that the user is made aware of the results of testing as soon as that testing has been completed. Be29

CT (CSP Translator) If R = explanation(L, T, proved, , ) then let X = implied constraint(N), where N = translate(L, T ), propose X, and attach these: If R = X, then detach, else propose X. If R = model(C, I, S), and N ∈ I, then let Y = model(C, [N] :: I, S), propose Y , and attach: If R = Y , then detach, else propose Y .

CR (CSP Reformulator) If R = model(C, I, S) then attach: If R = implied constraint(N) and N ∈ I, then let X = model(C, I :: [N], S), propose X, and attach: If R = X then detach, else propose X.

CE (CSP Evaluator) If R = model(C, I, S) then for each Si ∈ {1, . . . , S} attach: Let X = solving time(C, I, Si , T ), where T = solver time(C, I, S), propose X, and attach: If R = X then detach, else propose X.

Fig. 9. Additional fixed sub-processes for the GW-ICARUS configuration. Note that X :: Y represents the list resulting from appending list X to list Y .

cause the CE sub-process splits the testing of the CSP models, the user has to check the all the broadcast solving time relationships in order to find the CSP model which takes the least time on average for all the sizes. Alternatively, the user may use different criteria for deciding which reformulation is the best, for example they may be interested in those which perform best on the largest problem sizes, or those which retain their efficiency increasing ability as the problem size increase. We could also use the solving times to more intelligently build reformulations of the given CSP. In particular, the sub-processes could be employed in such a way that only models which improve upon the model from which they are built are proposed for broadcast. For instance, if the results for the CE sub30

process were broadcast for a model with implied constraints [I1 , . . . , Ik ] but this didn’t improve upon the model with implied constraints [I1 , . . . , Ik−1 ], then any sub-process involving implied constraint Ik could be killed. This might be problematic, however, because it is possible in theory to find a pair of implied constraints which, taken together, produce a very efficient model, but neither of which taken alone improve upon the basic model. Hence, metaheuristics that take solving times into account would have to be carefully controlled to avoid this situation as much as possible.

6.3 Worked Example

We demonstrate here how the combined reasoning system we obtain by configuring the framework as above could reproduce results equivalent to those obtained by the ICARUS system in [8]. In this example we consider the problem of finding better reformulations of an initial constraint program for solving problems in the QG3 quasigroup family of finite algebras. In particular, we show how the system would process two implied theorems, namely: ∀ a b c (a ∗ a = c ∧ b ∗ b = c → a = b), which states uniqueness of elements on the diagonal of the Latin square, and ∀ a b c (a ∗ b = c ∧ b ∗ a = c → a = b), which states an inequality of elements across the diagonal. In [8], we showed that adding these constraints as a pair causes an increase in efficiency when solving QG3 quasigroups. For conciseness, we show only the reaction to the broadcast of the basic model, and the system at work after the above theorems have been broadcast. The background information for the automated theorem formation sub-processes is as follows: • C1 = concept([Q] : qg3(Q), QS = [q1, q2, ...., qn]) • C2 = concept([Q, x, y, z] : qg3(Q) ∧ x, y, z ∈ Q ∧ x ∗ y = z, DT 1) • C3 = concept([Q, x, y] : qg3(Q) ∧ x, y ∈ Q ∧ x = y, DT 2) Concept C1 represents the concept of being a type QG3 quasigroup and concept C2 refers to its multiplication table. The set QS is the set of labels for an initial set of positive examples for QG3 quasigroups. The values in the multiplication table for these initial quasigroups is given in the datatable DT1, which is a set of tuples. Concept C3 is the concept of two elements being equal, with associated datatable DT2. The background information for the constraint reformulation processes is the following relationship: • SM1 = model(CSP, [], 3) 31

This relationship describes the basic solver model to be run over problem instance sizes 1 through 3. In a model relationship, the second argument is a list of implied constraints. As this is the basic solver model, there are no implied constraints and so this is the empty list. This relationship is continually proposed by BMP1 until broadcast. For additional clarity, we have only shown the broadcast of solver time relations for one of the size and have not referred to any of the ATF processes. Note also that all solving times are purely illustrative. In appendix 3, we present the reasoning rounds which would take place in this worked example. For brevity, we use the following concept definitions: • D1 = [Q, x, y] : QG3(Q) ∧ x, y ∈ Q ∧ x = y • D2 = [Q, x, y, z] : QG3(Q) ∧ x, y, z ∈ Q ∧ x ∗ x = z ∧ y ∗ y = z • D3 = [Q, x, y, z] : QG3(Q) ∧ x, y, z ∈ Q ∧ x ∗ y = z ∧ y ∗ x = z We see that in the session presented in appendix 3, two theorems are discovered and both are translated and used as implied constraints in the most successful reformulation.

7

Related Work

7.1 Additional Combined Reasoning Projects

The combined reasoning applications presented above are complemented by a number of other projects we have undertaken in this area. In particular, in [17], we addressed the problem of automatically generating isomorphic classification theorems for finite algebras. Such a qualitative approach was in contrast to the more quantitative ways in which AI systems are usually used in pure mathematics. In order to push the boundaries of the algebra sizes that the system could work with, we experimented with 16 different reasoning systems, including theorem provers, model generators, computer algebra systems, SATsolvers, and descriptive and predictive machine learning systems. In [47], we addressed the question of producing isotopic classification theorems, which further pushed the boundaries of our system, and required the implementation of some bespoke reasoning procedures. In [12], we combined HR, Otter and the Maple computer algebra system [50] into the HOMER system, which worked in number theory. Here, the usage of Otter was different to how it is usually used: any conjecture that could be proved by Otter was discarded, as it must be provable from first principles (i.e., the definitions of the number theory concepts), and hence unlikely to be of 32

interest to the user, given that first order theorem proving is not particularly effective in number theory. In order to most effectively weed out provable theorems, HOMER built up an axiom set consisting of any conjecture Otter ever proved. In this way, it was able to use the axioms to prove – and hence prune – more uninteresting conjectures. Any remaining conjectures were far more likely to be worthy of study on average, and HOMER enabled us to find some interesting number theory conjectures which we proved. For instance, HOMER discovered (and we proved) that if the sum of divisors of an integer, N, is a prime number, then the number of divisors of N will also be a prime number, which is not obvious. We similarly used Maple in an application to graph theory [41], where the aim was to discover similar conjectures in graph theory as the Graffiti program [22]. We were successful in this, and we have also proved some of these conjectures automatically using the AutoGraphiX program [6]. Our approach to modifying non-theorems arose as a spin-off of the PhD project of Alison Pease [43], which was to implement a multi-agent system to simulate the classroom style dialogue proposed in the philosophy of mathematics put forward by Lakatos [29]. Lakatos suggests certain techniques for the modification of non-theorems in the light of successive counterexamples, using Euler’s theorem as a case study. These techniques informed the routines we implemented in the TM system. In a similar project, [28], we looked at the question of case-splitting theorems, so that the time taken to prove the set of cases was less than the time taken to prove the theorem as a whole. Such an approach has much value because theorem provers tend to be subject to a Peter Principle point: after a certain level of difficulty, the time taken to prove a theorem sharply becomes prohibitively high [25]. Hence, if case splitting a theorem can bring each case to the efficient side of the Peter Principle point, as we showed in this project, the time taken to suggest and prove all the case splits can be an improvement on the time taken to prove the whole theorem. Our approach combined the HR and Otter systems, and a bespoke system which performed hill climbing to discover better ways to use HR’s concepts to propose case-split trees.

7.2 Reformulating Constraint Satisfaction Problems

Our approach to reformulating constraint satisfaction problems is distinct from, and complementary to, existing automated constraint reformulation methods. For example, the Cgrass system [24] captures common patterns in the hand-transformation of constraint satisfaction problems in transformation rules. Given an input model, Cgrass applies the rules in a forward chaining manner to improve the model gradually. A key difference from the approach presented here is that Cgrass is applicable to single problem instances only, 33

rather than problem classes. The O’CASEY system [30,31] employs casebased reasoning to capture constraint modelling expertise. Each case records a problem/model pair. Given a new problem, the most similar problem is retrieved from the case base. A model for the new problem is then constructed by following the modelling steps used in the retrieved case. Bessiere et al. present a method for learning the parameters of certain types of implied constraint by considering solutions and non-solutions of relaxations of the original problem [5]. The conacq system [3,4] uses a version space approach to learn a CSP model from scratch given only positive and negative example solutions to the CSP.

7.3 Relation to McCarthy’s Research

In [33], John McCarthy proposes the development of, and outlines the design for, a computer program that automatically checks mathematical proofs, the “creative parts” of which are supplied by the human, leaving the computer to fill in the more mechanical portions. But already implicit in his earlier 1959 paper on the ADVICE TAKER [32] is the idea that even the creative parts of finding a mathematical proof might be carried out by computer. In the course of inspiring us with the vision of a computer program that has “common sense” because it “automatically deduces for itself a sufficiently wide class of immediate consequences of anything it is told and what it already knows”, he points out that no program at the time is capable of discovering for itself such an abstract phenomenon as “the principle of opposition” in checkers. To be capable of discovering such abstract principles, McCarthy asserts, a system needs to be able to reason with declarative representations. The work described in this paper can be considered as directly descending from these early thoughts of McCarthy. At the time McCarthy wrote “Programs with Common Sense”, it was science fiction to think of describing the properties of a game such as checkers in mathematical logic and submitting the description to a computer that might not only prove things about checkers but also discover new concepts about checkers for itself. Today, systems such as HR have gone some way to turning McCarthy’s speculations into reality. But McCarthy’s 1950s dreams are still a long way from being fully realised, and in the context of automated theorem proving, the need to creatively define and apply new concepts is at the crux. McCarthy, for example, invites us to consider the so-called mutilated checker board problem. The challenge is to prove that it is impossible to exactly cover a checker board with dominoes if two diagonally opposite corners of the board have been removed. In 1964, McCarthy formalised this problem and proposed it as a benchmark for automated theorem provers [34]. Although automated theorem provers can now solve this problem, none do so by inventing for themselves the idea of 34

colouring the squares of the board black and white, and then discovering the lemma that any set of dominoes must cover an equal numer of black and white squares. This, as McCarthy points out, is a paradigm example of mathematical creativity [35,37]. One of the key difficulties in replicating this sort of creativity in a machine is the problem of deciding what is the most relevant concept to apply next in order to advance an automated theorem prover towards a solution. This is especially true when the most relevant concept is not part of the statement of the problem and is yet to be discovered. Many cognitive scientists today would cite this as an instance of what they call the “frame problem”. Of course, the frame problem, in its original guise, was also one of McCarthy’s discoveries [38]. The frame problem as McCarthy first conceived it, namely the challenge of describing the effects of actions in logic without having explicitly to describe their non-effects has largely been solved [45]. But the frame problem in the wider sense of delimiting relevance lives on. And it is just this sense of the frame problem that is addressed by the use of the global workspace architecture, as described in this paper [46]. The global workspace architecture adopted here is drawn from contemporary thinking in the scientific study of consciousness. But its pedigree goes back to the blackboard architectures of the 1980s, which are in turn descended from Selfridge’s pandemonium architecture. Interestingly, at the same conference in which McCarthy presented “Programs with Common Sense”, he also anticipated Baars’s global workspace theory. Responding to Selfridge’s paper, he writes: “I would like to speak briefly about some of the advantages of the pandemonium model as an actual model of conscious behaviour. In observing a brain, one should make a distinction between that aspect of the behaviour which is available consciously, and those behaviours, no doubt equally important, but which proceed unconsciously. If one conceives of the brain as a pandemonium – a collection of demons – perhaps what is going on within the demons can be regarded as the unconscious part of thought, and what the demons are publicly shouting for each other to hear, as the conscious part of thought.” McCarthy’s more recent speculations on consciousness take a different tack, but are equally pertinent. In [36], he proposes a form of mental situation calculus that would allow a computer to observe and reason about its own mental states. Such an ability might be exactly what is required to enable an automated theorem prover based on the proposed architecture to reflexively refine its own reasoning, with the process most relevant to the present broadcast state being decided by meta-heuristics. And this takes us right back to one of the most powerful, yet still largely unrealised, ideas of the ADVICE TAKER program, namely the potential of being able to express heuristics for reasoning in the same formal language as the object-level theory being reasoned about. 35

8

Conclusions and Future Work

By presenting numerous projects where combined reasoning systems have been used fruitfully, we hope to have demonstrated that there is much potential for the combination of AI systems to become more than the sum of their parts. In particular, we have shown that combined reasoning systems can perform intelligent tasks more efficiently than stand alone systems; they can be more flexible in their application; and they can undertake complex tasks that individual systems are not designed for. Moreover, by developing and configuring a global workspace framework for the combination of reasoning systems, we hope to have demonstrated that it is possible to combine systems in a generic rather than ad-hoc fashion. The three configurations we have presented show that the framework is flexible enough to enable systems that undertake a variety of tasks, yet is simple enough to enable easy extension of a configuration for a new task. There are various benefits to basing the framework on a global workspace architecture, including the following: • There is a serial thread of computation which is revealed in the sequence of broadcast relationships, and yet the state to state function defining this serial thread is the product of massive parallelism. • The sub-processes are largely independent of one another. In particular, they do not need to communicate with each other, and they have no need to understand how the other sub-processes work. These two points will be important when we implement and test the framework, which is the next step for our research. In particular, we will be able to use a highly distributed approach to performing tasks, yet the reasoning process will be reported in a simple linear fashion. Also, it has been our experience that getting one reasoning system to communicate with another is quite problematic, and introducing a new system to an existing combination is a very time consuming process. This difficulty will be greatly reduced by using the framework, and we envisage being able to quickly configure the framework for new tasks by choosing reasoning sub-processes from a library. There are a number of open questions which this research raises, not least of which is whether a simple meta-heuristic driven by the sub-processes assessing their proposals will be able to efficiently guide the reasoning process. It will be interesting to experiment with meta-heuristics which are less rigid than the ones we proposed here. We are keenly interested in the question of creativity in AI systems, and we hope to show that the combined reasoning systems we develop can surprise us with truly creative solutions to tough-nut problems like those proposed by McCarthy. 36

References

[1] B Baars. A Cognitive Theory of Consciousness. Cambridge University Press, 1988. [2] B Baars. The conscious access hypothesis: Origins and recent evidence. Trends in Cognitive Science, 6(1):47–52, 2002. [3] C. Bessiere, R. Coletta, E.C. Freuder, and B. O’Sullivan. Leveraging the learning power of examples in automatic constraint acquisition. In Proceedings of the 10th International Conference on Principles and Practice of Constraint Progamming, pages 123–137, 2004. [4] C. Bessiere, R. Coletta, F. Koriche, and B. O’Sullivan. A sat-based version space algorithm for acquiring constraint satisfaction problems. In Proceedings of ECML’05, pages 23–34, 2005. [5] C. Bessiere, R. Coletta, and T. Petit. Acquiring parameters of implied global constraints. In Proceedings of the 11th International Conference on Principles and Practice of Constraint Programming, pages 747–751. Springer, 2005. [6] G Caporossi and P Hansen. Finding relations in polynomial time. In Proceedings of the 16th International Joint Conference on Artificial Intelligence, 1999. [7] M Carlsson, G Ottosson, and B Carlson. An open-ended finite domain constraint solver. In Proceedings of Programming Languages: Implementations, Logics, and Programs, 1997. [8] J Charnley, S Colton, and I Miguel. Automatic generation of implied constraints. In Proceedings of the 17th European Conference on AI, 2006. [9] S Colton. Refactorable numbers - a machine invention. Journal of Integer Sequences, 2, 1999. [10] S Colton. Automated Theory Formation in Pure Mathematics. Springer-Verlag, 2002. [11] S Colton. The HR program for theorem generation. In Proceedings of the Eighteenth Conference on Automated Deduction, 2002. [12] S Colton. Automated conjecture making in number theory using HR, Otter and Maple. Journal of Symbolic Computation, 2004. [13] S Colton, A Bundy, and T Walsh. HR: Automatic concept formation in pure mathematics. In Proceedings of the 16th International Joint Conference on Artificial Intelligence, 1999. [14] S Colton, A Bundy, and T Walsh. Automatic identification of mathematical concepts. In Machine Learning: Proceedings of the 17th International Conference, 2000.

37

[15] S Colton, A Bundy, and T Walsh. Automatic invention of integer sequences. In Proceedings of the Seventeenth National Conference on Artificial Intelligence, 2000. [16] S Colton, A Bundy, and T Walsh. On the notion of interestingness in automated mathematical discovery. International Journal of Human Computer Studies, 53(3):351–375, 2000. [17] S Colton, A Meier, V Sorge, and R McCasland. Automatic generation of classification theorems for finite algebras. In Proceedings of the International Joint Conference on Automated Reasoning, pages 400–414, 2004. [18] S Colton and S Muggleton. Mathematical applications of Inductive Logic Programming. Machine Learning, 64:25–64, 2006. [19] S Colton and A Pease. The TM system for repairing non-theorems. In Selected papers from the IJCAR’04 disproving workshop, Electronic Notes in Theoretical Computer Science, volume 125(3). Elsevier, 2005. [20] S Colton, P Torres, P Cairns, and V Sorge. Managing automatically formed mathematical theories. In Proceedings of the Mathematical Knowledge Management Conference, 2006. [21] M Davis and H Putnam. A computing procedure for quantification theory. Associated Computing Machinery, 7:201–215, 1960. [22] S Fajtlowicz. On conjectures of Graffiti. Discrete Mathematics 72, 23:113–118, 1988. [23] S Franklin and A Graesser. A software agent model of consciousness. Consciousness and Cognition, 8:285–301, 1999. [24] A.M. Frisch, I. Miguel, and T. Walsh. Cgrass: A system for transforming constraint satisfaction problems. In B. O’Sullivan, editor, Proceedings of the Joint Workshop of the ERCIM Working Group on Constraints and the CologNet area on Constraint and Logic Programming on Constraint Solving and Constraint Logic Programming (LNAI 2627), pages 15–30, 2002. [25] Sutcliffe G, M Fuchs, and Suttner C. Progress in automated theorem proving, 1997-2001. In Workshop on Empirical Methods in Artificial Intelligence, 17th International Joint Conference on Artificial Intelligence, 2001. [26] Gap. GAP Reference Manual. The GAP Group, School of Mathematical and Computational Sciences, University of St. Andrews, 2000. [27] G Hardy and E Wright. The Theory of Numbers. Oxford University Press, 1938. [28] F Hoermann. Machine learning case splits for theorem proving. Master’s thesis, Department of Computing, Imperial College, London, 2005. [29] I Lakatos. Proofs and Refutations: The logic of mathematical discovery. Cambridge University Press, 1976.

38

[30] J. Little, C. Gebruers, D. Bridge, and E. Freuder. Capturing constraint programming experience: A case-based approach. In A. Frisch, editor, International Workshop on Reformulating Constraint Satisfaction Problems, Workshop Programme of the Eighth International Conference on Principles and Practice of Constraint Programming., 2002. [31] J Little, C Gebruers, D Bridge, and E Freuder. Using case-based reasoning to write constraint programs. In F. Rossi, editor, Principles and Practice of Constraint Programming. Springer, 2003. [32] J McCarthy. Programs with common sense. In Proceedings of the Teddington Conference on the Mechanization of Thought Processes, 1959. [33] J McCarthy. Computer programs for checking mathematical proofs. Recursive Function Theory, Proceedings of the American Mathematical Society Symposia on Pure Mathematics, 5, 1962. [34] J McCarthy. A tough nut for proof procedures. Technical Report AI memo no. 16, Stanford, 1964. [35] J McCarthy. Ascribing mental qualities to machines. In Philosophical Perspectives in Artificial Intelligence. Humanities Press, 1979. [36] J McCarthy. Making robots conscious of their mental states. In Proceedings of Machine Intelligence 15, 1995. [37] J McCarthy. Creative solutions to problems. In In Proceedings of the AISB’99 Symposium on AI and Scientific Creativity, 1999. [38] J McCarthy and P Hayes. Some philosophical problems from the standpoint of artificial intelligence. In Proceedings of Machine Intelligence 4, 1969. [39] W McCune. The OTTER user’s guide. Technical Report ANL/90/9, Argonne National Laboratories, 1990. [40] W McCune. A Davis-Putnam program and its application to finite firstorder model search. Technical Report ANL/MCS-TM-194, Argonne National Laboratories, 1994. [41] N Mohamadali. A rational reconstruction of the Graffiti program. Master’s thesis, Department of Computing, Imperial College, London, 2003. [42] A Newell. Some problems of basic organization in problem-solving systems. In Second conference on self-organizing systems, pages 393–342, 1962. [43] A Pease and S Colton. Modelling Lakatos’s philosophy of mathematics. In Proceedings of the Second European Conference on Computing and Philosophy, 2004. [44] J Robinson. A machine-oriented logic based on the resolution principle. Journal of the ACM, 12(1):23–41, 1965. [45] M Shanahan. Solving the Frame Problem: A Mathematical Investigation of the Common Sense Law of Inertia. MIT Press, 1997.

39

[46] M Shanahan and B Baars. Applying global workspace theory to the frame problem. Cognition, 98(2):157–176, 2005. [47] V Sorge, A Meier, R McCasland, and S Colton. The automatic construction of isotopy invariants. In Proceedings of the International Joint Conference on Automated Reasoning, 2006. [48] G Sutcliffe and C Suttner. The TPTP problem library: CNF release v1.2.1. Journal of Automated Reasoning, 21(2):177–203, 1998. [49] P Torres and S Colton. Applying model generation to concept formation. In Proceedings of the Automated Reasoning Workshop, 2006. [50] Waterloo Maple. Maple Manual at http://www.maplesoft.on.ca.

Appendix 1. Reasoning Rounds for the GW-ATF example

Fixed sub-processes: UCI[Match], UCI[Closure], NE, EQ, P C[Otter], DC[MACE]. Round 1 Spawned sub-processes: B1 , B2 , U D1 , U D2 Broadcast relationship: A dummy relationship Proposals: C1 = concept([S] : star algebra(S), {}), 1, C2 = concept([S, x, y, z] : star algebra(S) ∧ x ∗ y = z, 1 New sub-processes: None Detached sub-processes: None

Round 2 Spawned sub-processes: B1 , B2 , U D1 , U D2 Broadcast relationship: C1 (chosen randomly over C2) Proposals: C2, 1, J 1 = conjecture([C1], non-exists ), 2 New sub-processes: NE1 : propose J 1 (spawned by NE), EQ1 : future equivalence check against C1 (spawned by EQ) Detached sub-processes: B1 (due to broadcast)

Round 3 Spawned sub-processes: B2 , U D1 , U D2 , NE1 , EQ1 , U D1 Broadcast relationship: J 1 Proposals: C2, 1, E1 = explanation([C1], non-exists, disproved, , {[star0]}), 3 C2, 1, langleE2 = explanation([C1], non-exists, open, , ), 0 New sub-processes: EX1 : propose E1 (spawned by DC[M ACE]), EX2 : propose E2 (spawned by PC[Otter]) Detached sub-processes: NE1 (due to broadcast)

Round 4 Spawned sub-processes: B2 , U D1 , U D2 , EQ1 , EX1 Broadcast relationship: E1 Proposals: C2, 1, C3 = concept([S] : star algebra(S), {[star0]}), 4, C4 = concept([S, x, y, z] : star algebra(S) ∧ x, y, z ∈ S ∧ x ∗ y = z, {[star0, 0, 0, 0]}), 4 New sub-processes: B3 : propose C3 (spawned by U D1 ), B4 : propose C4 (spawned by U D2 ) Detached sub-processes: EX1 (due to broadcast), EX2 (due to broadcast of E1)

Round 5 Spawned sub-processes: U D1 , U D2 , EQ1 , B3 , B4 Broadcast relationship: C3 (chosen randomly over C4) Proposals: C4, 4 New sub-processes: EQ2 : future equivalence check against C3 (spawned by EQ), U D3 : check for future updates of the C3 datatable Detached sub-processes: EQ1 (redundant, due to re-broadcast of C1’s definition), B3 (due to broadcast)

40

Round 6 Spawned sub-processes: U D1 , U D2 , B4 , EQ2 , U D3 Broadcast relationship: C4 Proposals: C5 = concept([S, x] : star algebra(S) ∧ x ∈ S ∧ x ∗ x = x, {[star0, 0]}, 1 New sub-processes: EQ3 : future equivalence check against C4 (spawned by EQ), U D4 : check for future updates of the C4 datatable, U CI1 : propose C5 (spawned by U CI[M atch]) Detached sub-processes: B4 (due to broadcast)

Round 7 Spawned sub-processes: U D1 , U D2 , EQ2 , U D3 , EQ3 , U D4 , U CI1 Broadcast relationship: C5 Proposals: C6 = concept([S] : star algebra(S) ∧ ∀xy ∈ S(x ∗ x = x ∧ y ∗ y = y → (x ∗ y) ∗ (x ∗ y) = (x ∗ y), {[star0]}, 1 New sub-processes: EQ4 : future equivalence check against C5 (spawned by EQ), U D5 : check for future updates of the C5 datatable, U CI2 : propose C6 (spawned by UCI[Closure]) Detached sub-processes: U CI1 (due to broadcast)

Round 8 Spawned sub-processes: U D1 , U D2 , EQ2 , U D3 , EQ3 , U D4 , EQ4 , U D5 , U CI2 Broadcast relationship: C6 Proposals: J 2 = conjecture([D3, D6], equivalent), 2 [note that D3and D4 are the definitions of concepts C3 and C4] New sub-processes: EQ5 : future equivalence check against C6 (spawned by EQ), U D6 : check for future updates of the C6 datatable, EQJ1 :propose J 2 Detached sub-processes: U CI2 (due to broadcast)

Round 9 Spawned sub-processes: U D1 , U D2 , EQ2 , U D3 , EQ3 , U D4 , EQ4 , U D5 , EQ5 , U D6 , EQJ1 Broadcast relationship: J 2 Proposals: E3 = explanation([D3, D6], equivalent, proved, otter proof, ), 3, E4 = explanation([D3, D6], equivalent, open, , ), 0 New sub-processes: EX3 : propose E3 (spawned by P C[Otter]), EX4 : propose E4 (spawned by DC[M ACE]) Detached sub-processes: EQJ1 (due to broadcast)

Round 10 Spawned sub-processes: U D1 , U D2 , EQ2 , U D3 , EQ3 , U D4 , EQ4 , U D5 , EQ5 , U D6 Broadcast relationship: E3 Proposals: None New sub-processes: None Detached sub-processes: EQ5 (due to redundancy of C6), U D6 (due to redundancy of C6), EX4 (due to broadcast of E3)

Appendix 2. Reasoning Rounds for GW-TM example Fixed sub-processes: NF . UCI[Match], UCI[Closure], NE, EQ, P C[Otter], DC[MACE], NF . Round 1 Spawned sub-processes: BNT1 Broadcast relationship: NT 1 Proposals: None New sub-processes: NF1 : future explanations check to update positives and future concept check for non-theorem modification Detached sub-processes: BNT1 due to broadcast ... Round 6 Spawned sub-processes: NF1 Broadcast relationship: EX1 = explanation([D3, D4], implies, disproved,, T 1) Proposals: None New sub-processes: None − NF1 updates − P ositive = P ositives ∪ supporters([D1, D2], [T 1]) Detached sub-processes: proposer of EX1 ... Round 10 Spawned sub-processes: NF1 Broadcast relationship: EX2 = explanation([D5, D6], implies, disproved,, T 2) Proposals: None New sub-processes: None − NF1 updates − P ositive = P ositives ∪ supporters([D1, D2], [T 2]) Detached sub-processes: proposer of EX2

41

... Round 15 Spawned sub-processes: NF1 Broadcast relationship: C9 = concept(D7, T 3) Proposals: J 1, 2 New sub-processes: NF11 : T 3 ⊂ P ositives, therefore propose J 1 = conjecture([D7, D1], implies), 2 until broadcast or a counter-example is identified NF12 : future proofs of conjecture trigger a triviality check Detached sub-processes: proposer of C9

Round 16 Spawned sub-processes: NF1 , NF11 , NF12 Broadcast relationship: J 1 Proposals: EX3, 3 New sub-processes: P C1 : propose EX3 = explanation([D7, D1], implies, proved, proof1 ,) , 3 Detached sub-processes: NF11 (due to broadcast)

Round 17 Spawned sub-processes: NF1 , NF11 , NF12 , P C1 Broadcast relationship: EX3 Proposals: None New sub-processes: None − NF12 check trivial([D1, D7, D2]) = f alse, so no trivial f ix is required Detached sub-processes: P C1 (due to broadcast), NF12 following check trivial

Appendix 3. Reasoning Rounds for the GW-ICARUS example

Fixed sub-processes: UCI[Match], UCI[Closure], NE, EQ, P C[Otter], DC[MACE], CR, CT, CE. Round 1 Spawned sub-processes: BM P1 Broadcast relationship: M 1 Proposals: ST 1.1, 7, ST 1.2, 7, ST 1.3, 7 New sub-processes: CR1 : future implied constraint check for {} CE11 : propose ST 1.1 = solving time(CSP, {}, 1, 20), 7 CE12 : propose ST 1.2 = solving time(CSP, {}, 2, 130), 7 CE13 : propose ST 1.3 = solving time(CSP, {}, 3, 1100), 7 Detached sub-processes: BM P1 (due to broadcast)

Round 2 Spawned sub-processes: CR1 , CE11 , CE12 , CE13 Broadcast relationship: ST 1.1 Proposals: ST 1.2, 7, ST 1.3, 7 New sub-processes: None Detached sub-processes: CE11 (due to broadcast) ... Round 17 Spawned sub-processes: CR1 Broadcast relationship: EX1 = explanation([D2, D1], implies, proved, proof1 , ) Proposals: CS1, 5 New sub-processes: CT11 : propose CS1 = implied constraint(N1), 5 where N1 = translate([D2, D1], implies) CT12 : check future models to introduce N1 Detached sub-processes: sub-process that proposed EX1

Round 18 Spawned sub-processes: CR1 , CT11 , CT12 Broadcast relationship: CS1 Proposals: M 2, 6 New sub-processes: CR11 : propose M 2 = model(CSP, [N1], [1, 2, 3]), 6 Detached sub-processes: CT11 (due to broadcast)

42

Round 19 Spawned sub-processes: CR1 , CT12 , CR11 Broadcast relationship: M 2 Proposals: ST 2.1, 7, ST 2.2, 7, ST 2.3, 7 New sub-processes: CR2 : future implied constraint check for [N1] CE21 : propose ST 2.1 = solving time(CSP, [N1], 1, 15), 7 CE22 : propose ST 2.2 = solving time(CSP, [N1], 2, 100), 7 CE23 : propose ST 2.3 = solving time(CSP, [N1], 3, 910), 7 Detached sub-processes: CR11 (due to broadcast)

Round 20 Spawned sub-processes: CR1 , CT12 , CR2 , CE21 , CE22 , CE23 Broadcast relationship: ST 2.1 Proposals: ST 2.2, 7, ST 2.3, 7 New sub-processes: None Detached sub-processes: CE21 (due to broadcast) ... Round 30 Spawned sub-processes: CR1 , CT12 , CR2 Broadcast relationship: EX2 = explanation([D3, D1], implies, proved, proof2 , ) Proposals: CS2, 5 New sub-processes: CT21 : propose CS2 = implied constraint(N2), 5 where N2 = translate([D3, D1], implies) CT22 : check future models to introduce N2 Detached sub-processes: process that proposed EX2

Round 31 Spawned sub-processes: CR1 , CT12 , CR2 , CT21 , CT22 Broadcast relationship: CS2 Proposals: M 3, 6, M 4, 6 New sub-processes: CR12 : propose M 3 = model(CSP, [N2], [1, 2, 3]), 6 CR21 : propose M 4 = model(CSP, [N1, N2], [1, 2, 3]), 6 Detached sub-processes: CT21 (due to broadcast)

Round 32 Spawned sub-processes: CR1 , CT12 , CR2 , CT22 , CR12 , CR21 Broadcast relationship: M 3 Proposals: ST 3.1, 7, ST 3.2, 7, ST 3.3, 7, M 4, 6, M 5, 6 New sub-processes: CR3 : future implied constraint check for [N2] CT121 : propose M 5 = model(CSP, [N2, N1], [1, 2, 3]), 6 N2 ∈ [N2], so no action by CT22 CE31 : propose ST 3.1 = solving time(CSP, [N1], 1, 17), 7 CE32 : propose ST 3.2 = solving time(CSP, [N1], 2, 110), 7 CE33 : propose ST 3.3 = solving time(CSP, [N1], 3, 980), 7 Detached sub-processes: CR12 (due to broadcast)

Round 33 Spawned sub-processes: CR1 , CT12 , CR2 , CT22 , CR3 , CR21 , CT121 , CE31 , CE32 , CE33 Broadcast relationship: ST 3.1 Proposals: ST 3.2, 7, ST 3.3, 7, M 4, 6, M 5, 6 New sub-processes: None Detached sub-processes: CE31 (due to broadcast) ... Round 36 Spawned sub-processes: CR1 , CT12 , CR2 , CT22 , CR3 , CR21 Broadcast relationship: M 4 Proposals: ST 4.1, 7, ST 4.2, 7, ST 4.3, 7, M 5, 6 New sub-processes: CR4 : future implied constraint check for [N2] N1 in [N1,N2] so no action by CT12 N2 in [N1,N2] so no action by CT22 CE41 : propose ST 4.1 = solving time(CSP, [N1, N2], 1, 13), 7 CE42 : propose ST 4.2 = solving time(CSP, [N1, N2], 2, 90), 7 CE43 : propose ST 4.3 = solving time(CSP, [N1, N2], 3, 850), 7 Detached sub-processes: CR21 (due to broadcast)

Round 37 Spawned sub-processes: CR1 , CT12 , CR2 , CT22 , CR3 , CR4 , CT121 , CE41 , CE42 , CE43 Broadcast relationship: ST 4.1 Proposals: ST 4.2, 7, ST 4.3, 7, M 5, 6 New sub-processes: None Detached sub-processes: CE41 (due to broadcast) ...

43

Round 40 Spawned sub-processes: CR1 , CT12 , CR2 , CT22 , CR3 , CR4 , CT121 Broadcast relationship: M 5 Proposals: ST 5.1, 7, ST 5.2, 7, ST 5.3, 7 New sub-processes: CR5 : future implied constraint check for N2, N1 N1 ∈ [N2, N1], so no action by CT12 N2 ∈ [N2, N1], so no action by CT22 CE51 : propose ST 5.1 = solving time(CSP, [N2, N1], 1, 10), 7 CE52 : propose ST 5.2 = solving time(CSP, [N2, N1], 2, 80), 7 CE53 : propose ST 5.3 = solving time(CSP, [N2, N1], 3, 800), 7 Detached sub-processes: CR121 (due to broadcast)

Round 41 Spawned sub-processes: CR1 , CT12 , CR2 , CT22 , CR3 , CR4 , CR5 , CE51 , CE52 , CE53 Broadcast relationship: ST 5.1 Proposals: ST 5.2, 7, ST 5.3, 7 New sub-processes: None Detached sub-processes: CE51 (due to broadcast)

44

Suggest Documents