The Contextual Memory System: An architecture for ... - CiteSeerX

1 downloads 0 Views 92KB Size Report
Roderick I. Nicolson. Department of Computer Studies. Department of Psychology. University of Glamorgan. University of Sheffield. Pontypridd, Mid Glamorgan.
Furse E., and Nicolson R.I. (1995), The Contextual Memory System: An architecture for Declarative Learning, First European Cognitive Science Conference, INRIA Press.

The Contextual Memory System: An architecture for declarative learning Edmund Furse Department of Computer Studies University of Glamorgan Pontypridd, Mid Glamorgan CF37 1DL, UK [email protected]

Roderick I. Nicolson Department of Psychology University of Sheffield Sheffield S10 2TN, UK [email protected]

Abstract Declarative learning, the ability to learn facts and concepts by experience and to adapt the declarative memory structures appropriately, forms the basis of human competencies. We argue that a viable architecture for declarative learning must be able to cope with novel information, and to demonstrate four core competencies - knowledge comprehension, assimilation, utilisation and accommodation. The Contextual Memory System (CMS) is put forward as a model of declarative learning which satsifes these criteria, and illustrated in the domain of pure mathematics. No other cognitive architecture satisfies these criteria. A declarative learning capability provides an important constraint upon cognitive science modelling.

1. Introduction. Declarative Learning and Cognitive Science Probably the first 'cognitive' analysis of declarative knowledge acquisition was made by Piaget (e.g.. 1952). He made the crucial observation that knowledge was organised loosely into schemas, and that new knowledge might either be incorporated naturally into existing schemas ('assimilation') or alternatively that it might require the reorganisation and extension of existing schemas ('accommodation'). Ironically, despite the profound influence of Piaget's stage analysis of development, these deeper insights were ignored by his contemporaries, and it was over 20 years before they surfaced in the cognitive literature in Rumelhart and Norman's (1978) analysis of learning (where assimilation and accommodation are labelled accretion and restructuring, respectively). The first effort at computational simulation of knowledge acquisition was in Quillian's Teachable Language Comprehender (e.g.. 1969), where he acknowledged explicitly the need to build up semantic knowledge by being told facts such as 'a canary sings', but he admitted his failure to achieve this objective in the same paper. Consequently, TLC's influence on cognitive science was in terms of knowledge representation rather than declarative learning. Interestingly, the issue of modelling semantic memory appears to have been abandoned - the handbook "Foundations of Cognitive Science" (Posner, 1989) devotes only three lines out of 888 pages to semantic and declarative memory. Recently, however, learning has been re-assigned its central role in higher cognition. The reason is stated clearly by VanLehn (1989, p529). "Expert behaviour, whether generated by people or programs, is a product of their knowledge, so any explanation of that behaviour must rest on postulating a certain base of knowledge. But what explains that knowledge? ... the ultimate explanation for the form and content of the human experts' knowledge is the learning processes they went through in obtaining it. Thus the best theory of expert problem solving is a theory of learning. Indeed, learning theories may be the only scientifically adequate theories of expert problem solving." Lenat and Feigenbaum (1991) present as one of the guiding principles of AI the 'Knowledge Principle' - "If a program is to perform a complex task well, it must know a great deal about the world in which it operates. In the absence of knowledge, all you have left is search and reasoning, and that isn't enough." It is little surprise therefore that learning mechanisms lie at the heart of the three major architectures for cognition. The two major symbolic architectures are ACT*/PUPS (Anderson, 1983; Anderson & Thompson, 1988) and Soar (Laird, Newell & Rosenbloom, 1986). Both are expressed in terms of production systems, though Anderson makes an explicit distinction between declarative and procedural knowledge , whereas Soar adopts a uniform production rule representation. By contrast, much recent research has been phrased at a sub-symbolic level in terms of neural networks (Rumelhart et al, 1986). ACT*, Soar, and connectionist approaches all assign a central role to learning in the acquisition of their knowledge. ACT*/PUPS suggests that declarative knowledge must be acquired initially, then this declarative knowledge is 'proceduralised' by a 'knowledge compilation' process The Contextual Memory System: an architecture for declarative learning

1

Furse E., and Nicolson R.I. (1995), The Contextual Memory System: An architecture for Declarative Learning, First European Cognitive Science Conference, INRIA Press.

consequent upon successful performance, turning it into a production rule format. The production rules may subsequently be tuned by extended practice. Soar learns primarily by a process of problemspace search, with learning taking place by automatic processes of 'subgoaling' following a failure and by 'chunking'. Though space precludes a discussion here (see Nicolson and Furse, 1992 for a detailed analysis) not one of the above cognitive architectures is able to cope with declarative learning. ACT*/PUPS is specialised to learn procedures once the declarative information has been acquired, but has no mechanism for the initial declarative learning. Therefore, whenever Anderson wishes to apply his theory of learning to a new domain, he first has to create the requisite naive declarative knowledge, which may then be proceduralised and tuned. Soar requires a pre-defined problem space to function at all and is therefore particularly ill-suited to learning arbitrary declarative knowledge. Connectionist models are specialised for incremental learning and thus appear to have too fine a 'grain size' for learning declarative facts. Furthermore, common connectionist techniques such as gradient descent typically require thousands of trials before they can 'learn' routine discriminations whereas much declarative learning takes place in one or two trials. One might expect that the problem of declarative learning would have been extensively studied in the machine learning literature, but, bafflingly, this is not so. Space again precludes a detailed analysis of the machine learning literature here (see Ellman, 1989 for a useful review), and, again following Nicolson and Furse (1992) we suggest four major critiques of the machine learning research: (1). Most of the models lack psychological plausibility; (2). The declarative knowledge being acquired is relatively unstructured and semantically arid; (3). The learning tasks are too simple to be ecologically valid; (4). All the models operate within a closed world in which the knowledge representations are prespecified. VanLehn (1989, p553) has made the point clearly: "No computer program yet exists that can start off as a novice in some knowledge-rich domain and slowly acquire the knowledge needed for expert performance. Thus we lack even a computationally explicit account of the novice-expert transition, let alone one that compares well with performance of human learners". In summary, declarative, symbolic learning is of major theoretical and applied importance, yet no established cognitive architecture offers any mechanism for it, and no current machine learning approach offers a principled and psychologically plausible account of the processes involved.

2. Requirements for declarative learning In this section we develop an initial specification for any testable, psychologically plausible, architecture for declarative learning. Processes for Declarative Learning (i) Comprehension (cf. Encoding). The system must be able to encode novel information so as to cause it to enter working memory. Following Quillian (1969) we use the term comprehension to emphasise the need to rewrite the input into the format used in the declarative memory structures. (ii) Assimilation (cf. Storage). The system must be able to create a long-term memory entry for information in its working memory. Furthermore, the system must be able to incorporate the new information into its memory structures in such a way as to facilitate the adaptive use of that information in other contexts. It is worth noting the need to store not only the new information but also the context in which it occurs. Tulving (1983) refers to this as the 'cognitive environment'. (iii) Utilisation (cf. Retrieval). The system must be able to retrieve the stored information given an appropriate cue or an appropriate context. For many theorists, the term retrieval suggests too automatic a process, and terms such as reconstruction (Bartlett, 1932) or ecphory (Tulving, 1983) capture better the complex search and matching processes involved. We adopt the neutral term to indicate the adaptive use of the existing knowledge to satisfy the person's requirements. (iv) Accommodation. Regardless of how broadly one interprets the above three processes, we believe that they are incomplete for a viable declarative learner, and it is failure to consider the need for adaptation of declarative memory structures that resulted in the eventual aridity for declarative The Contextual Memory System: an architecture for declarative learning

2

Furse E., and Nicolson R.I. (1995), The Contextual Memory System: An architecture for Declarative Learning, First European Cognitive Science Conference, INRIA Press.

learning of the approaches described above. The system must be able to modify the information in the light of subsequent experience so as to improve its adaptivity of use. This includes both strengthening of salient links and forgetting of useless links. As Baddeley and Patterson (1971) noted, forgetting is at least as useful a skill as remembering! The above processes of declarative learning are uncontroversial, reflecting probably the current consensual view of cognitive psychologists. We argue that these reflect four core competencies for any declarative learner. One key distinction which is not made explicit by the above analysis is between the processing of new and old material. It is sufficiently important to be worth highlighting. Coping with novel information It is possible to devise many architectures which can show comprehension, assimilation, utilisation and accommodation when presented with familiar material, though even so none of the established approaches seem able to achieve this naturally. Much more important for the learner is the ability to show the four competencies for unfamiliar, novel material, which cannot easily be assimilated into existing structures, but requires restructuring of the system. Even comprehending unfamiliar material is difficult, in that this involves first recognising that one's existing knowledge is not sufficient, and then identifying new features that specify the new object. Assimilating a novel object involves registering its label, its identifying features, and patching it in to the existing memory structures. Utilisation presupposes that some retrieval route can be set up, presumably involving its label (if stored) and some subset of its features. Accommodation over time will result in the tuning of the memory structures so as to represent the salient features of the object and its salient links to other memory objects. In learning something novel it is essential not to represent the new knowledge entirely within one's existing representations, otherwise one has a closed system (see Furse (1993a)). However, whilst an important step in itself, specifying a set of desiderata for declarative learning is of limited utility if one has no method for testing a system, and we argue that development of a complex but objective test-bed for declarative learners would be a major contribution in its own right. In the following section we offer the domain of Pure Mathematics as a plausible contender.

3. Pure Mathematics as a Proving Domain for Declarative Learning One major reason for the failure of existing architectures to accommodate declarative learning is that, unlike procedural learning, it is determined as much by the existing contents of semantic memory as by the new facts to be acquired (cf. Piaget, 1952). Given our inability to specify the contents of semantic memory either before or after a learning experience it has proved frustratingly difficult to devise an environment which could allow empirical tests which might inform the development of models of declarative learning. The major problem in identifying a domain for the investigation of 'pure' declarative learning, unsullied by the application of prior knowledge, is that humans attempt wherever possible to adapt their existing huge knowledge base to each new problem. Anderson and his co-workers (e.g.. Anderson et al, 1989) have addressed the prior knowledge problem by selecting domains such as LISP programming, text editing, geometry, calculus and logic which involve artificial tasks and artificial languages, thus hoping to minimise intrusions from established knowledge. In a related approach, Furse (1993b, 1994) has investigated pure mathematics, and, in particular, group theory. Unlike LISP, which has a rich procedural semantics but relatively slight declarative semantics, and unlike text editing and geometry, which require rich but unspecifiable prior knowledge in the form of understanding of spatial layout, pure mathematics provides a domain with both a rich declarative semantics and rich procedural semantics which requires little prior knowledge. Because mathematics provides such a rich declarative and procedural semantics, skilled mathematical performance involves the interplay of a range of knowledge and skills, including declarative knowledge (both conceptual knowledge, of definitions and the like, and structural knowledge, corresponding to the links between concepts), and also procedural knowledge (in terms of how to access relevant concepts, how to check proofs line by line, and finally, how to generate proofs unaided). A further advantage of pure mathematics as a domain is that, because of its non-intuitive nature and its long and stable history, a consistent and effective pedagogical strategy has evolved, resulting in textbooks being written in a careful, cumulative style, introducing one concept or theorem at a time - an ideal situation for learning by person or machine. In short, a good mathematical textbook is engineered to provide the 'felicity conditions' (VanLehn, 1989) necessary for human The Contextual Memory System: an architecture for declarative learning

3

Furse E., and Nicolson R.I. (1995), The Contextual Memory System: An architecture for Declarative Learning, First European Cognitive Science Conference, INRIA Press.

learning of mathematics. Pure mathematics, therefore, represents a valuable proving domain for theories of learning - tractable and self-contained yet semantically rich. Furthermore, pure mathematics textbooks provide a structured learning environment together with objective, built-in empirical tests of understanding (via proof checking) and of problem solving. A critical issue, however, is whether pure mathematics learning is representative of everyday declarative learning. We argue that the same processes are involved, with the major difference being that the learner has no opportunity to use his/her vast existing knowledge base. Consequently the domain provides a complex yet tractable, artificial but complete, domain for the investigation of pure declarative learning which is not only vast but also open. Pure mathematics therefore offers an opportunity to provide an objective test of a model's capability in the four key competencies of declarative learning, namely knowledge comprehension, assimilation, utilisation and accommodation. It also offers the further opportunity to test generative, procedural learning via the need to solve new problems in addition to checking proofs.

4. The Contextual Memory System: an architecture for declarative learning The CMS introduces the notion of dynamic features which are built by the system from the environment. These features model the processes of attention and perception which front end the memory model, and index the resulting items in the CMS. The essential notion is that an intelligent system, in making sense of its environment uses a large number of features to represent the input item. The CMS has no built in features, but has built in feature creating mechanisms which generate a large number of features from an input item. However, many of these features will be useless in understanding the meaning of the input. It is only by subsequent experience in other contexts that it is possible to determine which features are salient. The CMS therefore encodes input material in terms of both old and new features, and with subsequent experience new features are added to distinguish between similar items. This network architecture is similar to connectionist models and to ACT*,but with the crucial enhancements that the nodes and links are created dynamically and from scratch. Architecture The CMS is based on a network architecture of features and items. In figure 1 an item is represented by a square with features represented by circles. The crucial difference between this architecture and other cognitive architectures such as ACT*, Soar and neural networks, is that the configuration is completely dynamic. The CMS starts as a tabula rasa, with no items and no features. When a new item is remembered new features are created dynamically from the object presented. Subsequent items will be stored with a combination of new and old features. This is illustrated in the diagram below with the new features occupying the positions immediately surrounding an item. The right hand picture shows a new item being added to the CMS which has 5 new features and three old features it shares with an older item. The actual configuration of the network is entirely dependent upon the history of memory processes, but can become very complex once a few hundred items have been learned.

Figure 1. The CMS learning a new item. The squares represent items, and the circles features. The new item is represented in terms of 5 new features and 3 old ones. The Contextual Memory System: an architecture for declarative learning

4

Furse E., and Nicolson R.I. (1995), The Contextual Memory System: An architecture for Declarative Learning, First European Cognitive Science Conference, INRIA Press.

The items and features have energy values which decay with time. The links between features and items also have a strength, or energy, similar in principle to ACT*. The current state of attention can be thought of as the features with highest energy which partly determine what features of an input object will be observed. Furthermore the items with highest energy are those most likely to be recalled at a given time. Recall increases the energy of an item, and the features are reorganised. Comprehension The comprehension, assimilation and accommodation processes in the CMS are based closely on an analysis of the corresponding processes in people. In common with most symbolic and connectionist architectures, the CMS assumes that on presentation of an object, it is encoded into a large number of features. By contrast with other approaches, however, the features are affected not only by the object but also by the observer. The likely focus of the observer's attention will be affected by his/her previous experience, in that from previous experience s/he will have acquired certain features which are important and therefore are likely to be used in making sense of the object. Of course the observer's perceptual system will also provide masses of features of the new object, many of which are completely novel. The initial registration, which we have termed comprehension in our earlier analysis, may be thought of as the process of identifying the set of features in the input. Assimilation In storing a new fact, some balance needs to be achieved between the use of old features and new features. If the object can be uniquely identified from similar objects in memory using one's existing features then it is probably not necessary to encode many new features. But it may be the case that some of the new features will actually be significant in future classifications to be made, and therefore it may be dangerous to represent the object entirely with the old features. Conversely an object may be confusable, with a number of different existing items in memory using only the old features, and this then places greater importance on the new features. But the key problem is at this time of encoding the observer does not know which of the new features are salient. Thus a range of different new features need to be encoded, and the objective of the subsequent accommodative processes with further experience is to tune these features so that the salient features are strengthened and the nonsalient ones atrophy. In representing the object presented in terms of the old features, some mechanism needs to be used to ensure that relevant features in the current context are used in preference to features which have atrophied. This is achieved by giving all features an energy value. The current attentional state can then be modelled by the features with the highest energy. When a new object is stored in the CMS the old features used are the ones with the highest energy. Nevertheless some of these features which have high energy because they are useful at finding other items, may not be so useful at finding this object. But only experience will show which features are actually useful. Utilisation Interestingly, the CMS exploits the opportunity to accommodate the memory structures whenever a stimulus is presented. The process of retrieving an object now stored as an item in the memory when encountered in a new context also refines the features. Upon presentation of a probe, which is an object to recognise, the usual comprehension process including feature analysis takes place, but here although new features may be created they are of no value in retrieving the item from memory since clearly new features do not index existing items. If the object is a familiar one the features should be adequate to uniquely distinguish it rapidly using the processes described below. More commonly the features used for recall will retrieve a number of different items, and it is necessary to learn to distinguish them. Ultimately extensive feature analysis of the object and the found items will distinguish which is the desired item, or alternatively other forms of pattern matching may be used. At this stage it is necessary to learn so that the item can be found more easily in future. The features used to retrieve the item from memory are activated in order of decreasing energy. When a feature is activated it in turn activates all the items to which it points, and increases a transient energy of these items by an amount determined by the energy of the feature, the energy of the link and the energy of the item. If an item's transient energy exceeds a certain threshold then the item is 'activated' and tested to see if it corresponds to the desired item. This testing is currently implemented by pattern matching, but could be done by more detailed feature analysis. If the item does not match The Contextual Memory System: an architecture for declarative learning

5

Furse E., and Nicolson R.I. (1995), The Contextual Memory System: An architecture for Declarative Learning, First European Cognitive Science Conference, INRIA Press.

then it is added to the list of 'failures' - items which are similar to the desired item, otherwise the item has been found and the CMS is accommodated to ensure that the item can be found more easily in future using the processes described below. Accommodation: Learning by Experience The way the CMS learns by experience is in the process of adjusting the features used in the search so that future access of the item is easier. Three processes are used to do this: storing 'uncomputed' features, adjusting the energy of existing features, and creating new features. Not all the features that are found in the probe may be already stored as indexing the recalled item, in that the context of recall may be very different from when the item was first stored and other contexts when it was recalled, so that at the time of recall it may not be known whether the item has one of the probe features or not. If these previously 'uncomputed' features are found to be positive for the item they are given explicit new links to the item. Features are deemed to be useful in the search if they index the found item, but not the failed items. These features can have their energies increased and also the energy of the link from the feature to the item. Conversely, features are considered unuseful if they index the failed items but not the desired item, and they have their energies decreased. The final process of learning from the experience of recall is to dynamically create new features to distinguish the found item from the failures. In the CMS this is currently only done when no existing features succeed in achieving this dissociation. New features can be created by two different approaches. In the first the found item and a failure are analysed to discover differences, from which a feature can be derived. In the second approach both the found item and a failure are broken up into a large number of features, and a feature in the found item set and not the failure is chosen. The CMS currently uses the latter approach.

5. MU, the Mathematics Understander The CMS is intended as a general architecture for declarative learning, with applicability in a range of areas such as games playing, computer programming, learning about architectural design, and so on. The above overview has necessarily been abstract, and in order to illustrate it more concretely we present examples of the CMS in action in the mathematics domain. MU is a large computer program which models the reading of mathematics texts by students. The nature of pure maths makes it possible to devise a 'formal expression language' (FEL) (Furse (1990), which captures the essential semantics of the domain, thus allowing maths texts to be translated into an artificial, unambiguous language, as shown in Figure 2.. The FEL version of a maths textbook forms the input to MU, and MU reads in the theorems, proofs and problems line by line, with the CMS building up a complex form of knowledge representation through the experience of reading the definitions, checking the proofs and attempting to solve the problems. MU learns from scratch (except that it must first read an FEL file of logic rules for the Group Theory and arithmetic rules for the classical analysis), and it has succeeded in reading and explaining several chapters of both group theory and classical analysis texts. Because of MU's extremely rich knowledge, all of which is learned, MU demonstrates much more complex mathematical understanding than nearly all programs derived from the theorem-proving tradition of AI, a paradigm case of Lenat and Feigenbaum's Knowledge Principle. The crucial expertise to capture is the ability to find the appropriate mathematical result when checking a proof or solving a problem. In complex subjects such as classical analysis there are hundreds of possible results in the subject, but the expert mathematician has little difficulty finding the right one. In some sense the mathematician can "see" the appropriate result, and it is this aspect of mathematical expertise which the CMS models. The expertise is modelled by the use of features, but not only is the skill itself modelled, but also how it is acquired. Within the CMS, the mathematical results are stored as items, with features used to index them. New results are stored when a definition or theorem statement is read, and they are recalled in the process of checking proofs and solving problems. Since the CMS is adjusted during recall, the mathematical knowledge is continually being refined and reorganised during proof checking and problem solving so that finding the appropriate mathematical result becomes more easy with more experience. The Contextual Memory System: an architecture for declarative learning

6

Furse E., and Nicolson R.I. (1995), The Contextual Memory System: An architecture for Declarative Learning, First European Cognitive Science Conference, INRIA Press.

Features in MU The features used in representing a mathematical result can be generated by a number of different methods. The fundamental notion used at present in the CMS is that a feature captures a varying degree of specificity of the input, and where the focus of attention is. Formally a feature name is defined by: ::= ::= LHS- | RHS- | null ::= LHS- | RHS- ::= IS- | HAS ::= FORM- | TERMFor example the proposition: x ∈ G => x-1 ∈ G can be parsed as (=> (member x (cap g)) (member (inv x) (cap g))) and has features including: HAS-FORM-[MEMBER_[INV_A]_B] LHS-FORM-IS-[MEMBER_A_B]] LHS-RHS-HAS-FORM-[INV_A] IS-FORM-[=> _[MEMBER_A_[CAP_B]]_[MEMBER_[INV_A]_[CAP_B]]] A form is a canonical representation of a proposition, using the letters a,b,c,... instead of the variables used in the proposition, thus x ∈ G is replaced by a ∈ B. A form can be as specialised or as general as required, for example the feature: IS-FORM-[=>_[MEMBER_A_[CAP_B]]_[MEMBER_[INV_A]_[CAP_B]]] is as specialised as possible, being identical to the original, whilst HAS-FORM-[=>_A_B] is the most general. The use of position is indicated by a sequence of LHS (left hand side) and RHS (right hand side), for example RHS-IS-FORM-[MEMBER_[INV_A]_[CAP_B]] indicates that the right hand side of the proposition is the form a-1 ∈ b, whilst focusing attention on an even smaller part, LHS-RHS-ISFORM-[INV_A] indicates that the left hand side of the right hand side is a -1. HAS indicates that somewhere within the focus of attention the form occurs. Understanding Mathematics In checking a proof MU needs to find the one or more inferences used in deriving one step from the previous step. Similarly in solving problems, it is necessary to choose an appropriate inference to apply to a step to reach a desired goal. This search is essentially a retrieval from the CMS as described in the previous section, and since learning takes place as part of the memory processes, it ensures that the result is easier to find in future situations. For example, consider the following two steps from a proof:

The Contextual Memory System: an architecture for declarative learning

7

Furse E., and Nicolson R.I. (1995), The Contextual Memory System: An architecture for Declarative Learning, First European Cognitive Science Conference, INRIA Press.

(1/2)b + (1/2)a < sn Definition 2.4 of converges «sn» converges to α => (1/2)(b + a) < sn iff ∀ε > 0 ∃m ∀n > m |sn - α| < ε Here a bottom up comparison of the two steps will indicate that the connecting inference is: Lemma 2.1.1 (1/2)b + (1/2)a = (1/2)(b + a) This provides the probe to search the CMS for not («(-1) n» converges to 1) this result. Features are used in order of Proof decreasing energy which include: 1 RTP not («(-1)n» converges to α) HAS-FORM-[=_A_B] 2 Suppose to the contrary «(-1) n» converges to 1 HAS-FORM-[+_A_B] 3 ⇒ ∀ε > 0 ∃m ∀n > m |(-1)n - 1| < ε by definition of LHS-HAS-FORM-[*_A_B] converges and these find the items: a + (ab) = a(1 + b) 4 ⇒ ∃m ∀n > m |(-1)n - 1| < 1 by substituting ε = 1 5 let n = 2m + 1 (ab) + b = (a + 1)b 6 ⇒ n is odd but these fail to verify the step. Once the more specific feature: 7 ⇒ (-1)n = - 1 LHS-HAS-FORM-[+_A_B] 8 ⇒|(-1)n - 1| = |-2| since a = b ⇒ |a-1| = |b-1| is used this finds 9=2 (ab) + (ac) = a(b + c) 10 ⇒ 2 < 1 by combining steps 4 and 9 which succeeds. 11 ⇒ contradiction Learning now takes place from this experience 12 QED Figure 2 to distinguish the found item from the failures. In particular the following features are found to distinguish the item from the failures, and so have their energy increased, or are created from scratch: HAS-FORM-[+_[*_A_B]_C] HAS-FORM-[*_A_[+_B_C]]

6. Conclusions We have argued that declarative learning is a crucial component of human learning. An analysis of the requirements for a declarative learner has suggested that, at the very least, any viable architecture must be able to support the four processes of declarative learning, namely comprehension, assimilation, utilisation and accommodation; not only for knowledge that can be pigeonholed into existing structures, but also for knowledge that has to be acquired in an open-ended fashion. Pure mathematics provides a rich, challenging domain for declarative learners offering a self-contained environment, with unambiguous, parsed data, which provides objective and rigorous testing of each of the four declarative learning processes. The CMS demonstrates all four competencies in this domain, both for knowledge that it has already learned, and, crucially, for new knowledge for which their is no existing structure. We believe that the CMS is unique amongst cognitive architectures in possessing these capabilities. In conclusion, there is a pressing need for guidance in the construction of cognitive architectures. Empirical data appear insufficient to provide this guidance, with diverse models able to provide a good fit to the 'classic' findings of verbal learning and memory. By contrast, none of these architectures is able to cope, even in principle, with all four competencies of declarative learning. One of the major computational problems facing the human infant, and also the human adult, is how to make sense of new situations. This is surely achieved by dynamic, adaptive learning. We argue, therefore, that a dynamic, declarative learning capability should be the cornerstone of architectures for cognition. The CMS has provided an existence proof that a declarative learner can be constructed for the mathematics domain. We hope that this demonstration will prove the catalyst for the construction of a new generation of cognitive architectures.

The Contextual Memory System: an architecture for declarative learning

8

Furse E., and Nicolson R.I. (1995), The Contextual Memory System: An architecture for Declarative Learning, First European Cognitive Science Conference, INRIA Press.

References Anderson, J.R. (1983). The Architecture of Cognition. Cambridge MA: Harvard University Press. Anderson, J.R. & Thompson, R. (1988). Use of analogy in a production system architecture. In S. Vosniadou & A. Ortony (ed). Similarity and Analogical Reasoning. New York: Cambridge University Press. Anderson, J.R., Conrad, F.G. & Corbett, A.T. (1989). Skill Acquisition and the LISP Tutor. Cognitive Science, 13, 467-505. Bartlett, F.C. (1932). Remembering. Cambridge: Cambridge University Press. Collins, A.M. & Ellman, T. (1989). Explanation-based learning: a survey of programs and perspectives. ACM. Comp. Survey, 21, 163-221. Furse, E. (1990). A Formal Expression Language. Technical Report No.CS-90-2, Dept of Computer Studies, University of Glamorgan. Furse E. and Nicolson R., (1992), Declarative Learning: Cognition without primitives, Proceedings of the 14th Annual Conference of the Cognitive Science Soc., Lawrence Erlbaum 1992, pp. 832-837. Furse E., (1993a), Escaping from the Box, in Prospects For Intelligence: Proceedings of AISB93, (Eds) Aaron Sloman, David Hogg, Glyn Humphreys, Allan Ramsay, Derek Partridge, IOS Press, Amsterdam pp.71-80. Furse E., (1993b), Perception and Experience in Problem Solving, Proceedings of the 13th Internal. Joint Conference on Artificial Intelligence, Morgan Kaufmann Publishing Inc., pp. 181-6. Furse E., (1994), The Mathematics Understander, In Artificial Intelligence in Mathematics, (eds) J.H. Johnson, S. McKee, A. Vella, Clarendon Press, Oxford (in press) Laird, J.E., Newell, A. & Rosenbloom, P.S. (1986). Soar: an architecture for general intelligence. Artificial Intelligence, 33, 1-64. Lenat, D.B. & Feigenbaum, E..A. (1991). On the thresholds of knowledge. Artificial Intelligence, 47, 185-250. Nicolson, R.I. and Furse, E. (1992). Decalarative learning: A challenge for cognitive science. Available as report LRG 92/12, Dept. of Psychology, University of Sheffield. Norman, D.A. (1991). Approaches to the study of intelligence. Artificial Intelligence, 47, 327-346. Piaget, J. (1952). The origins of intelligence in children. NY: International Universities Press. Quillian, M.R. (1969). The teachable language comprehender: A simulation program and theory of language. Communications of the ACM, 12, 459-476. Rumelhart, D.E, McClelland, J.L.. & the PDP Research Group. (1986). Parallel Distributed Processing: Explorations in the Microstructure of Cognition I: Foundations. Cambridge MA: MIT Press. Tulving, E. (1983). Elements of Episodic Memory. Oxford: Oxford University Press. VanLehn, K. (1989). Problem Solving and Declarative Skill Acquisition. In M.I. Posner (ed). Foundations of Cognitive Science. Cambridge MA: MIT Press. Winston, P.H. (1974). Learning structural descriptions from examples. In P.H. Winston. (ed.), The Psychology of Computer Vision.. McGraw-Hill.

The Contextual Memory System: an architecture for declarative learning

9

Suggest Documents