An Actor-based Architecture for Intelligent Tutoring Systems Claude Frasson*, Thierry Mengelle*, Esma Aïmeur*, Guy Gouardères**
*Université de Montréal Département d'informatique et de recherche opérationnelle 2920 Chemin de la Tour Montréal, H3C 3J7, Québec, Canada **Université de Pau, IUT Informatique, Bayonne, France E-mail: {frasson,mengelle,aimeur} @iro.umontreal.ca
[email protected] Tel: 1-514-343 7019 Fax: 1-514-343 5834 Abstract: The evolution of intelligent tutoring systems (ITS) toward the use of multiple learning strategies calls on a multi-agent architecture. We show how such an architecture can be defined for the pedagogical component of an ITS. After considering the evolution of intelligent agents, we observe that ITS need to have cognitive agents able to model the human behavior in learning situations. They represent what we call actors, a category of reactive, adaptive, instructable and cognitive agents.To assume these properties we present an actor architecture with different layers of cognition: a reactive layer, a control layer, and a cognitive layer which contains learning capabilities. We provide a detailed view of this architecture and show how it functions with an example involving the different actors of a new learning strategy, the learning by disturbing strategy.
Key-words: architecture, agents, actors, cognition, learning-by-disturbing, behavior, troublemaker
1
1. Introduction Learning in intelligent tutoring systems (ITS) has evolved during the last two decades. The goal of an ITS was to reproduce the behavior of an intelligent (competent) human tutor who can adapt his teaching to the learning rhythm of the learner. Initially, the control of the training was assumed by the tutor (prescriptive approach), not the learner. More recent ITS developments consider a co-operative approach between the learner and the system [8] using a co-learner. The system participates with the learner in the learning process and facilitates knowledge acquisition through interactions under the control of the learner. An extension of this approach was presented by Chan [6] with the learning companion who simulates the behavior of a second learner (the companion) who would learn together with the human learner. Various alternatives to this co-operative approach were then conceived, leading more recently [14, 16] to an inverted model of ITS called "learning by teaching" in which the learner could teach the learning companion by giving explanations. In fact this evolution progressively highlighted several fundamental characteristics: (1) learning in ITS is a constructive process involving several partners, (2) various learning strategies can be used to improve the process such as one-on-one tutoring, learning with a co-learner [6], learning by teaching [14], learning by disturbing [2]. The consequence in term of ITS architecture is that we need to resort to a distributed architecture (multi-agent architecture) in which various partners can play a role. Flexibility in learning requires that the ITS be based on multiple strategies. To fullfill these goals we need to specify a multi-agent architecture using intelligent agents with specific characteristics. The purpose of this paper is to define the components of such an architecture and show how it functions. This paper is organized as follows.We first show how ITS architecture could be structured to distribute pedagogical expertise among various agents. Then, we examine the evolution in terms of agents characteristics in order to situate our definition of intelligent agent. This evolution and the cognitive tasks we intend to attribute to an agent in an ITS lead to the notion of actor that would be able to play a role in specific conditions, cooperating with other actors. We present an actor architecture with different layers of cognition: a reactive layer, a control layer, and a cognitive layer which contains learning capabilities. We provide a detailed view of this architecture and show
2
how it functions with an example involving the different actors of a new learning strategy, the learning by disturbing strategy [2].
3
2. Toward a new agent-based ITS architecture The fundamental elements of an ITS architecture generally include a curriculum, a learner model and a pedagogical module. However, as mentioned above, we need a more flexible communication between the learner and the system, allowing various pedagogical interactions [12]. This flexibility can be supported by several agents which assume a pedagogical role (as indicated in Figure 1) and may intervene in several strategies with different roles. This is the case of the tutor which appears in the one-on-one, the companion or the learning by disturbing strategies. This last one involves three agents: the tutor who supervises the learning session, the troublemaker, a "particular" companion who can decide to give correct solutions or wrong information in order to check, and improve, learner’s self-confidence, and the artificial learner. This last agent is intended to support the dialogue between the system and the human learner, and also to synchronize the human learner’s activity with the different agents.
Figure 1: An agent-based ITS architecture. The role of the metatutor is to select the best strategy according to the learning objectives to reach (in relation with the curriculum) and the characteristics of the learner (learner model) [1]. All the agents interact according to an intelligent behavior that we precise in the following sections.
4
3. The evolution of intelligent agents The question what is an agent ? is certainly rich in different answers as a large part of the research community is using the word for various purposes. A reason is that if an agent is supposed to fulfill human capabilities it must have a large number of properties. Let us examine some of these properties. Agents have generally an autonomy; they can operate without human control [5] and interact with other agents (they have social ability) using an agent-communication language. They show a reactivity to the changes in the environment but also act according to a goal-directed behavior. This last point converges toward a set of additional and humanlike properties of an agent such as beliefs, desires and intentions [15]. If we consider the evolution of intelligent agents we can distinguish a first period of research in which agents are defined with basic capabilities such as reactivity, planning, prediction and diagnosis. Reactive agents have only immediate responses to stimuli, without reasoning. Planning consists in finding a sequence of actions that will achieve a desired goal (sometimes by matching graphs or post-conditions of actions against the desired goal). Planning systems evolved from linear to non-linear planners [4, 17]. Predictions follow from planning with hypothesis. Agents with diagnosis capabilities are intended to search a conclusion through a set of hypotheses [13]. However, these capabilities are insufficient to take on tutoring functions. We need a reasoning level and a dynamic improvement of agent’s knowledge. The second category of agents are called instructable agents [11, 3] which can dynamically receive instructions or new algorithms (for instance new pattern matching algorithms), and agents that can learn or improve their learning according to an history of actions. Instructable agents can receive new instructions to perform new tasks, directly from the users or from the system (that performs an analysis of a sequence of actions). A third category of agents are called adaptive agents [9]. They can adapt their perception of situations and modify their decisions by choosing new reasoning methods. In particular, they adapt their perceptual strategy to dynamic information requirements and ressources limitations, they adapt their control mode to dynamic goal-based constraints on their actions, they adapt their reasoning choices among potential reasoning tasks to dynamic local and global objectives, they adapt their reasoning methods to the currently available
5
information and performance criteria, and finally they adapt their meta-control strategy to the dynamic configuration of demands, opportunities and ressources for behavior. These properties are very important for educational purposes but are still insufficient for ITS requirements. Indeed, we need to have agents which model human behavior in learning situations. These last types of agents are cognitive and seem to be the most suitable to the problematic of ITS. The cognitive aspect of an agent relies upon its capability to learn and discover new facts or improve its knowledge for a better use. Learning can be achieved by a variety of methods. Considering the ITS components we need an architecture with multiple cognitive agents able to model not only the learner but also different behaviors corresponding to various pedagogical situations that may occur with a companion, a co-learner, etc...This extended notion of intelligent agent is viewed as an actor able to play different roles according to the learning situation (state of knowledge acquired by the learner, conditions of learning, desirable strategy, ...) in a co-operative environment.
4. A new actor paradigm 4. 1 Properties The properties we need in an ITS environment leads to the following definition of an actor: An actor is an intelligent agent which is reactive (basic property), adaptive, instructable and cognitive. •
• • •
The first level of actions required by an ITS concerns reactive capabilities such as determining the sequence of knowledge to be presented, diagnosing the misconceptions or missing conceptions, and matching situations of the learner with typical situations according to a case-based approach. An adaptation of perception and control is necessary when, for instance, a new learning strategy needs to be activated to cope with a new state of the learner. To improve the behavior of the ITS the system needs to learn by experience or acquire new strategies. Finally, the system must be able to generate new control tasks or adapt the reasoning strategy according to a more in depth analysis of learner' behavior.
6
The use of multiple strategies allows to involve the active participation of the learner with, for instance, a co-learner, a companion, a troublemaker, ... In the same way, an actor must encompass a multistrategy-based behavior. In that sense, an actor must have the capability to co-operate in multiple learning strategies and learn from this co-operation instead of aiming at multiple goals to satisfy intention of each actor. They improve their own behavior in co-operating with other actors. 4.2 Global architecture The architecture (Figure 2) we need contains four modules (Perception, Action, Control and Cognition) distributed in three layers (Reactive, Control and Cognitive). The reactive layer is similar (but extended) to the Touring Machines [7], consisting in perception and action subsystems that interface with the environment of the agent and three control layers (reactive, planning and modelling layers). Similarly, our architecture contains also a reactive layer, with direct association between perception and action modules, and a control layer which corresponds to the planning layer. However, the third layer (cognitive layer) is not restricted to solving goal conflicts (as in Touring machines) but supports the ability to learn from experience which is an important element of an intelligent entity [10].
Figure 2: Conceptual architecture of an actor.
7
An original point of this architecture is that each actor has the possibility to observe the previous behavior of the other actors, their trace of actions (and not only the results of actions). To allow such capability each actor has an external view on the other actors and an internal view allowing other actors to consult its own behavior. In addition, an actor can decide to hide part of its behavior to the actors. The role of each module is the following. Perception Detects a change of the environment of the actor and identify the situations in which the actor may intervene. The environment of an actor is composed by all the other actors and a common memory. Changes of the environment result from the activity of the other actors (the fact that the troublemaker has just given a wrong information or that an answer of the learner becomes available in common memory). Action Regroups all the actions (functions) which allow the actor to operate on the environment. For instance display an answer, submit a problem, congratulate, mislead,... Control Handles situations which imply a planning aspect in order to determine the actions to be activated. For instance the tutor can decide to stop or continue the tutoring session, the troublemaker may give the right or wrong solution,... Cognition Allows the actor to improve its performance according to several aspects such as: improve actors's perception (Œ), expand the control (º), complete the set of actions (ª), modify the view of the other actors on its internal activity (•), and finally improve its own reasoning (æ) strategies (improvement of cognition). This module includes several reasoning strategies and learning mechanisms intended to modify or create new reasoning strategies. These modules can be used according to different functioning modes which can involve one or several modules. We distinguish four functioning modes: reflex, control, reasoning, and learning mode.
8
• Reflex mode (¿): it involves perception and action module; in that case there is a direct association between an identified situation and a specific action (spontaneous action) without reasoning capabilities. There is only a kind of pattern matching of behaviors, checking the conditions in which each action must be executed. This mode avoids to call a higher level layer to generate control actions. • Control mode (¡): this mode involves perception, action and control modules. Starting from a given situation the control module takes a decision among possible alternatives, it constructs plans and selects actions to execute. • Reasoning mode (Æ): It involves the four modules.This mode intervenes when the actor is unable to take a decision at the control level (contradiction, conflicts,.uncertainty, between different decisions) or when, after observation of the control mode, the cognition module decides to override some knowledge in the control layer in order to influence the decision at the control level. In this mode the influence of the cognitive layer concerns improvement of control layer (–). • Learning mode (Ø): it involves perception and cognition modules. In that case, the objective of the actor is to learn and modify its reasoning mode and consequently all the corresponding levels. In this mode the actor has no social behavior (interaction with other actors) but only internal modification as objectives.
5. Functional design Now we can give more details on each module and precise the interactions between the different layers (Figure 3). An example of the whole behavior of actors is given in section 6. The perception module It contains a set of typical situations. A typical situation corresponds to a condition of activation according to characteristics of the environment. Each external actor is observable through the external view and actions can update the common memory. Most of these typical situations are defined by the pedagogical expert for each actor (for instance all the conditions of activation for the troublemaker). However a typical situation can be also induced by the cognitive layer (Figure 3, •).
9
Each typical situation is described by an object with three parts: a focus, a condition, and a conclusion. The focus allows to restrict the view of the environment in order to only consider information that are relevant for the evaluation of the condition. The condition is a logical proposition. The conclusion refers to the task to be activated when the condition is true. This task can be situated either at the action level (¿) or the control level (•), as indicated on Figure 3. An example of a typical situation is given in section 6.
Figure 3: Detailed architecture of an actor.
10
All the activities that the actor can perform (actions, control or cognitive decisions) are supported by tasks situated in action, control and cognition modules.Four categories of tasks are distributed among these modules: operating, abstract, control and cognitives tasks. Action module The action module consists of a set of tasks that the actor can perform to act on the environment. There are two types of action tasks: operating tasks that are perceptible by the other actors and abstract tasks which are not perceptible. Operating tasks (grey boxes on Figure 3) are elementary tasks (for instance GiveProblem, Display-Answer). Abstract tasks (white boxes on Figure 3) include both elementary tasks (e.g. Find-Right-Answer (this task allows the tutor to get the solution of a problem)) and tasks that lead to the activation of an operating task (for instance, in the case of the troublemaker, Mislead calls the Display-Answer operating task with a wrong answer as a parameter). Control module The control module contains several control tasks which are activated by typical situations and allow to trigger action tasks. The goal of a control task is to participate in a decision process which aims to select and activate a sequence of action tasks. A control task can also call another control task before activating an action task. For instance, regarding the troublemaker (see section 6), the Scene2: React-To-Answer control task calls another control task ( Choose-An-Attitude) which sends back a decision. According to this decision, Scene2: React-To-Answer finally calls a specific action task (e.g. Tell-lies, Remain-silent). Cognition module The cognition module consists of several cognitive tasks and a single cognitive control task. • In order to ensure the various functions of the cognitive layer (as mentioned in section 3), each cognitive task is in charge (specialized) of improving a specific aspect of the actor (improving a specific module or modifying the views grants). Cognitive tasks are not activated from other components (typical situations and tasks) but are permanently running; they possess two distinct parts: a learning algorithm and an action part. The learning algorithm. can in fact use one among three learning mechanisms: analogy,
11
deduction and induction. They allow to analyze the results of previous actor’s performance, and decide how to improve it (learning by experience). • The role of the cognitive control task is to modify the expertise of the cognitive tasks (actions and learning mechanisms). Let us take two examples of cognitive tasks. A cognitive task specialized in improving the control module can observe the previous decisions of a control task that has to choose among several alternatives (see section 6: Choose-An-Attitude). The learning algorithm of the task (for example an induction mechanism) can infer that these decisions are not justified and decide to create a new control task. When the conditions of activation (typical situation) of the control task will be satisfied the cognitive task will dynamically replace the old control task by the new one1. Similarly a cognitive task intended for improvement of perception will be able to modify typical situations (focus, condition or conclusion) or to infer some new ones. Views on actor behavior To keep a trace of the activity of the actor (also called behavior) each activation of a task is stored in the previous behavior area indicated on the right side of Figure 3. As tasks are classified according to four categories, the actor behavior can be observed according to several levels of abstraction or views. Thus, it is possible to observe the actor' behavior within 4 views: operating view (operating tasks only), abstract view (operating and abstract tasks), tactical view (operating, abstract and control tasks), strategic view (all the tasks). Schematically, the operating view of an actor shows what the actor has done while the other views explain why. So actors can have a more reliable behavior knowing the reasons of activities of the other actors. By default, an actor has only an operating view on the others. Moreover, an actor can still decide to restrict the internal view on its behavior. 6. Design and implementation of the learning by disturbing strategy We will illustrate the functioning of the actor-based architecture with the 'learning by disturbing' strategy [2]. This strategy involves three actors (a tutor, a troublemaker, and an artificial learner) in order to stengthen the learner self-confidence, as indicated in section 2. 1
In fact the old control task can remain valid in another context, typical-situation or control task
12
6.1. The strategy The competence level of the troublemaker is superior to the learner in order to provide a profitable competition with him/her. A problem is submitted both to the learner and the troublemaker. The troublemaker can have different behaviors: give a wrong answer to the problem in order to force the learner to react and propose the right sol ution, wait for the solution of the learner and give a wrong suggestion or solution or a counter-example. If the learner is unable to give a correct solution the teacher finally gives him the right solution. The troublemaker can react only once to each request of the tutor.
13
6.2. Design and implementation The implementation of this strategy requires to define a set of tasks and a list of typical situations for each of the three actors mentioned above. Figure 4 shows all the typical situations and tasks for the troublemaker. The perception module involves two typical situations TM-TS1 and TM-TS2, which respectively allow the actor to intervene before and after learner’s answer. We describe below the implementation of the typical situation TM-TS2: Focus Condition Access to behaviors: Yes Learner-answer From: TUTOR last Operating Task is-in Common Memory Only: Operating-View (Behavior (TM)) = Common memory: Yes
Conclusion Scene2: React-To-Answer
The focus part restricts the view of the environment to the information that are relevant for the evaluation of the condition: the behavior of all actors since the tutor has given the problem, and the common memory. According to the previous description of the strategy, the condition part checks the presence of the learner-answer in the common memory, and the fact that the troublemaker has not already reacted on the current problem (last proposition). When the condition is true, the conclusion part calls the Scene2: React-ToAnswer control task. The algorithm of the Scene2: React-To-Answer control task (described bellow), calls another control task Choose-An-Attitude, which decides of the activation of the suitable action task: attitude := Choose-An-Attitude if (attitude = be negative) then Mislead else if (attitude = be positive) then Get the learner’s answer in the Common Data Area. Analyse the answer. if (right answer) then Approve else Give-Solution else if (attitude = be neutral) then Remain-Silent
6.3. Example of functioning To illustrate the functioning of the 'learning by disturbing' stategy, and especially of the troublemaker, Figure 4 considers the following situation :
14
At time t4, none of the actors is active and since the beginning of the session: we have the following - the tutor submitteda first problem to the learner (time t1), - the troublemaker has decided not to react before learner’s answer (time t2), - and the learner has given the right answer (time t3), which is now available in common memory. Figure 4 presents two possible scenarios: the first one (arrows labelled with white circles) gives an example of the control mode, while the second one (arrows labelled with black circles) concerns the reasoning mode.
Figure 4: Implementation and example of functioning of the troublemaker.
15
First scenario: In its attempt to rebuild, step by step, the behavior of all the actors the troublemaker accesses to the previous behavior area according to its view grant. This explains why the result of this operation (behavior of the actors indicated on the left side of the troublemaker) contains only the operating tasks of the tutor and the artificial learner. This view on the environment makes the TM-TS2 typical situation triggerable; so the Scene2: React-To-Answer control task (that is linked with TM-TS2) is activated (¨). As previously mentionned this task calls another control task: Choose-An-Attitude (¡) which returns a negative position to Scene2:React-To-Answer.This last one activates the Mislead abstract task (¬); this task calls the Display-Anwer operating task with a wrong solution as a parameter (•). Consequently, a wrong solution is displayed on the learner’s screen. Second scenario: This second scenario begins like the first one, however when the Choose-An-Attitude control task is activated, a cognitive task (Improve-Decision) intervenes to change the expertise for selecting an attitude. It stops the current control task (•), creates a new control task (•) allowing to take the same kind of decision but with a new expertise, and activates this last one (Œ). Unlike the first scenario, the decision is now to be positive; so, because learner's answer is right, Scene2:React-To-Answer calls the Approve operating task (œ). Finally the result of all these operations on the troublemaker behavior will be updated with the following information: - First scenario t4. (Scene2: React-To-Answer (Choose-An-Attitude) (Mislead (Display-Answer)))
- Second scenario t4. (Scene2: React-To-Answer (Choose-An-Attitude [Cancelled]) (Improve-Decision) (Choose-An-Attitude-New) (Approve))
7. Conclusion We have presented an ITS architecture based on actors, a type of intelligent agents with suitable properties for ITS. The actors characteristics have been particularly detailed for the pedagogical component of the ITS architecture in which various actors can interact dynamically. This architecture has multiple advantages. First, it provides a high degree of
16
flexibility in terms of interaction between the learner and actors in various strategies. This allows a co-operative approach in which the learner is involved with a constructive knowledge elaboration.Second, the actor improves itself by interacting with the other actors. The learner is not the only participant who learns but the community of actors that is attentive to the behavior of the learner and learns by experience. Third, the cognitive layer has learning mechanisms to cope with new situations that cannot be processed at lower levels. Learning by experience allows the actors to evolve from the reasoning mode to the control mode and even to the reflex mode, using a permanent learning process. This architecture is currently implemented in the SAFARI project (a multidisciplinary project aiming at developing various ITS) in Smalltalk. The next step is to test the learning by disturbing strategy in an Intensive Care Unit prototype whose curriculum is already developed. ACKNOWLEDGMENTS This work has been supported by the Ministry of Industry, Trade, Science, and Technology (MICST) under the Synergy program of the Government of Québec.
REFERENCES 1.
Aïmeur, E. & Frasson, C. (1995). Eliciting The Learning Context. In Co-Operative Tutoring Systems, IJCAI-95 Workshop on Modelling Context in Knowledge Representation and Reasoning, (pp. 1-11).
2.
Aïmeur, E., Frasson, C. & Sthiaru-Alexe, C. (1995). Towards New Learning Strategies In Intelligent Tutoring Systems, Brazilian Conference of Artificial Intelligence SBIA’95.
3.
Altermann, R. and Zito-Wolf, R. (1993) Agents, Habitats and Routine Behavior, Thirteen International Conference On Artificial Intelligence.
4.
Ambros-Ingerson, J. & Steel, S (1988) Integrating planning execution and monitoring. In Proceedings of the seventh national conference on artificial intelligence (AAAI 88), Saint Paul, MN, (pp. 83-88).
5.
Castelfranchi, C. (1995). Garanties for autonomy in cognitive agent architecture. In Wooldridge, M. and Jennings, N.R., editors, Intelligent Agents: Theories, Architectures and Languages (LNAI vol 890), Springer Verlag: Heidelberg, Germany, (pp. 56-70).
17
6.
Chan, T.W. & Baskin, A.B. (1990). Learning Companion Systems. In C. Frasson & G. Gauthier (Eds.) Intelligent Tutoring Systems: At the Crossroads of Artificial Intelligence and Education, Chapter 1, New Jersey: Ablex Publishing Corporation.
7.
Ferguson, I. A. (1992) TouringMachines: An Architecture for Dynamic, Rational, Mobile Agents. PhD Thesis, Clare Hall, University of Cambridge.
8.
Gilmore, D. & Self, J. (1988). The application of machine learning to intelligent tutoring systems. In J. Self, (Ed.) Artificial Intelligence and Human Learning, Intelligent computer-assisted instruction, New York: Chapman and Hall, (pp. 179-
9.
196). Hayes-Roth, B. (1995) An architecture for adaptive intelligent systems. Artificical Intelligence: special issue on agents and interactivity, (pp. 327-365).
10. Honavar, V, (1994) Toward learning systems that integrate different strategies and representations. In Symbol Processors and Connectionist Networks for Artificial Intelligence and Cognitive Modelling: Steps toward Principled Integration, Honavar, V. & Uhr, L. (Ed), New York, Academic Press. 11. Huffman, S. B. (1994) Instructable Autonomous Agents. PhD Thesis, University of Michigan, dept of Electrical Engineering and Computer Science. 12. Mengelle, T. (1995) Etude d'une architecture d'environnements d'apprentissages basés sur le concept de préceptorat avisé. PhD Thesis, University of Toulouse III 13. Morignot P. & Hayes-Roth, B. (1995). Why does an agent act ?. In M.T. Cox & M. Freed (Eds.), Proceedings of the AAAI Spring Symposium on Representing Mental States Mechanisms. Menlo Park, AAAI (in press.) 14. Palthepu, S., Greer, J., & McCalla, G. (1991). Learning by Teaching. The Proceedings of the International Conference on the Learning Sciences, AACE. 15. Rao, A. S. & Georgeff, M. P.(1991) Modelling rational agents within BDIarchitecture. In Fikes, R. and Sande Wall, E. editors. Proceedings of knowledge representation and reasoning (KR&R-91), Morgan Kaufmann Publishers: San Mateo, CA, (pp. 473-484). 16. Van Lehn, K., Ohlsson, S. & Nason, R. (1994). Application of simulated students: an exploration. Journal of artificial intelligence in education, vol 5, no 2, (pp. 135-175). 17. Wood, S. (1993) Planning and Decision Making in Dynamic Domains. Ellis Horwood: Chichester, England.
18