Machine Learning for User Modeling and Plan ... - Semantic Scholar

8 downloads 30365 Views 171KB Size Report
The agent to be observed is the user of an email system coupled with a plan .... the user always pursued plan Read delete msgs when the sender of the currently read ... of all hypotheses and the eventual selection of the \best" one if ... assigning a certain amount v0 of the numerical evidence mass to the hypothesis set ...
Machine Learning for User Modeling and Plan Recognition Mathias Bauer

German Research Center for Arti cial Intelligence (DFKI) Stuhlsatzenhausweg 3 66123 Saarbrucken, GERMANY [email protected]

Abstract

Whenever a system has to take into account an agent's actions or utterances in order to produce a certain kind of cooperative (or competitive) behavior, plan recognition becomes an important task. In many cases it is not sucient to merely maintain a list of plan hypotheses all of which proved compatible with the observations made so far. If a decision has to be made among a number of competing alternatives, a quality criterion is needed. This paper presents a method that allows plan hypotheses to be assessed on the basis of information about a user's typical behavior. Machine learning techniques are used to come up with a model of the user's preferences re ecting the regularities in his acting.

1 Introduction Whenever a system has to take into account an agent's actions or utterances in order to produce a certain kind of cooperative (or competitive) behavior, plan recognition becomes an important task. For example, an intelligent help system has to detect what the user of an application system is doing in order to be able to o er adequate help in accomplishing his goals or to adapt the system to the user's particular needs

in a given situation. Among others [GL92] and [BP93] demonstrate the usefulness of incorporating a plan recognition component into an interactive software system. Often, such a system is required to come up with a unique plan hypothesis (e.g., when a user is to be o ered a service like semantic plan completion) instead of only maintaining a set of equally plausible alternatives all of which are compatible with the user's observed actions. To do so, a quality criterion is needed, on the basis of which a ranking of all hypotheses can be computed. Additionally such a criterion can be used for the early exclusion of \bad" hypotheses. Grounding such a hypothesis assessment on user-speci c information contained in a user model allows the plan recognizer to focus on patterns of behavior that are typical to a particular person. As a consequence the plan recognition process becomes more reliable and the fast recognition of the user's actual plan enables an adequate reaction at an early stage during his interaction with the application system (see also [Bau96a]). In this paper I will describe how machine learning techniques can be applied to establish and maintain such a user model and how the plan recognition and other components of an application system can bene t from doing so. The following section will introduce a sample scenario that will be used throughout this paper to illustrate the methods suggested. Section 3 will describe the exploitation of machine learning in building a user model especially tailored for plan recognition purposes. Finally Section 4 will summarize the results and give an outlook on future work.

2 A Scenario and Basic Notions Besides describing a sample scenario, this section will present a general notion of plans which does not refer to any particular representation language. The agent to be observed is the user of an email system coupled with a plan recognition component.1 The latter works on the While this \toy domain" allows the basic principles of this approach to be presented, it certainly does not constitute a serious application. Currently the techniques presented are investigated in the context of more challenging domains like preparing a document using various tools like editors, text formatting programs, previewers etc. Services o ered by the system include supporting the choice of a currently available printer or the automatic execution of subtasks like running latex 1

Process_mail quit Read_mail Send_mail Store_messages Read_delete_msgs delete

Read_write_msgs Read_save_msgs

store_message

write

read_message

read

save

type

abstraction decomposition

Figure 1: A sample plan hierarchy basis of a plan hierarchy like the one depicted in Figure 1. Such a plan hierarchy is meant to capture the standard repertoire of action sequences of a system user. Given by a domain expert, it re ects the decomposition of plans into actions and the existing abstraction relations among its entries. In general, a plan hierarchy is a tuple H = hP ; Dec; Abi where P is the set of plans represented in this hierarchy. Dec and Ab are called the decomposition and abstraction hierarchy, resp. Dec consists of a set of plan decompositions which are tuples hp; Ap; Cpi, where p 2 P is the name of a plan (or simply a plan), Ap is a set of actions|either primitive or abstract|and Cp is a set of constraints concerning these actions such as conditions about temporal relationships among various elements of Ap or restrictions of the objects manipulated by these actions. So, Dec contains the complete information about how a given plan can be executed without restricting it to one possible action sequence: p might, for example, describe a non-linear plan where the temporal ordering of the actions is only partial and thus allows for a number of concrete execution instances. and transforming its output into the PostScript format.

Ab is a collection of pairs hp1 ; p2i with p1; p2 2 P representing the fact that p1 is being abstracted by p2. Plans which never appear as the second element of such an abstraction pair are called basic plans, the others abstract plans. These are intended to represent concepts which describe classes of basic plans sharing some common property like reaching the same goal or containing the same actions.2 In the same way, Ab also encodes information about basic and abstract actions.

Remark: Adopting a closed-world assumption, abstract plans are iden-

ti ed with the set of all basic plans they subsume. Analogously each non-singleton set of basic plans can be considered an abstract plan. Given these notions, it is possible to represent all relevant aspects of plan hierarchies in a uniform way. Note that no particular language for actions and constraints was speci ed.

Example: Figure 1 shows part of the plan hierarchy for the given sce-

nario with solid abstraction arcs and dashed decomposition arcs pointing from plans to actions. If an abstract plan contains an action that also occurs in all more concrete plans, the corresponding decomposition arc is not repeated. E.g., the arcs pointing from Read write msgs and Read save msgs to the action read message are left out. The names of abstract plans are underlined. The constraints relating actions and plans are not depicted here. The abstraction hierarchy contains, e.g. the pair h Store messages; Read mail i, and a typical plan decomposition looks like hRead save msgs; fread message; save; quitg; f:::gi.

3 Machine Learning for Plan Recognition This section will brie y discuss the motivation for using ML techniques and their actual application in the context of plan recognition and user modeling. The decision of which basic plans are grouped this way depends on the designer of the plan hierarchy. 2

3.1

Why Machine Learning?

When trying to adapt a plan recognizer|and as a consequence also the application system using its results|to a particular user's behavior, a proper model of his preferences is crucial. Preferences are the reasons why a user acts in a particular way when facing a certain situation. Here a situation is meant to be a (partial) snapshot of the current state of the world, i.e. an enumeration of some of its most relevant properties. In the context of a software application like the email system this amounts to listing the current values of the most important system parameters, e.g. the sender and length of the current message, its reply-status, subject and so on. In many cases the user will not be willing to actively convey these preferences to the system. An intelligent help system, for example, is typically expected to automatically adapt to the peculiarities in a user's behavior. However, it is also possible that the user is not even aware of such recurring patterns of behavior himself. For example, a certain suboptimal behavior in a certain class of situations often cannot be detected by the user. As a consequence there is no way for him to tell the system about this aspect of his acting. An (inductive) machine learning approach can detect the regularities behind the user's decisions to act in a particular way and describe them in terms of the situation parameters, his respective action being one of the most important ones. This way the system's dependency on the user's willingness (and ability) to actively support it during the knowledge acquisition phase can be drastically reduced. 3.2

The Training Phase

The regularities in a user's behavior can be captured by classifying the various situations he is confronted with according to his respective reactions. In terms of machine learning, this means the training examples are situation descriptions in terms of various attribute values including the user's actions. The aim is to form (abstract) classes of situations which are similar in the sense that a given user action was executed as part of a given plan. For example, one such class would comprise the set of situations in which the user executed read message while pursuing plan Read delete msgs.

sender Manager

Co-worker

(#=5) [Manager, *]: Read_delete_msgs

length < 50

(#=48) [Co-Worker, 1-49]:

Store_messages

> 50

(#=2) [Co-Worker, 50+]: Read_delete_msgs (#=8) [Co-Worker, 50+]: Store_messages

Figure 2: The decision tree for action read message. Whenever the user executes some action, this action along with the current values of a prede ned set of attributes describing the current application context is added to a database of examples. When the current plan-recognition session is successfully nished,3 all of these new database entries are marked to belong to class p, where p is assumed to be the currently recognized user plan. After a number of sessions|the \training phase" of the system|all database entries with identical user actions are forwarded to a derivative of the ID3 algorithm [Qui83] which computes the decision trees required for the classi cation of situations (see also [DM83, Qui86]). In addition to the original formulation of this algorithm, the frequencies of situations in the tree leaves are also recorded.

Remark: In case the user's plan could not be uniquely identi ed, p

corresponds to the set of plan hypotheses that proved compatible with all observations made. According to a remark made in Section 2 this set can be considered an abstract plan.

Example (cont): In the given scenario, situations are represented

as records with attributes action, sender, and length containing as values the user's recent action and several data regarding the current message.4 Figure 2 depicts the decision tree for the case of user acA plan-recognition session is meant to describe a complete cycle of user actions and the corresponding reactions of the plan recognizer such as discarding incompatible hypotheses, until the user stops interacting with the application system or his plan has been recognized. 4 While this is sucient for this example, a detailed description de nitely requires more attributes to be taken into consideration (see [Mae94]). 3

tion read message (entries  stand for arbitrary attribute values, while (#=n) indicates that a situation with this description was encountered n times during the sessions observed so far). It represents the fact that the user always pursued plan Read delete msgs when the sender of the currently read message was his manager. In case the message came from a co-worker, he saved it|i.e. his plan belonged to Store messages| unless its length was greater than 50 lines. In this case, the attributes were insucient to allow an exact discrimination of these situations w.r.t. the plans being pursued. If an action a never occurs during the training phase, the corresponding decision tree consists of only one leaf mapping all possible situations to the set Pa of all plans containing a as a substep. This information can be directly extracted from the decomposition hierarchy by simply collecting all plans with a decomposition link to a. In the case of action quit, for example, this set comprises all the plans contained in the plan hierarchy. As a consequence, observing this action without further information does not allow the set of plan hypotheses to be restricted in any way. After the training phase the system is equipped with a set of decision trees T (a) for each possible action a. The collection of all these trees can be considered an uncertain and incomplete model of the user's preferences. 3.3

The Plan-Recognition Process

Given a plan hierarchy and the decision trees T (a), the plan-recognition process itself takes place in 4 steps: 1. Whenever the user executes a new action a0, the relevant attribute values of the current situation s0 are determined. In the given example these are the sender and length of the current message. These values are obtained by querying the application system using process communication to link it to the plan recognition component. 2. This situation s0 is classi ed using decision tree T (a0) associated with the given action. On the basis of the result of this classi cation and additional information provided by the plan recognizer, a numerical hypothesis assessment|the so-called hypothesis strength r^ induced by the recent observation|is computed.

3. r^ is combined with the existing hypothesis assessment mi|resulting from the last observation|thus yielding the updated assessment mi+1. The latter forms the basis for the computation of a ranking of all hypotheses and the eventual selection of the \best" one if this is required (e.g. if the user explicitly asks for help). 4. After the end of a plan-recognition session, the data collected are marked with the recognized plan and the decision trees can be updated in order to cover also the experience gained during this last session (see the description of the \training phase" in Section 3.2). Point 2 requires further explanation. To that end assume the user's rst action was a read message, the current message was from his manager and had a length of 30 lines (i.e. s0 = [Manager,30] in terms of these attributes). Using the decision tree from Figure 2, the classi cation of s0 leads to the assumption that the user's current action de nitely belongs to the plan Read delete msgs. However, as the decision trees were constructed from a limited number of observed interactions only, the reliability of these data is restricted. This is taken into account by assigning a certain amount v0 of the numerical evidence mass to the hypothesis set containing all basic plans that do not explicitly exclude an occurrence of this action.5 Assuming a value of 0.1 for v0, this yields a hypothesis assessment (

= Read delete msgs r(X ) = 00::91;; X (1) X = Read mail which constitutes a so-called basic probability assignment (bpa) from the Dempster-Shafer theory (DST) [Sha76] on the set of all plan hypotheses (the values for all other plans are zero). That means the instruction on how to assess the various plan hypotheses after a given observation is \If an occurrence of read message is observed that ts into the logical and temporal structure of Read delete msgs and the currently read message was from the user's manager, then assign a degree of con dence of 0.9 to this hypothesis" (and analogously for the other plans, see Figure 2 and (1)). The value of v0 decreases stepwise as soon as the decision trees are updated in the light of new information, i.e. as soon as the data become more reliable. 5

The plan recognizer itself then checks whether the current occurrence of read message satis es the structural constraints imposed by the various plans with positive r-values. If some constraint is violated, the corresponding r-value is instead assigned to the most unspeci c hypothesis Process mail, thus representing the fact that the system does not know exactly how to deal with the current observation (see [Bau94]). The outcome of this step is denoted r^. Assuming r^ = r, i.e. all constraints are satis ed, the recent observation induces a very high degree of con dence in the fact that the user is currently pursuing plan Read delete msgs while also leaving a certain plausibility to other plans|i.e. other plans belonging to the abstract Read mail are not excluded for the rest of this session|and thus produces the desired result. Subsequent observations will con rm or refute this assessment. The combination of r^ with their results as mentioned in the third step can be done using Dempster's rule [Dem67], the basic reasoning mechanism of DST.

Remarks:

1. If the current situation cannot be uniquely classi ed (as is the case for the user reading a message of length 60 from a co-worker, i.e. s0 = [Co-Worker,60]), the frequency information stored in the corresponding leaf of the decision tree is used to compute the resulting hypothesis assessment r. In the given case, the values are 0.18 for Read delete msgs, 0.72 for Store messages, 0.10 for Read mail, and 0.00 otherwise. To obtain these values, compute the relative frequencies of all plans in the corresponding leaf (= 0.8 for Store messages and 0.2 for Read delete msgs) and discount these values by v0 (in this case 0.1). 2. In case the current situation cannot be classi ed at all using T (a0) (e.g. if the sender is neither a co-worker nor a manager), the statistical information contained in all the leaves of T (a0) is used to assess the various plan hypotheses. For the decision tree of Figure 2, this yields the values 0.1 for Read delete msgs and Read mail and 0.8 for Store messages (after discounting by v0 = 0:1). This means that such cases are assessed according to the user's long-term behavior without considering the particular situation encountered. 0

4 Conclusion and Future Work

The techniques presented make the user modeling component|that has to provide a model of the user's preferences as the basis for plan recognition|widely independent of the user's cooperation. As a consequence they allow the regularities behind a user's behavior|and thus his motivations for acting the way he does|to be recognized and recorded even in non-cooperative settings, i.e. in domains where the user is not necessarily willing or able to convey his intentions to the system. However, the integration of a phase where the user explicitly teaches his preferences to the system is straightforward. The structure of the user model allows the plan recognition results and the system reactions based on them to be justi ed to the user in a concise way [Bau96b]. The integration of machine learning and the numerical formalism provided by the DST proved to make plan recognition reliable in the sense that large portions of the typical behavior of users in a given application can be quickly identi ed [Bau96a]. However, the feasibility of the proposed methods is limited to applications where users display a \substantial amount of repetitive behavior" (cf. [Mae94]). Additionally, a \training phase" is required to build an adequate model of a user's preferences. However, the combination with DST as the basic numerical representation formalism enables the system to deal even with significantly underspeci ed situations, i.e. with incomplete statistical data resulting from a small number of observations only. Future work will include the investigation of alternative learning algorithms like C4.5 [Qui93] and empirical evaluation of the techniques presented in various application domains. Here the reliability of the system's prediction of the user's future behavior and the question of how to come up with an appropriate feature set for the description of situations will play an important role. Finally, the possibility of partial matches will be investigated for the case that a given situation cannot be classi ed at all using the information gained so far.

References [Bau94] M. Bauer. Quantitative Modeling of User Preferences for Plan Recognition. In B. Goodman, A. Kobsa, and D. Litman,

[Bau96a]

[Bau96b]

[BP93]

[Dem67] [DM83] [GL92] [Mae94] [MCM83]

editors, Proceedings of the Fourth International Conference on User Modeling (UM94), pages 73{78, Hyannis, MA, USA, 1994. M. Bauer. Acquisition of User Preferences for Plan Recognition. In D. Chin, editor, Proceedings of the Fifth International Conference on User Modeling (UM96), pages 105{112, Kailua-Kona, HI, USA, 1996. M. Bauer. Justi cation of Plan Recognition Results. In W. Wahlster, editor, Proceedings of the 12th European Conference on Arti cial Intelligence, pages 647{651, Budapest, Hungary, 1996. M. Bauer and G. Paul. Logic-based Plan Recognition for Intelligent Help Systems. In C. Backstrom and E. Sandewall, editors, Current Trends in AI Planning: Proceedings of the Second European Workshop on Planning, Frontiers in Arti cial Intelligence and Applications, pages 60{73, Vadstena, Sweden, 1993. IOS Press. A.P. Dempster. Upper and lower probabilities induced by a multivalued mapping. Annals of Mathematical Statistics, 38:325{339, 1967. T. Dietterich and R. Michalski. Machine Learning{An Arti cial Intelligence Approach. In Michalski et al. [MCM83], chapter 3. B.A. Goodman and D.J. Litman. On the Interaction between Plan Recognition and Intelligent Interfaces. User Modeling and User-Adapted Interaction, 2(1-2):83{116, 1992. P. Maes. Agents that Reduce Work and Information Overload. Communications of the ACM, 37(7):31{40, 1994. R. Michalski, J. Carbonell, and T. Mitchell, editors. Machine Learning{An Arti cial Intelligence Approach. SpringerVerlag, Berlin, Heidelberg, New York, 1983.

[Qui83] J.R. Quinlan. Learning Ecient Classi cation Procedures and Their Application to Chess End Games. In Michalski et al. [MCM83], chapter 15. [Qui86] J.R. Quinlan. Induction of Decision Trees. Machine Learning, 1:81{106, 1986. [Qui93] J.R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, Los Altos, CA, 1993. [Sha76] G. Shafer. A Mathematical Theory of Evidence. Princeton University Press, Princeton, NJ, 1976.