Recommendation System Using Multistrategy Inference and Learning ´ zy´ Bartlomiej Snie˙ nski AGH University of Science and Technology, Institute of Computer Science, Krak´ ow, Poland; e-mail:
[email protected]
Abstract. This paper presents a new approach to build recommendation systems. Multistrategy Inference and Learning System based on the Logic of Plausible Reasoning (LPR) is proposed. Two groups of knowledge transmutations are defined: inference transmutations that are formalized as LPR proof rules, and complex ones that can use machine learning algorithms to generate intrinsically new knowledge. All operators are used by inference engine in a similar manner. In this paper necessary formalism and system architecture are described. Preliminary experimental results of application of the system conclude the work. Keywords: Recommendation system, adaptive web sites, multistrategy learning, inferential theory of learning, logic of plausible reasoning.
1
Introduction
Developing adaptive web systems is a challenge for AI community for several years [1]. A broad range of soft computing methods is used in this domain: neural networks, fuzzy logic, genetic algorithms, clustering, fuzzy clustering, and neurofuzzy systems [2]. This paper presents a completely new approach: a knowledge representation and inference technique that is able to perform multi-type plausible inference and learning. It is used to build a model of recommendation system. Multistrategy Inference and Learning System (MILS) proposed below is an attempt to implement the Inferential Theory of Learning [3]. In this approach, learning and inference can be presented as a goal-guided exploration of the knowledge space using operators called knowledge transmutations. MILS combines many knowledge manipulation techniques to infer given goal. It is able to use a background knowledge or machine learning algorithms to produce information that is not contained in data. The Logic of Plausible Reasoning (LPR) [4] is used as a base for knowledge representation. In the following sections LPR and MILS are presented. Next, preliminary results of experiments are described.
2
Outline of the Logic of Plausible Reasoning
The core part of MILS is the Logic of Plausible Reasoning introduced by Collins and Michalski [4]. It can be defined as a labeled deductive system [5] in the following way. Language consists of a finite set of constant symbols C, countable set of variable names X, five relational symbols and logical connectives: →, ∧. The relational symbols are: V, H, B, S, E. They are used to represent: statements (V ), hierarchy (H, B), similarity (S) and dependency (E). Statements are represented as object-attribute-value triples: V (o, a, v), where o, a, v ∈ C. It is a representation of the fact that object o has an attribute a equal v. Value v should be a subtype of attribute a. If object o has several values of a, there should be several appropriate statements in a knowledge base. To represent vagueness of knowledge it is possible to extend this definition and allow to use composite value [v1 , v2 , . . . , vn ], list of elements of C. It can be interpreted that object o has an attribute a equal v1 or v2 , . . ., or vn . If n = 1 notation V (o, a, v1 ) is used instead of V (o, a, [v1 ]). Relation H(o1 , o, c), where o1 , o, c ∈ C, means that o1 is a type of o in a context c. Context is used for specification of the range of inheritance. o1 and o have the same value for all attributes which depend on attribute c of object o. To show that one object is below the other in any hierarchy, relation B(o1 , o), where o1 , o ∈ C, should be used. Relation S(o1 , o2 , c) represents a fact, that o1 is similar to o2 ; o1 , o2 , c ∈ C. Context, as above, specifies the range of similarity. Only these attributes of o1 and o2 have the same value which depend on attribute c. Dependency relation E(o1 , a1 , o2 , a2 ), where o1 , a1 , o2 , a2 ∈ C, means that values of attribute a1 of object o1 depend on attribute a2 of the second object. Using relational symbols, formula of LPR can be defined. If o, o1 , ..., on , a, a1 , ..., an , v, c ∈ C, v1 , ..., vn are lists of elements of C, then V (o, a, v), H(o1 , o, c), B(o1 , o), S(o1 , o2 , o, a), E(o1 , a1 , o2 , a2 ), V (o1 , a1 , v1 ) ∧ ... ∧ V (on , an , vn ) → V (o, a, v) are formulas of LPR. To represent general rules, it is possible to use variables instead of constant symbols at object and value positions in implications. To manage uncertainty the following label algebra is used: A = (A, {fri }).
(1)
A is a set of labels which estimate uncertainty of formulas. A labeled formula is a pair f : l where f is a formula and l ∈ A is a label. A set of labeled formulas can be considered as a knowledge base. LPR inference patterns are defined as classical proof rules. Every proof rule ri has a sequence of premises (of length pri ) and a conclusion. {fri } is a set of functions which are used in proof rules to generate a label of a conclusion: for every proof rule ri an appropriate function fri : Apri → A should be defined. For rule ri with premises p1 : l1 , ..., pn : ln the plausible label of its conclusion is equal fri (l1 , ..., ln ). Example of plausible algebra can be found in [6].
There are five main types of proof rules: GEN , SP EC, SIM , T RAN and M P . They correspond to the following inference patterns: generalization, specialization, similarity transformation, transitivity of relations and modus ponens. Some transformations can be applied to different types of formulas, therefore indexes are used to distinguish different versions of rules. Formal definitions of these rules can be found in [4, 7].
3
Multistrategy Inference and Learning System
The core element of MILS is the inference engine. Its input is a LPR formula that is an inference goal. Algorithm builds the inference chain using knowledge transmutations to infer the goal. Two types of knowledge transmutations are defined in MILS: simple (LPR proof rules), and complex (using complex computations, e.g. rule induction algorithms or representation form changing procedures). A knowledge transmutation can be represented as a triple: (p, c, a), where p is a (possibly empty) list of premises or preconditions, c is a consequence (pattern of formula(s) that can be generated) and a is an action (empty for simple transmutations) that should be executed to generate the consequence if premises are true according to the knowledge base. Every transmutation has a cost assigned. The cost should represent transmutation’s computational complexity and/or other important resources that are consumed. Usually, simple transmutations have low cost and complex ones have high cost. MILS inference algorithm is a backward chaining that can be formalized as a tree searching. It is a strict adaptation of LPR proof algorithm [7], where proof rules are replaced by more general knowledge transmutations. It is based on the AUTOLOGIC system developed by Morgan [8]. To limit the number of nodes and to generate optimal inference chains, algorithm A* is used.
4
Preliminary Experimental Results
In experiments, model of a web version of a newspaper is considered. Its aim is to present articles to users. Users should register before they can read articles. During registration they fill preferences form, but they are not forced to answer all the questions. As a result, model of the user is not complete but the level of noise is low. Missing values can be inferred when they are necessary using machine learning algorithms. Architecture of the system is presented in Fig. 1. Users’ preferences and background knowledge are stored in KB. When user requests for an index page, all articles are evaluated using MILS to check if they are interesting to the user. If a new knowledge is learned using complex operators during this evaluation, it is stored in KB. Knowledge generated by complex transmutations can be also used for other purposes. E.g. it can help to present appropriate advertisements for the current user.
$%
&'!( !"# )" ! &'!(
Fig. 1. System architecture
In the current version of software only one complex and several simple knowledge transmutations are implemented. Complex one is a rule generation transmutation based on Michalski’s AQ algorithm [9]. All derived and stored formulas have uncertainty and other factors assigned. Label algebra used is very simple and because of the lack of space it is not presented here. Cost of M P transmutation is 0.2, cost of SP ECo→ is 0.3. The rest of simple transmutations have cost 0.1. The complex transmutation has the highest cost: 10. Background knowledge consists of hierarchies used to describe articles (its topic, area, importance, and type) and users (gender, job industry, job title, primary responsibility, company size, area of interest, topic of interest), and several similarity formulas, e.g. S(computerScience, telecommunication, topic).
(2)
There are also implications that are used to recommend articles. Three of them are presented below (U, A, R, T are variables): V (U, article, A) ← V (A, importance, high), V (U, jobIndustry, f inance)
(3)
V (U, article, A) ← V (U, jobIndustry, it), V (A, topic, computerScience)
(4)
V (U, article, A) ← V (A, area, R), V (U, area, R), V (A, topic, T ), V (U, topic, T ) (5)
First rule can be interpreted as follows. User U is interested in article A if the article is important and the user job industry is finance. Second rule says that users working in IT are interested in articles about computer science. Third rule checks if article’s topic and area are equal to user’s topic and area of interest. Data about twenty users and ten articles are stored in KB. Attributes of chosen three articles are presented in Table 1. Question mark means that the the value of attribute is not known and corresponding statement is not present in KB. Let us trace reasoning of the system for user u1 that is described by the following attributes: gender = female, job title = manager, job industry = banking, primary responsibility = human resources, area = North America. Other user preferences are unknown. Status of article a1 represented by a goal formula V (u1 , article, a1 ) is inferred using implication (3). First premise is matched to a statement V (a1 , importance, high) from KB. Second premise, V (u1 , jobIndustry, f inance), is
Table 1. Chosen Article Attributes Topic Area Importance Type Recommended Article a1 politics ? high news yes a7 telecommunication ? low news no economy USA medium report yes a9
generated using value generalization transmutation (GENv ), because banking is below finance in the hierarchy. The article a7 would be recommended if u1 job was IT. In such case implication (4) would be used and a statement V (a7 , topic, computerScience) would be derived from V (a7 , topic, telecommunication) and similarity formula (2) using value similarity transmutation (SIMv ). Proof for the third goal V (u1 , article, a9 ) is generated using implication (5). Article attributes are stored in KB. Statement V (u1 , area, U SA) is inferred using value specialization (SP ECv ), because USA is below North America in the hierarchy. Derivation of V (u1 , topic, economy) is more complicated. This statement is not supplied in user u1 preferences, and it is not possible to derive it using simple transmutations. This is why complex transmutation AQ is used. It generates rules that allow to predict topic using other user attributes. One of rules generated is presented below: V (user, topic, economy) ← V (user, jobIndustry, f inance)
(6)
It can be applied for user u1 after specialization (application of SP ECo→ ) that replaces symbol user with its descendant u1 . Its premise is inferred using GENv like above. As we can see, system is able to infer plausible conclusions, automatically applying machine learning when necessary. Because system consists of on-line module only, the learning process can cause delays in responses to user requests. When such delays can not be accepted, inference engine should be modified to save information that some learning proces has to be performed and this process can be executed later.
5
Conclusions and Further Works
Multistrategy Inference and Learning System based on LPR can be used as a tool to build recommender systems. Such an approach has several advantages. Only one common knowledge representation and one KB is used to store user models, background knowledge, and user access history. All inference processes can be made using the same inference engine. This technique seems to be promising in adaptive web sites construction. Further works will concern adding other simple and complex transmutations, such as other rule induction algorithms, and clustering methods that can be used
to generate similarity formulas. On the other hand, simplification of the LPR formalizm is considered (e.g. dependency relation will be probably omitted). To extend system capabilities, user activity (attributes of read articles) will be used to build user models. Current system is written in Prolog, what makes problems in debugging and further development. It will be rewritten in Java or C++. Next, it will be tested more intensively. Other applications in adaptive web domain will be examined.
References 1. Perkowitz, M., Etzioni, O.: Adaptive web sites: an AI challenge. In: Proceedings of IJCAI-97. (1997) 16–23 2. Frias-Martinez, E., Magoulas, G., Chen, S., Macredie, R.: Recent soft computing approaches to user modeling in adaptive hypermedia. In Bra, P.D., Nejdl, W., eds.: Adaptive Hypermedia and adaptive web-based systems, Proceedings of 3rd Int Conf Adaptive Hypermedia-AH 2004. Volume 3137 of Lecture Notes in Computer Science., Springer (2004) 104–113 3. Michalski, R.S.: Inferential theory of learning: Developing foundations for multistrategy learning. In Michalski, R.S., ed.: Machine Learning: A Multistrategy Approach, Volume IV. Morgan Kaufmann Publishers (1994) 4. Collins, A., Michalski, R.S.: The logic of plausible reasoning: A core theory. Cognitive Science 13 (1989) 1–49 5. Gabbay, D.M.: LDS – Labeled Deductive Systems. Oxford University Press (1991) ´ zy´ 6. Snie˙ nski, B.: Probabilistic label algebra for the logic of plausible reasoning. In Klopotek, M., et al., eds.: Intelligent Information Systems 2002. Advances in Soft Computing, Physica-Verlag, Springer (2002) ´ zy´ 7. Snie˙ nski, B.: Proof searching algorithm for the logic of plausible reasoning. In Klopotek, M., et al., eds.: Intelligent Information Processing and Web Mining. Advances in Soft Computing, Springer (2003) 393–398 8. Morgan, C.G.: Autologic. Logique et Analyse 28 (110-111) (1985) 257–282 9. Michalski, R.S.: AQVAL/1 – computer implementation of a variable valued logic VL1 and examples of its application to pattern recognition. In: Proc. of the First International Joint Conference on Pattern Recognition. (1973)