A Rule-Based Recommender System for Online Discussion Forums ...

3 downloads 6906 Views 157KB Size Report
Online discussion forums allow people to discuss different topics using the .... and IT class, (4th year Computer Science students), the Comtella-D system was.
A Rule-Based Recommender System for Online Discussion Forums Fabian Abel2 , Ig Ibert Bittencourt1 , Nicola Henze2 , Daniel Krause2 , Julita Vassileva3 1

Federal University of Alagoas, Computer Science Institute Maceio, AL, Brazil [email protected] 2 IVS - Semantic Web Group, Leibniz University of Hannover Appelstr. 4, D-30167 Hannover, Germany {abel, henze, krause}@kbs.uni-hannover.de 3 Department of Computer Science, University of Saskatechwan 110 Science Place, Saskatoon SK S7N 5C9, Canada [email protected] Abstract. In this paper we present a rule-based personalization framework for encapsulating and combining personalization algorithms known from adaptive hypermedia and recommender systems. We show how this personalization framework can be integrated into existing systems by example of the educational online board Comtella-D, which exploits the framework for recommending relevant discussions to the users. In our evaluations we compare different recommender strategies, investigate usage behavior over time, and show that a small amount of user data is sufficient to generate precise recommendations.

1

Introduction

Online discussion forums allow people to discuss different topics using the World Wide Web. While a discussion forum has often one large overall topic, it is normally divided into subtopics, called subforums or topics. The subforums are further divided into single threads. In these threads, one specific question, defined by the thread opener, is discussed by several users. Every user who wants to contribute can create text snippets, called posts which are ordered by the time they have been created. Posts can be displayed as a list and enable other users to follow the discussion easily. The tree-like structure of the discussion forums enables users to navigate quickly to the topics which they are interested in. A drawback of the structure is that it is hard to find interesting threads if the thread is either not classified correctly by the thread opener or the user does not know how a specific topic of his interest is classified into the static discussion forum’s hierarchy. Another drawback is that every thread can be assigned to one category, making threads matching to multiple categories hard to find. A commonly used approach to handle the problems caused by the described classification is to display a flattened list of all topics which encountered recent changes as a starting point. However, when forums become popular and hence large or the community discusses various topics, these lists contain a high percentage of threads which are not relevant

for the user. This results in the situation that interested users encounter serious problems to find relevant threads when the forum grows. Collaborative recommender systems can be used to cope with the issue of bringing users and relevant tasks together. In an E-Learning Online Discussion Forum like Comtella-D there are different kinds of input data which can be used to create recommendations. In this paper we evaluate a) which kind of input data fits best and b) how much input data is required to generate appropriate recommendations. Based on the results of this evaluation, we propose a rule based framework which chooses the optimal input data source: Recommender systems based on different input data sources are implemented as Web Services of the Personal Reader Framework [1]. A rule layer enables on the one hand to pass parameters to single personalization Web Services, PServices for short, and on the other hand to combine the results of different PServices. Default rules allow Comtella-D users to use this personalization framework immediately without any interaction. Moreover, user adjustable parameters in the rule allow for fine tuning of the rules if a user is not satisfied with the recommendations. The paper is structured as follows: In Section 2 we describe our rule-based personalization system. In Section 3 we describe the Comtella-D system and outline the need for personalization in this system. Afterwards, by evaluating different recommender strategies, we define a flexible personalization rule in Section 4 that performs best in creating recommendation in different scenarios. Section 5 contains related work and Section 6 gives a conclusion and some further ideas to be exploited.

2

Rule-based Personalization System

Personalization techniques have been investigated extensively in different areas of computer science. Especially in the domains of recommender systems [2] and adaptive hypermedia [3], personalization algorithms have been developed and deployed in various systems. These personalization techniques are generic as the algorithms can be deployed in different domains, changing the domain specific input data without the need of modifying the algorithm itself. Hence, personalization algorithms are perfect candidates for being encapsulated to become reusable. However, in current systems these algorithms are often strongly coupled with the system as the data is often domain specific pre- or postprocessed or combined with other algorithms. In our system we decouple personalization algorithms, data sources and pre- and postprocessing from each other and allow the creation of rules which describe the interaction of the single components. Figure 1 shows the architecture of the rule-based recommender system. It emphasizes two aspects. First, it assures the integration of different recommendation algorithms based on the use of Web Services. Second, show how to integrate external personalization functionality like recommendations. A description of the components of the architecture is listed below. However, it is not the focus of this paper to describe each component in detail. The technologies used in the

development of the system were Java, Prot´eg´e 4 , SWRL5 , OWL-S Editor 6 , and MindSwap 7 .

Fig. 1. Architecture of the System

DB represents all databases that can be used for personalization, for e.g. user profiles or data provided in the Web DS each data source represents an encapsulated personalization algorithm like a collaborative recommender system SWS (Web Services) – each data source can be accessed as Web Service Comtella Application it represents the Comtella application (more details are described in the next section) Rule-based Recommendation Interface this interface is used to specify personalization rules. Section 4.5 gives an example of such a rule. SWRL (Semantic Web Rule Language) it is used to specify the conversion of information between the Comtella-D application and the data sources

3

The Comtella-D System

Comtella Discussions (Comtella-D) [4] is an online community for discussing the social, ethical, legal and managerial issues associated with information technology and biotechnology. It was used to support the coursework related to a 4th year undergraduate class on Ethics and IT taught in the spring of 2006 at the University of Saskatchewan. Access to content is restricted to registered members, but anyone may create an account at http://fire.usask.ca after consenting to release their access data for research purposes. A nickname/alias, e-mail address, and password are required to create an account. Members are relatively anonymous because they are identified just by their alias. The purpose of using Comtella-D in the class was sharing and discussing information (Internet publications, popular magazine, articles, etc.) related to the course topics. The students had to share at least one link to an online article related to the weekly topic and summarize the article in a way that it stimulates discussion. 4 5 6 7

http://protege.stanford.edu/ http://www.w3.org/Submission/SWRL/ http://owlseditor.semwebcentral.org/ http://www.mindswap.org/

As a part of their coursework, the students also had to reply/discuss two of their colleagues’ postings each week. In parallel with the students of the Ethics and IT class, (4th year Computer Science students), the Comtella-D system was used in a class on Ethics and Technology offered by the Philosophy department. These students used the system only as an additional resource, recommended by the instructor. The system was not related to their coursework and it was used entirely voluntary. In the context of Comtella-D, a ’forum is an initial theme related to a course topic (usually weekly), defined and created by the instructor. A ’thread is started when a student contributes a link (URL) of a paper related to the topic of the forum. The first ’post in a new thread contains the URL and a summary of the paper (usually half a page). Further ’posts in the thread are added as other students respond to/discuss the first post of the thread. Each post can be commented. A ’comment is usually a very specific local comment to the post rather than to the entire thread. In Comtella-D comments were used mostly by the marker to give feedback on the quality of arguments raised in the students posts. Comtella-D allows students to rate posts by adding or removing ’energy to or from it. A user can rate every post once, but only if there is free energy in the system available. The system provides a limited number of energy units, depending on the level of activity in the system. The number of energy units in the system increases every time when a new post is created (2 new units are added), and it decays with time. In this way, the scarcity of energy in the system prevents users from overrating their colleagues posts, and encourages them to carefully read a post before assigning energy to it. This mechanism is described in [4]. As every week several new threads are started and popular threads attract many posts, keeping an overview of the discussion is a time consuming task. A student who does not spend the time to read all new posts could easily miss important topics of his/her interest. Hence, a recommender system is needed which points the student to relevant posts. Using our rule-based personalization framework, we can utilize collaborative recommender services to solve this task. Based on the features of Comtella-D, there are different possibilities on which input data such a collaborative recommender can perform: a) recommendations based on explicit feedback gained from the user’s energy rating and b) recommendations based on implicit feedback gained from co-posting in the same thread. In the following section we will evaluate which kind of user feedback fits better to recommend threads a user might be interested in. Therefore, we also evaluate how much input data is required and over which time frame this input data has to be provided to generate high quality recommendations.

4

Evaluation

For the evaluation we took a snapshot of the Comtella-D system of the Ethics and Computer Science course 2006. Overall, there were 110 registered users. From these users only 36 contributed actively by posting a least one message in

the discussion forum. Users rated other users 183 time and posted 756 messages in 173 threads over a time period of approximately 3 months. In these three months, the lectures deal every week with a new topic. To define a personalization rule which recommends threads a user could be interested in, we use the existing user interaction with the system. Before creating this rule, we have to examine different questions: a) How much training data is required to generate precise recommendations (Section 4.1)? b) What kind of input data (explicit or implicit) gives the best quality to recommend threads (Section 4.2)? c) Does the behavior of users in the discussion forum change over time (Section 4.3)? d) Are active users, i.e. users who have posted frequently and hence are more experienced, more reliable as source for recommendations (Section 4.4)?. For all of the following measurements, we used a recommender library8 which implemented the collaborative recommender algorithm described in [5]. 4.1 Required Amount of Training Data According to the first question we divided our data set into weeks corresponding to the different topics of the lectures. Afterwards, we iterated over the weeks, selecting every week x as training set and tried to calculate the posts a specific user will create in week x + 1. Therefore, we classified the users into different classes, these classes contain sets of users who have posted at least y posts in different threads and at least 1 post in the test set. Furthermore, as a nonpersonalized baseline algorithm, we recommend the top-k threads having the most posts. Our hypothesis is that the more data from a user is available in the training set, the more precise the recommendation for the test set are. The precision-recall distribution is build by iterating over all users in the class and calculating the top-k recommendations for these users. k is chosen from 1 to the number of all posts. For every k, the precision and recall is calculated as the average mean of all precision and recall values of all users in the class. Therefore, the recommendation system is invoked as follows: First, the posts generated in the training set are passed to the recommender system to determine the similarity between the users. Afterwards, the recommendations are calculated by passing all posts to the recommender system which were created in the test set. Figure 2 displays the precision-recall distribution for the non-personalized baseline algorithm and the personalized recommendations based on users who have contributed at least 2, 3, 4, or 5 posts in the training set. While for k 4. However, none of the different classes results in significantly better results than the other classes. Furthermore, all approaches are able to retrieve not more than 80% of the threads the users have contributed to. This can be explained by the characteristics of the recommendation process: When a thread is recommended, a user who is similar to the current user must have contributed to this thread. Hence, threads which are discussed by only a few users are recommended rarely. 8

http://www.l3s.de/˜diederich/SW/renkground-2006-09-07-1030.zip

1

min posts = 5 min posts = 4 min posts = 3 min posts = 2 baseline

Precision

0.8

0.6

0.4

0.2

0 0

0.2

0.4

0.6

0.8

1

Recall

Fig. 2. The precision-recall diagram based on implicit user feedback for users who have posted at least 2, 3, 4, or 5 times in the training set week.

This issue is known as new item problem in collaborative recommender systems [6]. Overall, the results imply that a) the non-personalized baseline algorithm is outperformed by the personalized algorithm and that b) two posts in a week are sufficient to generate precise personalized recommendations while more posts do not improve this the results significantly. 4.2

Implicit vs. Explicit User Feedback

Based on the classes defined in the previous section which used implicit user feedback by engaging the posts a user created, we define equivalent classes of explicit user feedback: These classes contain users who have at least added or removed x energy points to posts from other users in the training set week and have at least posted once in the test set week. To recommend posts by using user ratings we modified the similarity function of the recommender system. Instead of comparing the similarity of user vectors containing threads a user has posted in, we use vectors containing the energy distribution. Two users are considered as similar when they gave energy to the same post, hence expressing interest in the same post. We did not take into account if users added or removed energy as we interpreted every form of energy assignment as interest in a post. The recommender algorithm itself was not modified. Figure 3 gives an overview of the precision-recall ratio of recommendations based on explicit feedback for the classes of users having rated at least 2, or 3 other users in the training set period. The class with 5 energy assignments was omitted as it contained not enough users to deliver reliable results. The graph outlines that – like in the previous section – a comparable small amount of input data, namely two energy assignments, are sufficient to create appropriate

1

min energy = 2 min energy = 3 min energy = 4 baseline

Precision

0.8

0.6

0.4

0.2

0 0

0.2

0.4

0.6

0.8

1

Recall

Fig. 3. The precision-recall diagram based on explicit user feedback for users who have rated at least 2, 3, or 4 posts of other users in the training set week.

recommendations and that increasing the amount of input data does not increase the precision or recall of the recommendations significantly. Compared to the precision-recall distribution generated by implicit user feedback, the quality of the results generated by explicit feedback, in respect of both, precision and recall, are lower. We also tried to combine explicit feedback and implicit feedback as we expected that input from different sources could improve the overall performance. We used the average mean to combine the weighted result sets of the recommendations based on explicit feedback and implicit feedback. We examined that the more we increased the weight of the explicit user feedback, the worse our recommender system performed. Our conclusion for the given setting is that explicit feedback performs always worse than implicit feedback and cannot be used to improve recommendation based on implicit feedback. However, if no implicit feedback is given for a specific user, explicit feedback performs better than the non-personalized baseline algorithm. Hence, explicit feedback based recommendations can be used as a fallback if no implicit feedback is available. Based on these results we used implicit user feedback as source for the recommendations applied in the following evaluations. 4.3 User Behavior The Comtella-D system was strongly coupled with the timeline of the lectures. This means that the users discussed every week a new topic. We assume that the behavior of users changes over time (and over different topics) which means that the more weeks ahead recommendations are created, the more imprecise they are. Furthermore, as topics discussed in a given week should be still somewhat fresher in the memory of the students, we assume that the forecast for the next week would be more precise than forecasts for two or more weeks ahead.

To verify our assumptions, we iterated over all weeks and used them as training data. We calculate the recommendations for n weeks ahead, where n = 1, 2, .., 7 and compared them with the test data. Afterwards, we created the precision-recall diagram displayed in Figure 4.3.

1

1 week ahead 2 weeks ahead 3 weeks ahead 4 weeks ahead 5 weeks ahead 6 weeks ahead 7 weeks ahead

Precision

0.8

0.6

0.4

0.2

0 0

0.2

0.4

0.6

0.8

1

Recall

Fig. 4. User behavior over time

The figure displays a result which does not comply with our assumptions: The one week ahead precision-recall values for small top-k result sets are worse than all other forecasts. Furthermore, the forecasts for more weeks ahead do not comply to any rule or trend. This means that the behavior of the users indeed changes over time and topic, but that the change of behavior is not monotonic and cannot be forecasted. However, we have to remark that our dataset covers only three months of data. Thus, we can only infer about the short time behavior of users but cannot conclude that there is not a long time trend in user behavior. 4.4

Size of the Time frame

In the previous section we have shown that the user behavior changes over the weeks making a constantly high forecast for several weeks ahead impossible. To lower this effect, we increase the input data timeframe by aggregating several weeks as training set and creating recommendations for one week ahead. We expect that aggregating several weeks of input data normalizes the behavior of a user on one hand and increases the amount of input data one the other. Both effects should result in an increased quality of the recommendations. Figure 5 displays the measurement aggregating one to five weeks of input data and calculating the precision and recall of the recommendations for the following week. All input periods result in similar results. Our expectation that more input weeks could improve the result could not be proven. This also underlines our

1

1 input week 2 input weeks 3 input weeks 4 input weeks 5 input weeks

Precision

0.8

0.6

0.4

0.2

0 0

0.2

0.4

0.6

0.8

1

Recall

Fig. 5. Variation of the amount of weeks used as training data

previous observation that the changes of quality regarding precision and recall seem to follow no rule or trend. 4.5 Results The results show that a small amount of input data (two posts or energy assignments) is enough to generate precise information. Furthermore, we have shown that the implicit user feedback, given by the posting behavior of users gives much better recommendations than explicit user feedback given by the energy assignment of the users. Also we have shown that more input data does not automatically result in better recommendations. According to these observations, an optimal personalization strategy to recommend threads in the Comtella-D system is the following: if exist 2 or more posts of the user: -> recommendation based on implicit feedback else if exist 2 or more energy assignments of the user: -> recommendation based on explicit feedback else use the non-personalized baseline algorithm

5

Related Work

Our framework aims on decoupling personalization functionality from a specific application. For single domains, for e.g. the e-learning domain, there already exist approaches that realize such a decoupling [7]. However, to the best of our knowledge there exists no generic approach describing such an encapsulation of personalization functionality. Also different personalization techniques are already combined to overcome the disadvantages of single personalization techniques. In the domain of recommender systems, these combination techniques are known as hybrid recommender systems systems [6], utilizing for example both, collaborative and

content based recommender systems to overcome the new item or new user problem. The rule-based framework, however, does not only allow for a static combination of different personalization techniques. Instead, every rule can combine arbitrary personalization techniques.

6

Conclusion and Future Work

In this paper we presented a rule-based framework to combine arbitrary personalization techniques. Therefore, personalization techniques are encapsulated and separated from their input data to be reusable in different applications. We used the Comtella-D system to outline how the framework could be used to recommend forum threads. We specified a rule which selects the optimal recommendation technique according to the existing user information. To determine the best strategies, we evaluated which kinds of input data and which quantity is required to provide accurate recommendations. In the future, we plan to make the rule user-adjustable. This can be done by introducing user adjustable weights in the rule which enable the combination of different techniques according to a user’s preferences. Additionally, we plan to engage content-based recommender systems which take the text of the posts into account to improve the quality of recommendations further.

References 1. Abel, F., Baumgartner, R., Brooks, A., Enzi, C., Gottlob, G., Henze, N., Herzog, M., Kriesell, M., Nejdl, W., Tomaschewski, K.: The personal publication reader, semantic web challenge 2005. In: 4th International Semantic Web Conference. (nov 2005) 2. Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering 17(6) (2005) 734–749 3. Brusilovsky, P.: Adaptive Hypermedia. User Modeling and User-Adapted Interaction 11 (2001) 87–110 4. Webster, A., Vassileva, J.: Visualizing personal relations in online communities. In Wade, V.P., Ashman, H., Smyth, B., eds.: AH. Volume 4018 of Lecture Notes in Computer Science., Springer (2006) 223–233 5. Shardanand, U., Maes, P.: Social information filtering: Algorithms for automating “word of mouth”. In: Proceedings of ACM CHI’95 Conference on Human Factors in Computing Systems. Volume 1. (1995) 210–217 6. Burke, R.: Hybrid recommender systems: Survey and experiments. User Modeling and User-Adapted Interaction 12(4) (2002) 331–370 7. Brusilovsky, P., Henze, N.: Open corpus adaptive educational hypermedia. In Brusilovsky, P., Kobsa, A., Nejdl, W., eds.: The Adaptive Web. Volume 4321 of Lecture Notes in Computer Science., Springer (2007) 671–696

Suggest Documents