Document not found! Please try again

Agent-based collaborative filtering based on fuzzy ... - Semantic Scholar

3 downloads 27446 Views 268KB Size Report
He obtained a Computer Science degree in the Politecnica University of Madrid in 1997. ... In fact some online sites play such a role, collecting user feedback in ...
Int. J. Web Engineering and Technology, Vol. X, No. X, 2004

Agent-based collaborative filtering based on fuzzy recommendations Javier Carbo and Jose M. Molina Complex Adaptive Systems Laboratory, Comp. Science Dept., Univ. Carlos III of Madrid, Spain E-mail: [email protected] E-mail: [email protected] Abstract: Recommender systems intend to provide suggestions based on the opinion of several sources of information. But personalised suggestions based on past user’s likes and dislikes require a distributed approach. In this way, agents may automatically collect recommendations from other agents applying personal criteria in order to determine whether an item is recommended to the user or not. The application of agent technology to recommending problem has been tested before by researchers from the M.I.T., Univ. North Carolina, and the Spanish Research Institute. In this paper, we present a new, elegant and effective way to combine vague and subjective opinions to make recommendations using fuzzy logic. We have adapted real data on evaluations of movies (from the site MovieLens) to compare our proposal with the predecessors. The experimental results obtained show, using ROC curves and cost analysis, how our approach performs better than some other distributed collaborative filtering methods applied to provide personalised recommendations. Keywords: autonomous agents; collaborative filtering; recommendation systems; fuzzy logic. Reference to this paper should be made as follows: Carbo, J. and Molina, J.M. (2004) ‘Agent-based collaborative filtering based on fuzzy recommendations’, Int. J. Web Engineering and Technology, Vol. X, No. Y, pp.000–000. Biographical notes: Javier Carbo is currently senior lecturer of the Computer Science Dept. of the Carlos III University of Madrid (SPAIN). He belongs to the Group of Applied Artificial Intelligence in this department. He has previously worked at the Artificial Intelligence Laboratory in the University of Savoie (FRANCE). He obtained his PhD degree from the Carlos III University in 2002. He obtained a Computer Science degree in the Politecnica University of Madrid in 1997. Dr. Carbo has over 20 publications and conference contributions. He has acted as a reviewer of several international conferences and has also organised several international workshops. He took part in a U.N. founded research project, 2 ESPRIT programs and other national projects. His interests are in automated negotiations, multiagent systems, electronic payments and fuzzy logic. José M. Molina received a degree in Telecommunication Engineering from the Universidad Politecnica de Madrid in 1993 and a PhD degree from the same university in 1997. He has worked in the Signal Processing and Simulation Group of the same university since 1992, participating in several national and European projects related to Radar Processing. He also joined the Computer Science Department of the University Carlos III of Madrid in 1993, being enrolled in the Group of Applied Artificial Intelligence working in soft computing techniques (NN, Evolutionary Computation, Fuzzy Logic and Copyright © 2004 Inderscience Enterprises Ltd.

1

2

J. Carbo and J.M. Molina Multiagent Systems). He is author of up to ten journal papers and 70 conference papers. His current research focuses on the application of soft computing techniques to security in e-commerce, information retrieval from web, problem solving in web domains, radar processing, air traffic control, etc.

1

Introduction

Through internet we can reach a great number of products, commercial sites and people, but choosing among so many options is not an easy task. Recommender systems have emerged to solve this problem. In fact some online sites play such a role, collecting user feedback in order to recommend items. Amazon, IMDb, BarnesAndNoble, CDNOW, etc. are examples of recommender sites. There are mainly two types of recommender systems [1]: content-based recommending and collaborative filtering. The first of them compares a representation of content contained in an item to the representation of content that interests the user. But in domains where content is difficult for a computer to analyse – eg. ideas, opinions, etc. – collaborative filtering is the most promising approach. Collaborative filtering systems face two limitations [2], usually noted as sparsity and as first-rater problem. Sparsity stands for the low probability of finding a set of users with significantly similar ratings in a very populated system. First rater problem stands for the handicap suffered by new items when the system is not in an initial stage of use. Both problems have not yet been completely solved. The classic view of collaborative filtering recommenders [3] consists of a database of preferences for items by users (see Figure 1). A new user matches his preferences against the database to find neighbours (other users with similar likes and dislikes). Then neighbours would recommend the items that they liked. However they suffer two fundamental problems: low scalability of the system and low quality of the recommendations. The first of them is tackled through a distributed view of recommendation task since the central entity could become a bottleneck of the system. A distributed approach is also more adequate for privacy issues since user’s profile should be protected. A third motivation to use a distributed approach relies on the different criteria that the users may apply to evaluate an item. Only the recommendations from those neighbours who share the same evaluation criteria should be considered. Therefore, no central entity should collect the ratings from all of users, and each user has its own view of the world, cooperates with some other agents (neighbours) and acts by himself (autonomously). Both characteristics are often applied to define ‘agents’. These autonomous programs intend to make real-time intelligent decisions on behalf of human users [4]. The use of agents may improve the efficiency, customisation and competence of many selecting decisions. Despite very remarkable potential advantages of using autonomous agents, its level of success is far from expectations. Many factors may be limiting their use, but one of the most argued reasons is the lack of trust of human users on the automatic decision making process involved. In order to overcome such handicap, agents should adopt intelligent decisions but in a human-like way. These final aims are also applicable to the generation of recommendations [5,6].

Agent-based collaborative filtering based on fuzzy recommendations Figure 1

3

A database of ratings acting as a central recommender

One of the main attributes of agents is their ability to cooperate, forming coalitions of agents with common interests (so called neighbours). The agents who belong to the same coalition, or cluster, share information about third parties with the neighbours. This cooperative attitude of agents reflects human collaborative filtering methods and has been successfully applied to several scopes [7]. The formation process of these clusters has been studied extensively in literature, and is not the focus of this paper [8]. Figure 2

Agent-based collaborative filtering

The second of the challenges faced by collaborative filtering, namely, low quality of recommendations, also plays an important role. Recommender systems would not succeed if recommendations were poor. Recommender systems, like other search systems have two types of characteristic error: false negatives (items not recommended but users

4

J. Carbo and J.M. Molina

would like them), and false positives (recommended items that did not please the users). In electronic commerce scenarios the most important error to avoid are false positives, since these kind of errors generate distrust towards recommender systems and leads to the failure of their acceptance. Both characteristics are strongly linked, since the distributed nature of the multi-agent system has influence over the quality of recommendations. In centralised approaches, all opinions (collected by the central entity) are taken into account, but in a distributed system of agents, each agent has to make recommendations with incomplete information (only their own opinions and the opinions of the neighbours are known). So the quality of both types of approaches cannot be compared on a fair basis. These characteristics are the main focus of our work, and define the final goal of the research: designing and testing a distributed (agent-based) collaborative filtering through high-quality recommendations. The rest of the paper is organised as follows. Section 2 provides an overview of the most relevant agent-based collaborative filtering algorithms. In section 3, we describe in detail our proposal using fuzzy recommendations. Section 4 presents the experimental setup, the dataset used, and the results obtained. And finally, section 5 concludes with some future extensions of our work.

2

Distributed collaborative filtering algorithms

The general problem faced by collaborative filtering algorithms consists of how to combine the recommendations from different sources of information. If a distributed approach was assumed, a personalised way to combine such recommendations would be required. One of the simplest ways to combine the recommendations from different agents is to use the sum of all recommendations (which may be positive and negative), or the average sum of positive recommendations. Different points of view of each agent depend on which sources of information are selected to be taken into account. These computations were much too simple to provide high-quality recommendations, and more sophisticated proposals emerged from academic community. In general they compute the final opinion about a given item i aggregating recommendations through a weighted mean, where the differences between the proposals becomes from the definition of the weights applied in the formula. One of them, called SPORAS, was proposed by M.I.T. Researchers [9]. It computes the weight of each recommendation (valued between 0 and 100) inspired in the next aims: •

If a high number of recommendations were taken into account in the evaluation of the given item i, the contribution of a new received recommendation would decrease until a certain level of confidence is reached.



If a given item i was strongly recommended, future recommendations of this item i would be considered with a lower weight.



Only the most recent recommendations would be taken into account.

Based on these principles, a new recommendation Rt received in time t, would be weighted using equation 1.

Agent-based collaborative filtering based on fuzzy recommendations

5

1 w = ⋅ Φ(Rt −1 ) θ Eq. 1 SPORAS weight where Rt-1 is the own opinion of the agent, in other words, the aggregated sum of previous recommendations (until time = t–1). Let θ be the effective number of ratings taken into account in the evaluation (θ > 1). The bigger the number of considered recommendations, the smaller the change in resulting opinion about item i is. Furthermore, function Φ is defined in order to slow down the incremental changes for strongly recommended items. It is computed from equation 2.

Φ ( R t −1 ) = 1 −

1 1+ e − ( Rt −1 − D ) / σ

Eq. 2 SPORAS damping function Where dominion D is the maximum possible recommendation value and σ is chosen so that the resulting Φ would remain above 0.9 when recommendations values were below ¾ of D. Another model for aggregating positive recommendations, called REGRET, was proposed by researchers from the Artificial Intelligence Research Institute of the Spanish Council for Scientific Research. The core idea of REGRET [10] is to remark the freshness of the information. Computations in REGRET give a fixed high relevance to recent ratings over the older ones according to a time dependent function ρ that gives higher values to recommendations received at times closer to current time. It also only considers the last θ ratings into the weighted mean computation. Where ρ (tc,t) is a normalised value calculated from equation 3: w = ρ(t c , t ) =

f (t, t c ) k =c



k =c −θ

f (t k , t c ) t

Eq. 3 REGRET weight function in time ti for a recommendation received at time t where tc stands for the current time, and tk represent the instant when recommendation k was received. An alternative meaning, considers t as the number of a rating rather than temporal measures. For instance, the example of function f provided by the authors is the equation 4. f (t, t c ) =

t tc

Eq. 4 REGRET example of a time dependent function The last collaborative filtering method studied in this paper, is the one proposed by researchers Singh and Yu from the University of North Carolina [11]. The mentioned algorithm accepts recommendations rated from –1 to 1, and it behaves in different ways according to the sign of the recommendation. If the given item i received a recommendation at time t, Rt, and the own previous opinion of the agent about such an

6

J. Carbo and J.M. Molina

item until the reception of such recommendation, Rt-1, has the same sign, then the formula to be applied is equation 5. w = (1 − Rt −1 ) Eq. 5 Singh and Yu weight when Rt and Rt-1 have the same sign Otherwise, equation 6 should be considered: w =

1 1 − min{| Rt −1 |, | Rt |}

Eq. 6 Singh and Yu when Rt and Rt-1 have opposite sign These equations aim to give more weight to those recommendations with opposite sign to the aggregation of the rest of the recommendations. By this way the authors implement a collaborative filtering method that intends to minimise the deception through an extreme sensibility to a disagreement among the recommendations from the neighbours.

3

Agent-based collaborative filtering through fuzzy recommendations (ACF-FR)

We assumed previously that collaborative filtering is preferred rather than content boosted recommending when the content of the items to evaluate is difficult to analyse. In the introduction, we also considered the different evaluation criteria of users among the motivating reasons of using a distributed approach. Beyond this consideration, we can also assume that these differences come from the vague and uncertain nature of subjective interpretations of the quality of a given item. Therefore human users might feel more comfortable using vague terms rather than numeric values. These linguistic labels are then mapped with pre-defined fuzzy sets. So fuzzy sets may be appropriate to represent recommendations derived from personalised criteria. Fuzzy sets usually assign a truth-value in the range [0,1] to each possible value of the domain. These values form a possibility distribution over a continuous or discrete space. If we considered recommendations in the ad-hoc continuous space [0,100], we could represent graphically vague terms as ‘highly recommended’ with possibility distributions. Once we have recommendations expressed as fuzzy sets, the corresponding agent based collaborative filtering algorithm becomes an aggregation of fuzzy sets. Many aggregation operators have been proposed to combine different fuzzy sets: conjunctive, disjunctive, compensative, non-compensative, and weighted. When the elements to be combined have different importance, a weighted aggregation has to be used. The most common weighted operator is the weighted arithmetic mean, where: 1 ≥ w ≥ 0 . The weight w determines the contribution of the last recommendation in the own opinion about a given item i. Since this aggregation operation would be computed by an agent, and agents intend to show intelligent behaviour, we have applied an evaluation method which adapts the computations according to the circumstances faced in the past by the agent. In this way, the expected contribution of a new recommendation (weight w) will change dynamically according to the overall level of success/failure of predictions.

Agent-based collaborative filtering based on fuzzy recommendations

7

From the observation of the equations proposed by other authors, we can state that the last recommendation would at most count as much as the previous opinion hold about the item from the past recommendations (when w ≈ 1). On the other extreme, a new recommendation weighted with w ≈ 0, would not modify the previous opinion at all. Next, we will focus in the meaning of this weight w in order to satisfy the aims of agents to reflect human reasoning and behaviour. The value of the weight w determines how much a new recommendation is taken into account. We can interpret such value as the remembrance of the agent. In this way, a low value of w, would mean that old recommendations would be nearly forgotten. On the other hand, a high value of w, would mean that old recommendations would be very remembered. So if we noted memory or remembrance in time t as ρt, the weight of a new recommendation results in equation 7. w = (1 − ρ t ) Eq. 7 weight of new recommendations in our proposal Furthermore, adaptive behaviour of agents depends on how remembrance grows or decreases. It should be affected according to the success/failure of the last prediction. This level of success/failure can be obtained from the similarity between the final opinion about such item (obtained from the aggregation of the recommendations received from the neighbours) and the result of the afterwards human evaluation of such item. We estimate such similarity through the support of the matching between both fuzzy sets. We compute the semantic unification of them using the mass assignment theory developed by J.F. Baldwin [12]. The instantiation of a variable to a fuzzy set is interpreted as a possibility distribution, which is equivalent to a mass assignment. The mass assignment algebra applied to fuzzy sets provides a justification for the rules of intersection, complementation and union used in Fuzzy Set Theory. This approach to interpreting fuzzy sets is underlying in ‘fril’. Fril is the Artificial Intelligence logic programming language used to implement and test our proposal. The output value of the matching operation is then used to decide how much has the fuzzy set moved along the way to the new recommendation received, and how much the shape of the fuzzy set is modified to represent an increasing/decreasing level of cumulative doubts about the item. A detailed description of the mathematical transformation (applied to the problem of inferring trust from commercial activities) can be found in previous works of the authors [13,14]. Once similarity is computed in this way, an average sum of previous remembrance ρt-1 with similarity ∆ (Rt–1, St) is applied to obtain the new remembrance ρt in equation 8. + ∆( Rt −1, St ) ρ ρ t = t −1 2 Eq. 8 Remembrance updating equation The output of this equation follows the next simple principles: •

If the recommendation was accurate (∆ ≈ 1) then remembrance would increase in 1/2 + ρ/2



If the recommendation was not similar at all with the afterwards evaluation of the item (∆ ≈ 0), the relevance of remembrance in the future would be halved.

8

J. Carbo and J.M. Molina

These properties help in avoiding ρ being below zero and above one. The initial value of remembrance associated to any agent joining the system, should be minimum (zero), although it would increase when the level of relative success in past recommendations became relevant.

4

Evaluation

4.1 Experimental setup In order to assess the accuracy of recommendations, we use the Relative Operating Characteristic (ROC) [15] technique. This technique is often applied to test the accuracy of predictors. Since we are working with recommendations, we have considered that an event e would be predicted, when an item i was recommended. The application of this measure to collaborative filtering was also tested in previous works [16,17]. The ROC is based on the notion that a prediction of an event e would be assumed if event e was predicted by at least a fraction p of ensemble members, where the threshold p is defined a priori. Applied to our domain, an item i would be recommended if item i was recommended by the neighbours and by ourselves with a final rate greater than a given threshold pt. Let us consider first a deterministic (single model) recommendation of item i (either that it will please the user or that it will not satisfy such user). Over a sufficiently large sample of independent recommendations, we can form a contingency matrix (see table 1) giving the frequency that i pleased the user or not, and whether it was recommended or not. Table 1

Contingency matrix of a recommender system Item i Recommended

Satisfies the user No

Yes

No

α

β

Yes

γ

δ

Based on these values, the ‘hit-rate’ (H) and ‘false alarm rate’ (F) for item i are given by equation 9: H=

γ δ , F= β +δ α +γ

Eq. 9 Hit-rate and false alarm rate according to the contingency matrix Hit and false alarm rates for a probabilistic prediction can be defined as follows [18]. Suppose it is assumed that item i will be recommended if the final rate of such item pi is greater than p (and will not if pi < p ). By varying p between 0 and 1 we can define H = H(p), F = F(p). The value of H stands for the ROC sensitivity, where sensitivity is defined as the probability that a good item is recommended. On the other hand, F=1-specificity, where specificity is defined as the probability that a bad item is not recommended. Since we are

9

Agent-based collaborative filtering based on fuzzy recommendations

interested in avoiding deceptions or false positives, γ, (recommended items that did not please the users) and in avoiding false negatives, δ, (items not recommended but users would like them), we will study both rates, F and H, of the ROC applied to different agent-based collaborative filtering algorithms. The accuracy of the recommendations is given by the area under the ROC curve (so called AROC), where the ROC curve is a plot of H(p) against F(p). A perfect deterministic forecast will have AROC=1, whilst a no skill-forecast for which the hit and false alarm rates are equal, will have AROC=0.5. Although AROC gives an objective measure of skill for ensemble predictions, it is difficult to determine which threshold is more suitable. Cost analysis is often used to fix such threshold. This analysis evaluates two kinds of cost: •

When certain event e is predicted, it causes some action of cost C irrespective of whether or not event e occurs.



If event e occurs and no action has been taken, a loss L is incurred (see table 2). So it is desirable to minimise both costs.

Table 2

Associated costs contingency matrix Item i Recommended

Satisfies the user No

Yes

No



C

Yes

L

C

4.2 Dataset We use a dataset of recommendations about movies from MovieLens recommender system provided by Compaq Systems Research Center [19]. MovieLens is a web-based research recommender system that debuted in Fall 1997 and lasted 18 months. Although information about title and genre of movie, age and gender of voters is also provided, we just focus on ratings and movie identifiers. Opinions of users are introduced by zero-tofive star ratings. These ratings were mapped linearly to the interval [0,1]. Table 3 shows a histogram of the ratings: Table 3

Histogram of ratings in MovieLens dataset Star Rating –

Numerical Rating 0

% Count 12.34

*

0.2

5,35

**

0.4

12.08

***

0.6

24.93

****

0.8

27.08

*****

1

18.19

The information gathered by this site consists of 74.424 users who have expressed opinions on 1649 different movies. We randomly selected users among those who had rated 70 or more movies, and we also selected the movies that were evaluated more than

10

J. Carbo and J.M. Molina

35 times in order to avoid the sparsity problem. Finally we had 53 users and 28 movies. The average votes per user is approximately 18. So the sparsity of the selected set of users and movies is under 35% (see equation 10). sparsity = 1–

nonzero entries 970 = 1− = 0.34 total entries 53 ⋅ 28

Eq. 10. Sparsity of data subset on movies recommendation

4.3 Results Among the 53 users, we select the user who rated more movies (22), and this user is represented by an agent who asks the other 52 users about each of those 22 movies. These recommendations from the other users are combined to generate the prediction about the 22 movies. The zero-to-five star ratings were mapped to six normalised uniformly distributed fuzzy sets in order to apply ACF-FR. Each of these predictions will be considered positive if the resulting recommendation is greater than certain threshold, otherwise the agent would not recommend that movie. As the opinion of the user about those 22 movies is already known, we can compute hit (H) and false-alarm (F) rates considering the same threshold. The computation of both rates for thresholds p=0.2, 0.4, 0.6 and 0.8 applied to the described recommending algorithms may be plot in the ROC curves of Figure 3. The area covered by the ROC curves gives a measure of the accuracy of such predictions. Figure 3

ROC curve of different agent-based collaborative filtering algorithms on real data

In the figure we can observe that the area covered by SPORAS and Singh-Yu’s proposal are the lowest ones and it seems to be more or less like a random decision. On the other hand, the ROC curves of REGRET and our proposal (ACF-FR) includes greater area than the others. We can observe in more detail the difference among the areas covered by these ROC curves in Table 4.

Agent-based collaborative filtering based on fuzzy recommendations Table 4

11

Comparison of recommending algorithms using AROC Algorithm

Area

Random

0,5

SPORAS

0,528

REGRET

0,722

Singh and Yu

0,573

ACF-FR

0,7845

From the results showed in table 3, we can conclude a better global performance of our proposal compared with the other recommending algorithms. But it does not mean that our proposal would obtain better results with a particular threshold. In order to study the behaviour of the recommending algorithm studied, we will compute a cost analysis. Since we are predicting the satisfaction provided by a given movie from recommendations of other users, we can consider that the cost of the opportunities lost (L) from the well rated movies that were not recommended. The cost will be computed from the difference between the rating and the threshold used. On the other hand, the consequences of taking an action (viewing a given movie) from the ensemble recommendation can be considered negative if the final rating of the movie was below the threshold used. So we will compute such cost (C) as the negative difference between the threshold and the rating. A more detailed description of the computations can be observed in Table 5. Table 5

Costs of a rating v (0

Suggest Documents