How SVD and demographic data can be used to enhance ... - CiteSeerX

5 downloads 0 Views 324KB Size Report
ii) Compute the SVD of Rnorm, where Rnorm = U·S· V T . iii) Perform ...... [10] Bardul M. Sarwar, George Karypis, Joseph A. Konstan, and John T. Riedl. Analysis ...
How SVD and demographic data can be used to enhance generalized Collaborative Filtering Manolis Vozalis and Konstantinos G. Margaritis Parallel Distributed Processing Laboratory Department of Applied Informatics, University of Macedonia Egnatia 156, P.O. 1591, 54006, Thessaloniki, Greece E-mail: {mans,kmarg}@uom.gr URL: http://macedonia.uom.gr/˜{mans,kmarg}

Abstract In this paper we examine how Singular Value Decomposition (SVD) along with demographic information can enhance plain Collaborative Filtering (CF) algorithms. After a brief introduction to SVD, where some of its previous applications in Recommender Systems are revisited, we proceed with a full description of our proposed method: it applies SVD in different stages of an algorithm, which can be described as CF enhanced by demographic data. We test the efficiency of the resulting approach on two commonly used CF approaches (User-based and Item-based CF). The experimental part of this work involves a number of variations of the proposed approach. The results show that the combined utilization of SVD with demographic data is promising, since it does not only tackle some of the recorded problems of Recommender Systems, but also assists in increasing the accuracy of systems employing it. keywords: Recommender Systems, Collaborative Filtering, Personalization, Singular Value Decomposition (SVD), demographic data

1

Introduction

Recommender Systems were introduced as a computer-based intelligent technique that assists people with the problem of information and product overload. Their aim is to provide efficient personalized solutions in e-business domains, benefiting both the customer and the merchant. Recommender Systems feature two basic entities: the user and the item. A user who utilizes the Recommender System, provides his opinion about past items. The purpose of the Recommender System is to generate suggestions about new items for that particular user. The process is based on the input provided, which is usually 1

expressed in the form of ratings from that user, and the filtering algorithm, which is applied on that input. Recommender Systems suffer from a number of fundamental problems that cause a reduction in the quality of the generated predictions. Among others we have to mention the problems of sparsity, which makes it difficult for algorithms to locate successful neighbors, scalability, which refers to a performance degradation that follows a possible increase in the amount of data, and synonymy, which is caused by the fact that similar products have different names and cannot be easily linked. A number of solutions have been introduced, intending to solve those problems [7] [3] [10]. We will focus on the case of Singular Value Decomposition which comes from the area of Linear Algebra, was successfully employed in the domain of Information Retrieval, and only recently was proposed by various Recommender Systems researchers as a possible means of alleviating the aforementioned problems. The algorithm we present in this paper utilizes SVD as a technique that is able to effectively reduce the dimension of the user-item data matrix, and then it executes Item-based Filtering with this low rank representation to generate its predictions. The following sections are structured as follows: Section 2 introduces the concept of Singular Value Decomposition. Section 3 presents some Recommender Systems which have employed SVD in order to improve their performance. Section 4 provides a brief analysis of our experimental methodology. Section 5 gives a step-by-step description of the proposed filtering method and then applies it on User- and Itembased Collaborative Filtering. The experimental part, included in the same section, tests a number of variations of the main algorithm, contrasting them with each other. Finally, Section 6 summarizes the essence of this work, providing directions for future research.

2

Singular Value Decomposition (SVD)

Singular Value Decomposition (SVD) is a matrix factorization technique which takes an m × n matrix A, with rank r, and decomposes it as follows: SV D(A) = U × S × V T U and V are two orthogonal matrices with dimensions m × m and n × n respectively. S, called the singular matrix, is an m × n diagonal matrix whose diagonal entries are non-negative real numbers. The initial r diagonal entries of S (s1 , s2 , . . . , sr ) have the property that si > 0 and s1 ≥ s2 ≥ . . . ≥ sr . Accordingly, the first r columns of U are eigenvectors of AAT and represent the left singular vectors of A, spanning the column space. The first r columns of V are eigenvectors of AT A and represent the right singular vectors of A, spanning the row space. If we focus only on these r nonzero singular values, the effective dimensions of the SVD matrices U , S and V will become m × r, r × r and r × n respectively. An important attribute of SVD, particularly useful in the case of Recommender Systems, is that it can provide the best low-rank approximation of the original 2

matrix A. By retaining the first k