Multimed Tools Appl (2015) 74:11399–11428 DOI 10.1007/s11042-014-2236-3
Towards context-sensitive collaborative media recommender system Mohammed F. Alhamid & Majdi Rawashdeh & Hussein Al Osman & M. Shamim Hossain & Abdulmotaleb El Saddik Received: 25 January 2014 / Revised: 16 June 2014 / Accepted: 14 August 2014 / Published online: 3 September 2014 # Springer Science+Business Media New York 2014
Abstract With the rapid increase of social media resources and services, Internet users are overwhelmed by the vast quantity of social media available. Most recommender systems personalize multimedia content to the users by analyzing two main dimensions of input: content (item), and user (consumer). In this study, we address the issue of how to improve the recommendation and the quality of the user experience by analyzing the contextual aspect of the users, at the time when they wish to consume multimedia content. Mainly, we highlight the potential of including a user’s biological signal and leveraging it within an adapted collaborative filtering algorithm. First, the proposed model utilizes existing online social networks by incorporating social tags and rating information in ways that personalize the search for content in a particular detected context. Second, we propose a recommendation algorithm to improve the user experience and satisfaction with the use of a biosignal in the recommendation process. Our experimental results show the feasibility of personalizing the recommendation according to the user’s context, and demonstrate some improvement on cold start situations where relatively little information is known about a user or an item.
M. F. Alhamid : H. Al Osman : A. El Saddik Multimedia Communications Research Laboratory (MCRlab), School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, Canada H. Al Osman e-mail:
[email protected] A. El Saddik e-mail:
[email protected] M. F. Alhamid (*) : M. S. Hossain : A. El Saddik College of Computer and Information Sciences (CCIS), King Saud University, Riyadh, Saudi Arabia e-mail:
[email protected] M. S. Hossain e-mail:
[email protected] M. Rawashdeh Division of Engineering, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates e-mail:
[email protected]
11400
Multimed Tools Appl (2015) 74:11399–11428
Keywords Personalized search . Context media search . Context-aware recommendation . Collaborative context . Retrieval model . Information filtering
1 Introduction The prevalence of smart devices has brought more accessibility to different multimedia contents through the social web. Also, the ability to embed our living environment with small and cheap sensors enables and accelerates the development of context-aware recommendation systems that can obtain the most desirable content in a specific context. The combination of real-time access to different multimedia contents and the overwhelming content options available on the social web increases the challenge of building a personalized search model. For these reasons, this field of study, related to helping users browse and consume multimedia contents, has been a challenging issue for researchers over the past few years [33, 40]. Different users have different personalities [26]; likewise, a user’s choice of which multimedia content to consume varies with different parameters, which include emotions and various health conditions [23]. This necessitates the integration of user preferences and physiological context in the development of a recommendation algorithm. Using today’s technology, we can obtain a user’s various physiological functions and monitor them using biomedical sensors, with the help of computer systems. In this paper, we focus on recommending multimedia content to users based on their instantaneous requirements, which can be represented by categorizing their contexts into physiological and environmental parameters, in order to enhance their experience and comfort level. Although a broad number of contextual parameters can be considered, we only show certain context conditions in this paper. Particularly, we highlight the importance of the physiological aspect of the user’s context during the recommendation process. However, the proposed model can also be applied to accommodate other contextual information that will help personalize the search. As people share their preferences and information on different popular social media sites, they share important information about their consumed multimedia contents including their descriptive experience. A huge number of users on Last.fm, Youtube, and Facebook upload media content with annotation tags that describe their uploaded type of media. Our method benefits from the availability of such social tags by exploring context-based item recommendations and by avoiding the unrealistic need to analyze innumerable items on the Internet in order to detect the context. Accordingly, our recommendation technique offers a feasible solution to find relevant media content, based on the detected contextual information, and benefiting from the large number of existing online users who contribute to the classification and annotation of different media, which is more effective than just relying on feature extraction. Inspired by our earlier work [2], our proposed recommendation model makes the following contributions toward recommendation techniques. First, we utilize existing online social networks by incorporating social tags and rating information in ways that personalize the search for the right content, for a particular detected context. Second, we propose a recommendation algorithm to improve the user experience and satisfaction by the use of a biosignal in the recommendation process. The rest of this paper is organized as follows: Section 2 briefly reviews the existing work related to context-aware recommendations. Section 3 presents the detection of the user physiological context. Section 4 provides details concerning the proposed context-aware recommendation model. Section 5 highlights some scenarios and the usability of the proposed model by presenting an implemented recommender application. Finally, we present the experimental method and the related results and conclusion in Sections 6 and 7.
Multimed Tools Appl (2015) 74:11399–11428
11401
2 Related work In this section, we briefly bring attention to the different existing recommendation techniques, and particularly to the ones based on context-awareness. We also review existing approaches that explore context detection and context-aware computing during the recommendation process. 2.1 Recommendation techniques In recent years, the development of recommender systems has attracted considerable research interest in an attempt to discover the relation between users and the items they consume. The studies analyzed the hidden relationships between users and items, and proposed several methodologies that became today’s ways of establishing recommendations. The first traditional recommendation approach is Content-Based Filtering (CBF). The CBF approach analyzes the item content and creates a profile for each by assigning features such as type, category, and content special attributes. It then relies on profile matches, which contain user-content explicit ratings of other item profiles. Such an approach is commonly used in building music or movie recommender systems, and needs the user’s historical selection of content in order to predict the ranking of new items. Another traditional method is Collaborative Filtering (CF), which recommends items based on the relationship of users rather than only items. This approach relies on the identification of similarities between users who share the same interests. As opposed to CBF, CF gives more attention to social aspects and the feedback of similar users to unknown user-item rating. At a later time, researchers combined CBF and CF in order to build much improved versions of these traditional methods. A hybrid method is proposed to overcome a problem that exists in CF, where they only consider users to be close to each other based on the calculated distance, when in reality they might have completely different tastes. Another problem is that new users do not have enough history to feed the recommender system and allow it to find relevant items. This is a well-documented problem known as the cold-start problem [37]. Contextual information boosts the effectiveness of the traditional recommender systems, which consider only the user, items, or their related information to find the next expected item to be selected. For instance, the user may like to listen to a particular type of music while running in the morning and not at other times of the day. Accordingly, if the context of the user has been well detected, the recommendation results might be completely different from the traditional ones, and better suited to the user’s needs [33]. Some studies combine the existing recommendation approach with some contextual information. For instance, Woerndl et al. [36] apply a combination of content-based filtering, collaborative filtering, and a hybrid method on the dataset of mobile application recommender systems. The authors present the effectiveness of applying the hybrid approach as a successful way of accommodating a contextual type of information. Campos et al. [9] presented the recommendation model that combines CBF and CF. Context is also considered as a content feature and it is assumed that context elements are independent, so they can apply the Bayesian network algorithm on them. Another hybrid approach is presented by Lekakos and Caravelas [22], which is to recommend movies by analyzing the content and the collaborative relation between users. Bogers [4] proposed a Markov random walk algorithm over a contextual graph on a movie dataset. The user rating and tags, the movie genre, and the actors are all considered as contextual information in the graph. Cai et al. [6] extracted emotion tags available on a web document in order to match them with music lyrics. In a similar focus, Hyung et al. [18] use
11402
Multimed Tools Appl (2015) 74:11399–11428
textual inputs from the user to find similar documents using a Latent Semantic Analysis approach (LSA) and a Probabilistic Latent Semantic Analysis (PLSA), which leads to a better recommendation of music. Another study using music extracted features as a context for music recommendation is proposed by Yoon el al. [38]. Han et al. [13] propose a music recommendation model that transits the user’s emotional state from one to the other using the recommended music. Authors use a proposed ontology to model the relationship between music items and the related emotion. More recently, recommender systems rely on social networks in order to collect vast amounts of information about the user, their multimedia choices and their behavior. For instance, Zangerle et al. [42] present a facilitation approach to collect a user’s music dataset from Twitter, assuming the user’s music player shares such information on Twitter. Pessemier et al. [29] discussed an activity detection in a mobile environment as supportive information for the recommendation process. By using an attached accelerometer on a mobile device, the authors were able to detect four basic activities: running, walking, standing and cycling. Based on the detected activity and on other contextual information such as location and weather, the recommender system can fetch different categories of information such as train schedules and restaurants in the surrounding area. In another study, Kim et al. [20] employed a Hidden Markov Model on a contextual information dataset to recommend the most appropriate menu for healthcare services. 2.2 Physiological-related recommendation In their study, Liu et al. [23] addressed the effects of music playlist recommendations on the heart, while flying on an airplane. Since the airplane environment causes discomfort and stress, authors used a heart sensor embedded in the aircraft’s seat to monitor the heart activities and particularly to measure the stress. This study [23] focuses on analyzing the airplane environment and the effects of music on the user’s level of stress, which differs from the focus of this paper. Oliver and Flores-Mangas [28] presented MPTrain system that uses a smartphone-based accelerometer and heart sensor to analyze the user activities to select appropriate music by changing the user selection based on the music rhythm. MPTrain is proposed to help users achieve their exercise goals by speeding up or slowing down their activity. In another study, Nirjon et al. [27] proposed a context-aware recommendation system that uses an earphone device equipped with a heart rate detection sensor. The sensor collects continuous Electrocardiography (ECG) signals, and extracts the peaks in order to calculate the average heart rate instantaneously. Similarly, the system extracts offline beats from the music signal or what is known as the tempo of the music, as well as the frequency of the sound, and the energy of the signal. Our work differs from all the presented studies in this section in such a way that our proposed context-aware recommendation model considers the physiological condition of the user as a useful context parameter to enhance the recommendation results. Another difference is that our aim is to recommend items that match the detected context and not to transfer the physiological condition from one user to the other. In addition, our model does not require the analysis of the content of the media item itself, for example the content of the image or the video. Instead, we rely on the user’s available context.
3 Detecting a User’s physiological context In order to detect the user’s physiological context, we have to record and monitor the user’s biological signals to determine their physiological or emotional (e.g. mental stress) situation.
Multimed Tools Appl (2015) 74:11399–11428
11403
Even though we collect biological signals, we are not concerned with diagnosing illnesses. Our goal is to use the collected physiological data to better understand the physiological status of users. In fact, in this paper we focus on one particular emotion: mental stress. The best tool to detect the targeted physiological conditions is to monitor the heart using an ECG signal that can measure the Heart Rate Variability (HRV) [7]. Some of the least obtrusive commercial ECG devices are in the form of a chest strap fitted with two electrodes and an electronic circuit. This obliges the user to wear the ECG sensor at all times in order for the application to benefit from physiological or psychological contextual information. Since these devices are wearable, they might be associated with a certain level of discomfort over prolonged periods of use. In this paper, we used a much more convenient and non-invasive device as shown in Fig. 1. The ECG sensor used is a product from the AliveCor Company, and is connected to a “Samsung Galaxy III” smartphone device. By having a sensor that just looks like a protection cover attached to the smart device, an installed android application can collect the ECG signal. We use the ECG signal to extract the HRV information. Since the ECG signal represents the electrical activity of the heart, we can extract the HRV signal by calculating the series of time intervals between two consecutive heart beats or Rwaves [27]. Fig. 2 shows a portion of a recorded ECG signal for a couple of minutes. The HRV is known to shed light on the mental stress and infer some conclusions about the conditions of monitored individuals [3, 11, 14, 17]. To detect mental stress, the latest measurement collected is compared to a previous recorded benchmark. The benchmarks are typically previous measurements taken during a neutral stress state. The length of the measurement is a key constraint here, since the longer the measurement is, the more reliable the conclusions are. Therefore, it is widely recommended to use ECG measurement records of at least three minutes [14]. Accordingly, the stress detection algorithm assesses the recorded ECG signal and compares the HRV calculated values to the benchmark measurement. Nonetheless, since the reason for measuring the physiological parameters in this study is to recommend suitable multimedia content, such a long measurement reduces the system’s usability and applicability. In other words, asking a user to have a three minute ECG measurement before the system can recommend a song is unrealistic. Consequently, we decided to minimize the ECG recording time needed to evaluate the physiological parameters in order to increase the application’s practicability. The use of shorter ECG time measurements when analyzing mental stress is still a relatively new concept. Therefore, more research is needed in order to assess the accuracy of such an approach compared to longer accurate
Multimedia recommendation application
Single-lead ECG
Fig. 1 An image of the single-lead ECG sensor attached to an android device in our developed prototype application
11404
Multimed Tools Appl (2015) 74:11399–11428
Fig. 2 A portion of the original ECG signal recorded using the single-lead ECG sensor
measurement methods [30]. As we have previously argued, the proposed model is not targeting medical applications and therefore we are more at liberty to use such an approach. As a result, we are willing to potentially sacrifice some accuracy in the goal of increasing usability. Therefore, for the physiological context detection, we simply use the short term analysis of the heart rate variability proposed by [30] to determine if there is the potential of the user being in an elevated stress state, a neutral state, or a relaxed state, as in Fig. 3. After collecting the required biosignal, the extracted physiological information is sent along with other contextual parameters to the recommender system. Then, the recommender system recommends content such as music and movies, which consider the detected physiological state of the user. 3.1 Detecting physiological context experiment We attempted to determine how the short ECG measurement is able to distinguish between the three main physiological contexts: elevated stress, neutral, and relax states. We invited subjects to measure their HRV parameters while asking them to perform three types of activities. The sessions were composed of stress, neutral, and relax exercises. For the stress exercise, subjects were asked to perform a Stroop color-word test. The Stroop test has been used in different physiological and psychological studies [21, 25, 34] since it involves sympatho-adrenal activation that is reflected in the subject’s heart and respiration rates [34]. For the neutral exercise, subjects were asked to sit comfortably and try to read an article. We will use this
Elevated Stress State
Neutral State
Relaxed State
Fig. 3 Three different physiological states are proposed as possible user physiological contexts
Multimed Tools Appl (2015) 74:11399–11428
11405
exercise as a benchmark to analyze stress and relaxation situations. For the relaxed exercise, subjects were asked to sit comfortably, close their eyes, and listen to relaxing music. Each exercise lasted three minutes and contained 3 min worth of ECG data, but only the last 30 s were fed to the physiological detection algorithm to evaluate the short ECG measurement performance. Ten adult subjects participated in the physiological context detection experiment: 6 males and 4 females. The average age was 28.7 years. The experiment was carried in an office space in our laboratory. An office desk, a laptop and a headphone were provided for each participant. Subjects were seated in front of the laptop while the ECG sensor sent the data recorded to a Java based computer program. After collecting the ECG signal, three HRV parameters of interest were extracted to measure the mental stress:
& & &
Low frequency band of the HRV (LF). High frequency band of the HRV (HF). The ratio of low frequency to high frequency HRV (LF/HF).
Numerous related studies have shown the effectiveness of these HRV parameters to measure mental stress [1, 30]. Particularly, during a stressful situation, the HF component is noticed to decrease while the LF ratio to HF increases. The resulting HRV parameters for the 10 subjects during the last 30 s of the three exercises of the experiment are presented in Table 1. The results obtained have shown, on average for the 10 subjects, an expected increase in LF/HF component triggered by a stressor activity (with respect to the neutral state). Such an increase in the LF/HF was also observed in earlier documented studies using a similar Stroop test, such as in [32]. In addition, for the same stressor activity, a decrease in the HF component (with respect to the neutral state) was noticed from the experiment results on average for the 10 subjects. Such a decrease Table 1 The results of detecting the physiological context of the users Subject number
Neutral Exercise Session (N) 1
LF (N)
1
HF (N)
Stress Exercise Session (S) 1
LF/HF LF (S) (N)
1
HF (S)
Relax Exercise Session (R)
LF/ LF (R)1 HF (S)
HF (R)1
LF/HF (R)
1
2,179.69
605.20 3.60
2,884.29
566.49 5.09
1,421.07
906.34 1.57
2
5,457.56
4,715.32 1.16
2,871.76
2,041.03 1.41
2,755.02
6,388.11 0.43
3 4
2,548.07 1,648.41
3,136.71 0.81 922.55 1.79
2,467.03 1,939.05
1,443.93 1.71 864.21 2.24
2,976.48 1,399.98
4,119.28 0.72 563.22 2.49
5
4,972.69
964.01 5.16
3,898.86
863.92 4.51
1,579.52
1,128.10 1.40
6
1,646.78
267.89 6.15
2,468.13
251.07 9.83
1,340.22
263.30 5.09
7
6,518.59
2,158.91 3.02
12,489.29
1,642.77 7.60
1,792.80
1,577.55 1.14
8
25,890.21 150,127.87 0.17
49,184.93 108,881.39 0.45
66,522.72 142,255.49 0.47
9
5,306.72
2,970.81 1.79
11,629.48
4,229.37 2.75
9,693.36
10
3,164.43
1,854.81 1.71
4,436.91
1,791.26 2.48
5,809.44
3,210.19 1.81
Average STDEV
5,933.32 7,227.36
16,772.41 2.53 46,876.27 1.93
9,426.97 14,487.92
12,257.54 3.81 33,968.30 2.97
9,529.06 20,201.45
16,772.89 1.64 44,158.70 1.36
1
Values are expressed in milliseconds (ms2 )
7,317.30 1.32
11406
Multimed Tools Appl (2015) 74:11399–11428
has also been confirmed in [8, 17]. On the other hand, for the relaxation activity, we noticed, on average, an expected decrease in the LF/HF component with respect to the neutral state. Nonetheless, we did not find a significant change in the HF component with respect to the neutral state. Therefore, we will use the LF/HF exclusively to differentiate between the various physiological (stresses, neutral, or relaxed) states. Subjects were asked to evaluate the three exercises subjectively by giving a rating value that indicates how stressful each exercise was. A rating range from 0 to 10 was used, with 10 indicating the exercise was stressful, and 0 indicating that the exercise was relaxing. Our experimental design of three different exercises distinguishing the required three physiological statuses (stressed, neutral, and relaxed) has been reflected in the results reported in Table 2. Accordingly, the HRV analysis results verify the correlation between the physiological status of subjects performing the three experimental exercises and the observed subjective assessment.
4 Preliminaries Before describing the details of the recommendation model, we have to introduce a set of definitions to formalize our recommendation problem. In this paper, bold uppercase letters, such as A, are used to denote matrices; whereas the corresponding lowercase italic with two subscript indices, such as ax,y represent the entries of the matrices. Capital italicized letters represent sets, such as U, and an upper-case italic letter with one subscript index such as Ux represents an entry element x from set U. We also formalize the matrix elements in the recommendation model to follow the following format: Matrix=[(element entry)Set A, Set B]|A|×|B|. For example, the contextitem matrix in the model is represented in the following format: A=[ac,i]|C|×|I| . In addition, for simplifying the recommendation problem, we refer to the term “users” or (U) to represent the set of users, and “items” or (I) to denote a set of media resources that can be recommended such as music or movies. Table 3 summarizes the notations employed in the rest of this paper.
Table 2 The subjective results of user experience during the experiment
Subject number
* (0: relaxed, 10: stressed) Neutral Exercise Session (N)
Stress Exercise Session (S)
Relax Exercise Session (R)
1
4.00
7
2.00
2
6.00
9
3.00
3
3.50
6.5
3.00
4
4.00
7
0.00
5
6.00
9
1.00
6
5.00
9
2.00
7
4.00
7
2.00
8 9
4.00 2.50
6 8
3.00 0.50
10
3.00
9
0.00
Average
4.20*
7.75*
1.65*
Multimed Tools Appl (2015) 74:11399–11428
Table 3 A summary of the notations meaning
11407
Notations
Meaning
U
Sets of users.
I C
Sets of items Sets of contexts
E|C|×|I|
Context-item matrix
T|C|×|U|
Context-user matrix
R|U|×|I|
User-item matrix.
T
Normalized matrix T.
E
Normalized matrix E.
S|U|×|U|
User-user similarity matrix.
B|I|×|I|
Item-item similarity matrix.
Sk
Top k most similar users.
Bk
Top k most similar items.
4.1 Problem definition Combining contexts is an important challenge in our recommendation model. Our research problem is to identify different contextual dimensions and deliver media content that best fits the user’s detected context. Suppose we have a group of users who share music content. A user’s information, including their history of rating and their textual annotations given to certain tracks describing their experience as “love”, “relaxing”, “for gym”, etc., are available. We also have other kinds of users who have not previously recorded any annotations toward any music or have not shared their ratings publicly on the internet. For a list of possible detected conditions represented as context dimensions CN, where CN={n1,n2,n3,…n|N|}, a list of users U={u1, u2,…,u|U|}, a list of context parameters C={c1, c2,…,c|C|} - extracted from CN- and a list of available item I={i1, i2,…,i|I|}, a recommendation model that predicts the suitability or interest for user (u) in item (i) given a set of context (c) can be built. Then, the context attributes are used in the recommendation process to filter and uncover items that are probably of interest to the user in such a context, personalized according to the user’s preferences. 4.2 Constructing the required matrices To build the required matrices needed for our recommendation model, we use the example illustrated in Fig. 4. As shown in Fig. 4, we have a list of users U={u1, u2,…,u5}, a list of items I={i1, i2,…,i12}, and a list of contexts C={c1,c2,…,c5}. Then, we construct three main matrices to build the base for our recommendation model as follows: context-user matrix T|C|×|U|, user-item matrix R|U|×|I|, and context-item matrix E|C|×|I|. To build the matrix T|C|×|U|, let t(cy,ux) be the number of times a user ux consumed items in context cy. Similarly, in the matrix E|C|×|I|, let e(cy,ix); be the number of times item ix has been consumed in context cy. If a user has not consumed any items in a given context, or if an item has never been consumed by any user in a particular context, then the t(cy,ux) and e(cy,ix)=0 respectively. In addition, if we only consider the frequency of usage for a particular context within the users or items scope, then the accuracy of the recommendation results might be affected by the number of users who repeatedly use items in a large variety of contexts. Accordingly, we would neglect the importance of how many users have consumed items
11408
Multimed Tools Appl (2015) 74:11399–11428
u5
u4
u1 i1 c1 i4 c5 c2
c4
u2
i7
i6 i2
i13 c2
c5
c3 c4 i5
i11 c 1 c4 c3 i10 c2 c3
c5
c5
c2 i9
i8
c1
Items consumed in context ( c3 )
i11
user item
c3 i12 c1 c1
Context of use
u3
Fig. 4 A representation of the three dimensions of the recommendation problem
within that context as the opposite of small number of users who consumed many items in a particular context. Therefore, we normalized the frequency values in a range between 0 and 1 by the following formulas: ncu cy ; ux t cy ; u x ¼ ð1Þ N cy;u
e cy ; i x ¼
nci cy ; ix N cy;i
ð2Þ
where ncu(cy,ux) is the number of occurrences of context cy in the list of consumed items by ux, nci(cy,ix) is the number of occurrences of context cy in the list of contexts an item has been consumed in (fy,x) as in Eq. 3. Ncy,u and Ncy,i represent the number of times the context cy is used with all items, and the number of times the context cy is used by all users, respectively. X nci cy ; ix ¼ δy; x f y; x y;x ð3Þ 1 cy occured in ix δy;x ¼ 0 otherwise Note that, we also examined the usage of a binary version of values for T and E, but in this case, we would not be able to show how often a particular item is being used in a specific context. That’s because in the binary case, we would only be able to know that an item has been consumed in that specific context. The construction of the three matrices (T, R, and E) leads to the discovery of the latent association of items toward a particular context, and the latent association of users toward contexts, and accordingly, leverage relevant items for a user in a particular context.
5 The recommendation algorithm In this section, we present the algorithm used to construct a context-aware recommendation model. The inceptive idea is that users sharing an item such as music are likely to also share
Multimed Tools Appl (2015) 74:11399–11428
11409
some hidden contextual information. Such contextual information is able to effectively describe the user’s preferences toward their selected items. The analysis of the available contextual information associated with consumed items enables the analysis of items consumed in similar contexts. In general, users who consume certain items in a given list of contexts are more likely to form a contextual pattern to bridge the information gaps between users and new items. 5.1 User-based collaborative filtering Before we analyze the user’s context, using collaborative filtering technique [5], the proposed recommendation algorithm identifies the user’s neighbors. The concept behind relying on the detection of similar users who share some items is to exploit the list of items consumed by given users to find other interesting items consumed by similar users (also called nearest neighbors). To determine the similarity between two users, we used cosine-based similarity. The cosine-based similarity is a widely used approach that takes two vectors of shared items of user ux, and uy, and quantifies their similarity according to their angle, as in Eq. 4. To minimize the computational cost, we consider top k nearest neighbors for each user. Accordingly, we eliminate the computed similarities of those users who share few items with others, and assign a zero similarity value if the similar user is not among the top k nearest neighbors. We employ the matrix (S), where S=Sk, to form the user-user similarity matrix. Figure 5 shows an example of constructing the similarity matrix S. ! ux : uy ð4Þ sux ;uy ¼ cos ux ; uy ¼ jjux jj2 :j uy j2 5.2 Item-based collaborative filtering As we employ collaborative filtering to observe the user-user similarities, we employ the user-item matrix R to observe the item-item similarities. According to [10], the item similarity can be computed using collaborative filtering. The idea here is that a user is likely to consume items that are similar to some of what they have already seen in the past. The similarity values can be obtained by measuring the cosine angle
user
item
...
ux,i1
ux,ik
uy,i1
uy,ik
:.
...
...
...
:. ux,i|I|
...
...
:.
User-item matrix(R) Fig. 5 A representation of the user-user similarity matrix S
ux
...
uy
...
user
...
similarity
user
:.
ux
:.
uy ...
:.
sx,y
... ...
...
...
User-user similarity matrix(S)
11410
Multimed Tools Appl (2015) 74:11399–11428
between the two column vectors in the matrix R, similar to finding the user-user similarity according to Eq. 5. ! ix :iy ð5Þ bix ;iy ¼ cos ix ; iy ¼ jjix jj2 :j iy j2 With regard to the resulted similarities, we can form the item-item similarity matrix B. The similarity value of bix ;iy is only considered if it is greater than the top k nearest item neighbors, otherwise the similarity value is set to zero. Figure 6 shows an illustrated example of computing the item-item similarity. 5.3 Extracting latent context-item preferences Before recommending items to a user, we need to find the latent preferences of that user toward their current context. This can be done by analyzing the hidden preferences of users toward items in a given context. Accordingly, by finding the latent context-item, we capture how a particular context has occurred with the user’s selection of items that are similar to a given particular item. We utilize the matrix T and the transpose of matrix B, which we constructed earlier to form the new context-item matrix Z. Formally, the matrix Z represents the matrix multiplication results of both T and B, as in Eq. 6: T ð6Þ Z ¼ T Bk Where the matrix T denotes a normalized version of the matrix T, and the matrix (Bk)T denotes the top k nearest items as explained in Section 4.2. The multiplication of the c-th row by the i-th column implies finding the latent preferences of context c, on item i with respect to the items’ k nearest neighbor. Figure 7 shows the details of constructing the new matrix Z. The reason for having normalized values in the matrix T is to reduce the effect of items that were consumed by many users. Hence, the first type of items contributes more in estimating the context-based prediction score than the second type of items. Using normalization, we can minimize the effects of those items in regards to the detected contexts.
item
:. ...
u1,ix
...
uk,ix
:.
...
ix u1,iy
...
:. ...
uk,iy
...
... u|U|,iy
:.
...
... item
user
item
ix
...
iy
... :.
iy ...
:.
bx,y
... ...
...
...
similarity
User-item matrix(R) Fig. 6 A representation of the item-item similarity matrix B
Item-item similarity matrix(B)
Multimed Tools Appl (2015) 74:11399–11428
11411
i1
context
i1
|C| × |I|
i2,,i1
i2 ...
...
... c1,i|I|
...
...
...
...
...
...
...
c|C|,i1 ...
...
×
... ik,i1
item
...
b1,2
|I| × k ik ... :.
k ×1 item
c1,,i1
item
i2
i|k|
:.
...
...
Item-item similarity matrix(B k )
T
item
c|C|,i|I| _
c1,,i1
=
context
context-item matrix (E)
...
...
c1,i|I|
...
...
...
...
c|C|,i1
...
...
c|C|,i|I|
Latent context-item matrix (Z) Fig. 7 An illustration of the process of computing the latent context-item matrix Z
5.4 Extracting latent item-user preferences In this step, we capture the user latent preferences toward an item. The main idea is that users in a context consume certain items, and that when they are in the same context in the future, they will likely consume items that are either similar to their preferences or similar to the choice of their nearest neighbors. We denote matrix Y to represent the latent item preferences to a given user ux, which also includes the item preferences of his/her similar users. We build the matrix Y according to Eq. 7 as follows: T T Y ¼ R Sk ð7Þ T Where R is the transpose of the original normalized rating matrix R (denoted as D) and T T k S is the top k user-user similarity matrix. The product of the two matrices R Sk brings the user and their nearest neighbors’ preferences to a given item. It is also important at this point to consider the issue of having some users that are more active in rating and consuming different items than other inactive users. This leads to more contributions in the recommendation model from the active users compared to the less active users. Therefore, we normalize the values in matrix D, before the multiplication step, in order to reduce such contribution effects. Figure 8 shows the details of constructing the new matrix Y.
11412
Multimed Tools Appl (2015) 74:11399–11428
u1
item
|I| ×|U|
u2,,u1
u2 ...
...
... i1,u|U|
...
...
...
...
...
...
...
i|I|,u1 ...
...
s1,2
uk :.
u1
...
×
... uk,u1
user
...
user |U| × k
k ×1 user
i1,,u1
u2
uk
:.
...
...
User-user similarity matrix(S k)
T
user
i|I|,u|U|
i1,u1,
_
=
item
item-user matrix (D)
...
... i1,u|U|
...
...
...
...
i|I|,u1
...
...
i|I|,u|U|
Latent item-user matrix (Y) Fig. 8 An illustration of the process of computing the context-user matrix Y
5.5 Employing the latent collaborative models for recommendation We take into consideration the user’s previous items, their nearest neighbors, as well as the similarity of items that have previously been consumed in the detected context. Items with a high ranking score will be recommended to the user. In addition, the recommended items reflect the user’s context within which it is recommended. In order to compute the items’ final rating scores, we use the two previously described models: the latent item-context model, and the latent item-user model. The association of these two models builds the required contextual bridge between users and items. Specifically, the proposed recommendation produces item recommendations relevant to a given context by extracting latent preferences. The calculation of the user-item ranking score is computed by: Rank u;c ðiÞ ¼
∑ α Zc;i Yi;u
c∈m
ð8Þ
Where Zc,i is the matrix entry of the c-th context row and the i-th item column of the Z matrix. The Yi,u is the entry value of the i-the row and the u-th column in the Y matrix. The parameter α is an attenuation factor, where α (0…1) for reducing the weight factor of a less sensitive context. The tuning of the α value is set after some experimental results. Details of the parameter α are presented in the parameter tuning of the performance evaluation in Section 7.5. Items with higher ranking scores are recommended to the user. Figure 9 presents an illustration of the user-item ranking score computation step. In Eq. 8, we show that the
Multimed Tools Appl (2015) 74:11399–11428
11413
user
item
...
...
c1,i|I|
...
...
...
...
c|C|,i1
...
...
...
...
...
... ...
...
i|I|,u|U|
Latent item-user matrix (Y) item
:.
context
c1
Recommended
user
ux
... i1,u|U|
...
i|I|,ux
c|C|,i|I|
Latent context-item matrix (Z) Query user
i1,ux,
+ item
context
c1,,i1
...
...
:.
ux,i1 ux,i2 ux,i3
...
uy,i1 uy,i2 uy,i3
...
ci
:.
ex
...
nt
...
co
:.
t
Items with higher ranking score
Final user-item matrix Fig. 9 The final user-item ranking score
ranking score is not limited to a single context; if multiple contexts are in the user’s query, then the summation of the multiplication will represent the ranking score for that item. With regard to the example illustrated in Fig. 4, the final matrix does not have to be physically stored in the database; its ranking score can be computed by executing a query on the two matrices Z and Y to reduce the model’s complexity.
6 Architecture and system design 6.1 Application scenario Today’s smartphone ecosystem accelerated the development of different context-aware recommender systems, particularly in observing the user’s contexts such as weather, location, activity, and time. Smartphones also enable enormous access to multimedia collections and act as a bridge between different sensors and the recommendation engine. Hence we developed a smartphone application that can be effectively used in a home environment. Let us assume a user who usually listens to music while studying and who has an exam the next morning. At home, the user used the application’s ECG interface to capture their heart signal. Depending on the user’s
11414
Multimed Tools Appl (2015) 74:11399–11428
stress level, the application will suggest relaxing music according to his/her preferences. Nonetheless, as we explained in the design of the recommendation algorithm, the application explores the user’s social friends and their preferences as well. In this scenario, the application can not only recommend music, but it can also adapt the recommendation result to fit a user’s context. The application prototype is developed in an android environment. The application can detect the date, time, location, song being played, the play count, and the heart signal, and has the ability to customize other parameters into the collection, as shown in Fig. 10. 6.2 Design and architecture The architecture of the introduced recommendation system consists of four main layers: the input/output interface layer, the context management layer, the clientlocal resources layer, and the server-cloud resource layer, as shown in Fig. 11. The client side is separated into client interfaces and user-local server. Identifying the user’s context and all the input parameters are processed within the user-local side. In addition, the processing of the recommendation algorithm including all required
Fig. 10 A screenshot of the context-aware recommendation prototype interface
Multimed Tools Appl (2015) 74:11399–11428
11415
recommendation agents is kept in user-local server. The server side has a cloud-based design to store additional multimedia resources and social profiles. Figure 12 shows the sequence of main functionality interactions for a user requesting a song. As a proof of concept, the prototype application synchronizes any media content stored within the user’s Dropbox account. The input/output interface layer handles the collection of the required contextual data and interacts with the user, which includes delivering the recommendation results. The context-management layer identifies the context of the user by analyzing the retrieved sensory data. The local resources layer stores the user behavior, and selects and evaluates the different recommendation parameters needed for the recommendation algorithm to function. The Entity Relationship Diagram (ERD) of the client-local resources layer is demonstrated in Fig. 13. The resource contents and the available social profiles are stored in a cloudbased repository.
7 Experimental evaluation In this section, we investigated whether the proposed recommendation technique improves the item prediction accuracy or not. The goal of the experiments is to evaluate the accuracy of the proposed approach: the utilization of the user’s context to recommend a different number of items. We changed the number of items retrieved each time to measure their relevance to the user’s request. 7.1 Dataset To find a publicly available dataset that carries some contextual information is a crucial challenge. Such lack of availability challenges any design of a context-aware recommendation algorithm [40]. Therefore, we crawled our dataset from an online social music database: last.fm. Specifically, music information and annotations data are extracted from the last.fm website. Last.fm is an online social music radio resource that enables their users to subscribe, listen and tag their favorite albums, tracks, and artists. The crawled dataset contains 164 users, 626 tracks, 251 contextual tags, and 10,711 overall item-context assignments. Users of last.fm annotate different albums and tracks with textual tags. Due to the fact that users can give any textual description to their favorite tracks and albums, different words can be used to describe the same meaning. For instance, four different users may tag item i1 with four different tags with the same meaning: “relaxing music”, “for relax”, “relax”, and “relaxation”. These tags can be grouped together under one annotation: “relaxing”. We also tested the performance of our proposed system on a bigger dataset. We used the publicly available dataset from MovieLens1 (www.movielens.org), which was used in [15]. The MovieLens dataset consists of 943 users, 1,682 movies, and 100,000 user-item ratings. This dataset does not have explicit contextual data, but contains some information that we consider a context in our experiment for a proof of concept experiment: genre (romance, action, adventure etc.), and user’s information (age, gender, occupation and zip code).
1
The MovieLens dataset can be downloaded from: http://www.grouplens.org/node/73.
11416
Multimed Tools Appl (2015) 74:11399–11428
Input/output interfaces Layer
Context Management Layer Evaluating Inputs
Time, Date
Physiological
Biosensors
Elevated Stress State
Social
Neutral State Relaxed State
Identify context User
Location Context matching
Client-Local Resources Layer
Contextual Log
Evaluating contextual and joint preferences
Recommendation Engine History
Service User Profile
Profile exploitation
Cloud Manager
Web Services
Multimedia Contents Social Profiles Server-Cloud Resources Layer Fig. 11 The architecture of the context-aware recommendation model
7.2 Comparison with other methods In our evaluation, we present the detailed experimental results of our ranking method in comparison with other benchmark methods.
& & &
Popular Items (PopItems): As described in [19], the PopItems technique gives more prediction weight to items with a higher count value within a specific context cy. Item Rank (ItemRank): This technique is a random walk scoring algorithm proposed in [12]. Using a user-item relationship graph, the algorithm estimates the probability of user u visiting item i in a random walk. uMender: This recommendation technique is proposed by J. Su et al. [33]. The algorithms first create a sub-matrix to find users and items that are similar in the same context condition. Then the algorithm obtains the positive and the negative preferences based on the available rating values. Finally, the algorithm finds frequencies in the negative and positive item sets and computes the related user-item prediction.
Multimed Tools Appl (2015) 74:11399–11428
11417
Client Application User
Server
Client-Local Handler Request a song
Context Provider
context/Item Analyzer
Recommender Engine
Server-Cloud Handler
GetCurrentContext
current context Evaluate inputs submit identified context, user profile context matching
Identified current and previous contextual parameters
Matching preferences Profile and resources exploitation
Required resources
items ranking scores list of recommendations
identified top-k relevant items
Compute ranking scores
Fig. 12 Interaction diagram of a user requesting a song using the proposed application
&
Collaborative and Content-based Technique (CCbT): Lops et al. [24] proposed a collaborative content-based tag recommendation algorithm. Since we are dealing with context by utilizing social tags, we considered comparing this technique with our proposed model since the tags annotation is also analyzed here.
7.3 Evaluation methodology The evaluation procedure is divided into two parts: the first part measures the accuracy of the context-based recommendation prediction, using offline experiments on different datasets crawled from online multimedia databases; the second part evaluates the user’s satisfaction with the resulted context-based recommendations after using the proposed prototype applications. 7.3.1 Offline experiment For the experiments in this section, we similarly follow the experimental procedure proposed in [41]. We randomly divided the two datasets into two groups: a training set that represents 80 % of the original dataset, and the remaining 20 % used as a test set. The training set is used to train the recommendation model, while the test set has items withheld randomly and their associated contexts are used as test-queries for each user. Since the algorithm performance might be sensitive to particular items chosen in the training or the test set, we repeated the run of the algorithms 5 times with different portioning. In addition, the values represented in the experimental results section represent the average of those different 5 runs with the standard deviation.
11418
Multimed Tools Appl (2015) 74:11399–11428
Fig. 13 Client-Local Resources Layer Database ERD
7.3.2 Online experiment In addition to the offline dataset experiments, we also conducted a subjective user evaluation using the android prototype application mentioned earlier. Providing users with some contextual knowledge, we computed the suitable items to be recommended according to the given context and displayed them to the user. In order to measure the effectiveness of our approach, the recommendation algorithm presents the recommended items retrieved using another algorithm that is non-personalized to the user’s context. We used the collaborative filtering algorithm presented in [31] as the second recommendation technique used in the application. The goal behind this experiment is to know whether or not our context-aware prediction method can capture items more relevant to the user’s preferences than the non-context-aware technique. Each user in the online experiment is asked to evaluate the ten pieces of music recommended in a certain context. Subjects go through different context scenarios, thus they evaluate the recommendation for different context conditions. Subjects browse the recommended music and answer if they like or dislike the music in such a context. 7.4 Evaluation metrics To measure the algorithm’s retrieval accuracy, we adopted precision and recall, which are widely used evaluation parameters to measure the effectiveness of the retrieved
Multimed Tools Appl (2015) 74:11399–11428
11419
recommendations in our offline experiment. Precision can be calculated by finding the ratio of the recommended items to the items already identified as relevant to the user, as in Eq. 9 [35]. Recall can be calculated by finding the amount of relevant contents among all recommended contents as in Eq. 10 [35]. precision ¼
recall ¼
recommended ∩ relevant recommended
recommended ∩ relevant relevant
ð9Þ
ð10Þ
For the comparison of our method to the other four benchmark algorithms, we report the Mean Average Precision (MAP) by using Eq. 11, in addition to precision and recall. u 1 X1X P n Rn jU j u¼1 tu n¼1
jU j
MAP ¼
t
ð11Þ
Where tu is the number of test cases for user u, and Pn is the precision at top n and Rn is a binary variable that equals to 1 if the item is relevant at rank n [16]. The MAP reports the average precision at each top k result [39]. Note that we varied the number of items retrieved (top k values) to measure the ranking positions of each recommended item. For instance, the precision values are reported for each top k (k=1, k=5, and k=10), which show the number of relevant items at top 1, top 5, and top 10. 7.5 Parameter tuning The proposed recommendation model uses α, an attenuation factor, where α ∈ (0…1) is used to reduce the weight factor of the contextual effects on the prediction scores. Prior to starting the experiments, we gave α equal values in all contexts in order to run an empirical study. By tuning this parameter, we may increase or decrease the influence of the context on the final scoring value given to an item. Hence, it is critical to correctly set the value of α to improve the recommendation performance. We first measure the MAP, and the Mean Reciprocal Rank (MRR) using different values of α on the Last.fm dataset and then on the Movielens dataset, as shown Table 4. The MRR measure is computed using Eq. 12: MRRðk¼nÞ
jU j 1 X 1 ¼ ∑i∈ðtu ∩Rn Þ u r ðiÞ jU j u¼1
ð12Þ
Where tu is the test case for user u, Rnu is the top n returned records, and the value r(i) ranges between: 1≤r(i)≤n. The results in Table 4 show the best MAP and MRR values obtained when α=0.3 on Last.fm dataset and α=0.5 on Movielens dataset. 7.6 The effect of normalization Prior to running the comparison experiments, we investigated the impact of matrix normalization on the evaluation metrics. We measured MAP and MRR after running the algorithm on normalized and non-normalized matrices. Specifically, we employed two different versions of the context-user matrix D and context-item matrix E. Tables 5 to 6 reports the improvement on
11420
Multimed Tools Appl (2015) 74:11399–11428
Table 4 The different weights assigned to α to measure its sensitivity in each dataset
Dataset
Number of Items α
Last.fm
Movielens
MAP@10
MRR@20
0.1
0.120±0.007
0.214±0.008
0.3
0.225±0.016
0.584±0.004
0.5
0.153±0.004
0.412±0.008
0.1
0.131±0.008
0.177±0.007
0.3 0.5
0.106±0.011 0.152±0.011
0.325±0.016 0.369±0.09
MAP and MRR using normalized matrices over non-normalized matrices of D and E in each dataset. In addition, we conducted a statistical analysis to measure the significance of the improvement of the normalized approach over the non-normalized one. Specifically, we conducted the two-tailed paired t-tests under the same conditions of D and E and for the same dataset. The results confirmed that the normalized approach positively impacted the significance for both MAP and MRR values Table 6. 7.7 Experimental results 7.7.1 Offline experiment In this section, we present the results of the comparison between the performance of our approach and the recommendation techniques introduced in Section 7.2. As explained in Section 7.3, we computed the performance of each recommendation approach in retrieving accurately relevant items, as well as their ranking positions in the recommendation list. Firstly, we evaluated the recommendation performance by calculating precision and recall, obtained by our approach and the other four alternative approaches, on Last.fm and Movielens datasets, as in Figs. 14 and 15. Figures 14 and 15 depict the precision-recall curves showing how our proposed approach outperforms the other baseline algorithms on both datasets. The number of retrieved items for a user’s quest is plotted on data points of the graph curves; the curves start from left denoting the top k=1, whereas the last point on the right denotes the top k=10. Our approach obtained approximately a 2 %, 3 %, 10 %, and 12 % precision improvement on the Last.fm dataset compared to CCbT, uMender, ItemRank, and PopItems respectively. Our approach also
Table 5 Effect of normalization on the Last.fm dataset Non-Normalization D, E
MAP@10
Normalization MRR@10
MAP@10
MRR@10
25
0.118±0.009
0.288±0.001
0.129±0.007**
0.294±0.008**
50
0.125±0.008
0.311±0.010
0.138±0.008**
0.299±0.009*
100
0.166±0.019
0.385±0.007
0.149±0.007*
0.302±0.004**
150 All
0.179±0.004 0.186±0.002
0.402±0.008 0.422±0.001
0.169±0.016** 0.199±0.016**
0.335±0.008* 0.579±0.004**
* Significant at p