Designing Personalized Recommender Systems - Google Sites

Designing Personalized Recommender Systems Dr. Satya Gautam Vadlamudi Principal Data Scientist Capillary Technologies

Outline ●

How to build a ratings based personalized recommender

●

How to build a top-N type of personalized recommender

●

How to build a content based personalized recommender

Dr. Satya Gautam Vadlamudi, Principal Data Scientist, Capillary Technologies

Ratings based personalized recommender

Problem Statement: Given that an user has rated some products/movies (say, on a scale of 1 to 5), the objective is to predict a rating that the user would give for a new product


What do we have P1

P2

P3

U1

5

?

3.5

U2

1

3

3

3

4

U3

P4

P5

.

.

Pn

1

2

?

5

?

4

3

.

? ?

. Um

.

? 4

4

3

2

2

?


What else do we have Lot of content (entire movie!) Cast & Crew User reviews Critics reviews CONTEXT And so much more.. Dr. Satya Gautam Vadlamudi, Principal Data Scientist, Capillary Technologies

And let’s not forget the humans! User profile: Thinks a lot Watches 2 movies a year May like or hate the same movie Depends on CONTEXT & more such profiles of billions of reco-hungry humans Dr. Satya Gautam Vadlamudi, Principal Data Scientist, Capillary Technologies

Don’t try this at home! (or even on supercomputers) Make each pixel of all frames of each movie at 4K resolution as a feature and train using deep learning on ratings data from billions of people on millions of movies (and use all available web/video data of each human too)

Is this the future though? Dr. Satya Gautam Vadlamudi, Principal Data Scientist, Capillary Technologies

So how do we approach solving this problem?

Let’s get back to the present! Dr. Satya Gautam Vadlamudi, Principal Data Scientist, Capillary Technologies

You guessed it right! 1.

If someone likes some product lines, say, dramas (given high ratings), then give high scores for dramas for them (Content Filtering)

2.

Predict rating for an user based on how users similar to her have rated that product/movie (User User Collaborative Filtering)

3.

Predict rating of a product for an user based on the relationship of the movies rated by the user in the past with the current movie (Item Item Collaborative Filtering) Dr. Satya Gautam Vadlamudi, Principal Data Scientist, Capillary Technologies

Collaborative Filtering Two types: 1. 2.

Neighbourhood methods Latent factor models

Neighbourhood methods: 1. 2.

User User CF Item Item CF

Latent factor models: 1. 2.

Matrix factorization with explicit feedback/with implicit feedback Restricted Boltzmann Machines Dr. Satya Gautam Vadlamudi, Principal Data Scientist, Capillary Technologies

Matrix factorization For user u, item i, predict rating r^ by computing the dot product of user to latent factors affinities (p_u) and latent factors to item affinities (q_i)

Where p_u and q_i are real number vectors of size f (no. of latent factors)


Latent Factors example


How to learn the latent factor affinities Given K records containing u, i, and r, learn p and q by solving the below equation:

Where λ is the regularization parameter.


How do we solve Equation 2 1. 2. 3. 4.

Stochastic Gradient Descent (SGD) Alternating Least Squares (ALS) Singular Value Decomposition (SVD) And more..

SGD basic idea:

SVD basic idea: M = UΣV* where U is mxm, Σ is mxn, and V is nxn, V* is conjugate transpose of V Dr. Satya Gautam Vadlamudi, Principal Data Scientist, Capillary Technologies

Alternating Least Squares (ALS) Equation 2 is not convex since both q and p are unknown Steps: 1. 2. 3.

Fix p (Eq. 2 becomes convex), and optimize for q using Least Squares Fix q (Eq. 2 becomes convex), and optimize for p using Least Squares Repeat steps 1 & 2 until Eq. 2 converges

Easy to massively parallelize Can handle implicit data better than SGD


Adding more to ALS Adding Biases

Adding input sources (helps with cold start)

Temporal Dynamics

Inputs with varying confidence levels (some records are more reliable) Dr. Satya Gautam Vadlamudi, Principal Data Scientist, Capillary Technologies

Sample Latent Factor model learned


Netflix Prize Competition (2006) training data set of 100,480,507 ratings that 480,189 users gave to 17,770 movies ~1m quiz set, ~1m test set Winning solution: Linear combination of 100+ algos!


Evaluation metrics Dev/Test set setup: Use timeline information Coverage: For how many (%) users whose history is available, are we able to generate ratings? Accuracy: RMSE based on test data ratings and predicted ratings Dr. Satya Gautam Vadlamudi, Principal Data Scientist, Capillary Technologies

DEMO - Anshu Kumar


Top-N recommender system

Problem Statement: Given that an user has purchased some products/movies, the objective is to predict a list of Top N products that the user would be most interested in


Neighbourhood Methods


User User Collaborative Filtering


User User Collaborative Filtering Input: User-item matrix with ratings/purchase history Steps: 1. 2.

Fit: Learn user-user correlation matrix Transform: Generate personalized top-N list

User-user correlation (normalize your data first): Pearson correlation can be used: Dr. Satya Gautam Vadlamudi, Principal Data Scientist, Capillary Technologies

Selecting Neighbourhoods ● ● ● ●

All neighbours Random K neighbours Top K neighbours Neighbours who have min. threshold of similarity

Fewer neighbours -> lower coverage but also lesser noise from dissimilar neighbours Typically, about 50


Exercise - UUCF Find ranked lists for U3 & U4 P1 U1

1

U2

1

U3

1

P2

P4

1 1

U4 U5

P3

P5 1

1

1 1

1

1


Exercise - UUCF with 2 neighbours Find ranked lists for U3 & U4 P1 U1

1

U2

1

U3

1

P2

P4

1 1

U4 U5

P3

P5 1

1

1 1

1

1


Item Item Collaborative Filtering UUCF drawbacks: 1. Users’ tastes change fast 2. Users watch relatively few movies of the whole movie set, leading to sparse data, and few or no recommendations for many users

Item-item affinity/correlation is much more stable Item-Item correlations can be learned even from sparse data


IICF Input: User-item matrix with ratings/purchase history Steps: 1. 2.

Fit: Learn item-item correlation matrix Transform: Generate personalized top-N list (use of neighbourhood similar to UUCF)

Item-item correlation (normalize your data first): Pearson correlation can be used: Dr. Satya Gautam Vadlamudi, Principal Data Scientist, Capillary Technologies

Exercise - IICF Find ranked lists for U3 & U4 P1 U1

1

U2

1

U3

1

P2

P4

1 1

U4 U5

P3

P5 1

1

1 1

1

1


Evaluation metrics Dev/Test set setup: Use timeline information Coverage: For how many (%) users whose history is available, are we able to generate ratings? Accuracy: Hitrate@n (say, n = 5) Dr. Satya Gautam Vadlamudi, Principal Data Scientist, Capillary Technologies

Sanity Sample User User history for July’16

Recommendations suggested by Capillary Tech. for Aug’16

Actual user purchases in Aug’16

BREAKING BAD; S2: MA15+ 2009 BREAKING BAD; S3: MA15+ 2010 BREAKING BAD; S1: MA15+ 2008 GRIMM; S2: MA15+ 2013 GRIMM; S1: M15 2012 GRIMM; S3: MA15+ 2014

1. BREAKING BAD; S4: MA15+ 2011 2. BREAKING BAD; S5: MA15+ 2012 3. BREAKING BAD; FINAL SEASAON 4. GRIMM; S4 MA15+ 2015 5. TEEN WOLF; S5 P2: MA15+ 2015

BREAKING BAD; S4: MA15+ 2011 BREAKING BAD; S5: MA15+ 2012 BREAKING BAD; FINAL SEASAON


DEMO - Shashi Kumar


Content based recommender system

Problem Statement: Given that an user has purchased some products/movies, the objective is to predict a list of Top N products that the user would be most interested in

Say, data of only a single user is available and product meta-data is available Dr. Satya Gautam Vadlamudi, Principal Data Scientist, Capillary Technologies

Content Filtering


TFIDF Term Frequency (TF) = No. of occurrences of a term in the document/product description/user purchase history

Inverse Document Frequency (IDF) = log(#documents/#documents with the term) (how few documents contain this term)

TFIDF = TF * IDF Dr. Satya Gautam Vadlamudi, Principal Data Scientist, Capillary Technologies

What does it do Automatically downgrades stopwords and common terms Promotes core terms over incidental ones

Drawback: If core term is not used much in the document, then it is not focussed on


Vector space model Each keyword is a dimension Steps: 1. 2. 3. 4.

Learn p_u using TFIDF Learn q_i using TFIDF (Normalize if needed) Compute pearson correlation/cosine to rank

Limitation: Cannot handle interdependencies-- someone likes Shahrukh in romantic movies but Salman in action movies and does not like vice-versa Dr. Satya Gautam Vadlamudi, Principal Data Scientist, Capillary Technologies

Exercise - TFIDF Give recommendations for U3 & U4

User

Movie

U1

Dangal

U1

Hindi Medium

Movie

Keywords

U2

Bahubali 2

Toilet - Ek Prem Katha

Comedy, Drama

U3

Bahubali 2

Hindi Medium

Comedy, Drama

U3

Hindi Medium

Bahubali 2

Action, Adventure, Drama U3

The Ghazi Attack

Jolly LLB 2

Comedy, Crime, Drama U4

Dangal

The Ghazi Attack

Action, Drama, History U4

Toilet - Ek Prem Katha

Dangal

Action, Biography, Drama

U5 Hindi Medium Dr. Satya Gautam Vadlamudi, Principal Data Scientist, Capillary Technologies

DEMO - Sanket Sahu


References 1. 2. 3. 4. 5.

https://www.coursera.org/specializations/recommender-systems - Univ. of Minnesota, Prof. Joseph A Konstan, Dr. Michael D Ekstrand Koren, Yehuda, Robert Bell, and Chris Volinsky. "Matrix factorization techniques for recommender systems." Computer 42.8 (2009). J. Bennet and S. Lanning, “The Netflix Prize,” KDD Cup and Workshop, 2007; www.netflixprize.com. D. Goldberg et al., “Using Collaborative Filtering to Weave an Information Tapestry,” Comm. ACM, vol. 35, 1992, pp. 61-70. Salakhutdinov, Ruslan, Andriy Mnih, and Geoffrey Hinton. "Restricted Boltzmann machines for collaborative filtering." Proceedings of the 24th international conference on Machine learning. ACM, 2007. Dr. Satya Gautam Vadlamudi, Principal Data Scientist, Capillary Technologies

Thank You