Exploiting Similarity to Optimize Recommendation Lists from Click ...

1

Exploiting Similarity to Optimize Recommendations from User Feedback Hasta Vanchinathan Andreas Krause (Learning and Adaptive Systems Group, D-INF,ETHZ )

Collaborators: Isidor Nikolic (Microsoft, Zurich), Fabio De Bona (Google, Zurich) 2

A Recommendation Example

3


4


5


6


7


8


9


10

Many real world instances…

Disclaimer: All trademarks belong to respective owners















Common Thread

19

Common Thread • To do well, we need a model. e.g.,

20


• Popular techniques include – Content-based filtering – Collaborative filtering – Hybrid recommendation systems

21


• Popular techniques include – Content-based filtering – Collaborative filtering – Hybrid recommendation systems

• All aim to predict reward given a fixed data set 22

Challenges

23

Challenges Many, dynamic!

24


Preferences change

25


Estimating all combinations both hard and wasteful!

Preferences change

26



Preferences change Only need identify high reward items!

27



Preferences change Only need identify high reward items!

28

Multi Arm Bandits

29

Multi Arm Bandits

30

Multi Arm Bandits • Early approaches require k > T 33

Learning meets bandits

f(x)

x

34

Learning meets bandits • Exploit similarity information to predict rewards for new items f(x)

x

35

Learning meets bandits • Exploit similarity information to predict rewards for new items • Must make assumptions on f(x) reward function, e.g.:

x

36

Learning meets bandits • Exploit similarity information to predict rewards for new items • Must make assumptions on f(x) reward function, e.g.: • Linear (linUCB - Li et al ‘10) x

37

Learning meets bandits • Exploit similarity information to predict rewards for new items • Must make assumptions on f(x) reward function, e.g.: • Linear (linUCB - Li et al ‘10) • Lipschitz (Bubeck et al ‘08) x

38

Learning meets bandits • Exploit similarity information to predict rewards for new items • Must make assumptions on f(x) reward function, e.g.: • Linear (linUCB - Li et al ‘10) • Lipschitz (Bubeck et al ‘08) • Low RKHS norm (GP-UCB Srinivas et al ’12)

x

39

Learning meets bandits • Exploit similarity information to predict rewards for new items • Must make assumptions on f(x) reward function, e.g.: • Linear (linUCB - Li et al ‘10) • Lipschitz (Bubeck et al ‘08) • Low RKHS norm (GP-UCB Srinivas et al ’12)

x

• This is the approach we pursue in this work! 40

Problem Setup

41

Problem Setup

42

Problem Setup

= user attributes

43

Problem Setup

= user attributes

44

Problem Setup

= user attributes

45

Problem Setup

= user attributes

46

Problem Setup

= user attributes

47

Problem Setup

= user attributes

48

Problem Setup

= user attributes

49

Problem Setup

= user attributes

50

Problem Setup

= user attributes

We want to maximize:

51

Problem Setup

= user attributes

Equivalently, minimize

52

Problem Setup

= user attributes

Equivalently, minimize

53

Our Approach

54

Our Approach • We propose CGPRank, that uses a bayesian model for the rewards

55

Our Approach • We propose CGPRank, that uses a bayesian model for the rewards • CGPRank efficiently shares reward across

56


– Items

57


– Items – Users

58


– Items – Users – positions 59

‘Demux’ing Feedback

60

‘Demux’ing Feedback We still need to predict:

61


Assume: items do not influence reward of other items

62



63



64



relevance! 65



relevance!

Position CTR! 66

CGPRank – Sharing across positions

67

CGPRank – Sharing across positions

68

CGPRank – Sharing across positions 0.3

0.17

0.16

0.08

69


0.17

0.16

0.08

70


0.3

0.17

??

0.16

??

0.08

0.08

71


0.3

0.17

??

0.16

??

0.08

0.08

72


0.3

1

0.17

??

0.8

0.16

??

0.65

0.08

0.08

0.47

- Position weights - independent of items! - estimated from logs

73


0.3

1

0.17

0.19

0.8

0.16

0.13

0.65

0.08

0.08

0.47

- Position weights - independent of items! - estimated from logs

74

CGPRank – Sharing across items/users

75


76


77


78


79


80


81


82


83


84


85


86


87

Sharing across items / users with Gaussian processes Bayesian models for functions Prior P(f)

f(x) reward

x choice

88


f(x) reward

x choice

89


f(x) reward

x choice

90


f(x) reward likely

x choice

91


unlikely

f(x) reward likely

x choice

92

Sharing across items / users with Gaussian processes Likelihood P(data | f)

Bayesian models for functions Prior P(f)

unlikely

f(x)

f(x) reward

+

likely

+

+ +

x choice

93

Sharing across items / users with Gaussian processes Likelihood P(data | f)  Posterior: P(f | data)


unlikely

f(x)

f(x) reward

+

likely

x

+

+ + x

choice

94



unlikely

f(x)

f(x) reward

+

likely

x

+

+ + x

choice

95



unlikely

f(x) likely

f(x) reward

+

likely

x

+

+ + x

choice

96



unlikely

f(x) likely

f(x) reward

+

likely

x

+

+ + x

choice

97



unlikely

f(x) likely

f(x) reward

+

likely

x choice

+

+ + x unlikely

98



unlikely

f(x) likely

f(x) reward

+

likely

+

+ +

x choice

x unlikely

Closed form Bayesian posterior inference possible! 99



unlikely

f(x) likely

f(x) reward

+

likely

+

+ +

x choice

Closed form Bayesian posterior inference possible! Allows to represent uncertainty in prediction

x unlikely

100

Predictive confidence in GPs f(x)

x

Typically, only care about “marginals”, i.e.,  

101


x’

x


102


f(x’)

x’

x

P(f(x’))


103


f(x’)

x’

x

P(f(x’))

Typically, only care about “marginals”, i.e., Parameterized by covariance function K(x,x’) = Cov(f(x),f(x’)) 

104


f(x’)

x’

x

P(f(x’))

Typically, only care about “marginals”, i.e., Parameterized by covariance function K(x,x’) = Cov(f(x),f(x’)) Can capture many rec. tasks using appropriate cov. function 105

Intuition: Explore-Exploit using GPs

Selection Rule:

118

Intuition: Explore-Exploit using GPs

Selection Rule:

119

CGPRank – Selection Rule

120


At t=0, if no prior observations

121


At t=0, with some prior observation

122


Uncertainty shrinks not just at observation….

123


but also at other locations based on similarity!

124


If list size is 2…

125


The first item,

, is selected according to

126


✔

127


✔

Secret sauce?  128


✔

Time varying tradeoff parameter 129


✔

Hallucinate mean

and shrink uncertainties…

130


✔

Hallucinate mean

and shrink uncertainties…

131


✔

Now update model and again pick using:

132


✔

✔

Now update model and again pick using:

133

CGPRank

134

CGPRank

135

CGPRank

136

CGPRank

137

CGPRank

138

CGPRank

139

CGPRank

140

CGPRank

141

CGPRank

142

CGPRank

143

CGPRank

144

CGPRank

145

CGPRank

146

CGPRank - guarantees Theorem 1 If we choose

, then running CGPRank for T

rounds, we incur a regret sublinear in T. Specifically,

Grows strongly sublinearly for typical kernels

147

Experiments - Datasets

153

Experiments - Datasets • Google book store logs – 42 days of user logs – Given key book, suggest list of related books – Kernel computed from related graph on books

154

Experiments - Datasets • Google book store logs – 42 days of user logs – Given key book, suggest list of related books – Kernel computed from related graph on books

• Yahoo! Webscope R6B* – – – –

10 days of user log on Yahoo! Frontpage Unbiased method to test bandit algorithms 45 million user interations with 271 articles Feedback available for single selection, we simulated list selection 155

Experiments - Questions • How much does principled sharing of feedback help? – Across items/context? – Across positions?

• Can CGPRank outperform an existing, tuned recommendation system?

156

Sharing across items

157

Sharing across contexts

158

Effect of increasing list size

159

Boost over existing approach

Existing Algorithm

160

Conclusions • CGPRank - Efficient Algorithm with strong theoretical guarantees • Can generalize from sparse feedback across • Items • Contexts • Positions • Experiments suggest • Statistical and computational efficiency 161

Exploiting Similarity to Optimize Recommendation Lists from Click ...

Exploiting Similarity to Optimize Recommendation Lists from Click ...

Suggest Documents

Exploiting Similarity to Optimize Recommendation Lists from Click ...

Exploiting Cloud Heterogeneity to Optimize ... - NetDB@Penn

Exploiting Cloud Heterogeneity to Optimize ... - NetDB@Penn

Exploiting Multicores to Optimize Business Process Execution

Book Recommendation Lists with summaries

Exploiting Data Similarity to Reduce Memory Footprints

Learning Weighted Entity Lists from Web Click Logs for Spoken ...

Using Maximum Coverage to Optimize Recommendation Systems in E ...

Improving Recommendation Lists Through Topic Diversification

manipulating recommendation lists by global considerations

Exploiting V2G to optimize residential energy consumption with

Google Optimize helps Milliyet Emlak boost click-to ... - Google Services

Exploiting GPU Parallelism to Optimize Real-World ...

Exploiting High-level Coherence Information to Optimize ... - CiteSeerX

Bi-directional semantic similarity for gene ontology to optimize ...

Learning Dense Models of Query Similarity from User Click Logs

Learning Dense Models of Query Similarity from User Click Logs

Exploiting FrameNet for Content-Based Book Recommendation

Exploiting Hyperlink Recommendation Evidence In ... - CiteSeerX

Personalized News Recommendation Based on Click Behavior

A Mobile Application Recommendation Framework by Exploiting ...

Exploiting Dining Preference for Restaurant Recommendation - GDAC

Exploiting Dining Preference for Restaurant Recommendation - GDAC

Exploiting Hyperlink Recommendation Evidence In ... - CiteSeerX