Digital Enterprise Research Insitute (DERI),. National University of Ireland, Galway ..... Hunch.com: bought by eBay in 2011. â¡ StumbleUpon presentation Wed, ...
Digital Enterprise Research Institute
www.deri.ie
An Open Framework for Multi-source, Cross-domain Personalisation with Semantic Interest Graphs Benjamin Heitmann, Digital Enterprise Research Insitute (DERI), National University of Ireland, Galway
© Copyright 2011 Digital Enterprise Research Institute. All rights reserved.
Enabling Networked Knowledge
Problem Digital Enterprise Research Institute
www.deri.ie
Users expect personalised experiences Preferences are:
distributed
many domains
??
!
!
! Travel
New site:
no user profile
no recommendations
Not just cold-start problem:
privacy
interoperability
rec. approach
Politics
Sports
Food
domain specific web sites
Enabling Networked Knowledge Benjamin Heitmann, slide: 2 /18
closed approach to crossdomain personalisation Digital Enterprise Research Institute
www.deri.ie
Centralised user profile ➡ Data sharing and aggregation ➡ Closed system ➡ No portability ➡ User trade-offs: privacy, trust, data ownership, control ➡ Examples: Facebook, Google+, Twitter ➡
express preference authentication for user action web site interaction cross domain data sharing if authorised by user
recommendations for external site provided by facebook
Enabling Networked Knowledge Benjamin Heitmann, slide: 3 /18
Killer application for the Semantic Web ? Digital Enterprise Research Institute
www.deri.ie
Enabling Networked Knowledge Benjamin Heitmann, slide: 4 /18
Significance Digital Enterprise Research Institute
www.deri.ie
Problems with current recommendation process: 1. Only single source recommendations 2. Only single domain recommendations 3. No portability of user profile data Who needs this? Sounds far fetched? Enabling Networked Knowledge Benjamin Heitmann, slide: 5 /18
Research questions Digital Enterprise Research Institute
www.deri.ie
Architecture
Enabling of a decentralised eco-system?
Aggregation of user profile fragments?
Privacy-enabling and interoperable at the same time?
User model
Data structure for merging?
Background knowledge?
Domain definitions?
Algorithm
Type of algorithm?
Data sets and metrics for evaluation? Enabling Networked Knowledge Benjamin Heitmann, slide: 6 /18
Alternative: an open framework for cross-domain recommendations Digital Enterprise Research Institute
Architecture for privacy-enabled profile exchange private and secure aggregation of profiles
www.deri.ie
Distributed and domain-agnostic user model
Cross-domain recommendation algorithm
merging of profiles
New recommendations:
Travel destinations:
Movies:
semantic interest graph
spreading activation algorithm finds new recommendations
corresponds to the main three contributions of the PhD Enabling Networked Knowledge Benjamin Heitmann, slide: 7 /18
Architecture for privacy-enabled and portable user profiles Digital Enterprise Research Institute
User profile:
Profile data expressed using RDF (FOAF+SIOC)
WebID provides identity (2 parts) – private SSL Key in user agent – public SSL Key in FOAF profile
www.deri.ie
WebID
private key
user agent
FOAF Profile
Roles:
user agents: manage user identities
profile storage service: stores 1 or more profiles
data consumers: provide services for users
public key
stored in
data consumer
retrieves user profile if user authorises it profile storage site
Enabling Networked Knowledge Benjamin Heitmann, slide: 8 /18
Communication pattern of the proposed architecture Digital Enterprise Research Institute
www.deri.ie
WebID
private key Storage URI
user agent
public key
FOAF Profile
stored in
data consumer
profile storage site
Enabling Networked Knowledge Benjamin Heitmann, slide: 9 /18
Communication pattern of the proposed architecture Digital Enterprise Research Institute
www.deri.ie
Scenario: restaurant recommendation Assumption: user is logged into Openbook
WebID
private key Storage URI
user agent
public key
FOAF Profile
stored in
data consumer
profile storage site
Enabling Networked Knowledge Benjamin Heitmann, slide: 9 /18
Communication pattern of the proposed architecture Digital Enterprise Research Institute
www.deri.ie
Scenario: restaurant recommendation Assumption: user is logged into Openbook 1. User requests nice restaurants from Chow
WebID
private key Storage URI
user agent
public key
FOAF Profile
Any nice restaurants?
data consumer
stored in
profile storage site
Enabling Networked Knowledge Benjamin Heitmann, slide: 9 /18
Communication pattern of the proposed architecture Digital Enterprise Research Institute
Scenario: restaurant recommendation Assumption: user is logged into Openbook 1. User requests nice restaurants from Chow 2. Chow gets profile storage via Firefox
www.deri.ie
WebID
private key Storage URI
user agent
public key
FOAF Profile
Firefox provides storage URI
stored in
data consumer
profile storage site
Enabling Networked Knowledge Benjamin Heitmann, slide: 9 /18
Communication pattern of the proposed architecture Digital Enterprise Research Institute
Scenario: restaurant recommendation Assumption: user is logged into Openbook 1. User requests nice restaurants from Chow 2. Chow gets profile storage via Firefox 3. Chow redirects Firefox to Openbook for authorisation
www.deri.ie
WebID
private key Storage URI
user agent
public key
FOAF Profile
redirect to openbook for authorisation
data consumer
stored in
profile storage site
Enabling Networked Knowledge Benjamin Heitmann, slide: 9 /18
Communication pattern of the proposed architecture Digital Enterprise Research Institute
Scenario: restaurant recommendation Assumption: user is logged into Openbook 1. User requests nice restaurants from Chow 2. Chow gets profile storage via Firefox 3. Chow redirects Firefox to Openbook for authorisation 4. User authorises Openbook to show some profile parts to Chow
www.deri.ie
WebID
private key Storage URI
user agent
public key
FOAF Profile User authorises Openbook to show parts of profile to Chow
data consumer
stored in
profile storage site
Enabling Networked Knowledge Benjamin Heitmann, slide: 9 /18
Communication pattern of the proposed architecture Digital Enterprise Research Institute
Scenario: restaurant recommendation Assumption: user is logged into Openbook 1. User requests nice restaurants from Chow 2. Chow gets profile storage via Firefox 3. Chow redirects Firefox to Openbook for authorisation 4. User authorises Openbook to show some profile parts to Chow 5.Openbook redirects to Chow
www.deri.ie
WebID
private key Storage URI
user agent
public key
FOAF Profile
stored in
data consumer
redirect back to Chow profile storage site
Enabling Networked Knowledge Benjamin Heitmann, slide: 9 /18
Communication pattern of the proposed architecture Digital Enterprise Research Institute
Scenario: restaurant recommendation Assumption: user is logged into Openbook 1. User requests nice restaurants from Chow 2. Chow gets profile storage via Firefox 3. Chow redirects Firefox to Openbook for authorisation 4. User authorises Openbook to show some profile parts to Chow 5.Openbook redirects to Chow 6.Now Chow accesses parts of profile data on openbook
www.deri.ie
WebID
private key Storage URI
public key
user agent
FOAF Profile
stored in Chow retrieves profile parts now data consumer
profile storage site
Enabling Networked Knowledge Benjamin Heitmann, slide: 9 /18
RDF is a graph! Digital Enterprise Research Institute
www.deri.ie
graph with typed edges and typed vertices expressed as triples of: subject, predicate, object entity types:
URIs Strings, optionally with language tag XOR data type Blank Nodes Rules for the triples:
– subject can be URI or Blank, predicate just URI, object anything.
Enabling Networked Knowledge Benjamin Heitmann, slide: 10 /18
User model Digital Enterprise Research Institute
www.deri.ie
Semantic interest graphs:
interests represented as
DBpedia URIs
supports merging
domain-agnostic
transferable between systems
independent of actual inventory
Unsuitable alternatives:
item-rating vector
lists of plain text items/tags
Enabling Networked Knowledge Benjamin Heitmann, slide: 11 /18
Semantic background knowledge + domain definitions Digital Enterprise Research Institute
www.deri.ie
Background knowledge:
DBpedia
semantic graph
indirect connections
categories and properties for concepts
Domain definitions: using SKOS (Simple Knowledge Organisation System)
define recommendable entities of domain
– Music: artists, cds, tracks, genre – Food: restaurant, dish, chef, ingredient
food
movie
travel destination
Enabling Networked Knowledge Benjamin Heitmann, slide: 12 /18
Cross-domain recommendation algorithm Digital Enterprise Research Institute
www.deri.ie
Recommendable items
DBpedia
Richard Dawkins
Atheism Activists dc:subject
User profile
Algorithm:
Spreading activation
graph-based
uses semantic network
able to provide results of specified target domain (or inventory)
influencedBy dc:subject Kurt Vonnegut
Douglas Adams author
Start of spreading activation
birthplace The Hitchhikers Guide to the Galaxy (novel)
Cambridge subsequentWork
author subdivisionName
Restaurant at the end of the universe
publisher
Macmillian
Unconstrained SA == depth-first search (pictured) Why does that already work ?
influencedBy
country
United Kingdom
DBpedia is super dense!
avg degree: ~20
Enabling Networked Knowledge Benjamin Heitmann, slide: 13 /18
SA configuration Digital Enterprise Research Institute
Constraints:
fan-out / distance
path
Semantic configuration type of activations (domain definition)
activation type
Link type weights
Configuration:
Node/Link black-lists
www.deri.ie
num of target activations
fanout penalty
distance penalty
initial activation spread
activation threshold
maximum degree
Iterative algorithm num of phases: re-activation after stabilising?
num of waves per phase: how often to spread?
Implementation:
HDT RDF store (in-house)
Neo4J, Giraph/Hadoop unsuccessful
Enabling Networked Knowledge Benjamin Heitmann, slide: 14 /18
Evaluation plan Digital Enterprise Research Institute
www.deri.ie
Theoretical framework: Link prediction problem Metrics:
AUC & precision
diversity, novelty, personalisation, heterogeneity
Data sources:
User profiles: StackExchange network (cross-domain user profiles) Background knowledge: DBpedia
Baseline algorithms:
Linked Data Semantic Distance (LDSD)
Random Walker with Restart (RWR)
Collaborative Filtering (CF)
User study, depends on time constraints Enabling Networked Knowledge Benjamin Heitmann, slide: 15 /18
Impact Digital Enterprise Research Institute
Academia:
www.deri.ie
Personalisation: hot topic in Semantic Web community Many (!) workshops Best paper award at I-Semantics 2012: “Linked Open Data to support Content-based Recommender Systems”, Di Noia et al.
Industry:
current Cisco Ireland collaboration
Hunch.com: bought by eBay in 2011
StumbleUpon presentation Wed,RecSys2012
Personalisation has become a commodity
Facebook approach requires multi-source, cross-domain recs.
Decentralised SocNets like Diaspora*
Enabling Networked Knowledge Benjamin Heitmann, slide: 16 /18
Achievements and future plans Digital Enterprise Research Institute
Finished:
algorithm implementation
architecture
Next step: Off-line evaluation using all StackExchange data User study, depends on time constraints
www.deri.ie
Publications “Personalisation of Social Web Services in the Enterprise Using Spreading Activation for Multi-Source, Cross-Domain Recommendations”, AAAI Spring Symposium on Intelligent Web Services Meet Social Computing, 2012. “An architecture for privacy-enabled user profile portability on the Web of Data”, HetRec Workshop at RecSys 2010. “An empirically-grounded conceptual architecture for applications on the Web of Data”, IEEE Transactions on Systems, Man and Cybernetics, Part C - Applications and Reviews, 2011.
Enabling Networked Knowledge Benjamin Heitmann, slide: 17 /18
Summary Digital Enterprise Research Institute
Goal of research:
alternative to current closed ecosystems
mechanism for authorisation of data exchange through user enables private and secure profile exchange
2.) Distributed and domain-agnostic user model
open framework for cross-domain & multi-source recommendations
1.) Architecture for privacy-enabled profile exchange:
www.deri.ie
provides semantic graph as user model, background knowledge and domain definitions enables aggregating and merging of profiles
3.) Cross-domain recommendation algorithm
provides graph-based algorithm enables personalisation in a target domain using any interests
Enabling Networked Knowledge Benjamin Heitmann, slide: 18 /18
Characteristics of the Spreading Activation algorithm Digital Enterprise Research Institute
www.deri.ie
SA is very different from e.g. PageRank “SA is depth-first search, guided/interrupted by domain logic and algorithm conditions” Challenges when implementing SA:
requires semantic graph
size of data (DBPedia: 11 mio. entities, 40 mio. edges)
iterative algorithm
embedding of domain logic
stateful nodes
execution speed for Cisco ADVANSSE use case
Enabling Networked Knowledge Benjamin Heitmann, slide: 19 /18