An Open Framework for Multi-source, Cross-domain ...

0 downloads 0 Views 10MB Size Report
Digital Enterprise Research Insitute (DERI),. National University of Ireland, Galway ..... Hunch.com: bought by eBay in 2011. □ StumbleUpon presentation Wed, ...
Digital Enterprise Research Institute

www.deri.ie

An Open Framework for Multi-source, Cross-domain Personalisation with Semantic Interest Graphs Benjamin Heitmann, Digital Enterprise Research Insitute (DERI), National University of Ireland, Galway

© Copyright 2011 Digital Enterprise Research Institute. All rights reserved.

Enabling Networked Knowledge

Problem Digital Enterprise Research Institute

www.deri.ie

Users expect personalised experiences  Preferences are: 







distributed



many domains

??

!

!

! Travel

New site: 

no user profile



no recommendations

Not just cold-start problem: 

privacy



interoperability



rec. approach

Politics

Sports

Food

domain specific web sites

Enabling Networked Knowledge Benjamin Heitmann, slide: 2 /18

closed approach to crossdomain personalisation Digital Enterprise Research Institute

www.deri.ie

Centralised user profile ➡ Data sharing and aggregation ➡ Closed system ➡ No portability ➡ User trade-offs: privacy, trust, data ownership, control ➡ Examples: Facebook, Google+, Twitter ➡

express preference authentication for user action web site interaction cross domain data sharing if authorised by user

recommendations for external site provided by facebook

Enabling Networked Knowledge Benjamin Heitmann, slide: 3 /18

Killer application for the Semantic Web ? Digital Enterprise Research Institute

www.deri.ie

Enabling Networked Knowledge Benjamin Heitmann, slide: 4 /18

Significance Digital Enterprise Research Institute

www.deri.ie

Problems with current recommendation process: 1. Only single source recommendations 2. Only single domain recommendations 3. No portability of user profile data Who needs this? Sounds far fetched? Enabling Networked Knowledge Benjamin Heitmann, slide: 5 /18

Research questions Digital Enterprise Research Institute







www.deri.ie

Architecture 

Enabling of a decentralised eco-system?



Aggregation of user profile fragments?



Privacy-enabling and interoperable at the same time?

User model 

Data structure for merging?



Background knowledge?



Domain definitions?

Algorithm 

Type of algorithm?



Data sets and metrics for evaluation? Enabling Networked Knowledge Benjamin Heitmann, slide: 6 /18

Alternative: an open framework for cross-domain recommendations Digital Enterprise Research Institute

Architecture for privacy-enabled profile exchange private and secure aggregation of profiles

www.deri.ie

Distributed and domain-agnostic user model

Cross-domain recommendation algorithm

merging of profiles

New recommendations:

Travel destinations:

Movies:

semantic interest graph

spreading activation algorithm finds new recommendations

corresponds to the main three contributions of the PhD Enabling Networked Knowledge Benjamin Heitmann, slide: 7 /18

Architecture for privacy-enabled and portable user profiles Digital Enterprise Research Institute



User profile: 

Profile data expressed using RDF (FOAF+SIOC)



WebID provides identity (2 parts) – private SSL Key in user agent – public SSL Key in FOAF profile



www.deri.ie

WebID

private key

user agent

FOAF Profile

Roles: 

user agents: manage user identities



profile storage service: stores 1 or more profiles



data consumers: provide services for users

public key

stored in

data consumer

retrieves user profile if user authorises it profile storage site

Enabling Networked Knowledge Benjamin Heitmann, slide: 8 /18

Communication pattern of the proposed architecture Digital Enterprise Research Institute

www.deri.ie

WebID

private key Storage URI

user agent

public key

FOAF Profile

stored in

data consumer

profile storage site

Enabling Networked Knowledge Benjamin Heitmann, slide: 9 /18

Communication pattern of the proposed architecture Digital Enterprise Research Institute

www.deri.ie

Scenario: restaurant recommendation  Assumption: user is logged into Openbook 

WebID

private key Storage URI

user agent

public key

FOAF Profile

stored in

data consumer

profile storage site

Enabling Networked Knowledge Benjamin Heitmann, slide: 9 /18

Communication pattern of the proposed architecture Digital Enterprise Research Institute

www.deri.ie

Scenario: restaurant recommendation  Assumption: user is logged into Openbook 1. User requests nice restaurants from Chow 

WebID

private key Storage URI

user agent

public key

FOAF Profile

Any nice restaurants?

data consumer

stored in

profile storage site

Enabling Networked Knowledge Benjamin Heitmann, slide: 9 /18

Communication pattern of the proposed architecture Digital Enterprise Research Institute

Scenario: restaurant recommendation  Assumption: user is logged into Openbook 1. User requests nice restaurants from Chow 2. Chow gets profile storage via Firefox

www.deri.ie



WebID

private key Storage URI

user agent

public key

FOAF Profile

Firefox provides storage URI

stored in

data consumer

profile storage site

Enabling Networked Knowledge Benjamin Heitmann, slide: 9 /18

Communication pattern of the proposed architecture Digital Enterprise Research Institute

Scenario: restaurant recommendation  Assumption: user is logged into Openbook 1. User requests nice restaurants from Chow 2. Chow gets profile storage via Firefox 3. Chow redirects Firefox to Openbook for authorisation

www.deri.ie



WebID

private key Storage URI

user agent

public key

FOAF Profile

redirect to openbook for authorisation

data consumer

stored in

profile storage site

Enabling Networked Knowledge Benjamin Heitmann, slide: 9 /18

Communication pattern of the proposed architecture Digital Enterprise Research Institute

Scenario: restaurant recommendation  Assumption: user is logged into Openbook 1. User requests nice restaurants from Chow 2. Chow gets profile storage via Firefox 3. Chow redirects Firefox to Openbook for authorisation 4. User authorises Openbook to show some profile parts to Chow

www.deri.ie



WebID

private key Storage URI

user agent

public key

FOAF Profile User authorises Openbook to show parts of profile to Chow

data consumer

stored in

profile storage site

Enabling Networked Knowledge Benjamin Heitmann, slide: 9 /18

Communication pattern of the proposed architecture Digital Enterprise Research Institute

Scenario: restaurant recommendation  Assumption: user is logged into Openbook 1. User requests nice restaurants from Chow 2. Chow gets profile storage via Firefox 3. Chow redirects Firefox to Openbook for authorisation 4. User authorises Openbook to show some profile parts to Chow 5.Openbook redirects to Chow

www.deri.ie



WebID

private key Storage URI

user agent

public key

FOAF Profile

stored in

data consumer

redirect back to Chow profile storage site

Enabling Networked Knowledge Benjamin Heitmann, slide: 9 /18

Communication pattern of the proposed architecture Digital Enterprise Research Institute

Scenario: restaurant recommendation  Assumption: user is logged into Openbook 1. User requests nice restaurants from Chow 2. Chow gets profile storage via Firefox 3. Chow redirects Firefox to Openbook for authorisation 4. User authorises Openbook to show some profile parts to Chow 5.Openbook redirects to Chow 6.Now Chow accesses parts of profile data on openbook

www.deri.ie



WebID

private key Storage URI

public key

user agent

FOAF Profile

stored in Chow retrieves profile parts now data consumer

profile storage site

Enabling Networked Knowledge Benjamin Heitmann, slide: 9 /18

RDF is a graph! Digital Enterprise Research Institute

www.deri.ie

graph with typed edges and typed vertices  expressed as triples of: subject, predicate, object  entity types: 

URIs  Strings, optionally with language tag XOR data type  Blank Nodes  Rules for the triples: 

– subject can be URI or Blank, predicate just URI, object anything.

Enabling Networked Knowledge Benjamin Heitmann, slide: 10 /18

User model Digital Enterprise Research Institute



www.deri.ie

Semantic interest graphs: 

interests represented as

DBpedia URIs





supports merging



domain-agnostic



transferable between systems



independent of actual inventory

Unsuitable alternatives: 

item-rating vector



lists of plain text items/tags

Enabling Networked Knowledge Benjamin Heitmann, slide: 11 /18

Semantic background knowledge + domain definitions Digital Enterprise Research Institute



www.deri.ie

Background knowledge: 

DBpedia



semantic graph



indirect connections

categories and properties for concepts





Domain definitions: using SKOS (Simple Knowledge Organisation System)



define recommendable entities of domain



– Music: artists, cds, tracks, genre – Food: restaurant, dish, chef, ingredient

food

movie

travel destination

Enabling Networked Knowledge Benjamin Heitmann, slide: 12 /18

Cross-domain recommendation algorithm Digital Enterprise Research Institute

www.deri.ie

 Recommendable items

DBpedia

Richard Dawkins

Atheism Activists dc:subject

User profile

Algorithm: 

Spreading activation



graph-based



uses semantic network



able to provide results of specified target domain (or inventory)

influencedBy dc:subject Kurt Vonnegut

Douglas Adams author

Start of spreading activation

birthplace The Hitchhikers Guide to the Galaxy (novel)

Cambridge subsequentWork

author subdivisionName

Restaurant at the end of the universe

publisher

Macmillian

Unconstrained SA == depth-first search (pictured)  Why does that already work ? 

influencedBy

country

United Kingdom



DBpedia is super dense!



avg degree: ~20

Enabling Networked Knowledge Benjamin Heitmann, slide: 13 /18

SA configuration Digital Enterprise Research Institute



Constraints: 

fan-out / distance



path



Semantic configuration type of activations (domain definition)



activation type



Link type weights

Configuration:



Node/Link black-lists





www.deri.ie



num of target activations



fanout penalty



distance penalty



initial activation spread



activation threshold



maximum degree



Iterative algorithm num of phases: re-activation after stabilising?



num of waves per phase: how often to spread?





Implementation: 

HDT RDF store (in-house)

Neo4J, Giraph/Hadoop unsuccessful



Enabling Networked Knowledge Benjamin Heitmann, slide: 14 /18

Evaluation plan Digital Enterprise Research Institute

www.deri.ie

Theoretical framework: Link prediction problem  Metrics: 





AUC & precision



diversity, novelty, personalisation, heterogeneity

Data sources:  





User profiles: StackExchange network (cross-domain user profiles) Background knowledge: DBpedia

Baseline algorithms: 

Linked Data Semantic Distance (LDSD)



Random Walker with Restart (RWR)



Collaborative Filtering (CF)

User study, depends on time constraints Enabling Networked Knowledge Benjamin Heitmann, slide: 15 /18

Impact Digital Enterprise Research Institute



Academia:   



www.deri.ie

Personalisation: hot topic in Semantic Web community Many (!) workshops Best paper award at I-Semantics 2012: “Linked Open Data to support Content-based Recommender Systems”, Di Noia et al.

Industry: 

current Cisco Ireland collaboration



Hunch.com: bought by eBay in 2011



StumbleUpon presentation Wed,RecSys2012



Personalisation has become a commodity



Facebook approach requires multi-source, cross-domain recs.



Decentralised SocNets like Diaspora*

Enabling Networked Knowledge Benjamin Heitmann, slide: 16 /18

Achievements and future plans Digital Enterprise Research Institute



Finished: 

algorithm implementation



architecture

Next step: Off-line evaluation using all StackExchange data  User study, depends on time constraints 

www.deri.ie

Publications  “Personalisation of Social Web Services in the Enterprise Using Spreading Activation for Multi-Source, Cross-Domain Recommendations”, AAAI Spring Symposium on Intelligent Web Services Meet Social Computing, 2012.  “An architecture for privacy-enabled user profile portability on the Web of Data”, HetRec Workshop at RecSys 2010.  “An empirically-grounded conceptual architecture for applications on the Web of Data”, IEEE Transactions on Systems, Man and Cybernetics, Part C - Applications and Reviews, 2011.

Enabling Networked Knowledge Benjamin Heitmann, slide: 17 /18

Summary Digital Enterprise Research Institute



Goal of research:  





alternative to current closed ecosystems

mechanism for authorisation of data exchange through user enables private and secure profile exchange

2.) Distributed and domain-agnostic user model  



open framework for cross-domain & multi-source recommendations

1.) Architecture for privacy-enabled profile exchange: 



www.deri.ie

provides semantic graph as user model, background knowledge and domain definitions enables aggregating and merging of profiles

3.) Cross-domain recommendation algorithm  

provides graph-based algorithm enables personalisation in a target domain using any interests

Enabling Networked Knowledge Benjamin Heitmann, slide: 18 /18

Characteristics of the Spreading Activation algorithm Digital Enterprise Research Institute

www.deri.ie

SA is very different from e.g. PageRank  “SA is depth-first search, guided/interrupted by domain logic and algorithm conditions”  Challenges when implementing SA: 



requires semantic graph



size of data (DBPedia: 11 mio. entities, 40 mio. edges)



iterative algorithm



embedding of domain logic



stateful nodes



execution speed for Cisco ADVANSSE use case

Enabling Networked Knowledge Benjamin Heitmann, slide: 19 /18

Suggest Documents