Characterising the Emergent Semantics in Twitter Lists - VideoLectures

Characterising the Emergent Semantics in Twitter Lists Andrés García-Silva †, Jeon-Hyung Kang*, Kristina Lerman*,Oscar Corcho † † {hgarcia, ocorcho}@fi.upm.es Facultad de Informática Universidad Politécnica de Madrid, Spain *{jeonhyuk,lerman}@isi.edu Information Sciences Institute, University of Southern California, USA

Introduction Twitter Lists

Characterising the Emergent Semantics in Twitter Lists

2

Introduction Curators and List Names


3

Introduction Members and List Names


4

Introduction Subscribers and List Names


5

Introduction

• Previous examples showed individual uses of lists • Some list names where related among them

• What about if we group the lists?


6

Introduction Lists where the Yahoo!Finance user is a member grouped by frequency of membership

Lists where the NASDAQ user is a member grouped by number of subscriptions


7

Introduction: Research questions • Is it possible to identify related keywords from list names according to the use given by the different user roles? • Are two list names related if they have been used by a similar set of curators? • Are two list names related if a similar set of users have subscribe to the corresponding lists? • Are two list names related if their corresponding lists have a similar set of members?

• What kind of user roles will generate more related keywords? • What types of relations between keywords can we obtain? •

Synonyms, is-a, siblings..?

Investment

Stocks

Curator 1

Banks

PersonalBanking

Curator 2

List members

Subscriber 1


8

Approach

Elicit related keywords from Twitter lists

Twitter Lists

Characterise the semantics of the relations

Schema Representation of keywords

Model to identify similar keywords

Based on curators

Vector Space Model

Based on subscribers Based on members


Latent Dirichlet Allocation

Pairs of related keywords per Schema Rep. and Model

9

Approach

Elicit related keywords from Twitter lists

Characterise the semantics of the relations

Similarity based on WordNet Path Length Pairs of related keywords per Schema Rep. and Model

Wu & Palmer (Hierarchical Inf.)

Synonyms Is-a Siblings Indirect is-a Specificity of relations

Jiang & Conrath (Distributional Inf.) Synonyms (sameAs) SPARQL queries over general KBs published as Linked Data DBpedia, OpenCyc, and UMBEL

Binary relations (TypeOf, BT) Object Prop. (Occupation)


10

Experiment: Setup

• Data set • Total • 297,521 lists, 2,171,140 members, 215,599 curators, and 616,662 subscribers • We extracted 5932 unique keywords from list names; 55% of them were found in WordNet. • We use approximate matching of the list names with dictionary entries • The dictionary was created from Wikipedia article titles


11

Experiment: Execution Elicit related keywords from Twitter lists

Data set



Based on curators

Vector Space Model

Based on subscribers


Based on members

Characterise the semantics of the relations Similarity based on WordNet WordNet Similarity

Path Length


Each keyword with the 5 Most related

Wu & Palmer (Hierarchical Inf.) Jiang & Conrath (Distributional Inf.)


12

Experiment: Data Analysis

Correlation Values (-1 to 1)

Pearson's coefficient of correlations

Average J&C distance and W&P similarity


13

Experiment: Data Analysis Path Length in WordNet Path Length

Members VSM

LDA

Subscribers VSM

LDA

Curators VSM

LDA

1 (synonyms)

8.58%

10.87% 3.97%

3.24%

1.24% 0.50%

2 (is-a)

3.42%

3.08%

1.93%

0.47%

0.70% 0.00%

3 (Siblings, ind. Is-a)

2.37%

3.77%

2.96%

2.06%

2.38% 4.03%

>3

67.61%

65.5%

67.2%

67.5%

77.8% 75.8%

% of relations found by each schema representation and model

In average 97.65% of the relations with a path length greater than 3 involve a common subsumer


14

Experiment: Data Analysis Depth (LCS) and path length as indicators of specificity

Relations in WordNet

Depth of the least common subsumer

Relations with dept(LCS) >=5

Length of the path setting up the relation


15

Experiment: Findings Summary •

Similarity models based on members • •

•

The majority of relations found by any model have a path length >= 3 and involve a common subsumer. •

•

produce the results that are most correlated to the results of similarity measures based on WordNet find more synonyms and direct relations is-a when compared to the other models (path length).

Depth of LCS • VSM based on subscribers produces the highest number of specific relations (depth of LCS >= 5 or 6).

Similarity models based on curators produce a lower number of relations.


16

Experiment: Execution Elicit related keywords from Twitter lists

Data set



Based on curators

Vector Space Model

Based on subscribers


Based on members

Ontological Relations between keywords

Characterise the semantics of the relations SPARQL queries over general KBs published as Linked Data DBpedia, OpenCyc, and UMBEL



Each keyword with the 5 Most related

17

Experiment

• We anchor 63.77% of the keywords extracted from Twitter Lists to DBPedia resources


18

Experiment Vector-space model based on members (direct relations) Relation type Broader Term 26% subClassOf 26% developer 11% genre 11% largest city 6% Others 20%

Example of keywords life-science biotech writers authors google google_apps funland comedy houston texas -

Vector-space model based on subscribers (relations of length 3) Linked data pattern (54.73%): x -> object

Characterising the Emergent Semantics in Twitter Lists - VideoLectures

Characterising the Emergent Semantics in Twitter Lists - VideoLectures

Suggest Documents

EMERGENT SEMANTICS - People - EPFL

EMERGENT SEMANTICS - Distributed Information Systems

Twitter Lists - The Blue Collar Marketer

Twitter Lists - The Blue Collar Marketer

Supporting the Curation of Twitter User Lists

Distributed Emergent Semantics in P2P Networks - CiteSeerX

Emergent Semantics Through Interaction in Image Databases

Distributed Emergent Semantics in P2P Networks

Viewpoints on Emergent Semantics - CiteSeerX

Emergent Semantics Systems - Mustafa Jarrar

Squeeze Your Twitter Lists | infinite42 - Bitly

The Validity Issue in Mixed Research - VideoLectures

Emergent Semantics from Folksonomies - Springer Link

Characterising the Design Space for Linda Semantics 1 ... - CiteSeerX

Emergent Semantics: Towards Self-Organizing ... - Semantic Scholar

Interoperability through Emergent Semantics. A Semiotic ... - CiteSeerX

Interoperability through Emergent Semantics. A Semiotic ... - CiteSeerX

Emergent semantics - Intelligent Systems, IEEE - Language Evolution

Emergent Semantics: Towards Self-Organizing ... - Semantic Scholar

Topical Semantics of Twitter Links

Contextual Semantics for Sentiment Analysis of Twitter

Topical Semantics of Twitter Links - UCLA.edu

Topical Semantics of Twitter Links - UCLA.edu

Contextual Semantics for Sentiment Analysis of Twitter