Dynamic Prediction of Communication Flow using ... - Semantic Scholar

8 downloads 1089 Views 733KB Size Report
Nov 5, 2008 - kinds of information. • Weak tie and flow of new information. • Strong tie and flow of existing information. Topic: iPhone 3G [new information].
Dynamic Prediction of Communication Flow using Social Context

Munmun De Choudhury Hari Sundaram Ajita John Dorée Duncan Seligmann

@ACM HyperText and HyperMedia 2008, Pittsburgh, PA

Arts, Media & Engineering Arizona State University, Tempe Collaborative Applications Research Avaya Labs, New Jersey

November 5, 2008

1

Introduction A temporal representation framework using communication and social context to efficiently predict communication flow in social networks.

Bob Alice

 Why is the problem important? • Determine information propagation and the roles of people in the process. • Targeted advertising, spread of fashions and fads, innovations, consumer interests etc. • Determine community evolution. @ACM HyperText and HyperMedia 2008, Pittsburgh, PA

November 5, 2008

2

Our Approach 35

Bas eline M etho d OurAp p ro ach

30 25 20

Baseline Improvement in predicted error

Error in Prediction of Intent Error in Prediction (%)

 Estimate two properties of communication flow: the intent to communicate and associated delay [De Choudhury ’07].

15 10

Our Approach

• Design features to characterize communication and social Time (Wee ks) context. • Select a subset of optimal Error in Prediction of Intent to communicate contextual features using a voting strategy from five different feature selection  Experimental results on techniques. MySpace dataset with effective • Use features in a Support Vector prediction (error ~12-15%). Regression framework to predict the intent and the delay. 5 0

1

@ACM HyperText and HyperMedia 2008, Pittsburgh, PA

3

5

7

9

11 13 15 17 19 21 23 25 27 29 31 33 35 37 39

November 5, 2008

3

Related Work  Work on information diffusion [Gruhl, Tomkins ’04].  Early adoption based flow model for recommendation systems [Song et al. ’06].  Work on information roles of people [Nakajima et al. ‘05, Song et al. ‘06].  Limitations: • information flow estimated from indirect evidence. • context not been considered. • Roles of people with respect to long term habitual behavior not analyzed.

Agitators and Summarizers [Nakajima et al. ‘05] @ACM HyperText and HyperMedia 2008, Pittsburgh, PA

November 5, 2008

4

Outline Introduction / Related work

Problem Statement

• Two sub-problems: Intent to communicate Communication Delay • Role of context

Communication Context Social Context Prediction Experimental Results Conclusions @ACM HyperText and HyperMedia 2008, Pittsburgh, PA

November 5, 2008

5

What is Intent to Communicate?  The probability that a person will engage herself in some communication (given a particular topic and at a certain point of time) with another person. • It is contingent upon several factors or features defined by the communication context [De Choudhury et al. 07] and social context. Movie: 40% Sports: 40%

Bob

Ann Alice

@ACM HyperText and HyperMedia 2008, Pittsburgh, PA

Movie: 80% Dinner: 20%

November 5, 2008

6

What is Delay in Propagation?  The amount of time passed between the reception of a message (on a certain topic) and the corresponding response by a person. Movie: 4 hours Sports: 25 mins

Alice

@ACM HyperText and HyperMedia 2008, Pittsburgh, PA

Movie: 2 days Dinner: 15 hours

Bob

Ann

November 5, 2008

7

What is the role of context? Conventional view of communication flow in the community Gladwell’s approach: information flows through specific “hot” points

Watt’s approach: “hot” points are just coincidental; information flow depends on the vulnerability of people to communication Bob

Alice

Communication flow depends upon: how Alice and Bob communicate (communication context), and what intrinsic roles Alice and Bob play in their communication (social context) @ACM HyperText and HyperMedia 2008, Pittsburgh, PA

November 5, 2008

8

Outline Introduction / Related work Problem Statement

Communication Context Social Context Prediction Experimental Results Conclusions

• What is communication context? • Overview of Neighborhood, Topic and Recipient context

Neighborhood

Recipient

Topic @ACM HyperText and HyperMedia 2008, Pittsburgh, PA

November 5, 2008

9

Communication Context  Communication context [De Choudhury, Sundaram et al. ‘07] is the set of attributes that affect communication between two individuals.  Contextual attributes are dynamic [Dourish ’02]. • relationship between messages • past communication behavior of a person • response patterns from others

@ACM HyperText and HyperMedia 2008, Pittsburgh, PA

November 5, 2008

10

Features of Communication Context  Neighborhood context • It refers to the effect of the user’s social network on her communication. • Susceptibility • Backscatter

 Topic context • It refers to the effect of the semantics of a user’s past communication on a topic on her future communication. • • • •

Message Coherence Temporal Coherence Topic Relevance Topic Quantity

 Recipient context @ACM HyperText and HyperMedia 2008, Pittsburgh, PA

November 5, 2008

11

Outline Introduction / Related work Problem Statement Communication Context

Social Context

• What is social context? • Features of social context

Prediction Experimental Results Conclusions @ACM HyperText and HyperMedia 2008, Pittsburgh, PA

November 5, 2008

12

Social Context  Social context: • set of contextual attributes that refer to who is communicating with whom and what is the strength of relationship shared between them.  Two features: information roles and strength of ties. Star network with only out-going communication links: high tendency to send messages.

Alice

Bob

Alice

Bob

Dan

Emilie

Charlie

Same clique – strong tie – higher chance of communication. Different cliques – weak tie – lower chance of communication. Several in-coming links: receptor of messages.

@ACM HyperText and HyperMedia 2008, Pittsburgh, PA

November 5, 2008

13

Information Roles  Information role is a contextual attribute acquired over time which impacts a person’s communication behavior.  Three different categories of roles of people: • generators, people who generate information by themselves or from other sources, • mediators, people who act as transmitters of information between people, and • receptors, people who mostly receive messages.

(a)

(b)

(c)

Figure : (a) Generators (red) (b) Mediators (green) (c) Receptors (gray).

@ACM HyperText and HyperMedia 2008, Pittsburgh, PA

November 5, 2008

14

Strength of ties  The nature of relationship between two people affects communication [Granovetter ’73].  Strength of a tie between two people: • overlap of their friends’ circles.

 The pattern of relationships between actors reveals the likelihood that individuals will be exposed to particular kinds of information. • Weak tie and flow of new information. • Strong tie and flow of existing information.

@ACM HyperText and HyperMedia 2008, Pittsburgh, PA

Weak tie

Topic: iPhone 3G [new information]

Strong tie

Topic: Joe’s birthday party last week [existing information]

November 5, 2008

15

Outline Introduction / Related work Problem Statement Communication Context Social Context

Prediction

• SVR approach • Feature Selection

Experimental Results Conclusions

Support Vector Regression

Contextual features

@ACM HyperText and HyperMedia 2008, Pittsburgh, PA

Features in transformed kernel space

November 5, 2008

16

Prediction of Intent and Delay  Support Vector Regression framework • Use features of communication and social context and actual frequency of communication over N weeks for training a Support Vector Machine regressor. • SVR further used to predict intent and delay at the (N+1)th week.

 Feature Selection • Need to eliminate redundant features dynamically. • Several static and dynamic feature selection techniques: • • • • •

Correlation coefficient Mutual information Decision trees Principal Component Analysis (PCA) kNN based estimation

• Use of a ‘voting strategy’ to determine the best k features at the beginning of each time slice.

@ACM HyperText and HyperMedia 2008, Pittsburgh, PA

November 5, 2008

17

Outline Introduction / Related work Problem Statement Communication Context Social Context Prediction

Experimental Results

MySpace.com

• Prediction of intent and delay • Qualitative Evaluation

Conclusions @ACM HyperText and HyperMedia 2008, Pittsburgh, PA

November 5, 2008

18

Prediction Results Error in Pre diction of De lay

Error in Prediction of Intent Bas eline M et ho d OurAp p ro ach

30

25

Error in Prediction (%)

Error in Prediction (%)

35

Bas eline M etho d Our Ap p ro ach

20

25

15

20 15

10

10 5

5

0

0 1

3

5

7

9

11 13 15 17 19 21 23 25 27 29 31 33 35 37 39

Time (We eks)

1

3

5

7

9

11 13 15 17 19 21 23 25 27 29 31 33 35 37 39

Time (We e ks)

 Data description: • 20,000 users from MySpace, 1,425,010 messages from September 2005 to April 2007.

 Baseline Method: • Consider all features of communication context without feature selection [De Choudhury et al. 2007].

 The training set of 70 weeks.  Experiments over a period of 40 weeks, averaged over eight different contacts of a person Charlie and four topics. • Prediction error of 12-15 % compared to 25-30 % using the baseline method. @ACM HyperText and HyperMedia 2008, Pittsburgh, PA

November 5, 2008

19

Qualitative Evaluation of Predicted Intent Radar Plot of Predicted Intent 1

Direction of increasing time

1 10

2

0.8 0.6 0.4

9

3 0.2 0

8

4

7

Jason Susan Anna Jennifer Mike David Bill T om

5

6

Figure : Dynamics of predicted intent across time and contacts for a user. @ACM HyperText and HyperMedia 2008, Pittsburgh, PA

November 5, 2008

20

Qualitative Evaluation of Predicted Delay CHARLIE→ → JASON

CHARLIE→ → DAVID

CHARLIE→ → SUSAN

Figure : Dynamics in delay between Charlie and three of her contacts. The visualizations show dynamics for ten consecutive weeks (Y axis) and for four topics (X axis). The length of each horizontal line is proportional to the measure of delay.

@ACM HyperText and HyperMedia 2008, Pittsburgh, PA

November 5, 2008

21

Outline

Conclusions @ACM HyperText and HyperMedia 2008, Pittsburgh, PA

Error in Prediction of Intent 35

Error in Prediction (%)

Introduction / Related work Problem Statement Communication Context SVM Based prediction MySpace dataset Experimental Results

Baseline M etho d OurAp p ro ach

30 25 20 15 10 5 0 1

3

5

7

9

11 13 15 17 19 21 23 25 27 29 31 33 35 37 39

Time (We eks)

• Summary and Contributions • Future Work

November 5, 2008

22

Summary and Conclusions Error in Prediction of Intent 35

Error in Prediction (%)

 Predict communication flow in large scale social networks by incorporating social context.  Excellent results on a real world dataset MySpace.com

B as eline M etho d OurAp p ro ach

30 25 20 15 10 5 0 1

3

5

7

9

11 13 15 17 19 21 23 25 27 29 31 33 35 37 39

Time (We e ks)

• error ~12-15 % compared to 25-30 % using only communication context.

Radar Plot of Predicted Intent 1 1

 Conclusions • Social context is key to predicting communication flow. • Intent to communicate appears to be a property of context; delay seems to be a property of habitual characteristics of people.

@ACM HyperText and HyperMedia 2008, Pittsburgh, PA

10

2

0.8 0.6 0.4

9

3 0.2 0

8

4

7

Jason Susan Anna Jennifer Mike David Bill T om

5

6

November 5, 2008

23

Future Work

Communication Flow

 How does the ‘age’ and nature of information impact communication flow?  Model of collective user behavior.  Inherent property of a communication media that affects communication flow.

‘Age’ of information

@ACM HyperText and HyperMedia 2008, Pittsburgh, PA

Emergent collective user behavior

November 5, 2008

24

Thanks!

@ACM HyperText and HyperMedia 2008, Pittsburgh, PA

November 5, 2008

25

Suggest Documents