Nov 5, 2008 - kinds of information. ⢠Weak tie and flow of new information. ⢠Strong tie and flow of existing information. Topic: iPhone 3G [new information].
Dynamic Prediction of Communication Flow using Social Context
Munmun De Choudhury Hari Sundaram Ajita John Dorée Duncan Seligmann
@ACM HyperText and HyperMedia 2008, Pittsburgh, PA
Arts, Media & Engineering Arizona State University, Tempe Collaborative Applications Research Avaya Labs, New Jersey
November 5, 2008
1
Introduction A temporal representation framework using communication and social context to efficiently predict communication flow in social networks.
Bob Alice
Why is the problem important? • Determine information propagation and the roles of people in the process. • Targeted advertising, spread of fashions and fads, innovations, consumer interests etc. • Determine community evolution. @ACM HyperText and HyperMedia 2008, Pittsburgh, PA
November 5, 2008
2
Our Approach 35
Bas eline M etho d OurAp p ro ach
30 25 20
Baseline Improvement in predicted error
Error in Prediction of Intent Error in Prediction (%)
Estimate two properties of communication flow: the intent to communicate and associated delay [De Choudhury ’07].
15 10
Our Approach
• Design features to characterize communication and social Time (Wee ks) context. • Select a subset of optimal Error in Prediction of Intent to communicate contextual features using a voting strategy from five different feature selection Experimental results on techniques. MySpace dataset with effective • Use features in a Support Vector prediction (error ~12-15%). Regression framework to predict the intent and the delay. 5 0
1
@ACM HyperText and HyperMedia 2008, Pittsburgh, PA
3
5
7
9
11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
November 5, 2008
3
Related Work Work on information diffusion [Gruhl, Tomkins ’04]. Early adoption based flow model for recommendation systems [Song et al. ’06]. Work on information roles of people [Nakajima et al. ‘05, Song et al. ‘06]. Limitations: • information flow estimated from indirect evidence. • context not been considered. • Roles of people with respect to long term habitual behavior not analyzed.
Agitators and Summarizers [Nakajima et al. ‘05] @ACM HyperText and HyperMedia 2008, Pittsburgh, PA
November 5, 2008
4
Outline Introduction / Related work
Problem Statement
• Two sub-problems: Intent to communicate Communication Delay • Role of context
Communication Context Social Context Prediction Experimental Results Conclusions @ACM HyperText and HyperMedia 2008, Pittsburgh, PA
November 5, 2008
5
What is Intent to Communicate? The probability that a person will engage herself in some communication (given a particular topic and at a certain point of time) with another person. • It is contingent upon several factors or features defined by the communication context [De Choudhury et al. 07] and social context. Movie: 40% Sports: 40%
Bob
Ann Alice
@ACM HyperText and HyperMedia 2008, Pittsburgh, PA
Movie: 80% Dinner: 20%
November 5, 2008
6
What is Delay in Propagation? The amount of time passed between the reception of a message (on a certain topic) and the corresponding response by a person. Movie: 4 hours Sports: 25 mins
Alice
@ACM HyperText and HyperMedia 2008, Pittsburgh, PA
Movie: 2 days Dinner: 15 hours
Bob
Ann
November 5, 2008
7
What is the role of context? Conventional view of communication flow in the community Gladwell’s approach: information flows through specific “hot” points
Watt’s approach: “hot” points are just coincidental; information flow depends on the vulnerability of people to communication Bob
Alice
Communication flow depends upon: how Alice and Bob communicate (communication context), and what intrinsic roles Alice and Bob play in their communication (social context) @ACM HyperText and HyperMedia 2008, Pittsburgh, PA
November 5, 2008
8
Outline Introduction / Related work Problem Statement
Communication Context Social Context Prediction Experimental Results Conclusions
• What is communication context? • Overview of Neighborhood, Topic and Recipient context
Neighborhood
Recipient
Topic @ACM HyperText and HyperMedia 2008, Pittsburgh, PA
November 5, 2008
9
Communication Context Communication context [De Choudhury, Sundaram et al. ‘07] is the set of attributes that affect communication between two individuals. Contextual attributes are dynamic [Dourish ’02]. • relationship between messages • past communication behavior of a person • response patterns from others
@ACM HyperText and HyperMedia 2008, Pittsburgh, PA
November 5, 2008
10
Features of Communication Context Neighborhood context • It refers to the effect of the user’s social network on her communication. • Susceptibility • Backscatter
Topic context • It refers to the effect of the semantics of a user’s past communication on a topic on her future communication. • • • •
Message Coherence Temporal Coherence Topic Relevance Topic Quantity
Recipient context @ACM HyperText and HyperMedia 2008, Pittsburgh, PA
November 5, 2008
11
Outline Introduction / Related work Problem Statement Communication Context
Social Context
• What is social context? • Features of social context
Prediction Experimental Results Conclusions @ACM HyperText and HyperMedia 2008, Pittsburgh, PA
November 5, 2008
12
Social Context Social context: • set of contextual attributes that refer to who is communicating with whom and what is the strength of relationship shared between them. Two features: information roles and strength of ties. Star network with only out-going communication links: high tendency to send messages.
Alice
Bob
Alice
Bob
Dan
Emilie
Charlie
Same clique – strong tie – higher chance of communication. Different cliques – weak tie – lower chance of communication. Several in-coming links: receptor of messages.
@ACM HyperText and HyperMedia 2008, Pittsburgh, PA
November 5, 2008
13
Information Roles Information role is a contextual attribute acquired over time which impacts a person’s communication behavior. Three different categories of roles of people: • generators, people who generate information by themselves or from other sources, • mediators, people who act as transmitters of information between people, and • receptors, people who mostly receive messages.
(a)
(b)
(c)
Figure : (a) Generators (red) (b) Mediators (green) (c) Receptors (gray).
@ACM HyperText and HyperMedia 2008, Pittsburgh, PA
November 5, 2008
14
Strength of ties The nature of relationship between two people affects communication [Granovetter ’73]. Strength of a tie between two people: • overlap of their friends’ circles.
The pattern of relationships between actors reveals the likelihood that individuals will be exposed to particular kinds of information. • Weak tie and flow of new information. • Strong tie and flow of existing information.
@ACM HyperText and HyperMedia 2008, Pittsburgh, PA
Weak tie
Topic: iPhone 3G [new information]
Strong tie
Topic: Joe’s birthday party last week [existing information]
November 5, 2008
15
Outline Introduction / Related work Problem Statement Communication Context Social Context
Prediction
• SVR approach • Feature Selection
Experimental Results Conclusions
Support Vector Regression
Contextual features
@ACM HyperText and HyperMedia 2008, Pittsburgh, PA
Features in transformed kernel space
November 5, 2008
16
Prediction of Intent and Delay Support Vector Regression framework • Use features of communication and social context and actual frequency of communication over N weeks for training a Support Vector Machine regressor. • SVR further used to predict intent and delay at the (N+1)th week.
Feature Selection • Need to eliminate redundant features dynamically. • Several static and dynamic feature selection techniques: • • • • •
Correlation coefficient Mutual information Decision trees Principal Component Analysis (PCA) kNN based estimation
• Use of a ‘voting strategy’ to determine the best k features at the beginning of each time slice.
@ACM HyperText and HyperMedia 2008, Pittsburgh, PA
November 5, 2008
17
Outline Introduction / Related work Problem Statement Communication Context Social Context Prediction
Experimental Results
MySpace.com
• Prediction of intent and delay • Qualitative Evaluation
Conclusions @ACM HyperText and HyperMedia 2008, Pittsburgh, PA
November 5, 2008
18
Prediction Results Error in Pre diction of De lay
Error in Prediction of Intent Bas eline M et ho d OurAp p ro ach
30
25
Error in Prediction (%)
Error in Prediction (%)
35
Bas eline M etho d Our Ap p ro ach
20
25
15
20 15
10
10 5
5
0
0 1
3
5
7
9
11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
Time (We eks)
1
3
5
7
9
11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
Time (We e ks)
Data description: • 20,000 users from MySpace, 1,425,010 messages from September 2005 to April 2007.
Baseline Method: • Consider all features of communication context without feature selection [De Choudhury et al. 2007].
The training set of 70 weeks. Experiments over a period of 40 weeks, averaged over eight different contacts of a person Charlie and four topics. • Prediction error of 12-15 % compared to 25-30 % using the baseline method. @ACM HyperText and HyperMedia 2008, Pittsburgh, PA
November 5, 2008
19
Qualitative Evaluation of Predicted Intent Radar Plot of Predicted Intent 1
Direction of increasing time
1 10
2
0.8 0.6 0.4
9
3 0.2 0
8
4
7
Jason Susan Anna Jennifer Mike David Bill T om
5
6
Figure : Dynamics of predicted intent across time and contacts for a user. @ACM HyperText and HyperMedia 2008, Pittsburgh, PA
November 5, 2008
20
Qualitative Evaluation of Predicted Delay CHARLIE→ → JASON
CHARLIE→ → DAVID
CHARLIE→ → SUSAN
Figure : Dynamics in delay between Charlie and three of her contacts. The visualizations show dynamics for ten consecutive weeks (Y axis) and for four topics (X axis). The length of each horizontal line is proportional to the measure of delay.
@ACM HyperText and HyperMedia 2008, Pittsburgh, PA
November 5, 2008
21
Outline
Conclusions @ACM HyperText and HyperMedia 2008, Pittsburgh, PA
Error in Prediction of Intent 35
Error in Prediction (%)
Introduction / Related work Problem Statement Communication Context SVM Based prediction MySpace dataset Experimental Results
Baseline M etho d OurAp p ro ach
30 25 20 15 10 5 0 1
3
5
7
9
11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
Time (We eks)
• Summary and Contributions • Future Work
November 5, 2008
22
Summary and Conclusions Error in Prediction of Intent 35
Error in Prediction (%)
Predict communication flow in large scale social networks by incorporating social context. Excellent results on a real world dataset MySpace.com
B as eline M etho d OurAp p ro ach
30 25 20 15 10 5 0 1
3
5
7
9
11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
Time (We e ks)
• error ~12-15 % compared to 25-30 % using only communication context.
Radar Plot of Predicted Intent 1 1
Conclusions • Social context is key to predicting communication flow. • Intent to communicate appears to be a property of context; delay seems to be a property of habitual characteristics of people.
@ACM HyperText and HyperMedia 2008, Pittsburgh, PA
10
2
0.8 0.6 0.4
9
3 0.2 0
8
4
7
Jason Susan Anna Jennifer Mike David Bill T om
5
6
November 5, 2008
23
Future Work
Communication Flow
How does the ‘age’ and nature of information impact communication flow? Model of collective user behavior. Inherent property of a communication media that affects communication flow.
‘Age’ of information
@ACM HyperText and HyperMedia 2008, Pittsburgh, PA
Emergent collective user behavior
November 5, 2008
24
Thanks!
@ACM HyperText and HyperMedia 2008, Pittsburgh, PA
November 5, 2008
25