Predicting How You Respond to Phone Calls: Towards Discovering Temporal Behavioral Rules Iqbal H. Sarker School of Software and Electrical Engineering Swinburne University of Technology Melbourne, Australia
[email protected]
Muhammad Ashad Kabir School of Computing and Mathematics Charles Sturt University NSW, Australia
[email protected]
Alan Colman School of Software and Electrical Engineering Swinburne University of Technology Melbourne, Australia
[email protected]
Jun Han School of Software and Electrical Engineering Swinburne University of Technology Melbourne, Australia
[email protected]
ABSTRACT
Discovering temporal rules that capture an individual’s phone call response behavior is essential to building intelligent individualized call interruption management system. The key challenge to discovering such temporal rules is identifying within a phone call log the time boundaries that delineate periods when an individual user rejects or accepts phone calls. Moreover, potential data sparsity in phone call logs imposes additional challenge in discovering applicable rules. In this paper, we address the above issues and present a hybrid approach to identify the effective time boundaries for discovering temporal behavioral rules of individual mobile phone users utilizing calendar and mobile phone data. Our preliminary experiments on real datasets show that our proposed hybrid approach dynamically identifies better time boundaries based on like behavioral patterns and outperforms the existing calendar-based approach (CBA) and log-based approach (LBA) to discovering the temporal behavior rules of individual mobile phone users. Author Keywords
Phone Call logs; Data sparsity; Calendar; Call response behavior; Mobile data mining; Temporal rules; Personalization; Call interruptions. ACM Classification Keywords
H.3.4 [Systems and Software]: Current awareness systems, User profiles and alert services. INTRODUCTION
The explosion in the number of cell phones has made them the most ubiquitous communication devices. The number of mobile cellular subscriptions is almost equal to Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions
[email protected]. OzCHI '16, November 29-December 02, 2016, Launceston, TAS, Australia © 2016 ACM. ISBN 978-1-4503-4618-4/16/11…$15.00 DOI: http://dx.doi.org/10.1145/3010915.3010979
the number of people on the planet (Pejovic, 2014). Mobile smart phones are also highly personal devices which users use in their daily life activities. While mobile phones are considered to be ‘always on, always connected’, mobile users are not always willing or able to respond to incoming phone calls. People are often inconveniently interrupted by incoming phone calls which not only disturb the users but also disturb people nearby. Such kind of interruptions may create embarrassing situation not only in an official environment (e.g., meeting) but also affect in other activities like examining patients by a doctor or driving a vehicle etc. Sometimes these kinds of interruptions may reduce worker performance, increased errors and stress in a working environment (Pejovic, 2014). Several factors such as time-of-day/day-of-week (temporal context), social relationship between caller and callee, user’s situation/activity, and location can be utilized to discover individual phone call response behavior rules for managing such kind of interruptions. However, Halvey et al. (Halvey, 2006) have shown through the analysis of a large sample of user data that temporal context is the most important factor that impacts user behavior in a mobile-Internet portal. A number of authors have studied mobile phone interruption management using calendar information. For instance, Khalil et al. (Khalil, 2005) use calendar information to infer user’s activity and to automatically configure cell phones accordingly to manage interruptions. Seo et al. (Seo, 2011) propose a rule-based context-aware configuration manager PYP (Personalize Your Phone) for smartphones to block a phone call without bothering the user. Kabir et al. (Kabir, 2014) propose a novel rule-based technique for socially aware phone call applications on Android phones to minimize call interruptions. An intelligent context aware interruption management system has been designed by Zulkernain (Zulkernain, 2010). These approaches use rules according to user’s scheduled events in calendar. However, the fixed scheduled time in calendar may not always represent the actual schedule of the events that occur in the real world: e.g., events in the calendar but do
not occur; or the user does not record all their appointments in calendar; or events may not follow the exact timing that allocated in the calendar (Lovett, 2010). In contrast, call activity records in mobile device logs are a rich resource for modeling individual user behavior (Sarker, 2016a). Mobile phones automatically record the users’ activities related to phone call in logs. This ability to log call activities offers the potential to understand the actual call response behavior of an individual. In particular, the storing of calling information (e.g., call date, specific call time, call type, call duration) in call logs provides raw data about when the user accepts, rejects or misses incoming calls. Recently, Sarker et al. (Sarker, 2016b) present an approach to identify dynamic time segments for mining individualized rules using such calling information from phone call log. However, this log-based approach (LBA) cannot generate rules when call log data does not have any record for a time duration, i.e., when a data sparsity problem exists in a call log. As a consequence, sometimes meaningful rules are not produced for a period of time. Let us consider an example, say, on Monday a user attends on a regular business meeting from 2.00PM to 5.00PM. The user rejects the incoming call while in meeting and at other times the user typically accepts the call. Let us assume an approach segments time by hourof-the-day. In his log history, he has some reject calls evidence during the first and last hour of meeting (e.g. [2:00PM-3:00PM] and [4:00PM-5:00PM] and no data exists between [3:00PM-4:00PM]. As a result, no behavioral activities can be identified between 3:00PM and 4:00PM and no rule is produced for that period, as this period is also a part of that event. To address the above issues, in this paper, we propose a hybrid approach that utilizes both calendar and mobile phone data to identify the effective time boundaries for discovering temporal behavior rules of individual users. In our approach, unsupervised bottom-up approach is responsible for behavior-oriented time segmentation that captures the dominant call response behavior for time segments by analyzing the characteristics of log data. However, viewed over a number of weeks, many of these segments will not have any logged activity. This problem of data sparseness is then addressed by aggregating topdown the like behavioral produced segments using calendar data. Temporal rules are then discovered for the aggregated time segments. As the behaviors of different individuals are not identical in the real word, such segmentation may differ from user-to-user according to their behavioral patterns over time. In particular, the contributions of this paper are three fold; (1) We identify and analyze data sparseness problem in phone call log data related to identifying effective time boundaries for discovering temporal behavioral rules of individual mobile phone users; (2) We propose a hybrid approach that utilizes the log-based approach and correlates this observed behavior with calendar data to improve the definition of temporal boundaries that delineate differing response behavior; and finally (3) We
present a preliminary quantitative evaluation of our approach using the real data set of individual users. HYBRID APPROACH
In this section, we present our hybrid approach for discovering the temporal behavior rules of individuals utilizing both calendar and mobile phone data. Pre-processing Mobile Phone Data Pre-processing
In this step, we pre-process the call log data to extract and differentiate the user’s call responses into three categories, i.e., ACCEPT, REJECT and MISSED, which are the user behaviors we are interested in. However, user’s actions in ACCEPTing and REJECTing calls are not directly distinguishable in INCOMING calls in the log. As such, we derive ACCEPT and REJECT calls by using the call duration. If the call duration is greater than 0 then the call has been ACCEPTED; if it is equal to 0 then the call has been REJECTED (Sarker, 2016b). Calendar Data Pre-processing
We utilize individual’s iCalendar data (such as provided by Google Calendar) for calendar information because many of today’s calendar application support iCalendar data for scheduling appointments (Lee, 2010). In preprocessing, we extract the temporal information (e.g., date, time) of the events utilizing iCalendar data. Figure 1 shows a snippet of iCalendar data for a single event ‘meeting’.
Figure 1: Sample iCalendar data for a scheduled event Behavior-Oriented Time Segmentation
In this section, we summarize the behavior-oriented time segmentation technique (Sarker, 2016b) that is used in our hybrid approach. This approach employs bottom-up processing of individual’s mobile phone data. First, each day of the week is divided into relatively small time slices according to a given base period. For the purposes of this study, 5 minutes (reasonable time gap) is assumed as the finest granularity required to distinguish day-to-day activities. These initial time slices are used to identify the dominant call response behavior of individuals in these time periods because their daily behavior occurs in a time interval rather than at an exact time. After that call instances from adjacent time slices of like dominant behavior are aggregated and the corresponding time slices are merged, to increase the temporal coverage and support value of identified time segments (i.e., the merged time slices). These larger time segments will have more support and are then used as the basis for mining rules pertinent to the
individuals. The applicability of the discovered rules for that segmentation is then measured. After that they iteratively increase the base period by a reasonable time gap and compare the applicability of the generated rules over each iteration. The time segmentation that yields the maximum applicability establishes the optimal time segmentation. These are stored in an optimal time segments (OTS) list with corresponding dominant behavior to output the rules for the users.
aggregate adjacent segments including G based on like dominant characteristic. Thus modified segment list is generated that is more effective for producing individual’s behavioral rules. For creating rules, our hybrid approach takes into account both the produced modified segments within calendar events and the rest of optimal segments in OTSlist that exist outside the scheduled events.
Optimal Segments Aggregation
In our technique, once the optimal segments have been determined by an optimal base period, segments that exhibit similar dominant behavior are dynamically aggregated including segment G (no data exist) into longest possible time segments. We do this utilizing the scheduled events in the calendar. This is done to increase the support value and temporal coverage for any rules that are eventually extracted for these time segments. Figure 2 shows an example of producing aggregated segment that combines three different segments and ‘Reject’ will be the dominant behavior for the whole time boundaries [10am - 1pm] according to the dominant of aggregated segment. Temporal behavior rules are then derived using these aggregated segments. Rule Mining
We employ the well-known association rule mining algorithm Apriori (Agrawal, 1994) for mining the temporal rules of an individual user. The following are the two basic measures of the association rule mining algorithm. Figure 2: Segments Aggregation
Syntactically, a temporal phone call response behavior rule of individuals is a pair (T, B), where T is the temporal information and B is the phone call response behavior during that time period. We use the following related information of the calendar events and optimal time segments. Event {Edate, Tstart, Tend}, where, Edate represents event date, Tstart is the start time of an event, Tend is the end time of that event. OptimalTimeSegment {Stime, Sday, CallB}, where, Stime is the time segment, Sday represents the associated day of that segment and CallB represents the dominant phone call response behavior for that segment. As the events and optimal time segments contain temporal information, we aggregate the adjacent segments within a scheduled event based on this temporal information. The overall process for this is set out in Algorithm 1. Input data include event list Elist and optimal time segment list OTSlist generated by behavior-oriented time segmentation approach and output data is the list of modified segments during the time period of a scheduled event in calendar. First, we extract the temporal information Tevent for each event E from Elist. We then filter the time segments during the time period of that event by synchronizing the temporal information in OTSlist. After that for each time segment during Tevent, we
Support: It measures the frequency of association, i.e., how many times the specific type of call has occurred in a dataset. Mathematically, an association rule is an implication expression of the form X ⇒ Y where X and Y are disjoint data, i.e., X ∩ Y = φ . If σ represents the support count and N is the total number of instances in dataset, then the formal definition of ‘Support’ is –
Support , s ( X ⇒ Y ) =
σ (X ∪Y ) N
Confidence: It is a measure of strength of the association rules. If σ represents the support count for the above expression of X and Y, then the formal definition of ‘Confidence’ is Confidence, c( X ⇒ Y ) =
σ (X ∪Y ) σ (X )
EXPERIMENTAL EVALUATION
We have conducted experiments on real datasets of both calendar schedules and mobile phone data of different users. We have implemented our method in Java programming language and executed them on a Windows PC with an Intel Core I5 CPU (3.20GHz) and 8GB memory. Data Collection
We have collected both calendar data and mobile phone data of 15 individual users of different professions such
as undergraduate students, post graduate students, university lecturers and industry professionals. To collect mobile phone data, we have first developed an Android mobile app which collects the user's real call log data on their mobile phones. For calendar data, we have collected individual’s iCalendar data of Google calendar. As our approach is fully individualized, in this paper, we illustrate our approach with the detailed of experimental results of two different users. Evaluation Metrics
To evaluate our approach for discovering the effective temporal rules of individuals, we use the metric ‘Applicability’ (Sarker, 2016b) that measures the applicability of time-dependent rules.
comparing to other approaches. The main reason is that CBA approach strictly follows the time boundaries of the events that exist in calendar, and does not verify whether the existing time boundaries in calendar is associated with a real event or not. Another reason is that it assumes the existence of an event is associated with REJECT behavior. However, in the real word, users’ behavior changes over time and may differ from event-to-event and user-to-user. Therefore, it cannot properly capture individual’s behavior and as a result the applicability becomes low.
Applicability: Applicability is a descriptive statistic that (1) measures how much of the week is covered by rules and (2) takes into account the support of rules, for a particular confidence threshold. It is defined as the product of aggregate support and aggregate temporal coverage, where aggregate support is the fraction of the summation of the support count of all the rules that satisfies the confidence threshold among the maximum possible support considered and the aggregate temporal coverage is the proportion of the temporal coverage by those rules. Formally, the applicability Arule is defined as: N
Arule = ∑n =1 (
Rsup port S max
*
Rcov erage Cmax
Figure 3: Applicability comparison (User 1)
Figure 4: Applicability comparison (User 2)
)
Where, Rsupport is the support count of a rule, Rcoverage is the temporal coverage of the rule, Smax is the maximum possible support in a dataset, Cmax is the maximum possible temporal coverage in a week and ‘N’ is the number of rules that satisfies the confidence threshold. Effectiveness of Hybrid Approach
To show the effectiveness of our hybrid approach, we compare the applicability of our approach with CBA and LBA. Figure 3 and Figure 4 show the applicability comparison for User 1 and User 2 (randomly chosen) respectively. For each approach, we set minimum support value 1 (one instance) because no rules are meaningful for the support value below 1. Moreover, we have explored different confidence threshold, i.e., 100% (highest strength), 90% and down to 70% (comparatively lower strength). The setting of this confidence threshold for creating rules will vary according to an individual’s preference on how interventionist they want the call-handling agent to be. For example, one person may want the agent to reject calls where in the past he/she has rejected calls more than, say, 80% of the time - that is, at a threshold of 80%. Another individual, on the other hand, may only want the agent to intervene if he/she has rejected calls in, say, 95% of past instances. Since by the definition, confidence is associated to rules strength, so we are not interested to take into account below 70% confidence threshold. If we observe Figure 3 and Figure 4, we see that the applicability of CBA approach (Khalil, 2005) is very low
The LBA approach (Sarker, 2016b) produces greater applicability than CBA approach. The main reason is that LBA approach identifies users’ actual behavior at various times-of-the-week of individual users. As a result, the applicability of LBA approach is higher than CBA approach. Finally, our hybrid approach consistently outperforms both the CBA and LBA approaches for a particular confidence threshold. The main reason is that hybrid approach takes into account the issues that undertaken in these two approaches. Our hybrid approach identifies the users’ actual behavior at various times of the day and days of the week using LBA approach and utilizes the calendar information to minimize the data sparseness. Thus our hybrid approach produces better time segments of like behavior for each user more properly, producing a set of meaningful temporal rules with high applicability for a particular confidence threshold. CONCLUSION AND FUTURE WORK
In this paper, we have presented a hybrid approach to discovering temporal behavior rules for individual mobile users utilizing their phone call log and calendar data. Our approach assists developers to building intelligent behavior-oriented call interruption management system for the benefit of end users. Our preliminary experiments on real datasets show that our approach is effective to identify time segments for discovering better temporal behavior rules of individual mobile phone users. In future work, we plan to study and incorporate additional contexts, such as the social relationship between the caller and callee to get user specific rules. We also have planned to conduct qualitative evaluation of our approach.
REFERENCES
Agrawal, R. and Srikant, R. Fast algorithms for mining association rules. In proceedings of VLDB 1994,ACM Press (1994), 487-499. Halvey, M. and Keane, M. Time based patterns in mobile-internet surfing. In proceedings of the 2006 SIGCHI conference on Human Factors in computing systems, ACM Press (2006), 31-34 Kabir, M.A.,Han, J., Yu, J. and Colman, A.User-centric social context information management: an ontologybased approach and platform. Personal and Ubiquitous Computing. Springer (2014), 1061-1083. Khalil, A. and Connelly, K. Improving cell phone awareness by using calendar information. In proceedings of the IFIP Conference on HumanComputer Interaction, Springer (2005), 588-600. Lee, A. Exploiting context for mobile user experience. In proceedings of the First Workshop on Semantic Models for Adaptive Interactive Systems, (2010). Lovett, T., O'Neill, E.,Irwin, J. and Pollington, D. The calendar as a sensor: analysis and improvement using data fusion with social networks and location. In proceedings of the 12th ACM international conference on Ubiquitous computing, ACM Press (2010), 3-12. Pejovic, V. and Musolesi, M. InterruptMe: designing intelligent prompting mechanisms for pervasive applications. In proceedings of the 2014 ACM
International Joint Conference on Pervasive and Ubiquitous Computing, ACM Press (2014), 897-908. Sarker, Iqbal H., Colman, A. and Kabir, M. A. and Han, J. Phone call log as a context source to modeling individual user behavior. In proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing, ACM Press (2016a), 630-634. Sarker, Iqbal H., Colman, A. and Kabir, M. A. and Han, J. Behavior-Oriented Time Segmentation for Mining Individualized Rules of Mobile Phone Users. In proceedings of the 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA’16), IEEE Press (2016b). Seo, S., Kwon, A., Kang, J., Strassner, J.A. PYP: design and implementation of a context-aware configuration manager for smartphones. In proceedings of the 1st International Workshop on Smart Mobile Applications, (2011). Uhlmann, S. and Lugmayr, A. Personalization algorithms for portable personality. In proceedings of the 12th international conference on Entertainment and media in the ubiquitous era. ACM Press (2008), 117-121. Zulkernain, S., Madiraju, P., Ahamed, S. and Stamm, A. A Mobile Intelligent Interruption Management System. Journal of Universal Computer Science (2010).20602080.