Predicting Stay Time of Mobile Users With Contextual ... - IEEE Xplore

2 downloads 0 Views 2MB Size Report
Predicting Stay Time of Mobile Users. With Contextual Information. Sen Liu, Huanhuan Cao, Lei Li, and MengChu Zhou, Fellow, IEEE. Abstract—Mobile service ...
1026

IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, VOL. 10, NO. 4, OCTOBER 2013

Predicting Stay Time of Mobile Users With Contextual Information Sen Liu, Huanhuan Cao, Lei Li, and MengChu Zhou, Fellow, IEEE

Abstract—Mobile service providers and manufacturers continue to provide services and devices that take advantage of the location information associated with devices to provide a more personalized experience for users. For many such services, the user experience can be dramatically improved if a mobile device can predict how long a mobile user will stay at the current location. In this paper, we propose to take advantage of contextual information for predicting the stay time of mobile users. Specially, we investigate two strategies for modeling the relevance between it and contextual information, i.e., Stay Status Prediction (SSP) and Stay Time Prediction (STP). SSP is to predict whether a mobile user will stay at the current location at time point according to the contextual information at , while STP is to directly predict how long a mobile user will stay at the current location. Moreover, we study several typical machine learning models which can be extended for implementing SSP and STP and evaluate their performance with respect to prediction accuracy. We also conduct extensive experiments on real data sets to evaluate several implementations of the proposed strategies in terms of both effectiveness and efficiency for STP. Note to Practitioners—Automatically and accurately predicting the stay time of mobile users at a location is very important in enabling service providers to offer their customers with the desired services. This work for the first time develops two novel prediction strategies by using historical contextual information and the resulting STP system by using machine learning algorithms. Such a system can be easily implemented in a smart phone that possesses the GPS and other sensors. This work also examines such issues as prediction performance, memory requirements, and energy consumption of the developed system. The research results show that it can be readily deployed for practical applications. Index Terms—Contextual information, mobile users, stay time prediction (STP). Manuscript received July 31, 2012; revised January 18, 2013; accepted April 10, 2013. Date of publication May 22, 2013; date of current version October 02, 2013. This paper was recommended for publication by Associate Editor E. Roszkowska and Editor H. Ding upon evaluation of the reviewers’ comments. This work was supported in part by the Beijing Laboratory, Nokia Research Center, the Special Fund for FSSP in Net Era by CSTD, the National Science Foundation of China (NSFC) under Grants 61202247 and 71231002, the 111 Project of China under Grant B08004, the Fundamental Research Funds for the Central Universities, the National Basic Research Program of China under Grant 2011CB302804, and the National Science Foundation (NSF) under Grant CMMI-1162482. (Corresponding author: L. Li.) S. Liu and L. Li are with the Beijing University of Posts and Telecommunications, Haidian, Beijing 100876 (e-mail:[email protected]; leili@bupt. edu.cn). H. Cao is with the Beijing Laboratory, Nokia Research Center, Daxing, Beijing 100023, China (e-mail: [email protected]) M. C. Zhou is with the Department of Electrical and Computer Engineering, New Jersey Institute of Technology, Newark, NJ 07102 USA, and also the Key Laboratory of Embedded System and Service Computing, Ministry of Education, Tongji University, Shanghai 201804, China (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TASE.2013.2259480

I. INTRODUCTION

P

ROVIDERS and manufacturers of mobile devices are continually challenged to deliver value and convenience to consumers by providing compelling network services. Especially, they strive to offer services and devices that take advantage of the location information associated with devices in order to provide a more personalized experience for users based on various locations. For many such location-based personalized services, the user experience can be dramatically improved if a mobile device can predict how long the user will stay at the current location. For example, suppose that service providers would like to provide a mobile service for recommending nearby deals. As is well known, it is not practical to continuously monitor the locations of users since in this way: 1) the energy consumption of mobile devices due to GPS positioning is too much and 2) the amount of uploading data is unacceptably high to users. Moreover, it may not be proper to always recommend deals to users when they are close to the deal providers because many of them may be on the way to other places and would not like to stop for them. Instead, if the mobile device can effectively predict and report to the server that the user will probably stay at the current location for a while, the server can efficiently push the recommendations if at the same time it discovers some proper nearby deals. To be specific, if a mobile device detects that the user will stay at a shopping mall for a relatively long time and thus are likely to enjoy the recommended deals, it will report the information to the back-end server and request some good nearby deals. It is observed that the context of mobile users may be relevant to their stay time. Hence, we may be able to build a stay time prediction (STP) model if we can learn such relevance from the historical logs in mobile devices which contain rich context information and user trajectories [9]. For example, as illustrated in Fig. 1, suppose that on Saturday mornings Linda usually first telephones her friend Lily and appoints a time to meet, and then goes shopping in a mall. We may learn from her device logs that Linda is likely to stay for awhile when the context is , where the cell ID 4500–2344 indicates the location of the shopping mall covered by the cell site 4500–2344. Although the problem of predicting the stay time of mobile users by leveraging rich contextual information is both interesting and important in practical applications, we find that it is hardly explored yet. One may argue for converting the problem of predicting the stay time of mobile users to the one of predicting the next stay point of a mobile user after a predefined time length. There are indeed some prior arts for addressing the latter problem [2], [6]. However, first, most of them cannot

1545-5955 © 2013 IEEE

LIU et al.: PREDICTING STAY TIME OF MOBILE USERS WITH CONTEXTUAL INFORMATION

1027

TABLE I A TOY EXAMPLE OF CONTEXT LOG

Fig. 1. A motivating example to illustrate the relevance between the stay time of a mobile user and the corresponding context.

achieve high enough accuracy for practical applications because obviously the problem of predicting the next location of a mobile user is much more complex than predicting whether a user will stay at the current location after a per-defined time length. Second, none of them leverages rich context information except the historical trajectories of users. To this end, we fulfill this void by developing a context-aware framework to predict stay time of mobile users. Along this line, we investigate two strategies of modeling the relevance between contextual information and the stay time, i.e., Stay Status Prediction (SSP) and Stay Time Prediction (STP). The former’s objective is to predict whether a mobile user will stay at the according to the contextual current location at time point information at , while the latter aims to directly predict how long a mobile user will stay at the current location. Moreover, we study several representative machine learning models, i.e., support vector machine, decision tree, multiple linear regression and principal component regression models, which can be extended for implementing both strategies. Their performance is evaluated in terms of prediction accuracy. Since the STP is expected to be performed by mobile devices, the energy consumption and computational requirement of different implementations of STP systems are worth studying as well. Therefore, we also comprehensively evaluate them. The contributions of this paper are summarized as follows. First, this paper is one of the first attempts to leverage rich contextual information for predicting how long a mobile user will stay at the current location, which is crucial for improving a wide range of location-based and personalized services. It presents STP models that are built by mining the historical device logs of mobile users. Second, two strategies of modeling the relationship between contextual information and the stay time of mobile users are

given. For each strategy, several typical machine learning models are extended for its implementation. Then, their performance is evaluated and compared. Finally, experiments are conducted on several real-world mobile device logs containing rich context data and user trajectories to evaluate the effectiveness of different implementations of context-aware STP systems based on different models in terms of prediction accuracy, energy efficiency, and computation cost. The results suggest some inspiring conclusions for the researchers and developers in the relevant fields. The rest of this paper is organized as follows. In Section II, we give an overview of context-aware STP systems. Then, in Sections III and IV, we present two context-aware STP strategies and the corresponding models for each strategy. The experimental results on real data sets are reported and analyzed in Section V. Finally, we conclude this paper and introduce our future research plan in Section VI. II. OVERVIEW A. Preliminaries Smart devices can capture the historical context data of users through multiple sensors and record them in context-rich device logs, or context logs for short [3], [7]. For example, Table I shows a toy example of context log, which contains several context records, and each one consists of a timestamp, and the most detailed context at that time captured by devices. A context consists of several contextual features (e.g., Day name, Time range, and Coordinate) and their corresponding values (e.g., Saturday, ), which can be annoAM8:00–9:00, and tated as contextual feature-value pairs. The log system can predefine a set of contextual features whose values should be collected, but a context record may miss the values of some contextual features because these values are not always available. For example, when a user is in doors or the GPS module is disabled, the mobile device cannot offer its coordinate information. Moreover, user interactions do not always happen in each sampling interval and can only be recorded in some context records. In practice, in context logs, only the contextual feature-value pairs whose contextual values are not missing are recorded [26]. From context logs, we can extract many stay records of mobile users associated with the corresponding contextual information. We argue that the stay time of mobile users are more or less relevant to their contexts and the stay records extracted from their context logs can be used for training the models for predicting how long they will stay at their current location

1028

IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, VOL. 10, NO. 4, OCTOBER 2013

given the contextual information captured by their mobile devices. As pointed by some previous work for context log-based behavior mining [3], [7], [18], many user behaviors are relevant to their contexts and context logs are effective to be used for mining their context-aware behaviors. Some observations of mobile users show that their stay time seems to have the similar relevance to their context. For example, girls usually go and stay at shopping malls regularly at a particular time and with particular previous activities, such as calling their friends for appointing a place to meet. If such typical contexts are detected, it is likely that they will stay at their current locations for a while. Therefore, it is possible to develop a context-aware STP system by mining the relevance between the stay time of users and their contexts. B. Related Work 1) Trajectory Based Location Prediction: The most related work is trajectory-based location prediction. It can be roughly classified into two categories. The first one aims to discover the motion pattern of many users from the aggregated trajectories. For example, Josh et al. [29] proposed to evaluate the next location of a mobile user based on the frequent behaviors of similar users in the same cluster determined by analyzing users’ common behavior in semantic trajectories. Zheng et al. [31] aimed to mine the correlation between users and locations in terms of user-generated GPS trajectories. Reinaldo et al. [4] proposed to investigate ways of efficiently analyzing user trajectories by clustering algorithms and extracting user preferences from them. By contrast, the second category aims to discover the motion pattern of individual users from their personal trajectories. For example, Kim et al. [15] predicted a user’s specific routes by learning his/her route patterns from his/her histories. Calabrese et al. [6] proposed to combine the geographical information of cells with individual users’ trajectories for predicting the next location of a user. Anagnostopoulos et al. [2] took the location prediction problem as a classification problem and took advantage of decision trees for classifying the next location of a user to a predefined location according to the previous trajectory. As we pointed out in Section I, although the problem of predicting whether mobile users will stay at the current location during a particular time in the future can be converted to the problem of predicting their next location after the same time length and somehow can be addressed by the second category of the above studies, the prediction performance is not good enough due to the following two reasons in comparison with the proposed system. 1) Intuitively, the problem of predicting the next location of a mobile user is much more difficult than our problem because it has a more complex output space that keeps changing for many cases. Our study shows that such approaches cannot achieve sufficient good performance for practical applications. 2) None of them proposes a framework to incorporate rich contextual information except time and location, which can be captured by mobile devices for improving the prediction accuracy, such as call logs, charging status, and user

activities. By contrast, the proposed approach is particularly developed for addressing the STP problem. Moreover, it can incorporate multiple contextual information seamlessly and thus is flexible to use more context data which may be captured by the next generation of mobile devices with more advanced sensors other than those we mentioned in this paper. 2) Prediction of Time-Based Presence: Another category of the related work is the prediction of time-based presence. Krumm et al. [16] introduced a probabilistic home/away schedule computed from observed GPS data. The study [13] presents a prototype service, COORDINATE, which supports collaboration and communication by learning predictive models that provide forecasts of users’s presence and availability. Scott et al. [22] proposed to automatically control home heating by using occupancy sensing and occupancy prediction. Although the above mentioned application scenarios are different, they are all more or less related to time prediction, and give us some ideas on our research. 3) Context-Aware Computing: With the popularity of smart mobile devices and their increasing ability to capture contextual information, the studies of leveraging such information to improve existing services or enable novel services have attracted much attention. For example, Adomavicius et al. [1] argued the importance of relevant contextual information when providing recommendations, and introduced three different algorithmic paradigms for incorporating contextual information into the recommendation process. Riboni et al. [21] talked about a challenging problem of human activity recognition for context-aware systems and applications. Woerndl et al. [28] proposed a model for proactivity in mobile, context-aware recommender systems. The model relies on domain-dependent context modeling in several categories. By contrast, in recent years, some researchers proposed to leverage machine learning technologies for modeling the relevance between the context and user behavior from the historical device logs of mobile users. For example, Cao et al. [7] proposed an effective approach for mining the user behavior patterns which were relevant to particular contexts. Bao et al. [3] studied to leverage extended topic models for mining the latent relevance between the context and user behavior. Li et al. [17] developed a framework of using multiple machine learning technologies to infer the status of a GPS sensor according to other contextual information, which actually could be used to infer the location of a user without a need of invoking the GPS sensor. These studies clearly demonstrate that the relevance exists between the context and user behavior and context-rich device logs can be used for mining it. The work reported in this paper is partially inspired by such works. C. Context-Aware Stay Time Prediction (STP) Systems A context-aware STP system can work as a middleware in mobile devices, which provides an Application Interface (API) for the third party stay-time-aware location-based services. There are two constraints for ensuring it to work normally. First, the sampling intervals of all context sensors are the same. Second, the first invocations for all context sensors are synchronous when the system starts. It is because a context-aware

LIU et al.: PREDICTING STAY TIME OF MOBILE USERS WITH CONTEXTUAL INFORMATION

Fig. 2. The framework of a learning-based context-aware time prediction system.

STP system needs as many outputs of context sensors as possible at a particular time point to infer the stay time of mobile users. It is worth noting that a third party context-aware application can cross the context-aware STP system and directly invoke a context sensor with different sampling intervals if it has a special need. But it is beyond the scope of this paper and thus left as the future work. The framework of a context-aware STP system is illustrated in Fig. 2. The system includes two components, i.e., the prediction model component and the STP component. The former is used for learning a context-aware STP model by mining the context logs of mobile users, while the latter is used to predict the stay time of the user given the current contextual information. The system works as follows. First, it collects the outputs of all context sensors with the sampling interval of spanning for a predefined period, such as one week. The collected data are recorded as a context log. Second, the prediction model component learns a contextaware STP model from the context log. It reflects the relevance between the stay time and corresponding contextual information for a particular user. Finally, the STP component periodically takes advantage of to predict how long the mobile user will stay at the current location according to the latest contextual information captured by the mobile device. In practice, we determine whether a mobile user stays at by taking into the same location from time points to account the positioning accuracy and the requirement of third party stay-time-aware location-based services. The positioning accuracy of different approaches are different. For example, the accuracy of GPS sensors is usually lower than ten meters and that of cell id-based positioning is usually lower than 50 meters. Thus, the definition of “stay” should be different in the mobile devices with different positioning approaches. Moreover, for some third party stay-time-aware location-based services, they do not need to know how long a mobile user will stay at an exact position but a meaningful location, such as a shopping mall or tourist attraction. In this case, we can define a less strict metric of “stay” such as “the maximum distance between any position and the beginning position in the user trajectory from is less than .” In practice, can time points to

1029

be set from tens of meters to hundreds of meters according to an application scenario. One may argue for other approaches that predict the stay time of mobile users according to a single contextual feature such as time or location. Actually, the proposed framework of context-aware STP systems are more general and can outperform the approaches based on such single contextual features by comprehensively taking advantage of multiple contextual features, which has been demonstrated by our experiments on real data sets. Moreover, the generality of the framework makes it flexible to configure for different mobile devices with different context sensing abilities. For example, some latest high-end smart phones are equipped with more advanced sensors such as barometers and gyroscopes that can capture external contextual information more accurately, while there are still many low-end phones, some of which are even without GPS sensors and 3D accelerometers. The proposed framework can fit both of them to help provide more intelligent and personalized location-based services. Learning a context-aware STP model plays an important role in a context-aware STP system. This work investigates two strategies of building context-aware STP models, i.e., Stay Status Prediction (SSP) and Stay Time Prediction (STP). For each model, we first introduce the basic idea and then propose some typical implementations. III. STAY STATUS PREDICTION (SSP) A. Overview Given a set of contextual feature-value pairs at one time point, we can build a contextual feature vector . First, we divide all contextual features into two categories. One category has numeric values, such as background audio strength. The other category has nonnumeric values, such as Cell ID. For the contextual features with numeric values, we directly take one contextual feature as a feature and take the particular value as the corresponding feature value. However, for the contextual features with nonnumeric values, this method is not applicable because the different values of one contextual feature are difficult to be quantized. Instead, we take each particular feature-value pair of such contextual features as a separate feature and take the occurrence of the feature-value pair as a Boolean value, i.e., 1 indicates the feature-value pair exists and 0 otherwise. Then, given features extracted from the training data, we can build a -dimensional contextual feature vector from the set of contextual feature-value pairs. Given a status variable to indicate a user’s stay status at next time point and a contextual feature vector , an SSP model can , where be formally defined as a mapping function: indicates whether the user will stay at the curor not . It rent location after time points predicts the stay status of a user at time point according to the contextual information at . Obviously, building an SSP model for a user can be transformed to the problem of training a binary classifier. The trained classifier will take the contextual information as observed features and classifies the future stay status of the user into a staying or moving status category. Therefore, we need a training data set that contains enough samples of staying and moving statuses for a user. To prepare such

1030

IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, VOL. 10, NO. 4, OCTOBER 2013

a training data set, we first transform the raw context records to a context record sequence by sorting them in the order of timestamps. Then, we assign each context record with a stay status label by checking whether the distance between any user posiis less tion and the beginning position from time points to . Finally, for each labeled context record, we extract than the corresponding contextual feature vector and the stay status label to build a training sample. Given a set of training samples, multiple existing classification approaches can be applied to build an SSP model. In this paper, we adopt two typical classification approaches, i.e., support vector machine [10] and decision tree [20]. B. Building SSP Model Via Support Vector Machine The basic idea of a support vector machine is to map the training samples into a high dimension space as training data points, and then find out a set of hyperplanes to divide the features training data points of several categories. Given and training samples with category labels 1 (staying) or (moving), we map each training sample into a training data point in a -dimension space. To be specific, if a training sample has the value for the th feature, the corresponding mapped data point has the value in the th dimension. Then, , where the objective is to learn a hyperplane denotes a -dimension vector and indicates a -dimension data point. The optimization problem is as follows: (1) subject to (2) where is a constant value, denotes the degree of misclassification of the th training data point , and indicates the category label of . There exist several specialized algorithms for quickly solving the optimization problem, mostly reliant on heuristics for breaking the problem down into smaller, more-manageable chunks. In our work, we use the Sequential Minimal Optimization (SMO) type decomposition methods [12] to build SSP models. C. Building SSP Model Via Decision Tree It is desirable to build SSP models for mobile users in their mobile devices instead of a remote server for the sake of privacy control. In this case, we have to take into account the complexity of training an SSP model because the computational resource of a mobile device is usually limited. Therefore, we also explore how to build SSP models with decision trees, which are also successfully applied to classification and need less computation than the support vector machine. In our problem, we need a decision tree to determine the most possible stay status of a mobile user during the following time points according to the current contextual information where each node of the tree is a decision rule to determine the most possible stay status in the future by taking into account one feature.

Fig. 3. An example of how a perfect SSP model makes an error. A user stay occurs from time point to and neither of the two points is overlapped with a prediction time point.

There exist several specialized algorithms for building decision trees. In our implementation, we use Classification And Regression Tree (CART) [5] and adopt a binary recursive partitioning technology. Our experiments on real data sets show that the trained decision trees have limited degrees and can thus be stored in mobile devices with a proper threshold of the target accuracy. D. Selection of Prediction Window Size An SSP model needs a parameter to indicate a time window with time points for predicting the future stay status of a user, which is called prediction window. In practice, the size of a prediction window is defined by the third party stay-time-aware location-based services according to their particular applications. But it is worth noting that its setting can affect the prediction accuracy of the model. Specially, the smaller, the more accurate prediction the model will make, and vice versa. It is because the beginning time point of a user stay is probably between two time points of sampling contextual information for making the predictions. In other words, the two predictions are earlier and later than the real beginning time point of the user stay, respectively. As a result, even the model can perfectly predict the real stay time of a user in future, the practical prediction can still be wrong due to missing the beginning time point of the user stay. For example, Fig. 3 illustrates how a perfect SSP model makes an error. Suppose that a user stay occurs from time points to and neither of the two points is overlapped with a prediction time point. Obviously, the stay time of the user is longer than time points. However, at each prediction time point from , the prediction model will infer that the user will stay less than time points. In the end, from time points to , the model negative predictions, while the user still stays until makes the user really leaves the current location. In other words, the system wastes the opportunity of reporting the user stay to the third party services. In the above example, even if the prediction window size is , the opportunity of reporting a correct smaller than user stay may also be missed. Fig. 4 illustrates how a smaller prediction window size avoids the problem illustrated in Fig. 3. Intuitively, the smaller the prediction window size, the more likely it is smaller than the floor of a user stay time, i.e., , and thus more user stays will be correctly predicted. This observation inspires us to ask the third party services to avoid setting a too big prediction window if not necessary when the STP system is implemented by an SSP model.

LIU et al.: PREDICTING STAY TIME OF MOBILE USERS WITH CONTEXTUAL INFORMATION

1031

Given a new feature vector , the predicted staying status interval of must fall into the following range with a confidence : (5)

Fig. 4. An example of how a smaller prediction window size lets the model correctly predict a user stay.

IV. STAY TIME PREDICTION (STP)

where . The inferred interval range may not be bounded by integers. Thus, to map the interval into the discrete time points, we determine the practical interval range as where and represent the floor and ceiling functions, respectively.

A. Overview

C. Building STP Model Via Principal Component Regression

Given a variable for length of stay time and a contextual feature vector , an STP model can be formally defined as a , where indicates the number of intervals function before a user’s next moving status. It predicts the number of possible intervals between the current time point and the one when the next move occurs according to the current contextual information. Specially, if predicts the maximum likelihood for a time points, the system user to stop the staying status after will report the prediction to the third party services. The advantage of STP models is that they can provide a long-term prospect of the user stay beyond a predefined prediction window, which offers more useful information to the third party services for providing proper services based on such prediction. Training an STP model can be transformed to training a regression model from a training data set where each training sample contains a number of intervals before the next move with a feature vector extracted from collected context data [25]. To prepare such a training data set, similar to preparing it for SSP models, we first transform the raw context records to a context record sequence by sorting them in the order of timestamps. Then, we assign each context record a staying status interval number by checking the distance between each position and the beginning position of a user during the following time points. Finally, for each context record, we extract the corresponding contextual feature vector to build a training sample.

It is well known that the general multiple linear regression approach may encounter the multicollinearity problem [14], which means that some explanatory variables are highly correlated. Consequently, the estimation of regression coefficients will be unstable and may even mislead the estimation of the regression formulas. There are two widely used approaches for solving the multicollinearity problem. The first one is the stepwise regression [11] which selects a subset of explanatory variables which are not highly correlated for the regression analysis, thereby avoiding the problem. Its drawback is that it may miss some useful information captured by the omitted explanatory variables. The second one is the Principal Components Regression (PCR) [14] which uses the principal components of explanatory variables instead of the raw explanatory variables for the regression analysis. As principal components are uncorrelated, the multicollinearity problem can be avoided, while all related information for regression is maintained as much as possible. Our implementation takes advantage of the PCR approach to solve the multicollinearity problem. To be specific, we first express the training samples in the form of weighted vectors of principal components by matrix as follows:

B. Building STP Model Via Multiple Linear Regression features from the training samples as exBy extracting planatory variables, we can implement an STP model by training a multiple linear regression [27] model in the following form: (3) denotes the vector of staying status where deintervals for each training sample, notes a matrix with mapped -dimension training vectors , denotes a -dimension vector, and denotes the error term vector. Generally speaking, given , is assumed to follow a normal distribution with the zero mean, i.e., , and and . It is easy to prove that the objective is to estimate is an unbiased estimation of . Moreover, an unbiased estimator of is (4) where

.

(6) th element of indicates the weight of the th where the training sample for the th principal components, is a matrix whose th column is the th eigenvector of , i.e., a principal component. Therefore, the new form of the regression problem becomes (7) where is a -dimension vector and one of its unbiased estimations is . In the prediction stage, the PCR-based STP model also needs to , where to transform the contextual feature vector . Then, the staying status interval is predicted to fall into the following range with a confidence : (8) Similar to the prediction from a general multiple regressionbased STP model, the inferred range of staying status intervals should also be mapped into discrete-time points.

1032

IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, VOL. 10, NO. 4, OCTOBER 2013

TABLE II THE COLLECTED CONTEXT DATA

TABLE III DETAILS OF EACH DATA SET

B. Baseline Methods

V. EXPERIMENTS To evaluate the effectiveness and efficiency of the proposed approach for SSP and STP, we conduct the following experiments to compare several implementations and two baseline methods from multiple perspectives on the real data sets collected from five mobile users. In this section, the detailed experimental results are reported and discussed. A. Data Set We collect five college student volunteers’ context data through their smart phones spanning for one month. The collected data include both indoor and outdoor ones, and the geographical area of the data collection basically covers the three campuses of the university. All of the smart phones are installed with the Symbian60v3 operating system and equipped with multiple sensors such as GPS, application monitoring sensors and 3D accelerometers. The collected data include rich types of context data as listed in Table II. Among all context sensors used in the experiments, only the battery level sensor and inactive time sensor have numeric outputs. Note that outputs of the time sensor are converted to numeric values to reflect the continuity property of the time information though their raw format is strings. Specially, we convert the time to an integer as . During data collection, the sampling intervals of all sensors are set to be 1 min. It is worth noting that though we set a unified sampling interval, the numbers of context records collected from volunteers may be different because they can manually turn off the context data collection software in the case of a low battery level. Moreover, some context records lack the GPS coordinates because the GPS sensor cannot find the satellite signal when a user is in doors. For such context records, we use those occurring in the nearest context record as its GPS coordinates. This approach is reasonable because in this way we take the indoor status as a stay. Table III shows the quantity information of each volunteer’s context data set, where Owner ID indicates the owner of the data set, #Context records indicates the number of context records the data set contains, #Features indicates the number of features extracted from the data set by the method mentioned in Section III, and #Stay indicates the number of stays reflected from the data. In the experiments, we claim that users stay at the current location if and only if during at least 10 min they have not moved more than 100 meters compared with their beginning position.

For the proposed context-aware approach for STP, we evaluate four implementations based on different machine learning technologies. Specially, we implement an SSP model via Support Vector Machine (SVM-SSP) and Decision Tree (DT-SSP), and STP model via Multiple Linear Regression (MLR-STP) and Principal Component Regression (PCR-STP). Moreover, for each volunteer’s data set, we use the data of the first three weeks as the training set and use those of the last week as the testing set. Since there is no prior art for directly predicting the stay time of mobile users, to evaluate the performance of the proposed context-aware approach, we select two baselines extended from the approaches for location prediction as follows. Pre-MP: The approach of Previous trajectory-based Mobility Prediction is proposed by Anagnostopoulos et al. [2]. Its basic idea is to extract multiple features from the previous Cell ID trajectories of mobile users and take advantage of decision trees for predicting the future user location at the next time point. It can be extended to an SSP model, i.e., we change the training objective of predicting the future user location at the next time point to predicting whether the user will stay at the current location during the prediction window. ICG-MP: The approach of Individual and Collective Geographical features-based Mobility Prediction is proposed by Calabrese et al. [6]. Its basic idea is to first partition the visiting areas of mobile users by geo-grids and then take advantage of both the mobility pattern reflected from the trajectories of individual mobile users and the collective geographical information of the candidate’s next location, such as the nearby Point of Interest information, for building a probabilistic model of user mobilities in these geo-grids. Similar to Pre-MP, this approach can be extended to an SSP model as well. C. Effectiveness 1) Overall Results: To compare the performance of two STP models with other approaches, we do not evaluate the performance of predicting stay time directly but check whether they successfully predict the stay status during a given prediction window. For example, if an STP model predicts that the user will stay at the current location for 45 min, given a prediction window equal to 30 min, the model indicates that the user will stay during the prediction window. If the user indeed stays for 40 min, we regard that it is a correct prediction with respect to the prediction window and do not take the 5 min error into account. From the perspective of the mobile service providers, however, only at least an hour stay prediction is effective. According to the analysis of selection of prediction window

LIU et al.: PREDICTING STAY TIME OF MOBILE USERS WITH CONTEXTUAL INFORMATION

Fig. 5. The precision of stay predictions for all the approaches on all data sets.

Fig. 6. The recall of stay predictions for all the approaches on all data sets.

size in Section III-D, we choose a set of round numbers from 10 to 60 min as the prediction windows. Fig. 5 compares the precision of stay predictions for all the approaches on all data sets with a 50 min prediction window. Consider data set E first. All the four proposed models dramatically outperform the two baselines in terms of precision. By analyzing the data set, we find that it contains a large ratio of positions where the user stay at some times but not at other times. Such data property amplifies the drawbacks of the baselines since both of them do not consider contextual information especially time information. Next, for the rest of the four data sets, the two STP models, i.e., MLR-STP and PCR-STP still well outperform the baselines in terms of precision (averagely more than 40%). Note that STP is better than SSP in all cases except data set B, while two baseline methods exhibit insignificant differences. Define the recall of stay predictions as the ratio of correct predictions to all correct and incorrect predictions. Fig. 6 compares

Fig. 7. The

1033

score of stay predictions for all the approaches on all data sets.

it for all the approaches on all data sets with the same prediction window. From it we can see that generally they have comparable performance, while on particular data sets one of them may perform better than the others. Fig. 7 compares the score of stay predictions for all the approaches on all data sets with the same prediction window. Clearly, two STP models outperform two baselines on all data sets, while two SSP models dramatically outperform the baselines on data set E but are comparable to the baselines on the rest. From the above experiments, we can see that if a user stay pattern mainly relies on the current location, the advantage of the proposed models can be observed but not significantly. Otherwise, if a pattern also relies on other contextual information, like the data set E, the advantage of the proposed approaches is dramatical. Therefore, they are much better since they can ensure outstanding performance on different data sets. We also study the impact of prediction window size on the stay prediction performance for all the approaches. From the experiments, we find that it indeed influences the stay prediction performance. To be specific, on all data sets, their performance drops with the increase of prediction window size, while the drop rate varies with a particular data set. Moreover, the two baselines are more sensitive to such change and may sharply drop when the window size is bigger than a particular value. For example, Fig. 8 shows such impact for all approaches in terms of score on data sets A and E. Limited by space, we do not show the experimental results on all data sets since they exhibit the similar phenomenon. D. Efficiency Beyond the effectiveness of predicting the stay time of mobile users, the efficiency of as STP approach is important as well since both the computing resource and battery capacity of mobile devices are relatively limited. Therefore, we evaluate the training memory cost, testing memory cost, and testing energy cost of all approaches. It is worth noting that we do not evaluate the training energy cost because we assume that the

1034

IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, VOL. 10, NO. 4, OCTOBER 2013

Fig. 8. The impact of prediction window size on the stay prediction performance for all test approaches in terms of

score on (a) data set A and (b) data set E.

Fig. 9. The memory cost of training each approach.

Fig. 10. The memory cost of testing each approach.

training stage can be performed when the mobile device is under charging and thus the training energy cost is not a significant metric. Ideally, all approaches had to be implemented in mobile devices for simulating the practical training and testing cost. However, in our experiments, the training stage work for all approaches is performed in a Core 2 2.0 G CPU, 2 GB main memory PC, while only the trained models are deployed in a Nokia C7-00 smart phone with 256 MB main memory, 600 MHz CPU. This is because the training cannot be performed on the test phone due to the limit of its memory for some approaches. Through using a PC to train all approaches, we can compare the training memory cost for them. The training algorithm of SVM-SSP is implemented by C++ based on a widely used SVM implementation libSVM [8], that of DT-SSP is implemented by C++ based on CART [23], those of MLR-STP and PCR-STP are done by MATLAB7.0, both baselines are done by standard C++, and all of trained models are implemented by Qt and deployed in the test phone. Fig. 9 compares the memory cost of training each approach. Among the four proposed models, MLR-STP dramatically outperforms the two baselines in terms of memory cost. To be spe-

cific, about 70% reduction on four data sets is observed, and on the rest, training memory cost is comparable to that of the baselines. Moreover, DT-SSP has training memory cost comparable to that of the baselines on all data sets, while the two models, especially SVM-SSP, have much higher one. Fig. 10 compares the memory cost of testing each approach. SVM-SSP requires much higher memory cost than other approaches (e.g., averaging more than eight times of DT-SSP’s) and DT-SSP’s is comparable to that of the baselines. The two STP models require more memory cost than two baselines but, fortunately, the amount is less than 6 M Bytes, which is acceptable for a smart phone in practice. Nowadays, the memory unit of a smart phone has been already at the GB level, and therefore, it has enough memory to run the trained algorithms. Fig. 11 compares the energy cost of testing each model during half an hour. The energy cost of the standby mode during the same period is also given in [19]. In general, the standby power of a phone which is without any operation is about 0.04 mW. However, in the case of calling operation, standby power is raised to 0.75 mW and the average peek power is more than 1.10 mW. In our experiment, all approaches except ICG-MP are

LIU et al.: PREDICTING STAY TIME OF MOBILE USERS WITH CONTEXTUAL INFORMATION

1035

Fig. 11. The energy cost of running each model.

relatively low-power consuming. To be specific, their standby power is less than 0.05 mW and the average peak power when making a prediction is less than 0.25 mW. By contrast, ICGMP’s standby power is close to 0.25 mW and the average peak power is close to 0.4 mW. This is because it requires GPS positioning, while the others have no such requirement. We do not show the time cost of inference for each approach because it is trivial. All of the implemented STP approaches can output the inference result within 10 ms. VI. CONCLUSION AND FUTURE WORK In this work, we have proposed to take advantage of contextual information for predicting the stay time of mobile users. Specially, we have formulated two strategies for modeling the relevance between contextual information and the stay time of mobile users. They are called Stay Status Prediction and Stay Time Prediction. The first one is used to predict whether a mobile according user will stay at the current location at time point to the contextual information at , while the second one is to make a direct prediction about how long a mobile user will stay at the current location. Moreover, we have studied several machine learning models used to implement them. We have also conducted extensive experiments on real data sets to evaluate several implementations of the proposed strategies and two baseline methods in terms of both effectiveness and efficiency for STP. The robustness of STP approaches with respect to training data is crucial for the practical application of the proposed approach. In the future, we will further study the sensitivity of the proposed approach with respect to the quality of the historical context data, such as the time length, and the sparseness of data. We will figure out the weights of features via Principal Component Analysis, which can help us knock out the features that may not be useful for the final prediction. It is also interesting to study Artificial Neural Network approaches [25], [30], [33] and how to learn from negative prediction results for better STP systems. We will also consider the utilization of fuzzy logic and related algorithms [24], [32] for modeling the process of approximate inference about the stay time.

REFERENCES [1] G. Adomavicius and A. Tuzhilin, Context-Aware Recommender Systems. New York, NY, USA: Springer, 2011. [2] T. Anagnostopoulos, C. Anagnostopoulos, and S. Hadjiefthymiades, “Mobility prediction based on machine learning,” in Proc. 12th IEEE Int. Conf. Mobile Data Management(MDM2011), Los Alamitos, CA, USA, 2011, pp. 27–30. [3] T. Bao, H. Cao, E. Chen, J. Tian, and H. Xiong, “An unsupervised approach to modeling personalized contexts of mobile users,” in Proc 10th IEEE Int. Conf. Data Mining (ICDM’10), Sydney, Australia, 2010, pp. 38–47. [4] R. B. Braga, A. Tahir, M. Bertolotto, and H. Martin, “Clustering user trajectories to find patterns for social interaction applications,” in Web and Wireless Geographical Inform. Syst., 2012, pp. 82–97. [5] L. Breiman, J. H. Friedman, and R. A. E. A. Olshen, Classification and Regression Trees. Boca Raton, FL, USA: CRC Press, 1984. [6] F. Calabrese, G. Di Lorenzo, and C. Ratti, “Human mobility prediction based on individual and collective geographical preferences,” in Proc. 13th IEEE Int. Conf. Intell. Transp. Syst. (ITSC’10), 2010, pp. 312–317. [7] H. Cao, T. Bao, Q. Yang, E. Chen, and J. Tian, “An effective approach for mining mobile user habits,” in Proc. 19th ACM Conf. Inform. Knowl. Management (CIKM2010), Toronto, ON, Canada, 2010, pp. 1677–1680. [8] C. Chang and C. Lin, “LIBSVM: A library for support vector machines,” ACM Trans. Intell. Syst. Technol., vol. 2, no. 27, pp. 1–27, Apr. 2011. [Online]. Available: http://www.csie.ntu.edu.tw/~cjlin/libsvm [9] I. C.-H. Lu and I. L.-C. Lu, “Robust location-aware activity recognition using wireless sensor network in an attentive home,” IEEE Trans. Autom. Sci. Eng., vol. 6, no. 4, pp. 598–598, Oct. 2009. [10] C. Cortes and V. Vapnik, “Support-vector networks,” Mach. Learn., vol. 20, no. 2, pp. 273–297, 1995. [11] S. Derksen and H. J. Keselman, “Backward, forward and stepwise automated subset selection algorithms: Frequency of obtaining authentic and noise variables,” Br. J. Math. Statist. Psychology, pp. 265–282, 1992. [12] R. E. Fan, P. H. Chen, and C. J. Lin, “Working set selection using second order information for training SVM,” J. Mach. Learn. Res., pp. 1889–1918, 2005. [13] E. Horvitz, P. Koch, C. M. Kadie, and A. Jacobs, “Coordinate: Probabilistic forecasting of presence and availability,” in Proc. 18th Conf. Uncertainty Artifi. Intell. (UAI’02), 2002, pp. 224–233. [14] T. Jolliffe, Principal Component Analysis, 2nd ed. New York, NY, USA: Springer-Verlag, 2002. [15] J.-M. Kim, H. Baek, and Y.-T. Park, “Probabilistic graphical model based personal route prediction in mobile environment,” Appl. Math. Inform. Sci., pp. 651–659, 2012. [16] J. Krumm and A. Brush, “Learning time-based presence probabilities,” in Proc. 9th Int. Conf. Pervasive Computing (Pervasive’11), Jun. 2011, pp. 79–96. [17] X. Li, H. Cao, E. Chen, and J. Tian, “Learning to infer the status of heavy-duty sensors for energy-efficient context-sensing,” ACM TIST, vol. 3, no. 2, pp. 35–35, 2012.

1036

IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, VOL. 10, NO. 4, OCTOBER 2013

[18] H. Ma, H. Cao, Q. Y. E. Chen, and J. Tian, “An unsupervised approach to modeling personalized contexts of mobile users,” in Proc. 21th Int. Conf. World Wide Web (WWW2012), Lyon, France, 2012. [19] M. I. A. V. O. Wigstron, B. Lennartson, and C. Breitholtz, “High-level scheduling of energy optimal trajectories,” IEEE Trans. Autom. Sci. Eng., vol. 10, no. 1, pp. 57–64, Jan. 2013. [20] J. R. Quinlan, C4.5: Programs for Machine Learning. San Mateo, CA, USA: Morgan Kaufmann, 1993. [21] D. Riboni and C. Bettini, “Cosar: Hybrid reasoning for context-aware activity recognition,” in Personal and Ubiquitous Computing, 2011, pp. 271–289. [22] J. Scott, A. B. Brush, J. Krumm, B. Meyers, M. Hazas, S. Hodges, and N. Villar, “Preheat: Controlling home heating using occupancy prediction,” in Proc. 13th ACM Int. Conf. Ubiquitous Computi. (UbiComp’11), Sep. 2011, pp. 281–290. [23] D. Steinberg and P. Colla, “Cart-classification and regression trees,” in Salford Systems, 1997, pp. 217–224. [24] U. Straccia, “Transportation planning with modified s-curve membership functions using an interactive fuzzy multi-bojective approach,” in Appl. Soft Comput., 2011, pp. 2656–2663. [25] W.-M. Wu, F.-T. Cheng, T.-H. Lin, D.-L. Zeng, and J.-F. Chen, “Selection schemes of dual virtualmetrology outputs for enhancing prediction accuracy,” IEEE Trans. Autom. Sci. Eng., vol. 8, no. 2, pp. 311–318, Apr. 2011. [26] M.-I. W.-H. Liau, C.-L. Wu, and I. L.-C. Fu, “Inhabitants tracking system in a cluttered home environment via floor load sensors,” IEEE Trans. Autom. Sci. Eng., vol. 5, no. 1, pp. 10–10, Jan. 2008. [27] H. G. Williamus, Conometric Analysis 5th. Englewood Cliffs, USA: Prentice-Hall, 2002. [28] W. Woerndl, J. Huebner, R. Bader, and D. Gallego-Vico, “A model for proactivity in mobile, context-aware recommender systems,” in Proc. 5th ACM Conf. Recommender Syst., 2011, pp. 273–276. [29] J. J.-C. Ying, W.-C. Lee, T.-C. Weng, and V. S. Tseng, “Semantic trajectory mining for location prediction,” in Proc. 19th ACM SIGSPATIAL Int. Conf. Advances Geographic Inform. Syst. (GIS’11), 2011, pp. 34–43. [30] M. I. Y. Yu, T.-M. Choi, and C.-L. Hui, “An intelligent quick prediction algorithm with applications in industrial control and loading problems,” IEEE Trans. Autom. Sci. Eng., vol. 9, no. 2, pp. 276–276, Apr. 2012. [31] Y. Zheng, X. Xie, and W.-Y. Ma, “Geolife: A collaborative social networking service among use, location and trajectory,” IEEE Data Eng. Bull., 2010. [32] Y. Tang, M. C. Zhou, and M. Gao, “Fuzzy-Petri-net based disassembly planning considering human factors,” IEEE Trans. Syst., Man, Cybern.: Part A, vol. 36, no. 4, pp. 718–726, Jul. 2006. [33] K. Venkatesh, M. C. Zhou, and R. Caudill, “Design of artificial neural network for tool wear monitoring,” J. Intell. Manuf., vol. 8, no. 3, pp. 215–226, 1997.

Sen Liu received the B.S. degree in computer science from the Beijing University of Posts and Telecommunications, Beijing, China, in 2013. Currently, he is working towards the Ph.D. degree at the Information and Communication Engineering Department, Beijing University of Posts and Telecommunications. His area of interests includes Artificial Intelligence, machine learning, data mining, mobile Internet, and cellphone application.

Huanhuan Cao received the B.E. and Ph.D. degrees from the University of Science and Technology of China, Beijing, China, in 2005 and 2009, respectively. He was a Senior Researcher with the Beijing Lab of Nokia Research Center. His major research interests include mobile context mining and mobile user behavior analysis. He was an intern with Microsoft Research Asia (MSRA) from October 2007 to April 2009. When working in MSRA, he did a series of studies on leveraging large-scale search logs for supporting context-aware search. Dr. Cao won the Microsoft Fellow and Chinese Academic Science President Award while at MSRA.

Lei Li received the B.S. degree in information engineering and the Ph.D. degree in signal and information processing from the Beijing University of Posts and Telecommunications, Beijing, China, in 1996 and 2001, respectively. She worked at the CapInfo Ltd. Company, from 2000 to 2003. In 2003, she joined the Center for Language Technology, Macquarie University, Sydney, Australia, as a Postdoctoral Researcher, and in 2004, joined the Beijing University of Posts and Telecommunications as an Associate Professor of Computer Science and Technology. Her research interests are in artificial intelligence, natural language processing, intelligent web services, machine learning and data mining. She has over 40 publications including 30+ journal and conference papers and 4 book-chapters. Prof. Li is a member of the Chinese Association for Artificial Intelligence and Secretary of the Natural Language Understanding Committee.

MengChu Zhou (S’88–M’90–SM’93–F’03) received the B.S. degree in control engineering from Nanjing University of Science and Technology, Nanjing, China, in 1983, the M.S. degree in automatic control from the Beijing Institute of Technology, Beijing, China, in 1986, and the Ph.D. degree in computer and systems engineering from Rensselaer Polytechnic Institute, Troy, NY, USA, in 1990. He joined the New Jersey Institute of Technology (NJIT), Newark, NJ, USA, in 1990, and is a Professor of Electrical and Computer Engineering. He is presently a Professor with Tongji University, Shanghai, China. He has over 490 publications including 11 books, 230+ journal papers (majority in IEEE TRANSACTIONS), and 18 book-chapters. His research interests are in Petri nets, sensor networks, web services, big data analysis, semiconductor manufacturing, transportation and energy systems. Dr. Zhou is a Life Member of Chinese Association for Science and Technology-USA and served as its President in 1999. He is a Fellow of the American Association for the Advancement of Science (AAAS). He is the founding Editor of the IEEE Press Book Series on “Systems Science and Engineering,” Editor of the IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, and Associate Editor of the IEEE TRANSACTIONS ON SYSTEMS, MAN AND CYBERNETICS: PART A, the IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, and the IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS.

Suggest Documents