Hindawi Publishing Corporation International Journal of Distributed Sensor Networks Volume 2015, Article ID 481705, 12 pages http://dx.doi.org/10.1155/2015/481705
Research Article A Location Prediction Algorithm with Daily Routines in Location-Based Participatory Sensing Systems Ruiyun Yu,1 Xingyou Xia,1 Shiyang Liao,1 and Xingwei Wang2 1
Software College, Northeastern University, Shenyang 110819, China College of Information Science and Engineering, Northeastern University, Shenyang 110819, China
2
Correspondence should be addressed to Ruiyun Yu;
[email protected] Received 21 November 2014; Accepted 11 January 2015 Academic Editor: Joel Rodrigues Copyright © 2015 Ruiyun Yu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Mobile node location predication is critical to efficient data acquisition and message forwarding in participatory sensing systems. This paper proposes a social-relationship-based mobile node location prediction algorithm using daily routines (SMLPR). The SMLPR algorithm models application scenarios based on geographic locations and extracts social relationships of mobile nodes from nodes’ mobility. After considering the dynamism of users’ behavior resulting from their daily routines, the SMLPR algorithm preliminarily predicts node’s mobility based on the hidden Markov model in different daily periods of time and then amends the prediction results using location information of other nodes which have strong relationship with the node. Finally, the UCSD WTD dataset are exploited for simulations. Simulation results show that SMLPR acquires higher prediction accuracy than proposals based on the Markov model.
1. Introduction Participatory sensing is a recent appearing sensing technology which emphasize that people participate in the sensing process. Participatory sensing enables individuals and communities to gather, analyze, and share local knowledge and to subsequently make intelligent decisions and also offer social services. The earliest research of participatory sensing is executed by Srivastava et al. who have proposed the conception of urban sensing in a technical report [1] in the year 2006 to discuss the system architecture and technical methods of urban sensing. The MSG (Mobile Sensing Group) laboratory at Dartmouth College also conducts research on this area, including BikeNet [2], SoundSense [3], CenceMe [4, 5], MetroSense [6], and Bubble-Sensing [7]. Participatory sensing systems rely on mobile phone users to sense and transmit data with diverse purposes in the process of monitoring or solving a particular problem. Based on the large number of users, participatory sensing systems have the potential to acquire large amounts of data from various places and address large-scale location-based problems [8– 10]. A typical example of location-based participatory sensing systems collects and records air quality measurements to
monitor the pollution of a particular location. What is more, the smartphone with Internet connectivity can also contribute to the participatory sensing systems’ growth. In [11], a solution based on web services is proposed to permit the interaction between a mobile application and the IPv6 compliant WSNs scenario. In participatory sensing systems, mobile devices are usually weakly connected. Due to uncertainty of connection, nodes sometimes need encounter opportunities to accomplish data communication and transmission. If the location of mobile nodes can be predicted ahead of several time slots, the service quality and efficiency of the system will be remarkably improved. In order to solve the location prediction problems, human mobility has been analyzed at different geographic scales [12– 14]. In [12] the limits of predictability provided by humans’ mobility patterns are examined. The data that is collected by 50,000 anonymous mobile phone users over a period of three months has been used to study the humans’ mobility patterns, and different entropy measures are adopted in this research to estimate the potential predictability in human dynamics. Based on their analysis, a 93% potential predictability in user mobility across the whole user base is found in their report.
2 In [13] the location data from 100,000 mobile phone users collected by tracking each person’s position for six months is investigated. This research shows that individual mobility patterns of the test persons can be described by a single spatial probability distribution after some corrective preprocessing. Thus the results of this study suggest that humans generally follow simple reproducible patterns. Based on the human mobility analysis, different techniques based on the Markov model have been applied to location prediction problems of human individuals. In [15], a set of discrete locations has been defined using the WIFI cells on a university campus. Two different kinds of location predictors, the 𝑘th order Markov predictor as well as a LZ-based predictor, are tested in predictions to the next location. Based on the test results, the research shows that the second order Markov predictor with a certain fallback feature performs best and provided a median accuracy of 72%. Literature [16] presents a similar extended Markov predictor, which takes the arrival time and residence time into account. Specifically, delay embedding is used to extract location sequences of a certain length from time series. Then these sequences are directly used to predict a user’s location, and the prediction result is obtained by comparing the last observed locations to all embedded location sequences. Meanwhile, hidden Markov models (HMMs) are also considered to predict human mobility. Literature [17] presents a hybrid method on the basis of hidden Markov models. The proposed approach clusters location histories according to their characteristics, and the latter trains an HMM for each cluster. Based on HMMs, location characteristics are considered as unobservable parameters and the effects of each individual’s previous actions are also accounted in the process of predictions. Finally, a prediction accuracy of 13.85% can be achieved when considering regions of roughly 1280 square meters. Except for the Markov model and HMM, there are some other location prediction algorithms that are used to analyze location information in literature [18–26], such as the artificial neural network-based algorithm [18], Bayesian networkbased methods [19, 20], mobile-sink-based methods [21, 22], the regression-based method [23], and mobile anchor assisted localization algorithms [24, 25]. These methods predict future positions of nodes from different perspectives, which focus on the behavior of each single node. However, in fact the behavior of a node is decided not only by location state of the former period but also by the social relationship of mobile users. In this paper, a social-relationship-based mobile node location prediction algorithm using daily routines (SMLPR) is proposed. In this approach, the social relationship of nodes is used to optimize the location prediction algorithm so that it can better adapt to participatory sensing applications and promote the prediction accuracy. The remainder of this paper is organized as follows: the network model is illustrated in Section 2. Section 3 specifies the mobile node location prediction algorithm. Extensive simulations have been done for performance evaluation in Section 4. Section 5 concludes the paper.
International Journal of Distributed Sensor Networks
2. Network Model Human movements often exhibit a high degree of repetition including regular visits to certain places and regular contacts during daily activities. In this paper, a hybrid urban network model is proposed for participatory sensing systems, as illustrated in Figure 1. The map 𝑀 is partitioned into small regions. In other words, 𝑀 is represented as a finite set of regions {𝑎1 , . . . , 𝑎𝑛 } such that ⋃𝑛𝑖=1 𝑎𝑖 = 𝑀 with 𝑎𝑖 ⋂ 𝑎𝑗 = 0 (𝑖 ≠ 𝑗). The movement possibility of the user from one region to another is represented by a directed graph. In these regions, the sensed data needs to be aggregated to reduce network overhead and to enhance its usefulness among consumers. As mentioned above, the network has been reinforced as a mixture of an opportunistic network and a centralized infrastructure which is shown in Figure 1. The centralized infrastructure consists of a number of wireless access points (APs) and a backbone connecting the APs. The purpose of this model is to collect data from a peer-to-peer network (scenario 2 in Figure 1) or WIFI APs which is in the vicinity of the consumers (scenarios 1 and 3 in Figure 1), rather than collecting it from any 3G/4G server. Mobile nodes that are carrying smart device can only access to the network when they are walking into the transmission range of any AP, and data transmission can only occur between peer counterparts when they fall into each other’s transmission range as in normal opportunistic networks. Definition 1 (location). Inside a geographical region, a mobile device closest to a fixed location is selected to perform data collection and the collected data is sent to a set of bounding APs where it is stored and pulled by the consumers. The fixed location is termed as the aggregation location. Definition 2 (location granularity). An aggregation location includes a set of APs in adjacent position. The granularity of the location represents the location’s transmission range which is decided by the number and the scope of the APs in the same geographical region. In this study, historical trajectories of users connecting with APs are recorded and they are divided by time granularity. Trajectory of a moving user is defined as a sequence of points {(id𝑗 , 𝑙1 , V𝑡1 ), (id𝑗 , 𝑙2 , V𝑡2 ), . . . , (id𝑗 , 𝑙𝑚 , V𝑡𝑚 )}, where id𝑗 is the user’s identifier and location 𝑙1 is represented by a set of adjacent APs which users are connecting to at time slot V𝑡𝑖 , 1 ≤ 𝑖 ≤ 𝑚. In this paper, a mechanism to construct aggregation location based on the information of APs is proposed, which divides APs into different locations in different granularities. The urban scenario is transformed into a graph 𝐺 = {𝐸, 𝑉, 𝑊}, where each AP is replaced by a vertex V ∈ 𝑉 and the relation between two APs is replaced by an edge 𝑒 ∈ 𝐸, where 𝐸 ⊂ 𝑉×𝑉. The relationship matrix of APs is replaced by 𝑊 = [𝑤𝑖𝑗 ], 𝑖 ∈ 𝑁, 𝑗 ∈ 𝑁, where 𝑤𝑖𝑗 represents the relationship between AP 𝑖 and AP 𝑗, which can be calculated by 𝑟𝑖𝑗 =
2𝑛𝑖𝑗 𝑛𝑖 + 𝑛𝑗
.
(1)
International Journal of Distributed Sensor Networks
3
Map M
ta Da ction lle co
ta n Da utio ib r t dis Opportunistic WIFI AP
WIFI AP
Contact Scenario 2
Scenario 1
Data communication User movement
Scenario 3
Locations and path Mobile node
Figure 1: Participatory sensing scenario.
C A
F
0.86
0.91
D
0.91 0.64
0.77
N
0.77 0.62
0.62
O
0.75 0.86
K
0.47 0.5 L
0.58 0.63 P
C
H
0.88
Q
B
0.11
𝜆 ≥ 0.6
F
J
0.83 0.86
0.47 0.96 0.75
G
0.67
0.77 0.91 0.91 0.92
E
D
0.26
0.56
J
A
`
0.77 0.91 0.91 0.92 B
G
0.11 0.96
I
0.95 0.74 0.82
0.18
E
M
0.69
0.83
0.71
0.92 0.62
B
0.26
A
0.86 0.61
0.95 H
(a)
M
K
I
0.53 0.67 P
Q
N
0.77 L 0.86 O
(b)
C
D
F
Q
0.83 0.86
0.67
G
I
0.95 H
E J
0.96
K
M
0.53
0.77
P
0.75
L 0.86
N
O
(c)
Figure 2: Process of location construction.
In formula (1), the frequency of AP 𝑖 and AP 𝑗 appearing on all users’ devices in the same period is counted, denoted as 𝑛𝑖𝑗 , and the number of times that AP 𝑖 appears in total is denoted as 𝑛𝑖 (the same to 𝑛𝑗 ).
Using the greedy algorithm of Kruskal [27], the maximum spanning tree from the graph 𝐺 is easily got, denoted as 𝑇. After choosing a weight 𝜆 as the location granularity, the edge in 𝑇 whose weight is less than 𝜆 will be cut down, and
4
International Journal of Distributed Sensor Networks
leave the tree into some separated connected components. One connected component is regarded as a location. Figure 2 shows the process of constructing the aggregation locations, and a construction with granularity 0.6 is represented in Figure 2(c). In most WLAN datasets, connection is recorded by the format (node, contact time, APs, and signal strength), and one or more APs which a user connects to may appear in one item at the same time, which may cause users’ location confusion. Therefore, after getting the set of locations in the mobility scenario, estimating which location the user belongs to in the same period is also needed. The signal strength between a user’s mobile device and a WIFI AP in the WLAN dataset can help to solve this problem. A weight between the user and location can be calculated by weight𝑖 =
∑𝑗=1 𝑛 strength𝑗 𝑛
.
(2)
In formula (2), 𝑛 represents the number of aps in location 𝑖, and strength 𝑗 represents the signal strength between the user 𝐴 and AP 𝑗 (AP𝑗 ∈ location𝑖 ). The user 𝐴 is considered to be at location 𝑖 in the time period, if 𝑖 meets the condition denoted in 𝑖 = arg max {weight𝑖 } .
(3)
3. Algorithm Design This paper proposes a simple method for predicting the future locations of mobile nodes, on the basis of their previous ways to other locations. The proposed approach considers different daily time periods, which relates to the fact that users present different behaviors and visit different places during their daily routines. Therefore the hidden Markov model is introduced to capture the dynamism of users’ behavior resulting from the daily routines. What is more, users experience a combination of periodic movement that is geographically limited and seemingly random jumps correlated with their social networks. Social relationships can explain about 10% to 30% of all human movement, while periodic behavior explains 50% to 70% [28]. On the basis of this theory prerequisite, social relationship between nodes is also exploited in this paper for optimization and amendment of location prediction result. 3.1. Hidden Markov Prediction Model. In urban scenario, users adopt different behaviors during different periods of daily time, and the users’ daily routines may influence users’ trajectories. Thus the different daily time periods (same as daily sample) should be considered in order to guarantee a more realistic representation. With filtering and hidden Markov model, this can be done in a simple way. Hidden Markov model (HMM) is a well-known approach for the analysis of sequential data, in which the sequences are assumed to be generated by a Markov process with hidden states. Figure 3 shows the general architecture of an instantiated HMM. Each shape in the diagram represents a random
a11
a31
a22
a33
a21
a32
X1
X2
a13 X3
a12
a23
b11 b12 b13 b14 b21 b22 b23 b24 b31 b32 b33 b34
E1
E2
E3
E4
Figure 3: Example of hidden Markov model.
variable that can adopt any number of values. The random variable 𝑥(𝑡) is the location state at time 𝑡. The random variable 𝑒(𝑡) is the daily sample state (in this paper, the observed state is called evidence) at time 𝑡. The arrows in the diagram denote conditional dependencies. From the diagram, it is clear that the conditional probability distribution of the location variable 𝑥(𝑡) at time 𝑡, given by the values of the location variable 𝑥 at all times, depends only on the value of the location variable 𝑥(𝑡−1), and thus the value at time 𝑡−2 and the values before it have no influence. This is called the Markov property. Similarly, the value of the evidence 𝑒(𝑡) only depends on the value of the location variable 𝑥(𝑡), at time 𝑡. Given the result of filtering up to time 𝑡, one can easily compute the result for 𝑡 + 1 from the new evidence 𝑒𝑡+1 . The calculation can be viewed as actually being composed of two parts: first, the current state distribution is projected forward from 𝑡 to 𝑡 + 1. Second, it is updated using the new evidence 𝑒𝑡+1 . This two-part process emerges using 𝑃 (𝑋𝑡+1 | 𝑒1:𝑡+1 ) = 𝛼𝑃 (𝑒𝑡+1 | 𝑋𝑡+1 ) ∑𝑃 (𝑋𝑡+1 | 𝑋𝑡 ) 𝑃 (𝑋𝑡 | 𝑒1:𝑡 ) ,
(4)
𝑋𝑡
where 𝛼 is a normalizing constant used to make probabilities sum up to 1. Within the summation, 𝑃(𝑋𝑡+1 | 𝑋𝑡 ) is the common transition model and 𝑃(𝑋𝑡 | 𝑒1:𝑡 ) is the current state distribution. 𝑃(𝑒𝑡+1 | 𝑋𝑡+1 ) is used to update the transition model and it is obtainable directly from the statistical data. In participatory sensing system, for each application scenario, the hidden Markov model can be used to predict the future location state of each mobile node. Prediction process includes the following steps. 3.1.1. Preparatory Stage. At the beginning, the system needs to collect enough information of user movement trajectories to construct the Markov chain, and therefore a “warmup” stage is assumed in the prediction system. During preparatory stage, the system only collects historical data and it cannot provide any predicted information. The warm-up stage can last for one day or one week depending on the amount of information collected. 3.1.2. Determination of State Set. The location elements in the collecting data are extracted and they are denoted as set 𝐿. As
International Journal of Distributed Sensor Networks
5
set 𝐿 contains a number of location elements, the location of higher visiting frequency is chosen as state set of the system, denoted as set 𝐸. If there are 𝑚 locations in the current scene, the state space can be denoted as 𝐸 = {𝑋1 , 𝑋2 , . . . , 𝑋𝑚 }, and the location 𝑖 is the 𝑖th status 𝑋𝑖 of Markov process.
the daily sample 𝑟, and 𝑛𝑖 represents the number of times that node arrives at location 𝑖. Based on all of the above, calculate the probability of node arriving on location 𝑖 at next time slot 𝑙 + 1 using
3.1.3. Discretization of Data Set. Statistical data of all users related to state set 𝐸 is made. Then the data set of each user is processed to be discrete set of the fixed time period, so the set after discretization is denoted as follows:
It can be considered that the state 𝑋𝑗 obtained by the system at time 𝑙 + 1 is 𝑋𝑗 = arg max{𝑝𝑗(𝑙+1) }. The formula above incorporates a one-step prediction, and it is easy to derive the following recursive computation for prediction of the state at 𝑡 + 𝑘 + 1 from a prediction for 𝑡 + 𝑘; therefore the state 𝑋𝑡+𝑘+1 can be obtained by
{(𝑡𝑘 , 𝑋𝑖 )} ,
𝑘 = 1, 2, 3 . . . , 𝑖 ∈ {1, 2, 3, . . . , 𝑚} .
(5)
3.1.4. Calculation of 1-Order Transition Probability Matrix. 𝑛𝑖𝑗 is the frequency that node 𝐴 departs from location 𝑖 for location 𝑗, then the probability of node 𝐴 departing from location 𝑖 for location 𝑗 is denoted as 𝑝𝑖𝑗 =
𝑛𝑖𝑗 𝑛
,
(6)
where 𝑛 is the total number of time node 𝐴 departed location 𝑖 to visit other locations in data set. Therefore, suppose that there are 𝑚 locations in the set, an 𝑚 × 𝑚 transition probability matrix is generated as 𝑝11 𝑝12 [ 𝑝21 𝑝22 [ 𝑃 = [ .. .. [ . . 𝑝 𝑝 𝑚1 𝑚2 [
⋅ ⋅ ⋅ 𝑝1𝑚 ⋅ ⋅ ⋅ 𝑝2𝑚 ] ] . ]. d .. ]
(7)
⋅ ⋅ ⋅ 𝑝𝑚𝑚 ]
Given 𝑝𝑗(𝑙) as the probability of node on state 𝑋𝑗 at initial moment 𝑙 and computing the probability of each state, the initial distribution of Markov chain can be obtained as (𝑙) ). 𝑃 (𝑙) = (𝑝1(𝑙) , 𝑝2(𝑙) , . . . , 𝑝𝑚
(8)
For example, assume that the initial state is 𝑋2 , the initial distribution is as 𝑃(𝑙) = (0, 1, 0, . . . , 0), and the absolute distribution at time 𝑙 + 1 is as 𝑃 (𝑙 + 1) = 𝑃 (𝑙) 𝑃 = (𝑝1(𝑙+1) , 𝑝2(𝑙+1) , 𝑝3(𝑙+1) , 𝑝4(𝑙+1) , 𝑝5(𝑙+1) ) . (9) 3.1.5. Update the Result from the New Evidence. The different daily time periods (daily sample) is regarded as the evidence variable 𝑒, so the set of evidences can be defined as EV = {𝑒𝑟 }, 𝑟 = 1, 2, . . . , 𝑑, where 𝑑 is the total number of daily time period state. For example, break the day into four daily samples (𝑑 = 4), and the EV is denoted as EV = {a.m., noon, p.m., evening} .
(10)
For each daily sample 𝑟, a diagonal matrix 𝑂(𝑒𝑟 ) is defined as 𝑂 (𝑒𝑟 ) = [𝑜𝑖𝑗 ] ,
0, 𝑜𝑖𝑗 = { 𝑝 (𝑒𝑟 | 𝑋 = 𝑖) ,
if 𝑖 ≠ 𝑗 if 𝑖 = 𝑗,
(11)
where 𝑃(𝑒𝑟 | 𝑋 = 𝑖) = 𝑃(𝑒𝑟 , 𝑋 = 𝑖)/𝑃(𝑋 = 𝑖) = 𝑛𝑖𝑟 /𝑛𝑖 , 𝑛𝑖𝑟 represents the frequency of node arriving at location 𝑖 in
𝑟 𝑟 𝑃 (𝑙 + 1) = 𝛼𝑂 (𝑒𝑙+1 ) 𝑃 (𝑙 + 1) = 𝛼𝑂 (𝑒𝑙+1 ) 𝑃 (𝑙) 𝑃.
𝑋𝑡+𝑘+1 = arg max {𝑃 (𝑋𝑡+𝑘+1 )} .
(12)
(13)
3.2. Social-Aware Prediction Optimization. In participatory sensing system, a mobile node can be the social node carrying data acquisition equipment. Thus, the social relationship is used to estimate the future locations of mobile nodes and optimize the prediction result of hidden Markov model. In this paper, capturing the evolution of social interactions in the different periods of time (daily sample) over consecutive days is the aim, by computing social strength based on the average duration of contacts. Figure 4 shows how social interaction (from the point of view of user 𝐴) varies during a day. For instance, it indicates a daily sample (8 a.m.–12 p.m.) over which the social strength of user 𝐴 to users 𝐵 and 𝐶 is much stronger (less intermittent line) than the strength to users 𝐷, 𝐸, and 𝐹. Figure 4 aims to show the dynamics of a social network over a one-day period, where different social structures lead to different behavior when a user moves towards the social community that the user is related to. As illustrated in Figure 4, the total contact time of mobile nodes 𝐴 and 𝐵 during a daily sample Δ𝑇𝑖 in a day 𝑘 is denoted as 𝑛
𝑀𝑖𝑘 = ∑ (𝑡𝑐𝑒 − 𝑡𝑐𝑠 ) ,
𝑐 = 1, 2, 3, . . . , 𝑛,
(14)
𝑐=1
where 𝑛 is the number of contact times in Δ𝑇𝑖 , 𝑡𝑐𝑠 indicates the start time of the 𝑐th contact of mobile node 𝐴 and 𝐵, and 𝑡𝑐𝑒 indicates the terminate time of the 𝑐th contact of mobile node 𝐴 and 𝐵. Hence the social strength between any pair of nodes 𝐴 and 𝐵 in Δ𝑇𝑖 is denoted as 𝑊 (𝐴, 𝐵)𝑖 =
𝑘 ∑𝑚 𝑘=1 𝑀𝑖 , 𝑚 × Δ𝑇𝑖
(15)
where 𝑚 is the total number of days in the historical record. According to formula (15), the social relationship matrix of nodes in Δ𝑇𝑖 can be obtained. On the basis of relation matrix mobile nodes can be partitioned as communities, which determine the closer relation nodes as a subgroup. Since the users’ proximity is only taken into account, partition-based clustering methods, such as 𝑘-means and fuzzy 𝑐-means, are not applicable. Therefore use a hierarchical
6
International Journal of Distributed Sensor Networks Contact (A, D)
Contact (A, B) Contact (A, B) Contact (A, C)
A
12:00 p.m.
B) B
4:00 p.m.
C
A
B
8:00 p.m.
C
Contact (A, C) · · ·
Contact (A, F) 12:00 a.m.
B
F
A
8:00 a.m.
4:00 a.m.
D
B A
Contact (A, B)
Contact (A, E)
Contact (A, C)
Contact (A, B)
8:00 a.m. Daily sample ΔTi
, W(A
Contact (A, B)
A
A
···
C
E
Figure 4: Contacts a user 𝐴 has with a set of users in different daily samples Δ𝑇𝑖 .
𝑃𝑖 (𝐴 | 𝑆𝑗 ) =
𝑃𝑖 (𝐴, 𝑆𝑗 ) 𝑃𝑖 (𝑆𝑗 )
,
𝑗 = 1, . . . , 𝑛,
(16)
where 𝑃𝑖 (𝐴 | 𝑆𝑗 ) represents the probability of node arriving at location 𝑖 on the condition that node 𝑆𝑗 has already been on the 𝑖 location; 𝑃𝑖 (𝑆𝑗 ) represents the probability that node 𝑆𝑗 keeps on staying at location, which can be obtained by Markov model calculation; 𝑃𝑖 (𝐴, 𝑆𝑗 ) represents the encounter probability of node 𝐴 and node 𝑆𝑗 on location 𝑖 and the formula is defined as 𝑃𝑖 (𝐴, 𝑆𝑗 ) =
𝑓𝑖 (𝐴, 𝑆𝑗 ) ∑𝑚 𝑖=1 𝑓𝑖 (𝐴, 𝑆𝑗 )
,
(17)
where 𝑓𝑖 (𝐴, 𝑆𝑗 ) represents number of encounter times on location 𝑖. Given the relationship weight of node 𝐴 and node 𝑆𝑗 as 𝑊(𝐴, 𝑆𝑗 ) = 𝜌𝑗 , the probability of node 𝐴 arriving on location 𝑖 at next time slot is 𝑛 𝜌𝑗 𝜆𝑗 = 𝑛 , 𝑃𝑖 (𝐴) = ∑𝜆 𝑗 𝑃𝑖 (𝐴 | 𝑆𝑗 ) , (18) ∑ 𝑗=1 𝜌𝑗 𝑗=1 where 𝜆 𝑗 is the weight of each conditional probability which is calculated by normalization method, ∑𝑛𝑗=1 𝜆 𝑗 = 1. According to the location distribution of all the nodes belonging to 𝐶, the probability of node 𝐴 arriving at different location can be obtained. And combined with the prediction result from hidden Markov model and using weight formula (19) to calculate the probability distribution of node 𝐴 arriving at all the location in the location set, the location having the maximum of the visiting probabilities is considered as the output of the prediction algorithm: 𝑃𝑖 = 𝑃𝑖 HMM + 𝑑 (𝑃𝑖social − 𝑃𝑖 HMM ) ,
(19)
550 Mean:445 500 Location quantity
clustering method, namely, complete linkage clustering [29], as the community partition algorithm. Suppose that it used social relationship to calculate the probability of node 𝐴 arriving at the location 𝑖 (𝑖 = 1, 2, . . . , 𝑚) at next period. Given that node 𝐴 belongs to community 𝐶 and the set of other nodes belonging to 𝐶 on location 𝑖 at current time slot is denoted as 𝑆 = {𝑆1 , . . . , 𝑆𝑗 , . . . , 𝑆𝑛 }, where 𝑆 ⊆ 𝐶, according to conditional probability then the following formula is proposed:
450
The total number of APs
400 The value used in the experiments
350 300
0.1
0.2
0.3
0.4
0.5 0.6 𝜆 value
0.7
0.8
0.9
1
Figure 5: Quantity of locations.
where 𝑃𝑖 HMM is location prediction probability of state 𝑋𝑖 using hidden Markov model, and 𝑃𝑖social is the prediction probability of location 𝑖 based on social relationship, and 𝑑 is the damping factor which is defined as the probability that the social relation between the nodes helps improve the accuracy of the prediction. This means that the higher the value of 𝑑 is, the more the algorithm accounts for the social relation between the nodes. It is beneficial to use social relationship to optimize the prediction result, making the transition probability matrix sparse and improve the accuracy of the prediction model.
4. Experimental Analyses 4.1. Simulation Configuration. In this paper, the experiment data is from the dataset provided by Wireless Topology Discovery (WTD) [30], from which two-month-period data, total 13,215,412 items, is chosen to simulate the prediction algorithm. There are 275 nodes and 524 APs (access points) in the dataset. According to the vicinity of AP positions, the number of locations at which APs are clustered is shown in Figure 5. Figure 5 shows that when the defined granularity 𝜆 becomes bigger, the quantity of the locations in the gained
International Journal of Distributed Sensor Networks
7
246
232
257 65
49
191
183
132
148
212
155
35
146 184
235
66
134
253
153 2
178
177
34
101
192
47
68
126
4
186
263
163
33 89
133
108
123
127 52
121
61
237
189
128
156
262
203
206 165
(a) a.m. 257
65
232
246 191 49
212 189
132
183
155
128 123
134
253
47
146
186
203
184
235
192
68
101 66
178 163
34
148
177
153
2
133 206
126
127
237
4
108
61
33
156
89
52
165
263
35
121
262
(b) Noon 262
165
121 133
89
52
108
127
206
33
183
163
156
123
126
128 186
47
49
178 66 192 4
148
253
246
2
232
235
155
132
191
263
177 68
203
184
34
189
146
101
134 61
212 65
237
153
257
35
(c) p.m.
Figure 6: Continued.
8
International Journal of Distributed Sensor Networks 165
128 121
133
203
52
89
127
206
108 33
156
262
34
257
183
123
126 163
212
184
186
65 47
178
68
177
132
49
253
66
4 192
191
263
2
246
235 155
189
148
232 101
237
146
134 35
61
153
(d) Evening
Figure 6: Social network structures of the dataset in different daily samples.
scenario will also become larger. When 𝜆 is defined as 1, the quantity of location is equal to the total number of APs. The location granularity 𝜆 has been given as 0.5 in following experiment. 4.2. Similar User Clustering. In order to predict the further location of mobile nodes using social relationship, the social network structure in the system should be primarily considered. Based on the quantization formula (15), we calculate the relation strength between any pair of nodes 𝐴 and 𝐵 in different daily sample (a.m., noon, p.m., and evening), and the social network structures of the dataset are achieved, illustrated in Figure 6. A hierarchical clustering method, complete linkage clustering, has been used to cluster mobile users. Figure 6(a) shows the social network clustering result in the a.m. period, and the clustering structures in the period of noon, p.m. and evening are, respectively, illustrated in Figures 6(b), 6(c), and 6(d). 4.3. Prediction Accuracy. In order to evaluate the accuracy of prediction model, the processed node locations can be divided into two parts: using the 50% that has been chosen from the original information to train the Markov model and using the rest as the test case of the prediction model. The prediction precision 𝑃result is denoted as 𝑃result =
∑𝑛𝑖=1 accuracy𝑖 . 𝑛
(20)
In formula (20), 𝑛 represents prediction times, and accuracy𝑖 is the prediction result of location 𝑖, denoted as accuracy𝑖 = {
1, 0,
when result is right when result is wrong.
(21)
Firstly, the training data set is used to train the prediction model which includes standard Markov model (SMM) and daily-routine-based prediction model (MLPR). Afterward the test cases are used, respectively, to verify the above mentioned two models. The prediction accuracies of the two prediction models are shown in Figure 7, where Figure 7(a) shows the prediction accuracy of nodes from 1 to 92, Figure 7(b) shows the prediction accuracy of nodes from 92 to 184, and Figure 7(c) shows the prediction accuracy of nodes from 185 to 275. From Figure 7, it indicates that the daily-routinebased mobile node location prediction algorithm (MLPR) gains a better performance than standard Markov model. This shows that daily routines can promote the accuracy and improve the algorithm’s performance. Then make a comparison among the proposed socialrelationship-based mobile node location prediction algorithm using daily routines (SMLPR), O2MM and the SMLP𝑁. Among these algorithms, second order Markov predictor (O2MM) has the best performance among Markov order𝑘 predictors [15], and social-relationship-based mobile node location prediction algorithm (SMLP𝑁) has the same even better performance than O2MM, which can be obtained from the previous work in the paper [31]. The comparative result is shown in Figure 8, from which it indicates that SMLPR has better prediction effects after combining with daily routines and social relationship and gains a higher accuracy than O2MM and SMLP𝑁. Figure 9 shows the number of users in different precision range among SMM, O2MM, SMLP𝑁, and SMLPR, and it illustrates that SMLPR obtained the largest number of node distribution in a higher precision range. For instance, the number of nodes with accuracy greater than 90% in SMLPR is 198 and in O2MM is 114, and SMM only achieves 55 nodes. Lastly, the performance of these algorithms is shown as Table 1. The accuracy of SMLPR is 30% higher than the
9
1.2
1.2
1
1 Prediction accuracy
Prediction accuracy
International Journal of Distributed Sensor Networks
0.8 0.6 0.4
0.6 0.4 0.2
0.2 0
0.8
10
20
30 40 50 Node number
60
70
80
0
90
SMM MLPR
100
110
120
130 140 150 Node number
160
170
180
SMM MLPR (a) Nodes 1–92
(b) Nodes 93–184
1.2
Prediction accuracy
1 0.8 0.6 0.4 0.2 0
190
200
210
220
230
240
250
260
270
Node number SMM MLPR (c) Nodes 185–275
Figure 7: Prediction precision of SMM and MLPR.
Table 1: The algorithm performance comparison.
Prediction accuracy Time complexity Storage space
SMM 0.6164 𝑂(𝑁) 𝑂(𝑁2 )
O2MM 0.8275 𝑂(𝑁2 ) 𝑂(𝑁3 )
SMLP𝑁 0.8488 𝑂(𝑁) 𝑂(𝑁2 )
SMLPR 0.9014 𝑂(𝑁) 𝑂(𝑁2 )
standard Markov model and nearly 10% higher than the second order Markov model. Then a better result could also be obtained in the comparison between SMLPR and SMLP𝑁. In the aspects of space cost from Table 1, the complexity of SMLPR is 𝑂(𝑁) while O2MM is 𝑂(𝑁2 ), and the memory demand of SMLPR is 𝑂(𝑁2 ) while O2MM is 𝑂(𝑁3 ). Thus, it is proved that the SMLPR gets better performance than order2 Markov predictor at much lower expense, and the SMLPR is
more practical than order-2 Markov predictor in the WLAN scenario. 4.4. Impact of Location Granularity. In location-based mobility scenario, location granularity may have a significant influence on the prediction accuracy. In order to evaluate the impact of location granularity, the algorithms’ performance is tested by adjusting the granularity value 𝜆, and the result is shown in Figure 10. As shown in Figure 10, with the increasing of the location granularity 𝜆, due to the number of locations in the scenario, the average accuracies of these four algorithms are relatively decreasing. In these algorithms, SMM and O2MM meet a more significant impact on the factor of location, and the accuracy reduces approximately to 25%. For SMLPR, it shows a relatively moderate downward trend and the location granularity effect to SMLPR is not very obvious.
10
International Journal of Distributed Sensor Networks
1 Prediction accuracy
Prediction accuracy
1 0.9 0.8 0.7
0.8 0.7 0.6
0.6 0.5
0.9
10
20
30
40 50 60 Node number
70
80
90
0.5
100
110
120
130
140
150
160
170
180
Node number SMLPN SMLPR O2MM
SMLPN SMLPR O2MM
(a) Nodes 1–92
(b) Nodes 93–184
Prediction accuracy
1 0.9 0.8 0.7 0.6 0.5
190
200
210
220 230 240 Node number
250
260
270
SMLPN SMLPR O2MM
(c) Nodes 185–275
Figure 8: Prediction precision of O2MM, SMLPR𝑁 , and SMLPR.
5. Conclusion In this paper, the influence of opportunistic characteristic in participatory sensing system is introduced and the problems of sensing nodes such as intermittent connection, limited communication period, and heterogeneous distribution are analyzed. This paper focuses on the mobility model of nodes in participatory sensing systems and proposes the mobile node location prediction algorithm with users’ daily routines based on social relationship between mobile nodes. According to the historical information of mobile nodes trajectories, the state transition matrix is constructed by the location as the transition state and hidden Markov model is used to predict the mobile node location with the certain duration. Meanwhile, social relationship between nodes is
exploited for optimization and amendment of the prediction model. The prediction model is tested based on the WTD data set and proved to be effective.
Conflict of Interests The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments This work is supported by the National Natural Science Foundation of China under Grant no. 61272529, the National Science Foundation for Distinguished Young Scholars of
International Journal of Distributed Sensor Networks
11
300
Number of users
250 200 150 100 50 0
>10
>20
>30
>40 >50 >60 >70 Prediction accuracy (%)
>80
>90
SMLPN SMLPR
SMM O2MM
Figure 9: The number of users in different precision range.
1
Accuracy rate
0.9 0.8 0.7 0.6 0.5
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
𝜆 value
SMM O2MM
SMLPN SMLPR
Figure 10: The influence of location granularity to prediction accuracy.
China under Grant nos. 61225012 and 71325002; Ministry of Education-China Mobile Research Fund under Grant no. MCM20130391; the Specialized Research Fund of the Doctoral Program of Higher Education for the Priority Development Areas under Grant no. 20120042130003; the Fundamental Research Funds for the Central Universities under Grant nos. N120104001 and N130817003; and Liaoning BaiQianWan Talents Program under Grant no. 2013921068.
References [1] M. Srivastava, M. Hansen, J. Burke et al., “Wireless urban sensing systems,” Tech. Rep. 65, Center for Embedded Networked Sensing at UCLA, 2006.
[2] S. B. Eisenman, E. Miluzzo, N. D. Lane, R. A. Peterson, G.-S. Ahn, and A. T. Campbell, “BikeNet: a mobile sensing system for cyclist experience mapping,” ACM Transactions on Sensor Networks, vol. 6, no. 1, article 6, 2009. [3] H. Lu, W. Pan, N. D. Lane, T. Choudhury, and A. T. Campbell, “SoundSense: scalable sound sensing for people-centric applications on mobile phones,” in Proceedings of the 7th ACM International Conference on Mobile Systems, Applications, and Services (MobiSys ’09), pp. 165–178, Krakov, Poland, June 2009. [4] E. Miluzzo, N. D. Lane, K. Fodor et al., “Sensing meets mobile social networks: the design, implementation and evaluation of the CenceMe application,” in Proceedings of the 6th ACM Conference on Embedded Networked Sensor Systems (SenSys ’08), pp. 337–350, Raleigh, NC, USA, November 2008. [5] E. Miluzzo, N. Lane, S. Eisenman, and A. Campbell, “CenceMe aˆ injecting sensing presence into social networking applications,” in Smart Sensing and Context, G. Kortuem, J. Finney, R. Lea, and V. Sundramoorthy, Eds., vol. 4793 of Smart Sensing and Context, pp. 1–28, 2007. [6] S. B. Eisenman, N. D. Lane, E. Miluzzo et al., “MetroSense project: people-centric sensing at scale,” in Proceedings of the Workshop on World-Sensor-Web, pp. 6–11, Boulder, Colo, USA, 2006. [7] H. Lu, N. D. Lane, S. B. Eisenman, and A. T. Campbell, “Bubblesensing: binding sensing tasks to the physical world,” Pervasive and Mobile Computing, vol. 6, no. 1, pp. 58–71, 2010. [8] L. Deng and L. P. Cox, “Live compare: grocery bargain hunting through participatory sensing,” in Proceedings of the 10th Workshop on Mobile Computing Systems and Applications (HotMobile ’09), Santa Cruz, Calif, USA, February 2009. [9] E. Kanjo, “NoiseSPY: a real-time mobile phone platform for urban noise monitoring and mapping,” Mobile Networks and Applications, vol. 15, no. 4, pp. 562–574, 2010. [10] A. J. Perez, M. A. Labrador, and S. J. Barbeau, “G-Sense: a scalable architecture for global sensing and monitoring,” IEEE Network, vol. 24, no. 4, pp. 57–64, 2010. [11] L. M. L. Oliveira, J. J. P. C. Rodrigues, A. G. F. Elias, and G. Han, “Wireless sensor networks in IPv4/IPv6 transition scenarios,” Wireless Personal Communications, vol. 78, no. 4, pp. 1849–1862, 2014. [12] C. Song, Z. Qu, N. Blumm, and A.-L. Barabasi, “Limits of predictability in human mobility,” Science, vol. 327, no. 5968, pp. 1018–1021, 2010. [13] M. C. Gonz´alez, C. A. Hidalgo, and A.-L. Barab´asi, “Understanding individual human mobility patterns,” Nature, vol. 453, no. 7196, pp. 779–782, 2008. [14] S.-M. Qin, H. Verkasalo, M. Mohtaschemi, T. Hartonen, and M. Alava, “Patterns, entropy, and predictability of human mobility and life,” PLoS ONE, vol. 7, no. 12, Article ID e51353, 2012. [15] L. Song, D. Kotz, R. Jain et al., “Evaluating location predictors with extensive Wi-Fi mobility data,” in Proceedings of the 23rd Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM ’04), vol. 2, pp. 1414–1424, 2004. [16] S. Scellato, M. Musolesi, C. Mascolo, V. Latora, and A. T. Campbell, “NextPlace: a spatio-temporal prediction framework for pervasive systems,” in Pervasive Computing, vol. 6696 of Lecture Notes in Computer Science, pp. 152–169, Springer, Berlin, Germany, 2011. [17] W. Mathew, R. Raposo, and B. Martins, “Predicting future locations with hidden Markov models,” in Proceedings of the 14th International Conference on Ubiquitous Computing (UbiComp ’12), pp. 911–918, September 2012.
12 [18] M. C. Mozer, “The neural network house: an environment that adapts to its inhabitants,” in Proceedings of the AAAI Spring Symposium, pp. 110–114, Stanford, Calif, USA, 1998. [19] H. A. Karimi and X. Liu, “A predictive location model for location-based services,” in Proceedings of the 11th ACM International Symposium on Advances in Geographic Information Systems (GIS ’03), pp. 126–133, New Orleans, La, USA, November 2003. [20] J. D. Patterson, L. Liao, D. Fox et al., “Inferring high-level behavior from low-level sensors,” in Proceedings of the 5th Annual Conference on Ubiquitous Computing (UbiComp ’03), pp. 73–89, Seattle, Wash, USA, 2003. [21] C. Zhu, Y. Wang, G. Han, J. J. P. C. Rodrigues, and J. Lloret, “LPTA: location predictive and time adaptive data gathering scheme with mobile sink for wireless sensor networks,” The Scientific World Journal, vol. 2014, Article ID 476253, 13 pages, 2014. [22] C. Zhu, Y. Wang, G. Han, J. J. P. C. Rodrigues, and H. Guo, “A location prediction based data gathering protocol for wireless sensor networks using a mobile sink,” in Proceedings of the 2nd Smart Sensor Networks and Algorithms (SSPA ’14), Co-Located with 13th International Conference on Ad Hoc, Mobile, and Woreless Networks (Ad Hoc ’14), Benidorm, Spain, June 2014. [23] Y.-B. He, S.-D. Fan, and Z.-X. Hao, “Whole trajectory modeling of moving objects based on MOST model,” Computer Engineering, vol. 34, no. 16, pp. 41–43, 2008. [24] G. Han, C. Zhang, J. Lloret, L. Shu, and J. J. P. C. Rodrigues, “A mobile anchor assisted localization algorithm based on regular hexagon in wireless sensor networks,” The Scientific World Journal, vol. 2014, Article ID 219371, 13 pages, 2014. [25] G. Han, H. Xu, J. Jiang, L. Shu, and N. Chilamkurti, “The insights of localization through mobile anchor nodes in wireless sensor networks with irregular radio,” KSII Transactions on Internet and Information Systems, vol. 6, no. 11, pp. 2992–3007, 2012. [26] G. Han, C. Zhang, T. Liu, and L. Shu, “MANCL: a multianchor nodes cooperative localization algorithm for underwater acoustic sensor networks,” Wireless Communications and Mobile Computing. In press. [27] J. Kruskal, “On the shortest spanning subtree of a graph and the traveling salesman problem,” Proceedings of the American Mathematical Society, vol. 7, pp. 48–50, 1956. [28] E. Cho, S. A. Myers, and J. Leskovec, “Friendship and mobility: user movement in location-based social networks,” in Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’11), pp. 1082–1090, ACM, August 2011. [29] G. Punj and D. W. Stewart, “Cluster analysis in marketing research: review and suggestions for application,” Journal of Marketing Research, vol. 20, no. 2, pp. 134–148, 1983. [30] M. McNett and G. M. Voelker, “UCSD Wireless Topology Discovery Project [EB/OL],” 2013, http://www.sysnet.ucsd.edu/ wtd/wtd.html. [31] R. Ru and X. Xia, “Social-relationship-based mobile node location prediction algorithm in participatory sensing systems,” Chinese Journal of Computers, vol. 35, no. 6, 2014.
International Journal of Distributed Sensor Networks
International Journal of
Rotating Machinery
Engineering Journal of
Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
The Scientific World Journal Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
International Journal of
Distributed Sensor Networks
Journal of
Sensors Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Journal of
Control Science and Engineering
Advances in
Civil Engineering Hindawi Publishing Corporation http://www.hindawi.com
Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Volume 2014
Submit your manuscripts at http://www.hindawi.com Journal of
Journal of
Electrical and Computer Engineering
Robotics Hindawi Publishing Corporation http://www.hindawi.com
Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Volume 2014
VLSI Design Advances in OptoElectronics
International Journal of
Navigation and Observation Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Hindawi Publishing Corporation http://www.hindawi.com
Hindawi Publishing Corporation http://www.hindawi.com
Chemical Engineering Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Volume 2014
Active and Passive Electronic Components
Antennas and Propagation Hindawi Publishing Corporation http://www.hindawi.com
Aerospace Engineering
Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Volume 2014
International Journal of
International Journal of
International Journal of
Modelling & Simulation in Engineering
Volume 2014
Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Shock and Vibration Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Advances in
Acoustics and Vibration Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014