A Location Prediction Algorithm with Daily Routines in Location-Based ...

Hindawi Publishing Corporation International Journal of Distributed Sensor Networks Volume 2015, Article ID 481705, 12 pages http://dx.doi.org/10.1155/2015/481705

Research Article A Location Prediction Algorithm with Daily Routines in Location-Based Participatory Sensing Systems Ruiyun Yu,1 Xingyou Xia,1 Shiyang Liao,1 and Xingwei Wang2 1

Software College, Northeastern University, Shenyang 110819, China College of Information Science and Engineering, Northeastern University, Shenyang 110819, China

2

Correspondence should be addressed to Ruiyun Yu; [email protected] Received 21 November 2014; Accepted 11 January 2015 Academic Editor: Joel Rodrigues Copyright © 2015 Ruiyun Yu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Mobile node location predication is critical to efficient data acquisition and message forwarding in participatory sensing systems. This paper proposes a social-relationship-based mobile node location prediction algorithm using daily routines (SMLPR). The SMLPR algorithm models application scenarios based on geographic locations and extracts social relationships of mobile nodes from nodes’ mobility. After considering the dynamism of users’ behavior resulting from their daily routines, the SMLPR algorithm preliminarily predicts node’s mobility based on the hidden Markov model in different daily periods of time and then amends the prediction results using location information of other nodes which have strong relationship with the node. Finally, the UCSD WTD dataset are exploited for simulations. Simulation results show that SMLPR acquires higher prediction accuracy than proposals based on the Markov model.

1. Introduction Participatory sensing is a recent appearing sensing technology which emphasize that people participate in the sensing process. Participatory sensing enables individuals and communities to gather, analyze, and share local knowledge and to subsequently make intelligent decisions and also offer social services. The earliest research of participatory sensing is executed by Srivastava et al. who have proposed the conception of urban sensing in a technical report [1] in the year 2006 to discuss the system architecture and technical methods of urban sensing. The MSG (Mobile Sensing Group) laboratory at Dartmouth College also conducts research on this area, including BikeNet [2], SoundSense [3], CenceMe [4, 5], MetroSense [6], and Bubble-Sensing [7]. Participatory sensing systems rely on mobile phone users to sense and transmit data with diverse purposes in the process of monitoring or solving a particular problem. Based on the large number of users, participatory sensing systems have the potential to acquire large amounts of data from various places and address large-scale location-based problems [8– 10]. A typical example of location-based participatory sensing systems collects and records air quality measurements to

monitor the pollution of a particular location. What is more, the smartphone with Internet connectivity can also contribute to the participatory sensing systems’ growth. In [11], a solution based on web services is proposed to permit the interaction between a mobile application and the IPv6 compliant WSNs scenario. In participatory sensing systems, mobile devices are usually weakly connected. Due to uncertainty of connection, nodes sometimes need encounter opportunities to accomplish data communication and transmission. If the location of mobile nodes can be predicted ahead of several time slots, the service quality and efficiency of the system will be remarkably improved. In order to solve the location prediction problems, human mobility has been analyzed at different geographic scales [12– 14]. In [12] the limits of predictability provided by humans’ mobility patterns are examined. The data that is collected by 50,000 anonymous mobile phone users over a period of three months has been used to study the humans’ mobility patterns, and different entropy measures are adopted in this research to estimate the potential predictability in human dynamics. Based on their analysis, a 93% potential predictability in user mobility across the whole user base is found in their report.

2 In [13] the location data from 100,000 mobile phone users collected by tracking each person’s position for six months is investigated. This research shows that individual mobility patterns of the test persons can be described by a single spatial probability distribution after some corrective preprocessing. Thus the results of this study suggest that humans generally follow simple reproducible patterns. Based on the human mobility analysis, different techniques based on the Markov model have been applied to location prediction problems of human individuals. In [15], a set of discrete locations has been defined using the WIFI cells on a university campus. Two different kinds of location predictors, the 𝑘th order Markov predictor as well as a LZ-based predictor, are tested in predictions to the next location. Based on the test results, the research shows that the second order Markov predictor with a certain fallback feature performs best and provided a median accuracy of 72%. Literature [16] presents a similar extended Markov predictor, which takes the arrival time and residence time into account. Specifically, delay embedding is used to extract location sequences of a certain length from time series. Then these sequences are directly used to predict a user’s location, and the prediction result is obtained by comparing the last observed locations to all embedded location sequences. Meanwhile, hidden Markov models (HMMs) are also considered to predict human mobility. Literature [17] presents a hybrid method on the basis of hidden Markov models. The proposed approach clusters location histories according to their characteristics, and the latter trains an HMM for each cluster. Based on HMMs, location characteristics are considered as unobservable parameters and the effects of each individual’s previous actions are also accounted in the process of predictions. Finally, a prediction accuracy of 13.85% can be achieved when considering regions of roughly 1280 square meters. Except for the Markov model and HMM, there are some other location prediction algorithms that are used to analyze location information in literature [18–26], such as the artificial neural network-based algorithm [18], Bayesian networkbased methods [19, 20], mobile-sink-based methods [21, 22], the regression-based method [23], and mobile anchor assisted localization algorithms [24, 25]. These methods predict future positions of nodes from different perspectives, which focus on the behavior of each single node. However, in fact the behavior of a node is decided not only by location state of the former period but also by the social relationship of mobile users. In this paper, a social-relationship-based mobile node location prediction algorithm using daily routines (SMLPR) is proposed. In this approach, the social relationship of nodes is used to optimize the location prediction algorithm so that it can better adapt to participatory sensing applications and promote the prediction accuracy. The remainder of this paper is organized as follows: the network model is illustrated in Section 2. Section 3 specifies the mobile node location prediction algorithm. Extensive simulations have been done for performance evaluation in Section 4. Section 5 concludes the paper.

International Journal of Distributed Sensor Networks

2. Network Model Human movements often exhibit a high degree of repetition including regular visits to certain places and regular contacts during daily activities. In this paper, a hybrid urban network model is proposed for participatory sensing systems, as illustrated in Figure 1. The map 𝑀 is partitioned into small regions. In other words, 𝑀 is represented as a finite set of regions {𝑎1 , . . . , 𝑎𝑛 } such that ⋃𝑛𝑖=1 𝑎𝑖 = 𝑀 with 𝑎𝑖 ⋂ 𝑎𝑗 = 0 (𝑖 ≠ 𝑗). The movement possibility of the user from one region to another is represented by a directed graph. In these regions, the sensed data needs to be aggregated to reduce network overhead and to enhance its usefulness among consumers. As mentioned above, the network has been reinforced as a mixture of an opportunistic network and a centralized infrastructure which is shown in Figure 1. The centralized infrastructure consists of a number of wireless access points (APs) and a backbone connecting the APs. The purpose of this model is to collect data from a peer-to-peer network (scenario 2 in Figure 1) or WIFI APs which is in the vicinity of the consumers (scenarios 1 and 3 in Figure 1), rather than collecting it from any 3G/4G server. Mobile nodes that are carrying smart device can only access to the network when they are walking into the transmission range of any AP, and data transmission can only occur between peer counterparts when they fall into each other’s transmission range as in normal opportunistic networks. Definition 1 (location). Inside a geographical region, a mobile device closest to a fixed location is selected to perform data collection and the collected data is sent to a set of bounding APs where it is stored and pulled by the consumers. The fixed location is termed as the aggregation location. Definition 2 (location granularity). An aggregation location includes a set of APs in adjacent position. The granularity of the location represents the location’s transmission range which is decided by the number and the scope of the APs in the same geographical region. In this study, historical trajectories of users connecting with APs are recorded and they are divided by time granularity. Trajectory of a moving user is defined as a sequence of points {(id𝑗 , 𝑙1 , V𝑡1 ), (id𝑗 , 𝑙2 , V𝑡2 ), . . . , (id𝑗 , 𝑙𝑚 , V𝑡𝑚 )}, where id𝑗 is the user’s identifier and location 𝑙1 is represented by a set of adjacent APs which users are connecting to at time slot V𝑡𝑖 , 1 ≤ 𝑖 ≤ 𝑚. In this paper, a mechanism to construct aggregation location based on the information of APs is proposed, which divides APs into different locations in different granularities. The urban scenario is transformed into a graph 𝐺 = {𝐸, 𝑉, 𝑊}, where each AP is replaced by a vertex V ∈ 𝑉 and the relation between two APs is replaced by an edge 𝑒 ∈ 𝐸, where 𝐸 ⊂ 𝑉×𝑉. The relationship matrix of APs is replaced by 𝑊 = [𝑤𝑖𝑗 ], 𝑖 ∈ 𝑁, 𝑗 ∈ 𝑁, where 𝑤𝑖𝑗 represents the relationship between AP 𝑖 and AP 𝑗, which can be calculated by 𝑟𝑖𝑗 =

2𝑛𝑖𝑗 𝑛𝑖 + 𝑛𝑗

.

(1)


3

Map M

ta Da ction lle co

ta n Da utio ib r t dis Opportunistic WIFI AP

WIFI AP

Contact Scenario 2

Scenario 1

Data communication User movement

Scenario 3

Locations and path Mobile node

Figure 1: Participatory sensing scenario.

C A

F

0.86

0.91

D

0.91 0.64

0.77

N

0.77 0.62

0.62

O

0.75 0.86

K

0.47 0.5 L

0.58 0.63 P

C

H

0.88

Q

B

0.11

𝜆 ≥ 0.6

F

J

0.83 0.86

0.47 0.96 0.75

G

0.67

0.77 0.91 0.91 0.92

E

D

0.26

0.56

J

A

`

0.77 0.91 0.91 0.92 B

G

0.11 0.96

I

0.95 0.74 0.82

0.18

E

M

0.69

0.83

0.71

0.92 0.62

B

0.26

A

0.86 0.61

0.95 H

(a)

M

K

I

0.53 0.67 P

Q

N

0.77 L 0.86 O

(b)

C

D

F

Q

0.83 0.86

0.67

G

I

0.95 H

E J

0.96

K

M

0.53

0.77

P

0.75

L 0.86

N

O

(c)

Figure 2: Process of location construction.

In formula (1), the frequency of AP 𝑖 and AP 𝑗 appearing on all users’ devices in the same period is counted, denoted as 𝑛𝑖𝑗 , and the number of times that AP 𝑖 appears in total is denoted as 𝑛𝑖 (the same to 𝑛𝑗 ).

Using the greedy algorithm of Kruskal [27], the maximum spanning tree from the graph 𝐺 is easily got, denoted as 𝑇. After choosing a weight 𝜆 as the location granularity, the edge in 𝑇 whose weight is less than 𝜆 will be cut down, and

4


leave the tree into some separated connected components. One connected component is regarded as a location. Figure 2 shows the process of constructing the aggregation locations, and a construction with granularity 0.6 is represented in Figure 2(c). In most WLAN datasets, connection is recorded by the format (node, contact time, APs, and signal strength), and one or more APs which a user connects to may appear in one item at the same time, which may cause users’ location confusion. Therefore, after getting the set of locations in the mobility scenario, estimating which location the user belongs to in the same period is also needed. The signal strength between a user’s mobile device and a WIFI AP in the WLAN dataset can help to solve this problem. A weight between the user and location can be calculated by weight𝑖 =

∑𝑗=1 𝑛 strength𝑗 𝑛

.

(2)

In formula (2), 𝑛 represents the number of aps in location 𝑖, and strength 𝑗 represents the signal strength between the user 𝐴 and AP 𝑗 (AP𝑗 ∈ location𝑖 ). The user 𝐴 is considered to be at location 𝑖 in the time period, if 𝑖 meets the condition denoted in 𝑖 = arg max {weight𝑖 } .

(3)

3. Algorithm Design This paper proposes a simple method for predicting the future locations of mobile nodes, on the basis of their previous ways to other locations. The proposed approach considers different daily time periods, which relates to the fact that users present different behaviors and visit different places during their daily routines. Therefore the hidden Markov model is introduced to capture the dynamism of users’ behavior resulting from the daily routines. What is more, users experience a combination of periodic movement that is geographically limited and seemingly random jumps correlated with their social networks. Social relationships can explain about 10% to 30% of all human movement, while periodic behavior explains 50% to 70% [28]. On the basis of this theory prerequisite, social relationship between nodes is also exploited in this paper for optimization and amendment of location prediction result. 3.1. Hidden Markov Prediction Model. In urban scenario, users adopt different behaviors during different periods of daily time, and the users’ daily routines may influence users’ trajectories. Thus the different daily time periods (same as daily sample) should be considered in order to guarantee a more realistic representation. With filtering and hidden Markov model, this can be done in a simple way. Hidden Markov model (HMM) is a well-known approach for the analysis of sequential data, in which the sequences are assumed to be generated by a Markov process with hidden states. Figure 3 shows the general architecture of an instantiated HMM. Each shape in the diagram represents a random

a11

a31

a22

a33

a21

a32

X1

X2

a13 X3

a12

a23

b11 b12 b13 b14 b21 b22 b23 b24 b31 b32 b33 b34

E1

E2

E3

E4

Figure 3: Example of hidden Markov model.

variable that can adopt any number of values. The random variable 𝑥(𝑡) is the location state at time 𝑡. The random variable 𝑒(𝑡) is the daily sample state (in this paper, the observed state is called evidence) at time 𝑡. The arrows in the diagram denote conditional dependencies. From the diagram, it is clear that the conditional probability distribution of the location variable 𝑥(𝑡) at time 𝑡, given by the values of the location variable 𝑥 at all times, depends only on the value of the location variable 𝑥(𝑡−1), and thus the value at time 𝑡−2 and the values before it have no influence. This is called the Markov property. Similarly, the value of the evidence 𝑒(𝑡) only depends on the value of the location variable 𝑥(𝑡), at time 𝑡. Given the result of filtering up to time 𝑡, one can easily compute the result for 𝑡 + 1 from the new evidence 𝑒𝑡+1 . The calculation can be viewed as actually being composed of two parts: first, the current state distribution is projected forward from 𝑡 to 𝑡 + 1. Second, it is updated using the new evidence 𝑒𝑡+1 . This two-part process emerges using 𝑃 (𝑋𝑡+1 | 𝑒1:𝑡+1 ) = 𝛼𝑃 (𝑒𝑡+1 | 𝑋𝑡+1 ) ∑𝑃 (𝑋𝑡+1 | 𝑋𝑡 ) 𝑃 (𝑋𝑡 | 𝑒1:𝑡 ) ,

(4)

𝑋𝑡

where 𝛼 is a normalizing constant used to make probabilities sum up to 1. Within the summation, 𝑃(𝑋𝑡+1 | 𝑋𝑡 ) is the common transition model and 𝑃(𝑋𝑡 | 𝑒1:𝑡 ) is the current state distribution. 𝑃(𝑒𝑡+1 | 𝑋𝑡+1 ) is used to update the transition model and it is obtainable directly from the statistical data. In participatory sensing system, for each application scenario, the hidden Markov model can be used to predict the future location state of each mobile node. Prediction process includes the following steps. 3.1.1. Preparatory Stage. At the beginning, the system needs to collect enough information of user movement trajectories to construct the Markov chain, and therefore a “warmup” stage is assumed in the prediction system. During preparatory stage, the system only collects historical data and it cannot provide any predicted information. The warm-up stage can last for one day or one week depending on the amount of information collected. 3.1.2. Determination of State Set. The location elements in the collecting data are extracted and they are denoted as set 𝐿. As


5

set 𝐿 contains a number of location elements, the location of higher visiting frequency is chosen as state set of the system, denoted as set 𝐸. If there are 𝑚 locations in the current scene, the state space can be denoted as 𝐸 = {𝑋1 , 𝑋2 , . . . , 𝑋𝑚 }, and the location 𝑖 is the 𝑖th status 𝑋𝑖 of Markov process.

the daily sample 𝑟, and 𝑛𝑖 represents the number of times that node arrives at location 𝑖. Based on all of the above, calculate the probability of node arriving on location 𝑖 at next time slot 𝑙 + 1 using

3.1.3. Discretization of Data Set. Statistical data of all users related to state set 𝐸 is made. Then the data set of each user is processed to be discrete set of the fixed time period, so the set after discretization is denoted as follows:

It can be considered that the state 𝑋𝑗 obtained by the system at time 𝑙 + 1 is 𝑋𝑗 = arg max{𝑝𝑗(𝑙+1) }. The formula above incorporates a one-step prediction, and it is easy to derive the following recursive computation for prediction of the state at 𝑡 + 𝑘 + 1 from a prediction for 𝑡 + 𝑘; therefore the state 𝑋𝑡+𝑘+1 can be obtained by

{(𝑡𝑘 , 𝑋𝑖 )} ,

𝑘 = 1, 2, 3 . . . , 𝑖 ∈ {1, 2, 3, . . . , 𝑚} .

(5)

3.1.4. Calculation of 1-Order Transition Probability Matrix. 𝑛𝑖𝑗 is the frequency that node 𝐴 departs from location 𝑖 for location 𝑗, then the probability of node 𝐴 departing from location 𝑖 for location 𝑗 is denoted as 𝑝𝑖𝑗 =

𝑛𝑖𝑗 𝑛

,

(6)

where 𝑛 is the total number of time node 𝐴 departed location 𝑖 to visit other locations in data set. Therefore, suppose that there are 𝑚 locations in the set, an 𝑚 × 𝑚 transition probability matrix is generated as 𝑝11 𝑝12 [ 𝑝21 𝑝22 [ 𝑃 = [ .. .. [ . . 𝑝 𝑝 𝑚1 𝑚2 [

⋅ ⋅ ⋅ 𝑝1𝑚 ⋅ ⋅ ⋅ 𝑝2𝑚 ] ] . ]. d .. ]

(7)

⋅ ⋅ ⋅ 𝑝𝑚𝑚 ]

Given 𝑝𝑗(𝑙) as the probability of node on state 𝑋𝑗 at initial moment 𝑙 and computing the probability of each state, the initial distribution of Markov chain can be obtained as (𝑙) ). 𝑃 (𝑙) = (𝑝1(𝑙) , 𝑝2(𝑙) , . . . , 𝑝𝑚

(8)

For example, assume that the initial state is 𝑋2 , the initial distribution is as 𝑃(𝑙) = (0, 1, 0, . . . , 0), and the absolute distribution at time 𝑙 + 1 is as 𝑃 (𝑙 + 1) = 𝑃 (𝑙) 𝑃 = (𝑝1(𝑙+1) , 𝑝2(𝑙+1) , 𝑝3(𝑙+1) , 𝑝4(𝑙+1) , 𝑝5(𝑙+1) ) . (9) 3.1.5. Update the Result from the New Evidence. The different daily time periods (daily sample) is regarded as the evidence variable 𝑒, so the set of evidences can be defined as EV = {𝑒𝑟 }, 𝑟 = 1, 2, . . . , 𝑑, where 𝑑 is the total number of daily time period state. For example, break the day into four daily samples (𝑑 = 4), and the EV is denoted as EV = {a.m., noon, p.m., evening} .

(10)

For each daily sample 𝑟, a diagonal matrix 𝑂(𝑒𝑟 ) is defined as 𝑂 (𝑒𝑟 ) = [𝑜𝑖𝑗 ] ,

0, 𝑜𝑖𝑗 = { 𝑝 (𝑒𝑟 | 𝑋 = 𝑖) ,

if 𝑖 ≠ 𝑗 if 𝑖 = 𝑗,

(11)

where 𝑃(𝑒𝑟 | 𝑋 = 𝑖) = 𝑃(𝑒𝑟 , 𝑋 = 𝑖)/𝑃(𝑋 = 𝑖) = 𝑛𝑖𝑟 /𝑛𝑖 , 𝑛𝑖𝑟 represents the frequency of node arriving at location 𝑖 in

𝑟 𝑟 𝑃 (𝑙 + 1)󸀠 = 𝛼𝑂 (𝑒𝑙+1 ) 𝑃 (𝑙 + 1) = 𝛼𝑂 (𝑒𝑙+1 ) 𝑃 (𝑙) 𝑃.

𝑋𝑡+𝑘+1 = arg max {𝑃 (𝑋𝑡+𝑘+1 )} .

(12)

(13)

3.2. Social-Aware Prediction Optimization. In participatory sensing system, a mobile node can be the social node carrying data acquisition equipment. Thus, the social relationship is used to estimate the future locations of mobile nodes and optimize the prediction result of hidden Markov model. In this paper, capturing the evolution of social interactions in the different periods of time (daily sample) over consecutive days is the aim, by computing social strength based on the average duration of contacts. Figure 4 shows how social interaction (from the point of view of user 𝐴) varies during a day. For instance, it indicates a daily sample (8 a.m.–12 p.m.) over which the social strength of user 𝐴 to users 𝐵 and 𝐶 is much stronger (less intermittent line) than the strength to users 𝐷, 𝐸, and 𝐹. Figure 4 aims to show the dynamics of a social network over a one-day period, where different social structures lead to different behavior when a user moves towards the social community that the user is related to. As illustrated in Figure 4, the total contact time of mobile nodes 𝐴 and 𝐵 during a daily sample Δ𝑇𝑖 in a day 𝑘 is denoted as 𝑛

𝑀𝑖𝑘 = ∑ (𝑡𝑐𝑒 − 𝑡𝑐𝑠 ) ,

𝑐 = 1, 2, 3, . . . , 𝑛,

(14)

𝑐=1

where 𝑛 is the number of contact times in Δ𝑇𝑖 , 𝑡𝑐𝑠 indicates the start time of the 𝑐th contact of mobile node 𝐴 and 𝐵, and 𝑡𝑐𝑒 indicates the terminate time of the 𝑐th contact of mobile node 𝐴 and 𝐵. Hence the social strength between any pair of nodes 𝐴 and 𝐵 in Δ𝑇𝑖 is denoted as 𝑊 (𝐴, 𝐵)𝑖 =

𝑘 ∑𝑚 𝑘=1 𝑀𝑖 , 𝑚 × Δ𝑇𝑖

(15)

where 𝑚 is the total number of days in the historical record. According to formula (15), the social relationship matrix of nodes in Δ𝑇𝑖 can be obtained. On the basis of relation matrix mobile nodes can be partitioned as communities, which determine the closer relation nodes as a subgroup. Since the users’ proximity is only taken into account, partition-based clustering methods, such as 𝑘-means and fuzzy 𝑐-means, are not applicable. Therefore use a hierarchical

6

International Journal of Distributed Sensor Networks Contact (A, D)

Contact (A, B) Contact (A, B) Contact (A, C)

A

12:00 p.m.

B) B

4:00 p.m.

C

A

B

8:00 p.m.

C

Contact (A, C) · · ·

Contact (A, F) 12:00 a.m.

B

F

A

8:00 a.m.

4:00 a.m.

D

B A

Contact (A, B)

Contact (A, E)

Contact (A, C)

Contact (A, B)

8:00 a.m. Daily sample ΔTi

, W(A

Contact (A, B)

A

A

···

C

E

Figure 4: Contacts a user 𝐴 has with a set of users in different daily samples Δ𝑇𝑖 .

𝑃𝑖 (𝐴 | 𝑆𝑗 ) =

𝑃𝑖 (𝐴, 𝑆𝑗 ) 𝑃𝑖 (𝑆𝑗 )

,

𝑗 = 1, . . . , 𝑛,

(16)

where 𝑃𝑖 (𝐴 | 𝑆𝑗 ) represents the probability of node arriving at location 𝑖 on the condition that node 𝑆𝑗 has already been on the 𝑖 location; 𝑃𝑖 (𝑆𝑗 ) represents the probability that node 𝑆𝑗 keeps on staying at location, which can be obtained by Markov model calculation; 𝑃𝑖 (𝐴, 𝑆𝑗 ) represents the encounter probability of node 𝐴 and node 𝑆𝑗 on location 𝑖 and the formula is defined as 𝑃𝑖 (𝐴, 𝑆𝑗 ) =

𝑓𝑖 (𝐴, 𝑆𝑗 ) ∑𝑚 𝑖=1 𝑓𝑖 (𝐴, 𝑆𝑗 )

,

(17)

where 𝑓𝑖 (𝐴, 𝑆𝑗 ) represents number of encounter times on location 𝑖. Given the relationship weight of node 𝐴 and node 𝑆𝑗 as 𝑊(𝐴, 𝑆𝑗 ) = 𝜌𝑗 , the probability of node 𝐴 arriving on location 𝑖 at next time slot is 𝑛 𝜌𝑗 𝜆𝑗 = 𝑛 , 𝑃𝑖 (𝐴) = ∑𝜆 𝑗 𝑃𝑖 (𝐴 | 𝑆𝑗 ) , (18) ∑ 𝑗=1 𝜌𝑗 𝑗=1 where 𝜆 𝑗 is the weight of each conditional probability which is calculated by normalization method, ∑𝑛𝑗=1 𝜆 𝑗 = 1. According to the location distribution of all the nodes belonging to 𝐶, the probability of node 𝐴 arriving at different location can be obtained. And combined with the prediction result from hidden Markov model and using weight formula (19) to calculate the probability distribution of node 𝐴 arriving at all the location in the location set, the location having the maximum of the visiting probabilities is considered as the output of the prediction algorithm: 𝑃𝑖 = 𝑃𝑖 HMM + 𝑑 (𝑃𝑖social − 𝑃𝑖 HMM ) ,

(19)

550 Mean:445 500 Location quantity

clustering method, namely, complete linkage clustering [29], as the community partition algorithm. Suppose that it used social relationship to calculate the probability of node 𝐴 arriving at the location 𝑖 (𝑖 = 1, 2, . . . , 𝑚) at next period. Given that node 𝐴 belongs to community 𝐶 and the set of other nodes belonging to 𝐶 on location 𝑖 at current time slot is denoted as 𝑆 = {𝑆1 , . . . , 𝑆𝑗 , . . . , 𝑆𝑛 }, where 𝑆 ⊆ 𝐶, according to conditional probability then the following formula is proposed:

450

The total number of APs

400 The value used in the experiments

350 300

0.1

0.2

0.3

0.4

0.5 0.6 𝜆 value

0.7

0.8

0.9

1

Figure 5: Quantity of locations.

where 𝑃𝑖 HMM is location prediction probability of state 𝑋𝑖 using hidden Markov model, and 𝑃𝑖social is the prediction probability of location 𝑖 based on social relationship, and 𝑑 is the damping factor which is defined as the probability that the social relation between the nodes helps improve the accuracy of the prediction. This means that the higher the value of 𝑑 is, the more the algorithm accounts for the social relation between the nodes. It is beneficial to use social relationship to optimize the prediction result, making the transition probability matrix sparse and improve the accuracy of the prediction model.

4. Experimental Analyses 4.1. Simulation Configuration. In this paper, the experiment data is from the dataset provided by Wireless Topology Discovery (WTD) [30], from which two-month-period data, total 13,215,412 items, is chosen to simulate the prediction algorithm. There are 275 nodes and 524 APs (access points) in the dataset. According to the vicinity of AP positions, the number of locations at which APs are clustered is shown in Figure 5. Figure 5 shows that when the defined granularity 𝜆 becomes bigger, the quantity of the locations in the gained


7

246

232

257 65

49

191

183

132

148

212

155

35

146 184

235

66

134

253

153 2

178

177

34

101

192

47

68

126

4

186

263

163

33 89

133

108

123

127 52

121

61

237

189

128

156

262

203

206 165

(a) a.m. 257

65

232

246 191 49

212 189

132

183

155

128 123

134

253

47

146

186

203

184

235

192

68

101 66

178 163

34

148

177

153

2

133 206

126

127

237

4

108

61

33

156

89

52

165

263

35

121

262

(b) Noon 262

165

121 133

89

52

108

127

206

33

183

163

156

123

126

128 186

47

49

178 66 192 4

148

253

246

2

232

235

155

132

191

263

177 68

203

184

34

189

146

101

134 61

212 65

237

153

257

35

(c) p.m.

Figure 6: Continued.

8

International Journal of Distributed Sensor Networks 165

128 121

133

203

52

89

127

206

108 33

156

262

34

257

183

123

126 163

212

184

186

65 47

178

68

177

132

49

253

66

4 192

191

263

2

246

235 155

189

148

232 101

237

146

134 35

61

153

(d) Evening

Figure 6: Social network structures of the dataset in different daily samples.

scenario will also become larger. When 𝜆 is defined as 1, the quantity of location is equal to the total number of APs. The location granularity 𝜆 has been given as 0.5 in following experiment. 4.2. Similar User Clustering. In order to predict the further location of mobile nodes using social relationship, the social network structure in the system should be primarily considered. Based on the quantization formula (15), we calculate the relation strength between any pair of nodes 𝐴 and 𝐵 in different daily sample (a.m., noon, p.m., and evening), and the social network structures of the dataset are achieved, illustrated in Figure 6. A hierarchical clustering method, complete linkage clustering, has been used to cluster mobile users. Figure 6(a) shows the social network clustering result in the a.m. period, and the clustering structures in the period of noon, p.m. and evening are, respectively, illustrated in Figures 6(b), 6(c), and 6(d). 4.3. Prediction Accuracy. In order to evaluate the accuracy of prediction model, the processed node locations can be divided into two parts: using the 50% that has been chosen from the original information to train the Markov model and using the rest as the test case of the prediction model. The prediction precision 𝑃result is denoted as 𝑃result =

∑𝑛𝑖=1 accuracy𝑖 . 𝑛

(20)

In formula (20), 𝑛 represents prediction times, and accuracy𝑖 is the prediction result of location 𝑖, denoted as accuracy𝑖 = {

1, 0,

when result is right when result is wrong.

(21)

Firstly, the training data set is used to train the prediction model which includes standard Markov model (SMM) and daily-routine-based prediction model (MLPR). Afterward the test cases are used, respectively, to verify the above mentioned two models. The prediction accuracies of the two prediction models are shown in Figure 7, where Figure 7(a) shows the prediction accuracy of nodes from 1 to 92, Figure 7(b) shows the prediction accuracy of nodes from 92 to 184, and Figure 7(c) shows the prediction accuracy of nodes from 185 to 275. From Figure 7, it indicates that the daily-routinebased mobile node location prediction algorithm (MLPR) gains a better performance than standard Markov model. This shows that daily routines can promote the accuracy and improve the algorithm’s performance. Then make a comparison among the proposed socialrelationship-based mobile node location prediction algorithm using daily routines (SMLPR), O2MM and the SMLP𝑁. Among these algorithms, second order Markov predictor (O2MM) has the best performance among Markov order𝑘 predictors [15], and social-relationship-based mobile node location prediction algorithm (SMLP𝑁) has the same even better performance than O2MM, which can be obtained from the previous work in the paper [31]. The comparative result is shown in Figure 8, from which it indicates that SMLPR has better prediction effects after combining with daily routines and social relationship and gains a higher accuracy than O2MM and SMLP𝑁. Figure 9 shows the number of users in different precision range among SMM, O2MM, SMLP𝑁, and SMLPR, and it illustrates that SMLPR obtained the largest number of node distribution in a higher precision range. For instance, the number of nodes with accuracy greater than 90% in SMLPR is 198 and in O2MM is 114, and SMM only achieves 55 nodes. Lastly, the performance of these algorithms is shown as Table 1. The accuracy of SMLPR is 30% higher than the

9

1.2

1.2

1

1 Prediction accuracy

Prediction accuracy


0.8 0.6 0.4

0.6 0.4 0.2

0.2 0

0.8

10

20

30 40 50 Node number

60

70

80

0

90

SMM MLPR

100

110

120


160

170

180

SMM MLPR (a) Nodes 1–92

(b) Nodes 93–184

1.2

Prediction accuracy

1 0.8 0.6 0.4 0.2 0

190

200

210

220

230

240

250

260

270

Node number SMM MLPR (c) Nodes 185–275

Figure 7: Prediction precision of SMM and MLPR.

Table 1: The algorithm performance comparison.

Prediction accuracy Time complexity Storage space

SMM 0.6164 𝑂(𝑁) 𝑂(𝑁2 )

O2MM 0.8275 𝑂(𝑁2 ) 𝑂(𝑁3 )

SMLP𝑁 0.8488 𝑂(𝑁) 𝑂(𝑁2 )

SMLPR 0.9014 𝑂(𝑁) 𝑂(𝑁2 )

standard Markov model and nearly 10% higher than the second order Markov model. Then a better result could also be obtained in the comparison between SMLPR and SMLP𝑁. In the aspects of space cost from Table 1, the complexity of SMLPR is 𝑂(𝑁) while O2MM is 𝑂(𝑁2 ), and the memory demand of SMLPR is 𝑂(𝑁2 ) while O2MM is 𝑂(𝑁3 ). Thus, it is proved that the SMLPR gets better performance than order2 Markov predictor at much lower expense, and the SMLPR is

more practical than order-2 Markov predictor in the WLAN scenario. 4.4. Impact of Location Granularity. In location-based mobility scenario, location granularity may have a significant influence on the prediction accuracy. In order to evaluate the impact of location granularity, the algorithms’ performance is tested by adjusting the granularity value 𝜆, and the result is shown in Figure 10. As shown in Figure 10, with the increasing of the location granularity 𝜆, due to the number of locations in the scenario, the average accuracies of these four algorithms are relatively decreasing. In these algorithms, SMM and O2MM meet a more significant impact on the factor of location, and the accuracy reduces approximately to 25%. For SMLPR, it shows a relatively moderate downward trend and the location granularity effect to SMLPR is not very obvious.

10


1 Prediction accuracy

Prediction accuracy

1 0.9 0.8 0.7

0.8 0.7 0.6

0.6 0.5

0.9

10

20

30


70

80

90

0.5

100

110

120

130

140

150

160

170

180

Node number SMLPN SMLPR O2MM

SMLPN SMLPR O2MM

(a) Nodes 1–92

(b) Nodes 93–184

Prediction accuracy

1 0.9 0.8 0.7 0.6 0.5

190

200

210


250

260

270

SMLPN SMLPR O2MM

(c) Nodes 185–275

Figure 8: Prediction precision of O2MM, SMLPR𝑁 , and SMLPR.

5. Conclusion In this paper, the influence of opportunistic characteristic in participatory sensing system is introduced and the problems of sensing nodes such as intermittent connection, limited communication period, and heterogeneous distribution are analyzed. This paper focuses on the mobility model of nodes in participatory sensing systems and proposes the mobile node location prediction algorithm with users’ daily routines based on social relationship between mobile nodes. According to the historical information of mobile nodes trajectories, the state transition matrix is constructed by the location as the transition state and hidden Markov model is used to predict the mobile node location with the certain duration. Meanwhile, social relationship between nodes is

exploited for optimization and amendment of the prediction model. The prediction model is tested based on the WTD data set and proved to be effective.

Conflict of Interests The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments This work is supported by the National Natural Science Foundation of China under Grant no. 61272529, the National Science Foundation for Distinguished Young Scholars of


11

300

Number of users

250 200 150 100 50 0

>10

>20

>30

>40 >50 >60 >70 Prediction accuracy (%)

>80

>90

SMLPN SMLPR

SMM O2MM

Figure 9: The number of users in different precision range.

1

Accuracy rate

0.9 0.8 0.7 0.6 0.5

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

𝜆 value

SMM O2MM

SMLPN SMLPR

Figure 10: The influence of location granularity to prediction accuracy.

China under Grant nos. 61225012 and 71325002; Ministry of Education-China Mobile Research Fund under Grant no. MCM20130391; the Specialized Research Fund of the Doctoral Program of Higher Education for the Priority Development Areas under Grant no. 20120042130003; the Fundamental Research Funds for the Central Universities under Grant nos. N120104001 and N130817003; and Liaoning BaiQianWan Talents Program under Grant no. 2013921068.

References [1] M. Srivastava, M. Hansen, J. Burke et al., “Wireless urban sensing systems,” Tech. Rep. 65, Center for Embedded Networked Sensing at UCLA, 2006.

[2] S. B. Eisenman, E. Miluzzo, N. D. Lane, R. A. Peterson, G.-S. Ahn, and A. T. Campbell, “BikeNet: a mobile sensing system for cyclist experience mapping,” ACM Transactions on Sensor Networks, vol. 6, no. 1, article 6, 2009. [3] H. Lu, W. Pan, N. D. Lane, T. Choudhury, and A. T. Campbell, “SoundSense: scalable sound sensing for people-centric applications on mobile phones,” in Proceedings of the 7th ACM International Conference on Mobile Systems, Applications, and Services (MobiSys ’09), pp. 165–178, Krakov, Poland, June 2009. [4] E. Miluzzo, N. D. Lane, K. Fodor et al., “Sensing meets mobile social networks: the design, implementation and evaluation of the CenceMe application,” in Proceedings of the 6th ACM Conference on Embedded Networked Sensor Systems (SenSys ’08), pp. 337–350, Raleigh, NC, USA, November 2008. [5] E. Miluzzo, N. Lane, S. Eisenman, and A. Campbell, “CenceMe aˆ injecting sensing presence into social networking applications,” in Smart Sensing and Context, G. Kortuem, J. Finney, R. Lea, and V. Sundramoorthy, Eds., vol. 4793 of Smart Sensing and Context, pp. 1–28, 2007. [6] S. B. Eisenman, N. D. Lane, E. Miluzzo et al., “MetroSense project: people-centric sensing at scale,” in Proceedings of the Workshop on World-Sensor-Web, pp. 6–11, Boulder, Colo, USA, 2006. [7] H. Lu, N. D. Lane, S. B. Eisenman, and A. T. Campbell, “Bubblesensing: binding sensing tasks to the physical world,” Pervasive and Mobile Computing, vol. 6, no. 1, pp. 58–71, 2010. [8] L. Deng and L. P. Cox, “Live compare: grocery bargain hunting through participatory sensing,” in Proceedings of the 10th Workshop on Mobile Computing Systems and Applications (HotMobile ’09), Santa Cruz, Calif, USA, February 2009. [9] E. Kanjo, “NoiseSPY: a real-time mobile phone platform for urban noise monitoring and mapping,” Mobile Networks and Applications, vol. 15, no. 4, pp. 562–574, 2010. [10] A. J. Perez, M. A. Labrador, and S. J. Barbeau, “G-Sense: a scalable architecture for global sensing and monitoring,” IEEE Network, vol. 24, no. 4, pp. 57–64, 2010. [11] L. M. L. Oliveira, J. J. P. C. Rodrigues, A. G. F. Elias, and G. Han, “Wireless sensor networks in IPv4/IPv6 transition scenarios,” Wireless Personal Communications, vol. 78, no. 4, pp. 1849–1862, 2014. [12] C. Song, Z. Qu, N. Blumm, and A.-L. Barabasi, “Limits of predictability in human mobility,” Science, vol. 327, no. 5968, pp. 1018–1021, 2010. [13] M. C. Gonz´alez, C. A. Hidalgo, and A.-L. Barab´asi, “Understanding individual human mobility patterns,” Nature, vol. 453, no. 7196, pp. 779–782, 2008. [14] S.-M. Qin, H. Verkasalo, M. Mohtaschemi, T. Hartonen, and M. Alava, “Patterns, entropy, and predictability of human mobility and life,” PLoS ONE, vol. 7, no. 12, Article ID e51353, 2012. [15] L. Song, D. Kotz, R. Jain et al., “Evaluating location predictors with extensive Wi-Fi mobility data,” in Proceedings of the 23rd Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM ’04), vol. 2, pp. 1414–1424, 2004. [16] S. Scellato, M. Musolesi, C. Mascolo, V. Latora, and A. T. Campbell, “NextPlace: a spatio-temporal prediction framework for pervasive systems,” in Pervasive Computing, vol. 6696 of Lecture Notes in Computer Science, pp. 152–169, Springer, Berlin, Germany, 2011. [17] W. Mathew, R. Raposo, and B. Martins, “Predicting future locations with hidden Markov models,” in Proceedings of the 14th International Conference on Ubiquitous Computing (UbiComp ’12), pp. 911–918, September 2012.

12 [18] M. C. Mozer, “The neural network house: an environment that adapts to its inhabitants,” in Proceedings of the AAAI Spring Symposium, pp. 110–114, Stanford, Calif, USA, 1998. [19] H. A. Karimi and X. Liu, “A predictive location model for location-based services,” in Proceedings of the 11th ACM International Symposium on Advances in Geographic Information Systems (GIS ’03), pp. 126–133, New Orleans, La, USA, November 2003. [20] J. D. Patterson, L. Liao, D. Fox et al., “Inferring high-level behavior from low-level sensors,” in Proceedings of the 5th Annual Conference on Ubiquitous Computing (UbiComp ’03), pp. 73–89, Seattle, Wash, USA, 2003. [21] C. Zhu, Y. Wang, G. Han, J. J. P. C. Rodrigues, and J. Lloret, “LPTA: location predictive and time adaptive data gathering scheme with mobile sink for wireless sensor networks,” The Scientific World Journal, vol. 2014, Article ID 476253, 13 pages, 2014. [22] C. Zhu, Y. Wang, G. Han, J. J. P. C. Rodrigues, and H. Guo, “A location prediction based data gathering protocol for wireless sensor networks using a mobile sink,” in Proceedings of the 2nd Smart Sensor Networks and Algorithms (SSPA ’14), Co-Located with 13th International Conference on Ad Hoc, Mobile, and Woreless Networks (Ad Hoc ’14), Benidorm, Spain, June 2014. [23] Y.-B. He, S.-D. Fan, and Z.-X. Hao, “Whole trajectory modeling of moving objects based on MOST model,” Computer Engineering, vol. 34, no. 16, pp. 41–43, 2008. [24] G. Han, C. Zhang, J. Lloret, L. Shu, and J. J. P. C. Rodrigues, “A mobile anchor assisted localization algorithm based on regular hexagon in wireless sensor networks,” The Scientific World Journal, vol. 2014, Article ID 219371, 13 pages, 2014. [25] G. Han, H. Xu, J. Jiang, L. Shu, and N. Chilamkurti, “The insights of localization through mobile anchor nodes in wireless sensor networks with irregular radio,” KSII Transactions on Internet and Information Systems, vol. 6, no. 11, pp. 2992–3007, 2012. [26] G. Han, C. Zhang, T. Liu, and L. Shu, “MANCL: a multianchor nodes cooperative localization algorithm for underwater acoustic sensor networks,” Wireless Communications and Mobile Computing. In press. [27] J. Kruskal, “On the shortest spanning subtree of a graph and the traveling salesman problem,” Proceedings of the American Mathematical Society, vol. 7, pp. 48–50, 1956. [28] E. Cho, S. A. Myers, and J. Leskovec, “Friendship and mobility: user movement in location-based social networks,” in Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’11), pp. 1082–1090, ACM, August 2011. [29] G. Punj and D. W. Stewart, “Cluster analysis in marketing research: review and suggestions for application,” Journal of Marketing Research, vol. 20, no. 2, pp. 134–148, 1983. [30] M. McNett and G. M. Voelker, “UCSD Wireless Topology Discovery Project [EB/OL],” 2013, http://www.sysnet.ucsd.edu/ wtd/wtd.html. [31] R. Ru and X. Xia, “Social-relationship-based mobile node location prediction algorithm in participatory sensing systems,” Chinese Journal of Computers, vol. 35, no. 6, 2014.


International Journal of

Rotating Machinery

Engineering Journal of

Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

The Scientific World Journal Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014


Distributed Sensor Networks

Journal of

Sensors Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014


Volume 2014


Volume 2014

Journal of

Control Science and Engineering

Advances in

Civil Engineering Hindawi Publishing Corporation http://www.hindawi.com


Volume 2014

Volume 2014

Submit your manuscripts at http://www.hindawi.com Journal of

Journal of

Electrical and Computer Engineering

Robotics Hindawi Publishing Corporation http://www.hindawi.com


Volume 2014

Volume 2014

VLSI Design Advances in OptoElectronics


Navigation and Observation Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014



Chemical Engineering Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Volume 2014

Active and Passive Electronic Components

Antennas and Propagation Hindawi Publishing Corporation http://www.hindawi.com

Aerospace Engineering


Volume 2014


Volume 2014

Volume 2014




Modelling & Simulation in Engineering

Volume 2014


Volume 2014

Shock and Vibration Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Advances in

Acoustics and Vibration Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

A Location Prediction Algorithm with Daily Routines in Location-Based ...

A Location Prediction Algorithm with Daily Routines in Location-Based ...

Suggest Documents

Become Independent with Daily Routines

daIly routInes PICTIONARY worksheet - Englishwsheets.com

Recording daily routines with guidance on healthy lifestyle to ... - SciELO

DAILY ROUTINES-SIMPLE PRESENT TENSE - Englishwsheets.com

Healthy Spaces, Daily Routines - Penn State

Child Routines Moderate Daily Hassles and ...

A Prediction Algorithm for Drug Response in Patients with ... - PLOS

An Adaptive Machine Learning Algorithm for Location Prediction

Improving Location Prediction using a Social Historical Model with ...

Protein (Multi-) Location Prediction: Using Location Inter ...

A March-Based Fault Location Algorithm with Partial and Full

RFID Location Algorithm

Algorithm xxxâORTHPOL: A package of routines for generating

Algorithm xxxâORTHPOL: A package of routines for generating

A Range-Free Location Algorithm Based on

Research Article A Location Estimation Algorithm

Location Dependency in Video Prediction - Autonomous Intelligent

SASA: A Synthesis Scheduling Algorithm with Prediction and Sorting ...

A hybrid Algorithm for Epidemic Disease Prediction with Multi ...

A Prediction Algorithm for Feedback Control Models with ... - CiteSeerX

A Distributed Location Identification Algorithm for

Significant Location Detection & Prediction in ... - Semantic Scholar

A Path Tracking Algorithm Using Future Prediction Control with Spike ...

On-Demand Routing Algorithm with Mobility Prediction in the Mobile ...

A Location Prediction Algorithm with Daily Routines in Location-Based ...