sensors Article
Conditional Random Field-Based Offline Map Matching for Indoor Environments Safaa Bataineh 1, *, Alfonso Bahillo 1 , Luis Enrique Díez 1 , Enrique Onieva 1,2 and Ikram Bataineh 3 1 2 3
*
Faculty of Engineering, University of Deusto, Av. Universidades, 24, Bilbao 48007, Spain;
[email protected] (A.B.);
[email protected] (L.E.D.);
[email protected] (E.O.) DeustoTech-Fundación Deusto, Fundación Deusto, Av. Universidades, 24, Bilbao 48007, Spain Department of Architecture, Jordan University of Science and Technology, Irbid 22110, Jordan;
[email protected] Correspondence:
[email protected]; Tel.: +34-72-2740-460
Academic Editor: Vittorio M. N. Passaro Received: 10 June 2016; Accepted: 12 August 2016; Published: 16 August 2016
Abstract: In this paper, we present an offline map matching technique designed for indoor localization systems based on conditional random fields (CRF). The proposed algorithm can refine the results of existing indoor localization systems and match them with the map, using loose coupling between the existing localization system and the proposed map matching technique. The purpose of this research is to investigate the efficiency of using the CRF technique in offline map matching problems for different scenarios and parameters. The algorithm was applied to several real and simulated trajectories of different lengths. The results were then refined and matched with the map using the CRF algorithm. Keywords: offline map matching; indoor positioning; indoor localization; navigation systems; conditional random fields; map modelling; inertial sensors; mobility of pedestrians
1. Introduction The importance of indoor navigation arises from the necessity of context-aware services that work inside buildings, given that people spend most of their time indoors [1]. Although outdoor navigation appeared prior to indoor navigation, the former techniques cannot be applied directly to the indoor environment. This is firstly associated with the fact that outdoor navigation systems mostly depend on Global Navigation Satellite Systems (GNSS) that function poorly indoors due to multipath and attenuation losses, and secondly because pedestrian indoor movement is limited by building structure and details that are totally different from those found on road maps [2]. As indoor environments possess spatial constraints that can be used to rule out certain incorrect localization results, indoor maps have become an additional data source with which to improve the accuracy and reliability of indoor localization systems. For instance, it is unreasonable that an estimated walking trajectory passes through an area occupied by an obstacle such as a wall. The process of utilizing map information in the localization processes of objects or persons is known as map matching. Map matching can be used to find the correspondence between a sequence of points representing the walking trajectory obtained via a localization system and a given map. In outdoor transportation systems, map matching techniques vary from simple geometric search methods to advanced complex probabilistic models [3]. Similarly, existing indoor map matching techniques can be geometric, topological, or probabilistic. The differences between indoor and outdoor map matching problems lie in the complex structure of indoor maps compared to road maps, as well as in the randomness of pedestrian behavior compared to that of vehicles [2].
Sensors 2016, 16, 1302; doi:10.3390/s16081302
www.mdpi.com/journal/sensors
Sensors 2016, 16, 1302
2 of 16
Indoor map matching techniques that use advanced probabilistic models are usually based on recursive Bayesian models such as Hidden Marcov Models (HMMs) and Particle Filters (PFs) [4–7]. However, recursive Bayesian models involve computing the joint probability distribution between all states and observation variables, which makes them computationally expensive. The Conditional Random Fields (CRFs), a probabilistic model that is widely employed in speech recognition, can also be used in map matching [8,9]. It is a sequence-labelling algorithm that employs the observations and context information as features to evaluate the conditional probability of state transitions. CRF is based on calculating conditional probability which makes it computationally more efficient than Bayesian models, besides being more flexible in handling arbitrary dependencies between observations [10]. However, CRF requires a backward phase in order to evaluate different path alternatives, which delays the process and makes it less suitable for real-time applications. On the other hand, offline (Offline localization systems: In localization systems, the expression “offline” has been used in the literature with two different meanings: (1) those localization systems that are not real time, also known as global systems, in which measurements are taken for the whole walk and an offline algorithm is then used to estimate the track; and (2) those localization systems that use local resources and sensors without having been connected to any global system or central server. In this paper, we use the term to refer to the first meaning.) (i.e., not real time) map matching can allow for reduced performance in favor of accuracy [11]. Offline map matching results can be used for both post-processing analysis and modelling purposes. Many indoor applications that need behavioral analysis and data mining can make use of offline map matched trajectories. CRF was applied for the first time in pedestrian indoor localization by Xiao et al. in their localization system MapCraft [12]. In MapCraft, the CRF map matching algorithm is tightly coupled to the localization system that takes raw measurements directly from sensors such as Wi-Fi, Bluetooth and inertial sensors as inputs to the algorithm, and fuses it together with the map information. What we mean by what we call ‘tight coupling’ is that the CRF algorithm here is used to fuse all measurements and spatial map information altogether at the same time; both positioning and map matching are done together in one module. MapCraft aims to be used as real time map matching system; however the delay produced by the backward phase of CRF is a real challenge that affects the efficiency of the algorithm when used for real-time tracking; In MapCraft they overcome this problem by making the compromise of converting the conditional probability discrete distribution obtained by the forward phase to a Gaussian distribution and displaying it on the map in real-time [12]. This means less accuracy as it does not give the optimal solution which can only be obtained in the backward phase; that means that the CRF was not implemented completely in MapCraft for the online mode. We do not know exactly to what degree that step affected the accuracy of their system when used online. Nevertheless, their step can be a good example of that CRF is more suitable for offline usage. As they mentioned clearly in their paper, the optimal path can only be calculated “in the case of delay tolerant offline tracking” [12]. Regarding offline tracking and trajectory detection, it is possible to apply the CRF can be directly to the raw sensors’ measurements to estimate the location and match it with map; however this makes the map matching dependent on the environment and highly customized, as the types of the available sensors should be known; or we can separate the map matching part from localization; we can take location trajectories obtained by any localization system and then match it with the map offline; the map matching system will be independent of the sensors types and of the localization system, and can be attached to (coupled with) any existing localization system as an independent module, separate from the localization module; this modular structures means more flexibility. In this paper, we present an offline CRF map matching algorithm that can be easily loosely coupled to any existing localization system; it uses only the output estimated walking trajectory of the localization system as an input and refines it based on only the map information, without the direct use of raw sensor data. The adoption of loose coupling means that the map matching algorithm does not need to know the implementation details of the primary localization system; this also then allows
Sensors 2016, 16, 1302
3 of 16
the map matching algorithm to be linked to any localization system as a separate module and to use only its output. Any change in the localization system will not affect the implementation of the map matching algorithm. An example of using CRF as an offline map matching algorithm for outdoor systems can be found in the work of Xu et al. [13]; in their work they use the GPS timestamped locations generated by the vehicle as input trajectory for the map matching algorithm and the output is a sequence of road segments traversed by the vehicle. As an outdoor map matching algorithm, the map model is based on road segments. In our work with indoor systems, the map structure is different from the road network, we used a grid based map model that divide the 2D map to squared cells. Our transition model is based on the neighborhood between cells and the distribution of obstacles in the building, where in outdoor environments the transition is dependent on the road network and topology. Instead of using GPS locations as input, we use locations produced by a local positioning system as input as the GPS is not effective inside buildings. We will explain in details in Section 2.2 how the CRF algorithm functions; but for the purpose of comparison we can say in brief that the CRF employs observations and context information through features functions that relates the observation with state transitions. In MapCraft, they used three feature functions depending on observations from sensors: (1) the first function uses displacement and heading measurements from the inertial sensors; (2) the second function is to handle correlations in heading errors of the inertial sensors; and (3) the third function uses the signal strength observation in conjunction with fingerprint map [12]. In our map matching CRF algorithm, we do not use observations from sensors as this supposed to be done by a pre-existing localization system; we only make use of the spatial context provided by the map as presented by the model in addition to the coordinates generated by the localization systems as observations. We use one feature function that makes use of the map information and the input coordinates. The rest of paper is organized as follows: Section 2 describes the system architecture and components, as well as the proposed CRF model; the experimental results are presented and discussed in Section 3; and finally the conclusions drawn from this research are presented in Section 4. 2. Overview of the System In this section, the global architecture of the system is presented, as well as the implemented algorithm. First, we give a brief description of the system architecture, components and behavior; further details follow in Sections 2.1 and 2.2. The system architecture is illustrated in Figure 1. As can be seen from this figure, the system is composed of three subsystems, with some interconnections between them. The three components of the system are: 1.
2.
3.
The primary localization system that produces the first estimated trajectory. This can be any localization system that produces, as its output, a walking trajectory in the form of a time sequence of estimated location coordinates. This trajectory will be the input of the map matching algorithm. The semantic map generation system, a unit that models the floor plan obtained from CAD files in a semantic format that can be used by the map matching algorithm. In our case, a grid-based map model was employed in which the floor plan map is divided into square uniform-grid cells. Each cell is associated with a semantic representation of its contents, for example if it contains a wall or a free space. The map matching algorithm that refines the estimated trajectory (path) using the CRF technique and the semantic floor plan information.
System operation proceeds as follows: once the primary estimated trajectory is obtained by the localization system, map matching is carried out by applying the CRF model that also uses the information obtained from the map to produce a refined (corrected) trajectory. The goal is to estimate the most feasible trajectory, taking into account the constraints provided by the map such as walls and
Sensors 2016, 16, 1302
4 of 16
other obstacles that are obtained from a semantic map generation system, a separate unit that extracts Sensors 2016, 16, 1302 4 of 16 map data from CAD files and represents it in a model so it can be used by the algorithm.
Figure 1. System Architecture, illustrating the coupling between the proposed map matching Figure 1. System Architecture, illustrating the coupling between the proposed map matching algorithm algorithm and the localization and semantic map systems. and the localization and semantic map systems.
2.1. The SemanticMap MapGeneration GenerationSystem System 2.1. The Semantic Indoormaps maps can available in different formats, images, or CAD Indoor can be be available in different formats, likelike images, PDFPDF files,files, or CAD files. files. These These formats are not suitable to be directly used by the map matching algorithm; two tasks should formats are not suitable to be directly used by the map matching algorithm; two tasks should be be performed, the map information should be extracted from the available map files, then this performed, the map information should be extracted from the available map files, then this information information should be represented in another format usable by the map matching algorithm. The should be represented in another format usable by the map matching algorithm. The most commonly most commonly used map models in indoor localization are the grid models and the graph models used map models in indoor localization are the grid models and the graph models [2]. In the grid [2]. In the grid models, the space is partitioned into regular cells with semantics. The graph models models, the space is partitioned into regular cells with semantics. The graph models [14] reconstruct [14] reconstruct the space as a graph where nodes represent entities or places of interest in the the space as a graph where nodes represent entities or places of interest in the building. In [12], for building. In [12], for example, they obtained graphs from maps in image formats using standard example, they obtained graphs from maps infrom image edge detecting algorithms edge detecting algorithms to extract edges theformats image,using then astandard grid-based map model was used to with extract the thenmap a grid-based map was used with celltosize m to celledges size offrom 0.8 m to image, model the information. In model our system, we preferred use of the0.8 CAD model the map information. In our system, we preferred to use the CAD files because CAD is files because CAD is very widely used in architectural design, and we modelled the map usingvery a widely used in architectural design, and we modelled the map using a grid-based model. grid-based model. The extractsthe themap mapdata datafrom fromthe the CAD files models Thesemantic semanticmap mapgeneration generation unit unit extracts CAD files andand models it initain a semantic model suitable for the map matching algorithm. We use the CAD data encoded the semantic model suitable for the map matching algorithm. We use the CAD data encoded ininthe Drawing are standard standardASCII ASCIItext textfiles filesthat thatare areoffered offered DrawingInterchange InterchangeFile Fileformat format (DXF). (DXF). DXF DXF files are byby CAD and that cancan be be easily read by other programs [15,16]. However, mapmap information obtained from CAD and that easily read by other programs [15,16]. However, information obtained DXF files is not suitable localization applications, as it is notasenhanced semantic from DXF files is not for suitable for localization applications, it is not with enhanced withinformation semantic information that allows tothe understand the architectural of theand building and to that allows computers to computers understand architectural structure ofstructure the building to distinguish distinguish between differentobjects architectural suchstairs as walls and stairs [17]. Instead, between different architectural such asobjects walls and [17]. Instead, DXF-derived map DXF-derived map information comprises only line, curve, circle, and polyline drawing data, which information comprises only line, curve, circle, and polyline drawing data, which means that it requires means that it requires more processing in order to be used by algorithm. the proposed more processing in order to be used by the proposed map-matching We map-matching extract the DXF algorithm. and We extract the DXF information represent ouronmap model,the which based on information represent it using our mapand model, whichitisusing based dividing mapisinto square dividing the map into square cells and determining the possible transitions of the pedestrian from cells and determining the possible transitions of the pedestrian from one cell to another depending on cell toobstacles. another depending existing obstacles. We extract theDXF simple information theone existing We extracton thethe simple entity information from the filesentity and locate it in the from the DXF files and locate it in the grid cell representation model of the floor plan. grid cell representation model of the floor plan. we will explain in Section 2.2, the CRF model is a classifier that requires pre-defined AsAswe will explain in Section 2.2, the CRF model is a classifier that requires pre-defined states/labels. To this end, the proposed map model is designed to suit this specific purpose by states/labels. To this end, the proposed map model is designed to suit this specific purpose by representing the map as a group of square cells, with each cell representing a squared area in the representing the map as a group of square cells, with each cell representing a squared area in the building; to the algorithm, this represents a state/label that can be used to specify the position of the building; to the algorithm, this represents a state/label that can be used to specify the position of the pedestrian. In the model, each cell has its own characteristics, be it representing a free space in the pedestrian. In the model, each cell has its own characteristics, be it representing a free space in the building or be it occupied by an obstacle such as a wall or furniture. For the purpose of representing building or be it occupied by an obstacle such as a wall or furniture. For the purpose of representing state transitions, each cell knows its neighbor cells and the neighbors of neighbor cells. state transitions, each cell knows its neighbor andthat the neighbors of neighbor cells. A between transition A transition graph/table is generated in suchcells a way the transition is only possible graph/table is generated in such a way that the transition is only possible between neighbor cells neighbor cells that do not contain any obstacle; for more flexibility, a transition to a neighbor ofthat a doneighbor not contain any obstacle; for more flexibility, a transition to a neighbor of a neighbor cell is also cell is also allowed under the condition that there are no obstacles impeding that transition. allowed under the condition thatlocation there are no obstacles impeding transition. search region The search region for the next from the current location that is known as theThe buffer [2], with thefor themaximum next location from the currentallowed locationinis each known as the buffer [2],aswith maximum transition transition distance direction known the the buffer size. Allowing transitions to neighbor cells and neighbors of neighbors results in a buffer size of 2 cells in each
Sensors 2016, 16, 1302
5 of 16
Sensors 2016, 16, 1302
5 of 16
distance allowed in each direction known as the buffer size. Allowing transitions to neighbor cells and neighbors of neighbors results in a buffer size ofan 2 cells in each direction, as shown in Figure direction, as shown in Figure 2a. Figure 2b presents example of possible transitions based on the2a. Figure 2b presents an example of possible transitions based on the existence of obstacles. 5 of 16 Sensors 2016, 16, 1302 existence of obstacles. direction, as shown in Figure 2a. Figure 2b presents an example of possible transitions based on the existence of obstacles.
(a)
(b)
Figure Searchregion regionfor forthe thenext nextlocation, location, including including allowed transitions in the next step (a) and Figure 2. 2.Search allowed (a) (b) transitions in the next step (a) and allowed steps in the presence of obstacles (b). allowed steps in the presence of obstacles (b).
Figure 2. Search region for the next location, including allowed transitions in the next step (a) and allowed steps in the presence of obstacles (b). 2.2. The Map Matching Algorithm
2.2. The Map Matching Algorithm A The CRFMap model is used as the base model in the developed system. As undirected probabilistic 2.2. Matching Algorithm A CRF model is used as the base model in the developed system. As undirected probabilistic graphical models developed for labelling data [8], CRF models are used for an input set of A CRF model is used aslabelling the base data model in CRF the developed system. As an undirected probabilistic graphical models [8], models are used for input setsuch of observations developed to predict for a vector of hidden variables . Unlike generative models as HMMs → observations → data graphical models developed for labelling [8], CRF models are used for an input set of x that to predict a vector of hidden variables y . Unlike generative models such as HMMs that model model the joint probability P , by applying Bayes’ rule, CRFs are discriminative models → → observations to predict a vector of hidden variables . Unlike generative models such as HMMs thethat joint probability P x , y distribution by applying PBayes’ rule, the CRFs are discriminative models that model model thetheconditional | over hidden variables given observation that model joint probability are discriminative models → P→, by applying Bayes’ rule, CRFs → ; a comparison of generative discriminative models can be found [18]. In linear chain→ that model the conditional P the | hidden over the hidden variables in given observation thevector conditional distribution Pdistribution x y and over variables observation vector y given x; CRFs, a special form of CRF graphs that model the output variable as sequence [9],Inthe conditional vector ; aofcomparison generative and discriminative beafound in [18]. linear chainCRFs, a comparison generativeofand discriminative models models can be can found in [18]. In linear chain probability of states given observations | the is proportional to as thea product of potential functions CRFs, a special form of CRF graphs thatPmodel output variable the conditional a special form of CRF graphs that model the output variable as a sequence sequence[9],[9], the conditional thatprobability link observations to consecutive states. Figure 3 shows a representation of the linear chain CRF of states given observations P→ |→ is proportional to the product potential functions probability of states given observations P x y is proportional to the product of potential functions model. Theobservations hidden states are dependent on input observation vectorof the ; linear each hidden state that link to consecutive states. Figure 3 shows a representation chain CRF that link observations consecutive states. Figure 3 shows ainput representation the linear chain CRF model. Theonhidden states are but dependent on the input observation vector As ;of each hidden state depends not one to input value, rather on whole vector. a result, they can be → → model. Thebyhidden y arevalue, dependent on input observation x ;As each hidden state depends not onstates one input rather on the wholewhich inputvector a result, they candepends bethe affected input observations frombut different time steps; isvector. considered an advantage of by input observations time steps; which considered an advantage ofby theinput notCRF onaffected one input value, but rather from on thedifferent whole input vector. As aisresult, they can be affected technique. CRF technique. observations from different time steps; which is considered an advantage of the CRF technique.
Figure Linear Chain model (adapted (adapted from [8]). Figure 3.3.3. Linear (adaptedfrom from [8]). Figure LinearChain Chain CRF CRF model [8]).
In the map matching vector →represents representsthe thesequence sequence locations In the map matchingproblem, problem,the thehidden hidden state state vector of of locations to to In the map matching problem, the hidden state vector y represents the sequence of locations to be calculated, i.e., correctedwalking walkingtrajectory; trajectory; our uses thethe cells in in be calculated, i.e., thethe corrected our Linear LinearChain ChainCRF CRFalgorithm algorithm uses cells be calculated, i.e., the corrected walking trajectory; our Linear Chain CRF algorithm uses the cells in map model hiddenstates/labels. states/labels. Vector Vector → represents that cancan be be either thethe map model as as thethe hidden representsthe thesystem systeminput input that either the map model as the locations hidden states/labels. Vector the system input that can as beineither x represents coordinates onan anestimated estimated trajectory obtained system, coordinates of of thethe locations on trajectory obtainedby bysome somelocalization localization system, as in coordinates of the locations on an estimated trajectory obtained by some localization system, as in the case of our algorithm,orordirect directmeasurements measurements from from sensors. thethe case of our algorithm, sensors. case ofThe our algorithm, or direct measurements from sensors. The CRF algorithm consists two phases: phases: the phase. CRF algorithm consists ofof two the forward forward phase phaseand andthe thebackward backward phase. Following the calculation of the conditional probabilities of all cells in all time steps during thethe The CRF algorithm consists of two phases: the forward phase and the backward phase. Following Following the calculation of the conditional probabilities of all cells in all time steps during forward phase, inference is carried out in the second phase (backward phase),with the optimal theforward calculation of the conditional probabilities of all cells in all time steps during the forward phase, phase, inference is carried out in the second phase (backward phase),with the optimal trajectory chosen from among different candidate trajectories. inference is carried out in the second phase (backward phase),with the optimal trajectory chosen from trajectory chosen from among different candidate trajectories. During the forward phase, the CRF algorithm evaluates the possible transitions at each time amongDuring different trajectories. thecandidate forward phase, the CRF algorithm evaluates the possible transitions at each time step according to the input trajectory and the transition graph obtained from the map model; this
step according to the input trajectory and the transition graph obtained from the map model; this
Sensors 2016, 16, 1302
6 of 16
During the forward phase, the CRF algorithm evaluates the possible transitions at each time step according to the input trajectory and the transition graph obtained from the map model; this involves calculating the probabilities of transition from all cells of the current time step to all cells of the next time step. A probability value is assigned to each cell in the map at each time step; this value represents how probable it is that the pedestrian is located in that cell at that time step (for example, the probability of pedestrian movement to a cell occupied by a wall is zero). This probability is also a conditional probability calculated using so-called feature functions that compute to what degree the input observations support the choice of a cell to be on the trajectory at that time step. However, cell selection depends not only on its probability value at the current time step, but also on the previous time steps, i.e., the path history, and how this cell is related to others in the path. Hence, choosing the cell with the highest probability is not enough; the probability of a whole trajectory should be calculated. The feature functions specify how the transition between two states is supported by the set of → observations x . The potential function of a cell is calculated at each time step by calculating the exponential of the summation of all feature functions that support the selection of this cell multiplied by the transition probability of moving to this cell from the current cell at the current time step. The higher the potential function value, the higher the probability of the cell to be the next cell. At each time step, j the potential function is the exponential of the summation of all feature functions fi at that time step, and can be written as: ψj
→ → m → x , y = exp( ∑ λi f i y j−1 , y j , x )
(1)
i =1
where m is the number of features andλi is the feature weight that can be determined by training the → → model. The conditional probability P x y of each cell is calculated by normalizing the potential function as follows: → → → → n n m → 1 1 P x y = → · ∏ ψj x , y = → ·exp( ∑ ∑ λi f i y j−1 , y j , x ) (2) Z x Z x j =1 j =1 i =1 where n is the number of output states/cells and Z is the normalization factor, with → Z x =
n
∑ ∏ ψj →
→ → x, y
(3)
x j =1
If N is the number of cells, then at every time step the potential functions should be calculated N × N times; however, knowing that transition can only happen between neighbor cells, one need only calculate the potential function of those neighbors. Thus, the calculation number at each step is O(N) and the complexity of the whole procedure is O(NT), where T is the number of time step observations. For each observation, each cell stores only one conditional probability value (the highest), while the previous cell that gave this value is also stored as the best parent. This is necessary for the following inference step. During the backward phase, inference is implemented in order to estimate the location over time, with the most likely sequence of hidden states calculated by maximising the sum of the conditional probability function. After the potential function for all cells is calculated for each observation, the optimal path is determined. This is carried out via a backward process using dynamic programming and the Viterbi algorithm [19]. The optimal path is obtained by maximizing the sum of conditional probabilities along the path: → → →∗ y = argmax P x y . (4) →
y
Sensors 2016, 16, 1302
Sensors 2016, 16, 1302
7 of 16
7 of 16
the previous time step, before summing these conditional probabilities along the path until reaching Although this process might be assumed to be of high complexity, as we need to choose between the first parent in theSensors path.2016, The16, path 1302with the highest sum is then chosen. The forward and backward 7 of 16 all possible trajectories, is shown actually linear because for each time step we only save one value of the phases of the map matching CRF algorithm itare in Algorithm 1. conditional probability each cell. Hence, numberprobabilities of candidate paths the reaching number of cells the previous time step,for before summing thesethe conditional along theequals path until (N). The backward process is O (NT), as we start a backward process for each cell by determining the the first parent in the path. The path with the highest sum is then chosen. The forward and backward Algorithm 1. CRF Algorithm for the Map Matching Problem best parent, which is the neighbor cell with the highest conditional probability at the previous time phases of the map matching CRF algorithm are shown in Algorithm 1. Input : Observation = a vector of coordinates of the input estimated trajectory 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
step, before summing these conditional probabilities along the path until reaching the first parent in
Output: CorrectedPath = a vector of coordinates of the output corrected trajectory Algorithm CRF with Algorithm for the Map Matching the path. The 1.path the highest sum is thenProblem chosen. The forward and backward phases of the map Forward Phase: matching CRF: Observation algorithm=are shown in Algorithm Input a vector of coordinates of the 1. input estimated trajectory 1 %Observation j = coordinates of the input at time step j For each observation (Observationj) Output: CorrectedPath = a vector of coordinates of the output corrected trajectory 2 For all cells (i) Algorithm 1. CRF Algorithm for the Map Matching Problem Forward Phase: 3 For all neighbor cells of i (k) %Observation j = input coordinates of the input at time step j For each observation=(Observation 1 4 Input : Observation a vector ofj)coordinates of the estimated trajectory fdis = T(i,k)/Distance (k,Observationj)%T(i,k): transition possibility from cell i to cell k {0,1} 2 5 Output: CorrectedPath = a vector of coordinates of the output corrected trajectory For all cells (i) Potential(k,Observationj) = exp(fdis); 3 6 ForwardFor Phase: all neighbor cells of i (k) = sum(Potential(:,j)) % normalization factor each observation (Observation 4 Z For j ) %Observation j = coordinates of the input at time step j 7 fdis = T(i,k)/Distance (k,Observationj)%T(i,k): transition possibility from cell i to cell k {0,1} ConditionalProbabilty 5 For all cells (i) (k,Observationj) = Potential(k,Observationj)/Z 8 Potential(k,Observationj) = exp(fdis); 6 For all neighbor cells i (k) If (ConditionalProbabilty (i,Observation j-1) > of ConditionalProbabilty (bestParent (k)) 9 ZT(i,k)/Distance = sum(Potential(:,j)) % normalization factor f = (k,Observation 7 j )%T(i,k): transition possibility from cell i to cell k {0,1} then CorrectedParent(k) = i dis ConditionalProbabilty (k,Observation j ) = Potential(k,Observation j)/Z 8 10 Potential(k,Observation ) = exp(f ); j dis End 9 11 Z = sum(Potential(:,j)) % normalization factor If (ConditionalProbabilty (i,Observationj-1) > ConditionalProbabilty (bestParent (k)) End 10 12 ConditionalProbabilty (k,Observationj ) = Potential(k,Observationj )/Z then CorrectedParent(k) =i End 11 If (ConditionalProbabilty (i,Observationj-1 ) > ConditionalProbabilty (bestParent (k)) End 13 End 12 then CorrectedParent(k) = i End 14 13 End Backward Phase End 15 14 = #Cells End #CandidatePaths End 15 16 End For p = 1 #CandidatePaths % p is the last cell in the CandidatePaths 16 17 End Backward Phase k=p % k is the current cell in the CandidatePath 17 18 Backward Phase = #Cells #CandidatePaths Construct CandidatePath: = #Cells 18 each #CandidatePaths % p is the last cell in the CandidatePaths 19 For p = 1 #CandidatePaths For all 19observations For p = 1(j 1) #CandidatePaths % p is the last cell in the CandidatePaths k=p % k is the current cell in the CandidatePath 20 20CandidatePath(j) k = p= k; %add cell K to the path%atktime is thestep current cell in the CandidatePath j Constructeach eachCandidatePath: CandidatePath: 21 21sum(CandidatePath) Construct = sum(CandidatePath) + ConditionalProbabilty(k,j) For observations (j 1) 22 22 For allallobservations 1) k = BestParent(k,j) % choose the best parent of k to be the next cell in the path 23 23 CandidatePath(j) k;%add %addcell cell path at time j CandidatePath(j) ==k; KK to to thethe path at time step step j End 24 24 sum(CandidatePath) = sum(CandidatePath) + ConditionalProbabilty(k,j) sum(CandidatePath) = sum(CandidatePath) + ConditionalProbabilty(k,j) End 25 k = BestParent(k,j) % choose the best parent of k to be the next cell in the path k = BestParent(k,j) % choose the best parent of k to be the next cell in the path 25 CorrectedPath with highest sum of ConditionalProbabilities 26= CandidatePath End 26 End
27 End End The main feature we use in our CRF map matching algorithm is the Euclidean distance 28 27that CorrectedPath = CandidatePath with highest sum of ConditionalProbabilities (in meters) between 28 the centre of the candidate cell and estimated derived from the CorrectedPath = CandidatePath withthe highest sum of position ConditionalProbabilities primary localization system; a smaller distance means a higher value of the potential function givenis the Euclidean distance The main feature that we use in our CRF map matching algorithm that the transition to thisThe cellmain is possible, as determined by theCRF transition graph obtained from feature that we use in our map matching algorithm is Euclidean distance (in meters) between the centre of the candidate cell and the estimatedthe position derived from the map model. The proposed feature function thusof bethe defined as follows: (in meters) between thecan centre candidate cell and the estimated position derived from the primary localization system; a smaller distance means a higher value of the potential function given primary localization system; a smaller distance means a higher value of the potential function given , that the transition to this=cell is possible, as determined by the transition graph obtained from the map that the transition to this cell is possible, as determined by the transition graph obtained from the , can thus be defined as follows: (5) model. The proposed feature function map model. The proposed feature function can thus be defined as follows: Minimum distance = cell size/2 where is a function that indicates the transition possibility from the Tcurrent (y, j−1 ,y j )cell to the candidate f= dis = Distance x ,y (5) cell, depending on the transition table, and which can be either 0 or 1, is the(, time is the (5) j j ) step and Minimum distance = cell size/2 current observation, which is the current location estimated by thedistance primary= localization system. This Minimum cell size/2 feature means that cells close toathe currentthat input locations higher probabilities far cells,cell to the candidate where is function indicates the have transition possibility fromthan the current where T is a function indicates possibility the cell to the candidate cell, depending on thethat transition table,the andtransition which can be either 0 orfrom 1, is thecurrent time step and is the cell, depending on the transition table,location and which can by be the either 0 orlocalization 1, j is the system. time step current observation, which is the current estimated primary Thisand x is means that cells close to is the current input locations have higher probabilities far cells, system. thefeature current observation, which the current location estimated by the primary than localization
This feature means that cells close to the current input locations have higher probabilities than far cells,
Sensors 2016, 16, 1302 Sensors 2016, 16, 1302
8 of 16 8 of 16
on the the condition condition that that transition transition to to the the cell cell is is possible possible from from its its neighbor neighbor cells cells and and no no obstacle obstacle forbids forbids on this transition. this transition. 3. Results 3. Resultsand andDiscussion Discussion The algorithm algorithm was was tested tested using using both both simulations simulations and The and data data obtained obtained from from real real measurements. measurements. Map information information for for the the 4th 4th floor floor of of the the DeustoTech DeustoTech building building at at the the University University of of Deusto Deusto (Spain) (Spain) Map was extracted from AutoCAD files and modelled as a grid-based map model. Section 3.1 presents was extracted from AutoCAD files and modelled as a grid-based map model. Section 3.1 presents the results results of of the the conducted conducted simulations, simulations, while while the the results results obtained obtained using using the the real real measurements measurements are are the presented in Section 3.2. presented in Section 3.2. 3.1. Simulations A base trajectory trajectory (shown in Figure 4) was was constructed constructed as the the ground ground truth, truth, with different unmatched trajectories then created randomly as the input (observations) of the map map matching matching algorithm. thethe input trajectory is theisoutput of an existing localization system algorithm. In Inreal-life real-lifeapplications, applications, input trajectory the output of an existing localization that represents the primary estimations (as we will in Section The map system is system that represents the primary estimations (asshow we will show in3.2). Section 3.2). matching The map matching applied toapplied refine the avoidingbycrossing obstacles; most feasible trajectory is system is to estimated refine the trajectory estimatedby trajectory avoiding crossingthe obstacles; the most feasible the sequence of positions violates the fewest constraints [12].constraints After matching the input trajectory trajectory is the sequencethat of positions that violates the fewest [12]. After matching the to the map using the CRFmap algorithm, (corrected) comparedtrajectory to the ground input trajectory to the using the thematched CRF algorithm, thetrajectory matchedwas (corrected) was truth in order to measure precision, the Cumulative Distribution Function (CDF) Distribution of the errors compared to the ground the truth in orderusing to measure the precision, using the Cumulative obtained calculating Euclidean distance between the position in the ground truth the Function by (CDF) of the the errors obtained by calculating theactual Euclidean distance between theand actual corresponding position. position in the corrected ground truth and the corresponding corrected position.
Figure 4. The Ground Truth, the arrows arrows indicate indicate the the direction direction of of walking. walking.
To simulate simulate different different possible possible input input trajectories trajectories with with different different qualities, qualities, different different levels levels of of noise noise To were added to the base track by allowing different degrees of deviation (distances) and randomness were added to the base track by allowing different degrees of deviation (distances) and randomness around the the ground ground truth. truth. A a primary primary estimation estimation (input) (input) around A low low noise noise level level was was used used to to represent represent a trajectory with with low low error error (mean (mean error error of of about about 1.3 1.3 m m and and standard standard deviation deviation of of about about 11 m m for for noise noise trajectory level 1); a simulated input with a high noise level was used to represent an input trajectory with high level 1); a simulated input with a high noise level was used to represent an input trajectory with high error (mean (meanerror errorofofabout about and standard deviation of about for noise level 4). 1Table error 2.52.5 mm and standard deviation of about 1.8 m1.8 formnoise level 4). Table shows1 shows approximate values of the mean and standard deviations of the four different noise levels approximate values of the mean and standard deviations of the four different noise levels obtained by obtained by simulating 25 random samples of each noise levels. Examples of different simulated simulating 25 random samples of each noise levels. Examples of different simulated trajectories with trajectories with different noise levels are 5. shown in Figure 5. different noise levels are shown in Figure
Sensors 2016, 16, 1302 Sensors 2016, 16, 1302
9 of 16 9 of 16
Table 1. Means and standard deviations for the different level of noise used in simulations. Table 1. Means and standard deviations for the different level of noise used in simulations.
Noise Level Mean (m) Noise Level 1 1.3Mean (m) 1 2 1.6 1.3 2 3 2.0 1.6 3 2.0 4 2.5 4
2.5
Standard Deviation (m) Standard Deviation 1 (m) 1 1.1 1.3 1.8
1.1 1.3 1.8
Figure with different different noise noise levels. levels. Figure 5. 5. Trajectories Trajectories with
The results of matching different input trajectories with the map are shown in Table 2. We can The results of matching different input trajectories with the map are shown in Table 2. We can see that in all the 300 samples that we have run the number of obstacles crossed by the map matched see that in all the 300 samples that we have run the number of obstacles crossed by the map matched trajectory is zero. This is an important proof of the efficiency of the algorithm. Mathematically, the trajectory is zero. This is an important proof of the efficiency of the algorithm. Mathematically, the best best path is the path with highest some of scores, the path might have the highest sum of score while path is the path with highest some of scores, the path might have the highest sum of score while one of one of these scores is zero; one single zero score might not affect much the whole sum. If at some these scores is zero; one single zero score might not affect much the whole sum. If at some point, the point, the score of the cell was zero because there is an encounter with an obstacle, it still can be in score of the cell was zero because there is an encounter with an obstacle, it still can be in the path if the the path if the sum of all points on the path is still high. However, such situation is very rare as our sum of all points on the path is still high. However, such situation is very rare as our algorithm and algorithm and features are designed to minimize the chance of crossing obstacles; what prevents features are designed to minimize the chance of crossing obstacles; what prevents crossing obstacles is crossing obstacles is that normally there are other free cells with higher potential scores to choose that normally there are other free cells with higher potential scores to choose instead of occupied cells. instead of occupied cells. That’s why the probability to cross obstacles is almost zero. No encounter That’s why the probability to cross obstacles is almost zero. No encounter with obstacles will happen with obstacles will happen unless it is necessary and there is no other choice; in odd cases like the in unless it is necessary and there is no other choice; in odd cases like the in the case of the divergence the case of the divergence (stuck) problem that happens when the trajectory is trapped in a small (stuck) problem that happens when the trajectory is trapped in a small area; the path might cross the area; the path might cross the wall to get out of the trap. To explain that, in some cases of divergence, wall to get out of the trap. To explain that, in some cases of divergence, the input location becomes so the input location becomes so far from the current cell to be evaluated (big distance), when we far from the current cell to be evaluated (big distance), when we calculate the feature (1/distance), and calculate the feature (1/distance), and the probability (Equation (2)) it will be of very small value the probability (Equation (2)) it will be of very small value (near to zero). That means no big difference (near to zero). That means no big difference of probability between free cells and blocked cells; both of probability between free cells and blocked cells; both would have very low scores because all are would have very low scores because all are distant from the real place. In this case, it is possible that distant from the real place. In this case, it is possible that the path will cross an obstacle to get out the path will cross an obstacle to get out from the trap, as again at the end the path is evaluated with from the trap, as again at the end the path is evaluated with the sum of probability along it not on a the sum of probability along it not on a single obstacle cross. According to our observations, this case single obstacle cross. According to our observations, this case of crossing obstacles happens very rarely of crossing obstacles happens very rarely and does not happen in most of divergence problem cases, and does not happen in most of divergence problem cases, but few of them. The divergence problem but few of them. The divergence problem itself happens rarely as we will see below. To study the itself happens rarely as we will see below. To study the effect of cell sizes and noise level, Table 2 was effect of cell sizes and noise level, Table 2 was constructed with different parameter combinations. constructed with different parameter combinations. To measure the precision, the cumulative error for To measure the precision, the cumulative error for 50% and 90% of positions on the trajectory was 50% and 90% of positions on the trajectory was calculated for different parameter values, in this case, calculated for different parameter values, in this case, cell size and noise level. For each parameter cell size and noise level. For each parameter combination, the algorithm was run on 25 random input combination, the algorithm was run on 25 random input trajectories and the average cumulative trajectories and the average cumulative error registered for 50% and 90% of positions. error registered for 50% and 90% of positions.
Sensors 2016, 16, 1302
10 of 16
Table 2. Simulated trajectory results using feature fdis . Path length = 352 m. Cell Size (m)
Noise Level
Number of Crossed Obstacles
0.8
1 2 3 4
Cumulative Error (m) 50%
90%
0 0 0 0
0.9867 2.682 5.7945 6.5158
2.6699 19.7820 32.464 32.3861
1
1 2 3 4
0 0 0 0
1.1574 1.2463 1.4853 2.1409
3.078 3.8172 5.1879 11.1061
1.5
1 2 3 4
0 0 0 0
19.4403 16.6941 17.7461 20.0473
70.7003 61.6751 68.5491 71.8129
It is clear from Table 2 that a cell size of 1m provides good results, and even when adding high levels of noise to the input the results still acceptable. At a low noise level (equal to 1), for 50% of the trajectory the average cumulative error is 1.16 m and for 90% the average cumulative error is 3.078 m, based on a total track length of 352 m. Using a larger cell size of 1.5 m length results in more cells occupied by obstacles and thus blocked to movement; this explains the high level of error observed in the respective results, as illustrated by the example shown in Figure 6a. Figure 6b illustrates how the corridor is blocked because it contains cells occupied by side walls. Besides being high in error we also notice the results of this cell size are also totally random in relation with the noise level, higher error of low noise level is just due to the random input samples, using different samples would result in different numbers, that explains the available results of 70.7003 m using noise level 1 while smaller error 61.6751 m error is obtained using noise level 2, which might look strange but it is normal; noise level in this specific situation is irrelevant and does not have a significant impact on the results as the error that results from the blocked corridors is much higher than the error that results from the noise. We have run other samples and we got results which looks normal (smaller error for less noise level) but we kept the current result to illustrate the irrelevance of the noise level in this case. Although the use of a cell size of 0.8 m would be expected to lead to more precise results, the obtained results show that this is correct only for low noise input; at higher noise levels the error was high, likely due to what we call the stuck or divergence problem. This particular problem occurs when the corrected trajectory enters a room or small area and cannot leave; an example of such a situation is shown in Figure 7. Although with smaller cells it is more likely that certain cells will contain no obstacles, some of these empty cells could form entrances to small areas, allowing the trajectory to pass through them and become trapped. The stuck problem recorded in 0.8 m cell size simulations resulted in high error values in about 16% of samples, with the remainder achieving good results (average cumulative error of 4.2522 m for 90% of observations at a noise level of 2). It should be noted that the values shown in Table 2 indicate the average cumulative errors in all samples, including those experiencing the stuck problem, which explains the high value of some of these averages. Figure 8 displays examples of the results obtained for different cell sizes and different noise levels.
Sensors 2016, 16, 1302 Sensors 2016, Sensors 2016,16, 16,1302 1302
11 of 16
11 16 of 16 11 of
Sensors 2016, 16, 1302
11 of 16
(a) (a)
(b) (b)
Figure 6.Large Large cellsresult resultin in more cells cells being occupied by obstacles and blocked totomovement, and Figure movement, Figure 6. 6. Large cells cells result in more more cells being being occupied occupied by by obstacles obstacles and and blocked blocked movement, and and (a) (a) (b) toto corridors can be blocked. shows a map matched trajectory for a cell size equal 1.5 m; occupied corridors (a) shows corridors can can be be blocked. blocked. (a) shows aa map map matched matched trajectory trajectory for for aa cell cell size size equal equal to to 1.5 1.5 m; m; occupied occupied cells are marked blue circles (b)cells where weoccupied can see how the corridor that is marked by theand red Figure 6. Large with cells result in more being by obstacles and blocked to movement, cells cells are are marked marked with with blue blue circles circles (b) (b) where where we we can can see see how how the the corridor corridor that that is is marked marked by by the the red red circle is blocked by obstacles. corridors can be blocked. (a) shows a map matched trajectory for a cell size equal to 1.5 m; occupied circle circle is is blocked blocked by by obstacles. obstacles. cells are marked with blue circles (b) where we can see how the corridor that is marked by the red 70 circle is blocked by obstacles. 70 60 70
60
50 60 4050
40 3040
stuck problem stuck problem stuck problem
meters
meters meters
50
30 2030
20 1020 10 010 0 0
00
20
20
0
20
Figure 7. Stuck problem
6050 60
60 60 50
5040 50
50 50 40
meters
3020 2010
meters meters meters
70 70 60
meters
7060 70
4030
40
30
estimated trajectory corrected Trajectory estimated trajectory estimated trajectory corrected Trajectory corrected 60 Trajectory80
20
10
10 0 0
20 0
0
0
100
120
100
100
120
120
m, noise level = 4.
Figure 7. 7. Stuck problem scenario, cell size == 0.8 m, noise level = = 4. Figure 7. Stuck size Figure Stuckproblem problemscenario, scenario, cell cell 70 size = 0.8 m, noise level = 4.
70
meters
actual trajectory estimated trajectory actual trajectory corrected trajectory actual trajectory estimated trajectory start and end points estimated trajectory trajectory corrected corrected trajectory start and end points 40 60end points80 start and meters 40 60 80 40 60 80 meters scenario, cell size = 0.8 meters
0
40 20
20
40
40
meters 60 meters 60
(a) meters (a) (a)
80
80
40 30 40 30 20 30 20 10 20
100 100
100
120 120
120
10 0 10 0 0
0
0
0
Figure 8. Cont. Figure 8. Cont. Figure Figure 8. 8. Cont. Cont.
20 20
20
estimated trajectory corrected Trajectory estimated trajectory estimated trajectory corrected Trajectory 40 60 Trajectory 80 corrected meters 40 60 80 meters60 40 80
(b) (b)meters
(b)
100
120
100
120
100
120
Sensors 2016, 16, 1302
12 of 16
Sensors 2016, 16, 1302
12 of 16 70
60
60
50
50
40
40
meters
meters
70
30
20
20 estimated trajectory corrected Trajectory
10
0
30
0
20
40
60 meters
estimated trajectory corrected trajectory
10
80
100
0
120
0
20
40
70
60
60
50
50
40
40
30
20
100
120
30
20 estimated trajectory corrected trajectory
10
0
80
(d)
70
meters
meters
(c)
60 meters
0
20
40
60 meters
(e)
estimated trajectory corrected trajectory
10
80
100
120
0
0
20
40
60 meters
80
100
120
(f)
Figure 8. Ground truth, estimated and corrected trajectories obtained using different cell sizes
Figure Ground truth,(a)estimated cell sizes and8. noise levels. Cell sizeand = 1corrected m, noisetrajectories level = 2;obtained (b) Cell using size =different 1 m, noise leveland = 3;noise levels. (a) Cell 1 m,level noise = size 2; (b) Cell size =level 1 m, level 3; (c)noise Cell size (c) Cell size =size 1 m,=noise = 4;level (d) Cell = 0.8 m, noise = 1;noise (e) Cell size== 0.8m, level== 1 m, (f) Cell size(d) = 0.8 m, noise noise2;level = 4; Cell size level = 0.8= 3. m, noise level = 1; (e) Cell size = 0.8 m, noise level = 2; (f) Cell size = 0.8 m, noise level = 3. A moderate walking speed is thought to be around 1.67 steps/s (1.67 Hz) [20], with an average step length of 0.7 m for women and 0.78 m for men [21], resulting in a movement speed A moderate walking speed is thought to be around 1.67 steps/s (1.67 Hz) [20], with an average of around 1.3 m/s. Assuming a frequency of input of 1 estimation/s (1 Hz), i.e., a time step equal to 1 step s, length of 0.7would m formove women 0.78time m for men inAs a movement speed of around the subject 1.3 and m every step, 2.6[21], m in resulting 2 steps, etc. the proposed map model 1.3 m/s. Assuming a frequency of input of 1 estimation/s (1 Hz), i.e., a time step equal to 1 s, the employs discrete values for movement (cell units), allowing only one cell transition, for example,subject in would move m (buffer every time step, to 2.61 m in would 2 steps,mean etc. that As the mapnot model employs discrete each time1.3 step size equal cell) theproposed system would be able to capture values movement units), walking allowingatonly one cell transition, for example, in each thefor movement of a(cell pedestrian moderate speed; it is therefore more suitable to time have step flexibility of movement by allowing 1, orthe 2 cell moves in each step to bycapture allowingthe transition to of (buffer size equal to 1 cell) would mean0,that system would nottime be able movement neighbor cells and to neighbors of neighbors (buffer size equal to 2), which is necessary in order to a pedestrian walking at moderate speed; it is therefore more suitable to have flexibility of movement obtain a 0,correct assuming cell size m. If thetransition input frequency was 2cells Hz and (2 to by allowing 1, or 2trajectory cell moves in eacha time step of by1allowing to neighbor estimations/s), this would mean an estimation every half-second time step. During this time period, neighbors of neighbors (buffer size equal to 2), which is necessary in order to obtain a correct trajectory a pedestrian moving at moderate speed moves around 0.65 m in each time step, a distance smaller assuming a cell size of 1 m. If the input frequency was 2 Hz (2 estimations/s), this would mean an than the cell size (1 m). In this case, if only one cell transition was allowed there should not be a estimation every half-second time step. During this time period, a pedestrian moving at moderate problem, while allowing more possibilities (1 or 2 cells) should also work. In some systems there is speednomoves around 0.65 in each time step, distanceregistered smaller than the cell (1 m). the In this case, if fixed frequency formestimations, with thea location at each stepsize whatever speed. only In one cell transition was allowed there should not be a problem, while allowing more possibilities the present case, if we assume a moderate walking speed the average frequency will be around (1 or 1.67 2 cells) Insize some there is no fixed frequency estimations, with Hz, should meaningalso thatwork. a buffer of 1systems cell would be suitable. However, in allfor cases a buffer size of the location registered at each stepinwhatever In the present if we assume a moderate 2 cells is still more flexible capturing the any speed. speed variations and incase, dealing with other types of movement suchaverage as running. walking speed the frequency will be around 1.67 Hz, meaning that a buffer size of 1 cell would TheHowever, results shown Table 2 are those for2 cells a buffer sizemore of 2 flexible cells andinassuming be suitable. in allincases a buffer size of is still capturingthat anythe speed estimations of the primary localization system are registered at a frequency of 1 Hz. To variations and in dealing with other types of movement such as running.
The results shown in Table 2 are those for a buffer size of 2 cells and assuming that the estimations of the primary localization system are registered at a frequency of 1 Hz. To experimentally test the
Sensors 2016, 16, 1302 Sensors 2016, 16, 1302
13 of 16 13 of 16
experimentally test and the effect of buffer size and estimation frequency on further the proposed algorithm, effect of buffer size estimation frequency on the proposed algorithm, simulations were further simulations were carried out using a cell size equal to 1 and noise level equal to 1; the results carried out using a cell size equal to 1 and noise level equal to 1; the results of these simulations are of these simulations are shown in Table 3. It is clear from this table that the results were worse when shown in Table 3. It is clear from this table that the results were worse when using a small buffer size using small buffer size (1 cell) 1Hzm frequency, 33.1054 m error forto90% of data (1 cell)aand 1Hz frequency, withand 33.1054 error for with 90% of data compared the 3.078 mcompared recorded to the 3.078 m recorded for a buffer size of 2 cells and the same 1 Hz frequency. This discrepancy is for a buffer size of 2 cells and the same 1 Hz frequency. This discrepancy is due to the inability of the due to thetoinability the trajectory to using reachsmall the correct using small steps, as the trajectory reach theof correct destination steps, asdestination the estimated trajectory was moving estimated trajectory was moving faster. Furthermore, the adoption of a 2 Hz measurement faster. Furthermore, the adoption of a 2 Hz measurement frequency achieved better results, even when frequency achieved results, even a small provides buffer. Nevertheless, a high buffer size using a small buffer. better Nevertheless, a highwhen bufferusing size always more flexibility. always provides more flexibility. Table 3. Simulated trajectory results for different step lengths and measurement frequencies using cell Table results for different size = 13.mSimulated and noisetrajectory level = 1. Path length = 352 m. step lengths and measurement frequencies using cell size = 1 m and noise level = 1. Path length = 352 m. Estimation Frequency
Estimation Frequency 1 Hz1 Hz 2 Hz2 Hz 1 Hz 1 Hz2 Hz 2 Hz
Buffer Size
Buffer Size 1 cell 1 cell 1 cell 1 cell 2 cells 2 cells 2 cells 2 cells
Cumulative Error (Meters)
Cumulative Error (Meters) 50% 90% 50% 90% 12.4931 33.1054 12.4931 33.1054 1.1348 2.4989 1.1348 2.4989 1.1574 3.078 1.1574 3.078 1.3639 3.0044 1.3639 3.0044
3.2. Real Measurements 3.2. Real Measurements Testing was carried out on the 4th floor of the DeustoTech building at the University of Testing was carried out on the 4th floor of the DeustoTech building at the University of Deusto Deusto (Bilbao, Spain). Primary results were estimated via a step-and-heading based Pedestrian (Bilbao, Spain). Primary results were estimated via a step-and-heading based Pedestrian Dead Dead Reckoning (PDR) system using a wrist-worn inertial measurement unit composed of three Reckoning (PDR) system using a wrist-worn inertial measurement unit composed of three accelerometers and three gyroscopes. Heading drift was reduced by applying the method known as accelerometers and three gyroscopes. Heading drift was reduced by applying the method known as improved Heuristic Drift Elimination (iHDE) [22], with the developed map matching algorithm then improved Heuristic Drift Elimination (iHDE) [22], with the developed map matching algorithm then applied offline. Estimation frequency was not constant but depended on the step, as the employed applied offline. Estimation frequency was not constant but depended on the step, as the employed PDR system calculated the location at each step. For the map model a cell size of 1m and a buffer size PDR system calculated the location at each step. For the map model a cell size of 1m and a buffer size of 2 cells were used. The estimated results of the primary localization system and the results corrected of 2 cells were used. The estimated results of the primary localization system and the results by CRF map matching for one round walk of the DeustoTech area are shown in Figure 9b. Figure 9c corrected by CRF map matching for one round walk of the DeustoTech area are shown in Figure 9b. displays the estimated trajectory for two rounds, with the corrected trajectory for this estimated Figure 9c displays the estimated trajectory for two rounds, with the corrected trajectory for this trajectory shown in Figure 9d; the ground truth is shown in Figure 9a. estimated trajectory shown in Figure 9d; the ground truth is shown in Figure 9a.
(a)
(b) Figure 9. Cont. Figure 9. Cont.
Sensors 2016, 16, 1302
14 of 16 14 of 16
70
70
60
60
50
50
40
40
meters
meters
Sensors 2016, 16, 1302
30
30
20
20
10
10
0
0
20
40
60 meters
(c)
80
100
120
0
0
20
40
60 meters
80
100
120
(d)
Figure 9. Ground Truth, estimated trajectory obtained from real measurements using iHDE and
Figure 9. Ground Truth, estimated trajectory from real measurements using iHDE and trajectory corrected using CRF. (a) Groundobtained truth; (b) Estimated and corrected trajectories; trajectory corrected using CRF. (a) Ground truth; (b) Estimated and corrected trajectories; (c) Estimated (c) Estimated trajectory for two circuits; (d) Corrected trajectory for two circuits. trajectory for two circuits; (d) Corrected trajectory for two circuits. Analysis of Figure 9 reveals that although the primary estimated trajectory crosses a number of walls and obstacles, was corrected successfully using the map matching algorithm and zero Analysis of Figurethis 9 reveals that although the primary estimated trajectory crosses a number obstacles was crossed by the map matched trajectory, see Table 4. As a consequence of matching the of walls and obstacles, this was corrected successfully using the map matching algorithm and zero trajectory with the map, the accuracy was also improved; even though the accuracy of the iHDE obstacles was crossed by the map matched trajectory, see Table 4. As a consequence of matching trajectory was already good, the map matching using CRF has further improved the accuracy and the trajectory with the map, the accuracy was also improved; even though the accuracy of the iHDE the cumulative error was decrease as shown in Table 4. Before applying the map matching, using trajectory was already good, the map matching using CRF has further improved the accuracy and the only the primary localization system (iHDE), the average cumulative error was 1.2861 m and for cumulative error was decrease as for shown 4. Before The applying the map matching, usingby only 50% of the trajectory 2.6858 m 90% in ofTable the trajectory. cumulative error was reduced the applying primary the localization (iHDE), the average cumulative error wastrajectory, 1.2861 mand and2.2316 for 50% CRF mapsystem matching algorithm to be 1.0634 m for 50% of the m of the for trajectory 2.6858 m for 90% of the trajectory. The cumulative error was reduced by applying the 90% of the trajectory; which means improvement of 17.5% and 19.8%, respectively.
CRF map matching algorithm to be 1.0634 m for 50% of the trajectory, and 2.2316 m for 90% of the Table 4. Real measurement results using feature fdis and respectively. cell size = 1 m. Path length = 347 m. trajectory; which means improvement of 17.5% and 19.8%, Cumulative Error (m) TableAlgorithm 4. Real measurement results using feature fdis and cell size = 1 m. Path length = 347 m. Number of Crossed Obstacles 50% 90% iHDE 15 1.2861 2.6858 Cumulative Error (m) iHDEAlgorithm + CRF 2.2316 Number 0of Crossed Obstacles 1.0634 50% 90% 15 1.2861 2.6858 4. Conclusions iHDE iHDE + CRF 0 1.0634 2.2316 The presented research has demonstrated the successful use of CRF as a model in the offline map matching process. Our CRF algorithm is able to correct trajectories produced by localization 4. Conclusions systems when given the appropriate map information. Grid cells of fixed length were used for the map with a has cell size of 1 m producing acceptableuse results. Theas usea of a larger Therepresentation, presented research demonstrated the successful of CRF model in cell the size offline achieved less accurate results, mainly due to obstacles blocking entire cells, a feature unsuitable for map matching process. Our CRF algorithm is able to correct trajectories produced by localization sites with narrow such as corridors. On the other hand, smaller cells were should provide systems when givenareas the appropriate map information. Grid although cells of fixed length used for the more accurate results, they also potentially increase the occurrence of the divergence problem that map representation, with a cell size of 1 m producing acceptable results. The use of a larger cell size occurs when trajectories are stuck inside small semi-enclosed areas. Even though the experiments achieved less accurate results, mainly due to obstacles blocking entire cells, a feature unsuitable for were carried out with pedestrians in mind, the same algorithm can be applied to any object moving sites with narrow areas such as corridors. On the other hand, although smaller cells should provide within the normal pedestrian speed range; this is because the algorithm can be loosely coupled with more accurate results, theyand alsoispotentially increase occurrence of the divergence problem any localization system independent of that the system implementation, instead dealing withthat occurs trajectories are stuck inside small semi-enclosed areas. Even though the experiments onlywhen its output coordinates. were carried out with pedestrians in mind,finding the same algorithm be to applied to any object moving Our future work will first involve methods with can which solve the stuck problem, within the normal pedestrian speed range; this is because the algorithm can be loosely coupled with possibly by adding more semantic information such as rooms and doors to the map model. In anyaddition, localization system is independent that system implementation, instead with only using more and context information of and adding more algorithm features can dealing help to further improve the results; people behavior related to the building structure can be another source of its output coordinates.
Our future work will first involve finding methods with which to solve the stuck problem, possibly by adding more semantic information such as rooms and doors to the map model. In addition, using
Sensors 2016, 16, 1302
15 of 16
more context information and adding more algorithm features can help to further improve the results; people behavior related to the building structure can be another source of information that we intend to employ in order to improve the accuracy. Exploring ways in which to use the CRF model efficiently for online applications is another future task we will work on. Acknowledgments: This work was supported by the Spanish Ministry of Economy and Competitiveness as part of the ESPHIA project (TIN2014-56042-JIN), as well as by the Erasmus Mundus Avempace II project (2012-2622/001-001-EMA2). Author Contributions: Safaa Bataineh wrote the paper, designed the system, developed the CRF algorithm and the map model, performed the experiments and analyzed the data; Alfonso Bahillo supervised and coordinated the overall process, contributed to the development and testing of the algorithm and reviewed the paper; Luis E. Díez developed the PDR localization system, provided the input data, and reviewed the paper; Enrique Onieva reviewed the paper; Ikram Bataineh helped in autoCAD and drew the building map using AutoCAD. Conflicts of Interest: The authors declare no conflict of interest.
Abbreviations GNSS CRF HMM PDR DXF CDF iHDE
Global Navigation Satellite System Conditional Random Fields Hidden Marcov Models Pedestrian Dead-Reckoning Drawing Interchange file format Cumulative Distribution Function improved Heuristic Drift Elimination
References 1.
2. 3. 4. 5.
6.
7.
8. 9.
10. 11.
Klepeis, N.E.; Nelson, W.C.; Ott, W.R.; Robinson, J.P.; Tsang, A.M.; Switzer, P.; Behar, J.V.; Hern, S.C.; Engelmann, W.H. The National Human Activity Pattern Survey (NHAPS): A resource for assessing exposure to environmental pollutants. J. Expo. Anal. Environ. Epidemiol. 2001, 11, 231–252. [CrossRef] [PubMed] Shang, J.; Hu, X.; Gu, F.; Wang, D.; Yu, S. Improvement Schemes for Indoor Mobile Location Estimation: A Survey. Math. Probl. Eng. 2015, 2015, 397298. [CrossRef] Quddus, M.A.; Ochieng, W.Y.; Noland, R.B. Current map matching algorithms for transport applications: State-of-the art and future research directions. Transp. Res. C Emerg. Technol. 2007, 15, 312–328. [CrossRef] Arulampalam, M.S.; Maskell, S.; Gordon, N.; Clapp, T. A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking. IEEE Trans. Signal Process. 2002, 50, 174–188. [CrossRef] Davidson, P.; Collin, J.; Takala, J. Application of particle filters for indoor positioning using floor plans. In Ubiquitous Positioning Indoor Navigation and Location Based Service (UPINLBS); IEEE: New York, NY, USA, 2010; pp. 1–4. Klepal, M.; Pesch, D. A bayesian approach for rf-based indoor localization. In Proceedings of the 4th International Symposium on Wireless Communication Systems ISWCS, Trondheim, Norway, 17–19 October 2007; pp. 133–137. Klepal, M.; Beauregard, S. A backtracking particle filter for fusing building plans with PDR displacement estimates. In Proceedings of the 5th Workshop on Positioning, Navigation and Communication WPNC, Hannover, Germany, 27 March 2008; pp. 207–212. Klinger, R.; Tomanek, K. Classical Probabilistic Models and Conditional Random Fields; Algorithm Engineering TU: North Rhine-Westphalia, Germany, 2007. Lafferty, J.; McCallum, A.; Pereira, F.C. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning, Williamstown, MA, USA, 28 June–1 July 2001. Ramos, F.T.; Fox, D.; Durrant-Whyte, H.F. CRF-Matching: Conditional Random Fields for Feature-Based Scan Matching. In Robotics: Science and Systems III; MIT Press: Cambridge, MA, USA, 2007. Pereira, F.C.; Costa, H.; Pereira, N.M. An off-line map matching algorithm for incomplete map databases. Eur. Tranp. Res. Rev. 2009, 1, 107–124. [CrossRef]
Sensors 2016, 16, 1302
12.
13. 14.
15. 16. 17.
18. 19. 20.
21. 22.
16 of 16
Xiao, Z.; Wen, H.; Markham, A.; Trigoni, N. Lightweight map matching for indoor localization using conditional random fields. In Proceedings of the 13th International Symposium on Information Processing in Sensor Networks IPSN-14, Berlin, Germany, 15–17 April 2014; pp. 131–142. Xu, M.; Du, Y.; Wu, J.; Zhou, Y. Map Matching Based on Conditional Random Fields and Route Preference Mining for Uncertain Trajectories. Math. Probl. Eng. 2015, 2015, 717095. [CrossRef] Lamarche, F.; Donikian, S. Crowd of virtual humans: A new approach for real time navigation in complex and structured environments. In Computer Graphics Forum; Blackwell Publishing, Inc.: Oxford, UK, 2014; pp. 509–518. Drawing Interchange and File Formats, Release 12. Available online: http://cnc25.free.fr/documentation/ format_dxf/dxf.pdf (accessed on 22 March 2016). Autodesk, Inc. (n.d.) Drawing Interchange File Formats. Available online: http://www.autodesk.com/ techpubs/autocad/acadr14/dxf/drawing_interchange_file_formats.htm (accessed on 22 March 2016). Schäfer, M.; Knapp, C.; Chakraborty, S. Automatic generation of topological indoor maps for real-time map-based localization and tracking. In Proceedings of the 2011 International Conference on Indoor Positioning and Indoor Navigation (IPIN), Guimaraes, Portugal, 21–23 September 2011; pp. 1–8. Jordan, A. On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. Adv. Neural Inform. Process. Syst. 2002, 14, 841. Forney, G.D., Jr. The viterbi algorithm. In Proceedings of the IEEE; IEEE: New York, NY, USA, 1973; Volume 61, pp. 268–278. Elsevier Health Sciences. Moderate Intensity Walking Means 100 Steps Per Minute. ScienceDaily 21 March 2009. Available online: www.sciencedaily.com/releases/2009/03/090317094719.htm (accessed on 22 April 2016). Determine Your Stride. Available online: https://cdn.shopify.com/s/files/1/0196/4616/files/Soleusdeterming_your_stride.pdf (accessed on 23 April 2016). Diez, L.E.; Bahillo, A.; Bataineh, S.; Masegosa, A.D. Enhancing Improved Heuristic Drift Elimination for Wrist-Worn PDR Systems in Buildings. In Proceedings of the IEEE Vehicular Technology Conference, Nanjing, China, 15–18 May 2016. © 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).