The VLDB Journal DOI 10.1007/s00778-008-0115-0
REGULAR PAPER
Instance optimal query processing in spatial networks Ke Deng · Xiaofang Zhou · Heng Tao Shen · Shazia Sadiq · Xue Li
Received: 19 September 2007 / Revised: 29 July 2008 / Accepted: 3 August 2008 © Springer-Verlag 2008
Abstract The performance optimization of query processing in spatial networks focuses on minimizing network data accesses and the cost of network distance calculations. This paper proposes algorithms for network k-NN queries, range queries, closest-pair queries and multi-source skyline queries based on a novel processing framework, namely, incremental lower bound constraint. By giving high processing priority to the query associated data points and utilizing the incremental nature of the lower bound, the performance of our algorithms is better optimized in contrast to the corresponding algorithms based on known framework incremental Euclidean restriction and incremental network expansion. More importantly, the proposed algorithms are proven to be instance optimal among classes of algorithms. Through experiments on real road network datasets, the superiority of the proposed algorithms is demonstrated. Keywords Spatial networks · Spatial queries · Instance optimality · Incremental lower bound constraint
K. Deng (B) · X. Zhou · H. T. Shen · S. Sadiq · X. Li School of Information Technology and Electrical Engineering, The University of Queensland, Brisbane, QLD 4072, Australia e-mail:
[email protected] X. Zhou e-mail:
[email protected] H. T. Shen e-mail:
[email protected] S. Sadiq e-mail:
[email protected] X. Li e-mail:
[email protected]
1 Introduction An important category of spatial queries is related to spatial networks (e.g. road networks and water systems) due to its prevalence in a wide range of applications, such as intelligent transport systems, location-based services, and mobile workforce management. In these applications, network distances are usually used to measure the accurate spatial relationships between data points. For example, the query “give the shopping centers within 3 km from me” is relative to the proximity of my current location. Here, the network distance between any two points corresponds to the shortest path in networks along which an object (a person or a vehicle) can physically move from one point to another. The network based spatial queries have been extensively studied recently [10,13,18]. In a spatial query, if an environment dataset (i.e. a spatial network in this paper) needs to be used for distance calculations, it is called constraint-based. In certain different applications, the distance between two points can be approximated using their Euclidean distance. The case of using the Euclidean distance is called constraint-free as the distance between any two points can be calculated by only using the coordinates of the two points. These two types of queries use quite different query processing strategies. For a constraintfree query the key to efficient query processing is to reduce the number of data points to be accessed (i.e. to minimize the portion of point dataset D accessed when answering a query). For a constraint-based query, however, one additional and often more critical goal is to minimize the portion of network data to be accessed and to minimize the network distance calculations required. This is because network data are typically much larger and much more complex than D and network distance calculations are typically done by using a costly shortest path algorithm.
123
K. Deng et al.
When processing network queries, network distances can be either calculated on-the-fly [13,18], or pre-computed and stored on disk to improve online performance. For a spatial network where the pre-computed distances between all pairs of n points (network nodes and/or data points) are completely maintained, the storage required O(n 2 ) can be very large. Moreover, any changes such as roads closed will cause updates of all relevant distance information materialized on disk in order to reflect the change of the network layout. The cost incurred can be very high if the changes involve the shortest paths of many pairs and/or the changes happen often. In this case, it is reasonable and attractive to use the algorithms with on-the-fly network distance computation, which is the focus of this paper. We assume that Euclidean distance is the shortest distance between any two points. We propose algorithms for processing network k-NN queries, range queries, closest-pair queries and multi-source skyline queries (MN-skyline)1 based on a novel processing framework, namely incremental lower bound constraint (LBC). As aforementioned, the optimization focus of network queries is on minimizing network data accesses and the network distance calculations. By giving high processing priority to the query associated points and utilizing the incremental nature of lower bound in our algorithms, only data points in the final solution need to compute network distances and the others compute a necessary lower bound. As a consequence, the network data accesses can be reduced to a just-enough level.2 Comparing to the corresponding algorithms based on the existing frameworks, incremental Euclidean restriction (IER)3 and incremental network expansion (INE), our algorithms will access fewer network data (or same in some cases) by performing less(or same in some cases) network distance calculations. More importantly, we prove our algorithms are instance optimal over a class of algorithms on network data accesses and network distance calculations. The instance optimality [5] corresponds to optimality in every instance (i.e. any spatial networks and data point sets), as opposed to just the worst case or the average case. The experiments using real datasets have demonstrated the optimality of our algorithms in various settings. This work aims to process network queries where the network data are typically much larger and more complex than D. However, in the extreme case where D is so large that each network contains several data points, our algorithms still demonstrate competitive performance compared to algorithms based on IER and INE. 1
The multi-source skyline query is also known as spatial skyline query [16].
2
The network data accesses and the network distance calculations are corresponding. The more calculations are, the more network data accesses are, and vice versa, see Sect. 5.1.
3
LBC can be thought as an optimization of IER.
123
This paper is an major extension to our previous work [3]. There we proposed three algorithms to process MN-skyline queries and proved the instance optimality of the algorithms based on LBC. Here in this paper, we extend the work by generalizing the incremental LBC into a framework and apply it to process more network query types including k-NN queries, range queries and closest-pair queries. More comprehensive related work, experiments and analysis have been provided to make this paper self-contained. The rest of the paper is organized as follows: Before the introduction of related work in Sect. 3, Sect. 2 discusses the network modeling and the shortest path algorithms. In Sect. 4, the framework is introduced through proposing the network k-NN query processing algorithm first and then the algorithms for range queries, closest-pair queries and MN-skyline queries are proposed. Section 5 analyzes the proposed algorithms against algorithms based on IER and INE and proves the optimality of the proposed algorithms. After that, Sect. 6 presents the empirical performance study. This paper is concluded in Sect. 7.
2 Network modeling and distance computation In this paper, we use road networks to illustrate spatial networks. A road network can be modeled as a graph G = (E, V ), where V is a set of nodes corresponding to road junctions and E is a set of (non-directional) edges between two nodes in V corresponding to road segments. An edge can be a straight line or a polyline. If there is an edge in E linking two nodes in V , these two nodes are adjacent to each other. Let Ad j (v) denote all the adjacent nodes of v ∈ V . Let d(v, v ) be the distance along a path between two points v and v (i.e. the total length of the edges along a network path between the two nodes). If there is no path connecting v and v , d(v, v ) = ∞. The network distance between v and v , denoted as d N (v, v ), is defined using the network shortest path between the two nodes. In addition, we use d E (v, v ) to denote their Euclidean distance. Two representative network shortest path algorithms are the Dijkstra’s algorithm [4] and the A* algorithm [11]. They both propagate a search “wavefront” from a source node vs until a destination node vd is reached. A heap H is used to keep all the nodes in the wavefront. The initial step of Dijkstra’s algorithm is to put every node v ∈ Ad j (vs ), together with the distance d(vs , v) (set to the length of the edge linking vs and v), into H . Then, the algorithm iterates an expansion process by replacing a node v ∈ H , where d(vs , v) is the minimum for all nodes in H , by all the nodes in Ad j (v). For each node v ∈ Ad j (v), d(vs , v ) is set to d(vs , v)+d(v, v ) if v is not yet in H or (in case v in H ) the existing estimate of d(vs , v ) is larger than d(vs , v)+d(v, v ). This process terminates when vd is selected from H to expand
Instance optimal query processing in spatial networks
(and d N (vs , vd ) = d(vs , vd )). Dijkstra’s algorithm can compute the shortest paths from a source node to multiple destination nodes. If there is only one destination, the wavefront expansion process can be optimized toward the direction of the destination node. This is the key idea of A*. Instead of selecting v ∈ H with the minimum d(vs , v), a node v ∈ H with the minimum d(vs , v ) + d E (v , vd ) is selected to expand.4 That is, when selecting a node v , not only the computed network distance from vs to v is considered, the Euclidean distance from v to vd is also used as a directional guide. For any node v ∈ H , d(vs , v) + d E (v, vd ) is called the distance lower bound of v from vs to vd (denoted as lb(v, vs , vd ), or v.lb when not causing ambiguity). Clearly, any valid network path from vs to vd via v cannot be shorter than v.lb. One important property of A* is that it can find the network shortest paths to multiple destinations by traversing the networks only once. To do that, besides maintaining the previous state of H , all intermediate network nodes need to be kept with their network distances to the source [18]. Both Dijkstra’s algorithm and A* compute the shortest path on-the-fly. There also exists several efficient algorithms that pre-compute the network distances between all pairs of nodes in order to avoid on-the-fly computation. One approach is to develop a transitive closure of G, at the cost of storage overhead. This type of approaches is not suitable for large road networks, as they typically require O(|V |2 ) disk space [2]. Another approach divides a large graph into smaller subgraphs and organizes them in a hierarchical fashion [6,9]. For each subgraph, the boundary nodes are the entrances and only the distances between these boundary nodes are precomputed. While computing the network distance using A* and Dijkstra’s algorithm, intermediate nodes inside the subgraph are skipped over and thus this hierarchical subgraph structure is suitable to calculate network distances in very large networks. A recent study [8] stores and indexes the network distances between network nodes and data points with different level of accuracy according to the distances between them. The drawback of this approach is the high maintenance cost when networks are updated. To compute network distance for a given pair of data points, we need to know on which edge each point is located and the distances from it to this edge’s two ends. In [13], the network model and the point dataset are independently stored. Such a storage scheme supports data independency for these two types of data. However, the geometric computation of mapping data points in networks is an expensive operation. Another storage scheme [18] avoids the cost of online mapping by making the networks and the point dataset linked with each other. One disadvantage of this scheme is that a network model is hard coded with a specific point dataset. In 4
d(vs , v ) = d N (vs , v ) when v is selected to expand.
this paper, we use a middle layer by partially materializing the mapping between two datasets. If a point p is on a network edge e between two adjacent nodes v, v , the distances d(v, p) and d(v , p) are pre-computed, and the id of e is stored in the middle layer with the id of p and the two precomputed distances. This middle layer can be indexed using a B + -tree on edge ids and using another B + -tree on data point ids. In either Dijkstra’s algorithm or the A* algorithm, the network distance computation can start by finding the edge of the query point using B + -tree on data points. When an edge is visited by the expanding “wavefront”, the data points on this edge can be retrieved using B + -tree on edges. More specifically, if an edge (v, v ) is visited by replacing v in H with Ad j (v), v is inserted into H and d N (q, v) is computed. The data point on (v, v ), e.g. p, is retrieved. We say that d N (q, p) is computed if d N (q, v) + d(v, p) is less than v.lb, ∀v ∈ H . Obviously, the mapping cost is proportionally associated with the amount of network accesses.
3 Related work Two frameworks are widely used to process queries in spatial networks. One is incremental Network expansion (INE) and the other is the incremental Euclidean restriction (IER). These two frameworks were generalized by Papadias et al. [13] to process a series of query types in networks. The similar approaches have also been applied for processing network aggregated nearest neighbor queries by Yiu et al. [18] and for processing network multi-source skyline queries in our recent work [3]. In this section, we review the algorithms processing network k-NN queries, range queries, closest-pair queries [13] and multi-source skyline queries [3] based on INE and IER. 3.1 Incremental Euclidean restriction IER generally performs pre-processing in Euclidean space first. Then the network distances are computed and used to indicate candidate region by utilizing the lower bound nature of Euclidean distance. Since the source and destination are known, the A* algorithm can be used to reduce the network search region. The exception is in range queries where the network searching does like Dijkstra’s algorithm. This is because the range expansion is equal in all directions in this query type. • IER k-NN Query (IER-KNN) Given a point dataset D and a query point q on spatial networks, a k nearest neighbor (k-NN) query finds S ⊆ D such that |S| = k and for any point p ∈ S and p ∈ D − S, d N (q, p) ≤ d N (q, p ). First, the query is processed in point dataset based on Euclidean distances. The retrieved
123
K. Deng et al.
p1 q T1=dN (q,p1)
p2
T2=dN (q,p2)
p3
T1 is computed first and then is updated to T2 after dN (q,p2 ) is computed
Fig. 1 An example of IER-KNN
k points are candidates and their network distances to q are computed. Next, a threshold τ is defined and set to be the kth maximum network distance among all candidates. If another point has Euclidean distance to the query point less than τ , this point is a new candidate. This guarantees there are no false misses in IER-KNN, that is, no data point in the final solution is mistakenly missed. To minimize the total number of candidates, IER-KNN each time retrieves only one new candidate p and computes d N (q, p). If τ > d N ( p, q), τ is updated to be d N (q, p). As a result, the candidate region is reduced. Figure 1 shows an example of querying k = 1 NN. p1 is the first Euclidean NN and τ = d N (q, p1 ). For this τ , p2 and p3 are candidates. IER-KNN next retrieves p2 and computes d N (q, p2 ). As shown, τ > d N (q, p2 ). τ is updated to be d N (q, p2 ). As d N (q, p2 ) < d E (q, p3 ), p3 is pruned from the candidate set. IER-KNN terminates when no more candidates can be found using current τ and returns the candidates with k minimum network distances. • IER Range Query (IER-RQ) Given a query point q and a point dataset D on spatial networks, for a given value e, a range query retrieves S ⊆ D such that d N (q, p) ≤ e, ∀ p ∈ S. In the first step, a set of points are retrieved from point dataset as candidates based on Euclidean distance. The second step expands a “wavefront” in networks from the query point in all directions to find all edges whose network distances are less than e or crossing the boundary (similar to Dijkstra’s algorithm). Only the candidates are located on these edges are returned to the query and other candidates are discarded. • IER Closest-Pair Query (IER-CP) Given two datasets S, T on spatial networks and a value k, a closest-pair query retrieves a set of pairs C P = {(s, t)|s ∈ S, t ∈ T }, |C P| = k, such that d N (s, t)i < d N (s, t) j , (s, t)i
123
∈ C P, (s, t) j ∈ A P − C P where A P = {(s, t)|∀s ∈ S, ∀t ∈ T }. Since there is no query point, the only approach is to treat every point in one dataset, say S, as a source. In the first step of IER-CP, the k closest-pairs are retrieved based on Euclidean distances with support of some spatial index structures such as R-tree. These pairs are candidates and the network distance of each pair is computed. The second step defines a threshold τ and sets τ to be the value of the kth maximum network distance among all candidates. Any other pair whose Euclidean distance is less than τ will be retrieved as a new candidate. In the similar way to IER-KNN, IER-CP retrieves only one such candidate each time to reduce the candidate region. When no more candidate can be found using current τ , IER-CP terminates and returns the candidate pairs with k minimum network distances. • IER MN-skyline (IER-SK) Given a point dataset D, a network E and a set of query points Q, a multi-source skyline query finds S ⊆ D such that for any point p ∈ S, there is no other point p ∈ D − p, d N (q, p) ≥ d N (q, p ) for ∀q ∈ Q. The skyline query by its nature can be widely used in decision support, financial data analysis and data mining. For example, find a hotel cheap and close to the University. While most skyline queries consider one query point, there are many applications that consider multiple query points at the same time, e.g. fine a hotel close to the Botanic Garden, the University and the China Town. Our work [3] is the first study of this query type in road networks. In a constraint-free space, the multi-source skyline queries have been studied in a recent work [16] where the same query type is known as spatial skyline query. The basic idea is to perform the multi-skyline query in Euclidean space first. Let Sky E be the retrieved Euclidean skyline using R-tree based on the algorithm given in [3]. This algorithm is a simple extension to the approach proposed in [13].5 The points in Sky E are candidates and they are kept in a candidate set C. For each p ∈ Sky E , network distances to all q ∈ Q are computed. If a point p ∈ D − Sky E , p (d E (q1 , p ) .. d E (q|Q| , p )) cannot be dominated by p(d N (q1 , p) .. d N (q|Q| , p)), ∀ p ∈ Sky E , p is a new candidate and put into C. After retrieval of all such candidates, we compute network distances from each candidate to all q ∈ Q. The network skyline, denoted as Sky N , can be identified by pairwise comparing all candidates in C based on the network distances. There are no false misses using this method. This is because while an Euclidean skyline point p may not be a network skyline point, all the data points potentially dominating p(d N (q1 , p)..d N (q|Q| , p)) are retrieved. 5 Note that Sky can also be retrieved using the algorithms proposed E in [16] which can improve the performance to some extent but require complex data structures. Sky E retrieval is a filtering step here and the IER-SK performance is dominated by the costly refinement step where network distances are computed.
Instance optimal query processing in spatial networks q2
p n2
candidate space defined by p 5n q2 pn pn
p 3n
p e5
p e5 p 2e
e p 4n p 3
p 1n
q1
o
p n5 p n1 p e3
p 2e
candidate space defined by p 1n
p 1e p 4e
3
2
p 1e
p e4
q1
o
(a)
p n4
(b)
Fig. 2 An example for IER-SK
In [3], we also provided an approach to report network skyline points incrementally. Let us use the example in Fig. 2a to describe the idea. p1e , . . . , p5e are the data points p1 , . . . , p5 in Euclidean space (the dynamic space as termed in [12]). p1e is an Euclidean skyline point retrieved from the R-tree and kept in C. Then, the network distances from p1e to all query points are computed so p1e is shifted to p1n . The grey area defined by p1n and the origin in Fig. 2a is used to fetch all data points in that region, thus p2e , . . . , p4e are added into C. Their network distances to all query points are then computed, and these points are shifted to their new positions p2n , . . . , p4n . Skyline points can then be determined by pair-wise comparing all data points in C ( p1n , . . . , p4n ). If a data point dominates p1n and it cannot be dominated by other points, it is a network skyline point. As shown in Fig. 2a, p4n is a network skyline point. p4n and all the data points dominated by p4n are removed from C (i.e. p1n , p3n ). Any undetermined data points (i.e. p2n ) remain in C. Then, the next iteration starts by retrieving the next Euclidean skyline point using the R-tree. Notice that an entry is not accessed if it is within the region defined by p1n . In Fig. 2b, p5e is the next Euclidean skyline point and p5n is the shifted p5e . Any dominant candidates from the candidate space defined by p5n should compare with previously determined network skyline points ( p4n so far) before determining whether they are network skyline points. If no more Euclidean skyline points can be retrieved, each of the undetermined candidates in C is a network skyline point and the algorithm terminates. The candidates can be further pruned. As shown in Fig. 2, instead of retrieving and processing p2e , . . . , p4e in one batch, the local Euclidean skyline points among p2e , . . . , p4e can be retrieved and processed first. That is, only p2e and p4e are processed and shifted to p2n and p4n respectively. Since p4n dominates p3e , IER-SK needs not to retrieve and process p3e . 3.2 Incremental network expansion In INE, a “wavefront” is gradually expanded in networks from the query point such that the data points closer to the query point are visited earlier than others. This is similar to the Dijkstra’s algorithm to find the shortest network distances to multiple destinations.
• INE k-NN Query (INE-KNN) and Range Query (INE-RQ) It is straightforward to use INE-KNN in processing network k-NN query and range queries. For a k-NN query, INE-KNN terminates once k data points are visited by the “wavefront”. For a range query, the basic idea is that any data point visited during the expansion of “wavefront” is returned to the query. While the network data accesses and candidate size are same as the basic idea, INE-KNN [13] first finds all edges with network distances less than e or crossing the boundary, and then retrieves all data points on these edges to answer the query. • INE Closest-Pair Query (INE-CP) First, one point s ∈ S finds its k network NNs in another dataset T . The k pairs formed by s and each of these k nearest neighbors are put into candidate set C. Let τ be the maximum network distance among all candidates in C. For another point s ∈ S, INE-CP expands “wavefront” from s to find its network NN. For each t ∈ T with d N (s , t) < τ , (s , t) is inserted into C to replace the existing candidate with τ . Note τ is updated. INE-CP stop processing s once the distance from the “wavefront” to s is over the current τ . INE-CP terminates until all s ∈ S are examined. • INE MN-skyline (INE-SK) The basic idea of this algorithm is to find the next nearest neighbor based on the network distance alternatively to each query point using Dijkstra’s shortest path algorithm [3]. That is, the search space of each query point, defined as a circle around the query point, is expanded in a collaborative way until all skyline data points are found. Conceptually, there are two phases in this algorithm. The first phase, called the filtering phase, ends when the first data point visited by all query points is found. During this phase, all nearest neighbor data points encountered (from all query points) are considered as candidates of the network skyline points and are stored in a candidate set C. When this phase terminates (i.e. a data point p is visited by all query points), for any data point p not in C, it is clear that for every query point q, d N ( p, q) ≤ d N ( p , q). In other words, all points not in C when this phase terminates are dominated by p. Notice that p is a skyline point. This is obvious: let the last query point to visit p be q, then there exist no other data points which are closer to all other query points as well as to q. In the second phase (called the refinement phase), the points in C will be checked, using the same process of alternative expansion of the search space from each query point to find their next nearest neighbor. Following the same observation above, the next data point encountered is a skyline point as long as it is not dominated by the skyline points that are already discovered. Notice that the new data points encountered during this phase (i.e. not already in C) are simply discarded. Assume now p is identified as a skyline point by INE-SK. Let C( p, q) ⊆ C, q ∈ Q, be the set of data points
123
K. Deng et al. Pruned candidate space by p 2 Candidate space p4
p4 q1 p3
p1 p2
p5
q1
q2 p3
p1 p2
p5 q2
p1 : 1st object visited by all query points
p2 : 2nd object visited by all query points
(a)
(b)
Fig. 3 An example of INE-SK
that are visited by q after q visits p. Then, the candidates in ∩q∈Q C( p, q) can be safely pruned, as they are dominated by p. Figure 3 is an example for two query points (q1 , q2 ) and five data points ( p1 to p5 ) (for presentation simplicity, Euclidean distances are used to illustrate the idea). p1 is the first data point visited by all query points, thus is the first skyline point. All the points within the two circles in Fig. 3a, { p2 , p3 , p5 }, are visited before p1 is visited by both q1 and q2 , thus are included in C. Note that p4 cannot be a skyline point, thus can be safely pruned. During the refinement phase, p2 is found as the next data point in C visited by all query points (shown in Fig. 3b). p2 is examined by comparing with all previously computed skyline points. In this example, as p2 is not dominated by p1 , it is reported as the next skyline point. All data points in C which are dominated by p2 are pruned from C, as shown in the shaded area in Fig. 3b. In this case, p3 is pruned from C, but p5 remains to be checked (and later be identified as a skyline point when q1 visits it). 4 Incremental lower bound constraint One major problem with INE is that the search space of each query point expands toward all directions. This may cause unnecessary network search. To control expansion directions, IER retrieves candidates in Euclidean space first. Then the network distances from the query point to these candidates are computed. While IER improves the strategy for search space expansion using the directional expansion property of the A* algorithm, it needs to compute the network distances for all candidates. In the example shown in Fig. 1, p1 is a candidate not in the final solution but IER-KNN has to compute d N (q, p1 ). Due to the very high cost, it is ideal that the network distance calculations can be reduced. Such observations motivate us to design a new framework that utilizes the incremental lower bound and gives high processing priority to the points most likely in the final solution. In consequence, query processing algorithms can reduce network data accesses to an optimal level since network distances are
123
only performed for points appearing in the final solution and the others compute a just-enough lower bound. Let’s define the concept of path distance lower bound, which is the key for our query processing algorithms to minimize the cost of network distance computation. Recall that A* propagates a search wavefront from a given source node vs to a given destination node vd . Every node in the wavefront is kept in a heap H with its distance lower bound lb, and if ∀v , v ∈ H , v . lb ≤ v . lb, v is selected as the expansion node. Clearly, v . lb cannot exceed the length of the shortest path from vs to vd , even if the shortest path, which is not known yet, does not pass v. Therefore, v . lb is a distance lower bound of the shortest path, and is called the path distance lower bound, denoted as plb. Let v and v be the expansion node just before and after v is used as an expansion node, we have v . lb ≤ v . lb ≤ v . lb. In other words, the path distance lower bound increases during the process of search space expansion (the initial path distance lower bound is the Euclidean distance between vs and vd , and the final value when the shortest path is found is the actual network distance). In summary, the following is a formal definition of path distance lower bound. Definition Given any two points q and p located in networks, A* algorithm is used to compute network distance from q to p with a heap H . Each network node v in the “wavefront” is kept in H with v . lb = d(q, v) + d E (v, p). The path distance lower bound, plb, is defined as minimum (v . lb), ∀v ∈ H . In our algorithms, each candidate has a plb to the query point (if there are more than one query point, such as in MN-skyline, it may have many lower bounds, each for one query point). By expanding network search toward the candidates most likely in the final solution (i.e. candidates with the minimum plb), the network data accesses in query processing can be minimized to an optimal level. 4.1 LBC k-NN queries (LBC-KNN) The pseudo code of LBC-KNN is illustrated in Fig. 4. LBCKNN first finds the k+1 Euclidean nearest neighbors p1 , . . . , pk+1 of the query point q. The p1 , . . . , pk are stored in a candidate set C and pk+1 is stored in B. The candidates in C are processed as follows. For each candidate p, the p . plb is initialized as its Euclidean distance to the query point, and a heap denoted as H p is used to keep a same copy of the current “wavefront”.6 For each v ∈ H p , v . lb is relevant to p, that is, v . lb = d(q, v) + d E (v, p). As discussed, the minimum v . lb for ∀v ∈ H p is the lower network distance bound of p, p . plb. By this way, based on the same “wavefront” each candidate p ∈ C has its own p . plb. 6
that is, the heap is same for all candidates.
Instance optimal query processing in spatial networks
access v and Adj(v) if v.lb