A scalable and fast OPTICS for clustering trajectory big data

8 downloads 45072 Views 2MB Size Report
Dec 20, 2014 - Keywords Trajectory big data · Clustering · Big data computing · GPGPU. 1 Introduction. Big data is a collection of datasets so large and ...
Cluster Comput (2015) 18:549–562 DOI 10.1007/s10586-014-0413-9

A scalable and fast OPTICS for clustering trajectory big data Ze Deng · Yangyang Hu · Mao Zhu · Xiaohui Huang · Bo Du

Received: 27 September 2014 / Revised: 21 November 2014 / Accepted: 8 December 2014 / Published online: 20 December 2014 © Springer Science+Business Media New York 2014

Abstract Clustering trajectory data is an important way to mine hidden information behind moving object sampling data, such as understanding trends in movement patterns, gaining high popularity in geographic information and so on. In the era of ‘Big data’, the current approaches for clustering trajectory data generally do not apply for excessive costs in both scalability and computing performance for trajectory big data. Aiming at these problems, this study first proposes a new clustering algorithm for trajectory big data, namely Tra-POPTICS by modifying a scalable clustering algorithm for point data (POPTICS). Tra-POPTICS has employed the spatiotemporal distance function and trajectory indexing to support trajectory data. Tra-POPTICS can process the trajectory big data in a distributed manner to meet a great scalability. Towards providing a fast solution to clustering trajectory big data, this study has explored the feasibility to utilize the contemporary general-purpose computing on the graphics processing unit (GPGPU). The GPGPU-aided clustering approach parallelized the Tra-POPTICS with the Hyper-Q feature of Kelper GPU and massive GPU threads. The experimental results indicate that (1) the Tra-POPTICS algorithm has a comparable clustering quality with T-OPTICS (the state Z. Deng (B) · Y. Hu · M. Zhu · X. Huang · B. Du School of Computer Science, China University of Geosciences (Wuhan), Wuhan, Hubei, China e-mail: [email protected] Y. Hu e-mail: [email protected] M. Zhu e-mail: [email protected] X. Huang e-mail: [email protected] B. Du e-mail: [email protected]

of art work of clustering trajectories in a centralized fashion) and outperforms T-OPTICS by average four times in terms of scalability, and (2) the G-Tra-POPTICS has a comparable clustering quality with T-POPTICS as well and further gains about 30 speedup on average for clustering trajectories comparing to Tra-POPTICS with eight threads. The proposed algorithms exhibit great scalability and computing performance in clustering trajectory big data. Keywords Trajectory big data · Clustering · Big data computing · GPGPU

1 Introduction Big data is a collection of datasets so large and complex that it’s difficult to work with using traditional data processing algorithms and models [46]. The challenges include data acquisition, model [24,49], storage [30], processing [31,47], security [51], analysis [2,27], service [23,44,45] and etc. More specially, recently, with the wide availability of global positioning system (GPS), radio frequency identification (RFID), remote sensor and satellite technologies, trajectory data can be collected in a massive scale and realtime manner. That means that trajectory data is trajectory big data, in the current era of ‘Big Data’. How to efficiently and effectively discover and create value from trajectory big data is an important research problem [40]. For example, in many big cities, the data centers will collect a large number of GPS trajectory data from taxies in a certain frequency and quickly process these data to discover the passengers’ mobility and taxis’ pick-up/drop-off behaviours. Based on the discovered knowledge, the data centers can recommend taxi drivers where they can find passengers quickly [50]. Let’s consider a Beijing traffic monitoring system [8] for another

123

550

example, every day the system can analyze large GPS trajectory data from Beijing road traffic to infer the root cause of anomalies. The insights from GPS trajectory big data can help authorities to avoid city road traffic jams. To discover hidden value behind trajectory data, a basic data analysis task is to find moving objects in a similar way [43]. Therefore, an efficient and effective clustering approach for trajectories is essential for such ‘Big Data’ analysis tasks. Clustering is the process of grouping a set of data into meaningful subclasses [21]. Clustering has been widely used in numerous applications such as market research, pattern recognition and data analysis. A number of clustering algorithms have been proposed in the past decades. All algorithms have roughly been categorized into four classes [21]: partition-based (e.g., K-means [32] and K-medoids [35]), hierarchy-based (e.g., SLINK [41] and CLINK [14]), gridbased (e.g., flexible grid-based clustering [1]) and densitybased such as DBSCAN [16] and OPTICS [4]. However, prior work focused mainly on clustering of point data. In recent two decades, some clustering algorithms for trajectory data have been reported in the literature. Most clustering algorithms for trajectory data are generally modelbased, distance-based or density-based [22]. The modelbased methods derive models (such as Markov model [3] or Vector field [17]) to describe trajectory data and cluster them using generic clustering algorithms (like K-means). The distance-based approaches (e.g. CluST [48]) cluster trajectories using generic clustering algorithms with trajectory distance functions. The density-based ones (e.g. TraClus [25] and Tra-DBSCAN [26] ) use a density threshold around each trajectory to distinguish the relevant data items from noise. In the current era of ‘Big Data’, clustering trajectory big data faces new challenges. Firstly, the amount of trajectory data is growing at extremely volumes [52]. This requires clustering methods must process massive trajectories in a scalable manner. Unfortunately, previous trajectory clustering algorithms process trajectory data in a centralized fashion. Second, the trajectory big data are being generated at a realtime rate and are collected in the form of data streams [42]. It becomes a grand challenge for performances of clustering methods to quickly process trajectory big data. Prior trajectory clustering approaches do not apply in the new problems any longer. There exists a pressing need for a new trajectory clustering approach which supports (1) scalability to trajectory big data of increasing density and (2) high performance for fast clustering of trajectory big data. To address these research challenges, this study first proposes a new trajectory clustering approach which aims to scalability to clustering trajectory big data. The scalable trajectory clustering approach is developed based on a scalable density-based clustering approach for point data ( POPTICS [36]). POPTICS is a distributed OPTICS to use multiple CPU cores or machines to simultaneously cluster a set of

123

Cluster Comput (2015) 18:549–562

point data based on disjoint-set data structure and Prim’s minimum spanning tree (MST). Inspired by POPTICS, we extended POPTICS to scalably cluster trajectory big data called Tra-POPTICS. Tra-POPTICS inherited the scalability of POPTICS and incorporated the supports of processing trajectory data. These supports include the construction of MST for trajectories and the indexing technique of trajectories. Another challenge is to achieve fast clustering towards trajectory big data. This study focuses on gearing contemporary parallel computing technologies with Tra-POPTICS. The modern graphics processing unit (GPU) [11] has evolved into a highly parallel, multithreaded, and many-core processor far beyonds a graphic engine for processing various types of data such as remote sensing data [28], neural data [9,12]. The GPU is particularly suited for solving the distributed clustering that can be expressed as task-parallel computations, and Tra-POPTICS is exactly the case. Tra-POPTICS is implemented on one modern GPU with the Kepler architecture [34], namely G-Tra-POPTICS. G-Tra-POPTICS can execute Tra-POPTICS in parallel with the Hyper-Q feature [34] of the Kepler GPU. Furthermore the sub-procedures of Tra-POPTICS have been individually parallelized on the GPU. A number of experiments have been carried out to evaluate the scalability of Tra-POPTICS. A OPTICS approach for clustering trajectories in a centralized fashion (T-OPTICS [33]) is referenced for the comparison purpose. Furthermore, G-Tra-POPTICS has been examined against TraPOPTICS. The experimental results indicate that (1) TraPOPTICS shows scalable speedups of average four times on a 4 CPU-cores machine comparing to T-OPTICS and (2) G-Tra-POPTICS on a Kepler GPU [34] dramatically outperforms Tra-POPTICS in terms of time efficiency 30 times on average. The main contributions of this study are as follows: 1. We developed a scalable density-based trajectory clustering algorithm, Tra-POPTICS. Tra-POPTICS meets the need of clustering trajectory big data in a scalable manner. 2. We designed a parallel Tra-POPTOCS on one Kepler GPU, G-Tra-POPTICS. G-Tra-POPTICS can significantly improve the time efficiency of clustering trajectories based on the Hyper-Q feature of GPU. 3. In the course of designing G-Tra-POPTICS, we also provided parallel algorithms of the sub-procedures of TraPOPTICS on GPU. This is a successful attempt to use massive GPU threads for efficiently parallelizing TraPOPTICS. To the best of our knowledge, the proposed method is the first trajectory clustering on a Kepler GPU in a distributed manner for trajectory big data. Additionally, this study only

Cluster Comput (2015) 18:549–562

handled OPTICS-based trajectory clustering method. Nevertheless, the method can be straightforwardly applied for other trajectory clustering algorithms. The remainder of this paper is organized as follows: Section 2 presents the backgrounds of clustering trajectory data. Section 3 formulates our problem. Section 4 proposes the scalable POPTICS algorithm for clustering trajectories (Tra-POPTICS). Section 5 introduces the design of GPUaided Tra-POPTICS (G-Tra-POPTICS). Section 6 presents the experiments and results for performance evaluation of the proposed approaches. Section 7 concludes this paper with a summary and a plan for future work.

2 Backgrounds A number of successful attempts have been made to cluster trajectories. The most salient works along this direction are described in this section. The trajectory clustering methods can be roughly classified three categories, i.e., model-based, distance-based, density-based. The model-based methods model the partial or whole trajectory dataset and look for a set of fitting parameters for the model to represent different clusters. For instance, in [13] and [3], the entire set of trajectories is represented as clusters with regression mixture model and Markov model, respectively. EM algorithm is used to estimate the parameters of these models. Like [13], Camargo et al. employ regression mixture model to cluster tropical cyclone trajectories [7]. Recently, Ferreira et al [17] use vector fields as models to model trajectory and K-means is used to do clustering. However, these methods model trajectories as a whole so that some similar portions of trajectories cannot be detected [25]. The distance-based clustering algorithms cluster trajectories using different trajectory distance functions and generic clustering algorithms [22]. For example, Chen et al. [10] introduced a trajectory distance function, called Edit Distance on Real sequence (EDR). EDR is robust against trajectory imperfections. The trajectories are clustered by a hierarchical clustering algorithm based on EDR. In [37], a framework of trajectory distance operators is proposed to handle different types of trajectory similarity based on various motion parameters of the trajectories. A agglomerative hierarchical clustering with the complete linkage criterion is used to process trajectory data. Recently, Wu et al. [48] propose a novel distance-based trajectory clustering algorithm, CluST. The main idea of CluST divides trajectory data into lines and uses a novel spatiotemporal line distance function that considers both the spatial and temporal characteristics. Using the proposed line distance function, trajectory lines are clustered based on k-means clustering algorithm. The density-based clustering algorithms use a density threshold around each trajectory to distinguish the relevant

551

data items from noise. The dominant density-based trajectory clustering algorithms are based on the DBSCAN and OPTICS, since both DBSCAN and OPTICS have the ability in discovering clusters with arbitrary shape, the robustness with respect to noise in the data and the ability of discovering an arbitrary number of clusters to better fit the source data [33]. For example, TraClus [25] partitions trajectories into a set of line segments and then group line segments based on the algorithm DBSCAN. However, TraClus does not take the temporal domain of the trajectory data into consideration. Unlike TraClus, ST-DBSCAN [5] can cluster spatialtemporal data according to its non-spatial, spatial and temporal attributes. Because OPTICS can address the major problem of DBSCAN that DBSCAN cannot detect meaningful clusters in data of varying density, OPTICS is widely used to cluster trajectories. For instance, in [33] a time-focused OPTICS (T-OPTICS) is proposed to cluster trajectories at different time intervals for discovering motion behaviours about urban traffic at some interesting hours of the day. Similarly, in [39] to find the major areas where people drive to work, a progressive OPTICS-based clustering is introduced to process the trajectories of people driving to work. Significantly different from the existing trajectory clustering methods, this study targets on the emerging challenges of (1) the scalable trajectory clustering processing and (2) enabling a fast solution to clustering trajectory big data. Our clustering method is a distributed and parallel method to satisfy the two above-mentioned challenges. The proposed method is the first trajectory clustering approach on a GPU for processing trajectory big data in a distributed manner. 3 Problem formulation Assuming linear interpolation [46] between sampled locations, we first present the definitions utilized hereafter as follows: Definition 1 (Trajectory) A trajectory Ti consists of Ni consecutive points with positions and time stamps. Namely, Ti = {(p1 , t1 ), (p2 , t2 ), . . . , (p Ni , t Ni )}. where p j and t j mean the data point position of Ti and corresponding recorded time, respectively, and t1 < t2 < · · · < t Ni < now. Two consecutive points constitute a line segment L j = ((p j , t j ), (p j+1 , t j+1 )). Therefore, Ti can also be defined as Ni − 1 line segments. That means Ti = {L1 , L2 , . . . , L Ni − 1} Definition 2 (Subtrajectory) A subtrajectory STi is a subset of a trajectory Ti , a part of consecutive data points in the trajectory. STi = {(pm , tm ), (pm+1 , tm+1 ),…, (pq , tq )}, where 1 ≤ m ≤ q ≤ Ni . Similarly, a substrajectory can be composed as a set of consecutive line segments. Namely, STi = {Lm , Lm+1 ,…, Lq−1 }.

123

552

Definition 3 (Trajectory data) Trajectory data D is denoted as a snapshot D = {T1 , T2 ,…, T N } or {ST1 , ST2 ,…, ST N }, where Ti and STi is the trajectory or subtrajectory of different moving objects at snapshot. Definition 4 (Density-based clustering for trajectory data) Given the trajectory data D, clustering is the process of partitioning D into a set of clusters C = {C1 , C2 ,…, Ck |Ci ⊆ D and k C = ∅ } and a noise set Noise = {tr ∈ D| ∀i : tr ∈ / Ci } ∩i=1 i based on a trajectory similarity measure and a density-based clustering algorithm, where C ∪ N oise = D. Here the objective in this paper is to cluster trajectory data using a distributed and parallel OPTICS algorithm based on one GPU device. 4 Clustering trajectory data scalably using POPTICS This section first reviews the POPTICS algorithm. Then, the details of the proposed Tra-POPTICS are described including the distance function, the description of the Tra-POPTICS algorithm and the optimization of Tra-POPTICS using a trajectory indexing technique. 4.1 POPTICS OPTICS is a density-based clustering algorithm and can detect meaningful clusters in data of varying density by producing a linear order of points such that points which are spatially closest become neighbors in order [4]. OPTICS starts with adding an arbitrary point of a cluster to the order list and then iteratively expands the cluster by adding a point within ε - neighborhood of a point in the cluster which is also closest to any of the already selected points. The process repeats for remaining clusters. Meanwhile, OPTICS also computes the reachability distance for each point. The clusters for any clustering distance ε (ε < ε) can be extracted based on the computed order and reachability distances. POPTICS [36] is a scalable OPTICS algorithm using graph algorithms concepts. POPTICS exploits the similarities between OPTICS and MST algorithm to break the sequential access of data points in the classical OPTICS algorithm. The POPTICS algorithm is designed based on the observation that Prim’s approach to continuously increase one edge to the MST at a time is very similar to the OPTICS algorithm. The vertices of MST are analogous to the points of OPTICS and the edge weights of MST are analogous to the reachability distances of OPTICS. Therefore, POPTICS expended MST to support OPTICS. As a result, POPTICS can divide the whole data set into local data subsets and let each process core or machine compute local MSTs through running the OPTICS based on its local data points in a distributed way. Then, a parallel merge of the local MSTs is

123

Cluster Comput (2015) 18:549–562

performed to obtain the global MST. The clusters for any clustering distance ε can be extracted from the global MST. As such, POPTICS can obtain a great scalability through avoiding to sequentially compute the reachability distances and order for all points compared to classical OPTICS. However, POPTICS aims for point data. Therefore, as the following subsection, we will introduce how to drive POPTICS to support trajectory data. 4.2 Tra-POPTICS: POPTICS for clustering trajectory data To successfully support the trajectory clustering using POPTICS, firstly a distance function for trajectory data [18] is used for measuring the dissimilarity between two trajectories. Then, we introduce the POPTICS algorithm for trajectory data called Tra-POPTICS. Finally, a indexing approach based on STR-tree structure [38] is applied in Tra-POPTICS for efficiently clustering trajectories. 4.2.1 The distance function for trajectories In POPTICS, the distance function measures the Euclidean distance between two points. In our setting, the distance function needs to measure the spatiotemporal distance between two trajectories. Therefore, we apply a distance metric of spatiotemporal dissimilarity between trajectories [18] in our scenario. For simplification, we assume that the moving objects move linearly with time in a 2D plane in this paper. Concretely speaking, given two trajectories T 1 and T 2 , the spatiotemporal distance between T 1 and T 2 Dist(T 1 , T 2 ) is approximately computed as: Dist(T1 , T 2 ) ≈

n−1  

 (DT1 ,T2 (tk ) + DT1 ,T2 (tk+1 )) × (tk+1 − tk ) ,

(1)

i=1

where [t 1 , t n ] is the common time period between T 1 and T 2 and t i is a timestamp for computing distance. DT1 ,T2 (t) is the Euclidean distance with time. Noted that, linear interpolation is applied for formula (1) because different trajectory is represented by a collection of discrete points with various sampling rates. DT1 ,T2 (t) is computed base on the definition in [19]: DT1 ,T2 (t) =

 a × t 2 + b × t + c,

(2)

where let the line segments for T 1 and T 2 be (ptk , ptk+1 ) and (qtk , qtk+1 ) at timestamp t k and t k+1 with linear interpolation, the values of a, b and c in formula (2) are computed as follows: a=

A (tk+1 − tk )2

Cluster Comput (2015) 18:549–562

553

 B 2 × A × tk+1 − tk+1 − tk (tk+1 − tk )2 A × tk2 B × tk c= − +C 2 t (tk+1 − tk ) k+1 − tk 

Algorithm 1: Tra-POPTICS based on shared memory

b=

1 Clustering Procedure(D, p, ε, ε , minNumofTrs, CID)/* Input : D is a set

of trajectory data, p is the number of CPU threads, ε means the spatiotemporal distance bound for clustering,ε is the clustering distance, minNumofTrs is the minimum number of trajectories in one cluster

*/

/* Output: CID clusters in CID

*/

with

2 Divide D into p equal-sized subsets D1 ,D2 ,...,Dp . Each subset is assigned to one CPU

A = (qtk+1 .x − qtk .x − ptk+1 .x + ptk .x)2

3 set a queue Q shared among all CPU treads for generating the global MST

thread.

+ (qtk+1 .y − qtk .y − ptk+1 .y + ptk .y)2 B = 2((qtk+1 .x − qtk .x − ptk+1 .x + ptk .x)(qtk .x − ptk .x) + (qtk+1 .y − qtk .y − ptk+1 .y + ptk .y)(qtk .y − ptk .y)) C = (qtk .x − ptk .x) + (qtk .y − ptk .y) 2

2

In the formulas for computing the values of A, B and C, p.x, p.y and q.x and q.y are the coordinates of points p and q in the x and y axis.

4.2.2 The Tra-POPTICS algorithm

4 for each CPU thread ti (1≤ i≤ p) in parallel do 5

for each trajectory element tr ∈ Di (1≤ i≤ p) do

6

mark tr as processed

7

Ns ← FindNeighbors(tr, Di , ε)

8

GenCoreDistance(tr, Ns, ε, minNumofTrs)

9

if tr.coreDistance = NULL then

10

Update(tr, Ns, Pi )

11

while Pi = empty do

12

(tr1, tr, w) ← findMin(Pi )

13

insert (tr1, tr, w) into Q

14

if tr1 ∈ Di then

15

mark tr1 as processed

16

Ns’ ← FindNeighbors(tr1, Di , ε)

17

GenCoreDistance(tr1, Ns’, ε, minNumofTrs)

18

if tr1.coreDistance = NULL then Update(tr1, Ns’, Pi )

19

;

20

end

21

end

22

Using the above-mentioned distance function Dist(T 1 , T 2 ) to compute the spatiotemporal distance of any pair of trajectories, we can design the algorithm of POPTICS for clustering trajectory data (Tra-POPTICS). Like POPTICS, TraPOPTICS can be implemented based on shared memory and distributed memory. In this paper, we focus on Tra-POPTICS based on shared memory. Such that, The Tra-POPTICS algorithm is presented in Algorithm 1. As we can see, the TraPOPTICS algorithm consists of three steps: Step1: Location MST computing (Line 2–23 ): each CPU thread processes a local disjointed subset of trajectory data. For each CPU thread, it first finds out ε-neighbors of each trajectory with FindNeighbor function. The FindNeighbor function is presented in Algorithm 2. In Algorithm 2, we simply linearly scan each data in the subset with the two steps: interpolating two trajectories within the common time period (Line 4 and 5 in Algorithm 2) and measuring the distance between two trajectories using the distance function presented with formula (1) (Line 6 in Algorithm 2). After getting neighbours, each CPU thread computes the core distance of each trajectory (Line 8) and then local MST (Line 9–23). In this step, since p threads process the local data in parallel and p is the maximum active threads provided with CPU, the computational complexity is O(n × runtime of an ε - neighborhood query), where n is the number of trajectories in the local subset. According to Algorithm 2, the query processing requires a linear scan of the entire subset. Consequently, the time complexity of the step 1 is O(n 2 ). Let n be the number of all trajectories, we can get n = n/p because the entire data set is equally assigned p threads. Such that, the complexity is O((n/p)2 )

23

end end

24 end

/* Merge all local results to get the global MST 25

*/

Set a empty MST T

26 for each trajectory tr ∈ D in parallel do 27

tr.parent ← tr

28 end 29 while Q = empty do 30

(t1, t2, w) ← findMin(Q)

31

if t1.parent = t2.parent then Union(t1, t2)

32

T ← T ∪ (t1, t2, w) ;

33 end

/* clustering from MST with any ε (< ε) 34 35

*/

for each trajectory tr ∈ D do CID[tr.ID] ← tr.ID

36 end 37 for each edge (tr1, tr2, w) ∈ T do 38

if w ≤ ε then CID[tr1.ID]← tr2.ID or CID[tr2.ID]← tr1.ID ;

39 end 40 return CID

Step 2: Generating the global MST (Line 24–32 ) and step 3: extracting clusters from the global MST (Line 33–39) is similar to the one in POPTICS [36]. The time complexity in both steps is O(n). Therefore, the computing complexity of Tra-POPTICS is O((n/p)2 + 2n). 4.2.3 Indexing trajectories According to the analysis of the computing complexity of the Tra-POPTICS in Sect. 4.2.2, we can see that the time cost of the algorithm mainly lies in querying each trajectory’s ε-neighbors. The time cost is O(n 2 ) where n is the number of trajectories in the local subset. So, we use a spatiotemporal indexing structure (STR-tree [38]) to

123

554

Cluster Comput (2015) 18:549–562

Algorithm 2: Find trajectory Neighbors

Algorithm 3: Find trajectory Neighbors with the STB-tree

1 FindNeighbors(tr, D, ε )/* Input : tr is a trajectory, D is a set of

1 FindNeighborsWithTree(tr, n, ε, RS)/* Input : tr is a trajectory, n is

one node of the STR-tree, ε means the spatiotemporal distance bound for

trajectory data for neighborhood candidates, ε means the spatiotemporal distance bound for clustering, /* Output: RS is a set of ε-neighbors

clustering,

*/

*/

/* Output: RS is a set of ε-neighbors

*/

2 RS ← ∅

2 if n is leaf node then

3 for each trajectory di ∈ D (1≤ i≤ |D|) do

3

4

5

di ’ ← Interpolate di with time period [max(di .startTS, tr.startTS), min(di .endTS,

Extract a set of trajectories TD from all index entries in the node n or its neighbor leaf nodes for each trajectory di ∈ TD (1≤ i≤ |T D|) do

tr.endTS)]

4

tr’ ← Interpolate tr with time period [max(di .startTS, tr.startTS), min(di .endTS,

5

di ’ ← Interpolate di with time period [max(di.startTS, tr.startTS),

6

tr’ ← Interpolate tr with time period [max(di.startTS, tr.startTS),

7

ds ← Dist(di ’, tr’)

min(di .endTS, tr.endTS)]

tr.endTS)] 6

ds ← Dist(di ’, tr’) // Dist function see formula (1)

7

if ds ≤ ε then RS ← RS ∪ {di }

8

;

9 end 10 return RS

*/

min(di .endTS, tr.endTS)] if ds ≤ ε then RS ← RS ∪ {di } ;

8 9

end

10 end 11 else

index trajectories to reduce the time complexity of the ε neighbors query to O(n ×logn ). STR-tree is an extension of R-tree to support efficient query for trajectories. Differently from R-tree, STR-tree not only considers spatial closeness, but also tries to keep line segments belonging to the same trajectory together. Fig. 1 illustrates the scheme of indexing trajectories. As we can see that a trajectory tr is divided into three line segments, i.e., line 1, line 2 and line 3. Then, these three lines are enclosed with corresponding MBBs (minimum bounding boxs) and inserted into the leaf node A of the STR-tree. As a result, there are three index entries (i.e., entry 1, entry 2 and entry 3) in the leaf A. Each entry is of the form (entry-ID, #trajectory, MBB) where #trajectory is the number of the trajectory that the corresponding line belongs to. The construction of STRtree for trajectories is based on the inserting algorithm in [38]. After constructing a STR-tree for a set of trajectories, the Algorithm 3 introduces the retrieval of ε-neighborhood of a trajectory tr. In the Algorithm 3, the trajectory tr is treated as a topological query to find its trajectory neighbors within εspatiotemporal distance. When the query tr reaches one leaf node, we first collect all trajectories from lines in this leaf Fig. 1 An example of the indexing scheme of a trajectory based on STR-tree

123

12

// n is non-leaf node

13

for each index entry E in the node do

14

if E overlaps tr in the time dimension then

15

tr’ ← Interpolate tr with time period [max(E.startTS, tr.startTS),

16

// computing the distance between tr and E’s MBB on 2D (x,t) plane

17

tr x ← the projection of tr’ on 2D(x,t) plane

18

MBB x ← the projection of E.MBB on 2D(x,t) plane

min(E.endTS, tr.endTS)]

19

d x ← dist traj Rect(tr x, MBB x)

20

// computing the distance between tr and E’s MBB on 2D (y,t) plane

21

tr y ← the projection of tr’ on 2D(y,t) plane

22

MBB y ← the projection of E.MBB on 2D(y,t) plane

23

d y ← dist traj Rect(tr y, MBB y)

24

// computing the distance between tr and E.MBB

25

d=

26

if d ≤ ε then

27

FindNeighborsWithTree(tr, E.node, ε, RS)

28

end

29 30

(d x)2 + (d y)2

end end

31 end 32 return RS

or its related neighbor leaves (Line 3) based on the line search algorithm in [38] since all or most line segments of one trajectory are indexed in one leaf node or neighbor nodes of the STR-tree. Then, we run the same computing procedure as the one in Algorithm 2 to get tr’s ε-neighbors. Otherwise, if the

Cluster Comput (2015) 18:549–562

555

query tr meets a non-leaf node, we choose the index entries whose time periods overlap the one of tr. For each chosen entry, we compute the distance between tr and the MBB of the entry with the calculation method in [19] (Line 15–25). Concretely, both tr and MBB are projected into two 2D planes (i.e., (x,t)-plane and (y, t)-plane). After projections, for each plane, we computing the distance between the projection of tr and the one of MBB with the calculation method in [19] measures the distance between one projection trajectory p and a rectangle M. More formally, the distance function is defined as: dist ( p, M) =

k 

min Dist ( p.linei , M)

(3)

i=1

As we can see the formula (3), the distance equals the sum of the minimum distances between every line of the projection of one trajectory and the rectangle M (i.e., the projection of MBB). If the space of a rectangle is decomposed in four quadrants with the two axes passing through the center and let Qs and Qe be the quadrants where the start and the end of a line lie respectively, the minimum distance between a line L and the rectangle M is computed as the following four rules: – Rule 1: If L intersects M, the minimum distance is zero. – Rule 2: If L belongs to the same quadrant Q of M, the minDist is the minimum value among the minimum distance between the corner of M in Q and L, the one between L’s start point and M, and the one between L’s end point and M. – Rule 3: If L’s start point and end point belongs to adjacent quadrants Qs and Qe, the minDist is the minimum value among the minimum distance between the corner of M in Qs and L, the one between the corner of M in Qe and L, the minimum value between L’s start point and M, and the one between L’s end point and M. – Rule 4: If L’s start point and end point belong to nonadjacent quadrants Qs and Qe, the minimum distance is the minimum value of two minimum distances between L and the two corners of M, which do not belong in either Qs or Qe. Figure 2 illustrates an example for calculating the distance between the projection of one trajectory and rectangle. In this example, the projection p consists of three lines (line1, line2 and line3). Therefore, based on the abovementioned computing rules, dist(p, M) = minDist(p.line1, M) + minDist(p.line2, M) + minDist(p.line3, M) = min{d1, d2, d3} + min{d3, d4, d5, d6} + min{d6, d7, d8}= d2 + d5 + d6.

Fig. 2 The distance between a 2D-projection of one trajectory and a rectangle

5 A GPGPU-aided POPTICS for clustering trajectory data in parallel Successful attempts have been made to enable GPGPU to accelerate the density-based clustering algorithms, e.g., CUDA-DClust [6] and GSCAN [29]. BÖhm et al. proposed CUDA-DClust, a DBSCAN algorithm specially dedicated to the use of GPUs under NVIDIA’s CUDA architecture and programming model [6] . Loh et al. proposed an enhanced version of CUDA-DClust, called GSCAN [29]. GSCAN outperformed CUDA-DClust using a grid structure to reduce the number of unnecessary distance computations. These methods are not designed for OPTICS and used to process trajectory data. Nevertheless, inspired by these successes, this study developed the Tra-POPTICS algorithm aided by GPGPU called G-Tra-POPTICS. We first implement a coarse-gained parallelism of G-Tra-POPTICS based on the new feature of GPU - Hyper-Q and our trajectory indexing structure. Then, we focus on the extremely finegained parallelism of G-Tra-POPTICS through parallelizing the sub-procedures of the Tra-POPTICS algorithm on one GPU. 5.1 The coarse-gained parallelism of G-Tra-POPTICS The NVIDIA’s GPU based on Kepler architecture provides new architectural features such as Hyper-Q, Dynamic Parallelism, and GPUDirect [34], compared with the previous GPUs based on Fermi architecture. Among these new features, Hyper-Q enables multiple CPU threads to connect one GPU simultaneously to launch GPU kernels. This dramatically increases GPU utilization and significantly reduces CPU idle times. Therefore, we divide the entire set of trajectory data using our trajectory indexing structure to assign data to multiple CPU threads. Then, these CPU threads simulta-

123

556

Cluster Comput (2015) 18:549–562

for returning results from the GPU device to host. Then, the tasks in m tasks queues in the Hyper-Q can be executed in parallel using massive GPU threads. Noted that, after executing the task queues on the GPU in Fig. 3, one single CPU thread is used to launch two GPU kernels to execute the merging stage and extracting clusters of Tra-POPTICS. For improving further the G-TraPOPTICS we map the partial subtrees in the STR-tree indexing structure into the GPU memory using the same approach for mapping CKDB-tree into GPU memory based on multiple arrays [15]. Additionally, m CPU threads run the above-mentioned clustering procedure on the GPU at ze least Gglobal_data_si PU _memor y_si ze times, since the maximum data size for clustering trajectories is limited by the size of GPU memory.

5.2 The fine-gained parallelism of G-Tra-POPTICS

Fig. 3 The parallelism of G-Tra-POPTICS based on the STR-tree and Hyper-Q

neously connect one Kepler GPU for executing GPU kernels about G-Tra-POPTICS in a coarse-gained manner. The parallelism of G-Tra-POPTICS based on our indexing structure and Hyper-Q is illustrated in Fig. 3. Figure 3 shows that m CPU threads are assigned to connect one GPU. Each CPU thread Ti is response for processing a set of local trajectory data. When the spatiotemporal distances among different sets of local trajectory data are distinguished differences, the calculation of core distance for each trajectory can limit in its local data set or a few neighboring sets. Thus, this can significantly reduce the number of unnecessary distance computations to reduce the computing overheads of G-Tra-POPTICS. Therefore, we use the trajectory indexing structure based on STR-tree to achieve such a case of data assignment for CPU threads. Based on the illustration of Fig. 3, we evenly assign the trajectory data from the leftmost leaf node at the leaf level of STR-tree to the rightmost one to m CPU threads in turn, since the indexed trajectories in different leaf nodes of the STR-tree have been differences in terms of the spatiotemporal distance (see Sect. 4.2.3 Indexing trajectories). After receiving individual local data, each CPU thread simultaneously launches one task queue into the Hyper-Q of the GPU. The task queue contains the following three tasks: (1) the input task (I task) for transferring local data from host to GPU memory, (2) the clustering task (C task) for calling GPU kernels to execute the local MST computation of Tra-POPTICS and (3) the output task (O task)

123

According to Algorithm 1, the Tra-POPTICS algorithm is run as the following three main steps: (1) computing local MSTs, (2) forming the global MST and (3) extracting clusters from the global MST. Therefore, the fine-grained parallelism of G-Tra-POPTICS can be implemented by parallelizing the above three computing procedures on one GPU.

5.2.1 Parallelizing the calculation of one MST Since the procedure of generating multiple local MSTs can be parallelized based on Hyper-Q (see Sect. 4.2.3 The coarsegained parallelism of G-Tra-POPTICS ), we detail the parallelization of generating one MST on GPU. We use one GPU thread block to generate a MST from one local data set D. The procedure is described in Algorithm 4. To make full use of massive GPU threads, we modified the TraPOPTICS based on CPU. Firstly, we computed all spatiotemporal distances between each pair of trajectories in parallel (Line 3) and store the results into a GPU matrix (Line 4). Using this matrix, we can quickly compute the following steps. Then we use GPU threads to simultaneously gain the core distances of all trajectories (Line 7–12). For efficiency, the part of indexing structure is also used on GPU according to the limitation of GPU memory. After that, the MST is computed in parallel without considering the cycle issue (Line 14–20). Finally, we remove the cycles in MST (Line 22–25).

5.2.2 Parallelizing the merge of MSTs After constructing multiple MSTs, t1 , t2 , …, tm , we simultaneously merge these MSTs on GPU. This procedure is shown

Cluster Comput (2015) 18:549–562

557

Algorithm 4: Generate one MST on one GPU

Algorithm 6: Extracting clusters from the global MST on one GPU

1 GeneratingOneMSTonGPU(D, ε, minTra, t) /* Input : D is a local data

1 ClusteringOnGPU(gt, ε , CID) /* Input : gt is a global MST

set, /* Output: t is a local MST

*/

every trajectory belongs to

2 Set a matrix M in the GPU memory

2 for each trajectory ti (∈ D) in parallel do

3 for each pair of trajectories (tri , trj ) (∈ D×D) in parallel do

3

4

M[i][j] ← Dist(tri , trj ) // the Dist function sees formula(1)

CID[ti .ID]← ti .ID

5 for each edge e (∈ gt) in parallel do

6 //Computing core distances of all data in parallel

6

7 for each trajectory tri (∈ D) in parallel do

7 end

8

Set a empty trajectory set RS

8 Set a GPU thread block TD to process gt

9

if the indexing subtree holds tr is loaded in GPU then

e.isProcessed ← FALSE

9 while not all edges in gt are processed do

FindNeighborsWithTreeOnGPU(tr, root, ε, RS)

10

10

;

11

process one edge e=(ti ,tj ,w) (∈ gt)

11

else FindNeighborsWithTreeOnCPU(tr, root, ε, RS)

12

if e.isProcessed == TRUE then break

12

;

13

;

13

tri .coreDistance ← SetCoreDis(M, RS, ε, minTra)

14

if w ≤ ε then CID[ti .ID]← tj .ID or CID[tj .ID]← ti .ID

14 end

15

e.isProcessed ← TRUE

15 //generate the MST in parallel

16

;

16 for each trajectory tri (∈ D) in parallel do

17

end

if tri .coreDistance = NULL then

18

*/

4 end

5 end

17

*/

/* Output: CID is a set of cluster identifiers to indicate the cluster that

*/

for each GPU thread (∈ TD) in parallel do

18 end

Get a list L with the form (tri , trj , w) from other core trajectories trj where w

19 return CID

is reachability distance from trj to tri (tri , trk , w) ← findMin(L)

19

21

we cluster the two trajectories by resetting their cluster IDs to the same cluster ID.

t ← t ∪ (tri , trk , w)

20

end

22 end 23 // remove the cycles of the MST 24 for each trajectory tri (∈ t) in parallel do 25

if tri ’s any ancestor ∈ {tri ’s chirldren} then

26

remove the cycle ;

6 Performance evaluation

27 end

in Algorithm 5. In this merge procedure, each GPU thread is responsible for merging a pair of MSTs. When the spatiotemporal distance between the roots of two MSTs is not more than ε, the two MSTs are merged through setting the pointer of one root to the other. Algorithm 5: Merge MSTs on one GPU 1 MergingMSTsOnGPU(T, gt, ε) /* Input : T is a set of local MSTs,

/* Output: gt is a global MST

*/ */

2 for each MST ti (∈ T) in parallel do 3

5 for each pair of MST (ti , tj ) (∈ T×T) in parallel do

if Dist(rooti , rootj ) ≤ ε then

7 8

6.1 Experimental setup

rooti ← ti ’s root

4 end

6

We have evaluated the performances of the proposed clustering algorithms of Tra-POPTICS and G-Tra-POPTICS against a real trajectory dataset collected by GPS devices [20] upon one platform empowered by a cutting-edge NVIDIA GPU. These experiments concern qualities of the clusters and computing efficiency of proposed clustering algorithms.

merge ti and tj into ti or tj end

9 end

5.2.3 Parallelizing the extraction of clusters The extraction of clusters on GPU is a parallel version of the corresponding part in Algorithm 1. This procedure is shown in Algorithm 6. GPU threads in Algorithm 6 first in parallel initialize the cluster IDs that each trajectory belongs to using individual trajectory ID. Then one group of GPU threads is assigned to process all edges of gt to extract clusters. Each GPU thread checks one edge (ti , t j , w) to check whether ti and t j are in the same cluster within ε - spatiotemporal distance. If the two trajectories are in the same cluster,

The trajectory dataset used in this paper comes from the GeoLif GPS trajectories [20]. These trajectories were recorded by different GPS loggers and GPS-phones, with different sampling rates. Each trajectory in this set is represented by a sequence of time-stamped points, each of which contains the information of latitude and longitude. This dataset contains 17,621 trajectories with a total distance of about 1.2 million kilometers and a total duration of 48,000+ h. Approximately 91 % of the trajectories have a sampling rate of 1– 5 s. Since clustering sub-trajectories could be more useful in many applications (especially in the analysis of special region interest) than clustering trajectories as a whole [25], we extracted sub-trajectories from the dataset for our following experiments. All experiments were executed on one computer equipped with a Kepler GPU (GTX 780), and the configurations are presented in Table 1. Additional, the input parameters ε, minTra and ε are set to 120, 6 and 100 for all experiments.

123

558

Cluster Comput (2015) 18:549–562

Table 1 Configurations of the computer Specifications of CPU platforms

Computer

OS

Windows 7

CPU

i7-4770 (3.4 GHz, 4 cores)

Memory

32GB DDR3

Specifications of GPU platforms

GTX 780

Architecture

Kepler GK110

Memory

3GB DDR5

Bandwidth

Bi-directional bandwidth of 16 GB/s

CUDA

SDK 6.0 Fig. 5 The time costs of Tra-POPTICS

6.2 Tra-POPTICS evaluation We investigated the clustering qualities and computing performance of Tra-POPTICS. For comparison, we used an OPTICS algorithm for clustering trajectories in a centralized fashion (T-OPTICS) [33]. The maximum number of threads is set to eight since the used CPU (i7-4770) has four cores. 6.2.1 Qualities of the clusters In this experiment, we compared the clustering obtained by Tra-POPTICS and T-OPTICS using the metric Omega-Index in [36]. Omega-Index indicates the similarity degree of clustering results between Tra-POPTICS and T-OPTICS. The value of Omega-Index ranges from 0 to 1. When the value of Omega-Index equals one, Tra-POPTICS and T-OPTICS have the same clustering results. We extract five subsets from the entire set. Each subset is a set of sub-trajectories with 50– 100 sampling points. The sizes of five subsets range from 3000, 6000, 9000, 12000 to 15000. Figure 4 shows that the

Omega Index of Tra-POPTICS with two threads is an average of 0.98 (range 0.96–0.99) with varying data sizes, while the ones of Tra-POPTICS with 4, 6 and 8 threads are 0.92, 0.913 and 0.911 on average. The results indicate that TraPOPTICS keeps a high clustering quality with the value of Omega Index above 0.91. 6.2.2 Computing performance In this experiment, we measure the time overheads and throughput of Tra-POPTICS with varying thread numbers and varying data sizes. The Computing efficiency of TOPTICS that is the computation efficiency using one thread is a baseline for comparision. The results are shown in Figs. 5 and 6. Figure 5 shows that Tra-POPTICS significantly outperforms T-OPTICS (1 thread in Fig. 5). Tra-POPTICS can on average gain the speedup of 2×, 3.5×, 4.0× and 4.5× using 2, 4, 6 and 8 threads for varying data sizes. The results show that Tra-POPTICS has a great scalability comparing to TOPTICS. The reason is straightforward since Tra-POPTICS uses a parallel and distributed computing manner while TOPTICS follows a centralized way. Fig. 6 shows the similar results to the ones in Fig. 5. 6.3 G-Tra-POPTICS evaluation We make two observations: (1) clustering qualities of G-TraPOPTICS and (2) the impact of GPU execution on computing of G-Tra-POPTICS. 6.3.1 Qualities of the clusters

Fig. 4 The clustering qualities of Tra-POPTICS

123

In this experiment, we establish m connections between CPU and GPU through Hyper-Q feature where m ranges from 10, 20 ,30 ,40 to 50. We observe the Omega-Index of G-Tra-

Cluster Comput (2015) 18:549–562

559

Fig. 6 The throughput of Tra-POPTICS Fig. 8 The time costs of G-Tra-POPTICS

Fig. 7 The clustering qualities of G-Tra-POPTICS

POPTICS with varying connections and different data sizes comparing to T-OPTICS. Noted that, for the GPU side of each connection, the GPU threads are assigned based on parallel algorithms of G-Tra-POPTICS in section 5.2 (The finegained parallelism of G-Tra-POPTICS). Figure 7 shows that G-Tra-POPTICS has a comparable clustering quality to T-POPTICS. The value of Omega-Index on average for all connections is 0.98, 0.94, 0.923, 0.905 and 0.902 as the data size changes from 3000, 6000, 9000, 12000 to 15000. The reason for the decrease of value of Omega-Index as the increase of data size lies in the impact of cumulative error of calculating and comparing points of trajectories on clustering trajectories.

Fig. 9 The throughput of G-Tra-POPTICS

POPTICS with eight threads, individually. The experimental results in both Figs. 8 and 9 show that G-Tra-POPTICS gains distinct speedups than Tra-POPTICS for all cases, especially for large data size. For instance, for the case of 20 connections, the speedup is 19.9× , 27.9× , 31.7× , 33.2× and 33.3× , respectively for data with size = 3000, 6000, 9000, 12000 and 15000 in terms of time costs. The average speedup is about 30×. Another observation is, for a fixedsize dataset, that G-Tra-POPTICS gains a constant speedup when the number of connections is great than 30 for both time costs and throughput. The reason is that 32 is the maximum number of connections supported by Hyper-Q.

6.3.2 Computing performance 7 Conclusion and future work To evaluate the computing performance of G-Tra-POPTICS, we compute the speedups of time costs and throughput between G-Tra-POPTICS with varying connections and Tra-

This study aims at the need of practice to scalably and quickly cluster trajectory big data in the current era of ‘Big Data’.

123

560

This study applied a scalable density-based clustering algorithm for point data, POPTICS in processing trajectory big data. The proposed new clustering algorithm, namely TraPOPTICS meets the need of scalably processing trajectory big data through employing the spatiotemporal distance function and the trajectory indexing. A massively parallel method for Tra-POPTICS (G-Tra-POPTICS) had been developed to ensure the performance of clustering trajectory data with the support of a contemporary Kepler GPU. The proposed approach provided a new tool for fast trajectory big data clustering. The results show that (1) the Tra-POPTICS algorithm has a comparable clustering quality with T-OPTICS and outperforms T-OPTICS by average 4 times in terms of scalability, and (2) G-Tra-POPTICS has a comparable clustering quality with T-POPTICS and further gains about 30 speedup for clustering trajectories comparing to Tra-POPTICS with 8 threads. This study indicates that (1) the Tra-POPTICS algorithm has a great adaptability to massive trajectory data due to using a distributed OPTICS algorithm and (2) the G-Tra-POPTICS method exhibits great potentials for clustering trajectory data in real-time manner by combining merits of both the CPU and the GPU. Meanwhile, currently the G-Tra-POPTICS algorithm is implemented based on one GPU device. Obviously this scheme provides a limited capacity of processing massive trajectory data. Therefore, for future work, we will extend the trajectory clustering approach for processing trajectory big data over multiple GPUs and CPUs based on distributed memory. Acknowledgments This work was supported in part by the National Natural Science Foundation of China (Nos. 61272314, 61361120098, 61440018) the Program for New Century Excellent Talents in University (NCET-11-0722), the Excellent Youth Foundation of Hubei Scientific Committee (No. 2012FFA025), the China Postdoctoral Science Foundation (2014M552112), the Fundamental Research Funds for the National University, China University of Geosciences (Wuhan) (Nos. CUG120114, CUG130617, 1410491B17), Beijing Microelectronics Technology Institute under the University Research Programme (No. BM-KJ-FK-WX-20130731-0013), the Hubei Natural Science Foundation (No. 2014CF- B904).

References 1. Akodjènou-Jeannin, M.I., Salamatian, K., Gallinari, P.: Flexible grid-based clustering. LNAI 4702, 350–357 (2007) 2. Alhamazani, K., Ranjan, R., Jayaraman, P.P., Mitra, K., Wang, M., Huang, Z.G., Wang, L., Rabhi, F.A.: Real-time qos monitoring for cloud-based big data analytics applications in mobile environments. In: IEEE international conference on mobile data management, pp. 661–670 (2014) 3. Alon, J., Sclaroff, S., Kollios, G., Pavlovic, V.: Discovering clusters in motion time-series data. In: IEEE conference on computer vision and pattern recognition, pp. 375–381 (2003)

123

Cluster Comput (2015) 18:549–562 4. Ankerst, M., Breunig, M.M., Kriegel, H.P., Sander, J.: Optics: ordering points to identify the clustering structure. In: ACM SIGMOD international conference on management of data, pp. 49–60 (1999) 5. Birant, D., Kut, A.: St-dbscan: an algorithm for clustering spatial temporal data. Data Knowl. Eng. 60, 208–221 (2007) 6. BÖhm, C., Noll, R., Plant, C., Wackersreuther, B.: Density-based clustering using graphics processors. In: ACM international conference on information and knowledge management, pp. 661–670 (2009) 7. Camargo, S.J., Robertson, A.W., Gaffney, C.J., Smyth, P., Ghil, M.: Cluster analysis of typhoon tracks. Part ii: large-scale circulation and enso. J. Clim. 20, 3654–3676 (2007) 8. Chawla, S., Zheng, Y., Hu, J.: Inferring the root cause in road traffic anomalies. In: International conference on data mining, pp. 141– 150 (2012) 9. Chen, D., Li, X., Wang, L., Khan, S., Wang, J., Zeng, K., Cai, C.: Fast and scalable multi-way analysis of massive neural data. IEEE Trans. Comput. 63 (2014). 10. Chen, L., Özsu, M.T., Oria, V.: Robust and fast similarity search for moving object trajectories. In: ACM SIGMOD international conference on management of data, pp. 491–502 (2005) 11. Chen, D., Wang, L., Zomaya, A.Y., Dou, M., Chen, J., Deng, Z., Hariri, S.: Parallel simulation of complex evacuation scenarios with adaptive agent models. IEEE Trans. Parallel Distrib. Syst. 25 (2014) 12. Chen, D., Li, X., Cui, D., Wang, L., Lu, D.: Global synchronization measurement of multivariate neural signals with massively parallel nonlinear interdependence analysis. IEEE Trans. Neural Syst. Rehabil. Eng. 22, 33–43 (2014) 13. Chudova, D., Gaffney, S., Mjolsness, E., Smyth, P.: Translationinvariant mixture models for curve clustering. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp. 79–88 (2003) 14. Defays, D.: An efficient algorithm for a complete link method. Comput. J. 20, 364–366 (1977) 15. Deng, Z., Wu, X., Wang∗, L., Chen, X., Ranjan, R., Zomaya, A., Chen∗, D.: Parallel processing of dynamic continuous queries over streaming data flows. IEEE Trans. Parallel Distrib. Syst. PrePrint 16. Ester, M., Kriegel, H., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd international conference on knowledge discovery and data mining, pp. 226–231 (1996) 17. Ferreira, N., Silva, C., Klosowski, J.T., Scheidegger, C.: Vector field k-means: clustering trajectories by fitting multiple vector fields. Comput. Graph. Forum 32, 201–210 (2013) 18. Frentzos, E., Gratsias, K., Theodoridis, Y.: Index-based most similar trajectory search. In: IEEE international conference on data engineering, pp. 816–825 (2007) 19. Frentzos, E., Gratsias, K., Pelekis, N., Theodoridis, Y.: Algorithms for nearest neighbor search on moving object trajectories. Geoinformatica 11, 159–193 (2007) 20. Geolife project (Microsoft Research Asia). http://research. microsoft.com/en-us/downloads/b16d359d-d164-469e-9fd4-daa 38f2b2e13/ (2012) 21. Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann, Burlington (2011) 22. Kisilevich, S., Mansmann, F., Nanni, M., Rinzivillo, S.: Spatiotemporal clustering. Data Mining and Knowledge Discovery Handbook, 2nd edn, pp. 855–874. Springer, New York (2010) 23. Kolodziej, J., Khan, S.U.: Multi-level hierarchical genetic-based scheduling of independent jobs in dynamic heterogeneous grid environment. Inf. Sci. 214, 1–19 (2012)

Cluster Comput (2015) 18:549–562 24. Kołodziej, J., González-Vélez, H., Wang, L.: Advances in dataintensive modelling and simulation. Future Gener. Comput. Syst. 37, 282–283 (2014) 25. Lee, J.G., Han, J., Whang, K.Y.: Trajectory clustering: a partitionand-group framework. In: ACM SIGMOD international conference on management of data, pp. 49–60 (2007) 26. Liu, L., Song, J., Guan, B., Wu, Z., He, K.: Tra-dbscan: a algorithm of clustering trajectories. Front. Manuf. Des. Sci. II(121– 126), 4875–4879 (2012) 27. Liu, H., Chen, S., Kubota, N.: Intelligent video systems and analytics: a survey. IEEE Trans. Ind. Inform. 9, 1222–1223 (2013) 28. Liu, P., Yuan, T., Ma, Y., Wang, L., Liu, D., Yue, S., Kołodziej, J.: Parallel processing of massive remote sensing images in a GPU architecture. Comput. Inform. 33, 197–217 (2014) 29. Loh, W.K., Moon, Y.S., Park, Y.H.: Fast density-based clustering using graphics processing units. IEICE Trans. Inform. Syst. 97, 1349–1352 (2014) 30. Ma, Y., Wang, L., Liu, D., Yuan, T., Liu, P., Zhang, W.: Distributed data structure templates for data-intensive remote sensing applications. Concurr. Comput. 25, 1784–1793 (2013) 31. Ma, Y., Wang, L., Zomaya, A.Y., Chen, D., Ranjan, R.: Task-tree based large-scale mosaicking for massive remote sensed imageries with dynamic dag scheduling. IEEE Trans. Parallel Distrib. Syst. 25, 2126–2135 (2014) 32. MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, pp. 281–297 (1967) 33. Nanni, M., Pedreschi, D.: Time-focused clustering of trajectories of moving objects. J. Intell. Inf. Syst. 27, 267–289 (2006) 34. NVIDIA Corporation. KEPLER—THE WORLD’S FASTEST, MOST EFFICIENT HPC ARCHITECTURE. http://www.nvidia. com/object/nvidia-kepler.html (2013) 35. Park, H.S., Jun, C.H.: A simple and fast algorithm for k-medoids clustering. Expert Syst. Appl. 36, 3336–3341 (2009) 36. Patwary, M.M.A., Palsetia, D., Agrawal, A., Liao, W.K., Manne, F., Choudhary, A.: Scalable parallel optics data clustering using graph algorithmic techniques. In: The international conference for high performance computing, networking, storage and analysis, pp. 49:1–49:12 (2013) 37. Pelekis, N., Kopanakis, I., Marketos, G., Ntoutsi, I., Andrienko, G., Theodoridis, Y.: Similarity search in trajectory databases. In: International symposium on temporal representation and reasoning, pp. 129–140 (2007) 38. Pfoser, D., Jensen, C.S., Theodoridis, Y.: Novel approaches to the indexing of moving object trajectories. In: International conference on very large databases, pp. 395–406 (2000) 39. Rinzivillo, S., Pedreschi, D., Nanni, M., Giannotti, F., Andrienko, N., Andrienko, G.: Visually driven analysis of movement data by progressive clustering. Inf. Vis. 7, 225–239 (2008) 40. Shekhar, S., Evans, M.R., Gunturi, V., Yang, K.: Spatial big-data challenges intersecting mobility and cloud computing. In: ACM international workshop on data engineering for wireless and mobile access, pp. 1–6 (2012) 41. Sibson, R.: Slink: an optimally efficient algorithm for the singlelink cluster method. Comput. J. 16, 30–34 (1973) 42. Tang, L.A., Zheng, Y., Yuan, J., Han, J., Leung, A., Peng, W.C., Porta, T.L.: A framework of traveling companion discovery on trajectory data streams. ACM Trans. Intell. Syst. Technol. 5, 3:1– 3:34 (2013) 43. Vlachos, M., Kollios, G., Gunopulos, D.: Discovering similar multidimensional trajectories. In: IEEE international conference on data engineering, pp. 673–684 (2002)

561 44. Wang, L., von Laszewski, G., Younge, A.J., He, X., Kunze, M., Tao, J., Fu, C.: Cloud computing: a perspective study. New Gener. Comput. 28, 137–146 (2010) 45. Wang, L., Chen, D., Hu, Y., Ma, Y., Wang, J.: Towards enabling cyberinfrastructure as a service in clouds. Comput. Electr. Eng. 39, 3–14 (2013) 46. Wang, L., Lu, K., Liu, P., Ranjan, R., Chen, L.: Ik-svd: dictionary learning for spatial big data via incremental atom update. Comput. Sci. Eng. 16, 41–52 (2014) 47. Wei, J., Liu, D., Wang, L.: A general metric and parallel framework for adaptive image fusion in clusters. Concurr. Comput. 26, 1375– 1387 (2014) 48. Wu, H.R., Yeh, M.Y., Chen, M.S.: Profiling moving objects by dividing and clustering trajectories spatiotemporally. IEEE Trans. Knowl. Data Eng. 25, 2615–2628 (2013) 49. Xue, W., Yang, C., Fu, H., Wang, X., Xu, Y., Gan, L., Lu, Y., Zhu, X.: Enabling and scaling a global shallow-water atmospheric model on tianhe-2. In: International parallel and distributed processing symposium, pp. 745–754 (2014) 50. Yuan, N.J., Zheng, Y., Zhang, L., Xie, X.: T-finder: a recommender system for finding passengers and vacant taxis. IEEE Trans. Knowl. Data Eng. 25, 2390–2401 (2013) 51. Zhao, J., Wang, L., Tao, J., Chen, J., Sun, W., Ranjan, R., Kołodziej, J., Streit, A., Georgakopoulos, D.: A security framework in ghadoop for bigdata computing across distributed cloud data centres. J. Comput. Syst. Sci. 80, 994–1007 (2014) 52. Zoumpatianos, K., Idreos, S., Palpanas, T.: Indexing for interactive exploration of big data series. In: ACM SIGMOD international conference on management of data, pp. 1555–1566 (2014)

Ze Deng received the B.Sc. degree from China University of Geosciences, the M.Eng. degree from Yunnan University, and the Ph.D. degree from Huazhong University of Science and Technology, China. He is currently an Assistant Professor with the School of Computer Science, China University of Geosciences, Wuhan, China. He is currently also a Postdoctor with the Faculty of Resources, China University of Geosciences, Wuhan, China. Yangyang Hu received the B.Sc. degree from China University of Geosciences. He is currently master degree candidate with the School of Computer Science, China University of Geosciences, Wuhan, China. His research interests include data management, highperformance computing, and neuroinformatics.

123

562

Cluster Comput (2015) 18:549–562 Mao Zhu received the B.Sc. degree from China University of Geosciences. He is currently master degree candidate with the School of Computer Science, China University of Geosciences, Wuhan, China. His research interests include data management, highperformance computing, and simulation.

Xiaohui Huang is currently a B.Sc. degree candidate with the School of Computer Science, China University of Geosciences, Wuhan, China. His research interests include high performance computing and data storage.

123

Bo Du is currently a B.Sc. degree candidate with the School of Computer Science, China University of Geosciences, Wuhan, China. His research interests include data management, high performance computing and simulation.