Towards Mining Approaches for Trajectory Data

4 downloads 15978 Views 843KB Size Report
trajectory data mining is emerging as a novel area of research. There are many ... Definition 1: Trajectory as the description of the movement of some object.
International Journal of Advances in Science and Technology Vol. 2, No.3, 2011

Towards Mining Approaches for Trajectory Data Susanta Satpathy, Lokesh Sharma, Ajaya K. Akasapu, Netreshwari Sharma Department of Computer Science and Engineering, Rungta College of Engineering and Technology, Bhilai-INDIA [email protected], {lksharmain, ajaykumar.akasapu, netresharma}@gmail.com

Abstract The increasing frequency of location-acquisition technologies (RFID, GPS, GSM networks, etc.) is leading to the collection of large spatio-temporal datasets and to the prospect of discovering usable knowledge about movement behaviour, which encourages novel applications and services. Therefore trajectory data mining is emerging as a novel area of research. There are many constraints to mining the trajectory data and analyze the mobility data by means of appropriate patterns and models extracted by efficient algorithms and development of novel knowledge discovery processes explicitly modified to analyze trajectory with reference to geography, at appropriate scales and granularity. Urban traffic simulations are a straightforward example of application for this kind of knowledge, since a classification model can represent a sophisticated alternative to the simple adhoc behavior rules, provided by domain experts, on which actual simulators are based. In this study we present brief review of the trajectory data mining. There are still open research issues those still unexplored. Therefore in this paper, some research challenges and future works are reported.

Keywords: Trajectory Data, Trajectory Association Rule Mining, Trajectory Classification, Trajectory Cluster. 1. Introduction Given the advent of telecommunications and particularly cellular phones, GPS technology and satellite imagery, robotics, Web traffic monitoring, and computer vision applications, the amount of spatiotemporal data stored every day has grown exponentially. Applications vary from tracking the movement of objects to monitoring the growth of forests. In most systems, time is considered to be unidirectional and linear. It is usually modeled as a discrete variable taking integer values that represent days, hours, and so on. On the other hand, space is considered bidirectional especially in geographic systems, and nonlinear space is represented with real numbers and the granularity is usually small. For example, it can be in the scale of feet or meters in an area that represents an entire city. Spatiotemporal data mining is a significantly more challenging task than simple spatial or temporal data mining. Spatiotemporal data is typically stored in 3-D format (2-D space information + time) [6] [7] [9]. During the last five years, attempts have been made to extend many techniques for knowledge discovery in classical relational or transactional data, such as association rule mining, frequent pattern discovery, clustering, classification, prediction and time-series analysis, to knowledge discovery in the context of data with a spatial annotation. Much of this research discusses some simple class of patterns and focuses mainly on algorithmic aspects, often involving some approximation techniques. The research in this field has not yet produced a theoretical framework for spatial data mining. This makes research in data mining in the context of moving objects more challenging. And the objectives in this area are manifold. First, the relevant patterns have been discovered to mine for. Second, taxonomy of these patterns will make it clear for which mining tasks; new techniques will have to be developed. Third, suitable algorithmic solutions will have to be developed to implement these mining tasks. Finally, this new research field could benefit from a clean unified theoretical framework. These objectives are rather ambitious and we will not try to approach and tackle them for general moving or

March Issue

Page 38 of 88

ISSN 2229 5216

International Journal of Advances in Science and Technology Vol. 2, No.3, 2011 changing objects, but rather in the more restricted setting of moving object data or trajectory data. Still, these tasks go beyond the research in spatial mining, because we will always assume spatial information as background data for the moving object data and it will be involved and appear in the mined patterns [5]. There is a clear and present need to bring together researchers from academia, government, and the private sector in the broad areas of knowledge discovery from trajectory data such as trajectory data preprocessing, representation and transformation, scalable and distributed classification, prediction, clustering algorithms, space-time sampling techniques etc. In this paper we present brief review of trajectory data mining. We discuss some data mining algorithms such as pattern discovery, classification and clustering for trajectory data. The remainder of the paper is organized as follows: In Section 2, semantic definition for the trajectory data is given. In Section 3, pattern discovery in trajectory data is reported. In Section 4, we present classification on trajectory data. Clustering techniques in trajectory data are reported in Section 5. Challenges and future direction are given in section 6 and our study is concluded in Section 7.

2. Trajectory Data Model In this section we provide the basic definition for trajectory data. Also we describe the basic terminology for trajectory data mining [1] [5] [11] [12]. Definition 1: Trajectory as the description of the movement of some object. More formally, a trajectory T is the graph of a continuous mapping from I ⊆ R to R 2 (the two-dimensional plane). I ⊆ R→R2: t → α (t) = (αx (t), αy (t))

(1)

So that T = (αx (t) ,αy (t) ,t) | t ∈ I ⊂ R2×R Definition 2. (Trajectory pattern mining) Given a trajectory database TD, a time tolerance  , a neighborhood function N() and a minimum support threshold s min , the trajectory pattern mining problem consists of finding all frequent T-patterns, i.e., all T-patterns (S,A) such that support D, ,N(S,A) ≥ s min ; where the support D, , N of a T-pattern (S,A) is the number of input trajectories T TD such that (S,A) ≤ N,  T. Definition 3. (Trajectory Classification). The trajectory classification can be formalized as follows: (T 1, c1),...,(T m, cm)  (TD x C)

(2)

Where TD is a non empty set of the trajectories list {(t0, x 0, y0), (t 1, x 1, y1)… (t N , x N , yN )}, with ti, x i, yi  R for i = 0,…, N and t 0 < t1