issues in the management of moving point objects - Semantic Scholar

1 downloads 0 Views 8MB Size Report
for at generere data til brug i empiriske studier af performance af teknikker ...... But again, to compute the result, we have to examine a set of line segments that ... taxis did a taxi meet during its duty from 7 a.m. to 10 a.m. this morning?,” “What.
ISSUES IN THE MANAGEMENT OF MOVING POINT OBJECTS Ph.D. Thesis

Dieter Pfoser Advisor: Christian S. Jensen Computer Science Department, Aalborg University

Fredrik Bajers Vej 7E, DK-9220 Aalborg Øst, DENMARK Abstract Database applications dealing with spatiotemporal phenomena, i.e., phenomena located in space and evolving over time, are numerous. In this context, the present work investigates issues related to applications involving moving point objects. We present a method to assess moving point objects data, propose a suitable representation in the form of trajectories of this data, and suggest prototypical queries. Trajectory data can stem from different movement scenarios such as unconstrained and constrained movement. The latter considers “infrastructure” to hinder the movement. For efficient query processing, we propose adapted and new access methods, as well as a new query processing technique that considers infrastructure in a filter step. An important query type is the join. In exploiting properties of the data, we propose a technique of how to incrementally compute this operation. The way we assess and represent moving point objects introduces uncertainty. Part of the thesis describes an augmented representation that considers uncertainty. We adapt the query processing techniques accordingly to consider uncertainty. A topic across chapters is the generation of data for the various experiments. We propose modifications to an existing spatiotemporal data generator and show how to create datasets that correspond to real-life situations.

DATABASEUNDERSTØTTELSE AF OBJEKTER I KONTINUERT BEVÆGELSE Ph.D. Thesis

Dieter Pfoser Vejleder: Christian S. Jensen Institute for Datalogi, Aalborg Universitet

Fredrik Bajers Vej 7E, DK-9220 Aalborg Øst, DANMARK Resumé Området for spatio-temporale anvendelser er omfattende, og der eksisterer derfor mange prototypiske anvendelser af databaseteknologi. Denne afhandling fokuserer på aspekter af sådanne anvendelser, som involverer punkt-objekter, der er istand til at bevæge sig kontinuert. Vi præsenterer en metode til at observere sådanne objekter i bevægelse, foreslår en passende repræsentation af deres bevægelser i form af såkaldte polylinier og beskriver prototypiske forespørgsler på objekternes bevægelse. Objekternes bevægelser kan enten foregå frit, eller de kan være underlagt visse begrænsninger. I det sidste tilfælde eksisterer der en "infrastruktur", som påtvinger objekternes bevægelser restriktioner. Med henblik på at opnå effektiv udførelse af forespørgsler foreslår vi tilpassede og helt nye indiceringsteknikker og algoritmer, herunder algoritmer, der søger at udnytte infrastrukturen i et indledende, såkaldt filter-skridt. Join-operationen, der sammenknytter forskellige data elementer baseret på deres egenskaber, er vigtig. Vi udnytter den måde, hvorpå data opdateres, til at udvikle en teknik, der gør det muligt at beregne join-operationen inkrementelt, dvs. ved genbrug af resultater af tidligere joinberegninger. Måden, hvorpå vi observerer og repræsenterer objekternes bevægelse, introducerer usikkerhed. Dele af denne afhandling udvikler en udvidet repræsentation, som tager denne usikkerhed i betragtning. Afhandlingen tilpasser desuden teknikker til udførelse af forespørgsler til også at tage usikkerhed i betragtning. Behovet for at generere data til brug i empiriske studier af performance af teknikker til udførelse af forespørgsler går igen i flere af afhandlingens kapitler. Afhandlingen beskriver modifikationer til en eksisterende algoritme, der genererer spatio-temporal data, og det illustreres, hvordan det er muligt at skabe data, som svarer til virkelige situationer.

Contents Chapter 1 Introduction..................................................................................................1 1.1 Spatiotemporal?..............................................................................................1 1.2 Spatiotemporal Research................................................................................2 1.3 Goals and Contributions.................................................................................3 1.4 Outline of the Thesis ......................................................................................4 1.5 Acknowledgements ........................................................................................5 Chapter 2 Requirements and Definitions for the Spatiotemporal Domain.................7 2.1 Concepts in Spatiotemporal Applications .....................................................8 2.1.1 Spatial Concepts ..................................................................................8 2.1.2 Temporal Concepts............................................................................10 2.1.3 Spatiotemporal Concepts...................................................................11 2.1.4 Analogies Between the Spatial and Temporal Domain....................13 2.2 Moving Point Objects...................................................................................13 2.2.1 An Application Scenario - GPS-based Fleet Management ..............14 2.2.2 Sampling Moving Objects.................................................................14 2.2.3 Infrastructure......................................................................................15 2.2.4 Queries ...............................................................................................16 2.3 Conclusions ..................................................................................................18 Chapter 3 Data Generation.........................................................................................21 3.1 Background...................................................................................................22 3.1.1 GSTD Overview ................................................................................22 3.2 Introducing Semantics in GSTD..................................................................24 3.2.1 Extending the GSTD Algorithm .......................................................24 3.2.2 Clustered Movements........................................................................24 3.2.3 Infrastructure......................................................................................25 3.3 Complex Scenarios on Moving Object Trajectories ...................................26 3.3.1 Movement Parameters.......................................................................26 3.3.2 Clustered Movement..........................................................................30 3.3.3 Obstructed Movement .......................................................................31 3.4 Conclusions and Future Work......................................................................33 Chapter 4 Query Processing and Indexing ................................................................35 4.1 Introduction ..................................................................................................35 4.1.1 Indexing .............................................................................................35 4.1.2 Query Processing...............................................................................36

Contents 4.2

iii

The Access Methods.................................................................................... 37 4.2.1 The STR-tree..................................................................................... 39 4.2.2 The TB-tree ....................................................................................... 42 4.2.3 A Qualitative Comparison ................................................................ 44 4.3 Query Processing Algorithms ..................................................................... 45 4.3.1 Topological Queries.......................................................................... 46 4.3.2 Combined Search Algorithms .......................................................... 46 4.4 Performance Comparison ............................................................................ 49 4.4.1 Datasets ............................................................................................. 49 4.4.2 Space Utilization and Index Size...................................................... 50 4.4.3 Preservation Parameter ..................................................................... 50 4.4.4 Range Queries ................................................................................... 51 4.4.5 Time Slice Queries............................................................................ 53 4.4.6 Topological Queries.......................................................................... 55 4.4.7 Combined Queries ............................................................................ 55 4.4.8 Summary ........................................................................................... 56 4.5 A Querying Processing Technique.............................................................. 57 4.5.1 Query Processing and Infrastructure ................................................ 57 4.5.2 Query Window Split Algorithms...................................................... 58 4.5.3 Query Window Segmentation and Indexing.................................... 62 4.6 Experimental Studies................................................................................... 63 4.6.1 Varying Query Window Size ........................................................... 63 4.6.2 Varying Infrastructure....................................................................... 66 4.6.3 Varying LRU Buffer Size................................................................. 67 4.6.4 Summary ........................................................................................... 67 4.7 Conclusions and Future Work..................................................................... 67 Chapter 5 Incremental Join Processing ..................................................................... 71 5.1 Temporal Joins............................................................................................. 73 5.1.1 Temporal Joins and Partitioned Storage........................................... 73 5.1.2 Temporal Join Recomputation ......................................................... 75 5.1.3 Incremental Join Computation.......................................................... 79 5.2 Analytical Cost Formulas............................................................................ 83 5.2.1 Recomputation .................................................................................. 83 5.2.2 Incremental Computation ................................................................. 86 5.3 Performance Study ...................................................................................... 86 5.3.1 General Considerations..................................................................... 86 5.3.2 Comparing Recomputation Algorithms ........................................... 88 5.3.3 Incremental Computation Versus Recomputation........................... 90

iv

Contents

5.3.4 Summary of Performance Study .......................................................91 5.4 Conclusions and Future Work......................................................................91 Chapter 6 Indeterminacy ............................................................................................93 6.1 An Introduction to Indeterminacy Measures...............................................94 6.1.1 Fuzzy Set Theory...............................................................................94 6.1.2 Probability Theory.............................................................................95 6.1.3 Differences between Fuzzy Set and Probability Theory ..................95 6.2 Temporal Indeterminacy ..............................................................................96 6.2.1 Indeterminate Time Points ................................................................96 6.2.2 Indeterminate Time Intervals ............................................................97 6.2.3 Differences between Fuzzy Set and Probability Theory in the Temporal Domain ...........................................................................................98 6.3 Spatial Indeterminacy...................................................................................99 6.3.1 Indeterminate Spatial Objects, Relationships and Attributes ...........99 6.3.2 Indeterminate Geometry..................................................................100 6.3.3 Differences between Fuzzy Set Theory and Probability Theory in the Spatial Domain..............................................................................................103 6.4 Spatiotemporal Indeterminacy ...................................................................104 6.4.1 Spatiotemporal Scenarios and Change............................................105 6.5 Sampling and Uncertainty..........................................................................108 6.5.1 Quantifying Uncertainty..................................................................108 6.5.2 Measurement Error..........................................................................109 6.5.3 Uncertainty in Sampling..................................................................109 6.6 A Representation for Moving Point Objects .............................................115 6.7 Conclusions and Future Work....................................................................117 Chapter 7 Query Processing and Indeterminacy .....................................................119 7.1 Queries and Uncertainty.............................................................................119 7.2 Indexing ......................................................................................................120 7.2.1 Context.............................................................................................120 7.2.2 Processing Uncertainty Queries ......................................................120 7.3 Constrained Scenarios................................................................................126 7.4 Conclusions ................................................................................................127 Chapter 8 Conclusions and Future Work................................................................129 8.1 Conclusions ................................................................................................129 8.2 Future Work................................................................................................130 Bibliography ...............................................................................................................133 Appendix A.................................................................................................................143 Appendix B.................................................................................................................145 Appendix C.................................................................................................................149

Figures Figure 2.1: Spatial objects and layers are orthogonal in space................................... 10 Figure 2.2: Movements and space............................................................................... 15 Figure 2.3: (a) Topological predicates and (b) combined queries.............................. 19 Figure 3.1: Consecutive instances of a time-evolving object and the corresponding projections ...................................................................................................... 23 Figure 3.2: Infrastructure (a set of rectangles) ............................................................ 25 Figure 3.3: Slow moving objects................................................................................. 27 Figure 3.4: Fast moving objects .................................................................................. 28 Figure 3.5: Smoothly moving objects ......................................................................... 28 Figure 3.6: A directed movement towards “south” .................................................... 29 Figure 3.7: Clustered movements towards the corners............................................... 30 Figure 3.8: Obstructed movement, sparse infrastructure............................................ 32 Figure 3.9: Obstructed movement, dense infrastructure............................................. 32 Figure 4.1: An R-tree index......................................................................................... 38 Figure 4.2: (a) Approximating trajectories using MBBs, and (b) mapping of line segments in a MBB ........................................................................................ 39 Figure 4.3: Insertion into the STR-tree ....................................................................... 40 Figure 4.4: STR-tree insert algorithm ......................................................................... 40 Figure 4.5: Different split scenarios ............................................................................ 41 Figure 4.6: STR-tree split algorithm ........................................................................... 41 Figure 4.7: Insertion into the TB-tree.......................................................................... 43 Figure 4.8: TB-tree insert algorithm............................................................................ 43 Figure 4.9: The TB-tree structure................................................................................ 44 Figure 4.10: Trajectories and covering MBBs: (a) 10 trajectories; MBBs of the R-tree (b) at level 1 and (c) at level 2, respectively for the STR-tree (d)-(e), and respectively for the TB-tree (f)-(g). ............................................................... 45 Figure 4.11: Processing (a) combined and (b) topological queries ............................ 46 Figure 4.12: Stages in combined search...................................................................... 47 Figure 4.13: R-tree and STR-tree: CombinedSearch algorithm for trajectory-based queries............................................................................................................. 48 Figure 4.14: TB-tree: CombinedSearch algorithm update ......................................... 49 Figure 4.15: Comparison STR/R-tree number of node accesses for insertion with varying preservation parameter p .................................................................. 51 Figure 4.16: Comparison on range queries: varying range, (a) 1%, (b) 10% and (c) 20% in each dimension .................................................................................. 52

vi

Figures

Figure 4.17: Comparison on range queries for datasets with a varying time horizon: range, (a) 1%, (b) 10% and (c) 20% in each dimension ................................53 Figure 4.18: Comparion on range queries for datasets with a varying objects agility: range, (a) 1%, (b) 10% and (c) 20% in each dimension ................................54 Figure 4.19: Comparison on time slice queries: varying spatial range, (a) 1%, (b) 10% and (c) 100% in each spatial dimension ................................................55 Figure 4.20: Comparison on topological queries: varying range, (a) 1%, (b) 10% and (c) 20% in each dimension .............................................................................56 Figure 4.21: Comparison on combined queries: (a) 1% inner range and 10% outer range and (b) 1% inner range and 20% outer range, in each dimension.......57 Figure 22: Moving objects snapshots and infrastructure.............................................58 Figure 4.23: Query window segmentation algorithm..................................................61 Figure 4.24: (a) Elongated rectangles, and (b) seed points .........................................61 Figure 4.25: Segmented query windows......................................................................62 Figure 4.26: Hilbert space filling curve .......................................................................63 Figure 4.27: A snapshot of the trajectory dataset ........................................................64 Figure 4.28: Query processing under varying LRU buffer size..................................67 Figure 5.1: Temporal relations.....................................................................................74 Figure 5.2: Partitioned temporal relations....................................................................75 Figure 5.3: Temporal join for partitioned storage .......................................................76 Figure 5.4: deptLocation cur âPT empDepartment old ....................................................77 Figure 5.5: empDepartment old âPT deptLocation old ....................................................78 Figure 5.6: Overview incremental join computation...................................................80 Figure 5.7: Temporal joins using (a) BlockSkip and (b) TupleSkip...........................84 Figure 5.8: NL vs. SMB join for varying buffer size ..................................................88 Figure 5.9: NL vs. SMB join for varying percentages of long-lived tuples: (a) buffer size 1/16 of relation size and (b) buffer size 1/2 of relation size ...................89 Figure 5.10: NL vs. SMB join for varing tuple lifespans............................................89 Figure 5.11: Recomputation versus incremental computation using varing outdatedness and buffer sizes .........................................................................90 Figure 6.1: Determinate (I1) and indeterminate (I2) time points .................................97 Figure 6.2: Indeterminate time interval........................................................................97 Figure 6.3: Time interval: probabilities of bounding time points ...............................98 Figure 6.4: Probability for an indeterminate point in Space .....................................101 Figure 6.5: Boundary point probability .....................................................................102 Figure 6.6: Boundary point probability .....................................................................103 Figure 6.7:Positional error in the GPS .......................................................................109 Figure 6.8: Possible trajectories of a moving object..................................................110 Figure 6.9: Uncertainty between samples..................................................................111

Figures

vii

Figure 6.10: Probability functions for sampling errors, (a) normal-case sampling error (b) worst-case sampling error ............................................................. 112 Figure 6.11: Evolving sampling error ....................................................................... 112 Figure 6.12: Error ellipses ......................................................................................... 113 Figure 6.13: Varying sampling rate........................................................................... 114 Figure 6.14: An example database containing positional and error information connected to a fleet management application of a taxi company ............... 117 Figure 7.1: Summing up the probability ................................................................... 121 Figure 7.2: Query window expansion: high probability........................................... 122 Figure 7.3: Refinement step....................................................................................... 123 Figure 7.4: Small-window query............................................................................... 124 Figure 7.5: Query window expansion: small query window.................................... 125 Figure 7.6: Query window expansion and infrastructure ......................................... 127 Figure C.1: Relating Number of Tuples to Elapsed Time ........................................ 149

Chapter 1 Introduction 1.1 Spatiotemporal? What is the essence of spatiotemporal information? With respect to databases, while temporal databases record the temporal aspects, e.g., the time when an accident happened, and spatial databases record the spatial aspects of objects and/or relationships, e.g., the location where an accident happened, spatiotemporal databases deal with the combined concept, i.e., “spatiotemporal” expresses both aspects. Spatial and temporal databases have been important sub-areas of database research for a long time. Researchers in both areas have always felt that there are important connections in the problems addressed by each area, and in the techniques and tools utilized for their solution. There are many publications in temporal databases, which conclude with the phrase “the ideas in this work can be extended to spatial data management.” Similarly, many works in spatial databases suggest that techniques developed for spatial databases are applicable for the management of temporal data by restricting attention to one dimension only. But up to now little has been done towards the systematic interaction and synergy between these two areas so that the respective claims can be formally verified, refuted, or appropriately qualified [46]. It is clear that despite the many efforts, the spatial and temporal research areas have not, yet, met satisfactorily. The main reasons lie in the complexity of their components: space by itself is a complex and intricate issue, involving, among others, positions of objects, and spatial attributes that change values depending on specific locations. Spatiotemporal applications (i.e., applications dealing with spatiotemporal concepts) can be categorized based on the types of data they manage, which may pertain to the past, the present, the future, or a combination of these. For example, applications managing past data often conduct analyses of movements over time, answering queries such as, “What were the movements of the Vikings in the North Sea between year 1000 and year 1200?” Applications dealing with present and future data capture the current spatial extents of objects in the database and typically make predictions

2

Chapter 1 Introduction

about the future extents of the objects. Sample queries include, “What is the position of flight SAS 286?” and “Where will flight SAS 286 be in 20 minutes?” A different categorization is based on the type of objects spatiotemporal applications manage. The first category comprises applications dealing with continuously moving real-world objects by disregarding the spatial extents and representing their positions as points. Candidate applications include fleet management, air traffic control, military command-and-control systems, and people tracking. The second category comprises objects whose characteristics, as well as positions may change in time, i.e., the position, the shape of the object, as well as other properties change discretely. Consider here a cadastral information system, which records the extent, ownership, and other information related to land parcels. The third category includes objects whose positions, shapes, and other related properties change continuously. Consider here environmental applications dealing with “pollution,” which is measured as a moving phenomenon that changes its properties and shape over time.

1.2 Spatiotemporal Research Many groups internationally conduct research related to spatiotemporal information and examples of this research are presented in this section. The European network for training and mobility of young researchers (TMR) CHOROCHRONOS involves ten nodes from eight different European countries. The name is a word composed out of the Greek words for space and time. The research conducted in this network ranges from perceptual issues, such as how we perceive spatiotemporal phenomena, e.g., life styles, over conceptual modeling issues, data representation (data types), and constraint databases, to issues such as query processing, indexing, and database architecture. The activities of the network are described in [46]. The network is coordinated by Timos Sellis from the National Technical University of Athens, Greece; more information about the network can be found at http://www.dbnet.ece.ntua.gr/~choros/. Some of the first spatiotemporal research stems from a group at the Department of Electrical Engineering and Computer Science at the University of Illinois at Chicago, USA, and headed by Ouri Wolfson. The research of this group focuses mainly on handling current and future positions of moving objects [134] [135] [136]. More information can be found at http://www.eecs.uic.edu/~wolfson/. Several researchers in the Department of Spatial Information Science at the University of Maine, USA, investigate spatiotemporal reasoning and spatiotemporal application development in the context of storing and maintaining geographic information over time. At http://www.spatial.maine.edu more information can be found. Further, a group around Donna Peuquet at the Department of Geography at Pennsylvania State University, USA, conducts research into what it takes to enhance a Geographic Information System with temporal capabilities. In the course of their research, they developed several prototype systems. More information can be found at http://www.geog.psu.edu/faculty/donnaP.html. Gail Langran pioneered work when proposing of how to include temporal aspects in geographic information systems [70]. The work of Stefano Spaccapietra’s group at

1.3 Goals and Contributions

3

the University of Lausanne, Switzerland is mainly concerned with the conceptual modeling of spatiotemporal information [91]. A large-scale project on advanced transportation systems [22] is from Caltrans, the Department of Transportation of the State of California, USA. Their goal is to integrate services like traffic management, emergency vehicle management, and traveler information system on a statewide level.

1.3 Goals and Contributions As we saw previously, the aspects of spatiotemporal information as related to databases are vast. The focus of the research effort documented in this PhD thesis, is on moving point objects, and more specifically, on past and present information. We investigate issues related to the data itself, i.e., data acquisition, representation, indeterminacy (uncertainty) as well as issues related to the use and exploitation of such data. This includes efficient query processing, in particular indexing and incremental query processing techniques. An application that utilizes such techniques is fleet management. Among other tasks, this application requires to keep track of the movements of a number of vehicles. This application is used as an example later on, when discussing data acquisition and representation. Another application context is mobile computing [7]. The estimates are that, by the year 2003, 500 million people will use mobile terminals [64]. Many of these terminals will be equipped with a GPS device, and, thus, may make their positions available to the outside, digital, and geo-referenced world. Applications, here, include spatiotemporal data mining, as well as providing geo- and timereferenced content. Further application examples include analyzing accelerator data in particle physics as well as pollution streams in environmental applications. In the former, collisions between high-speed particles produce particle trajectories that have to be analyzed. The latter example requires the recording and analysis of pollution streams modeled as a set of particles moving over time in lakes or other bodies of water [34]. Moving point objects stand for a particular type of application within the spatiotemporal context. Part of the thesis is concerned with structuring the whole spatiotemporal application domain, i.e., what types of spatiotemporal data do exist and how applications dealing with moving point objects are related to other applications. “Other” applications include cadastral applications with discretely changing shapes. To derive indeterminacy as related to spatiotemporal and in particular moving point objects, we examine this issue first in a broader context, i.e., we describe indeterminacy in the temporal, spatial, and the spatiotemporal domains. Subsequently, we describe the case of indeterminacy as related to trajectories. Moving point objects constitute a new kind of data. Consequently, new types of queries emerge. To process those, we have to devise new access methods and, in general, new query processing techniques. In the thesis, we devise two new access methods and a new query processing technique that is an adaptation of the filterrefinement technique known from spatial databases.

4

Chapter 1 Introduction

To experiment with new techniques one needs data. To the best of our knowledge, three data generators are available. For our purposes, we adapted an existing spatiotemporal data generator to produce appropriate datasets.

1.4 Outline of the Thesis The results reported in this thesis have been published in conference proceedings or are currently under submission. 1. Pfoser, D. and Jensen, C. S.: Spatiotemporal Query Processing for Constrained Movement Scenarios. Under submission, 2001. 2. Pfoser, D. and Tryfona, N.: Fuzziness and Uncertainty in Spatiotemporal Applications. Under submission, 2001. 3. Pfoser, D., Jensen, C. S., and Theodoridis, Y.: Novel Approaches in the Indexing of Moving Object Trajectories. In Proceedings of the 26th Conference on Very Large Databases, pp. 295-406, 2000. 4. Pfoser, D. and Theodoridis, Y.: Generating Semantics-Based Trajectories of Moving Objects. In Proceedings of the International Workshop on Emerging Technologies for Geo-Based Applications, Ascona, Switzerland, 2000. 5. Pfoser, D. and Jensen, C. S.: Capturing the Uncertainty of Moving-Object Representations. In Proceedings of the 6th International Symposium on the Advances in Spatial Databases, pp. 111-132, 1999. 6. Pfoser, D. and Jensen, C. S.: Incremental Join of Time-Oriented Data, In Proceedings of the 11th International Conference on Scientific and Statistical Database Management, pp. 232-242, 1999. 7. Pfoser, D. and Tryfona, N.: Requirements, Definitions, and Notations for Spatiotemporal Application Environments. In Proceedings of the 6th International Symposium on Advances in Geographic Information Systems, pp. 124-130, 1998. Chapter 2 of the thesis gives an overview of the spatiotemporal application domain, i.e., what types of spatiotemporal applications and thus data do exist and how can we characterize them? This chapter also describes the data, i.e., trajectories, and queries stemming from moving point objects. Sampling is suggested as a method to obtain trajectories of moving point objects. Queries for spatiotemporal data can be derived from the spatial and temporal domain. However, whenever there is a new type of data, new types of queries arise as well. Chapter 3 is concerned with spatiotemporal data generation. The testing of new data structures and access methods requires the existence of datasets, either real or synthetic. Since real datasets (i) are not accessible for some applications and (ii) may not be useful for stress testing conditions, a lot of work can be found in the literature on generating synthetic data, following some specifications. We describe extensions to an existing spatiotemporal data generator, the GSTD tool, and show how this modified generator can be used to create datasets for several example scenarios. Chapter 4 devises two spatiotemporal access methods tailored to the requirements of trajectory data and associated queries. The assumption that spatial and spatiotemporal data share similarities drives the development of those access

1.5 Acknowledgements

5

methods. Both methods are derived from the well-known R-tree structure. We consider the following particularities of spatiotemporal data; the preservation of trajectories in the index and the fact that the given type of data is append-only with respect to time. The two access methods we propose are the Spatio-Temporal R-tree (hereafter called STR-tree) and the Trajectory-Bundle tree (hereafter called TB-tree). In this work, we also devise a new technique for processing spatiotemporal range queries based on the two-step technique known from spatial query processing (filter and refinement step). We introduce an additional pre-processing step, in which we do not actually query the trajectory data itself, but static spatial objects that constrain movement. We name those objects infrastructure. Examples include, buildings, lakes, and pedestrian zones, all places where cars are not permitted. The two most important types of query operators are selection and join. After proposing access methods to support the processing of spatiotemporal range queries using indices, we proceed in Chapter 5 with presenting incremental algorithms for the processing of joins. These algorithms exploit the temporal append-only property of the data, as it is the case with trajectory data. Chapter 6 deals with fuzziness and uncertainty, or collectively, indeterminacy in the spatiotemporal application context. We integrate spatial and temporal indeterminacy and show how both can be expressed by using either fuzzy set theory or probability theory. We discuss the nature of spatiotemporal indeterminacy and give a mathematical description of it. We then apply this general framework to the case of moving point objects. When acquiring the positions, we are concerned with the error of the acquisition technique, sampling. The representation is imprecise since we use interpolated positions in-between samples. We quantify indeterminacy for the proposed representation. Chapter 7 shows how to consider indeterminacy connected to trajectory data. We present a framework that allows the use of indeterminacy in conjunction with indices for query processing. Also, we show how the query processing technique considering infrastructure can be applied to range queries while at the same time taking indeterminacy into account as well.

1.5 Acknowledgements This research was supported in part by the CHOROCHRONOS project, funded by the European Commission DG XII Science, Research and Development, as a Network Activity of the Training and Mobility of Researchers Program, contract no. ERBFMRX-CT96-0056, the Danish Technical Research Council through grant 9700780, and a grant from the Nykredit Data Corporation.

Chapter 2 Requirements and Definitions for the Spatiotemporal Domain In the last decade, information systems, and more specifically, the database research community, obtained valuable results in modeling and retrieving, on the one hand, spatial objects [53] [76] [103] [138] and, on the other hand, objects in a temporal framework [59] [61] [77] [109]. Some other efforts fall in the spectrum of the spatiotemporal systems, i.e., deal with combined spatial and temporal information: Story and Worboys [114] propose a design support environment for spatiotemporal databases focusing on the integration of time. Allen et al. [4] present a generic model consisting of objects, states, events, and conditions for explicitly representing causal links within a temporal GIS. Worboys [138] proposes a unified model for information, which is referenced by two spatial dimensions and two temporal dimensions (database and event times). Claramunt et al. [28] present a set of design patterns for spatiotemporal processes expressed in an object-relationship data model. Finally, in this spectrum, domain experts give their own solutions and modeling techniques to these issues, mainly driven by the need for fast and applicable answers. This results in spatiotemporal systems tightly coupled with specific software and hardware (see for example [9] [10] [141] for the use of Arc/Info in environmental modeling). The notable exception here is the work of Faria et al. [43] in which they propose an extensible framework for spatiotemporal application development. It is clear that despite the many efforts, the spatial and temporal research areas have not, yet, met satisfactorily. The main reasons lie in the complexity of their components: space by itself is a complex and intricate issue, involving, among others, position of objects, and spatial attributes that change values depending on specific locations.

8

Chapter 2 Requirements and Definitions for the Spatiotemporal Domain

In this chapter, we address the concepts involved in spatiotemporal applications as they are drawn in users’ requirements. The goal is to facilitate a better understanding of spatiotemporal applications, by providing the concepts and the notations needed. Later on, these concepts will be translated into specific constructs and implementation issues. More specifically, the concepts of snapshots, changes, and versions of objects and maps, motion and phenomena are presented and then combined to accommodate spatiotemporal needs. Subsequent to giving an overview of the spatiotemporal domain, this chapter examines a sub-domain more closely, i.e., moving point objects. Here, we present example applications as well as methods for data acquisition and representation. Data is not self-sufficient, but has to be seen in connection with queries. We explain “old” query types in the new context as well as introduce new query types significant for the new domain.

2.1 Concepts in Spatiotemporal Applications Spatiotemporal applications fall into the category of data intensive applications, often referred to as “non-standard,” including, among others, multimedia, VLSI design, and artificial intelligence based systems. They differ from business data processing— exemplified by the “supplier-supplies-parts” paradigm—in a variety of ways, centered around the support of complex objects, relationships among them, and long transactions. In addition, spatiotemporal applications deal with objects whose position in space, as well as the change of it over time matters. Section 1.1 gave two approaches to categorize spatiotemporal applications. One way was with respect to the type of moving objects the application manages, i.e., we distinguished discretely and continuously changing objects that have or have no areal extent. This section describes a set of spatiotemporal concepts, in connection with examples from the above application categories, drawn from requirements as found in theoretical research [17] [40] as well as applied experience (the design and development of a utility management system [129] and the design of a cadastral database [21]). First, the spatial and temporal concepts are given independently, and then combined to give spatiotemporal concepts. A database instance is a collection of objects, which represents a part of the real world. Each object belongs to an object class, which is characterized by a set of properties or attributes. Each attribute is associated with a domain, which is a set of values. So, each object in a database instance is represented by a set of values, each belonging to the domain of the corresponding attribute of the object class. A database is called spatial, temporal, or spatiotemporal if it manages spatial, temporal, or spatiotemporal concepts, respectively. Next, we describe these concepts. 2.1.1 Spatial Concepts In order to start a discussion about spatial concepts, we first need to refer to space and define what we mean by spatial objects. Moreover, space has attributes, which are represented as layers. Objects and layers are orthogonal and complementary views of space.

2.1 Concepts in Spatiotemporal Applications

9

2.1.1.1 Space Space is a set. The elements of space are called points, while finite sets of points (i.e., subsets of space, which can be point, lines, or regions) are called geometric figures. Any set will do for space; however for practical purposes of current spatiotemporal applications, space is modeled as a subset of ¡ 3 . ¢3 , ¢2 , ¢, ¡ 2 , and ¡ are the most common subsets used in practice. In all specific examples in this work the domain of space is ¡ 2 . 2.1.1.2 Spatial Objects Objects in the real world have a position in space. In specific application environments, objects’ positions in space matter and these objects are called spatial objects. For example, a moving “car” in a navigational system has position, as well as a “landparcel” in a cadastral system. Position p of objects is a function from objects to parts of space, G (i.e., geometric figures) [33]. p : spatial_objects → G (2.1) 2.1.1.3 Spatial Attributes As said before, objects have attributes, which characterize them. Spatial objects have, apart from descriptive attributes, also spatial attributes; for example “vegetation” of a “landparcel.” Values of spatial attributes depend on the referenced position and not on the object itself. If the spatial object “landparcel” changes position, then the value of “vegetation” may change. More specifically, spatial attributes are properties of space, and spatial objects located in specific positions inherit parts of these attributes. However, not all spatial objects have spatial attributes; this depends on the application requirements. For example, no spatial attribute can be assigned to a moving “car”, while many (e.g., “vegetation”, “soil type”) can be assigned to a “landparcel.” 2.1.1.4 Layers or Fields Spatial attributes refer to the whole space and can be represented as layers (or fields) representing one theme (i.e., thematic maps). Informally speaking, a layer L is a representation of a spatial attribute. Formally speaking, a layer is a function from geometric figures to attribute domains. L : G → D1 × D2 × ... × Dk (2.2) In Formula (2.2), G is a finite set of geometric figures and Di, with 1