Document not found! Please try again

Modeling Satellite Image Streams for Change ...

0 downloads 0 Views 347KB Size Report
Modeling Satellite Image Streams for Change Analysis ... satellite data, there are still challenging peculiarities that demand a ...... Foundation under Awards No. ... In 19th. Conference on Interactive Information Processing. Systems, AMS, 2003.
Modeling Satellite Image Streams for Change Analysis Carlos Rueda and Michael Gertz Department of Computer Science University of California, Davis, CA 95616, U.S.A. {carueda,gertz}@ucdavis.edu

ABSTRACT

1.

Fast detection of changes in environmental remotely sensed data is a major requirement in the Earth sciences, especially in natural disaster related scenarios. As satellite, transmission, and network technologies continue to improve, the realtime stream processing and delivery of geospatial data from remote sensors requires a systematic approach for change analysis and visualization in a streaming fashion. Although various approaches have been formulated to model the inherent spatial-temporal-spectral complexity of remotely sensed satellite data, there are still challenging peculiarities that demand a precise characterization in the context of environmental change detection. In this paper, we present a formal characterization of fundamental operational aspects for the unambiguous specification of change detection and visualization queries in a streaming fashion. This goal is accomplished by defining spatially-aware temporal operators with a consistent semantics for change analysis tasks, and a practically relevant image stream processing architecture founded on a precise execution model and realized by using scientific workflows particularly targeted at collaborative scientific environments. We illustrate our approach with representative examples in land cover and wildfire detection using live data from environmental remote sensors.

Detection of environmental changes using remotely sensed data is a major requirement in the Earth sciences [16]. While the prevalent approach in the remote sensing community is to process image dataset batches after complete scenes have been collected [10], several sensitive scenarios, however, especially in disaster prevention and monitoring, would greatly benefit if the incoming image streams were processed in a more expedite manner, ideally in real-time. Various algorithms for wildfire detection, for example, operate in a very localized manner, e.g., at the pixel level, so there is no intrinsic reason to wait until whole scenes are completed before the change analysis can proceed. The problem of online processing and analysis of change in satellite imagery can be formulated and addressed in terms of algorithms dealing with data streams, that is, unbounded, high-speed sequences of data tokens coming from one or multiple sources. There has recently been a growing interest in the processing of data streams and several data stream management techniques have been proposed (see, e.g., [2, 6]). However, driven typically by applications in the areas of sensor networks, event monitoring, and network traffic analysis, these efforts have a focus on tuple-based streams, and actually little research has been done for a systematic treatment in the case of geospatial images coming from spaceborne and airborne remote sensors. In this paper, we present an image stream processing model especially geared toward supporting change analysis and visualization tasks on remotely sensed, streaming geospatial data. As a necessary foundation for the realization of such a framework, we propose a characterization of key operations for change detection and visualization in remote sensing data imagery. A particular challenge is the handling of time in the context of image fragments coming from multiple sources, at different rates, and partially composing full geospatial scenes. These aspects are central for an adequate characterization of operators dealing with spatio-temporal analysis, which is typical in remote sensing change detection algorithms. For the realization of the framework, we describe a concrete architecture based on the Kepler scientific workflow system [15]. By using Kepler, we benefit from several advantages including ease of use, availability of an extensive library of tools, as well as formal underpinnings and an active user community. We illustrate the functionality of our approach with examples in the context of wildfire and land cover change detection. As satellite, transmission, and network technologies continue to improve, with reduced latencies and even real-time

Categories and Subject Descriptors J.2 [Computer Applications]: Physical Sciences and Engineering—Earth and atmospheric sciences; H.2.1 [Database Management]: Logical Design—Data models

General Terms Design, Experimentation

Keywords Change detection, Data streams, Remote sensing, Scientific workflows

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ACMGIS’07 November 7-9, 2007, Seattle, WA Copyright 2007 ACM ISBN 978-1-59593-914-2/07/11 ...$5.00.

INTRODUCTION

streaming delivery of data from remote sensors, the necessity and feasibility of constructing a full-fledged system to fulfill the above requirements become apparent; this work is a concrete step in that direction. Our contributions in the proposed framework are summarized as follows: • A precise definition of spatially-aware operators with a consistent semantics to support temporal change analysis tasks including detection and visualization; • A practically relevant image stream processing architecture founded on a precise execution model and realized by using scientific workflows particularly targeted at collaborative scientific environments. The remainder of the paper is structured as follows. In the following section, we detail the key operations on streaming image data underlying change detection and analysis. In Section 3, we give an overview of typical change detection techniques in the context of remotely sensed data. The realization of our approach using the Kepler scientific workflow system is detailed in Section 4, followed by two examples in Section 5. After a review of related work in Section 6, we conclude the paper in Section 7 with a summary and discussion of ongoing and future work.

2.

IMAGES AND IMAGE STREAMS

Central to our data model are the concepts of geospatial images and streams of (geospatial) images. In this paper, we extend basic image algebra elements presented in [7, 21] to expressly introduce a concrete characterization of streams and operations for supporting temporal analysis in streaming geospatial data.

2.1

Image Algebra Basics and Notation

For self-containment purposes, we briefly summarize relevant elements of the image algebra [21]. In general, an image a is a function from some set of points X in a topological space to some set of values F, that is a : X → F. Each element (x, a(x)) ∈ a is called a pixel, where x is its location and a(x) its value. The spatial domain X and the set of values of the image are denoted domain(a) and range(a), respectively. We say a is an F-valued image on X. The set of all images from X to F is denoted FX . Typically, unary and binary operations associated with a value set F induce corresponding operations on F-valued images. For example, given an image a ∈ RX , the image with the absolute values of the pixels in a is given by |a| ≡ {(x, |a(x)|) : x ∈ X}. Similarly, given a scalar h ∈ R, the thresholding operation is the binary image χ≥h (a) ≡ {(x, 1) : a(x) ≥ h} ∪ {(x, 0) : a(x) < h}. For a binary operation γ over F, the induced binary image operation is defined as a γ b ≡ {(z, a(z) γ b(z)) : z ∈ X}, for images a, b ∈ FX . For example, the binary addition of real-valued images a, b ∈ FX , is a + b ≡ {(z, a(z) + b(z)) : z ∈ X}. Given a function f : F → G, with F and G value sets, a value transform of an image a, denoted f ◦a ≡ {(x, f (a(x))) : x ∈ X}, changes an image in FX to an image in GX . For example, the conversion of an RGB color image in (Z3 )X into a gray-scale image in ZX can be given by a function f : Z3 → Z, frequently defined as f (r, g, b) ≡ ⌊.3r + .58g + .11b⌋. This is an example of a pointwise value transform. Other value transforms require more complex spatial operations. Examples include edge detection, contrast stretch, and spatial filtering. Often, a block of source points around a point

is used to compute the corresponding point value in the resulting image. For example, the standard neighborhood 2 function N : Z2 → 2Z , defined as N (x, y) ≡ {(x, y), (x − 1, y), (x, y − 1), (x + 1, y), (x, y + 1)}, is commonly used for Z2 the Psmoothing of a raster image a ∈ R as follows: b(x) = 1 y∈N (x) a(y). 5 The domain restriction of an image a ∈ FX to a point set R ⊆ X, denoted a|R , is defined as the function {(x, a(x)) : x ∈ R} ∈ FR . Given two images a ∈ FX , b ∈ FY , the spatial extension of a to b, denoted a|b ∈ FX∪Y , is defined as: a|b (x) ≡ a(x) if x ∈ X, and b(x) if x ∈ X \ Y.

2.2

Geospatial Images

Point sets of raster images are usually regularly-spaced lattices in R2 , i.e., they have the form Zm × Zn ≡ {(x, y) ∈ Z2 : 0 ≤ x < n, 0 ≤ y < m} for some integers m and n where Zn ≡ {0, 1, . . . , n − 1}. Although, in general, sensors use their native coordinate reference systems and have their corresponding georeferencing procedures to assign geographic locations to the pixels in an image (e.g., in terms of latitude/longitude), in this paper we assume all images share a common reference system. Value sets in remotely sensed data are typically subsets of the integers Z and the reals R. Also common are vectors of these types, e.g., Zb , in the case of instruments able to generate multi-spectral images, where b is the number of bands. Because of possible errors in acquisition or transmission, it is a common practice to specify a special value to represent missing or erroneous data. Consequently, all value sets in this paper are assumed to be implicitly augmented with this special element, denoted ⊥. Remotely-sensed images are timestamped according to the time of acquisition. Timestamps assigned by particular sensors may obey different criteria, e.g., different units, precision, and/or different time references. In this paper, we abstract from these differences and model timestamps as belonging to a global time reference, τ ∈ R, so all images can be compared in terms of time of acquisition. Although time could be modeled as an additional dimension in an image’s point set, we base our characterization in this paper by regarding the timestamp as a metadata attribute. We write τ (a) to denote the timestamp associated with an image a. Metadata is further discussed below.

2.3

Geospatial Image Streams

Let U and F be a point set and a value set, respectively. An image stream over U and F is an unbounded, timestampordered sequence of F-valued images coming from a certain sensor or operator, α ≡ ha1 , a2 , . . . , at , . . .i, where domain(a) ⊆ U for each image a in α. We refer to U as the field of view of the stream. Note that, implicitly, a ≡ a|U . For each stream α, we use the notation h. . . , ai to explicitly denote the current image a in the stream, i.e., the image with the timestamp that is currently the largest. Figure 1 depicts an image stream over a point set U within our spatio-temporal reference system. The spatial domains of the images are represented by vertical line segments (as a convenient simplification of a possibly multi-dimensional spatial domain). Notice that the spatial domain of an image is not necessarily a connected set and that images in a stream may be irregularly timestamped. Stream metadata. A set of metadata attributes needs to be associated with each geospatial image stream, thus

x

a4

a2

a6 a5

a1 ···

U

a3 τ1

τ2 τ3

τ4

τ5 τ6

τ

Figure 1: Example of an image stream. permitting a management system to properly support a variety of processing and query requirements. Metadata attributes are used and processed so that appropriate transformations are propagated across the operators in the system and the resulting attributes are associated to the derived image streams. Important attributes associated with each image stream α include: U, field of view; F, value set associated with the images; B, number of bands of the images (i.e., F has the form GB for some set G); S, spatial resolution, i.e., the size of the pixel in terms of the corresponding geographical coordinate reference space; and T, temporal resolution, i.e., the period of time between two consecutive scans of the same geographic area. Table 1 shows values of some of these attributes in representative satellite sensors (only highest resolutions are shown in each case). Instrument

Bands

Landsat ETM+ QuickBird MODIS ASTER AVHRR GOES Imager

8 5 36 14 5 5

Spatial resolution 15×15 m 0.61×0.61 m 250×250 m 15×15 m 1.1×1.1 km 1×1 km

Temporal resolution 16 days 1–5 days 1 day 1 day 1/2 day 5 min

Table 1: Basic characteristics of some standard spaceborne sensor instruments [10]. Region of interest coverage. It is important to examine how a given spatial region could be covered by remotely sensed images. Consider a field of view U, a region of interest R ⊆ U, and one or more image streams over U, as shown in Fig. 2. Some possible, non-mutually exclusive cases of coverage patterns of R are: (a) always full coverage, (b) irregular, (c) incremental, and (d) overlapping coverage in the case of multiple stream sources. x

erwise, we say the stream is heterogenous. For example, the stream in Fig. 2(a) is homogeneous but the others are heterogenous with respect to R. Commonly, satellite streams are heterogenous with respect to a given region of interest. For example, several satellite instruments follow a row-byrow, incremental coverage pattern similar to that shown in Fig. 2(c) (e.g., GOES Imager). However, as we shall discuss, a simpler characterization of stream operations makes homogeneity a desirable condition. It is often necessary to refer to the current state of a full region of interest even in the case of heterogeneous coverage. For visualization purposes, for instance, just displaying the current image (fragment) in the stream at a time would be very limited. Moreover, many spatial analysis operations often operate on complete region of interest images. Below, we introduce the stream spatial extension as a basis for handling these cases.

2.4

Operations on Image Streams

We primarily classify the operators in our framework according to their cardinality in terms of the main operated streams. Possible parameters are not considered for determining the cardinality of an operator. Parameters are commonly scalars, point sets, value sets, and images. Regardless of arity, any operation involving a ⊥ operand will in general produce the ⊥ result. In the following, we introduce operations pertinent to change detection and visualization. We start with some basic operations and terminology. Then, we introduce more advanced operations for temporal analysis. Unless noted otherwise, unary operations on images are straightforwardly extended to streams by applying the operation on the current image of the stream. The spatial domain of the operated image may or may not be altered. We say an operator op is domain preserving if and only if domain(op(a)) = domain(a). Figure 3 illustrates the subsetting operation extended to streams, α|R ≡ h. . . , a|R i. In the example, note that the image at τ1 in α|R is empty (not shown). x

x

1 2

4

R

x

2

5 3

3 R

R (a)

τ

(b)

x

x

R

R (c)

τ

(d)

τ

4

5

τ1 τ2 τ3 τ4 τ5 α

τ

τ1 τ2 τ3 τ4 τ5 α|R

τ

Figure 3: Stream spatial restriction.

τ

Figure 2: Possible cases of different streams covering a particular spatial region R. We say a stream α is homogeneous with respect to a region of interest R if R ⊆ domain(a) for all images a in α. Oth-

Other restrictions can be determined by predicates over the values of the images in a stream. For example, an alternate definition of the thresholding operation can be expressed as a|≥h ≡ {(x, a(x)) : a(x) ≥ h}. In this case, the operation is not necessarily domain preserving. In general, for a unary predicate (function) p : F → {0, 1} over image values, its extension to F-valued streams can be expressed as α|p ≡ h. . . , a|p i. The spatial extension of a stream α to an image b is denoted α|b . Instead of applying the image spatial extension to the current image in the stream, in this case the image

b used for extension is “updated” each time a new image is available in the stream so the resulting stream reflects the cumulated changes over the region being covered by the images. Initially, b is used for extending the first image in the stream, and then updated with subsequent images. Thus, the extension operation on α is defined by: α|b ≡ ha1 |b(0) , a2 |b(1) , . . . , at |b(t−1) i where b(0) = b, and b(t) = at |b(t−1) for t > 0. In contrast to the operations introduced so far, the stream extension is a stateful operation, because it requires an internal memory to keep track of the progress of the extension. A useful version of the stream extension is simply α|∅ . Its restriction, α|∅ R , to a given region of interest R ⊆ U, see Fig. 4, will always provide the current status of R. In particular, a direct visualization of the stream will only consist in displaying the current image in α|∅ R. x

x

1 2

4

R

2

2

4

5

5 3

3 τ1 τ2 τ3 τ4 τ5 α

4

τ

3

3

τ1 τ2 τ3 τ4 τ5 α|∅ R

τ

Figure 4: Stream spatial extension with restriction. Given that remotely sensed streams are often heterogeneous, in our model we allow the image operands in a γbinary operation to have different domains, i.e., a ∈ FX and b ∈ FY ; hence, the binary operation is re-defined as a γ b ≡ {(z, a(z) γ b(z)) : z ∈ X ∩ Y}. In this sense, we only require the images to share the same value set to be compatible for the operation. This compatibility property is extended to streams, α ≡ h. . . , ai and β ≡ h. . . , bi, provided the current images in the streams are compatible. As an example, the addition of two (compatible) streams α ≡ h. . . , ai and β ≡ h. . . , bi is α + β ≡ h. . . , a + bi.

Time Windows In standard data stream processing, it is common to provide mechanisms for specifying analysis windows across time. In our approach, the temporal selection of images in a stream is always specified in an object-based window manner, i.e., in terms of a number of past images in the stream relative to its current image. This is accomplished by means of the delay operator, which is introduced below. We note that other elements for a comprehensive time model in the context of change analysis on multiple-source image streams (e.g., image expiration, time granularity, out-of-order streams), although important, are not discussed in this paper.

Spatially-aware Delay Operation A fundamental operation for temporal analysis on streams is the delay operation. Consider a stream α ≡ ha1 , . . . , at i, t ≥ 1, where at denotes the current image in α in a given processing system. A standard signal processing oriented definition for a delayed stream can be given by ∆(α) ≡ h∅, a1 , . . . , at−1 i, i.e., the current image in ∆(α) is the image that preceded at in α. (Note that the predecessor of the first image in a stream is defined as the empty image.)

Although intuitive, however, this definition is not in general appropriate for image streams in our setting because, as indicated above, coverage of regions of interest may progress in a heterogeneous manner. Consider the temporal image differencing operation α − ∆(α) = h. . . , at − at−1 i, which is an important representative technique in change detection. If α is heterogeneous, then this operation is of no use because at and at−1 may always have different spatial domains so the result would be a stream of empty images. An unambiguous differencing operation can be given by α|∅ − ∆(α|∅ ), that is, by first homogenizing the operated stream. However, even if we only consider homogeneous streams (i.e., via stream extension when necessary), the delay operator defined above may still not be appropriate especially when the spatial domains of the original images are significantly smaller than the region of interest under consideration. A better definition of the delay operation can be expressed in pseudocode as shown in Algorithm 1. The operator keeps an internal image, m, which buffers the contents of the extension of the images received so far. When a new image is received, the corresponding subset in the buffer is output and the buffer is updated with the value of the input image in that spatial subset. This makes the delay operator behave as a spatially-aware, 1-capacity queue. Algorithm 1 Obtain stream β = ∆(α) m←∅  initialize buffer for each input image a do b ← m|domain(a)∩domain(m)  get result from buffer m ← a|m  update buffer with input τ (b) ← τ (a)  set same timestamp  other metadata assignments omitted output b Given the ∆(α) definition in Algorithm 1, we define the temporal image differencing operation as tid(α) ≡ α−∆(α). This is illustrated in Fig. 5 for a heterogeneous stream α. Note that empty images (not shown) are produced by ∆(α) at τ1 , and by tid(α) at τ1 and τ3 . x 1

6

1

2

2-1

6-2

4 2

3

4-2

2

4 5

5-4

3

5-3

τ1 τ2 τ3 τ4 τ5 τ6 τ τ1 τ2 τ3 τ4 τ5 τ6 τ τ1 τ2 τ3 τ4 τ5 τ6 τ α tid(α) ≡ α − ∆(α) ∆(α)

Figure 5: Spatially-aware temporal image differencing, tid(α). For notational convenience, we define ∆1 (α) ≡ ∆(α), and, inductively, ∆d (α) ≡ ∆d−1 (∆(α)), for d ≥ 2.

3.

CHANGE DETECTION

Given an image stream, a high-level notion of temporal change detection can be expressed as the process of identifying variations of significance by analyzing the images in the stream. A precise definition of significance is highly application dependent [20]. We model the output from a change detection process over a stream α ≡ h. . . , ai with a stream

ς(α) ≡ h. . . , ci, where c ∈ {0, 1}domain(a) is a change mask defined as: ( 1 if a significant change has occurred at a(x) . c(x) ≡ 0 otherwise In this definition, the period of analysis is left open: the change mask c can be based on a so-called bi-temporal analysis, i.e., between the current images in α and ∆(α); or, in general, on any subset of the current images in the delayed streams ∆d (α), d ≥ 1. In what follows, we present some predominant techniques with a proper definition of the corresponding change mask.

Common Change Detection Techniques One of the simplest techniques operates on classified images a ∈ (Zk )X , where each pixel value corresponds to a particular class from a predetermined set of k classes. The analysis is performed pixel-by-pixel to identify areas of change. Using our spatially-aware delay operator, the corresponding change mask, in the bi-temporal case for example, is simply ς(α) = (α 6= ∆(α)). The problem of change detection becomes much more challenging when dealing directly with the original (unclassified) images. The analysis involves the detection of change itself and often also the specific type of change. Several methods act directly on the raw numerical pixel measurements. Others involve transformations (spatial or spectral) that may result in quantities having a closer physical meaning (e.g., a vegetation index), while others involve the analysis of entities beyond the mere pixel, across time and space.

Image differencing, ratioing, and thresholding. Image differencing and ratioing are simple, yet widely used techniques for change detection in remote sensing imagery [22]. They are defined for two R-valued streams α and β as α − β and α/β, respectively. Together with a threshold h, the resulting stream can be thresholded to determine the change mask, for example, as ς = χ≥h (|α−β|). The meaning of change significance is determined by the given threshold value. A bi-temporal version of this technique on a single stream is ς = χ≥h (|α − ∆(α)|).

Index-based analysis. Instead of using original raw band values, change detection methods are also often applied to derived quantities. A common choice in vegetation related studies is the normalized difference vegetation index, ndvi = (nir − vis)/(nir + vis), where nir and vis are the reflectance images at the near infrared and visible wavelenghts in the electromagnetic spectrum. Figure 6 illustrates the application of the image differencing and thresholding methods on ndvi images. The (absolute) difference d = |ndvijan − ndvijul | in part (a) corresponds to different months over North America in a certain year; brighter pixels are areas of greater change. The result in part (b), given by the image d|≥h , identifies the areas of most change according to a given threshold.

Image regression. This is also a bi-temporal method in which pixel values in the current image a in a stream α are estimated by linear regression against the current image b in ∆(α) [10]. Let a and b be the resulting regression coefficients (e.g., from a least squares technique). Then, the estimated (current) ˆ } is a ˆ = a + b · a, and the residual image in α ˆ ≡ {. . . , a

(a)

(b)

Figure 6: North America ndvi difference images. stream α − α ˆ is used as a measure of change. Thresholding is applied to obtain the corresponding change mask. Figure 7 shows a change detection processing pipeline based on this technique. IR represents the binary operation that performs the linear regression between the current images in the input streams, generating the stream α ˆ whose current image is the corresponding estimation. α

− ∆

∆(α)

IR

|·|

h χ≥h

ς(α)

α ˆ

Figure 7: Image regression change detection.

4.

IMAGE STREAM PROCESSING ARCHITECTURE

In this section, we present a concrete realization of our change detection modeling framework. First, we describe the main components and the execution model of the overall architecture in an implementation-independent manner. Then, we use Kepler, a widely used scientific workflow problem solving environment as a means to support the realization of the architecture.

4.1

Architecture

Analysis of remotely sensed data streams can be modeled as being carried out through image pipelines. While data is continuously being transmitted between the outputs and the inputs of the participating processing elements, several additional components can be plugged in to facilitate monitoring, visualization, and other complementary tasks, including the possibility that external clients connect to the processing system to query for data products in the form of image streams. The architecture of our geospatial image stream management system, GISMS, is depicted in Fig. 8. Source components make external streams available into the processing system. Operator components perform concrete operations on input streams generating an output stream. Sink components take input streams and perform some form of final tasks like event logging, reporting, and visualization. We have developed a prototype of our processing framework using the Kepler scientific workflow management system [15]. Kepler is a multi-institutional, open project aimed at offering scientists in diverse disciplines a system to design, execute, and deploy scientific workflows using Web- and Grid-based technologies. A scientific workflow is a network of actors whose mutual interaction is governed by a particular model of computation as determined by a corresponding director. This computational infrastructure is provided by

Executor Query processor

o o o

Source

o

o Operator

Client manager

Image streams

Input synchronizer

Image streams

o

o

Sink

Figure 8: Our GISMS architecture. the underlying Ptolemy II system [9], a robust and mature programming framework for modeling and simulating concurrent, real-time, embedded systems.

4.2

Execution Model

As shown in Fig. 8, our system is a network of data processing components whose execution is triggered as incoming data tokens arrive to the respective input ports. In general, this structure also applies to standard DSMS systems (e.g., [1], [5], [2]). Scheduling, query plan construction, and component interaction are handled by mechanisms that, however, are not in general exposed to the user by these systems. By using Kepler, we expressly address a requirement for more flexibility in the selection of particular executors that users may deem more appropriate for particular workflows. Among the several models of computation provided by Ptolemy II, typical choices for signal and image processing are the Synchronous Dataflow (SDF) and the Process Network (PN) [14] models. In PN, components execute concurrently and communicate to each other under a blockingread, non-blocking-write scheme. This model of computation naturally allows for parallel, pipelined execution of actors. The SDF is a restricted PN in which predetermined token consumption-production settings allow the SDF director to run the workflow in a single thread, with pre-scheduled control over buffer sizes, and free of deadlocks. When dealing with image streams from multiple sources, data for certain multi-input operators may arrive at different rates. If one input is long delayed –because of sensor scanning characteristics, network latencies, or other causes– while other inputs are actively providing data, then the operator should not be blocked indefinitely. A clear semantics for the execution behavior in such cases is required. Upon each activation, a stream operator in our system is assumed to perform an atomic computation consisting of reading one image from each input port and writing one resulting image to its output port. Provided all inputs arrive at the same rate, no direct synchronization is required on operators. Synchronization is only required on the stream sources. By acting on the sources, the input synchronizer in our system makes sure that the downstream multi-input operators do not block under the above circumstances. Suppose only one stream source si , out of n > 1 sources, has just produced an actual image. Downstream operators whose input streams originate at si may involve processing of images originating from some sources sj , j 6= i. In this case, the input synchronizer will generate synthetic tokens from sj . A parameter allows the user to indicate what token should be generated: previous indicates that the previous actual image be re-generated from sj , and null indicates that

the null image ∅ be generated. This mechanism is valid in the SDF domain and hence in the PN one as well. While input synchronization is mandatory to meet the basic requirements of the SDF domain, this mechanism explicitly avoids blocked operations in the PN domain. For each disconnected subgraph in the workflow, the input synchronizer keeps a synchronization group containing the stream sources for the subgraph. Before generating any synthetic tokens, the input synchronizer will allow a user-specified amount of time for sources in a group to generate actual inputs.

5.

APPLICATION EXAMPLES

We demonstrate our framework with relevant query processing scenarios in land cover and wildfire detection. We note that rather than discuss the particular advantages of the techniques, our goal in this paper is to demonstrate how our framework can be used to express and support similar detection tasks in a streaming fashion. We use real-time data streams from NOAA’s Geostationary Operational Environmental West Satellite, GOES-West. At a rate of 2.1 Mbps, the sensor system in this satellite scans the Earth in a row-by-row pattern, so each stream is basically a sequence of 1-row images. In particular, our examples read data from the Imager instrument, which senses radiant and solar-reflected energy, generating data for five channels with different spatial resolutions and spectral characteristics. For instance, visible channel data is generated at 1×1 km spatial resolution and at a rate of about 190KB/sec, taking about 26 minutes to scan the 22, 600 × 10, 900-pixel scene for the full field of view of the satellite. For our examples, we use the following image streams: nir, near infrared ; vis, visible; mwi, midwave infrared ; and lwi, longwave infrared. For convenience, we omit here the details regarding the resampling operations required to put these streams in the same spatial resolution.

5.1

Land Cover Change Detection

In this example, we consider the normalized difference vegetation index, ndvi (Sect. 3), and assume that a stream α of ndvi images is already being generated in our example system. An approach based on temporal prediction analysis for land cover change detection on ndvi measurements is similar to the bi-temporal linear regression discussed in Sect. 2.4. In this case, multiple past images are analyzed by an operator TPp to predict the current image in α based on the previous p images in α, i.e., α ˆ = TPp (α). A change detection query in terms of this technique, along with the usual image differencing method above, can be expressed as Q3 : χ≥h (|α − TPp (α)|), and directly executed as shown in Fig. 9. The temporal sliding window of size p required by the composite operator TPp is realized by cascading p delay operators. The internal P component performs the actual prediction implementing some standard method (see, e.g., [4]). Thus, using as parameters the order of prediction p and the threshold h, the final ς(α) = χ≥h (|α − TPp (α)|) stream provides the resulting change mask.

5.2

Wildfire Detection

In general, the goal of wildfire detection is to identify pixels with active fires in satellite imagery. Algorithms for detection of wildland fires usually exploit the sensitivity to subpixel heat of certain spectral bands, in particular, the midwave (mwi) and longwave (lwi) infrared bands [16].

α

∆ ∆



∆1 (α) ∆2 (α) .. . ∆p (α)

input streams and displays the resulting combined image in real-time. In this case, the extension of the THR stream is overlaid on the lwi stream. (A low threshold value has been set for sake of illustration.)

TPp α ˆ P



|·|

h χ≥h

ς(α)

Figure 9: Image prediction change detection. As in the previous land cover scenario, several spectral indices have been investigated. Here, we use the normalized thermal index, calculated as nti = (mwi−lwi)/(mwi+lwi). Consider the change detection query Q1 : (χ≥h (nti))|R , i.e., the thresholding of the normalized thermal index over a region of interest R specified in some suitable form (e.g., corners of a latitude-longitude bounding box). This is a simple, yet representative query formulated in terms of a well-known data product name. Each image in the resulting stream is a classification of pixels into one of the classes fire or non-fire, according to the user-specified threshold h. Q1 can be initially translated to (χ≥h ((mwi − lwi)/(mwi + lwi)))|R . Since this is a pointwise calculation, equivalent rewritings of the query are χ≥h (((mwi − lwi)|R )/((mwi + lwi)|R )), and χ≥h ((mwi|R − lwi|R )/(mwi|R + lwi|R )). The latter, in particular, corresponds to an optimal cost execution plan as the subsetting operations are performed on the input streams, so the index and thresholding computations are applied only on the pixels in the region of interest. Given that the resulting Q1 stream may be heterogeneous if any of the input streams is heterogeneous, an alternative query, more suitable for visualization for example, can be written as Q2 : (nti|≥h )|∅ R , i.e., the spatial restriction to R of the extension of the (non-domain-preserving) thresholding of the nti index. By a similar cost analysis, the plan ∅

{((mwi|R − lwi|R )/(mwi|R + lwi|R ))|≥h } | , whose pipeline is illustrated in Fig. 10, would be optimal for query Q2. The SR actors perform the corresponding stream subsetting operations, and the last two actors perform the thresholding and final stream extension for direct visualization of the region of interest R. A screenshot of the actual mwi lwi

R SR R SR

− /

h ·|≥h

·|∅

ς

+

Figure 10: Execution of query Q2 for fire detection. Kepler workflow is shown in Fig. 11. The SR actors have been parameterized with part of the Northern hemisphere in the field of view of the GOES West satellite. An especially designed viewer sink actor applies the extension of its

Figure 11: Normalized Thermal Index calculation and visualization in Kepler. In summary, the above examples illustrate some of the typical design elements provided by our framework that a scientist can use to create image processing pipelines for change detection. We have used the change mask concept as the main output from such detection tasks. By using a scientific workflow platform, we enrich our framework with important interaction capabilities, for example, to steer the exploration process by varying the various parameters in the workflow and the possibility of easily plugging in monitoring and visualization tools at various points in the pipeline.

6.

RELATED WORK

Models for spatio-temporal data and queries have been the subject of active research especially in the database community [3, 17, 24]. However, these approaches do not consider the streaming aspect addressed in this paper. On the other hand, as already pointed out, substantial work has recently been done in the context of data stream management systems (see, e.g., [1, 2, 5, 6]), as well as change detection in data streams ([11, 13, 26]). However, much of this work, even in the case of modeling underlying multi-dimensional data, is concerned with streams of tuples. Image pipelines for change detection can be realized by using a variety of available technologies, including, among many others, [12, 19, 25, 18]. However, these systems provide image processing functions that are traditionally more oriented to whole-scene analysis (perhaps internally decomposed into streaming pipelines), but not particularly designed for processing multi-source, heterogeneous streams of image fragments. Elements in [23] provide some support for this case, but the temporal aspect and the execution model required for multi-input operators were not addressed.

This work has been done in the context of the GeoStreams project [8], a framework to process multiple continuous queries against streaming remotely-sensed geospatial image data.

7.

CONCLUSIONS AND FUTURE WORK

In this paper, we have built a conceptual spatio-temporal framework for expressing and realizing image processing workflows focused on change analysis in remotely-sensed, streaming geospatial data. Based on a parameterized execution model that users can customize according to their requirements, we have precisely defined a fundamental set of stream operators, some of which particularly oriented for temporal analysis, as building blocks for realizing change detection and visualization tasks. With a realization based on the Kepler scientific workflow system, we facilitate the integration of not only new stream operators, but also complete processing pipelines into our framework. This is enriched with other key scientist friendly functionalities like visual workflow design, composition, and execution. To our knowledge, no similar research has been done towards a systematic modeling and construction of tools for geospatial image stream processing, and change detection in particular. Ongoing work includes the construction of Web Service based mechanisms for the composition of workflows in distributed environments. Our immediate plans also include the implementation and evaluation of an ample representative set of change detection techniques, as well as the incorporation of especially tailored multi-query processing techniques to optimize the utilization and reuse of streams in collaborative scientific environments. Acknowledgment. This work is in part supported by the National Science Foundation under Awards No. IIS-0326517 and ATM-0619139.

8.

REFERENCES

[1] D. J. Abadi, D. Carney, U. Cetintemel, M. Cherniack, C. Convey, S. Lee, M. Stonebraker, N. Tatbul, S. Zdonik. Aurora: a new model and architecture for data stream management. The VLDB Journal, 12(2):120–139, 2003. [2] B. Babcock, S. Babu, M. Datar, R. Motwani, J. Widom. Models and issues in data stream systems. In PODS’02, 1–16, ACM Press, 2002. [3] P. Baumann. A database array algebra for spatio-temporal data and beyond. In NGITS’99, LNCS 1649, 76–93, Springer, 1999 [4] M. J. Carlotto. Detection and analysis of change in remotely sensed imagery with application to wide area surveillance. IEEE Transactions on Image Processing, 6(1):189–202, 1997. [5] S. Chandrasekaran, O. Cooper, A. Deshpande, M. J. Franklin, J. M. Hellerstein, W. Hong, S. Krishnamurthy, S. Madden, V. Raman, F. Reiss, M. A. Shah. TelegraphCQ: Continuous dataflow processing for an uncertain world. In CIDR, 2003. [6] N. Chaudhry, K. Shaw, M. Abdelguerfi. Stream Data Management. Springer, April 2005. [7] M. Gertz, Q. Hart, C. Rueda, S. Singhal, J. Zhang. A data and query model for streaming geospatial image data. In EDBT’06 Workshops, LNCS 4254, 687-699, 2006.

[8] Q. Hart, M. Gertz. Querying streaming geospatial image data: The GeoStreams Project. In SSDBM’05, 147–150, 2005. [9] C. Hylands, E. Lee, J. Liu, X. Liu, S. Neuendorffer, Y. Xiong, Y. Zhao, H. Zheng. Overview of the Ptolemy Project. Technical Report UCB/ERL M03/25, University of California, Berkeley, July 2003. [10] J. R. Jensen. Introductory Digital Image Processing: A Remote Sensing Perspective. Prentice Hall, 2005. [11] D. Kifer, S. Ben-David, J. Gehrke. Detecting change in data streams. In VLDB, 180–191, 2004. [12] K. Konstantinides, J. R. Rasure. The Khoros software development environment for image and signal processing. IEEE Transactions on Image Processing, 3(3):243–252, 1994. [13] B. Krishnamurthy, S. Sen, Y. Zhang, Y. Chen. Sketch-based change detection: methods, evaluation, and applications. In SIGCOMM’03, 234–247, 2003. [14] E. A. Lee, T. M. Parks. Dataflow process networks. Proceedings of the IEEE, 83(5):773–801, 1995. [15] B. Lud¨ ascher, I. Altintas, C. Berkley, D. Higgins, E. Jaeger-Frank, M. Jones, E. Lee, J. Tao, Y. Zhao. Concurrency and Computation: Practice & Experience, Special Issue on Scientific Workflows, Chapter Scientific Workflow Management and the Kepler System, 2007. [16] R. S. Lunetta, C. D. Elvidge. Remote Sensing Change Detection: Environmental Monitoring Methods and Applications. Ann Arbor Press, 1998. [17] A. P. Marathe, K. Salem. Query processing techniques for arrays. In SIGMOD’99, ACM Press, 323–334, 1999. [18] D. Murray, J. McWhirter, S. Wier, S. Emmerson. The integrated data viewer–a web-enabled application for scientific analysis and visualization. In 19th Conference on Interactive Information Processing Systems, AMS, 2003. [19] S. G. Parker, M. Miller, C. D. Hansen, C. R. Johnson. An integrated problem solving environment: The SCIRun computational steering system. HICSS’98, VOl.7, 1998. [20] R. Radke, S. Andra, O. Al-Kofahi, B. Roysam. Image change detection algorithms: A systematic survey. IEEE Transactions on Image Processing, 14(3):294–307, 2005. [21] G. X. Ritter, J. N. Wilson. Handbook of Computer Vision Algorithms in Image Algebra. CRC Press, 2001. [22] P. L. Rosin. Thresholding for change detection. Computer Vision and Image Understanding, 86(2):79–95, 2002. [23] C. Rueda, M. Gertz, B. Lud¨ ascher, B. Hamann. An extensible infrastructure for processing distributed geospatial data streams. In SSDBM’06, 285–290, 2006. [24] S. Shekhar, S. Chawla. Spatial Databases: A Tour. Prentice Hall, 2002. [25] J. Yeh. Image and video processing libraries in Ptolemy II. Technical Report UCB/ERL M03/52, EECS Department, University of California, 2003. [26] Y. Zhu, D. Shasha. Efficient elastic burst detection in data streams. In SIGKDD’03, ACM, 336–345, 2003.

Suggest Documents