cations to selectively process the stream and prioritize event processing based on ... The periphery of a stream engine is formed by adapters, software .... of discrete distributed systems. It uses a directed ..... Road Benchmark [4]. It compares ...
A column-oriented data stream engine E. Liarou
R. Goncalves M. Kersten CWI Amsterdam, The Netherlands {erietta,goncalve,mk}@cwi.nl
ABSTRACT This paper introduces the DataCell, a data stream management system designed as a seamless integration of continuous queries based on bulk event processing in an SQL software stack. The continuous stream queries are based on a predicate-window, called “basket” expressions, which support arbitrary complex SQL subqueries including, but not limited to, temporal and sequence constraints. The DataCell is designed for bulk event processing to capitalize proven relational column-store database technology to achieve efficient resource utilization. It is implemented on top of an open-source DBMS. In this work we describe in detail the DataCell architecture, its data model, and its processing scheme. An analysis of the core algorithms provides an outlook on its performance in the envisioned application domains.
1.
INTRODUCTION
Data Stream Management systems (DSMS) have become an active research area in the database community. Inspiration comes from potentially large application areas, e.g., network monitoring, sensor networks, telecommunication, financial and web applications. Many DSMSs have been designed from scratch.
cient. Commercial systems, such as Streambase1 and Coral82 stress the need for a better (JDBC) integration with a full fledged relational DBMS. In this work we describe a data stream management system, called DataCell, starting at the other end of the spectrum. We claim that a data stream management system, called the DataCell, for the emerging application domain on ambient intelligence can be build on top of an extensible database engine. We present our vision in the context of the widely used open source column-oriented DBMS MonetDB. The column orientation together with its clearly defined software stack provides an excellent basis for exploration of novel database techniques. It should be understood, however, that the approach presented here is applicable in any other modern (SQL-based) systems as well. The main contributions and topics addressed in this paper are the following: • Predicate windows. The DataCell generalizes the sliding window approach predominant in DSMS to allow for arbitrary table expressions over multiple streams and persistent tables interchangeably. It enables applications to selectively process the stream and prioritize event processing based on application semantics.
The performance requirements of (financial) stream applications have driven most DSMS architects to venture for a dedicated (bounded) main-memory solution. The application semantics are defined in a simplified SQL language framework from which a dedicated (Java) program is derived. A small runtime library complements the setup and most attention can be given to application specific input/output adapters.
• Bulk event processing. The DataCell processing engine is geared at bulk processing of events to amortize overhead incurred by process management and function calls. This favors a skewed arrival distribution, where a peak load can be handled easily, and possibly within the same time frame, as an individual event. It capitalizes the performance offered by column-store database systems.
The drawback became clear as the applications became more complex and demanding. Ad-hoc solutions to combine streams with proprietary persistent main-memory tables is insuffi-
• SQL compliance. Stream applications require the expressiveness of SQL’03. We do not resort a redefinition of the window concept. Instead, we propose an orthogonal extension to SQL’03. Moreover, the complete state of the system can at any time be inspected using SQL queries. The remainder of the paper is organized as follows. In Sec1 2
http://www.streambase.com/ http://www.coral8.com/
tion 2 we present in detail the DataCell model followed by a short introduction of the architecture at large in Section 3. Section 4 explores the scope of the solution by modeling stream-based application concepts borrowed from dedicated stream database systems. Section 5 provides preliminary experimental results. Finally, Section 6 and 7 conclude with related work, a summary and outlines future work.
2.
THE DATACELL
The DataCell is easily understood starting with a small example taken from an application domain. For example, assume an ambient (home) setting, where multiple sensors are embedded in the environment and generate an event when their environment status has changed. The sensors could be as simple as light switches, which emit a message when their state changes; microphones, which emit the noise level; temperature and humidity sensors; and smoke detectors, which are supposed to emit a signal until a disaster is about to unroll. The sensors may work independently, or be part of a local (wireless) network. The stream characteristics in an ambient setting are much more diverse than commonly assumed in DSMS, i.e., many streams have a low frequency event rate and few have a highly skewed rate. Furthermore, the actions triggered are generally more complex than network-based aggregation. Complex views filter events of interest using long term archival data as a frame of reference. They also rely on (external) services provided by specialized ambient algorithms. The information relevant for peers (sensors), archiving, or input to the ambient home algorithms are collected into a central hub, the DataCell. A kind of network attached storage device with DSMS functionality. The relational database paradigm is adopted to satisfy our needs. Its declarative programming scheme, combined with a dataflow driven query scheduler provides a generic solution. Extensions can be kept to a minimum. If warranted a remote service can be contacted to handle a new request. In the rest of this section we describe the DataCell model in more detail. The DataCell consists of the following components: receptors and emitters, baskets, basket expressions, and continuous queries. All components are encapsulated within the SQL’03 language framework [11]. The novelty are the basket and basket expression, which capture and generalize the essence of streaming applications. Let us shortly discuss each component.
2.1
Receptor and Emitter
The periphery of a stream engine is formed by adapters, software components to interact with devices, RSS feeds and web-services. The protocol range from simple messages to complex XML documents transported using either UDP or TCP/IP. The adapters for the DataCell consists of receptors and emitters. A receptor is a separate process thread that picks up incoming events from a communication channel and forwards them to the kernel for processing. Likewise, an emitter is a
separate thread that picks up events prepared by the DataCell kernel and delivers them to clients interested, i.e., who have subscribed to the query result. The interchange format is purposely kept simple. It aligns with the system wide protocol to interact with external tools using a pre-XML textual interface for exchange of flat relational tuples.
2.2
Baskets
Let us now introduce the basket data structure. Its role is to hold a portion of a stream, represented as a temporary main-memory table. Unlike other stream applications there is no a priori order, the basket is simply a (multi-) set of event records received from an adapter. The commonalities between baskets and relational tables are much more important then to warrant a redesign from scratch. Therefore, their syntax and semantics is aligned with the table definition in SQL’03 as much as possible. A prime difference is the retention period of their content and transaction semantics. Tuples are removed from the basket when “consumed” by a continuous query. They initiate the flow in the stream engine. The stream basket definition below models an ordered sequence of events. The id takes its value from a sequence generator upon insertion, a standard feature in most relational systems nowadays. It can model event arrival order. The default expression for the tag, ensures that the event is also timestamped upon arrival. The payload value is received from an external source. create basket X( tag timestamp default now(), id serial, payload integer );
Important differences between a basket and a relational table are summarized as follows: • Basket Integrity The integrity enforcement for a basket is different from a relational table. Events that violate the constraints are silently dropped. They are not distinguishable from those that have never arrived in the first place. The integrity constraint acts as a silent filter. • Basket ACID The baskets are like temporary global tables, their content does not survive a crash or session boundary. However, concurrent access to their content is regulated using a locking scheme or regulated by the scheduler. • Basket Control Unlike other DSMS, the DataCell provides control over the streams through the baskets. A stream becomes blocked when you disable the basket where the events should be delivered. The state can be changed to enabled once the flow is needed again. Selective (dis)enabling baskets can be used to debug a complex stream application. With baskets as the central concept we purposely step away from the de-facto approach to process events in arrival order only. We consider arrival order a semantic issue, which
may be easy to implement on streams directly, but also raises problems with out-of-sequence arrivals [1], regulation of concurrent writes on the same stream, and unnecessary complicate applications that don’t care about the arrival order.
Receptor
R
B1
Q
B2
E
Query Emitter
2.3
Basket Expressions
The basket expressions are the novel building blocks for the DataCell queries. They encompass the traditional selectfrom-where-groupby SQL language framework. A basket expression is syntactically a sub-query surrounded by square brackets. However, the semantics are quite different. Basket expressions have side-effects; they change the underlying tables during query evaluation. All tuples referenced in the (ordered) sub-query that contribute to the result set are also removed from their underlying store automatically. This leaves a partially emptied basket or tables behind. Note, a basket can also be inspected outside a basket expression. Then, it behaves as any temporary table, i.e., tuples are not removed as a side-effect of the evaluation. The basket expression in the query below takes precedence and extracts all tuples available in X. All tuples selected are immediately removed from X, but they remain accessible through B during the remainder of the query execution. From this temporary table B we select the payloads satisfying the predicate. select count(*) from [select * from X ] as B where B.payload >100;
The basket expressions initiate tuple transport in the context of the query. The net effect is a stream within the query engine. X is either a basket or a table. In both cases tuples are removed. However, deletion from tables is much more expensive, because it involves a sub-transaction commit with logging overhead. Baskets avoid this overhead, no transaction log is maintained. Most DSMSs perform query processing over streams seen as a linear ordered list. This naturally leads to sequence operators, such as next, follows, and window expressions. The latter overloads the semantics of the SQL window construct to designate a portion of interest around each tuple in the stream. The window operator is applied to the result of a query and, combined with the iterator semantics of SQL, mimics a kind of basket expression. However, re-using SQL’03 window semantics introduces several problems. For example, windows are limited to expressions that aggregate only, they carry specific first/last window behavior, they are read-only queries, they rely on predicate evaluation strictly before or after the window is fixed, etc. In DSMS languages, such as streamSQL, a window can be defined as a fixed sized stream fragment, a time-bounded stream fragment, or a value-bounded stream fragment only.
The basket expressions provide a much richer ground to designate windows of interest. They can be limited in size using a sequence constraint, they can be explicitly defined by predicates over their content, and they can be based on predicates referring to objects elsewhere in the database.
Table
T Basket
Figure 1: DataCell Application
2.4
Continuous Queries
With baskets and basket expressions at our disposal, it is just a small step towards support of continuous queries. Continuous queries are long-running queries that are continuously evaluated against incoming stream data. Generally speaking, it is a query re-executed whenever the database state changes. Two cases should be distinguished. For a non-streaming database, the result presented to the user is an updated result set and it is the task of the query processor to avoid running the complete query from scratch over and over again. For a streaming database, repetitive execution produces a stream of results. The results only reflect the latest state and any persistent state variable should be explicitly encoded. The continuous query semantics is aligned with basket expressions. When a query has been executed, all its basket expressions are individually inspected for newly arrived information. If true, it starts a new cycle of the sub-query evaluation. Figure 1 illustrates how the DataCell application scenarios can be modeled graphically succinctly. In this simple scenario, the receptor R appends the new incoming data to a basket B1. When new data appears, a submitted continuous query Q obtain access to the incoming stream and it is evaluated, its results are contained in the basket B2 that emitter E, forwards them to the interested subscribers. The graphics forms the basis for development tools.
3.
DATACELL ARCHITECTURE
In this section we present a high-level overview of DataCell architecture and its underlying computational model.
3.1
Overview
The DataCell architecture is build on top of the extensible column-store database MonetDB. Its architecture provides a clean software stack to support powerful front-ends, such as SQL’03 and XQuery. The system is designed as a virtual machine architecture with an assembly language, called MAL3 . Each MAL operator wrap an highly optimized relational primitive. It is the target for query compilers. Optimization is based on a modular decomposition of the task and encapsulated in MAL to MAL transformers. The DataCell is positioned between the (enhanced) SQL to 3
See http://monetdb.cwi.nl for details on MAL
select * from [select * from X] as A, [select * from [select * from Y] as B ] as C
SQL compiler DataCell SQL−MAL Optimizer
[select * from [select * from Y] as B ] as C
scheduler
Receptors
Emitters [select * from X] as A
[select * from Y] as B
MonetDB Kernel Figure 2: DataCell Architecture
MAL compiler and the MonetDB runtime engine. In particular, the SQL runtime has been extended to manage the baskets using the columns provided by the kernel, and a scheduler to control activation of the continuous queries. The SQL compiler is extended with a few orthogonal language constructs to handle the basket expressions. In Figure 2 we illustrate the multi-process architecture of the DataCell. Each receptor is mapped into a separate process thread to receive events and to insert them into the appropriate baskets. Likewise, each emitter is a thread which removes tuples from specific baskets and deliver them to clients registered. Both threads can act as server, accepting remote connections, or as client, connecting to a remote site to read/emit tuples. The bridge between the receptor baskets and the emitter baskets is made up by the continuous queries registered in the DataCell. These queries are defined at the SQL frontend, administered in a catalog for inspection, compiled and optimized as MAL plans, and, finally, scheduled for executed. Continuous queries always have at least one input basket and one output basket.
3.2
Computational Model
The computational model underlying continuous query processing is based on Petri nets [16], or predicate nets in particular [14]. A Petri Net is a mathematical representation of discrete distributed systems. It uses a directed bipartite graph of places and transitions with annotations to graphically represent the structure of a distributed system. Places may contain tokens to represent information and transitions model computational behavior. Edges from places to transitions model input relationships and, conversely, edges from transitions to places denote output relationships. A transition fires if there are tokens in all its input places. Once fired, the transition consumes the tokens from its input places, performs some processing task, and places result tokens in all its output places. This operation is atomic, i.e., performed in one non-interruptible step. The firing order of
Figure 3: Petri-net Example
transitions is explicitly left undefined. In the DataCell context all baskets are considered token placeholders. The basket expressions align with the Petrinet transitions. The firing condition maps to a test for nonempty basket expressions, i.e., a continuous query is automatically executed if its basket expressions produce at least one tuple. The result of the transition is delivery of tuples to baskets or tables. The Petri-net model for the query below is shown in Figure 3. select * from [ select * from X] as A, [ select * from [select * from Y] as B ] as C
An advantage of the Petri-net model is that it provides a clean definition of the computational state. Furthermore, the hierarchical nature of Petri-net allow us to display and analyze large and small models at different scales and levels of detail. The computational model produces a bottom up and parallel evaluation sequence for queries with nested and sibling basket expressions. That is, it enables queries whose actions depend on intermediates produced within the same (compound) queries. Petri-net analysis tools, e.g., PIPE24 can be used to be analyzed for problematic cases, such as deadlocks, conflicts, blocking and performance bottlenecks.
3.3
Runtime Behavior
The operational semantics of the DataCell follow directly from the Petri-net, i.e., running concurrently with process arbitration inherited from the operating system. Each basket expression, i.e., Petri-net transformation, proceeds in a few well-defined steps. 1. A snapshot view is taken of all baskets of interest. 4
http://pipe2.sourceforge.net/
Figure 4: Marked Graph example 2. The basket predicate is evaluated as an ordinary relational query. 3. Qualifying tuples are are permanently removed from the basket. 4. The result tuples are constructed and inserted in the output baskets.
factory step1():bit; x_num:bat[:lng,:lng]:= basket.bind("X","X_num"); x_id:bat[:lng,:lng]:= basket.bind("X","X_id"); x1_num:bat[:lng,:lng]:= basket.bind("X1","X_num"); x1_id:bat[:lng,:lng]:= basket.bind("X1","X_id"); #mdb.setTrace(true); barrier go:= true; basket.lock("X1"); basket.lock("X"); bat.insert(x1_num,x_num); bat.insert(x1_id,x_id); bat.delete(x_id); bat.delete(x_num); basket.unlock("X"); basket.unlock("X1"); yield qry00:=true; redo go:= true; exit go; end step1; petrinet.transformation("user","step1"); petrinet.source("petri","step1","X1"); petrinet.target("petri","step1","X");
5. Repeat the steps 1 to 4 for the next batch of tuples.
Figure 5: Petrinet represented by a factory
The snapshot view is a cheap operation in MonetDB. It inspects all baskets mentioned in the basket expression and produces a view. Taking a range-based view is handled in sub-micro seconds. These views fixate the database state to consider during query execution cycle. This action could be a system global atomic action. In phase two, all basket expressions are evaluated. For each we build a result table and create a list of tuples, the pivot list, that will be removed from the basket or table upon a successful query completion. In phase three, the query is evaluated against the pivot list comprised of tables for inspection and the temporaries produced. The result set is insert into tables or baskets.
whose execution state is saved between calls. The first time that the factory is called, a thread is created in the local system to subsequent handle requests. Once a result has been produced, it is yielded to the caller, where after the factory processing is suspended until another call is made. It then continuous with the first instruction after the yield.
This basic cycle opens a plethora of optimization challenges. If we deal with an ordered stream, cancellation of the cycle does not require to redo all work. Instead, we can keep the intermediate around until the next iteration starts and decides cheaply if the state of affairs has changed by comparing the view attributes. Furthermore, its combination with the computational model offers a solution to avoid conflicts due the type of Petri Net used, Marked Graph (MG), where every place has one incoming arc and one outgoing arc. This means the conflicts are avoided, but concurrency is not hindered. This type of Petri Net allows us to represent and control the different parallel computing levels on DataCell. The example in Figure 4 presents a Marked Graph, where a process is forked at transition T1 and synchronized at T4 and the other two are non-deterministic operations, T2 and T3.
3.4
Factories
SQL queries are compiled and optimized into MAL functions parameterised with placeholders for all lexical constants. The function is called afterwards. Keeping the function in a cache simplifies and speeds-up the processing of similar queries. The creation of a Petri-net uses a MAL co-routine, called a f actory, which is specified as an ordinary function, but
The factory is a convenient construct to model continuous queries. Figure 5 illustrates how it moves tuples from basket “X” to basket “X1”. With a successful execution the Boolean value true is yield as a result of the factory call and the factory “step1” is then put to sleep. The second call received wakes it up at the point where it went to sleep, it finds a redo statement, and does another move of tuples from “X” to “X1”. With this MAL co-routine DataCell simply defines a A Petrinet node with a factory as transaction, the sources as the input place of a transaction, and targets as the output places of a transaction. Then the system only needs to check if all the sources contain tuples to call the factory. An advantage on its use is the possibility to defined nested factories which allows the creation of nested Petri Net nodes.
4.
QUERYING STREAMS
In this section we illustrate how the key features of a stream query language are handled in the DataCell model. The state-of-the-art query language streamSQL5 , is used as a frame of reference. Its design is based on experiences gained in the Aurora [7] and CQL in the STREAM [2, 5] projects. It also reflects an experience based approach, where the language design evolved based on concrete application experiments.
4.1
Filter and Map
The key operations for a streaming application are the f ilter and the map operations. The f ilter operator inspects indi5
http://blogs.streamsql.org/
vidual tuples in a basket and is the most common operator. Tuples that satisfy the filter are taken out of the basket, others remain until further notice. A map operator takes an event and constructs a new one using built-in operators and calls to linked-in functions. Both operators directly map to the basket expression. There are no upfront limitations with respect to functionality, e.g., predicates over individual events or lack of access to global tables.
Both operators are defined with single Petri Net node. The filter/map operation is mapped into the transaction which insert the mapped/filtered results in output basket.
4.2
Split and Merge
Stream splitting enables tuple routing in the query engine. It is heavily used to support a large number of continuous queries by factoring out the common parts of interest. Likewise the stream merging, which can be a join or gather, is used to merge different results from a large number of common queries. They mainly used in Petri Net model to support the parallel computation, explained in section 3.3. The figure 4 has both operations, the transaction “T1” is a splitting operation and the transaction “T4” is a merging operation. The join, gather operators of streamSQL share the semantic problem found in all stream systems, i.e., at any time only a portion of the infinite stream is available. This complicates a straight forward mapping of the relational join, because an infinite memory is required. The way out are windowbased joins. They give a limited view over the stream and any tuple outside the window can be discarded from further consideration. The boundary conditions are reflected in the join algorithm. For example, the gather operator works on the assumption that both streams have a uniquely identifying key to glue together tuples from different streams. In the DataCell, we elegantly circumvent the problem using the basket expression semantics and the computational power of SQL. The DataCell immediately removes tuples that contribute to a basket predicate to become true. In particular, it removes matching tuples used in a merge predicate. This way, merges over streams with uniquely tagged events are straight forward. Delayed arrivals are also supported. Non-matched tuples remain stored in the baskets until a matching tuple arrives, or a garbage collection query takes control.
4.3
Aggregation
The initial strong focus on aggregation networks has made stream aggregations a core language requirement. In combination with the implicit serial nature of event streams, most systems have taken the route to explore a sliding window approach to ease their expressiveness. In the DataCell, we have opted not to tie the concepts that strongly. Instead, an aggregate function is simply a two phase processing structure: aggregate initialization followed by incremental updates.
This processing structure can be model by the MAL co-
routine used to create a Petri Net node transaction, the f actory presented in section 3.4. The prototypical example is to calculate a running average over a single basket. The transaction with such functionality is represented by the following factory.
factory average():bit; x_num:bat[:lng,:lng]:= basket.bind("X","X_num"); cnt := 0 sum := 0 barrier go:= true; basket.lock("X"); #the next two lines of MAL need to be checked by a MAL exper cnt := cnt + bat.count(x_num); sum := sum + aggr.cnt(x_num); bat.delete(x_num); basket.unlock("X"); yield avg := sum / cnt; redo go:= true; exit go; end step1; Each time the transaction is fired, the factory average will return the global average over the stream since the transaction definition until the actual moment.
4.4
Partition
Stream engines use a simple value-based partitioning scheme to increase the parallelism and to group events. A partitioning generates as many copies of the down-stream plans as there are values in the partitioning column. This approach only makes sense if the number of values is limited. It is also not necessary in a system that can handle groups efficiently. In the context of MonetDB, value-based partitioning is considered a tactical decision taking automatically by the optimizers. A similar route is foreseen in handling partitions over streams to increase parallelism. Partitioning to group events of interest still relies on the standard SQL semantics.
Example 1. An example taken from the Linear Road Bench-
mark [4] is to collect a sorted list by traffic per minute. For this example the source will contain tuples with a tag which is a timestamp value. The transaction will select all the tuples from the source and group them by tag value, in this case minute by minute. This solution can be refined with a metronome to produce the result at regular intervals. For that we only need an extra source, the target of a metronome transaction, which receives one NULL tuple every minute.
5.
EVALUATION
In this section, we report on experiments designed to assess the potential of a full-fledged implementation of the DataCell and its query language framework. All experiments are ran against the prototype DataCell implementation using manually modified MAL plans produced by the regular MonetDB/SQL compiler. This way we are able to assess the solution without immediately being confronted with all corners of a full-functional DataCell SQL compiler and its runtime system. The experiments were conducted on a 2.4Ghz
120
110000 2 Columns 4 Columns 8 Columns
2 Columns 4 Columns 8 Columns
100000
110
80000 Throughput (tuples/sec)
Avg Latency per tuple (microsecs)
90000 100
90
80
70
70000 60000 50000 40000
60 30000 50
20000
40
10000 1
2
4
8 Batch size (# of tuples)
16
1
2
4
(a) Latency
8 Batch size (# of tuples)
16
(b) Throughput Figure 6: Effect of network
AMD Athlon 64 processor equipped with 2GB RAM and two 250 GB 7200 S-ATA hard disks configured as RAID 0. The operating system is Fedora Core 6.
5.1
C1
E1
R1
C1
Q1
R1
C1
Q1
C2
E1
Latency and throughput
The baseline experiments are aimed to identify the coarse grain costs involved in dealing with event streams in the DataCell context. Of course, a non-neglectable cost is the transport of any event trough the communication layers to deliver it on the doorstep of the DataCell. It largely depends on the sensor and network characteristics. To quantify, we ran experiments where the sensor was directly hooked up to the receiver. The latency over a TCP-IP socket on our Linux platform gives values as good as 50 usec and as worse as 900 usec. Figures 6.a and 6.b illustrate some results obtained. Although a larger batch size improves the situation somewhat, the spread remains high. A sneaky way out of this dilemma is to ignore the network and look only to what is happening inside a DSMS engine. We are convinced that our underlying DBMS provides good performance, but also that we may not ignore the real-life cost of event shipping. The DataCell’s bulk processing architecture is designed and capable to deal with the fluctuating arrivals In the remaining experiments we focus on a steady state environment.
5.2
R1
Conveyor Belt Experiments
The next series of experiments focus on the latency and throughput using the DataCell in an application scenario. A series of continuous queries are lined up to pass tuples from the receptor to the emitters, illustrated in Figure 8. The events are generated by a fake sensor process on the same machine. Each tuple received at the receptor is time tagged and the emitter sends the delay of the event in the DataCell for post-processing at the receiver side. The latency for various sizes is shown in Figure 7.a. The corresponding throughput is shown in Figure 7.b. Each query is a simple basket expression to push all tuples through as quickly as possible.
Ck
E1
Figure 8: Conveyor Experiments tinuous queries are chained together. The scheduler uses a naive round-robin policy, which inspects the baskets for newly arrived events, i.e. to satisfy the Petri-net firing condition. Subsequently, a MonetDB factory, a.k.o. co-routine, is started to restart the basket expression. The processing behavior strongly depends on how the basket synchronization is handled and the query complexity. The prototype implements a serial execution of the continuous queries using the Petri-net model. It means that, aside from the interaction with receptors and emitters, the baskets need not be locked/unlocked before being used. For long conveyor belts the latency is largely determined by the query scheduling overhead.
6.
RELATED WORK
The DataCell falls in the category of stream-engines for complex event processing (CEP). This field is a revival of realtime processing in the contex of active database processing. The language semantics and its integration with database technology is studied in e.g. [8, 15, 10, 13] Several DSDM solutions have been proposed, e.g., [5, 7, 9, 12], but few have reached a maturity to live outside the research labs. Example systems that can be downloaded for experimentation are Borealis6 and TelegraphCQ7 . The DataCell follows the MonetDB product family charter to disseminate a useful system as soon as possible. In particular, the functionality of the DataCell was inspired by StreamSQL and CQL[6, 3]. The latter project has been abandoned and the software is not maintained for ease of 6
The experiments confirm the linear behavior as more con-
Qk
7
http://www.cs.brown.edu/research/borealis/public/ http://telegraph.cs.berkeley.edu/telegraphcq/v0.2/
300000
400000 1000 tuples 2000 tuples 3000 tuples
Conveyor 350000
250000 300000 Throughput (tuples/sec)
Latency (microsecs)
200000
150000
100000
250000
200000
150000
100000 50000 50000
0
0 10
20
30
40
50 60 # of Queries
70
80
90
100
10
20
(a) Latency
30
40
50 60 # of Queries
70
80
90
100
(b) Throughput Figure 7: Effect of network
experimentation. StreamSQL is also based on CQL, but it carries the signs of meeting the requirements of their commercial client base. It has been developed for simpler queries and it has a limited-power query-specification GUI, with the SQL-based language coming only after the fact. Instead, the DataCell has been developed for complex queries and it supports the complete SQL-based language. Furthermore, it relies on the Petri-net computational abstraction to highlight and analyze effects of concurrent behavior.
The DataCell project aims to provide a data stream management system that exploits an extensible DBMS. Applications can formulate expressive queries over multiple data streams and most DSMSs follow the event-at-a-time model, which incurs significant overhead. Contrary, the DataCell capitalizes the performance characteristics of bulk event processing, i.e., basket-at-a-time processing in a column-store setting. It provides the flexibility for better query scheduling and better exploitation and balancing of system resources.
The literature on performance evaluation of stream engines does not yet provide many points of reference. GigaScope[9] claims a peak performance up to a million events per second by pushing down selection conditions into the Network Interface Controller. Early presentations on Aurora report on handling over a 160K msg/sec.
The DataCell computational model is complete enough to support the the necessary components for robust stream engine. Not only to support them, but also avoid data conflicts between them. Furthermore, it opens opportunities to explore concurrency and query scheduling.
An application specific performance analysis is the Linear Road Benchmark [4]. It compares Aurora and a commercial DBMS systemX. Two solutions for systemX are given, one based on triggers and stored procedures, another one based on polling. The systems show the capability to handle between 100 (systemX) and 486 (Aurora) tuples/second. The DataCell architecture can be considered an in between solution. Not as specific as a stream engine, like Aurora, but much more flexible then systemX. The message throughput is largely determined by the network protocol, i.e., how quickly can you get events into the DataCell. The most similar system is Coral8, developed to support complex queries and attacks the stream problems from a SQL relational database perspective. It claims a total latency around one or two milliseconds. Several systems claim different performance results, but in general there are three real dimensions to performance: absolute latency, should be low; absolute throughput, should be high; and the complexity of the queries/patterns/filters being used when the first two factors are observed. This makes the comparison between systems a bit complex and context dependent.
7.
SUMMARY AND FUTURE WORK
A prototype DataCell kernel is operational and it is used to assess its functionality and performance. Experiments based on patching the intermediate code produced by the SQL compiler indicate that bulk processing is effective and that basket expressions nicely adapt to event arrival rates. Dealing with more than 100K events/second using the SQL framework seems feasible. Future work....
8.
ACKNOWLEDGMENTS
We wish to thank F. Groffen, M. Ivanova, and the other members of the MonetDB development team for their constructive comments on earlier versions of this paper.
9.
REFERENCES
[1] D. J. Abadi, Y. Ahmad, M. Balazinska, U. Cetintemel, M. Cherniack, J.-H. Hwang, W. Lindner, A. S. Maskey, A. Rasin, E. Ryvkina, N. Tatbul, Y. Xing, and S. Zdonik. The Design of the Borealis Stream Processing Engine. In CIDR, 2005. [2] A. Arasu, B. Babcock, S. Babu, M. Datar, K. Ito, I. Nishizawa, J. Rosenstein, and J. Widom. STREAM: The Stanford Stream Data Manager. In SIGMOD, 2003. [3] A. Arasu, S. Babu, and J. Widom. CQL: A Language for Continuous Queries over Streams and Relations. In DBPL, 2003.
[4] A. Arasu, M. Cherniack, E. F. Galvez, D. Maier, A. Maskey, E. Ryvkina, M. Stonebraker, and R. Tibbetts. Linear road: A stream data management benchmark. In VLDB, 2004. [5] B. Babcock, S. Babu, M. Datar, R. Motwani, and D. Thomas. Operator Scheduling in Data Stream Systems. The VLDB Journal, 13(4):333–353, 2004. [6] S. Babu and J. Widom. Continuous Queries over Data Streams. SIGMOD Record, 30(3):109–120, 2001. [7] H. Balakrishnan, M. Balazinska, D. Carney, U. Centintemel, M. Cherniack, C. Convey, E. Galvez, J. Salz, M. Stonebraker, N. Tatbul, R. Tibbetts, and S. Zdonik. Retrospective on Aurora. The VLDB Journal, 13(4):370–383, 2004. [8] R. S. Barga, J. Goldstein, M. H. Ali, and M. Hong. Consistent streaming through time: A vision for event stream processing. In CIDR, pages 363–374, 2007. [9] C. D. Cranor, T. Johnson, O. Spatscheck, and V. Shkapenyuk. Gigascope: A Stream Database for Network Applications. In SIGMOD, 2003. [10] A. J. Demers, J. Gehrke, B. Panda, M. Riedewald, V. Sharma, and W. M. White. Cayuga: A general purpose event monitoring system. In CIDR, pages 412–422, 2007. [11] A. Eisenberg, J. Melton, K. G. Kulkarni, J.-E. Michels, and F. Zemke. SQL:2003 Has been published. SIGMOD Record, 33(1):119–126, 2004. [12] L. Girod, Y. Mei, R. Newton, S. Rost, A. Thiagarajan, H. Balakrishnan, and S. Madden. The Case for a Signal-Oriented Data Stream Management System. In CIDR, 2007. [13] C. Koch, S. Scherzinger, N. Schweikardt, and B. Stegmaier. Schema-based Scheduling of Event Processors and Buffer Minimization for Queries on Structured Data Streams. In Proc. VLDB Conf., pages 228–239, Toronto, Canada, Sept. 2004. [14] A. Maggiolo-Schettini and J. Winkowski. A Generalization of Predicate/Transition Nets. Helsinki Unversity of Technology, Digital Systems Laboratory, Series A: Research Reports, 1990. [15] J. Mihaeli and O. Etzion. Event database processing. In ADBIS (Local Proceedings), 2004. [16] J. L. Peterson. Petri nets. ACM Comput. Surv., 9(3):223–252, 1977.