NetWeaver Business Warehouse Accelerator and a realistic set of data from an ... memory and column-oriented data management system, in enterprise ... management software. .... In fact, enterprise applications use only a small subset of .... Sparse Analysis for Financial Accounting (a) and Inventory Management (b). 0.01.
Enterprise Application-specific Data Management Jens Krueger, Martin Grund, Alexander Zeier, Hasso Plattner Hasso Plattner Institute for IT Systems Engineering University of Potsdam Potsdam, Germany {jens.krueger, martin.grund, alexander.zeier, hasso.plattner}@hpi.uni-potsdam.de
Abstract—Enterprise applications are presently built on a 20-year old data management infrastructure that was designed to meet a specific set of requirements for OLTP systems. In the meantime, enterprise applications have become more sophisticated, data set sizes have increased, requirements on the freshness of input data have been strengthened, and the time allotted for completing business processes has been reduced. To meet these challenges, enterprise applications have become increasingly complicated to make up for shortcomings in the data management infrastructure. To address this issue we investigate recent trends in data management such as main memory databases, column stores and compression techniques with regards to the workload requirements and data characteristics derived from actual customer systems. We show that a main memory column store is better suited for todays enterprise systems, which we validate by using SAP’s NetWeaver Business Warehouse Accelerator and a realistic set of data from an inventory management application.
I. I NTRODUCTION So far, column-oriented database management systems have mainly been used for analytical purposes of enterprise data, not taking into account transaction processing due to the overhead of row-oriented operations in the columnoriented storage system. Several aspects of today’s developments in computer science and business applications underline the possibility of such a system. Firstly, hardware has reached a far advanced level when compared to the available hardware of the 1990’s, when the decision to separate analytics from transaction processing systems was made, e.g. increased main-memory capacity and multi-core processor architectures. For example, bottlenecks that are created due to unbalanced growth of hardware, e.g. main-memory access speed growth is only 10% whereas processor speed has been growing 60% each year and limited bandwidth of the connection between memory and processor, play into the capabilities of column-oriented storage technology. This way of organizing data takes advantage of modern hardware by optimizing sequential memory access which allows efficient use of memory hierarchies and prefetching methods of todays processors. Furthermore, computational resources are traded with light-weight compression schemes facilitating an higher utilization of the memory bandwidth while enabling late materialization during plan execution.
Secondly, our analyses of the enterprise data of real companies show that column-oriented data management is able to utilize specifics of enterprise data management, until now disregarded by traditional row-oriented data storage. Such characteristics include, for example, that data is less often updated than is usually assumed, lots of attributes include null values, are constant, or only have little variance. These results further promote the usage of column-oriented storage as its handicap compared to row-oriented storage is lessened in regards to transactional workload characteristics. In addition to the above, our investigation of realistic workloads have shown that over 90% of all queries are selects statements. Given this fact, a read-optimized store with a write-optimized buffer aside exploits a better overall performance compared to traditional row-store. Following this assumption, column-oriented databases facilitating compression offer both performance enhancements and reductions in storage consumption as read access can directly processed on compressed data for query execution. In prior work [1] we propose a write-optimized buffer to trade write-performance for query-performance and memory consumption by using the buffer as an intermediate storage for several modifications which are then populated as a bulk in a compaction operation. The current state of the data is represented by this additional write-optimized buffer to maintain the delta in conjunction with the compressed main store, which is read-optimized and contains most of the data. Hence, it becomes feasible to incorporate such storage architectures in transactional environments if certain assumptions hold such as a mostly read focused workload. In this paper we present the capabilities of SAP NetWeaver Business Warehouse Accelerator, a mainmemory and column-oriented data management system, in enterprise application environments. We show how transactional enterprise data can be analyzed swiftly and flexibly by operating on its original transactional format without creating special structures for handling aggregates or analyticstyle queries while enabling more flexible application with less complexity at the same time. The remainder of the paper is structured as the following: First we will introduce recent trends in data management that have the potential to lead to new architectures to handle requirements of todays enterprise applications. The follow-
ing part introduces design considerations which have been derived from analyzing customer data. In next section we present a cost model to theoretically evaluate our assumption followed by a validation of this approach with real customer data. The paper concludes with related work and a summary and outlook. II. T RENDS IN E NTERPRISE DATA M ANAGEMENT Enterprise applications heavily rely on database management systems to take care of the storage and processing of their data. A common assumption of how enterprise applications work (row based, many updates) has led to decades of database research. A rising trend in database research shows how important it is to rethink how persistence should be managed to leverage new hardware possibilities and discard parts of the over 20-year old data management infrastructure. The overall goal is to define application persistence based on data characteristics and usage patterns of the consuming applications in realistic customer environments. Of course it has to be considered that some of the characteristics may be weakened due to the fact that they use “old” data management software. Stonebraker et. al proposes a complete redesign of database architectures considering latest trends in hardware and based on actual usage of data in [2], [3]. Besides, Vogels et. al describe in [4] the design and implementation key-value storage system that sacrifices consistency under certain failure scenarios and makes use of object versioning and application-assisted conflict resolution. We analyzed the typical setup for enterprise applications consisting out of an transactional OLTP part and an analytical OLAP part. The major difference between both applications is the way how data is stored and that data is typically kept redundantly available in both systems. While OLTP is well supported by traditional row-based DBMS, OLAP applications are less efficient on such a layout, which led to the development of several OLAP-specific storage schemes, in particular multidimensional schemas. However, these schemas are often proprietary and difficult to be integrated with the bulk of enterprise data that is stored in a relational DBMS. This poses serious problems to applications which are to support both, the OLAP and the OLTP world. Given that, complex OLTP queries, such as the computation of a stock for a certain material in a specific location cannot be done on the actual material movements events but is done by using precomputed aggregates on a pre-defined granularity level. However, three recent developments have the potential to lead to new architectures which may well cope with such demands [5]: • • •
Main-Memory Databases, Column-Oriented Storage Schemas, and Query-aware light-weight Compression.
A. Main-Memory Database Database management systems that answer all queries directly from main memory and do not rely on disk storages as primary persistence are called in-memory or Main-Memory Databases (MMDB) [6], [7]. Falling prices and constantly growing memory sizes make these more and more attractive for managing complete instances. Moreover, even though the data set of enterprise applications increases, other data sets like unstructured data or web data captured e.g. from social network applications grow faster. Due to organizational- or functional-based partitioning most of the systems do not exceed an active size of 1TB while slowly growing. As a matter of fact transactional processing is based on actual events which are tied to the real-world as the number of customers, products, and other real world entities are not growing as fast as main memory does. Compared to the size of main memory, the latency and bandwidth only slightly changed. To improve the bandwidth utilization modern CPUs have several caches directly on or close to the CPU. These caches have proven to be valuable targets for optimization as depicted for example in [8], [9]. In [10] Manegold et. al show that one factor determining the processing time of an in-memory data access is the number of cache misses which can be mapped to CPU cycles as an equivalent to time. Furthermore, the authors derive a dependency towards random and sequential data access patterns distinguished by a stride between the read values. Following this rationale, it must be the goal of a MMDB to minimize the number of misses on each level of the memory hierarchy, which in most cases consists of level 1, level 2, and main memory with increasing sizes by decreasing performance throughout the hierarchy. a1
a2
a3
a4
a5
r0 r1 r2
Cache Line
r3 Tuple
Figure 1.
Data Access with Increasing Stride
The best performance can be achieved if the data is accessed in a sequential way and the loaded cache line is fully consumed. Figure 1 shows an example table where multiple attributes span a single cache line in a row oriented storage. The ultimate goal must be to read all attributes stored in this cache line to avoid additional misses. The concrete application implementation must ensure a proper usage of the underlying memory architecture. While considering a MMDB the optimization towards the physical data layout in memory is dependent on the actual data access pattern where OLTP and OLAP workload characteristics are prominent examples of such requirements leading to an architecture design. In addition it becomes important to
make sure the available memory bandwidth is optimally used and as much as possible data is read sequentially. B. Column Stores A promising technology for facilitating complex queries over large data sets are column-oriented databases, which are based on the decomposed storage model, devised by Copeland and Khoshafian in 1985 [11]. Each column is stored separately while the logical table schema is preserved by the introduction of surrogate identifiers (which might be implicit). The use of surrogate identifiers leads to extra storage consumption, which can be overcome, for example, by using the positional information of the attributes in the column as identifier. Encoding approaches exist that avoid the redundant storage of attribute values, e.g. null values, or in the case of columns where only a small amount of differing values exists. According to column-store characteristics for queries with high projectivity, column-oriented databases have to perform a positional join for re-assembling the different field in a relation, but projectivity is usually low in analytical work-loads since analytical workloads are largely attribute-focused rather than entity-focused [12]. Memorybased column databases, such as SAP’s Business Warehouse Accelerator, can rapidly execute analytical queries because they exploit the specific characteristics of a column store by reading most of the data sequentially and compressed from main memory while being aware of memory hierarchies as much as possible. Rapid access to single business entities spread over many tables is possible as well but comes with a slight performance penalty due to the random access. However, the point where this tuple-reconstruction overhead outpaces the advantages of a column-oriented storage is much higher in main-memory based databases due to random access capabilities which makes the usage of such technology appealing for transactional workloads that incorporate access to single instances. Compiling all properties together column store DBMS benefit from data access that reads relatively small attribute sets over a large set of tuples. In addition when working with compression low update rates are beneficial to avoid re-compression.
early work [14] on compression explored improvements of I/O by reducing the size of the data, latest research, such as [15], [16], has focussed on the effects of compression onto query execution, i.e. late materialization strategies. Data compression techniques exploit redundancy within data and knowledge about the data domain for optimal results. Compression applies particularly well to columnar storages [16]. Since all data within a column a) has the same data type and b) typically has similar semantics and thus low information entropy, i.e. there are few distinct values in many cases. In Run Length Encoding (RLE) the repetition of values is compressed to a (value, run-length) pair. For example the sequence ’aaaa’ is compressed to ’a[4]’. This approach is especially suited for sorted columns with little variance in attribute values. For the latter if no sorting is to be applied, bit-vector encoding is well suited while many different variants of bit-vector encoding exist. Essentially, a frequently appearing attribute value within a column is associated with a bit-string, where the bits reference the position within the column and only those bits with the attribute value occurring at their position are set. The column is then stored without the attribute value and can be reconstructed in combination with the bit-vector. Another prominent example is dictionary encoding, where frequently appearing patterns are replaced by smaller (bitcompressed) symbols. Due to the compression within the columns, the density of information in relation to the space consumed is increased. As a result more relevant information can be loaded into the cache for processing at a time with an increase of bandwidth utilization. To mention is that this techniques are also applicable for row-oriented storage but efficient only in MMDB’s.
C. Compression Techniques
In addition performance can be improved by making the column-oriented query executor aware of the type of compression that is being used as described in [17] Decompressing while reading data, loads the CPU additionally. The main trade-off in compression is compression ratio vs. the cost for de-compression. Hence the goal is to postpone de-compression to the latest point in processing to both leverage the compressed data as much as possible and only de-compress data that is written out.
Moores Law, saying that complexity of circuits will double every two years, has been true in the case of CPUs for more than 35 years. Nothing similar is true for hard drive or memory speed, so that for many queries I/O has increasingly become the bottleneck [8]. A widening gap between the growth rate of CPU speed and memory access speed can be observed as described in [13]. This trend argues for the usage of compression techniques requiring higher effort for de-compression, since CPU power grows and resources are wasted for waiting on I/O operations to complete, thus trading CPU power for I/O improvements is feasible. While
However, when taking into account aggregate functions in real time enterprise applications – which are heavily used if working on actual events instead of precomputed aggregates – typical operations are more analytical-style queries, the use of compression techniques on top of which these functions can directly be performed without de-compressing becomes more appealing. Furthermore, the improvement of compression increases even more while considering sparse data in conjunction with considerably low distinct values on each column.
Selects Inserts Updates Deletes
90% 80% 70% 60% 50% 40% 30% 20% 10% 0%
r
r
r
be m
be m
e ec
D
r be
o ct
e ov
N
O
st
be em
pt Se
gu Au
ly Ju
ay
ne Ju
M
h
c ar
ril Ap
M
y ar
ry ua
Even though transactional systems work on single instances of objects, our investigation has shown that most of the data is consumed by set processing. In our work we distinguish between processing one single instance of an enterprise entity like a sales order or one customer, and the processing of attributes of a set of instances such as reading all open (overdue) invoices, or show the top ten customers by volume of sales. Even though enterprise applications rely on creating events which correspond to sales orders, invoices etc, it is not possible to determine the state of an
Load for all customers 100%
br Fe
A. Set Processing
B. Data Changes
nu Ja
Due to the increased data set size, new requirements for business processes and the increasing demand of actual event data, enterprise applications have become increasingly complicated to make up for shortcomings in the data management infrastructure including separate stores for transactional and analytical data. As a consequence, nowadays database management systems can no longer fulfill the requirements of specific enterprise applications since they are heavily optimized for one application use case: either for OLTP or OLAP processing. This leads to a mismatch of enterprise applications regarding the underlying data management layer. Mainly because conventional RDMBSs cannot execute certain important complex operations in a timely manner. While this problem is widely recognized for analytical applications, it also pertains to sophisticated transactional applications. In order to make up for this shortcomings in the data management infrastructure one of the solutions is to package operations as long-running batch jobs. Consequently, this approach slows down the rate at which business processes can be completed, possibly exceeding external requirements. Maintaining precomputed, materialized results of the operations are another solution. Materialized views in data warehouses for analytical applications are an example of this approach, which makes applications less flexible, harder to use, and more expensive to maintain. In fact, enterprise applications use only a small subset of current database features, especially for transactional behavior and complex calculations on sets of data. For example, storing data redundantly by predefined characteristics is used to solve performance issues while accepting both an increase of complexity and a decrease of flexibility at the same time. This approach is very similar to the materialized view concept in analytical environments but triggered by the application due to involved complex application logic. To summarize: the current approach to deal with todays enterprise requirements leads to more complexity of how data is transformed and stored. In the following section we will show how the actual enterprise characteristics are mapped to current software technology, e.g. main memory column store databases.
enterprise or process by a single instance. For example, to determine the progress of a project a context has to be created. To construct this context different attribute from many events have to be read and typically aggregated in order to generalize the information. Possible examples for such context are dashboard or object worklists. Dashboards are used to present the current state of a project or any other semantically grouping object. Object worklists are used to generate tasks based on the context of other objects — like creating follow ups, invoices, etc. Furthermore, business entities exists whose state can be derived based on actual events instead of precomputed aggregates. Examples for this are an financial account or an inventory of a material. The sheer flexibility of event-based state reconstruction and the number of ways to execute reports clearly outnumbers the possible ways of entering data. Before any business decision can be made the context for this decision needs to be created by processing a set of enterprise entities. In general, the described set processing requires to read less attributes but sequentially instead of reconstructing single relations. The more operations can be categorized by this behavior the more main memory column stores benefit from this characteristic.
Workload
III. D ESIGN C ONSIDERATIONS
Fiscal year
Figure 2.
Query Workload for an ERP System
In the context of enterprise data management database systems are classified being optimized either for online transaction processing or online analytical processing while enterprise applications encompass the characteristics of transactional processing as mentioned before. Consequently, it is assumed that the initiated workload is mainly focused on inserts and updates instead of mainly selects. In order to identify whether a read-optimized or write-optimized store suits the actual workload environment the database log files from 65 customers of an enterprise resource system have been analyzed with regards to types of queries executed. The result is shown in Figure III-B that clearly shows the distribution towards read-mostly while clustering the analysis on moth of each fiscal year of the investigated companies.
C. Sparse Enterprise Data A common assumption is that all data stored in enterprise applications is used always together and that data tends to be highly distinct. In this section we want to show that enterprise data is typically sparsely distributed and only a narrow set of attributes is used together. To validate our assumption we analyzed the occurence of distinct values per attribute in the main tables of the financial accounting and sales order processing applications in different customer systems. In a large enterprise system the accounting document has 98 attributes while the corresponding line item contains 301 attributes. The material movements table consist of 180 columns. Figure 3(a) depicts the percentage of occurrence grouped by sparse groups over all tables and customers. It obvious that most of the columns belong to the last sparse group, which includes the distinct value cardinality of one. This single value can either be a null value or a default value. Consequently, only certain attributes are used and thus in interest of the application. Which attributes belong to the sparse group with only one distinct value varies from company to company based on the companies industry. As shown in figure 3(b) on a material movements table 43 out of 180 columns have a significant amount of distinct values. Consequently, around 75% of all columns have a very small relative cardinality of their distinct values. The attribute with the highest relative frequency is the column of the movement number, which is also the key of the table. The other columns with a high distinct value cardinality are the transport request number, quantity, and date and time columns. Striking at the table is that 35% of all columns just contain one single distinct value. As a consequence we can define very important application specific characteristics: Firstly, depending on the application and industry many attributes in an enterprise application have only one distinct value and as a result such attributes would heavily benefit from light-weight compression techniques. Secondly, the before-mentioned criterias are application specific and not general. Such characteristics must be known by the database management system to allow the best performance and choosing the best optimization techniques.
1000 CPU Cycles per value
The main finding of this analysis is that around 90% of all queries executed are selects. This result provides a strong argument for using a read-optimized store in enterprise workloads which keeps up of data changes by a writeoptimized buffer as proposed in [1], [18] to achieve the best overall performance in the analyzed environment. As mentioned before compressed column stores require time-to-time re-compression to achieve best compression ratios and best performance. The low modification rate seen in Figure III-B shows that it is possible to apply light-weight compression to data used in enterprise applications.
XEON
100
10 L1 Hit
L1 Miss
L2 Miss
1 1
10
100
1000 10000 100000 1e+06 1e+07 1e+08 Stride in Bytes
Figure 4.
Data Access with Increasing Stride
D. Findings From the above mentioned three most important set of properties we can derive that enterprise applications show a behavior which is beneficial for main memory column databases. This includes a low update rate and known domain values to apply compression, and set processing to leverage sequential data access. In the next section we want to evaluate the above observations using a prototype build using the SAP Business Warehouse Accelerator (BWA). The input for our prototype are the above mentioned application characteristics which are than used in a modified SAP ERP system that uses the SAP BWA as a primary data source and thus allows complex OLTP operations and real-time analytics on current data. IV. C OST E VALUATION In this section we want to show and prove why column stores perform better in enterprise application scenarios than traditional row stores. Our observation is based on the cost model presented in [10] and extended in [19]. A. Random vs. Sequential Value Access Even though RAM is called Random Access Memory for being able to randomly accessing any value, the costs for accessing different memory locations can be different. Due to the multi-level cache hierarchy in modern CPU architectures costs vary. To increase the bandwidth multiple values inside memory are stored together on cache lines and loaded together. Each cache line can than be located in one of the different caches. If accessed data is not found in the closes cache of the CPU it incrementally requests this cache line (typically 64 bytes) from higher cache levels until it is loaded from memory. Requesting a value that is not available in the cache is called a cache miss. Cache misses on different levels of the memory hierarchy induce different penalties. Figure 4 shows an experiment where a constant number of elements is accessed, but the stride between those is varied to simulate the different access patterns off a row and a column store. Figure 5 shows the same experiment but in throughput on the y axis.
70.0 63 52.5
53 43
35.0
21
17.5
0
(a) Figure 3.
33 - 1024
2 - 32
1-1
(b)
Sparse Analysis for Financial Accounting (a) and Inventory Management (b)
10 Throughput in GB/s
1024 - 100000000
XEON
1
scan) 2) Aggregate on MENGE (conditional read) 3) Write output Using the cost model presented in [19] those basic plan elements translate into the following formal description :
0.1
s trav s trav cr ⊕ s trav
(1)
0.01 1
10
100
1000 10000 100000 1e+06 1e+07 1e+08 Stride in Bytes
Figure 5.
Data Access with Increasing Stride - Throughput
The results from this experiment clearly shows the different levels of cache and more important the latency for data requests that incur cache misses. The overall goal is to minimize the number of cache misses since they can be directly translated into CPU cycles waiting for the requested data. In addition to the sequential reading analysis the before mentioned experiment shows another important property. The resulting graph can be used to show the difference between purely random and sequential access. Sequential reading has no stride while purely random access can be simulated using a very high stride, where the stride spans more than one the size of one cache level. B. Cost Model In the following paragraph we want to explain using an enterprise application example how the access costs for a given physical layout (rows or columns) can be calculated, hereby modeling only one cache level for simplicity reasons. In our example we use a simplified query from the inventory management application: SELECT SUM(MENGE) FROM MSEG WHERE MATNR = $1. The translated query plan for this query is: 1) Scan MATNR for all matching rows and (sequential
where s trav is a sequential traversal on a data region and s trav cr is a sequential traversal conditional read, reconstructing only those values where the condition holds. The scan and aggregation are done in parallel writing and after finishing the scan writing the result. Writing the result is modeled using a write-through strategy incurring an additional cache miss. For both — column and row store — the overall cost in terms of cache misses for this query plan are.
Cost(Q) = M (s trav)+M (s trav cr)+M (s trav) (2) Even though the first s trav and s trav cr are executed in parallel both operations do not depend on the size of the cache since no cache line is reused and the overall cost can be calculated by adding both individual results. Now for each physical layout we calculate the costs. The cost model we used is based on calculating the misses on data regions. A region is determined by its width and number of tuples (R.w, R.n). Furthermore the width of a cache line (B) and the width of the read data (u) when accessing it is important. s trav: — when sequentially scanning a region R all tuples are traversed and u values are read. The number of misses depends on the width of the gap between the accessed parts of a tuple (R.w − u), if the gap is smaller (R.w − u < B) every cache line contains data that has to be read and no cache line can be skipped. If the gap is greater than a cache line some lines may be skipped.
3e+07
Cache Misses
2.5e+07
Column Store Row Store Row Store Reorderd
2e+07 1.5e+07 1e+07 5e+06 1e-07 1e-06 1e-05 0.0001 0.001
0.01
0.1
1
Selectivity 0..1
Figure 6. Cache misses for SELECT SUM(MENGE) FROM MSEG WHERE MATNR=$1
( M (s trav(R)) =
R.w·R.n B u e R.n · d B
if R.w − u < B, else
(3)
s trav cr: — a region R is scanned but only than u values are read if a condition holds. Assuming equally distributed values in a column, the probability that a cache line has to be read is the probability that one or more of the 16 values (assuming B = 64b) have to be read. The probability P for each value to be accessed is the selectivity while other value distributions based on customer data analysis are currently work in progress. P = 1 − (1 − selectivity)B
(4)
|R|B determines the number of cache lines the region R spans and can be calculated using: R.w · R.n (5) |R|B = B Hence we calculate the cache misses with: M (s trav cr) = P · |R|B
(6)
C. Comparison Using the formula 2 we can model the access cost for the inventory management query. For our comparison we use a customer derived table containing material movement events with 80 million tuples. The table MSEG has ≈ 100 columns with each 4 bytes width (i.e. domain encoded). When modeling a column store this yields 100 regions with each R.n = 80 · 106 and R.w = 4. For the row store this yields one region with R.n = 80 · 106 and R.w = 400. In our model experiment we vary the selectivity from 0 to 1. Figure 6 clearly shows the advantage of a column store over a row store. To compare this to the behavior of a row store in memory we have to separate two different cases: firstly the distance between the MENGE and MATNR attribute spans at least one
cache line, and secondly MATNR and MENGE are located on the same cache line due to reordering of the physical attributes. For the first case for every tuple at least one cache line is touched and a second cache line for every tuple where the predicate matches. In the second case always the cache line is touched where the predicate is evaluated and containing the data to be aggregated. Especially in low selectivity cases the main memory column store shows a clear advantage. The importance of low selectivity can be validated by comparing the results of the evaluation with enterprise application characteristics. In the analysis of customer data we see that out of 80M material movements 1 material can generate at maximum 1.4M material movements which is a selectivity of 0.0175. V. C USTOMER S TUDY To validate our findings from the customer data analysis and to apply the results from the evaluation we implemented a prototype based on an SAP ERP system following the proposed approach while focusing on financial accounting and inventory management. The most important changes are the following: • Redundancy-free data schema — typically ERP systems rely on a set of materialized aggregates to supply realtime numbers ad hoc. In our prototype we replaced all materialized views holding precomputed aggregates and calculate the required numbers on-the-fly, exploiting the advantages of main-memory column stores. • Insert Only Approach — Due to the simpler data schema and the removal of all materialized and thus redundant data it becomes possible to apply an insert only approach. The clear advantage of this approach is that all modifications are maintained in an auditable variant without any further changes to the application, and it is always possible to perform reports on historical data even in the OLTP system. • Stored procedures for elementary business functionality — To facilitate the access patterns of main memory databases we identified elementary business logic and moved it as close as possible to the database. Such logic include but is not limited to computing the balance of an account, determine the overdue items in financial accounting, or calculate the stock based on material movements. • Integrated Analytics — To extend the application with the flexibility required from a dashboard or object worklist applications we embedded the required analytical queries directly in the enterprise application as the used data management supports this efficiently. The results of the project showed that it is indeed possible to run major parts of financial accounting on top and directly on a main memory column database. The prototype was able to generate the balance sheet directly from all singe postings
SAP BWA 10.1GB 10.7GB (+5.9%)
MySQL 51GB 60.4GB (+18.4%)
Table I S TORAGE S IZE C OMPARISON
index benefit 1000 exec(noindex) / exec(index)
100 index speedup factor
Description Material Movements Material Movments w/ indices
10
1
on a specific account and executed an overdue item run instead of 20 minutes on the old system in 3 seconds based on the new database technology. It is important to mention that to achieve this it was required to extract, analyze and interpret the application characteristics. A. Inventory Management — Evaluation As an example extracted from the above prototype we want to provide a detailed analysis which shows different specific application characteristics and how they can be applied and interpreted together with different performance evaluation experiments. The example is taken from inventory management as already considered in section IV. The general structure of the participating tables in inventory management follows a very simple header-item principle where one movement header contains many material movement line items. One header could be e.g. mapped to one order, and each line item to one line item in this order. In contrast to existing systems we don’t store material movement aggregates that accumulate all material movements on a certain granularity level. The advantages are two-fold: firstly this reduces complexity due to less maintenance of those aggregates and secondly reduces the amount of data stored. The data set we are working with is extracted from a live customer system containing 80M material movements with 117.869 different materials. Figure 7(a) shows the distribution for the given material movements. One important conclusion is that only 10% percent of all materials generate more than 80% of all material movements. Optimizations assuming evenly random distributed movements must result in wrong results. The data layout and the query processing must be optimized towards reading sequential from memory. This example shows how beneficial main memory column stores can be. To evaluate the achievable performance when using a main memory column store we tested the query execution both using the SAP prototype and using a standard MySQL database. The goal of this benchmark is to measure the performance when calculating the stock level for a given material and stock location based on the material movements. Figure 7(b) shows the results of the experiment, Figure 7(c) shows the direct speedup when comparing both results. As a last step we analyzed the impact of using an index for the stock calculation queries. The first evaluation was to compare the size of the stored data in MySQL and SAP
0.1 1
10
100
1000
10000
100000
1e+06
1e+07
nr of rows selected
Figure 8.
Benefit of Using an Index for Stock Calculation
BWA as shown in table I. SAP BWA uses an inverted index structure [20] and we used the default indices provided by MySQL. As shown in 7(b) even with indices SAP BWA is faster than the conventional row store. The last part of the evaluation compares the benefit of using an index for SAP BWA depending on the number of rows touched by the stock calculation. Figure 8 shows the result of this experiment. For a very small number of touched tuples the index benefit is very high — 100 x — while for larger number of tuples the benefit is almost gone. From the customer data analysis we derived that the materials that generate the most movements will have around 1M to 2M material movements — based on the 80M data set size — and therefore will not benefit from the available index. However, since only ≈ 6% more space is required to store the indices it is advisable to keep the indices to efficiently support lookups of single tuples. B. Customer Study Summary To summarize the results from our customer validation we can say that it is possible to build ERP applications on top of a main memory database. Our customer data analysis reveals that the enterprise applications characteristics match those of main memory column databases. An important aspect of our observation results is the significant complexity reduction of the application database schema. While using the old system applications tend to materialized data or aggregates in secondary tables and thus increase the overall database size. The goal of such materializations is to overcome the performance bottleneck of the old row based storage layer. Using main memory column stores such redundant data can be removed, saving not only the additional space but as well complexity for transaction handling since the data does not need to be copied into the secondary storage. Furthermore, flexibility is increased due to the lack of predefined granularities. Hence, applications can take advantage of this flexibility, for example, by executing dynamically generated queries based on the actual input of the transaction to provide faster and better process handling.
0
10
20
30
40
50
60
70
80
90
100
TOP x Percent of all Materials
1000
100000 10000
100
1000 100
10 1
10 1
0.1
1
10
100
1000 10000 100000 1e+06 1e+07
Number of Aggregated Rows
(a) Figure 7. MySQL
SAP BWA / MySQL
MySQL w/ index SAP BWA w/ index Speedup
%movement / %material
time in ms [trex search time//mysql query time]
Percentage of Associated Movements
100 90 80 70 60 50 40 30 20 10 0
(b)
1
10
100
1000
10000 100000 1e+06 1e+07
Number of Aggregated Rows
(c)
(a) Materials mapped to Material Movements (b) Stock Level Calculation based on Material Movements (c) Direct Speedup of SAP BWA vs.
VI. R ELATED W ORK As mentioned in Section II, much of the related work belongs to specialized databases. This research area covers main memory based databases as well as physical columnoriented data representation. The first topic has been addressed early in [7] while latest research is leveraging main memory data processing for certain application areas. The authors of [3] and [21] describe the usage of main memory databases for transactional workloads while still relying on a row-wise data storage. In contrast, the work around MonetDB [8] focuses on in-memory processing and binary association tables that store data fully vertically partitioned while being optimized towards analytic-style queries. The idea of column store databases has been implemented in multiple academic projects, for example, the main-memory based MonetDB [8] or the C-Store as a disk-based variant [22]. Ramamurthy et al. describe in [23] a DBMS design to handle workloads of different characteristics by introducing mirrors with different physical storage layouts as the authors of [24] do. Another area of related work belongs to the capabilities of real-time reporting and operational reporting. The latter is a typical application of the emerging field of Active Warehousing [25], which aims at supporting tactical decisions as opposed to strategical. Tactical decisions usually need upto-the-moment information. This is where the concept of the operational data store (ODS) comes into play. Unlike data warehouses, which contain rather static data, the contents of ODS are updated during the course of operations, for example, order delivery scenarios. An ODS is designed to quickly perform relatively simple queries on small amounts of data, such as finding the status of a customer order, rather than the complex queries on large amounts of data typical for a data warehouse. In [26] the ODS is outlined as a hybrid structure with characteristics of the data warehouse and transactional systems as well. The ODS provides users with OLTP response time, update and DSS capabilities. In contrast to data warehouses only a small amount of data is stored and kept for only a short period of time. Since our
work has shown the feasibility of working on events (i.e. transactional items) instead of precomputed aggregates the features required by operational queries can be handled efficiently. Furthermore the work of Grund et. al [27] presents a concept for leveraging a technical insert-only approach to keep track of data changes in the transactional data store that enables reporting on historical data. The research around materialization of data as for example proposed in [28] [29], is orthogonal to our work as we avoid data redundancy in order to get rid of the maintenance overhead and the lack of flexibility due to predefinition by leveraging modern hardware. While eliminating the first a pre-computation is not needed anymore. VII. S UMMARY AND O UTLOOK In this paper we presented a new approach to an inmemory based data management for enterprise applications. While analyzing customer applications and even more important customer data we were able to deduct enterprise application characteristics. Following those characteristics we built a prototype based on an SAP ERP Financial Accounting application that runs directly on an main memory column-store database. Furthermore we showed how column stores may outperform general purpose row based databases in a theoretical evaluation and by example using logic from an inventory management application. The most important findings based on the customer analysis are: Enterprise data is sparse data with a well known value domain and a relatively low number of distinct values. Enterprise applications most of time present data by building a context for the view, modification to the data only happen rarely. Based on those characteristics which map perfectly to main memory column stores it is possible to build better enterprise applications with leaner database schemas and more functionality. The possibility to perform operational reporting on current event data allows better and faster decision making since no intermediate transformation to a reporting system is required.
During our validation we experimented with enriching the default overdue items run of the financial accounting system with customer separation queries that allow better separation of customers. Fixed customer configurations for dunning levels may no longer be necessary and can be replaced by rules based on real process logic. Current rack servers already provide more than 24 cores and about 1TB of main memory, in such environments it becomes necessary to rethink on how parallelization is applied. We see different scenarios here based workload and priorities, but most important based on enterprise characteristics directly derived from the consumers of the data. Furthermore we plan to analyze to map the results of our evaluation to multi-tenancy and cloud based main memory systems. We see that cloud based main memory databases allow the best scaling results for multi-tenancy and therefore promise the best overall TCO.
[12] C. D. French, “”One Size Fits All” Database Architectures Do Not Work for DDS,” in SIGMOD Conference, 1995. [13] N. R. Mahapatra and B. Venkatrao, “The Processor-Memory Bottleneck: Problems and Solutions,” Crossroads, vol. 5, no. 3, 1999. [14] G. V. Cormack, “Data Compression on a Database System,” Commun. ACM, vol. 28, no. 12, 1985. [15] T. Westmann, D. Kossmann, S. Helmer, and G. Moerkotte, “The Implementation and Performance of Compressed Databases,” SIGMOD Record, vol. 29, no. 3, 2000. [16] D. J. Abadi, S. R. Madden, and M. Ferreira, “Integrating Compression and Execution in Column-Oriented Database Systems,” in SIGMOD Conference, 2006. [17] D. J. Abadi, D. S. Myers, D. J. DeWitt, and S. Madden, “Materialization Strategies in a Column-Oriented DBMS,” in ICDE, 2007.
R EFERENCES [1] J. Krueger, M. Grund, C. Tinnefeld, H. Plattner, A. Zeier, and F. Faerber, “Optimizing Write Performance for Read Optimized Databases,” in DASFAA (to appear), 2010. [2] M. Stonebraker and U. C ¸ etintemel, “”One Size Fits All”: An Idea Whose Time Has Come and Gone,” in ICDE, 2005. [3] M. Stonebraker, S. Madden, D. J. Abadi, S. Harizopoulos, N. Hachem, and P. Helland, “The End of an Architectural Era (It’s Time for a Complete Rewrite),” in VLDB, 2007. [4] G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels, “Dynamo: amazon’s highly available keyvalue store,” in SOSP ’07, 2007. [5] H. Plattner, “A common database approach for OLTP and OLAP using an in-memory column database,” in SIGMOD Conference, 2009. [6] H. Garcia-Molina and K. Salem, “Main Memory Database Systems: An Overview,” IEEE Trans. Knowl. Data Eng., vol. 4, no. 6, 1992. [7] D. J. DeWitt, R. H. Katz, F. Olken, L. D. Shapiro, M. Stonebraker, and D. A. Wood, “Implementation techniques for main memory database systems,” in SIGMOD Conference, 1984. [8] P. A. Boncz, S. Manegold, and M. L. Kersten, “Database Architecture Optimized for the New Bottleneck: Memory Access,” in VLDB, 1999. [9] A. Ailamaki, D. J. DeWitt, M. D. Hill, and D. A. Wood, “DBMSs on a Modern Processor: Where Does Time Go?” in VLDB, M. P. Atkinson, M. E. Orlowska, P. Valduriez, S. B. Zdonik, and M. L. Brodie, Eds., 1999. [10] S. Manegold, P. A. Boncz, and M. L. Kersten, “Generic Database Cost Models for Hierarchical Memory Systems,” in VLDB, 2002. [11] G. P. Copeland and S. Khoshafian, “A Decomposition Storage Model,” in SIGMOD Conference, 1985.
[18] P. A. Boncz, M. Zukowski, and N. Nes, “MonetDB/X100: Hyper-Pipelining Query Execution,” in CIDR, 2005. [19] H. Pirk, M. Grund, J. Krueger, U. Leser, and A. Zeier, “Cache Conscious Data Layouting for In-Memory Databases,” in Hasso-Plattner-Institute (to appear), 2010. [20] F. Transier and P. Sanders, “Compressed inverted indexes for in-memory search engines,” in ALENEX, 2008. [21] S. K. Cha and C. Song, “P*TIME: highly scalable OLTP DBMS for managing update-intensive stream workload,” in VLDB, 2004. [22] M. Stonebraker, D. J. Abadi, A. Batkin, X. Chen, M. Cherniack, M. Ferreira, E. Lau, A. Lin, S. Madden, E. J. O’Neil, P. E. O’Neil, A. Rasin, N. Tran, and S. B. Zdonik, “C-Store: A Column-oriented DBMS,” in VLDB, 2005. [23] R. Ramamurthy, D. J. DeWitt, and Q. Su, “A Case for Fractured Mirrors,” in VLDB, 2002. [24] J. Schaffner, A. Bog, J. Krueger, and A. Zeier, “A Hybrid Row-Column OLTP Database Architecture for Operational Reporting,” in BIRTE 2008 in conjunction with VLDB’08, 2008. [25] A. Karakasidis, P. Vassiliadis, and E. Pitoura, “ETL queues for active data warehousing,” in IQIS, 2005. [26] W. H. Inmon, Building the Operational Data Store. York, NY, USA: John Wiley & Sons, Inc., 1999.
New
[27] M. Grund, J. Krueger, C. Tinnefeld, and A. Zeier, “Vertical Partition for Insert-Only Scenarios in Enterprise Applications,” in IE&EM, 2009. [28] J. Kiviniemi, A. Wolski, A. Pesonen, and J. Arminen, “Lazy Aggregates for Real-Time OLAP,” in DaWaK, 1999. [29] W. P. Yan and P.-A. Larson, “Eager aggregation and lazy aggregation,” in VLDB, 1995.