Vectorizing Database Column Scans with Complex Predicates

Recommend Documents

Lazy Database Access with Persistent Predicates - CiteSeerX

all free variables are represented as null-values, thus, suspension or failure seem to be more ... A query to a persistent predicate can have multiple solutions computed non- .... (Query db sql result) is used for queries that return a result. ... As

Romance complex predicates: phrasal and

Jun 9, 1996 - e. *Luis las insisti o en comer. Full NP complements follow the verb they are ... As is discussed below, `clitic climbing' is a su cient test for complex .... Almost identical facts obtain with se in Spanish. ...... Juan me hizo vender-

Vectorizing images

Page 1 of 6. CorelDRAW tutorial. Vectorizing images. Welcome to CorelDRAW®, a comprehensive vector-based drawing and graphic-design program for.

Lazy Database Access with Persistent Predicates - Semantic Scholar

From the logic programming point of view, a database table can be ..... it the database currydb in a MySQL database on the local machine in the table persons.

Discovery of fuzzy predicates in database

It was created in 2007 by Paulo Cortez y Anibal. Morais in the Minho University, ... [3] M. Berry, M. Linoff and S. Gordon, “Data Mining. Techniques”, John Wiley ...

UNIVERSITY OF CALGARY Persian Complex Predicates: Evidence ...

DEGREE OF MASTER OF ARTS. GRADUATE ... the past couple of years, be it assisting me in my master's applications, giving feedback on my CLA paper, or ...

Romance Complex Predicates: In defence of the

Aug 11, 1997 - If not, I seek still to establish a weaker point, namely to suggest that âThere's more than one way to do itâ (Wall and Schwartz .... tomorrow.

Axiomatizing GSOS with Predicates

Our procedure is implemented in a tool that receives SOS specifications as input and ...... Lecture Notes in Computer Science 5201, Springer-Verlag, Berlin,.

On Complex Predicates in Brazilian Portuguese1 - Revistas ...

PRES.3.PL a ball in-the birthday of fifteen years or regret.PRES.3.PL nÃ£o terem tido um. ..... David Michaels & Juan Uriagereka (eds.), Step by Step â Essays in.

Shallow morphology based complex predicates extraction ... - CiteSeerX

This paper presents the extraction of Complex Predicates (CPs) in. Oriya based on shallow morphology and available seed lists of verbs. Generally Oriya ...

CAUSATIVES AS COMPLEX PREDICATES ... - Stanford University

F (i.e. Ä is monotone) and the Knuth-Bendix completion procedure (Knuth and. Bendix, 1970) succeeds. In such a ..... Technical report, Xerox Corporation, 1996.

El + verb complex predicates in Hungarian - UiT

El + verb complex predicates in Hungarian. Ãva DÃ©kÃ¡ny. CASTL, University of TromsÃ¸. Abstract. This paper investigates the structure of complex predicates com ...

A Semantic Argument for Complex Predicates - CiteSeerX

(3) a. We feared Noriega in power. b. What we feared most was Noriega in power. c. Noriega in ..... found in Shakespeare's Henry VI, part 3: (i). The Theefe doth ...

Machine Translation and Complex Predicates? - Semantic Scholar

predicate formation on the lexical entries of the main verbs. However,. I show that this solution is inadequate for a more general treatment of complex predicates ...

Complex Predicates: Verbal Complexes, Resultative ...

the same status as VPs in incoherent constructions or whether these constructions ...... Go buy some cheap tires for that scene, those inexpensive tires drive.

Complex predicates in Basque Cathryn Donohue - ANU

Basque. In Basque, there are three structural cases, but no case doubling is allowed. .... Grammar (LDG; Kiparsky 1997 and elsewhere; Wunderlich 1997 and.

Database Theory Column Probabilistic Databases

Apr 30, 2008 - The central problem in probabilistic databases is query evaluation. Given a Boolean query q and a probabilistic database, compute the answer ...

Demand Analysis with Partial Predicates

Feb 4, 2006 - analysis â which is closely related to backwards strictness analysis â in a semantic framework of ..... position p, i.e. f(e1,..., en)|i.p = ei|p and t|Ç« = t, with Ç« the empty string. ...... p â¦ f.i(t) = match(t1, t) â§Â·Â·Â·â

Predicate Abstraction with Indexed Predicates

Jul 2, 2004 - col devised by Steven German [German ]. We believe we are the first to ... ting, datatype reduction [Clarke et al. 1992] and symmetry reduction.

Scaling Up Concurrent Main-Memory Column-Store Scans - CiteSeerX

Figure 2: 4-socket server with Ivybridge-EX CPU. Table 1: Local and ... column-store of Microsoft SQL Server [21]. ... new automatic NUMA balancing of Linux, monitor perfor- ..... rent scans for different data placement and task scheduling.

Scaling Up Concurrent Main-Memory Column-Store Scans - CiteSeerX

The key comparative criterion for analytical relational database management systems (DBMS) is their efficiency in servicing numerous clients on big data.

WHAT A DATABASE REALLY IS: PREDICATES AND PROPOSITIONS

Despite the name, a database is best thought of as a repository not just for data, but ... —Chris Date ... as in the first paragraph of the following introduction).

Database Table Column Description submission ID submission ...

Description submission ... should be unique in all submissions from that center of this document .... Short text that can be used to call out experiment records in.

Column-Oriented Database Systems - Computer Science

l The only fundamental difference is the storage layout l However: we need to look at the big picture. VLDB 2009 Tutorial. Column-Oriented Database Systems. 3.

Vectorizing Database Column Scans with Complex Predicates

Download PDF

34 downloads 147 Views 740KB Size Report

Comment

Vectorizing Database Column Scans with Complex. Predicates. Thomas Willhalmâ¡ thomas.willhalm@intel.com. Ismail Oukidâ â¡ [email protected]. Ingo MÃ¼llerâ â.

Vectorizing Database Column Scans with Complex Predicates Thomas Willhalm‡ [email protected]

Ismail Oukid†‡ [email protected]

Ingo Muller ¨ †∗ [email protected]

Franz Faerber† [email protected] ‡

Intel GmbH

†

SAP AG

∗

Karlsruhe Institute of Technology

ABSTRACT The performance of the full table scan is critical for the overall performance of column-store database systems such as the SAP HANA database. Compressing the underlying column data format is both an advantage and a challenge, because it reduces the data volume involved in a scan on one hand and introduces the need for decompression during the scan on the other hand. In previous work [26] we have shown how to accelerate the column-scan with range predicates using SIMD instructions. In this paper, we present a framework for vectorized scans with more complex predicates. One important building block is the In-List predicate, where all rows whose values are contained in a given list of values are selected. While this seems to exhibit only little data parallelism on first sight, we show that a performant vectorized implementation is possible using the new Intel AVX2 instruction set. We also improve our previous algorithms by leveraging the increased vector-width. Finally in a detailed performance evaluation, we show the benefit of these optimizations and of the new instruction set: in almost all cases our scans needs less than one CPU cycle per row including scans with In-List predicate, leading to an overall throughput of 8 billion rows per second and more on a single core.

1.

Figure 1: The codewords of a column are not aligned to machine word boundaries. They may be spread across several machine words and share their machine word(s) with other codewords.

thereby optimizes the usage of the execution units in the core [1, 8, 13]. But even within a single thread, several instructions can be executed inside a single clock cycle. The focus of this paper lies on SIMD, where a Single Instruction can perform on Multiple Data elements. SIMD instructions also benefit from multiple execution ports and out-of-order execution and can take advantage of multiple cores by parallel execution. For the evaluation of our algorithms, we use the Intel Advanced Vector Extensions 2 (Intel AVX2), the implementation of SIMD instructions in the current 4th generation Intel Core architecture. They feature a vector length of 256 bits and implement vector-vector shift and gather instructions, which play a fundamental role for the performance of our routines. These SIMD instructions work on vectors of machine words such as 8, 16, 32, or 64bit integers. However if a column is compressed, its values are not materialized in this format in RAM. Instead, they may be stored in the commonly used combination of domain encoding and bit-compression: Each of the n distinct values of a column gets assigned an integer code between 0 and n − 1 and each value of the column is replaced by its codeword. The mapping between real values and codewords is stored in a different datastructure. The sequence of codewords is stored in a bit stream where every codeword just uses a fixed number of b = dlog ne bits. Figure 1 shows an example of a column encoded in such a bit stream. As the codewords do not start at machine word boundaries, some unpacking logic is necessary before the evaluation of the scan predicate.

INTRODUCTION

Today’s processors implement a rich set of techniques to accelerate the performance of applications. Most prominent is the use of multiple cores per processor, where each core can execute an independent thread. Additionally, several vendors implement simultaneous multi-threading, where a single core executes two or more threads concurrently and

1

The most common scans search values using an equality predicate, i.e., =, 6=, >, ≥,