Applying Rules for Partitioned Parallelism in OODBMS ... - CiteSeerX

Applying Rules for Partitioned Parallelism in OODBMS within an Optimizer Generator Framework Carlo Giovano S. Pires

Javam C. Machado

Computer Science Department – Federal University of Ceara email: {giovano/javam}@lia.ufc.br

Abstract This work presents a rule-based approach for declarative query optimizer generation considering parallel execution in object-oriented databases. The main goal of this work is to provide a framework that can capture relevant aspects of parallel query optimization in a declarative way, combining procedural techniques with the advantages of rule processing. One of those techniques was used for determining repartitioning and selecting the algorithms to evaluate operator in a query tree considering the trade-offs between processing costs and repartitioning costs. This technique was proposed as a procedural algorithm and it was adapted to be used as rules in the context of object-oriented database optimizers. Another algorithm, used for query operator reordering in object-oriented databases, was adapted in order to consider repartitioning cost and was added to the framework. Finally, a new module for processing rules for parallelism extraction was added to the framework, providing a better support for inter-operator parallelism optimization techniques.

1

Introduction

Object-oriented database management systems have been used for supporting nonconventional applications. The new web-based applications designed for internet requires database to store Java objects, large and complex contents as XML structures. In addition, these applications require a database server able to handle a high load of requests from a high number of web users. The decrease in hardware costs and improvements in network technologies provide the use of parallel database architectures as an alternative to improve OODMS performance and scalability. Optimizer generator tools have been proposed with the aim of structuring and making it easier to build and prototype optimizers for new DBMS, including the ones supporting relational model. In general, they receive some high-level specification of the optimizer and generate the optimizer code. Those tools are very useful for object-oriented and parallel databases, due to the lack of a standard object-oriented algebra and the need of prototyping new algorithms for distinct parallel architectures. EXODUS [7], VOLCANO [8], CASCADES [6] and OPTGEN [2] are some important examples of those tools. They use the concept of modules

to provide the independence of the search strategies, together with logical and physical transformation rules. The aim of this work is to combine query optimization techniques designed for parallel and object-oriented databases with the facilities provided by rule-based optimizer generators. The OPTGEN system was chosen among the others, since it has already been used to generate optimizers for an object-oriented algebra with parallelism extraction [13], and also due to its integration with the open-source and OQL-complient object-oriented database Lambda-DB [1]. This system also provides the OPTL language [2] for specifying optimizers in a declarative way. In our approach, the original physical rules and the operator reordering algorithm used in the OPTL specification of Lambda-DB’s optimizer are modified in order to take into account repartitioning costs. The procedural technique for optimizing queries with partitioned parallelism [9] is implemented using a rule-based approach, making it easier to reuse rules and to modify conventional optimizers to support parallel query execution. This paper is organized in the following way: First, we briefly discuss some related works. Then we describe our approach for selection of operators and partitioning attributes regarding repartitioning costs, as well as the may we implemented in the framework for a parallel object algebra. In section 4 we show how OPTL rules can be used for selecting operators and partitions to be used in a query plan. We also show how operation reordering regarding repartitioning costs works. Finally we conclude with the future steps we are planning to make. 2

Related Work

In [15], an optimizer generator tool was used to implement a parallelism extraction stage that annotates query trees with information regarding parallelism opportunities. It combines the parallelism extraction approach presented in [9] with a rule-based optimizer generator. Another important related work was reported in [9]. It proposes a division for the query optimization phase. First, the optimizer should select the algorithms for implementing operators and reorder join operators regarding costs of repartitioning. This phase is named join ordering and query rewriting (JOQR). Next, the parallelism extraction should be done and, finally, the scheduler should allocate resources for evaluating the plan (the parallelization phase). The work presented in [15] covers the parallelism extraction task. Their approach concerns only independent and pipeline parallelism and the opportunities for partitioned parallelism are not analyzed. Our approach extends the technique described in [9], combining JOQR phase with a rule-based optimizer generator, so that we are able to identify opportunities for partitioned parallelism exploration. In addition, we add a new module to the optimizer generator that provides a better support for the parallelism extraction proposed in [15], where this extraction is based only on strategies used in the physical plans. Yet this phase can be transparent to strategies selection and reordering of operators. In this new module, rules split physical operators into operators from a parallel physical algebra for inter-operator parallelism described in [14]. This paper details only our framework for exploration of

partitioned parallelism within a rule-based optimizer generator. The algorithm proposed in [9] for join reordering is exponential. It works for relational queries that usually have no more then ten joins. But queries in object-oriented databases lead to a higher number of joins. This is due to the translation of path expressions into pointer-joins [12]. In [3], an algorithm was proposed for operators reordering. It is called GOO-OO (Greedy Operator Ordering - Object-Oriented) and is based in a heuristic that evaluates first the joins with lower cost. This algorithm works for a high number of joins and also reorders nest and unnest operators, but it was proposed for sequential execution and thus does not work properly for parallel execution, where repartitioning costs can be higher than operator processing cost. The work presented here modifies the GOO-OO algorithm to consider repartitioning costs. 3

Query Rewriting with Repartitioning

The partitioning of data across distinct processing nodes can result in a high degree of parallelism and performance. The partitioning could be done by applying some function over selected attributes of the class extent. But, depending on the query plan and data distribution, it might be necessary to transfer the data from it original processing node to some other node. The costs of this repartitioning process may exceed the cost of operator local processing. One factor that influences repartitioning costs is the choice of strategies to evaluate operators and their input and output constraints. For instance, a NESTED LOOP strategy has an input constraint that requires its input relations to be partitioned by the join attribute. If either one, or both of them, do not satisfy the input constraint, then the repartition should be done by the join attribute. The cost of that process may exceed the local cost of NESTED LOOP. Thus, the cost of an operator query tree is the sum of the costs of all operators while the cost of an operator consists of its inputs repartitioning cost according to strategy input-constraints plus the cost of the strategy itself. The JOQR phase proposed in [9] should minimize the total cost of a query tree as follows: 1. Let Optc(i,A) be the minimal cost of the subtree rooted at node i such that i has the output constraint A satisfied and OptcStrategy(i, A) be the strategy that achieves this value. Then, for each node i in postfix order, compute Optc(i,A) and OptcStrategy(i, A) for each p in the set of all possible attribute partitions. The computation of Optc and OptcStrategy should be done according to the following recurk rence: (1) Optc(i,a) = mins∈S [StrategyCost(s, R1s , . . . , Rks ) + j=1 minp∈P [Optc(αj , p) + s s repartition(Rj , p, inpCol(s, A, j))]] where P is the set of partitions, Rm denotes statistics for strategy s and relation m. We have simplified this recurrence to consider only partition property, but in [9] one can find details on how to work with a set of physical properties. 2. Let r be the root of the tree, select the lower value for Optc and the partition that produces it. This is the optimal partition and the correspondent OptcStrategy is the optimal strategy.

3. For each non-root node in prefix order compute optimal partitions and strategies applying recurrence (1) in reverse. [9] suggests the use of the JOQR phase in two ways: (a) Use the phase as a postpass to a conventional optimizer (b) Use the phase as a replacement for a conventional optimizer. We propose the use of JOQR phase in a context of a rule-based framework for generating optimizers. With this approach we intend to achieve the reuse of the technique for several optimizer specifications. The optimizer search strategy uses the following to support JOQR: • Plans produced for each partitioning attribute in a sub-term are used in the consumer term to iterate over all possible input partitioning and then estimate the repartitioning cost. • Plans produced for each physical operator in a sub-term are used in a term to iterate over all possible strategies. • The plan with lower cost value is the plan with lower Optc among all partitioning attributes so it turns to be the optimal plan. • After the best plan has been selected, the EXCHANGE operator is inserted between two connected terms with different partitioning attributes. The repartitioning costs, estimated previously in the producers terms, occurs in the EXCHANGE operator processes. Note that it works as a ”glue” operator just to provide the repartitioning process.

Reduce (Oid(Dept))

data repartitioning Merge_join (Oid(Dept))

Table_scan Dept (Oid(Dept))

Sort (EmpNumber) Table_scan Emp (EmpNumber)

Figure 1: Query tree Figure 1 shows a physical plan with repartitioning for the following OQL query:

select struct(NUM: e.EmpNumber, NAM: e.EmpName, D:e.dept.DeptName) from e in Emp Extents Emp and Dept are originally partitioned by EmpNumber and OID(Dept) (the Object Identifier of each Dept instance) . A parallel pointer-join version of MERGE JOIN strategy requires (as additional condition for the input-output constraints), the operands to be partitioned by the join attribute. Thus, the result of SORT should be repartitioned. Table 1 shows the input-output constraints for the parallel object-oriented algebra described in [14]. The first line, for instance, shows that if the optimizer requires the result produced by NESTED LOOP to be partitioned by X then both inputs should be partitioned by X and the join predicate should be on X. In a rule-based context, the notation means that the rule for the corresponding strategy should have no restriction about the attributes for partitioning. This kind of rule should generate plans for all possible partitions. Strategy Nested-Loop Merge-Join Nest Groupby Table-Scan Index-Scan Map Materialize Reduce Sort Unnest Union

Output

Input1 

Input2

Additional requirements Join predicate on X Join predicate on X X is a Nesting attribute X is a Grouping attribute X is the partition attribute of the extent X is the partition attribute of the extent X is the attribute unveiled by the function X is the path attribute



Table 1: Input-Output Constraint From Table 1, we define the following classification of the physical algebra operators according to the rule pattern that should be defined for optimizing plans with partitioned parallelism: • Base Operators: Operators that access the extents. They are the base of the recurrence and their costs are estimated based on extent information in the catalog. They are Table-Scan and Index-Scan • Constrained Operators: Operators that run according to a specific partition. If the strategy has an additional condition for the input-output constraints then the respective physical-rule generates only one plan according to the condition. The operators are Nested-Loop, Map, Merge-Join, Nest, Groupby and Materialize. • Free Operators: Operators that runs over any partition. Strategies without additional condition produce one plan for each possible partitioning attribute. They are Reduce, Sort, Unnest and Union.

1

Parsing OQL queries

2

3

4

Translation OQL queries into Monoids

Checking the calculus for type

Query unnesting and translation of the monoid calculus

5

6

7

Join permutation using a cost-based (with repartitioning cost) heuristic

Physical plan generation using a rule-driven and cost-base (with repartitioning cost) system

Physical operatos spliting and EXCHANGE insertion

Figure 2: Optimization Phases This parallel algebra1 implements the Volcano Operator Model [5] using the operator EXCHANGE for parallelism encapsulation. So, for each repartition process unveiled by the rules, the optimizer should automatically insert the EXCHANGE operator into the plan. Another approach is to use enforcer rules to insert the parallelism operator into the plan, moving this task and some other parallelism issues from the framework and other rules to the enforcer rules. We incorporate these optimization techniques into a rulebased optimizer generator, such that a generated system would follow the steps shown in Figure 2. These phases are the same used in Lambda-DB’s optimizer, but phases 5 and 6 were modified to incorporate repartitioning costs and phase 7 was added to incorporate, transparently, the parallel extraction technique we have been working on. 4

Generating OQL parallel Optimizers

In this section we show how to depart from an existing non-parallel optimizer generator tool and an optimizer specification and add new rules and features in order to extend the JOQR phase proposed in [9]. The JOQR extended phase generates terms and join trees taking into account the costs of repartitioning data for parallel execution using a rulebased approach. Our approach is based on the following steps: 1. Specify new physical rules for an existing non-parallel optimizer generator to model the techniques described in [9]. Those rules should produce, for each term, one version of the same strategy for each possible partition, according to input-output constraints. Then, these plans are associated to the term and stored in a systemprovided hash table along with plans for others strategies and used as input plans for the consumers terms. 1

see [14] for a detailed description

2. Provide support for operation reordering based on strategy and data repartitioning cost. The GOO-OO algorithm proposed in [3] was modified to order join nesting and unnesting operators taking into account the trade-offs between operator costs and repartitioning costs. The modified algorithm is called GOO-OOP (GOO-OO with Parallelism) and it is a combination of [3] and [9] proposals. 4.1

Physical Operators and Partitioning Selection

In this section we will present rule patterns and discuss some OPTL rules used to implement these patterns. The rule patterns are just abstractions of the rule models used in the main optimizer frameworks.

Required=empty Guard=cond

fire

Logical Operator

transf.

Free Physical Rule

Physical Operator [part 1]

[part i] in collection's possible partitions

Physical Operator [part n] a

Figure 3: Free Rule Pattern In Figure 3 we have a rule pattern to be used with free operators. The left side of the pattern denotes the rule header with its required property and condition. If the condition specified in the guard is satisfied, the the rule is applied and a transformation in the search space occurs. The right box in the figure denotes a transformation in the search space. This type of rule do not receive a required property and transform the search space writing a new term (a physical operator) for each possible partition of the partial result. { order=order(); } unnest( ‘M, ‘e, ‘v, ‘path, ‘pred, ‘keep ) = #forall out_partition in all_partitions(‘e) do UNNEST( ‘e, ‘v,‘path, ‘pred, ‘keep ) : {size = ê.size; partition = out_partition; cost = ê.size +(ê.cost+repart_cost(ê, ê.partition , out_partition)); order = ê.order; }; #end;

This rule (an OPTL implementation for the pattern in Figure 3) translates the unnest operator of the logical algebra into an UNNEST operator of the physical parallel algebra presented in [14]. For non-base operators the cost of the plan should be estimated according to recurrence (1). A new synthesized attribute should be defined to store the partitioning attribute of each term. The synthesized attributes size, partition, cost and

order represent the physical properties of the plan and they are computed after a rewrite. This rule generates one plan for each partition (according to Table 1) and estimates the value of the attribute cost for each one of them. The plan also receives annotations (attribute partition) according to the partitioning chosen for parallelization. In the generated optimizer code this rule is processed for each input plan and thus for each strategy and partitioning in order to implement the JOQR phase.

Required=part Guard=cond Constrained Physical Rule

fire

Logical Operator

transf.

Physical Operator [part]

Figure 4: Constrained Rule Pattern In Figure 4 we have some partition (part), according to the operator and Table 1, as a required property. Note that the transformed term receives the annotation (part) in the physical operator. { order=‘expected_order; } join(‘x

Applying Rules for Partitioned Parallelism in OODBMS ... - CiteSeerX

Applying Rules for Partitioned Parallelism in OODBMS ... - CiteSeerX

Suggest Documents

Physical Design in OODBMS - CiteSeerX

Applying Ergonomic Rules for Interactive Computer Systems - CiteSeerX

PARTITIONED SEPARABLE PARABOLOIDAL ... - CiteSeerX

Partitioned Residual Echo Power Estimation for ... - CiteSeerX

MINIMAL GERSCHGORIN SETS FOR PARTITIONED ... - CiteSeerX

oodbms - aircc

Query Optimization in an OODBMS

42 Rules for Applying Google Analytics

42 Rules for Applying Google Analytics

42 Rules for Applying Google Analytics

Scheduling Intersection Queries in Term Partitioned ... - CiteSeerX

Building an Integrated Active OODBMS: Requirements ... - CiteSeerX

Nested Loop Transformation for Full Parallelism - CiteSeerX

Addressing Partitioned Arrays in Distributed Memory ... - CiteSeerX

Profiling Java Programs for Parallelism - CiteSeerX

Extended Parallelism Models For Optimization On ... - CiteSeerX

Preparing Students for Ubiquitous Parallelism - CiteSeerX

FastForward for Efficient Pipeline Parallelism - CiteSeerX

Available Instruction-Level Parallelism for Superscalar ... - CiteSeerX

Extended Parallelism Models For Optimization On ... - CiteSeerX

FastForward for Efficient Pipeline Parallelism - CiteSeerX

FastForward for Efficient Pipeline Parallelism - CiteSeerX

Adhesive DPO Parallelism for Monic Matches - CiteSeerX

Query Optimization in an OODBMS - Semantic Scholar