Scanning Index Sets with Polynomial Bounds Using Cylindrical

3 downloads 0 Views 227KB Size Report
index sets with arbitrary polynomials as bounds using cylindrical alge- ..... statement with parametric parallelepiped tiles has been presented [RKRS07].
Scanning Index Sets with Polynomial Bounds Using Cylindrical Algebraic Decomposition

Armin Gr¨oßlinger

Department of Informatics and Mathematics, University of Passau [email protected]

Technical Report, Number MIP-0803 Department of Informatics and Mathematics University of Passau, Germany June 2008

Scanning Index Sets with Polynomial Bounds Using Cylindrical Algebraic Decomposition Armin Gr¨oßlinger University of Passau Department of Informatics and Mathematics [email protected]

Abstract. Automatic, model-based program transformation relies on the ability to generate code from a model description of the program. In the context of automatic parallelisation, cache optimisation and similar transformations, the task is to generate loop nests which enumerate the iteration points within given domains. Several approaches to code generation from polyhedral descriptions of iteration sets have been proposed and are in use. We present an approach to generating loop nests for index sets with arbitrary polynomials as bounds using cylindrical algebraic decomposition. The generated loops are efficient in the sense that no integer superset is enumerated. We also state where this technique is useful, i.e., where non-linearities in the loop bounds arise in loop program transformations and show some examples for our approach with polyhedral and non-polyhedral input.

1

Introduction

Optimisation of program execution is an ubiquitous challenge in computer science. Programs have to be adapted to exploit parallelism, use caches efficiently or save power on new architectures. Using a model representation of the programs in question, one can facilitate an automatic or semi-automatic transformation process. The model represents the characteristics of the program at an abstract level, modelling, e.g., the execution order of dependent operations or the memory accesses of the program. The transformation is performed in this model which permits an optimising search for the best transformation to achieve the desired target program, e.g., infusing a maximum of parallelism or a minimum of cache misses. After the model of a program has been transformed, corresponding code which can execute on the target architecture has to be generated from the model. One successful example of such a model is the so-called polyhedron model (previously called the polytope model) of loop programs [Len93] in which loops are modelled by polyhedra. It has been used successfully to infuse parallelism and enhance cache behaviour. The code generation step has long been a subject of study, cf. Section 3. But not all useful transformations of programs can be expressed by linear algebra on polyhedra which have linear bounds; there has been an increasing demand for non-linear transformations. 1

We drop the restriction to polyhedral index sets, allowing arbitrary semialgebraic sets, i.e., index sets bounded by polynomials in the variables and parameters. We give an example demonstrating the limitations of the polyhedron model and illustrating the idea of our code generation technique in Section 2. After discussing related work in Section 3, we give a precise formulation of the problem we solve and our main result, its solution, in Section 4. Section 5 presents some examples for polyhedral and non-polyhedral inputs. Section 6 discusses possible future improvements of the code generation algorithm and Section 7 concludes.

2

Introductory Examples

We introduce the problem of code generation by means of a simple example in Section 2.1. A reader who is familiar with polyhedral code generation may want to skip to Section 2.2, which illustrates the additional challenges for code generation in our more general case. 2.1

Introduction to Code Generation

When we speak of code generation, we aim at generating loops that enumerate so-called index sets and execute statements (loop bodies) for each enumerated point. For example, let us generate code for two statements T1 and T2 , where T1 is executed at every point in D1 = {x | 2 ≤ x ≤ 8} and T2 at every point in D2 = {x | 2 ≤ x ≤ p}. The sets D1 and D2 are called the index sets of T1 and T2 , respectively. Since T1 is to be executed for x = 2, . . . , 8 and T2 for 2, . . . , p (for p ∈ Z), we have to generate loops with the index variable x which enumerate the respective x-values and execute T1 and T2 at the respective index points. Unfortunately, enumerating the x-values for the two statements independently, as in the following sequence of loops: for (x=2; x

Suggest Documents