Using Graph Coloring in an Algebraic Compiler - CiteSeerX

0 downloads 0 Views 260KB Size Report
Jan 19, 1996 - compiler is a graph coloring problem in which an optimally colored graph is .... That is, register management in an algebraic compiler can be ... translator and since machine registers are usually not available in high-level languages an .... of the input and its target image constructed by M is @i. The next.
Using Graph Coloring in an Algebraic Compiler Teodor Rus and Sriram Pemmaraju Department of Computer Science The University of Iowa, Iowa City Iowa 52242, Phone: (319) 335-0742 [email protected], [email protected] January 19, 1996

Abstract An algebraic compiler allows incremental development of the source program and builds its target image by composing the target images of the program components. In this paper we describe the general structure of an algebraic compiler focusing on compositional code generation. We show that the mathematical model for register management by an algebraic compiler is a graph coloring problem in which an optimally colored graph is obtained by composing optimally colored subgraphs. More precisely, we de ne the clique-composition of graphs G and G as the graph obtained by joining all the vertices in a clique in G with all the vertices in a clique in G and show that optimal register management by an algebraic compiler is achieved by performing clique-composition operations. Thus, an algebraic compiler provides automatically adequate clique separation of the global register management graph. We present a linear-time algorithm that takes as input optimally colored graphs G and G and constructs an optimal coloring of any clique-composition of G and G . Motivated by the operation of clique-composition, we de ne the class of clique-composable graphs as those graphs that can be iteratively built from single vertices using the clique-composition operation. We show that the class of clique-composable graphs coincides with the well-known class of chordal graphs. Keywords: Computing Reviews Classi cation: D.3 [Programming Languages], D.3.4 [Processors], G.2.2 [Graph Theory]. General Terms: code generation, compilers, graph algorithms. 1

2

1

2

1

2

1

2

Contents

1 2 3 4 5 6

Introduction The structure of an algebraic compiler Code generation by semantic macro-operations Register management in an algebraic compiler Coloring a clique-composition Clique-composable graphs

1

2 6 9 16 24 28

1 Introduction It is well-known that computer instructions that use registers as operands can be executed faster than those that use memory locations as operands. But typically, there is a limited number of registers available to hold operands during the execution of a program. This means that registers must be recycled during program execution. Cocke [7], Ershov [10] and Schwartz [22] have suggested the graph coloring problem as a mathematical model for register management in a compiler. The graph coloring problem can be formulated as follows. Let

G = hV; E i be a simple undirected graph with vertex set V and edge set E . Suppose that C = f1; 2; : : : ; kg is a set of k distinct colors. A function f : V ! C is called a k-coloring of G if for each edge (u; v) 2 E , f (u) 6= f (v). A k-coloring f of G is called optimal if there exists no (k ? 1)-coloring of G. It is well known [1, 11] that given a graph G = hV; E i deciding whether or not there is a k-coloring of G is NP-complete for all k, 2 < k < jV j. The register management problem can be modeled as a graph coloring problem as follows. The register interference graph , RIG, [2] of a program P is de ned as G(P ) = hV (P ); E (P )i where the vertex set V (P ) contains a vertex for each variable of P and the edge set E (P ) contains an edge (u; v) if the variables corresponding to the vertices u and v are \live" at the same time. Suppose that the target machine has k registers available to hold data during the process of executing the program P . Then a k-coloring of G(P ) provides an assignment of variables to registers. That is, determining whether the set of available registers of the target machine will suce for variables in P is equivalent to determining whether G(P ) can be k-colored. It is well-known that in conventional compilers register interference graphs corresponding to straight line code are interval graphs . Many problems, including the graph 2

coloring problem, though NP-complete for arbitrary graphs, can be solved in polynomial time for interval graphs. However, register interference graphs corresponding to code with branches are not necessarily interval graphs and no polynomial time algorithms are known for coloring such graphs. Hence, compilers have had to depend on graph coloring heuristics. Chaitin [5, 6] proposes a graph-coloring heuristic that is commonly used by conventional compilers for global register management. In this paper we consider register management in algebraic compilers. An algebraic compiler builds the target image of a source program by composing target images of program components. For that, the code generator maintains a register interference graph for each target image of a valid program component. A vertex of this graph represents a symbolic register used in the target image; two vertices of a register interference graph are connected by an undirected edge precisely when the registers they represent are live at the same time. Consequently, register management by an algebraic compiler can be modeled as a graph coloring problem in which it is required that an optimally colored graph be obtained by composing optimally colored graphs. This composition is de ned such that the live registers of one graph are connected with all live registers of the second graph. That is, live registers of the register interference graph of an image form a clique and graph composition during new image generation must be compatible with the colorings of these cliques. To model this situation, we de ne the clique-composition of a pair of graphs G and G as the graph 1

2

obtained by joining all the vertices in a clique in G with all the vertices in a clique in 1

G . That is, register management in an algebraic compiler can be modeled as the following 2

problem: Given optimal colorings of the graphs G and G compute the optimal coloring 1

3

2

of a clique-composition of G and G . 1

2

The main motivation of this paper comes from the development of the algebraic technique of compilation, which is based on homomorphism computation rather than on automata theory. The algebraic compiler integrates all components of the compiler into a unique algorithm that embeds the source language into the target language and merges theoretical and practical aspects of compilation. Consequently there are two (possible disjoint) sets of readers addressed by this paper: one interested in compiler construction and one interested in graph theory. To keep paper's relevance for both group of readers we organize it such that each section is self-contained and provides a deeper understanding of the fundamental problems resolved by its previous sections. Note that an algebraic compiler is a language to language translator and since machine registers are usually not available in high-level languages an algebraic compiler, in general, does not need to handle registers. A conventional compiler is an algebraic compiler whose target language is the machine or assembly language of a given machine. In this case machine registers are available and target program optimization requires optimal use of the machine registers. Therefore we need to develop techniques for register management suitable for the algebraic methodology of compilation. This is particularly important for the RISC machines. Our methodology for register management by the algebraic compiler is appropriate for both type of machines, RISC and CISC. However, due to the must nature of register management for the RISC systems we targeted it to the RISC machines and experimented it on the IBM RS/6000 system. The rest of this paper is organized as follows. Section 2 of the paper introduces the algebraic technique of compilation. The main result of this section is the algorithm integrating all components of the compiler. Section 3 4

focuses on the mechanism of compositional code generation using semantic macro-operations. The main result of this section is the algorithm for computing register liveness during the process of code generation. Section 4 of the paper introduces the mathematical model of register management as performed by the algebraic compiler. The main result here is the algorithm that constructs the register interference graphs and their optimal colorings of the target images generated during the process of compilation. The fundamental operations employed by this algorithm are the clique-composition of two graphs and the construction of an optimal coloring of a clique-composed graph from the optimal colorings of the graph components. Section 5 of the paper presents the mathematical aspects of the graph coloring problem used in Section 4. The main result of this section is an algorithm to solve this problem in linear time using linear space. Note that the above problem arises naturally from the way an algebraic compiler builds the target image and allows the compiler to independently color subgraphs and reuse space once a subgraph has been colored. Since this is not naturally achieved in conventional compilers, Gupta, So a, and Ombres [15] use clique-separators [13, 23] to decompose a register interference graph into subgraphs that

can be independently colored and combined. The algorithm of Gupta et.al., needs to be judicious in its choice of clique-separators. On the other hand, the algebraic compiler, during the course of building the target image, automatically provides optimally colored subgraphs whose clique-composition needs to be optimally colored. Finally, in Section 6, we de ne clique-composable graphs as those graphs that can be constructed by iteratively clique-composing pairs of graphs starting from a set of isolated vertices. Clearly, cliquecomposable graphs are exactly register interference graphs in algebraic compilers. The main result of this section is the characterization of clique-composable graphs as the well-known 5

class of chordal graphs . The class of chordal graphs is a strict superset of the class of interval graphs [14] and like interval graphs, chordal graphs have also been the subject of a lot of research [12, 14, 16]. Problems that are NP-complete for arbitrary graphs (for example, graph coloring, maximum clique) can be solved in polynomial time for chordal graphs [12] and the problem of recognizing chordal graphs can also be solved in linear time.

2 The structure of an algebraic compiler An algebraic compiler C : SL ! TL is a language-to-language translator mapping a source language SL into a target language TL. An algebraic compiler is implemented by a triple

hR; G ; Mi where R is a pattern-matcher recognizing valid language constructs of SL, G manages the target images of the constructs recognized by R, and M is a TL macro-processor that expands TL macro-operations into valid language constructs of TL called images. We assume that the syntax of the source language SL of the compiler is speci ed by a nite set

R of Backus-Naur Form (BNF) speci cation rules. Each rule r 2 R is an equation of the form A = t A t : : : tn? Antn where each ti, 0  i  n, is a xed string called a terminal 0

0

1 2

1

symbol or is the empty word, and each Ai , 0  i  n, is a variable called a nonterminal symbol or is the empty word. We use the notation lhs(r) to denote the left-hand side of the

rule r, that is, lhs(r) = A , and rhs(r) to denote the right-hand side of the rule r, that is, 0

rhs(r) = t A t : : : tn? An tn. Also, we assume that the target language of the compiler, TL, 0

1 1

1

is provided with a macro-processor M that allows target construct development by semantic macro-operations taking other target constructs as parameters [4, 17, 19, 24]. An algebraic compiler is speci ed by associating each source language speci cation rule

r 2 R, r : A = t A t : : : tn? An tn, with a target language macro operation M (r) [19]. 0

0

1 2

1

6

The macro-operation M (r) is a parameterized target representation of the computations expressed by the source language constructs speci ed by r. That is, M (r) is de ned by the compiler implementor by \programming" the computation speci ed by r in the target language. Hence, we assume that the semantics of the source language is well-de ned. The calling name of the macro-operation M (r) is the rhs(r), i.e., t A t : : : tn? Antn. The 0

1 1

1

components of the source language construct speci ed by r are source language constructs of syntax categories A , : : :, An. Hence, the formal parameters of M (r) are the nonterminals 1

A ; A ; : : : ; An and the actual parameters are images of the source language constructs of 1

2

syntax categories A , A , : : :, An. The body of M (r) expresses the computation speci ed by 1

2

r in SL as a target image in TL that is composed from the target images of the components of the construct speci ed by r. We use the symbol @, with indices if necessary, to denote the target images generated by M and manipulated by G during the process of compilation. That is, the actual parameters of the macro-operation M (r) are the target images @ ; @ ; : : : ; @n of 1

2

source language constructs of syntax categories A ; A ; : : : ; An. The set CS = fhr; M (r)ijr 2 1

2

Rg is called a compiler speci cation. CS de nes a macro-facility that extends the target language TL allowing the implementation of the source language SL speci ed by R by embedding it into the target language TL. Algebraically this mechanism of compilation embeds the source language algebra [3, 8, 20] into the target language algebra preserving the computational structure of the source language constructs. In other words, the language of the abstract machine used to represent the program during the compilation process is actually a language already implemented on the target machine rather than an intermediate form of the abstract syntax tree of the source text. The most primitive such a language is the machine language. 7

The compilation process performed by C is a sequence of transformations of the input text during which source language constructs speci ed by the rules r 2 R are discovered in the input text, their target images @lhs r are constructed by expanding their associated ( )

macros M (r) and the portions of the input text representing such constructs are replaced by records of the form lhs(r) : @lhs r . Suppose that after a number of transformations the input ( )

text has the form P = x t A : @ t : : : An : @n tn y , where the tuple Ai : @i, 1  i  n, 0

1

1

1

shows that a source language construct of syntax category Ai have been discovered by R as a valid component of the input and its target image constructed by M is @i. The next transformation of P by C is performed as follows: 1. For each tuple hr; M (r)i 2 CS , R interprets rhs(r) = t A t : : : Antn as a pattern to 0

1 1

be searched in P . R ignores the target images embedded in P . When an occurrence of the rhs(r) is discovered by R in P and the context (x; y) [21] surrounding rhs(r) in P determines that this portion of P is speci ed by r, then this portion of P can be replaced by the lhs(r) preserving the syntactic validity of the input, i.e., P is transformed into P 0 = x lhs(r) y . This operation is denoted by R(r). 2. For each tuple hr; M (r)i 2 CS , G interprets rhs(r) = t A t : : : Antn as the name 0

1 1

of the macro-operation M (r). Therefore, when R determines that a portion of the input can be replaced by the lhs(r), G extracts the actual parameters @ ; : : : ; @n from 1

the portion of the input t A : @ t : : : An : @n tn matched by R and calls the 0

1

1

1

macro-processor M to expand the macro-operation M (r)(@ ; : : : ; @n). Let @ be the 1

0

resulting code thus generated. Then G associates the parameter @ with the lhs(r) 0

creating the record lhs(r) : @ . This operation is denoted by G (r). 0

8

3. The operation by which M expands M (r)(@ ; : : : ; @n) into the target image @ is 1

0

denoted by M(r). The relationship between components R, G , and M of the algebraic compiler while performing a transformation of the input text is shown in Figure 1.

P = xt A : @ : : : An : @ntn y rhs(r) M (r)(@ ; : : : ; @n) R(r) G (r) M(r) R lhs(r) : @ @ P 0 = x lhs(r) : @ y 0

1

1

?

-

-



1



0

-

0

?

0

Figure 1: The integration of the components of an algebraic compiler The algorithm performed by the algebraic compiler is a sequence of transformations

T ; T ; : : : ; Tm of the source text as described by (1), (2), (3) above and shown in Figure 1. 0

1

Each Ti , 1  i  m, takes source text already transformed by T ; T ; : : : ; Ti? , and applies 0

1

1

the operation R(r) ! G (r) ! M(r). Note that R(r), G (r), and M(r) may use (in parallel) all speci cation rules r 2 [ik Rk , where R ; R ; : : : Rm is a partition of R de ned as follows: 0

=0

1

1. R = fr 2 Rjrhs(r) contains no parametersg; lhs(R ) = flhs(r)jr 2 R g. 0

0

0

2. For each i, 1  i  n and lhs(Rk ) = flhs(r)jr 2 Rk g, 0  k  i ? 1, Ri is de ned by:

Ri = fr 2 R n [ik? Rk jrhs(r) = t A : : : tn? Antn and A ; : : : ; An 2 [ki? lhs(Rk )g 1 =0

0

1

1

1

1 =0

3 Code generation by semantic macro-operations The code generator G operates with target language constructs called images, speci ed by target language macro-operations. The macro-operations that specify images take as pa9

rameters both syntax and semantic properties of other images. So far we have identi ed the following characteristic properties of an image that are manipulated by the code-generator: 1. Representation: is a well-formed expression according to the target syntax rules. 2. Type: is the type of value the image can assume in the target language. 3. Standard: designates the computation standard, such as its syntax category. 4. Mode: de nes structural properties of the image such as the ow of control among its components, the nature (data or process) of the computation represented by the image, and the manner in which the image can be used as a parameter in the construction of other images. 5. Import: are computational objects (such as registers or variables) de ned by the image components which are accessible to the image. 6. Export: are objects de ned by the image (such as registers or variable) which are available to other images using this image as a parameter. 7. Entry: are entry points where a process performing the computation represented by the image can start. 8. Exit: are exit points where a process performing the computation represented by the image can terminate. All these properties are computable functions and are expressed by target macro-expressions; for easy manipulation they may be introduced by keywords followed by the macro-expressions de ning them. Only the representation of an image is explicitly required, the speci cation 10

of the other properties are optional. The mapping of a macro-expression specifying a property of an image into the target image it represents is performed by the macro-processor

M. The macro-processor M operates in a computing environment provided with: (1) its own interpretative arithmetic available to the compiler implementor through a given set of pseudo-operations, (2) a collection of functions that take as arguments target images and return their syntax and semantics properties, and (3) the usual substitution operations. The macro-processor M is called by the code generator G of the compiler when the recognizer

R recognizes a valid source language construct in terms of its source language construct components, as seen in Figure 1. Since R is syntax directed, there may be properties of the components of the construct recognized by R that are unde ned at the recognition time. This means that the target images can be partially generated. However, the algorithm performed by the compiler assures that any valid component of the input text will be eventually mapped into a valid image in the target language. This allows the incremental development of the source language programs, provides for multiple target image generation by the compiler, and facilitates target program optimization by the local optimizations of the image components. In addition, there are other three major advantages of this philosophy:

 The code generator is a well-de ned mathematical algorithm that can be implemented by a standard universal procedure operating on a speci c data-structures. Thus, it can be specialized to produce a given target code by preprocessing target speci cation rules similar with the preprocessing of the source syntax speci cation rules preprocessed by various parser-generator tools to produce the data-structures on which the parser of the source language operates. 11

 Since all components of the compiler are speci ed by well-de ned mathematical algorithms and are implemented by standard universal procedures they are integrated into the algorithm performed by the compiler by simple composing their data-structures into a global data structure.

 The target speci cations of various constructs speci ed by BNF rules can be preserved in libraries of speci cations and can be reused whenever a new language is speci ed and implemented. This makes the process of specifying and implementing languages reusable. The consequence is that compiler design and implementation can be automated by tools that preprocess a compiler speci cation le and update the data structure on which the prede ned universal compiler operates. We illustrate this philosophy of compiler implementation by assuming that the target language of the compiler is an assembly language and

M is a macro-assembler that is capable to expand assembly macro-operations into assembly code they represent. The semantic properties used by assembly macro-operations specifying images which are pertinent for compositional register management are: the standard of the image, introduced by the keyword stand, the mode of the image, which is a list of ow-relations of the form @i ! @j showing that the ow of information (control and data) proceeds from the component @i to the component @j of the image, introduced by the keyword mode, and the type of the image, which is introduced by the keyword type. The resources of an image pertinent for register management are: the representation of the image, which is a sequence of assembly statements introduced by the reserved word rep, the registers of the target machine used by the assembly language representation of the im12

age are the imported objects introduced by the reserved word reg, the results generated by the process that performs the computation represented by the image are the exported objects introduced by the reserved word res, the entry points of the image, introduced by the reserved word entry, and the exit points of the image introduced by the reserved word exit. For each property or resource of an image there is a function in the macroprocessor M that returns the value of that property. These functions can be denoted by

f (@i), where f 2 fstand; mode; type; rep; res; reg; entry; exitg. Thus, the general form of a compiler speci cation rule hr; M (r)i 2 CS de ning the mechanism of embedding source language constructs speci ed by r into the assembly language is: hA i = t A t : : : tn? An tn stand: A mode:f@i ! @j j@i; @j 2 f@ ; @ ; : : : ; @ngg type: Et ; entry: En rep: L : [S ] L : rep(@i1 ) [S ] ::: Lj : rep(@i ) [Sj ] ::: Ln: rep(@i ) [Sn ] res: Er exit: Ex 0

0

1

1

1

0

1

0

2

0

1

1

j

n

where Et , En, Er , Ex are macro-expressions and [Sj ], 0  j  n, when present, is a section of assembly-code speci c to this macro-operation. Sj has access to variables and constants used in the image @i . Lj , 0  j  n, denotes the label of the entry point of rep(@i ), and j

j

(i ; i ; : : : in) is a permutation of (1; 2; : : : ; n). The ow of information (control and data) in 1

2

the image is speci ed by the control- ow relations Li? ! Li, 1  i  n, and @i ! @j , 1

for each @i ! @j 2 mode(@ ). For example, using the mnemonic A to denote add and 0

assuming that an assembly language statement of the form Mnemonic Operand Operand 1

13

2

performs the operation Operand := Operand Mnemonic Operand , we can specialize the 2

1

2

above rule to de ne the assembly language image of expressions speci ed by the BNF rule

E = E + T as follows: E = E + T;

stand: arithmetic expression; mode:@ ! @ ; type: type(@ ); entry: entry(@ ); rep: rep(@ ); 1

2

1

1

1

rep(@ ); A res(@ ),res(@ ); res: res(@ ); exit: exit(@ ); 2

1

2

2

2

Note that with this paradigm of compilation the source language constructs and their target images are directly associated and consequently no program intermediate form is necessary. Thus, the basic blocks constructed by the conventional compiler to de ne live ranges for program variables are de ned and controlled by the compiler implementor using the standard property of the image. That is, the compiler implementor can select speci c syntax categories such as expression, assignment statement, loop, etc., to denote live ranges of variable where register liveness is de ned by the following rules: 1. A register holding a variable or a constant in a live range is implicitly live. 2. The liveness of a register holding the result of a computation must be explicitly speci ed by a macro-expression introduced by res. The liveness of the registers used in the components of an image is further propagated in the composed image by the implicit and explicit rules explained below. For that we denote by NewReg(Sj ) the set of the new registers requested in Sj . 14

1. LiveAt(L ) is the collection of registers used by S , if S is present in @ . Otherwise 0

0

0

0

LiveAt(L ) = ;. 0

2. LiveAfter(L ) is the collection of registers used in S that belong to reg(@i1 ). 0

0

3. Let @k1 ; : : : ; @k be all parameters such that @k ! @i 2 mode(@ ), 1  r  s. r

s

0

j

LiveAt(Lj ) = LiveAfter(Lj? ) [ ([sr LiveAt(exit(@k )), 1  j  n. 1

=1

r

4. LiveAfter(Lj ) = LiveAt(exit(@i )) [ NewReg(Sj ), 1  j  n. j

These functions allow us to express the register liveness in the target image @ in terms of 0

the register liveness in the target image components @ , @ , : : :, @n. The following example, 1

2

where we use the notation Ai : @i in the BNF rule expressing the while loop to associate the formal parameter Ai with its actual parameter, the target image @i, shows the manner in which we use the above functions to compute liveness.

WhileLoop : @ = while BooleanExpression : @ do Statement : @ stand: WhileLoop; mode:f@ ! @ ; @ ! @ g; rep: L : rep(@ ); Cmp True, res(@ ); JumpF L ; L : rep(@ ); Jump L ; L : Nop ; entry: L ; exit: L ; 0

1

1

1

2

2

2

1

1

1

3

2

2

1

3

1

3

LiveAt(L ) = ;; LiveAfter(L ) = ;; LiveAt(L ) = LiveAt(exit(@ )); LiveAfter(L ) = LiveAt(exit((@ )); LiveAt(L ) = LiveAfter(L ) [ LiveAt(exit(@ )); LiveAfter(L ) = LiveAt(exit(@ )) 0

0

1

2

2

1

1

1

1

2

2

Assuming that stand(@ ) is a live range and R ; R are the registers used in @ , where R is 1

1

2

1

1

the register holding the result of this computation and R holds a variable, and stand(@ ) 2

15

2

is not a live range and R ; R ; R are the registers used in @ , where R is the register 3

4

5

2

3

holding the result of the computation performed by @ , we have: LiveAt(L ) = fR g, 2

1

3

LiveAfter(L ) = fR ; R g, LiveAt(L ) = fR ; R g, and LiveAfter(L ) = fR g. 1

1

2

2

1

2

2

3

The advantage of this methodology results from the richer structure of an image when compared with the statements composing the intermediate forms. The code generator is aware of both the computation contents of the image it constructs and its compositional structure. The optimization actions performed by this code generator are programmed and operate on properties and resources of optimized image components. In particular, the mathematical model of register management is de ned by a graph coloring problem where optimally colored graphs, representing register assignments of the image components, are composed into an optimally colored graph, representing register assignment of the resulting image. Note the di erence between this graph coloring problem, that constructs an optimally colored register interference graphs of the local images taking as data optimally colored register interference graphs of the image components, and the graph coloring problem used by conventional code generators, that construct a global register interference graph and then try to nd its optimal coloring. That is, an algebraic compiler optimizes register management by performing local optimizations of the target images it constructs and preserve these optimizations when the target images are used as components of other target images.

4 Register management in an algebraic compiler The registers used by the image components of an image are expressed by colored register interference graphs where the colors assigned to the vertices representing live registers in the composed code segment form a clique. Consequently, register management by the code 16

generator discussed in this paper can be modeled as a problem in which we need to compose optimally colored graphs into an optimally colored graph preserving well de ned cliques. More

precisely, we de ne the clique-composition of graphs (G ; C ) and (G ; C ) where C and C 1

1

2

2

1

2

are cliques in G and G respectively, as the graph obtained by joining all the vertices 1

2

in the clique C with all the vertices in the clique C . Note the di erence between this 1

2

clique-composition operation, which is a graph constructor that preserves given cliques in the graph operands, and the clique-decomposition operation used in [23], which searches for a clique of a given graph that splits it. While for a given graph there may be many cliquedecompositions or no clique-decomposition, for the given graphs G and G and cliques 1

2

C and C the clique-composition operation uniquely determines the resulting graph. In 1

2

addition, notice that from a register management viewpoint the graph operands of a cliquecomposition operations represent registers locally assigned to hold variables, constants, and results of code segments components and the cliques C and C represent live-registers in 1

2

the composed code. Machine registers used by an image @ are either explicitly requested by the macro0

developer through a call to the function GetReg() when they are necessary in the local code

Si, 0  i  n, of M (r), or are implicitly provided as the registers already used by the image components @i, 1  i  n. Since this methodology is targeted to the RISC systems, each source language reference to a variable or a constant is mapped into a reference to a register. Using the mnemonic L for load and lexeme(token) the function that returns the source language name of its token argument, this can be illustrated by the following macrooperation that speci es the embedding of the smallest form of expressions, usually called factors, de ned by BNF rules of the form Factor = identifier or Factor = constant, into 17

assembly language: Factor = identi er; type: type(lexeme(identifier)); rep: L0: R = GetReg(type(lexeme(identifier))); L lexeme(identifier), R; Factor = constant; type: type(lexeme(constant)); rep: L0: R = GetReg(type(lexeme(constant))); L lexeme(constant), R; The compatibility of register allocation in image @ with the register allocation in the 0

components @i, 1  i  n, of @ is obtained by maintaining a register interference graph 0

G(@) and an optimal coloring C (G(@)) for each image @ where the colors are the physical registers allocated to the symbolic registers used in @. The vertices of G(@) are symbolic registers denoted by Ri ; there is an edge between Ri ; Rj 2 G(@) precisely when the physical registers allocated to Ri and Rj are alive at the same time during the execution of the code represented by @. The consistency and the optimality of the register allocation during the generation of the code @ is achieved by building the graph G(@ ) and its optimal coloring 0

0

C (G(@ )) from G(@i), and C (G(@i)), 0  i  n, using the clique composition operations 0

shown in Section 5. For the implementation of the algorithm constructing G(@) and C (G(@)), the registers used by the image @ are maintained by the code generator into a list of tuples denoted

reg(@) where each element has the form (Ri ; ai; vi; si) where Ri is a unique symbolic name generated by GetReg(), ai is the physical register allocated to Ri by the code generator, vi is the value held by Ri, and si is the status of the register de ned by the formula: 8 v; if R is alive at the exit from @ and holds a variable; > > < c; if Rii is alive at the exit from @ and holds a constant; si = > r; if R is alive at the exit from @ and holds a result; > : f; if Ri is free at exit from the execution of the code @ . i 0

0

0

0

18

A register Ri which is alive in reg(@) contains a result exported by the computation performed by the code @ and thus its physical register ai cannot be reallocated to another register Rj 6= Ri, when @ is a component of another image. On the contrary, if the register

Ri is free in the list reg(@) its physical register can be reallocated to another Rj 6= Ri when @ is used as a component of another image. Consequently, if R recognizes a source language construct speci ed by the BNF rule A : @ = t A : @ t : : : tn? An : @ntn then 0

0

0

1

1 1

1

the register allocation in the list reg(@ ) must be compatible with the register allocation in 0

the lists reg(@i), 1  i  n. The liveness of the registers in reg(@ ) is computed by the 0

code generator from the explicit declaration of the exported results using res() in view with both the implicit control- ow in the image, provided by the concatenation of the assembly language statements, and the explicit control- ow in the image, provided by mode(). The code generated by M when it expands the macro-expression introduced by rep is optimized to avoid multiple loads of a variable in the register holding it using the following equivalence of the symbolic registers generated by GetReg(): Two symbolic registers, Ri and Rj , are equivalent, Ri  Rj , if their tuples (Ri; ai; vi; si) and (Rj ; aj ; vj ; sj ) in reg(@ ) have the same status and contains 0

the same value, that is, si = sj and vi = vj . Hence, if Ri  Rj in reg(@ ) then they can be allocated the same physical register and their 0

vertices in the G(@ ) can be collapsed to just one vertex. This allows the macro-processor 0

M to optimize the image it produces by factoring out common subexpression evaluation. In particular, if Ri  Rj then M generates only one load instruction, L name Ri or L name Rj , whichever comes rst, when it expands the macro-expression de ning the image representation. 19

By (parallel) rewriting id!F, F!T, T!E, graph construction, and code generation y := Ex  + = Tx Fx Fx r

r

r

r

r

r

r

r

?

?

?

?

?

?

?

?

(R ; 1; x; v) L x, R (R ; 1; x; v) L x, R (R ; 1; x; v) L x, R (R ; 1; x; v) L x, R By rewriting T * F! T, graph composition and code generation y := + = Ex Txx Fx r

r

0

0

r

1

1

r

2

2

2

r

r

r

r

r

r

?

?

?

?

?

?

2

(R ; 1; x; v) L x, R (R ; 1; x  x; r) L x, R L x, R (R ; 2; x; v) M R , R By rewriting T / F! T, graph composition and code generation - L x, R y := Txx=x + Ex L x, R R R ? ? ? M R,R (R ; 2; x; v) D R , R (R ; 1; x; v) L x, R (R ; 1; x  x=x; r) By rewriting E+T! E, id := E! Assign, graph construction, and code generation - L x, R Assigny x xx=x L x, R R R R ? M R,R (R ; 2; x + x  x=x; r) (R ; 1; x; v) D R,R A R,R S R,y (R ; 1; x; v) L x, R r

0

0

1

r

1

r

2

2

r

r

2

3

1

r

r

r

2

r

0

0

1

0

r

2

r

2

r

2

r

:= +

1

3

r

r

1

3

2

1

2

1

0

1

3

0

0

1

0

1

0

1

1

Figure 2: Code generation for the assignment y := x + x  x=x We illustrate in Figure 2 the RIG construction and code generation by the macroprocessor M, where we use the mnemonics L for load, M for multiply, and A for add, and show the successive transformations of the assignment statement \y := x + x * x / x" assuming that it was originally tokenized to \y := id + id * id / id". To make the example self-contained we index nonterminals with the source language construct they represent and 20

label the nodes of the graphs constructed by M withe the register tuples maintained in the list reg(@). The macro-processor M constructs the register interference graph of an image while generating its representation. To describe the algorithm performed by M in this respect we assume that the BNF rule used by R is A = t A t : : : Antn , the portion of the source text 0

0

1 1

matched by the rhs(r) is t A : @ t : : : tn? An : @n tn, and we use the following notation: 0

1

1

1

1

1. If G and G are two graphs and P  G and P  G are cliques in G and 1

2

1

1

2

2

1

G , respectively, then G : P  G : P is their clique composition operation and 2

1

1

2

2

Color(G ; P ; f ; G ; P ; f ) is the algorithm that nd the optimal coloring of G : 1

1

1

2

2

2

1

P  G : P giving the optimal colorings f and f of G and G respectively. These 1

2

2

1

2

1

2

are explained in Section 5. 2. G(@ ) and f (@ ) are the register interference graph and its optimal coloring of the 0

0

target image @ constructed by M; 0

3. G(@j ) and f (@j ), 1  j  n, are the register interference graph and its optimal coloring of the target image @j , 1  j  n, component of @ . 0

4. G(Sj ) and f (Sj ), 0  j  n, are the clique containing all the vertices in NewReg(Sj ) and is its optimal coloring; 5. G(Lj ) and f (Lj ), 0  j < n, denote the register interference graph and its optimal coloring of the portion of @ obtained by processing the macro-operation up to (but not 0

including) the label Lj . Clearly, G(L ) = G(S ), f (L ) = f (S ) and G(@ ) = G(Ln), +1

0

C (G(@ )) = f (Ln). 0

21

0

0

0

0

With this notation, the macro-processor M constructs the register interference graph G(@ ) 0

and its optimal coloring f (G ) of the target image @ as follows: 0

0

1. If rhs(r) contains no non-terminals, then G(@ ) is either empty or a clique. The optimal coloring C (G(@ )) in this case uses as many colors as jG(@ )j. 2. Let Lj : $rep(@i ) [Sj ] be the code to be expanded at label Lj of the macro-operation and @k ! @i 2 $mode(@ ), 1  r  s. Assume that G(Lj? ) and f (Lj? ) have been constructed. Then G(Lj ) and f (Lj ), 1  j  n, are constructed as follows: 0

0

0

j

r

0

j

1

1

G := G(Lj? ); P := LiveAfter(Lj? ); f := f (Lj? ); G := G(@i ); P := ;; f := f (@i ); if (G 6 G ) then 1

1

2

1

2

j

2

1

2

1

1

j

1

begin

G := G : P  G : P ; f := Color(G ; P ; f ; G ; P ; f ); 1

1

1

1

2

1

2

1

1

2

end for r = 1 step 1 until s do

2

2

G := G(@k ); P := LiveAt(exit(@k )); f := f (@k ); if (G 6 G ) then 2

2

r

2

r

2

r

1

begin

G := G : P  G : P ; f := Color(G ; P ; f ; G ; P ; f ); 1

end

1

1

1

2

1

2

1

1

2

2

2

P := P [ LiveAt(exit(@k )); 1

1

endfor

r

G := clique(NewReg(Sj )); P := NewReg(Sj ); f := f (Sj ); G := G : P  G : P ; f := Color(G ; P ; f ; G ; P ; f ); G(Lj ) := G ; f (Lj ) := f ; 2

1

2

1

1

1

2

1

1

1

2

2

1

2

2

2

1

The algorithm building G(@ ) and C (G(@ )) is illustrated in Figure 3 showing the con0

0

struction of the register interference graph for IfClause constructs using the statement \if (i < 0) then E := x + y  y=x else E := x + x  x=x" where the RIG and the target image of the assignment E := x + x  x=x are shown in Figure 2 and the RIG and the code of the assignment E := x + y  y=x are constructed similarly. 22

if BEi