(ILP), which provides an inte- grated approach to several traditionally separate sub- problems in code generation. We not only have a uni-. 0-81 86-5785-5/94.
An
Integrated
Approach Tom
Wilson,
to
Gary
Retargetable
Grewal,
Ben
VLSI-CAD University Guelph,
Ontario,
Halley,
challenge
Canada
N1G2W1
fied model of the problem,
instruction
compilers
set
of
because
processors
instruction
erful
(ILP)
model.
methodology
for
for
ILP for
modeling
code
on data;
Introduction
1
data
creasingly
Set wide
Processors
use in
of their
inherent
capable
of generating
matically such
compilers
monly
ular
chip
application,
for
which
tion
available
different
chip
We have
developed that
a variety
model
the
teger
linear
grated problems
code chips,
constrained compilers
a difficult to
task
for
mention
poorly
prepared another,
any
The
one
a family
entire
code
program
approach
both
very
purpose (ILP),
quality
problem
which We not
provides only
Most
have
$03.00 @ 1994 IEEE
compilers register
variable
conflicts
in-
example.
Special
signment
techniques
sub-
when
compiling
a uni-
code
generation
might
registers, in
the
purpose code
methods,
a
an address
is
an operand on yet
assignment
such
have focused
process, inspire Most
as [5-8],
stressed on
live
[I-4]
for
heuristic
the major chips.
that
to handle.
schemes
ISP
streams
parallelism
have
on
another.
operand
be able
registers
registers.
which
such
an”d provide for
of
used
and
been
to engender
operation
assignment
We an inte-
the
has
tends
to obtain
being several
from
purpose
as an
separate
among
markedly
general
70 0-81 86-5785-5/94
parallelism
used
operand
in
different
instruction-level
ensemble
style,
cycle,
the
code
architectures.
traditionally
generation.
and a work-
high
ISP
generation
to several
in code
a methodology
friendlier parallelism
on one with
differs
of
a
programming
conventional
can generate
of special
and
have
less
ex-
generating
research
with
be used
memory,
memory,
published
by
are quite
ones may
to data
registers
machines
instruction-level
pipelined
certain
different
relevant
of the toward
genera-
are make
And
to imple-
is surrounded
capabilities
program
Much
The
is often
from
reg-
memory;
er, possibly
writing
of capabilities!
parallelism
hardware
Only for
the from
address
to data
ALU
whose
combinations
oriented
a partic-
architecture,
features
not
(DSP)
specialized.
etc.
preparing
The
of registers
a constant
access
count
branch.
constants
addresses,
com-
itself;
program
operations,
tracting
Furthermore,
enough”
and
ALU
with
for the next
the
number
varied
is often are
this
to retargetable
of data to or from
a register
instruction
a value
a conditional
a small
designs.
ing prototype for
that
chips
approach
movement
loading
updating
for
consequence
optimizing
ISP
but
processing
~tjust
architectural
quality
special
to market,
to optimize
The
or heavily
Awkward
of high
of the
has
function.
an unconventional
ISP
signal
designed
it often
the
dra-
requirements.
is specially
perform
suit ed.
digital
would
of code .
with
ment
available.
quality
real-time
and
because
concurrent
of the
isters
in-
Compilers
chips
time
applications for
attendant
if the to
the
these
generally
is the high
by
finding
versatility.
product’s
are not
employed
with
for
are
products,
and
code
their
One problem demanded
commercial
flexibdity
hasten
(ISPS)
such integrated
memory;
a field Instruction
to important We believe
We have concentrated on ISP chips that are specially designed for DSP applications. The chips in question have a single ALU that can do a multiply and accumulate in a single instruction cycle, Although the instruction repertoire is limited, the instruction words are long, permitting several things to occur in parallel during each cycle. These include: one ALU operation
of ISPS.
a variety
adapted
architecture.
code generation.
a pow-
high-quality
but one which is sufficiently
it can be easily
in the target
is the first
level par-
provides
generating
that
variations
(lSPS)
allelism, small numbers of registers, and highly specialized register capabilities. Many traditionally separate subproblems in code generation have been unified a single integer linear proand jointly optimized within gramming
Banerji
Group
abstract purpose
Dilip
Generation
of Guelph
Abstract Special
Code
as-
bottleneck published
approach
the
problem ally
by solving
employing
extensive
a succession
heuristics.
peephole
unoptimized
of subproblems,
One
approach
optimization
generated
to make
code.
a code
generator
that
lem
at once
to obtain
an integrated
that
thrives
on special
the
In contrast,
have
considers
purpose
as sets of registers
usu-
edges
[9, 10] uses most
we not
the
solution,
gives
but
one
registers.
Code
Generation
which
model registers
instructions important
idea
that
The
2.1 We
Subproblems
begin
generated The
with
a data
by the front
following
that flow
are
graph
Included
(DFG),
which
end of an appropriate
subproblems
must
be handled
is
1. Map
combinations
onto
more
of
inclusive
multiply
and
generic
machine
DFG
use
operations
instructions,
such
live
value
must
be consumed
Another
add;
tions 2. Schedule them
operations
to specific
3. Assign when
data
on
control
functional (and
steps
and
bind
units;
address)
“extra”
alternative
values
to
registers
into
ber
of registers,
ily
store
values
exceeds
spills
introduce
certain
5. Introduce
of live values
which
to
resolve
registers
and
temporar-
copies
problems
with
the
in memory;
register-to-register
points
the num-
values
with that
at
special around
loops;
are required concerns
Similarly,
across
control
block
generic
registers
boundaries
and
7. Correctly
around
compact
(highly
parallel)
imum 2.2
consistently loops;
the
individual
machine
number
of final
Important
components
instructions
into
of
the
patterns
with
a min-
instructions.
Concepts
candidate
registers
in
patterns
that
ated
Model
code
Our Our
solution
gram.
Thus
tioned of any
is to
and
trade
Although tecture,
model the
of
off issues the
reflect model
the
the
any
arise
realities
solution
within
the
in
an
can
re-
address its
various
spill
result.
terms,
archi-
supports example,
certain
use at another. at
one
point points useful
have
subsume
cannot
which
overlap
edges
require
edges
elements
in the
implication some
and
gener-
regions one
requires that
feature
plus
op-
be adcurrent
reserving
an ad-
the array
using
autoincre-
of the
code,
use of the
point
selection using
access
could
base
or by
reference
Similarly,
among
array
its
of access,
at
of in-
instruc-
may
Operations,
and traversing
register
groups machine
potentially
(recomputing
point
Within
register
71
by
design.
active.
For
either
final
as final
solution
ex-
addressing,
operation
chosen
of only
Another
relate
affect
design.
also
Another
such
final
model
associated
application
of a particular in general
ment.
a final they
represent
at each
are
in
are called
dress register
criteria
constraints
an integrated
constraints
is expressed
the
offset
that
copies
code
inclusive
be chosen
patterns
and
the
which
objects.
dressed
menwhose
correctness
all
tional
pro-
inequalities,
ILP,
that
linear
subproblems
the
providing
actual
integer
linear
Since
by
thus
an
of the
specify
solution.
together
on all
terms
effect
“subproblems”, of the
based
model
in
feasible
considered late
we
above
combined
is
other,
generated
in the
DFG
For
characteristics
appear
not
code.
or register
of array
to more
in-
represent
are inserted
solution.
generic
chosen
each
spills
op-
DFG
final
modes
patterns,
same
the
in the
must
contains
which
on overall
for a correct
or may
several it;
of
may The
This
The
operations
where
operations
tions.
DFG
use in
alternative
the
DFG
other
to be handled
choose.
or copy
appear
one of which
onto units).
operations,
depending
if they
our
may
for
spill
They
structions, 6. Assign
and
ample
pupose
wrap
edges
one
is gen-
way.
is that
at places
solution.
exactly
certain
uniform
other
as sequentially
registers
possibilities
DFG
same
regis-
order;
operations)
purpose
ILP
be useful,
the
functional
the
optional
the
might
before
(like
same
some
(like
idea
which
example,
possible;
4. In case the number
basic
from
cludes
and
regis-
conventional
the
in
the
objects
special
allows
to
register
is essentially
in a systematic
as
mapped
that
notion
approach the
When-
must
resources
we
support.
mutual
are
non-sharable
and patterns can
and
they
other
one
sets of
view.
ter,
This
What
conflict”
edges
scheduling
This
of a scheduling
DFG
erated.
way
in-
DFG.
of registers,
machine
variable
two
compiler.
generator:
“live
ever
by a code
the
abandoned
in favor
DFG
inclusive
operations, the
is the
have
stresses
incompatibility,
of
numbers
that
We
certain
retargetability.
the
for various
assignment.
for more
parts
its inherent
depicting
view
be used
represent
cover
to do is change
An
Integrated
could that
allowable
ter 2
and
the
needs
prob-
that
‘{patterns”
structions
of
only
entire
and
related
the
same
is assignment
to a set of edges in the data
flow
might
imply
of an optional spill
code
at
value. of the graph
same
(DFG).
One
application
where and -
one
involves
edge
another
represents
(recentering
the same
the
To support constraints
edges, a loop,
general,
the
ability
blocks
when
and
edges
assign
flexibility,
the ILP
are dynamically
on the
code
blocks
current
ILP
contains
enabled
by
and
values
of solution
There
variables.
that
operation
given
a compiler
generic
path
has
also
lected.
been These
which
can
DFG
of data
with
and
Registers
a single An
and
such
If certain certain
we data
specific
then
within
the
makes
set
of
ducing trol
blocks
operations control
data
movement, in turn,
in-
●
●
t as a
using
the
but
the
be included, in
one
attempt
correctness
list
and
criteria
summarizes
conveys
the
for
what
general
any
the
con-
strategy
of
a basic
DFG
operation
can be included
in at most
pattern;
a DFG
operation
is active
pattern
and
if it is not
if it is met
covered
by
by at least
one
to the design
and
edge;
an edge is active totally
if it is essential
within
an active
an edge is inactive
pattern;
if it is totally
within
an active
pattern;
to us to
●
by each
address
certain tive
cal-
edge
sets are
(because
either
all
active
they
belong
to the
sets
may
have
they
represent
or all
inac-
same
alternative
most
one
implementation);
recognition
Such
●
possibilities
by appropriate
certain
edge
member
scheduling
(because
at
active
alternative
im-
to a register
that
plementations);
no
distinction
designs. must
between
A DFG
This
merges.
to
in
such
a logically can
be
depict Additional
single-
spanning
be structured within
nodes and
●
model.
blocks.
also
“optimum”
are the
guarantee
active
that
allocated
and
can an
following
an active
only
allows
permits
The
one active
56001
required
in resources.
remain
“dummy”
branches
been
solution,
dur-
model
assumption
This,
multiblock
control
that
●
be updated
use with
this
practice
in
between
Motorola
for in
in treats
model.
not
already
be circumvented
model and
The
of resources
conflict
The eral
has
family
options block
ma-
can occur
may
equivalent approach
any
number
longer.
constraints
straints our
capable
as autoincrement,
considered
time
so-
and
required
alternative
to find
be somewhat
or accumulators.
restricted
This
operation.
of any potential
add.
for
the
ILP
Presumably
t, is prespecified,
within
parameters
times
solution.
edges.
ALU
is
The
the feasible
function.
seeks a minimum
cost
The
the entire
of data
assume
memory.
manipulation,
culation
can
by the
and
operation
Other
se-
and
steps,
any
function:
running
operations
developed
addressing
are
memories,
the
specifically
seeks
rein(t)
an architecture.
memory-resident
here.
of executing
fastest
no objective
the
patterns
nodes
execution.
registers
a particular
to
covering
registers such
objective
replaced
is multiply
ALU
ways,
instruction
exemplifies
regis-
Inter-block
addressed
ways
solution
steps
been
DFG
was
control
and
instructions
transfers
operand
standard
ing regular
same
Model
and
of control formed
variable
handle
non-pipelined
used for memory
in certain
must
here
the
basic
two
requires
has produced
it subsumes
operation
one or two
memory
data
example
of
simplest
other.
correctly
yet
single
is chosen,
presented only
not
into
manipulation.
parallel
know
but
it – both
with
use of the
is not
block
on inter-block
edge segments.
as such
are The
to any
appropriate to
of primitive
A classic
model
chines
ternal
identified
a pattern
within
The
are
A set of potential
be combined
such
have
example,
are groups
architecture. When
that
for
widths.
front-end
operations
sequences
architecture,
data
of
Assumptions
that
through
relevant
traversing
be represented
connecting
Overview
3.2
several
the number
Operational
a DFG
between
Values
or disabled,
Model
We assume
either
movement
correctly 3.1
operations
points.
or be associated
lution
The
may
ter on logically
model. 3
certain
merge
boundaries
the correct
several
keep
branch
same
to
enables
constraints
at once.
such that
“cyclic”
leaving
– perhaps
edges
of control
considered
depending
In
to different
interconnection
up
a value
a value
loop.
register
are being
linking
represents
●
sev-
points
by where
node
edges
must
be assigned
to an allowable
register
set;
a way
acceptable
done
active belongs
●
edges trol
introcon-
ordering
’72
that
represent
block,
between
around
a loop,
cept
a copied
for
the same
must
control use
value);
value, blocks,
the
same
within or
a con-
wrapping
register
(ex-
●
active
edges
tween
the
active
●
operations
in the unit,
DFG,
edges the
same
register,
quential
the
not
but
must
related
by
same
in some
are
order
are
use that
functional
and
of the
same
assigned
register
to
the
in some
se-
Variables
and
following
symbols
index
conflicting
over
ALUs
V
potentially
conflicting
over
registers
V
that: or
not
erations.
covering
i, j, h, k
operation
If
7’
Membership
a value
of nodes nodes
it connects,
are used
tifying
edges
The lected
within
the
activated
the
conveyed
final
two
auxiliary
ILP,
are defined
pairs
besides
to
the
be
step
and
others for
register
as-
results
are
some
of which
of which
must
(y,t,z,x,u),
internal
use
type intgr
t
by
intgr
final
step
opi =
total
if 1, Opi activated
Xi j
o-1
if 1, edge
(i, j)
activated
U~j,
o-1
if 1, edge
(i, j)
uses register
o-1 o-1
if 1, ~i
must
operations
(nodes
operations
that
precede
of the
could
opj
not
constores
otherwise
conflict.
sets exclusive
mutually
dependent
requiring
same
one
edge
may
contain
alternatives for
activation
register
of other
of Cc identify ter.
Such
successive though
whose
are
wrap
It
must
that often
iterations.
also the
require
They
in the
patterns
that
cover
opi
patterns
that
cover
edge
~j
registers
suitable
for
edge
are sets of operation
an
activation set.
The
same
segments to
convey
are logically
separate
contain
the
interlock a loop
be actiall require
implies
from
all
that
may
activation
around
physically
A.
set
members
edges
edges
that
each
a set of edges status.
of a number
ments
from
activation
The
Constraints
sets regis-
or seg-
a value
to
“connected”,
model.
most
one combining
k
Detail
pattern
may
cover
any
op-
eration:
From opj
fOllOW O’pj objects:
r
ing
last
set of alternative
any
edge must
be either
pattern. by
member
of some
nal
design
has its
xij
(2).
=
Such
A= and must
or be covered and
either
by a pattern.
the
one
by a combinedges
an
has no possible
1 from
A=, exactly
or covered
non-alternative
constraint
alternative
edges,
activated
Ordinary,
handled
is not
DFG)
appear
Bi B~j
following
are
will
may
in which
mutually
edge
edge
are
also
is the
sole
appear An covering
in the fiedge
that
pattern
outset,
(i, j) (Lj)eA~
(i, j) Each
The
(i) opi
operations
possibly
must qij
PEB