Jun 1, 2006 - some improvements but most of the time, ILP timeout without a better solution. â expensive optimization without significant improvements ...
Problem
ILP for software pipelining
SCAN
SCAN : a Heuristic for Near-Optimal Software Pipelining Florent Blachot, Benoît Dupont de Dinechin and Guillaume Huard ID-IMAG, Grenoble, France STMicroelectronics, Grenoble, France
01 June 2006
Conclusion
Problem
ILP for software pipelining
SCAN
Conclusion
Software pipelining
Software pipelining : a classic compiler optimization improves the performances of inner loops on instruction-level parallel processors cyclic scheduling 1-periodic with integral period objective is to minimize period of the cyclic schedule NP-Hard problem performed by heuristics such as iterative modulo scheduling
Problem
ILP for software pipelining
SCAN
Software pipelining on a small instance
3 typed instructions to schedule ...
Conclusion
Problem
ILP for software pipelining
SCAN
Software pipelining on a small instance
... on 2 typed processors
0 1 2 3 4 5 6 7 8 9 10
Conclusion
Problem
ILP for software pipelining
SCAN
Software pipelining on a small instance
0 1 2 3 4 5 6 7 8 9 10
Conclusion
Problem
ILP for software pipelining
SCAN
Software pipelining on a small instance
0 1 2 3 4 5 6 7 8 9 10
Conclusion
Problem
ILP for software pipelining
SCAN
Software pipelining on a small instance
A kernel occurs and ...
0 1 2 3 4 5 6 7 8 9 10
Conclusion
Problem
ILP for software pipelining
SCAN
Software pipelining on a small instance
.. in loop context, this kernel is repeated...
0 1 2 3 4 5 6 7 8 9 10
Conclusion
Problem
ILP for software pipelining
SCAN
Software pipelining on a small instance
... a great number of times
0 1 2 3 4 5 6 7 8 9 10
Conclusion
Problem
ILP for software pipelining
SCAN
Software pipelining on a small instance
...to obtain a periodic schedule
0 1 2 3 4 5 6 7 8 9 10
Conclusion
Problem
ILP for software pipelining
SCAN
Software pipelining on a small instance
Idle Time
0 1 2 3 4 5 6 7 8 9 10
Conclusion
Problem
ILP for software pipelining
SCAN
Software pipelining on a small instance
Idea : move into idle time...
0 1 2 3 4 5 6 7 8 9 10
Conclusion
Problem
ILP for software pipelining
SCAN
Software pipelining on a small instance
Idea : ... instructions from others iterations...
0 1 2 3 4 5 6 7 8 9 10
Conclusion
Problem
ILP for software pipelining
SCAN
Software pipelining on a small instance
Idea : ... to obtain a schedule with a lower period
0 1 2 3 4 5 6 7 8 9 10
Conclusion
Problem
ILP for software pipelining
SCAN
Conclusion
Denition on a small instance
Denition of Period : Number of time units for a new kernel
0
P=2
10
Problem
ILP for software pipelining
SCAN
Conclusion
Denition on a small instance
Denition of Time Horizon : number of time units for an old kernel
0
H=4
10
Problem
ILP for software pipelining
SCAN
Motivation to use ILP
Context of st200 processors : family of VLIW embedded processors 4 UET instructions by time units production compiler with a software pipeliner : Iterative Modulo Scheduler (IMS) on some loops, IMS fails to minimize period embedded computing ⇒ invest large amount of time ⇒ ILP for these dicult loops
Conclusion
Problem
ILP for software pipelining
SCAN
First disappointment on ILP
2 dierent ILP models [Eichenberger 1997, Dupont de Dinechin 2003] most recent implemented into production compiler some improvements but most of the time, ILP timeout without a better solution ⇒ expensive optimization without signicant improvements
Conclusion
Problem
ILP for software pipelining
SCAN
Conclusion
ILP formulation time indexed formulation for all instructions i and for all time units t , a binary variable xit is created a binary variable xit is true ⇔ instruction i was scheduled at time t ¯ −1 HX
t =0
xit xit
¯ − 1] ∈ {0, 1} ∀i ∈ [1, n], ∀t ∈ [0, H =1
∀i ∈ [1, n]
Problem
ILP for software pipelining
SCAN
ILP formulation
ILP strongly depends on 2 xed values ¯ −1 HX
s =t
xis +
t +θij X −P ωij −1
b HP−1 c
s =0
xjs
≤1
¯ − 1], ∀(i , j ) ∈ Edep ∀t ∈ [0, H
¯
n X X i =1 k =0
→ −
xit +k ×P bi
≤
→ −
B
¯ − 1] ∀t ∈ [0, H
Conclusion
Problem
ILP for software pipelining
SCAN
ILP formulation
xed value P : value to minimize ¯ −1 HX
s =t
xis +
t +θij X −P ωij −1
H¯ −1 c n bX P X
i =1 k =0
s =0
xjs → −
xit +k ×P bi
≤1
≤
→ −
B
¯ − 1], ∀(i , j ) ∈ Edep ∀t ∈ [0, H ¯ − 1] ∀t ∈ [0, H
Conclusion
Problem
ILP for software pipelining
SCAN
ILP formulation
xed value H¯ : an upper bound on Time Horizon ¯ −1 HX
s =t
xis +
t +θij X −P ωij −1
b HP−1 c
s =0
xjs
≤1
¯ − 1], ∀(i , j ) ∈ Edep ∀t ∈ [0, H
¯
n X X i =1 k =0
→ −
xit +k ×P bi
≤
→ −
B
¯ − 1] ∀t ∈ [0, H
Conclusion
Problem
ILP for software pipelining
SCAN
¯ inuence ILP ? How does H
xit
¯ − 1] ∈ {0, 1} ∀i ∈ [1, n], ∀t ∈ [0, H
directly determines number of variables on ILP formulation, less variables ⇒ easy to solve but decrease H¯ ⇒ problem may become infeasible and increase H¯ ⇒ often causes timeout existence of upper and lower bounds for Time Horizon bounds far o practical value strictly mathematical approach cannot help us ⇒ experimental approach
Conclusion
Problem
ILP for software pipelining
SCAN
Conclusion
Experimental approach
consists to test our ILP formulation for all P and for all H¯ on a large benchmark lower bound for P : a theorical one MinP upper bound for P : a pratical one PIMS from IMS lower value for H¯ : a pratical one HIMS from IMS upper bound for H¯ : a pratical one, done by compilator constraint
Problem
ILP for software pipelining
Experimental conditions
Experimental conditions : 169 inner-loops of classical tests pentium 1.8GHz CPLEX 9.0 timeout of 3000s ⇒ total computing time approx. 2 weeks
SCAN
Conclusion
Problem
ILP for software pipelining
SCAN
Conclusion
Space characterization
Space for a loop from Transfo.c with a timeout of 3000s MinP
HIMS
+5
+5
+10
Infeasible
PIMS
Feasible
P
+15 Timeout
+64
H¯
Problem
ILP for software pipelining
SCAN
Conclusion
Space characterization
General space for benchmark with a timeout of 3000s MinP
HIMS Infeasible
PIMS
IMS point
P
Timeout
Feasible
H¯
Problem
ILP for software pipelining
SCAN
Conclusion
Idea of SCAN
Large feasible & quickly solved area Infeasible
PIMS
MinP
HIMS
P
IMS point
Time Out
Feasible
H¯
Problem
ILP for software pipelining
SCAN
Conclusion
Idea of SCAN
Idea : follow frontier of this large area MinP
HIMS Infeasible
Time Out
SCAN
PIMS
Feasible
P
H¯
Problem
ILP for software pipelining
SCAN
Conclusion
Idea of SCAN
SCAN for a loop from Transfo.c MinP
HIMS
+5
+5
+10
Infeasible
PIMS
Feasible
P
+15 Timeout
+64
H¯
Problem
ILP for software pipelining
SCAN
Conclusion
Results of SCAN
timeout PSCAN vs. PHMS 5s 4.10% 10s 4.21% 25s 4.28% 75s 4.29% 500s 4.29%
PMinP vs. PSCAN average time for scan 1.39% 1.28% 1.20% 1.19% 1.19%
3.36s 5.95s 9.19s 22.10s 98.34s
Problem
ILP for software pipelining
SCAN
Conclusion
Conclusion
Conclusion : SCAN made ILP formulation of software pipelining usable and ecient great improvement gain on period with addition of 1 hour on all benchmark up to 33.3% gain on hard loop, like t32x32 close to lower bound (1.19%) ⇒ hard to gain more than SCAN