Reiner Hartenstein
Disruptive Trends urging to rethink Embedded System Implementation
The impact of shifting to multicore TU Kaiserslautern
4 P issues: performance market trends
programmer productivity program efficiency power consumption
© 2010,
[email protected]
2
http://hartenstein.de
Power Consumption of Computers TU Kaiserslautern
... has become an industry-wide issue: incremental improvements are on track, IPCC ?
but „we may ultimately need revolutionary new solutions“ [Horst Simon, LBNL, Berkeley]
Power consumption by internet: x30 til 2030 if trends continue
(~90% payed by customers?)
(Google denied)
G. Fettweis, E. Zimmermann: ICT Energy Consumption - Trends and Challenges; WPMC'08, Lapland, Finland, 8 –11 Sep 2008 „Google
causes 2% of the worlds electricity consumption“
at Dallas [Randy Katz: IEEE Spectrum, Febr. 2009]
© 2010,
[email protected]
Energy cost may overtake IT equipment cost in the near future 3
[Albert Zomaya]
http://hartenstein.de
vN: a Massive Power Guzzler TU Kaiserslautern
it‘s a symptom of the von Neumann Syndrome:
Software
is extremely power-hungry - by
massively memory-cycle-hungry instruction streams
Software:
has often very bad performance
we need an approach using much less
Software
triple paradigm © 2010,
[email protected]
http://hartenstein.de
Growth beyond Moore‘s Law? TU Kaiserslautern
relative performance 1013 1012 1011
the end of the single-core era
triple paradigm we need to learn parallel programming
1010 109 108 107 106 105 104
... performance drops, productivity & other problems ... „Multicore shifts the burden of Performance from Chip Designer to Software Developers.“ [J. Larus: Spending
Program
Moore's Dividend; C_ACM, May 2009]
103 70 72 74 76 78 80 82 84 86 88 90 92 94 96 98 00 02 04 06 08 10 12 14 16 18 20 22 24 26 28 30 year © 2010,
[email protected]
http://hartenstein.de
Multimedia in the Multicore Era TU Kaiserslautern
begin of the multicore era
relative performance
94
96
98
00
02
[courtesy E. Sanchez]
MIPS
Multimedia needed Performance Needs performance performance application growing needs up to: Audio faster than 800 MIPS Graphics Moore‘s law 11 GOPS
Video 160 GOPS Digital TV 900 GOPS [Pierre Paulin, MPSoC’09] next GSM GPRS EDGE UMTS standard
04
© 2010,
[email protected]
06
08 10
12
6
14
16
18
20
22
24
26
28
30 year
http://hartenstein.de
ICT market at an inflection point
TU Kaiserslautern
The battle for the living room & mobile is more important than the PC market. Prosperity depends on network capacity, ..., efficient pricing, flexible platforms, & ...
... Cheap Revolution: • low power
• affordable broadband •software performance triple paradigm
Senior Counselor to the U.S. Trade Representative (USTR) on strategy and negotiations.
Broadband is significant at the inflection point, prompting major market governance changes
Cowhey‘s & Aronson‘s Law © 2010,
[email protected]
& massive funding needed http://hartenstein.de
Performance Growth by Multicore? & massive programmer productivity problems
TU Kaiserslautern
begin of the multicore era
relative performance
year 94
96
98
00
02
04
06
08 10
12
14
16
18
20
22
24
26
28
30
von-Neumann-only is not the silver bullet Reconfigurable Computing is indispensable! © 2010,
[email protected]
http://hartenstein.de
Dead Supercomputer Society TU Kaiserslautern
[Gordon Bell, keynote, ISCA 2000]
•DAPP •ACRI •Denelcor •Alliant •Elexsi •American Supercomputer •ETA Systems •Ametek •Evans and Sutherland •Applied Dynamics Computer •Astronautics •Floating Point Systems •Galaxy YH-1 •BBN •Goodyear Aerospace MPP •CDC •Gould NPL •Convex •Guiltech •Cray Computer •ICL •Cray Research •Intel Scientific Computers •Culler-Harris •International Parallel •Culler Scientific Machines •Cydrome •Kendall Square Research •Dana/Ardent/ Stellar/Stardent •Key Computer Laboratories
only 2 or 3 successes
•MasPar •Meiko most in 1985-1995 •Multiflow - mainly research •Myrias •Numerix •Prisma •Tera •Thinking Machines •Saxpy •Scientific Computer •Systems (SCS) •Soviet Supercomputers •Supertek •Supercomputer Systems •Suprenum •Vitesse Electronics
the single core sequential mind set was http://hartenstein.de the winner
© 2010,
[email protected]
TU Kaiserslautern
new types of bugs introduced
Hastily knitted compilers for the heavy lifting ?
e. g. automatically parallelizing compilation via multi-threading, and many other ad-hoc solutions?
widespread confusion and competing claims, „I would be © 2010,
[email protected] panicked if I were in industry“
easy fix? John Hennessy:
http://hartenstein.de
TU Kaiserslautern
Amid the Clamor
Michael Wrinn, (keynote at SIGCSE2010): Suddenly, All Computing Is Parallel: Seizing Opportunity Amid the Clamor
a senior course architect in the Intel Software College
http://www.sigcse.org/sigcse2010/attendees/keynotes.php
„Foundational change will disrupt traditional habits throughout the discipline ....“
„The proud era of von Neumann architecture passes into history.“
works to bring parallel computing into mainstream of undergraduate education He also works with the ACM Education Council to bring industrial perspective to curriculum evolution. © 2010,
[email protected]
11
... especially how students are to be introduced .... http://hartenstein.de
TU Kaiserslautern
HPRC: High Performance Reconfigurable Computers
programming dilemma…
© 2010,
[email protected]
… a taxonomy of design flows
http://hartenstein.de
RC*: Demonstrating the intensive Impact *) RC = Reconfigurable Computing
TU Kaiserslautern Tarek El-Ghazawi
[Tarek El-Ghazawi et al.: IEEE COMPUTER, Febr. 2008] SGI Altix 4700 with RC 100 RASC compared to Beowulf cluster
Application
DNA and Protein sequencing
DES breaking
Power
Savings Cost
Size
8723
779
22
253
28514
3439
96
1116
Speed-up factor
much less memory and bandwidth needed
no software used ! © 2010,
[email protected]
13
massively saving energy
much less equipment needed http://hartenstein.de
by Software to Configware migration No instruction fetch at runtime:
no software !
Speedup-Factor
Speed-up factors TU Kaiserslautern obtained
106
& most MIPS running on FPGAs
Image processing, Pattern matching, Multimedia DSP and real-time face detection
SPIHT wavelet-based image compression 52 40
20
100
BLAST
288 457
FFT 88
A physical signal is the simplest and fastest way of message & data transport.
DES breaking
Reed-Solomon Decoding
video-rate stereo vision MAC pattern 730 1000 900 recognition 400
Abundant on-chip bandwidth available for parallelism of flexible granularity (by FPGA).
http://hartenstein.de © 2009,
wireless
6000
103
28500
protein identification
2400
DNA seq.
8723
3000
crypto CT imaging 1000
Viterbi Decoding Smith-Waterman pattern matching 100
molecular dynamics simulation
Bioinformatics
Astrophysics
GRAPE
[email protected]
© 2010,
[email protected]
14
http://hartenstein.de
Energy saving factors: ~10% of speedup
Speedup-Factor
Power save factors TU Kaiserslautern obtained
106
Image processing, Pattern matching, Multimedia DSP and real-time face detection
wireless
6000
GPGPU and x86 multicore:
SPIHT wavelet-based image compression
no energy saving data available
52
Low Power Circuit Design: 40
20
http://hartenstein.de © 2009,
BLAST
288 457
FFT 88
PowerOpt™ (ChipVision Design Systems): divides power consumption by up to 4
100
DES breaking
Reed-Solomon Decoding
video-rate stereo vision MAC pattern 730 1000 900 recognition 400
103
28500
protein identification
DNA 2400 seq.
8723
3000
crypto CT imaging 1000
Viterbi Decoding Smith-Waterman pattern matching 100
molecular dynamics simulation
Bioinformatics
Astrophysics
GRAPE
[email protected]
© 2010,
[email protected]
15
http://hartenstein.de
Why such Speed-up Factors ... TU Kaiserslautern
... with FPGAs: a much worse technology ! massive wiring overhead + massive reconfigurability overhead + routing congestion growing with FPGA size
The „Reconfigurable Computing Paradox“ main reason:
no von Neumann Syndrome! no software!
using Configware and Flowware instead © 2010,
[email protected]
http://hartenstein.de
Isn‘t NVIDIA the solution?
TU Kaiserslautern
begin of the multicore era
relative performance
year 94
96
98
00
02
04
© 2010,
[email protected]
06
08 10
12
14
16
18
20
22
24
26
28
30
http://hartenstein.de
Speed-up factors by GPGPUs (1) http://www.nvidia.co.uk/object/cuda_home_uk.html#state=home CUDA ZONE pages [NVIDIA Corp.]: non-reviewed CUDA user submissions
TU Kaiserslautern
power consumption not reported!
http://hartenstein.de © 2009,
[email protected]
© 2010,
[email protected]
Speedup-Factor
Drawbacks: von Neumann syndrome, Programmer productivity
103
Astrophysics Bioinformatics
EDA
675 500 340 270 327 420 250 169 270 170 150 260 169 138 150 109 172 100 100120 100 100 100 100 100 100 100 60 90 55 90 75 55 34 60 50 77 60 55 50 30 50 50 50 40 50 36 29 4035 30 50 39 50 50 35 35 31 35 32 35 26 27 23 29 25 26 1520 1630 20 20 20 17 16 15 13 15 15 12 13 10 12 10 10 10 10 10 10 10 10 9 10 9 7 8 9 7 5 5 5 5 5 4 8 4 . 3 3 .5 4 4 3 5 3 4 3 2 2 2 2 1.3 470
102
CFD Computational Fluid Dyamics Cryptography oil & gas
DCC DSP
101
100
Jan 2007
July 2007
Jan 2008
18
July 2008
Jan 2009
July 2009
Digital Content Creation Digital Signal Processing
Graphics Imaging Jan 2010
Numerics Video & Audio
http://hartenstein.de
Speed-up factors by GPGPUs (2) http://www.nvidia.co.uk/object/cuda_home_uk.html#state=home CUDA ZONE pages [NVIDIA Corp.]: non-reviewed CUDA user submissions
TU Kaiserslautern
power consumption not reported! (up to ~600 x)
Speedup-Factor
103
Astrophysics Bioinformatics
EDA
675 675 500 500 470 470270 340340 270 327327 420 420 250 169 270 169 270 150 260 150260 169 169 170 170 138 150 138150 109 109 100 100 172 172 100120 120 100 100100100 100100 100 100 100 100 100 100 100 6060 90 90 55 55 90 90 55 75 7560 55 77 34 60 34 60 77 60 55 55 50 30 50 50 30 50505050 50 40 50 50 40 50 36 50 29 50362935 35 35 35 40 30 50 39 40 30 3950 50 50 35 35 35 32 2626 27 2731 3135 35 35303220 30 29 23 252923 25 26 1520 26 15 20 20 17 17 20 20 16 2016 20 15 13 15 1616 15 13 15 1515 12 10 10 1313 12 12 10 10 10 10 10 10 10 10 10 10 10 10 10 9 10 10 10 9 10 7 109 9 79 7 88 7 9 5 55 5 5 5 5 55 5 4 4 8 8 4 3 4 44 4 3 3 4. 35. 3 3 . 55 5 3 4 3 4 3 3 2 22 2 2 2 2 2 1.3 1.3
102
CFD Computational Fluid Dyamics Cryptography oil & gas
DCC DSP
101
100
Jan 2007
© 2010,
[email protected]
July 2007
Jan 2008
19
July 2008
Jan 2009
July 2009
Digital Content Creation Digital Signal Processing
Graphics Imaging Jan 2010
Numerics Video & Audio
http://hartenstein.de
by Software to Configware migration (up to ~30,000x) (200x) vs. GPU: almost 50x
Speedup-Factor
Speed-up factors TU Kaiserslautern obtained (2)
106
Image processing, Pattern matching, Multimedia DSP and real-time face detection
675 500 340 470 270 327420 250 270169 150 260169170 138 150 109100172 100 120 100 100 100 100100 100 60 10090 55 90 75 60 55 77 34 6050 50 503055 50 50 35 5040 50 35 2936 39 4030 5050 35 35 32262731 35 20 23 30 29 25 2615 20 16 20 20 17 16 13 15 1515 12 10 13 12 10 10 1010 10 109 10 7 10 9 7 8 9 5 55 55 4 4 48 3 3. 54 . 3 4 3 35 2 22 2 1.3
© 2010,
[email protected]
wireless
6000
SPIHT wavelet-based image compression 52
BLAST
288 457
FFT 88
40
20
100
DES breaking
Reed-Solomon Decoding
video-rate stereo vision MAC pattern 730 1000 900 recognition 400
103
28500
protein identification
2400
DNA seq.
8723
3000
crypto CT imaging 1000
Viterbi Decoding Smith-Waterman 327 pattern matching 250 100 50 molecular dynamics simulation
50 12
Bioinformatics 12
Astrophysics
GRAPE
20
Cryptography http://hartenstein.de
TU Kaiserslautern
RC versus Multicore „RC“ = Reconfigurable Computing
RC: speed-up often higher by orders of magnitude
Sure !
RC: energy-efficiency often higher:
very much, or, by orders of magnitude ?
this is the silver bullet
Sure !
We need both: Multicore and RC © 2010,
[email protected]
http://hartenstein.de
„Software“ stands
for extremely memory-cycle-hungry instruction streams
TU Kaiserslautern
Patterson’s Law:
Nathan’s Law:
It expands to fill its containers ...
bandwidth gap grows 50% / year Dave has reached >1000x Patterson
“The Memory Wall”
Nathan Myhrvold
coined by Sally McKee (& co-author)
Software is a gas.
… until being limited by Moore’s Law [& Kryder’s Law]
Wirth‘s “software is slowing faster Law [Niklaus Wirth]
than hardware is accelerating“
The von Neumann Syndrome: C.V.
overhead piles up to code sizes of astronomic dimensions
Ramamoorthy © 2010,
[email protected]
22
http://hartenstein.de
term by F. L. Bauer [1968] TU Kaiserslautern
50 years Software Crisis [Cyril Northcote Parkinson, 1955]
Parkinson‘s Law
bureaucracy growth independent of actual work to be done
The time has come
Max Planck:
Replacement of false doctrines by new insights needs 50 years waiting for not only old professors but also their scholars to die off. Software Engineering critics is not new:
Peter G. Neumann 1985-2003: F. L. Bauer 1968, coined the term „Software Crisis“ 216x “Inside Risks“(18 years inside back N. N. 1995: THE STANDISH GROUP REPORT cover of Comm_ACM) Robert N. Charette 2005: Why Software Fails; IEEE Spectrum, Sep 2005 L. Savain 2006: http://hartenstein.de © 2010,
[email protected] Why Software is bad Anthony Berglas 2008: Why it is Important that Software Projects Fail
CPU-centric flat world model TU Kaiserslautern
(Aristotelian model)
typical programmer qualification: sequential-only mind set – CPU-“centric“ but no hardware know-how (kind of tunnel view)
CPU not visible from SE © 2010,
[email protected]
This
Software-centric
world model is obsolete
http://hartenstein.de
The Machine Model Dichotomy auto-sequencing Memory
TU Kaiserslautern
asM
FE
Flowware Engineering
CPU SE Software Engineering
PE Program Engineering
*) do not confuse with „dataflow“!
von Neumann versus Anti-machine (data stream machine).
PE: the Generalization of Software Engineering — First Step © 2010,
[email protected]
25
http://hartenstein.de
Procedural Languages Twins TU Kaiserslautern
program counter
imperative Software Languages read next instruction goto (instruction address) jump to (instruction address) instruction loop instruction loop nesting instruction loop escape instruction stream branching no: no internally parallel loops
data counter(s) super
systolic Flowware Languages read next data item goto (data address) jump to (data address) data loop data loop nesting data loop escape data stream branching yes: internally parallel loops
But there is the Asymmetry
for data parallelism
26
http://hartenstein.de
© 2010,
[email protected]
Machine twins: different data movement TU Kaiserslautern
if not Software? Who moves operand to operator if not an instruction? / from
moving data # between Neumann 1 von CPU cores
execution strategy data transport triggered by via common instruction moving data at memory stream run time moving at piped thru arrival of data (r)DPU cores compile time 2 within (r)DPA directly from (transport- the locality of (r)DPU to (r)DPU triggered*) execution *Daniel Tabac, Jack Lipovski
remember the Memory Wall (Patterson‘s Law)
© 2010,
[email protected]
27
http://hartenstein.de
A Heliocentric CS Model needed TU Kaiserslautern
CPU SE Software Engineering Triple Paradigm Dual Dichotomy Approach. The Generalization of Software Engineering — © 2010,
[email protected]
time to space mapping issue
auto-sequencing Memory
asM
FE
Flowware Engineering
PE Program Engineering
*) do not confuse with „dataflow“!
structure pipe network model s CE Configware Engineering
rDPU reconfigurable-Data-Path- Unit rDPA reconfigurable-Data-Path- Array 29
http://hartenstein.de
Triple Paradigm Compilation TU Kaiserslautern
automatic partitioning
Software Engineering
Code-X
Configware Engineering
mid‘ 90ies: Jürgen Becker
C, FORTRAN MATHLAB, …
source „program“ placement & routing mapper software configware compiler compiler instruction scheduler data scheduler configware software code code flowware code http://hartenstein.de © 2010,
[email protected] 30 instruction streams data streams configuration source program
SE Education Revolution
TU Kaiserslautern
Software Engineering
by triple paradigm co-education: traditional qualification in the time domain + lean qualification in the space domain = lean hardware modeling qualification at a higher level of abstraction
© 2010,
[email protected]
31
http://hartenstein.de
Conclusions (1) TU Kaiserslautern
We urgently need a Software Education Revolution for using Multicore - and RC* (SERUM-RC*) *) Reconfigurable Computing We urgently need a Mead-&Conway-dimension text book on triple-paradigm programming education
and a few new Matlab/Simulink boxes for a model-based lean instruction approach to undergraduate students © 2010,
[email protected]
32
http://hartenstein.de
Conclusions (2) TU Kaiserslautern
To maintain a Booming Multicore Era: possible for 2 or 3 more decades? Not without Reconfigurable Computing!
the end of the singlecore era
relative performance
year © 2010,
[email protected] 04 06 08 10
12
14
16
18
20
22
24
26
28
http://hartenstein.de 30
33
TU Kaiserslautern
thank you © 2010,
[email protected]
34
http://hartenstein.de
TU Kaiserslautern
END © 2010,
[email protected]
35
http://hartenstein.de
TU Kaiserslautern
extra pages for discussion: © 2010,
[email protected]
36
http://hartenstein.de
The Systolic Array
TU Kaiserslautern
nice time/space notation - defines: ... which data item time at which time at which port
x x x
(pipe network) DPA* *) DataPath Array (array of DPUs) DataPath Unit has no program counter! it’s no CPU!
time
(H. T. Kung paradigm)
|
input data stream
| |
x x x x x x -
port #
- - - x x x
time
- - - - x x x
x x x - -
- - - - - x x x port #
port #
|
|
|
|
|
|
|
|
|
|
|
x x x © 2010,
[email protected]
x x x
x x x
Algebra experts‘ hobby, early 80ies
time
x x x
output data streams
|
x x x
37
http://hartenstein.de
The von Neumann Syndrome TU Kaiserslautern
The instruction-stream-based von Neumann approach: The data-stream-based anti machine approach: has no von per Neumann bottleCPU! necks has
the watering pot model [Hartenstein]
several von Neumann overhead phenomena
© 2010,
[email protected]
38
http://hartenstein.de
Data meeting the Processing Unit (PU) TU Kaiserslautern
... explaining the RC advantage
We have 2 choices routing the data by memory-cycle-hungry instruction streams thru shared memory data-stream-based: placement* of the execution locality ... pipe network generated by configware compilation © 2010,
[email protected]
by Software by Configware
(data)
(PU)
*) before run time
39
http://hartenstein.de
*> Declarations EastScan is TU Kaiserslautern
by [1,0] 4 step end EastScan;
2
JPEG zigzag scan pattern
goto PixMap[1,1]
a datastream HalfZigZag; SouthWestScan uturn (reverse (HalfZigZag))
SouthScan is step by [0,1] endSouthScan; NorthEastScan is loop 8 times until [*,1] step by [1,-1]
3 endloop
language example
an animation
x y
dataHalfZigZag counter
data counter
end NorthEastScan;
SouthWestScan is loop 8 times until [1,*] step by [-1,1]
1 endloop
end SouthWestScan;
endloop end HalfZigZag;
© 2010,
[email protected]
data counter
40
data counter
reverse (HalfZigZag)
HalfZigZag is EastScan loop 3 times SouthWestScan SouthScan NorthEastScan EastScan
http://hartenstein.de
TU Kaiserslautern
Double Dichotomy Paradigm Dichotomy
time domain
von Neumann
Anti Machine
(Software-Domain)
(Flowware-Domain)
data stream
instruction stream
time domain
Relativity Dichotomy time domain
space
time
Procedure
Structure
(Software-Domain) © 2010,
[email protected]
(Configware-Domain)
41
space domain
http://hartenstein.de
Paradigm Dichotomy: an old hat TU Kaiserslautern
HDL scene ~1970:
paradigm mapping causes a time to space mapping decision box:
demultiplexer:
ENABLE
B0
CONDITION
CONDITION
ENABLE
B0 1
0
B1 B1
W. A. Clark: 1967 SJCC, AFIPS Conf. Proc. C. G. Bell et al: IEEE Trans-C21/5, May 1972 RTM as DEC product available: © 2010,
[email protected]
1973 42
decision box turns into demultiplexer “That’s so simple! why did it take 30 years to find out ?” reductionists’ tunnel view David Parnas: Put [very] Old Ideas Into Practice
PvOIIP
http://hartenstein.de
Paradigm Dichotomy (2)
TU Kaiserslautern
Paradigm Dichotomy time domain
von Neumann
Anti Machine
(Software-Domain)
(Flowware-Domain)
data stream
instruction stream
time domain
software to flowware mapping ? Relativity Dichotomy time domain
space
time
Procedure
Structure
(Software-Domain) © 2010,
[email protected]
(Configware-Domain)
43
space domain
http://hartenstein.de
Relativity Dichotomy
TU Kaiserslautern
Paradigm Dichotomy time domain
von Neumann
Anti Machine
(Software-Domain)
(Flowware-Domain)
data stream
instruction stream
time domain
Relativity Dichotomy time domain
space
time
Procedure
Structure
(Configware-Domain)
(Software-Domain)
space domain
time to space mapping © 2010,
[email protected]
44
http://hartenstein.de
Relativity Dichotomy (2) TU Kaiserslautern
space time/space time/space
time time time
time domain: procedure domain
space domain: structure domain
2 phases: 1) programming instruction streams 2) run time
3 phases: 1) reconfiguration of structures 2) programming data streams 3) run time
© 2010,
[email protected]
45
http://hartenstein.de
time-iterative to space-iterative TU Kaiserslautern
n time steps, 1 CPU
the space dimension is limited (e.g. because of the chip size) n*k time steps, 1 CPU
a time to space mapping
a time to space/time mapping
1 time step, n DPUs
n time steps, k DPUs
loop transformation methodogy: 70ies and later © 2010,
[email protected]
Strip mining [D. Loveman, J-ACM, 1977]
46
http://hartenstein.de
POIIP: Loop turns into Pipeline [1979]
TU Kaiserslautern
loop:
Memory
CPU loop body
complex loop body nested loops
Pipeline: (reconfigurable) DataPath Unit:
loop body
rDPU
rDPU rDPU rDPU rDPU
complex rDPU or pipe network inside rDPU
© 2010,
[email protected]
47
complex pipe network http://hartenstein.de
TU Kaiserslautern
The Bubble Sort Algorithm
loop i = 2 … N loop j = 2 … N if key [j-1] > key [j] then swap (key [j-1], key [j]) endif; endloop j; endloop i;
© 2010,
[email protected]
48
http://hartenstein.de
architecture instead of synchro bubble sort example
TU Kaiserslautern
conditional swap
conditional swap conditional swap conditional swap
conditional swap conditional swap conditional swap
conditional swap
only half of the number of blocks
conditional swap conditional swap conditional swap
direct time to space mapping
modification: with shufflefunction
accessing conflicts
© 2010,
[email protected]
conditional swap
„Shuffle Sort“ 49
http://hartenstein.de
TU Kaiserslautern
time 2 space mapping
Time domain: Procedure-Domain
space-Domain: Structure-Domain
time-Algorithm
space-Algorithms
Pipeline
Program loop
n time steps, 1 CPU
1 clock steps n DPUs
Shuffle Sort
Bubble Sort conditional swap
n x k time steps: 1 „conditional x swap“ unit y
k clock steps, n „conditional swap“ units
conditional swap conditional swap conditional swap
space- / time-Algorithm
time-Algorithm © 2010,
[email protected]
conditional swap
50
http://hartenstein.de