Computer Architecture in the Many-Core Era

Recommend Documents

Oct 2, 2006 - XDR-DRAM. 2GBytes. Stream. Processor. 64 FPU. 128 GFLOPS. On-Board Network. Intra-Cabinet Network. E/O. O/E. Inter-Cabinet Network.

A Cluster for CS Education in the Manycore Era

Mar 9, 2011 - {adams, kmh23, jmw26}@calvin.edu. ABSTRACT. Traditional ... 2.1 Funding. We were fortunate enough to procure a grant from the National.

Computer Architecture Education in Multicore Era: Is ...

computer architecture related courses. ... indirect related to multicore or many-core architectures. It is .... on the sequential data structures, for example linked list,.

Exploring the Thermal Impact on Manycore ... - Computer Science

[15] consider the thermal constraints in multicores at a more detailed microarchitecture level with comprehensive architecture simulations for multi-programmed.

Assignments in Computer Architecture

Jun 27, 2002 ... READINGS (CS-585 only): Readings in Computer Architecture (Hill, Jouppi, ... DELIVERABLE: Make up three multiple-choice questions.

CSE 502: Computer Architecture - Computer Architecture Stony ...

... Architecture. CSE 502: Computer Architecture. Out-of-Order Execution and Register Rename ... r4 ← r0 - r10 ... 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 addf f0 ...

Epigenetic Robotics Architecture (ERA) - CiteSeerX

Dec 10, 2010 - First, it is not always clear from GÃ¤rdenfors work how the ... of two concepts, where the meaning of the new one lays in between the other two.

Competing in the Era of Emergent Architecture: The Case ... - CiteSeerX

complementarities based on the emergent architecture, .... argument is that by tuning a firm's product portfolio to ..... 1998, New York: Free Press. x, 406 p. 5.

Computer architecture

UNIVERSITY OF OSLO. Department of Informatics. Computer architecture. Compendium ..... 5.10 Schematics/circuit diagram of a 1-bit half adder . . . . . . . . . . . . .

Computer Architecture

Andrew S. Tanenbaum, Structured Computer. Organization, 4th edition, Prentice Hall. International, Inc., 1999. ▫ V C Hamacher et al : Computer Organization 4th.

Computer architecture

5.10Schematics/circuit diagram of a 1-bit half adder . ... 7.4 n-bit ALU schematics example . ...... After the Intel 286 there was the 386 and then the 486, but the.

Manycore Parallel Algorithm - WordPress.com

TE091585 – Komputasi Grid. Outline. ❑ How to compute Phi ? ❑ Implementation. ❑ Result and Experiment. TE091585 – Komputasi Grid ...

Indoor Positioning and Navigation in the Big-Data Era Architecture ...

management Web 2.0 back-end service. â¢ Leverages rich multi-sensory ... and Viewer website and the Android Client application running in Navigator or Logger ...

Olive Tree in the Genomic Era: Focus on Plant Architecture

DÃaz et al., 2006a; DÃaz et al., 2006b; DÃaz et al., 2007). One way ..... Go mez-Roldan V., Fermas S., Brewer P.B., Puech-PagÃ¨s V., Dun. E.A., Pillot J.P., Letisse F., ...

Indoor Positioning and Navigation in the Big-Data Era Architecture ...

Architecture. Overview. â¢ Operates on top of Google Maps with a big-data management Web 2.0 back-end service. â¢ Leverages rich multi-sensory data available ...

Programming Education in the Era of the ... - IEEE Computer Society

Programming Education in the Era of the Internet: A Paradigm Shift. W. Scott Harrison ... Introduction. Computer Science (CS) has evolved a great deal since its inception. ..... Race Conditions are based on the violation of access atomicity and task

Defect Tolerance in Homogeneous Manycore Processors ... - CiteSeerX

Abstract. Homogeneous manycore processors are emerging for tera- scale computation. Effective defect tolerance techniques are essential to improve the yield ...

1000 consecutive ablation sessions in the era of computer assisted

c Dept of Anaesthesiology, Danderyd University Hospital, Stockholm, Sweden. A R T I C L E I N F O ...... [5] M. Macchi, M.P. Belfiore, C. Floridi, N. Serra, G. Belfiore, L. Carmignani, · R.F. Grasso, E. ... 240 (2004) 205â213. [17] N.G. Berger, J.L

Data Quality in the Internet Era - IEEE Computer Society

In the Internet era, information is accessible to and published by everyone in a free and uncontrolled way. New technologies such as mash- ups and ...

CAAD teaching in the electronic era - Cumulative Index of Computer ...

computers are been integrated in practically every kind of machinery. .... worst case scenario is that CAAD specialists will become the computer technicians of.

Computer aided innovation in the era of Web 2

Computer Aided Innovation (CAI) has taken significant developments in the past ..... software company Spigit4) are possible because of these interactive.

CAAD teaching in the electronic era - Cumulative Index of Computer ...

role of CAAD in architectural education and practice. The changing .... Singapore, Centre for Advanced Studies in Architecture, National University of. Singapore.

Critical thinking in the internet era - Computer Science - Wellesley

Campbell supports the Association of Research. Libraries' plan to ... develop critical-thinking skills would provide lasting value, while ... would ascertain if online sources ... mine if students were more diligent about accuracy and verification ..

Computer Architecture Technical Report

Computer Architecture Technical Report. TUC / RA-TR-2005-02. Date: 16th Jul 2005. (Last Build: 22nd July 2005). A short Performance Analysis of Abinit on a ...

Computer Architecture in the Many-Core Era

Download PDF

3 downloads 9436 Views 2MB Size Report

Comment

Oct 2, 2006 - Beyond caches and domain decomposition ... Arithmetic is cheap, Communication is expensive. â¢ Arithmetic ... Local Register. Time. Cost*.

ICCD: 1

Oct 2, 2006

! $ % $ $ ( $ *

"

&

# "

' ) +

, .

/

$ #+ , ! $ ICCD: 2

' ' &, & 01. . & & '

& 0 & & "&

' Oct 2, 2006

! $ % $ $ ( $ *

"

&

# "

' ) +

, .

/

$ #+ , ! $ ICCD: 3

' ' &, & 01. . & & '

& 0 & & "&

' Oct 2, 2006

2

ICCD: 4

*

Oct 2, 2006

3 1 4

/ ' 1e+7 1e+6 1e+5

56 0

Perf (ps/Inst) Linear (ps/Inst)

1e+4 1e+3 1e+2

78 60

1e+1

;< 9

9:60

1e+0

9

1e-1

< 9 ;

< 9

1e-2 1e-3 1e-4 1980

1990

2000

2010

2020

Dally et al. “The Last Classsical Computer”, ISAT Study, 2001 ICCD: 5

Oct 2, 2006

'

Source: S Borkar, Intel ICCD: 6

Oct 2, 2006

0

ICCD: 7

+

9=

Oct 2, 2006

ICCD: 8

Oct 2, 2006

=

9

QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture.

ICCD: 9

Oct 2, 2006

! -

ICCD: 10

Oct 2, 2006

'

'

>

$ 5

&

$ % - ' / / /

' ? &

$ %

&

@ - '

/ ('

+

$

/ (' $ !

$ ICCD: 11

& &

& Oct 2, 2006

A

"

!1 ; # ICCD: 12

, 9::7. + 0

Oct 2, 2006

@

'

&

B

' & 4

ICCD: 13

&

' -

&

& B Oct 2, 2006

! $ % $ $ ( $ *

"

&

# "

' ) +

, .

/

$ #+ , ! $ ICCD: 14

' ' &, & 01. . & & '

& 0 & & "&

' Oct 2, 2006

+ 0.5mm

$ / 9 C1 / DB 50 EC3 1 5 %0 EC3 1 / #+ '

64-bit FPU (to scale)

$200 1GHz

Decreasing BW

$ / $ D=0 E%0 %0 E%0 , "

/

90n m Chip

%

.

, .'

/ 1' / 3

1 clock

' '

$

12mm

'

/ *

Increasing power

& '

ICCD: 15

Oct 2, 2006

9 C 3

A& A&

E

. , 95

, E

,

#

&

9

F

G

4

DB 5

9

5 F

D

8

F

D9

. .

+

9 F

D5

5F

D5

9

*Cost of providing 1GW/s of bandwidth All numbers approximate ICCD: 16

Oct 2, 2006

1 #

ICCD: 17

H1 H3

Oct 2, 2006

! $ % $ $ ( $ *

"

&

# "

' ) +

, .

/

$ #+ , ! $ ICCD: 18

' ' &, & 01. . & & '

& 0 & & "&

' Oct 2, 2006

'

ICCD: 19

--

Oct 2, 2006

' Global Memory Switch LM CM Switch RM

RM

RM

Switch Switch Switch R R R R R R R R R A A A A A A A A A ICCD: 20

Oct 2, 2006

*

ICCD: 21

!

"

Oct 2, 2006

*'

'@ '

ICCD: 22

'

) >

Oct 2, 2006

! $ % $ $ ( $ *

"

&

# "

' ) +

, .

/

$ #+ , ! $ ICCD: 23

' ' &, & 01. . & & '

& 0 & & "&

' Oct 2, 2006

%

< Global Memory Switch

LM CM Switch RM

RM

RM

Switch Switch Switch R R R R R R R R R A A A A A A A A A ICCD: 24

Oct 2, 2006

I

'

/

$ $ 1 /

$ !

ICCD: 25

+

'

&

&

+

Oct 2, 2006

#+

/

C

"#

loop over cells flux[i] = ... loop over cells ... = f(flux[i],...)

ICCD: 26

Oct 2, 2006

#+

-

loop over cells flux[i] = ... loop over cells ... = f(flux[i],...)

ICCD: 27

AC Flux passed through SRF, no memory traffic

Oct 2, 2006

#+

-

loop over cells flux[i] = ... loop over cells ... = f(flux[i],...)

ICCD: 28

AC Explicit re-use of Cells, no misses

Oct 2, 2006

0 , 9

ICCD: 29

'

, & .

.

Oct 2, 2006

#+

&

+ All needed data and instructions on-chip no misses

ICCD: 30

Oct 2, 2006

,

ICCD: 31

J'

K.

Oct 2, 2006

J' 99% hit rate, 1 miss

K

costs 100s of cycles, 10,000s of ops

ICCD: 32

Oct 2, 2006

'

ICCD: 33

' >

&

+

Oct 2, 2006

1 & 3

1

&< 1

$ 1 / / 4 / 3 1'

-

$ 3 / 1 / %

0 -

$ 1 / #

&

K2 K1 ICCD: 34

K3

K4 Oct 2, 2006

#

1 &

9::7

0 L - &

&

-

L

0 &

9

&

( 4 J

5

& "

I E ) 4 3 J& J K ! "

ICCD: 35

K

J

-K

K

' J

K

Oct 2, 2006

#+

&

+

& SW Pipeline

One iteration 0

0 10 20

10 20 30

30 40 50 60 70 80 90 100

40 50 60

110 120 20 30 40 50 60 70

70

ComputeCellInt kernel from StreamFem3D Over 95% of peak with simple hardware Depends on explicit communication to make delays predictable

80 90 100

80 90

110 120 20 30 40

100

50 60 70

110

80 90 100

120

ICCD: 36

110 120

Oct 2, 2006

&+

+

&

' Read-Only Table Lookup Data (Master Element)

StreamFEM application Compute Flux States Element Faces Gathered Elements

ICCD: 37

Compute Numerical Flux

Face Geometry

Gather Cell

Numerical Flux

Cell Geometry

Compute Cell Interior

Advance Cell

Elements (Current)

Elements (New)

Cell Orientations

Prefetching, reuse, use/def, limited spilling

Oct 2, 2006

I $ 1 $ $ J

/E

)L

3

Node memory

& K' -&

void __task matmul::leaf( __in float A[M][P], __in float B[P][N], __inout float C[M][N] ) { for (int i=0; i