Study of the mapping of Navier-Stokes algorithms onto multiple

NASA Technical Memorandum 85945

NASA_TM-8594519840021475

Study of the Mapping of Navier-Stokes Algorithms onto Multiple-lnstruction/Multiple-DataStream Computers D. Scott Eberhardt, Donald 8aganoff and K.G. Stevens, Jr.

July 1984

NI\S/\

National Aeronautics and Space Administration

'-ANGLEY Q[sc:,c.qc'-' LlGR!\~C

,'t

~
Examples: Iterat10ns 1 2

5 10

SEeeduE

Iterat10ns

SEeeduE

1.574 1.705 1.813 1.857

15 50 100 400

1.872 1.895 1.900 1.904

15

TABLE 3.- TASK TIMINGS:

NAVIER-STOKES EQUATIONS

Time-MIMD code, sec Task Serial Setup RHS Residual + BC LHS Output (optional) Time/iteration Output routines

6.57 .02 .84 .08 (1. 64)

Concurrent

Time-serial code, sec

Speedup

6.07 15.15 .84 30.50 Not measured

0.924 1.927 1.0 1.972 Assume 1.0

7.84 15.47

(.94) 3.24

(23.31)

Iteratl.ons

Speedup

Iterations

Speedup

1 2 5 10 15 25

1.635 1. 751 1.842 1.876 1.889 1.899

50 100 400

1.906 1.910 1. 913

1. 914 .994

(46.42) 3.22

Examples:

TABLE 4.- TOTAL CPU TIMINGS FOR EULER EQUATION Number of l.teratl.ons

t

MAIN

, sec

11.60 11. 79 14.25 16.10 22.60 28.03 33.45

1 1 2 5 10 15 25

t RHS ' sec

tLHS,sec

4.21 4.23 8.41 20.98 42.04 63.03 104.79

13.67 13.74 27.36 67.95 132.38 204.58 338.04

t

MIMD

, sec

29.48 29.76 50.02 105.03 197.02 295.64 476.28

t

serl.al' sec 45.32 82.70 187.36 365.60 544.49 897.78

Speedup 1. 537 1.523 1.653 1. 784 1.856 1.842 1.885

TABLE 5.- TOTAL CPU TIMINGS FOR NAVIER-STOKES EQUATION Number of l.terations 1 2 5 10 25

t

MAIN

, sec

11.62 13.93 16.19 20.58 33.36

t RHS ' sec 7.84 15.69 39.15 78.17 196.72

t LHS ' sec 15.47 30.87 77 .15 155.13 386.95

16

t

MIMD

, sec

34.93 60.49 132.49 253.88 617.03

t serial' sec

Speedup

56.70 105.07 244.24 478.13 1173.25

1.623 1.737 1.843 1.883 1.901

TABLE 6.- STOPWATCH TIMINGS FOR EULER EQUATION Number of iterations

t

MIMD

, sec

t

serial'

sec

Speedup

1 1 2 5 10 15

44.61 36.98 58.10 114.68 213.66 311.86

82.70 188.32 366.98 545.98

1.034 1.248 1.423 1.642 1. 718 1. 751

25

499.44

899.53

1.801

46.14

TABLE 7.- STOPWATCH TIMINGS FOR NAVIER-STOKES EQUATION Number of lteratl0ns 1 2 5 10 25

t

MIMD

, sec

t

43.73 68.58 151.50 269.69 647.78

serlal' 58.39 106.75 246.11 479.39 1175.21

17

sec

Speedup 1.335 1.557 1.642 1. 778 1.814

PREVIOUS ITERATION

/" •

L xyz

RHS

/ F -1 x

- SYNCHRONIZE •

1,3 !, 1,4

2,1

2,2

2,3

2,4

3,1

3,2

3,3

3,4

4,1

4,2

4,3

4,4

- SYNCHRONIZE .. q**

/

- SYNCHRONIZE

k =1

.. qn + 1

F -1 z

1,2

q*

/ F -1 y

1,1

J= 1

/

Jmax

- SYNCHRONIZE

Figure 3.- Memory partition for a four-processor system set up for a two-dimensional problem.

NEXT ITERATION

Figure 1.- MIMD procedure for solv1ng approximate-factored algor1thms.

INITIALIZATION

1 DATA INPUT GRID GENERATION VARIABLE INITIALIZATION

2 3

4

k = 1

MAIN ITERATION LOOP

J =1

Jmax SWEEP 1

1

2

3

BOUNDARY CONDITIONS RIGHT-HAND-SIDE OPERATOR EXPLICIT SMOOTHING CALCULATE RESIDUAL - CONVERGANCE? IMPLICIT INTEGRATION - x-SWEEP - y-SWEEP - z-SWEEP OUTPUT EVERY N ITERATIONS

4

k= 1

J= 1

Jmax SWEEP 2

FINAL OUTPUT ROUTINES

Figure 2.- Implementat10n of a twodimensional problem on a fourprocessor system.

Figure 4.- Serial code flow summary.

18

PROGRAM AIR3D

,

INITIA GRID JACOB METOUT (4x) OUT2 EIGEN

XXM, YYM, ZZM XXM, YYM, ZZM

STEP

BC

XXM, YYM, ZZM

RHS

ZZM FLUXVE DIFFER

J = 2, Jmax - 1

YYM FLUXVE DIFFER

J = 2, Jmax - 1

XXM FLUXVE DIFFER VIXRHS

J=1,Jmax k = 2, Kmax - 1 1= 2, Lmax - 1

k = 2, Kmax - 1 1= 1, Lmax

k=1,Kmax 1= 2, Lmax - 1

ZZM MUTUR ZZM

SMOOTH

XXM, YYM, ZZM

FILTRX

XXM AMATRIX (J = 1, Jmax)

k = 2, Kmax - 1 1= 2, Lmax - 1

YYM AMATRX (k = 1, '('l1ax)

J = 2, Jmax - 1 1= 2, Lmax - 1

ZZM AMATRX (I = 1, Lmax) ZZl'vl

J = 2, Jmax - 1 k = '2, Kmax - 1

BTRI FILTRY BTRI FILTRZ VISMAT BTRI MAP OUT2 (x4) < EhlD n-LOOP> OUT (x6) PLOT (x4) PLANE

Figure 5.- Serial code subroutines.

19

MEMORY MAPPING AND SUBPROCESS SETUP MAP GLOBAL SECTIONS CREATE FLAG CLUSTERS CREATE SUBPROCESSES

(SERIAL) (SERIAL) (SERIAL)

INITIALIZATION DATA INPUT GRID GENERATION VARIABLE INITIALIZATION

(SERIAL) (SERIAL) (SERIAL)

MAIN ITERATION LOOP BOUNDARY CONDITIONS RIGHT-HAND-SIDE OPERATOR EXPLICIT SMOOTHING CALCULATE RESIDUAL - CONVERGENCE] IMPLICIT INTEGRATION - x-SWEEP - V-SWEEP - z-SWEEP OUTPUT EVERY N ITERATIONS

(SERIAL)

FINAL OUTPUT ROUTINES

(SERIAL)

SUBPROCESS DELETION AND MEMORY DEALLOCATION

(SERIAL)

F1gure 6.- Concurrent code flow summary.

20

(SERIAL) (PARALLEL) (PARALLEL) (SERIAL) (PARALLEL)

MAPPRM SPAWN

MAPPRM SPAWN INITIA

OUT2 EIGEN

GRID JACOB METOUT (x4) (XXM, YYM, ZZM) METPASS (XXM, YYM, ZZM)

METREC

, . - - - -..

STEP BC

SUBRHS RHS ZZM FLUXVE DIFFER YYM FLUXVE DIFFER XXM FLUXVE DIFFER

) = 2, Jmax - 1 k = 2, Kmax - 1 1= 1, Lmax ) = 2, Jmax - 1 k = 1, Kmax 1= 2, Lmax - 1 ) = 1, JmdX k = 2, Kmdx - 1 1= 2, Lmax - 1

SMOOTH (SYNC)

(SYNC)

FILTRX XXM k=2,Kmax-1 AMATRX 1= 2, Lmax - 1 () = 2, Jmax) BTRI (SYNC)

(SYNC) FIL TRY YYM ) = 2, Jmax - 1 AMATRX 1= 2, Lmax - 1 (k = 2, Kmax) BTRI

(SYNC)

(SYNC) FIL TRZ ZZM ) = 2, Jmax - 1 AMATRX k = 2, Kmax - 1 (I = 1, Lmax) VISMAT BTRI

(SYNC)

(SYNC) MAP OUT2 (x4)

OUT (x6) PLOT (x4) PLANE DELPRM

DELPRM

Figure 7.- Concurrent code subrout1nes.

21

MERCURY

JUPITER

MAIN

SLAVE

LOCAL GEO READ VAR3 COUNT PLOT

MA780

\VARll STEPUP

BASE VIS VARS VARO PPRCSS TURMU MUKIN FS

LOCAL

IVARll STEPUP

LOCAL VAR3 BTRI

LOCAL VAR3 BTRI

SUBRHS

SUBRHS

LOCAL VAR3 RHS

LOCAL VAR3 RHS

F1gure 8.- Process/memory allocat1on (us1ng data block names).

22

-

COMPONENT TIMINGS •

TOTAL CPU TIME REAL TIME

20 0

19 - - - - - - - - - - - - - - - - - - - -

a..

::> w w

18

0

Q

0

a..

0

en 1 7 0

16 1 5 L.--_ _ _ _ _-'---_ _ _ _ _- ' - -_ _ _ _- - - - ' 10 100 1000 1 ITERATIONS

Figure 9.- Speedup of the Pulliam-Steger AIR 3D code using two processors to solve the Euler equations.

20

•

COMPONENT TIMINGS TOTAL CPU TIME

o

REAL TIME

a..

::> w

Q

w a.. en

15~

1

o

_____

~

_____

10

~

_ _ _ _ __ J

100

1000

ITERATIONS

Figure 10.- Speedup of the Pulliam-Steger AIR3D code using two processors to solve the Nav1er-Stokes equation with a thin-layer approximation and an algebra1c turbulence model.

23

2 Government Ace_,on No

1 Report No

3 Recipient's Catalog No

NASA TM-85945 4 Title and Subtitle

5 Report Date

STUDY OF THE MAPPING OF NAVIER-STOKES ALGORITHMS ONTO MULTIPLE-INSTRUCTION/MULTIPLE-DATA-STREAM COMPUTERS

6 Performing Organization Code

July 1984

7 Author(s)

8 Performing Organization Report No

D. Scott Eberhardt and Donald Baganoff (Stanford Univ. , Stanford, CA), and Ken Stevens

A-9716 10 Work Unit No

9 Performing Organrzatlon Name and Address

Ames Research Center Moffett Field, CA 94035

11 Contract or Grant No

13 Type of Report and Period Covered

12 Sponsoring Agency Name and Address

Techn1cal Memorandum

National Aeronaut1cs and Space Admin1strat10n Wash1ngton, DC 20546

14 Sponsoring Agency Code

505-37-01

15 Supplementary Notes

P01nt of Contact:

Ken Stevens, Ames Research Center, MS 233-14, Moffett Field, CA 94035, (415) 965-5949 or FTS 448-5949

16 Abstract

Imp11c1t approximate-factored algor1thms have certa1n propert1es that are sU1table for parallel processing. This study demonstrates how a part1cular computational flu1d dynamics (CFD) code, using th1s algorithm, can be mapped onto a multiple-1nstruction/multiple-data-stream (MIMD) computer architecture. An explanation of this mapp1ng procedure is presented, as well as some of the diff1cult1es encountered when try1ng to run the code concurrently. Timing results are given for runs on the Ames Research Center's MIMD test facility Wh1Ch cons1sts of two VAX 11/780' s w1th a common MA780 multi-ported memory. Speedups exceeding 1.9 for character1st1c CFD runs were ind1cated by the timing results.

17 Key Words (Suggested by Author(s))

18 Distribution Statement

Navier-Stokes equat10ns MIMD Concurrent process1ng Computational fluid dynam1cs 19 Security aasslf (of thiS report)

Unclassified

Unlimited Subject category - 62

20 Security Classlf (of thiS page}

Unclassified

121

No of Pages

26

"For sale by the National Technical Information Service Sprongfleld, Virginia 22161

1 22

Proe,"

A03

End of Document

Study of the mapping of Navier-Stokes algorithms onto multiple

Study of the mapping of Navier-Stokes algorithms onto multiple

Suggest Documents

Satisficing algorithms for mapping conditional statements onto social ...

mapping high level algorithms onto massively parallel ...

Satisficing algorithms for mapping conditional statements onto social ...

Mapping Wireless Communication Algorithms onto ... - UT RIS webpage

Mapping Applications onto reconfigurable

Mapping Powerlists onto Hypercubes

A comparative study of scheduling algorithms for the multiple deadline ...

Mapping ExpoCast onto ToxCast

Mapping Data-Flow onto Enhanced

MAPPING CONCURRENT APPLICATIONS ONTO ARCHITECTURAL ...

Mapping metabolism onto the prebiotic organic chemistry of ...

The scale analysis of number mapping onto space ... - Semantic Scholar

Mapping of the BCPNN onto Cluster Computers - CiteSeerX

Mapping a Neutralizing Epitope onto the Capsid ... - Journal of Virology

Dynamic Mapping of a Class of Independent Tasks onto ... - CiteSeerX

Dynamic Mapping of a Class of Independent Tasks onto ... - CiteSeerX

Mapping Syntax onto Prosodic Structure - CiteSeerX

Mapping Engineering Design Processes onto a ...

Mapping Data-Parallel Tasks Onto Partially ... - CiteSeerX

Mapping Cloud Computing onto Useful e-Governance

Mapping two-qubit operators onto projective geometries

Mapping Data-Parallel Tasks Onto Partially ... - CiteSeerX

Mapping Data-Parallel Tasks Onto Partially ... - CiteSeerX

Mapping two-qubit operators onto projective geometries