parallel computing methods for the epri spatial kinetics ... - CiteSeerX

PARALLEL COMPUTING METHODS FOR THE EPRI SPATIAL KINETICS CODE ARROTTA Han Gyu Joo, Thomas J. Downar, Douglas A. Barber, Guobing Jiang School of Nuclear Engineering, Purdue University, W. Lafayette, IN 47907-1290 Laurence D. Eisenhart, Antonio F. Dias, and Jeffrey L. Voskuil S. Levy Incorporated, 3425 S. Bascom Avenue, Campbell, CA 95008-7006

Abstract New neutronics solution methods implemented in the EPRI spatial kinetics code ARROTTA are presented. The new methods were originally developed for parallel execution, but significant performance improvement was achieved with the new methods even in serial applications. The methods are characterized by the nonlinear nodal method based on two-node coupling relations derived for both the analytic nodal method (ANM) and the nodal expansion method (NEM), a solver for the coarse mesh finite differenced problem based on a Krylov subspace method, and a domain decomposition method to achieve parallelism. The new code is examined using various transient benchmark problems which include the NEA PWR rod ejection and uncontrolled rod withdrawal at HZP. To verify the solution accuracy of the new method, comparisons are made between the original ARROTTA solutions and the new solutions for both eigenvalue and transient calculations. Comparisons are also made to assess the solution accuracy of the two-node ANM and NEM. Parallel results are given for a 2 processor SUN ultrasparc II which is a symmetric multiprocessor machine (SMP). Introduction The ARROTTA-011 computer program is a three-dimensional kinetics code developed by EPRI for solving LWR static and transient problems. ARROTTA neutronics are based on the Analytic Nodal Method (ANM)2 and the code has been successfully benchmarked and results are documented in numerous reports3 for a variety of LWR applications. The purpose of the work reported here was to significantly reduce the neutronic computation time of ARROTTA by implementing new neutronics methods which were originally designed to enhance the performance of the code on parallel computers. However, the methods developed here have also considerably reduced the computational burden on serial machines. These new neutronics methods include: 1. a nonlinear nodal method4 based on "two-node" coupling relations derived from the ANM, 2. a time differencing method to generate the coarse mesh finite differenced (CMFD) transient fixed source problem5 3. the Krylov subspace based CMFD solution method6 4. the incomplete domain decomposition method for parallel execution6

The methods here were tested using a variety of benchmark problems to include the NEA rod ejection7 and rod withdrawal8 spatial kinetics benchmark problems. Applications were performed on a Sun Ultrasparc II multiprocessor. The work here was confined to neutronics and no changes were made to the thermal-hydraulics methods in ARROTTA. Methods In the nonlinear nodal method, higher order coupling between nodes is imposed only at the local or "two-node" level. The coupling coefficients of a global CMFD problem are iteratively updated until convergence is achieved. In contrast, the higher order coupling is imposed at the global level in the original ARROTTA solution scheme and results in a considerably more complicated coefficient matrix. The nonlinear nodal method has been shown to be computationally superior from both the standpoint of CPU time, memory utilization4, and parallel implementation6. The accuracy of the Analytic Nodal Method was preserved in the nonlinear nodal method by deriving a "two-node" coupling relation based on the ANM. The coupling at each two-node interface requires the solution of four 2x2 and one 4x4 linear systems. For purposes of comparison, a "two-node" coupling relation based on the Nodal Expansion Method (NEM) was also implemented in ARROTTA and comparisons of ANM and NEM for some of the benchmark problems are provided in the results section. The time differencing methods and the details of the solution techniques for transient fixed source problem are described in Reference 5. Significant reductions in the transient execution time are attributable to the application of the theta time differencing method applied only to the CMFD problem. In order to achieve the best performance for both serial and parallel applications, the CMFD solution in the nonlinear nodal method is based on a Krylov subspace method, Bi-Conjugate Gradient Stabilized (BICGSTAB), accelerated with a preconditioning scheme based on a blockwise incomplete LU factorization6. Parallelism is achieved using an incomplete domain decomposition preconditioning method6. As byproducts of the new solution methods, two additional improvements in the solution scheme were possible. First, the Wielandt Shift method was implemented as an alternative to the Chebyshev polynomial technique to accelerate the fission source iteration during the eigenvalue calculation. This became practical with the Krylov CMFD solution method for the transient fixed source problems which does not involve group-wise solution process. Secondly, the actual "ragged" reactor core boundary is treated explicitly as opposed to the "square" core. This treatment became trivial with the nonlinear nodal method and resulted in a noticeable reduction in the computational requirements. The version of ARROTTA with the modifications indicated above will be referred to as ARROTTA/P in the following applications. Applications ARROTTA/P was validated and verified using a variety of test problems which encompass different core sizes, thermal/hydraulic feedback conditions, symmetry conditions, Assembly Discontinuity Factor (ADF) options, and mesh spacing variations. The test problems also included the problems examined in Reference 3, as well as the more recent NEA spatial kinetics problems. Because of space limitations, the principal focus for the analysis here will be on the NEA spatial kinetics benchmarks since they have been widely analyzed by modern nodal codes. The NEA

problems include the ejection of a control assembly from an initially critical core at hot zero power and hot full power conditions7, as well as an uncontrolled withdrawal of a control bank at HZP8. The solutions were analyzed by comparison with published reference solutions.7,8 In the ARROTTA model for the NEA problems, each fuel assembly is treated as a 2x2 node in the radial plane and divided into 18 nodes axially. Minor changes were made to the heat conduction routines of ARROTTA in order to incorporate the specified prescription for Doppler feedback which requires centerline and surface fuel temperatures. Because the NEACRP core is small (157 assemblies), some results are also provided for the HERMITE rod ejection problem3, which is for a larger core (193 assemblies), to examine core size effects. All problems here were executed on a two processor Sun UltraSparc II computer with double precision arithmetic. Execution times are provided for both serial and parallel performance. The test problems were performed using both the codes ARROTTA and ARROTTA/P. For ARROTTA, the convergence criteria for the eigenvalue, steady-state and transient global fission source were set at 5x10-6, 0.0001 and 0.001, respectively. In ARROTTA/P the relative residual was used as a tolerance and the value of 0.001 was found to be consistent with the convergence criteria used in ARROTTA. For the transient solutions, fixed time step sizes ranging from 1 ms to 100 ms were selected depending on the type of problem. Results Comparisons of ARROTTA and ARROTTA/P solutions are first discussed for the eigenvalue and then for the transient calculations. Parallel results for both problems are provided at the end of this section. Eigenvalue Calculations For the eigenvalue problem, the ANM option provides a consistently more accurate solution than the NEM option. This is evidenced by a comparison of the ANM and NEM eigenvalues and power distributions given in Table 1 and Figure 1. This difference is more pronounced for smaller, more tightly coupled core problems such as the NEACRP in which the error in both the keff and the power distribution of the NEM solution exceeds that of the ANM solution by an order of magnitude. For the larger core of the HERMITE problem, the NEM solution errors in eigenvalue and RPD were 9 pcm and 0.6%, respectively. The improved accuracy of ANM is primarily because with hyperbolic expansion functions ANM is better able to treat flux gradients in the vicinity of the core reflector and these differences are more pronounced for smaller cores with larger gradients at the core boundary. It was verified that the NEM solution approached the ANM solution as the radial mesh size was refined. As noted in Table 1, the execution time for the ANM "two-node" portion of the nonlinear nodal calculation exceeds the NEM time by 40% because of the higher CPU time requirement for evaluation of hyperbolic and trigonometric functions in ANM. However, the total execution time for the nonlinear nodal calculation with ANM exceeds the total time with NEM by only about 15%. Also as indicated in Table 1, the decrease in the execution time due to the use of the ragged core boundary is proportional to the fraction of the core consisting of dummy nodes in the square geometry. The execution time reduction is more pronounced for the small NEACRP cores with about 24% dummy nodes than for the large HERMITE cores with about 10% dummy nodes. The use of Wielandt shift results in about a 15% decrease in the execution time. It is

worth noting that the number of outer iterations is significantly reduced with the Wielandt shift method. Table 1 Summary of Eigenvalue Calculations for the NEACRP A2 Problem Method

Geo.

ARROTTA A/P-ANM A/P-ANM A/P-ANM A/P-NEM a b c

SQ SQ RG RG RG

∆ka pcm

Acc. CHEB CHEB CHEB WIEL WIEL

-1.7 -2.0 -1.9 32.2

RPDb Error %

Nout

0.02 -0.03 -0.03 1.76

82 64 64 27 27

Nodal

Total

Neutronicc CPU Time Ratio

4.07 3.15 3.18 2.24

30.61 9.87 7.70 6.52 5.59

1.0 3.1 4.0 4.7 5.5

CPU Times, sec

Based on ARROTTA keff of 0.999807 at 1160.6 ppm Based on ARROTTA Radial Power Distribution Taking ARROTTA neutronic iteration time as the numerator

0.2470 -1.34 0.04

0.4120 -1.26 0.00

0.3965 -1.11 0.00

0.9808 -0.92 0.01

1.4430 -0.64 0.01

1.3929 -0.39 0.01

0.6335 0.24 0.00

0.7532 1.59 0.01

0.5289 -1.13 0.00

0.6475 -1.03 0.02

1.0938 -0.80 0.01

1.4135 -0.67 0.01

1.4447 -0.20 0.01

1.1501 0.43 0.00

0.6674 2.05 -0.01

ARROTTA A/P-NEM Error,% A/P-ANM Error,%

0.4980 -0.94 0.02

1.0661 -0.76 0.01

1.3683 -0.42 0.01

1.3628 -0.02 0.00

1.2830 1.07 -0.02

1.1347 -0.52 0.01

1.0640 -0.26 0.00

1.1877 0.39 0.00

0.8643 1.74 -0.03

0.4945 0.24 0.00

0.6457 1.72 -0.02

Maxima Neg. A/P-NEM: -1.34% A/P-ANM: -0.03%

Pos. 2.05% 0.04%

Figure 1 ARROTTA/P Radial Power Distribution Error in NEACRP Case C1

In general, the accuracy achievable with ARROTTA/P-ANM for the eigenvalue problem is comparable to that of the original ARROTTA. However, the reduction in the execution time for the cases shown here is about a factor of 5 on a single processor of the Ultrasparc. Transient Calculations For the transient problems, the accuracy of the flux solution from ARROTTA and ARROTTA/PANM are also in good agreement. The minor discrepancy noted in Figure 2 between the ARROTTA solutions and the reference PANTHER results for the NEA HZP rod ejection

problems can be attributed to the use of a lumped parameter fuel temperature model which can not provide accurate fuel center line and surface tempertures. They are required when evaluating Doppler temperatures as specified in the problem specification. The advantage of ANM vrs. NEM on the solution accuracy for the transient problem is very noticeable in Figure 2. It is interesting to note that because the peripheral power is overpredicted and the interior power is underpredicted in the NEM steady-state flux solution as seen in Figure 1, NEM underpredicts the peak power for the central rod eject (Fig. 2a) and overpredicts the peak power for the peripheral rod eject (Fig. 2b). It is also worth noting that the differences in the NEM and ANM peak power prediction are less pronounced for the larger core of the HERMITE problem (Fig. 2c). The time step size has a considerable effect on the transient solution. As indicated in Table 2, the accuracy of the ARROTTA solution diminishes considerably as the time step size increases. Conversely, the accuracy of the ARROTTA/P solution is maintained with larger time step sizes as shown in Table 2 and Figure 2. This is attributable to both the application of the theta time differencing scheme and the use of a precursor integration method. As indicated in Table 2, the execution time of ARROTTA/P is about a factor of 6 smaller than ARROTTA for the same time step size. However, ARROTTA/P can achieve the same accuracy as ARROTTA with a larger time step size and for the HZP cases ARROTTA/P shows as much as a factor of 28 reduction in the execution time. ARROTTA and ARROTTA/P both provide acceptable accuracy on the NEACRP Rod Withdrawal problem as shown in Figure 3. It is interesting to note that the rod cusping model plays an important role in this problem if coarse axial nodes are used. As indicated in Figure 3, the use of 30 cm axial nodes without a rod cusping model results in a gross misprediction of the core power, similar to that reported for ARROTTA in Reference 10.

Table 2 Summary of Transient Calculations for NEACRP HZP Rod Ejection Problems Case

A1

C1

a

Method

∆t ms

tpeaka sec

ppeaka %

Nitr

Total

Neut.

Neutronicb CPU Time Ratio

CPU Times, sec

ARROTTA ARROTTA A/P-ANM A/P-ANM A/P-NEM

1 10 1 10 10

0.553 0.500 0.552 0.555 0.576

134.6 144.8 134.4 134.9 123.4

1473 426 411

2394.7 368.9 952.3 136.8 132.3

1615.2 289.2 259.1 58.0 53.9

1.0 5.6 6.2 27.8 30.0

ARROTTA ARROTTA A/P-ANM A/P-ANM A/P-NEM

1 10 1 10 10

0.266 0.183 0.267 0.265 0.256

553.6 816.9 542.5 555.6 612.3

1473 445 457

4881.8 792.8 1919.7 268.8 262.6

3331.7 636.7 524.8 120.1 117.4

1.0 5.2 6.3 27.7 28.4

Reference PANTHER results are 117.9% at 0.560 second and 475.2% at 0.268 second for Cases A1 and C1, respectively. b Taking Neutronic Iteration Time of the 1ms ARROTTA Case as the Numerator

NEACRP A1

Core Power, %

150

PANTHER ARROTTA, 1ms A/P−NEM, 1ms A/P−ANM, 1ms A/P−ANM, 10ms

100

50

0 0.0

0.2

0.4

0.6

0.8

1.0

0.4

0.5

Time, sec

NEACRP C1

Core Power, %

600

PANTHER ARROTTA, 1ms A/P−NEM, 1ms A/P−ANM, 1ms A/P−ANM, 10ms

400

200

0 0.0

0.1

0.2

0.3 Time, sec

Westinghouse 4−Loop Problem (HERMITE Benchmark) Core Power, Mw

5000 4000 3000 2000

HERMITE ARROTTA, 1ms A/P−NEM, 1ms A/P−ANM, 1ms A/P−ANM, 10ms

1000 0 0.0

0.1

0.2

0.3

0.4

0.5

Time, sec

Figure 2 Results of HZP Rod Ejection Analyses for Different Benchmark Problems

60

Reference ARROTTA, 30cm axial node, w/o rod cusping treatment ARROTTA, 10 cm axial node, w/o rod cusping treatment ARROTTA, 30cm axial node, w/rod cusping treatment A/P−ANM, 30cm axial node, w/rod cusping treatment

Core Power, %

50 40 30 20 10 0

65

70

75

80

85

90

Time, sec

Figure 3 Core Power Change in the NEA HZP Uncontrolled Rod Withdrawal Case A Parallel Performance ARROTTA/P was adapted to a Symmetric Multi-Processor (SMP) architecture using the GUIDE parallel package9, which achieve parallelism based on invoking multiple threads, and employing domain decomposition. The parallel execution results given in Table 3 were obtained for a two Table 3 Summary of Parallel Execution Results on a 2 Processor Ultra Sparc 2 Machine Eigenvalue Calculation

Rod Ejection Transients A1 (HZP) 1 ms

10 ms

C1 (HZP, 10 ms)

Rod Withdrawal Case A (100 ms)

A1

A2

C2 (HFP, 10 ms)

1 Processor

Nodal CMFD Total Nitr Nnodal

2.75 3.67 6.42 33 8

3.09 3.20 6.29 27 9

13.55 70.30 83.85 216 19

13.65 245.44 259.09 1473 39

9.54 48.49 58.03 426 27

19.52 100.61 120.13 445 28

21.80 216.64 238.44 1616 62

2 Processors

Nodal CMFD Total Nitr Nnodal

1.41 1.96 3.37 33 8

1.57 1.74 3.31 27 9

6.74 37.17 43.91 220 19

6.92 125.77 132.69 1490 39

4.94 26.48 31.42 482 28

9.45 57.18 66.63 519 27

11.36 116.37 127.73 1762 64

Speedup

Nodal CMFD Total

1.95 1.87 1.91

1.97 1.84 1.90

2.01 1.89 1.91

1.97 1.95 1.95

1.93 1.83 1.85

2.07 1.76 1.80

1.92 1.86 1.87

processor case in which the core was divided into two axial subdomains. As indicated in the table, speedups over 1.9 were achieved in the eigenvalue calculations with 2 processors. The corresponding parallel efficiency of 95% is considered reasonably high. The loss of 5% in efficiency in parallel eigenvalue calculations, in which there is no computational overhead (no increase in number of iterations or nodal updates), comes from the overhead of synchrnonization of multiple threads. It is noted that in parallel transient calculations the number of total BiCGSTAB iterations (Nitr) and the number of nodal (ANM) updates (Nnodal) increased from those of the single processor case. The increase is a consequence of the incomplete domain decomposition preconditioning which neglects spatial coupling effect when estimating the subdomain coupling effects.6 The increase is, however, very small in milder transients such as full power rod ejection and zero power rod withdrawl or in the case where smaller time step size is used. Even with the computational overhead associated with the increase in iterations, parallel speedups over 1.85 were achieved in most transient calculations. Finally, it is worth noting in Table 3 that the number of nodal updates is much fewer than the number of time steps, which indicates that only the CMFD calculation is performed in most of the time steps, not invoking two-node nodal updates. This is one of the reasons that significant reduction in execution time was achieved even in single processor cases. Conclusions Substantial performance improvement of ARROTTA was achieved for both eigenvalue and transient calculations by implementing new neutronic solution methods. The primary improvement is attributed to the use of the nonlinear nodal method, which allows CMFDintensive transient calculations, and also to the use of a Krylov subspace based CMFD solver. Parallel methods implemented in the new ARROTTA provide almost linear speedups on SMP machines. References 1 2.

3. 4. 5. 6. 7. 8. 9.

L. D. Eisenhart, "ARROTTA-01: Advanced Rapid Reactor Operational Transient Analysis," EPRI Code Manual, Project 1936-6 (1993). K. S. Smith, "An Analytic Nodal Method for Solving the 2-Group, Multi-dimensional, Static and Transient Neutron Diffusion Equations," Nuc. Eng. Thesis, Dept. of Nuc. Eng, MIT, Cambridge, MA (1979). R. E. Rohan and S. G. Wagner, "ARROTTA-HERMITE Code Comparison," EPRI Report NP-6614 (1989). K.S. Smith, ANS Trans., 44, 265, Detroit, MI (1983). H. G. Joo, T. J. Downar, and D. A. Barber, "Methods and Performance of a Parallel Reactor Kinectics Code PARCS," Proc. Intl. Conf. Reac. Phys., p. J-42, Mito, Japan (Sept. 1996). H. G. Joo and T. Downar, Nucl. Sci. Eng., 123, 403 (1996). H. Finnemann, et. al., "Results of LWR Core Transient Benchmarks," Proc. Intl. Conf. Math. and Supercomp. in Nuc. App., 2, p.243, Karlsruhe, Germany (April, 1993). R. Fraikin, "Review of a NEA-NSC PWR Benchmark on UnControlled Withdrawl of Control Rods at Hot Xero Power," Proc. Intl. Conf. Reac. Phys., p. J-99, Mito, Japan (Sept. 1996). GUIDETM Reference Manual Version 2.0, Kuck & Associates, Inc., Champaign, IL (1996).

parallel computing methods for the epri spatial kinetics ... - CiteSeerX

parallel computing methods for the epri spatial kinetics ... - CiteSeerX

Suggest Documents

epri ournal - EPRI Journal

Grid computing for parallel bioinspired algorithms - CiteSeerX

Web Based Framework for Parallel Computing - CiteSeerX

Distributed Aggregation for Data-Parallel Computing - CiteSeerX

Parallel Computing Patterns for Grid Workflows - CiteSeerX

Javelin: Parallel Computing on the Internet - CiteSeerX

Generalized Sweep Methods for Parallel Computakional ... - CiteSeerX

Statistical Methods for Social Networks: a Focus on Parallel Computing

Parallel Computing Methods For X-ray Cone Beam Tomography With ...

Engineering Parallel Algorithms for Community ... - Parallel Computing

parallel text search methods - CiteSeerX

Accelerated projection methods for computing ... - CiteSeerX

EPRI Working Paper Series - CiteSeerX

Judd EPRI 2004 paper - CiteSeerX

a parallel algorithm for computing the extremal ... - CiteSeerX

Parallel Computing

Parallel Computing

EPRI Working Paper Series - CiteSeerX

Research Methods in Computing - CiteSeerX

An Efficient Parallel Algorithm for Computing the

Spatial Computers for Emergency Management - Spatial Computing

SCIENCE CHINA Parallel computing study for the

parallel computing for the finite element method

parallel image processing in heterogeneous computing ... - CiteSeerX