Exploiting the Extensibility of the FLASH Code ...

3 downloads 5497 Views 196KB Size Report
Exploiting the Extensibility of the FLASH Code. Architecture for Unsplit Time Integration. Dongwook Lee1, Anshu Dubey1, Kevin Olson2, Klaus Weide1 and.
**FULL TITLE** ASP Conference Series, Vol. **VOLUME**, **YEAR OF PUBLICATION** **NAMES OF EDITORS**

Exploiting the Extensibility of the FLASH Code Architecture for Unsplit Time Integration Dongwook Lee1 , Anshu Dubey1 , Kevin Olson2 , Klaus Weide1 and Katerina Antypas3 Abstract. FLASH is a component-based massively parallel multiphysics simulation code with a wide user base. The time integration in FLASH was originally designed using Strang operator splitting for hydrodynamics. In version 3 of the FLASH release, we added an Unsplit Staggered Mesh magnetohydrodynamics (USM-MHD) solver based on the constrained transport method of Lee and Deane. This method tested and exercised the modularity and extensibility of the FLASH code architecture. The integration of the method into the FLASH code, and its later adaptation into an unsplit hydrodynamic solver, also validated two major architectural features of the FLASH code. One of the two features relates to the successful mesh abstraction in FLASH, where uniform or adaptive discretization can be selected at configuration time. The second feature relates to the hierarchical organization of the Equation of State handling, which has been generalized while retaining the flexibility of user control. In this paper we present the relevant architectural details of the FLASH code that facilitated the incorporation of unsplit time integration into a primarily directionally split framework. Additionally, we discuss the challenges posed by adaptive mesh refinement to the USM-MHD solver and their solutions. Finally we present an analysis of the USM-MHD solver’s performance on the BG/L machine at Argonne National Laboratory.

1.

Introduction

The ASC/Flash Center at the University of Chicago has developed a publicly available astrophysics application code, FLASH (Fryxell et al. 2000; Dubey et al. 2009). FLASH is component-based, parallel, portable, and has a proven ability to scale to tens of thousands of processors. The FLASH code has successfully created abstract boundaries between numerical solvers, physical equations of interest, and the mesh on which these equations are solved. Between the growing base of external users and the demands of the research being conducted at the center, there has been a steady growth in the solver capabilities of the code, which have been made possible largely because of the modularity and extensibility of the code. FLASH’s extensibility is based upon a client-server relationship between the solvers and the mesh. With this model, the physics

1

ASC FLASH Center, University of Chicago, 5640 S. Ellis, Chicago, IL 60637, USA

2

Department of Physics, Drexel University, 3141 Chestnut St. Philadelphia, PA 19104

3

Lawrence Berkeley National Laboratory, 1, Cyclotron Road, Berkeley, CA 94720

1

2

Lee, Dubey, Olson, Weide, and Antypas

solvers can be oblivious of the details of the discretization methods, and the parallelization methodology. In addition, fully encapsulated functional units in the code facilitate co-existence of several alternative time-stepping methods, which further add to the flexibility and extensibility of the code. FLASH started its existence with directionally split time advancement method, which was also intertwined with the solver units. One of the major achievements of FLASH3 has been to abstract the knowledge of directional splitting from those solvers and units that are not inherently directionally split. Infrastructure units such as Grid and IO, and several physics solvers such as Source terms and Gravity are examples of units without directional splitting. In this paper we describe our efforts to extend the time-stepping capability in FLASH to directionally unsplit methods, with emphasis on code units that had to undergo major structural changes to accommodate the demands placed on them by the new unsplit solvers. We describe a new unsplit hydrodynamics solver added to the code, including the results of a sphere advection verification test for stability. And finally we show a weak scaling test of the Unsplit Staggered Mesh magnetohydrodynamics (USM-MHD) solver.

2.

Extensibility of the FLASH Code

In its original incarnation, FLASH was a code built around the directionally split PPM solver. Because the code was an amalgamation of several legacy codes, the data ownership was ambiguous, and there was very little abstraction between the solvers and administrative aspects of the code. In that format, adding an unsplit solver would have been a daunting task. Realizing the importance of modularity for what was going to be a large multi-year multi-user software project, the early developers of the code introduced some very farsighted architectural features. One of the most influential features was to use of UNIX directory structure, combined with a configuration tool to implement inheritance and hierarchy in the code. This method of enforcing an object oriented structure on the code without compromising its flexibility and performance continues to be used in the latest versions of the code with significant enhancements, some of which relate to the customizability of the code. However, by itself the original architecture of FLASH was not sufficient to bring about the desirable features of modularity and extensibility. The next most critical development in the code architecture was formalization of unit architecture, including resolution of the data ownership by the unit, and the scope of data within the unit. This development lead to successful abstraction of code housekeeping and administration from the complexity of the solvers, while also allowing different time-stepping schemes to exist as alternative implementations in such a way that an application can choose between them at configuration time. The only constraint is that the collection of solvers picked along with the time-stepping scheme should be consistent with the selected time-stepping scheme. One final hurdle in achieving unsplit time integration in FLASH was related to a few key units that are used in by both split and unsplit solvers. The two most important units in this category are the Grid unit and Equation of state (EOS) unit. The Grid unit requires the use of face-centered and edge-centered variables on a staggered mesh because an accurate computation of the divergence

Extensibility of FLASH for Unsplit Time Integration

3

of magnetic fields is essential in the USM-MHD solver. The split and the unsplit hydrodynamics schemes only need the cell-centered variables, and they were the only kind of variables supported in the early versions of FLASH. The EOS unit also always assumed that it had to operate on the cell-centered data. Paramesh (MacNeice & Olson 2008), the AMR mesh package used in FLASH had basic support for the face and edge-centered variables, but FLASH’s own housekeeping routines had to be generalized to be able to handle different kind of variables. For example, the Grid unit had to provide an electric field correction routine analogous to flux conservation in AMR. The correction is necessary to preserve consistency of the electric fields stored in the edge-centered variables that share fine-coarse block boundaries for the USM-MHD solver. The EOS unit, in principle computes only algebraic equations. However, for computational efficiency, many times the process of filling guardcells uses masking, which necessitates that the EOS unit carefully examine its variables’ dependencies. There is elaborate infrastructure in the unit to ensure consistency. In order to enable similar consistency checking in face and edge-centered variables we organized the EOS unit with a hierarchy in the complexity of user interfaces. For single point calls there are simple scalars, where the users are on their own and there is no error detection/correction involved. The higher level interfaces provide wrappers to single point calls where the machinery for masking and ensuring variable consistencies comes into play. Thus, much of the complexity of the unit is hidden from casual users who use higher level interfaces, while the more sophisticated users have greater access control. The higher level interfaces also do not distinguish between cell/face/edge-centered variables; that functionality is handled by the setup tool through mappings between mesh data structures and EOS variables determined at configuration times. 3.

Unsplit solvers for Gas dynamics and Magnetohydrodynamics

A directionally unsplit solver for gas dynamics (unsplit hydro) is a new highorder Godunov hydrodynamics solver in FLASH. The method provides three different levels of first, second, and third order spatial accuracy for the solution. The simplest first order Godunov method, although very diffusive, can be useful by providing robustness in the solution. In practice, the first-order method could be used as an alternative when higher-order methods fail with unphysical negative states during high-order interpolations/reconstruction. The secondorder accurate MUSCL-Hancock type formulation is the default; it has been benchmarked, and is tested daily in FLASH’s regression tests. Although the current unsplit solvers are at most second-order accurate for smooth flows and become first-order in presence of shocks and discontinuities, the PPM thirdorder reconstruction algorithm can still increase solution accuracy with the small amount of numerical dissipation. The result is that PPM can resolve small scale features much better than the MUSCL-Hancock reconstruction. The unsplit hydro solver is essentially a reduced version of the USM-MHD solver that has been available in FLASH3 for some time. In principle it would be possible to use USM-MHD solver with zero magnetic fields limit in case of pure hydrodynamics simulations. However, the USM solver uses additional memory in form of face and edge-centered variables for magnetic and electric fields

4

Lee, Dubey, Olson, Weide, and Antypas

respectively, and has computational overhead of solving the induction equations separately. Neither the additional memory, nor the solution of induction equations is necessary in gas dynamics simulations, therefore it is desirable to have a separate unsplit hydro solver for greater efficiency. The unsplit hydro implementation can not only solve 1D, 2D and 3D problems, but it can also switch between various numerical implementations: e.g., different types of Riemann solvers (Roe’s linearized solver, HLL, HLLC), slope limiters (Monotonized central-difference, Minmod, van Leer’s harmonic mean of slopes, hybrid, Toro’s limited slopes), a choice of applying slope limiters to primitive versus characteristic variables, a strong shock/rarefaction detection algorithm (Balsara 2001), and handling of grid-aligned shock instabilities (Hanawa et al. 2008) such as carbuncle and odd-even decoupling phenomena, and two different entropy fix routines for the Roe’s solver. It is worth mentioning here that, in data-reconstruction steps required for the high-order (second and third) methods, use of the characteristic limiting can significantly suppress unphysical oscillations in the vicinity of discontinuities. In order to accomplish monotone slope limiting on characteristic variables, the unsplit solvers calculate eigensystems (eigenvectors and eigenvalues) at each cell to project primitive variables to characteristic space, apply slope monotonicity constraints, and project them back to the primitive variables. This approach was not implemented in the original split PPM hydro solver, and has sometimes caused spurious temperature fluctuations in the postshock regions in a 3D Type 1a gravitationally confined detonation (GCD) problem. As such, the primitive limiting can plague the numerical solution. In such circumstances, it is necessary to use the characteristic limiting which better maintains the monotonicity of solution. Some of the notable features of the unsplit hydro solver are that it preserves flow symmetries much better than the split formulations (Almgren et al. 2006), and can take a wide range of CFL stability limits (e.g., CFL < 1) for all three dimensions, which is based on using upwinded transverse flux formulation developed in the multidimensional USM-MHD solver (Lee & Deane 2008). In Figure 1, we advect a 3D density ball in a periodic domain, demonstrating a CFL limit for the unsplit solver. The density ball is initialized with ρ = 10 in an ambient background with ρ = 1, p = 1. The ball is in a pressure equilibrium with the flow and is advected along with the flow velocities without interference from shocked gas to order to detect any numerical instabilities during the simulation with a given CFL number. Figure 1 uses flow velocities vx = vy = vz = 1 integrated to a final time t = 2 with CFL=0.8. By this time the ball makes two complete cycles over the domain. The unsplit solver preserves the spherical symmetry of the ball extremely well for a large CFL range in 3D, clearly demonstrating the stability of the method. Additionally, an upwinded transverse flux formulation, accommodating a full multidimensional eigenstructures in a single step in an unsplit way, requires only three Riemann solves in 3D per cell in each direction per time step. The feature makes the unsplit solver computationally less expensive than 3D CTU scheme by Saltzman (Saltzman 1994) which needs 12 Riemann solves, without losing solution accuracy and stability.

Extensibility of FLASH for Unsplit Time Integration

(a) Density at t = 0.0

(b) Density at t = 0.6

5

(c) Density at t = 2.0

Figure 1. Thresholded images of the 3D density ball advection problem at times t = 0,, t = 0.6, and 2.0 using a uniform grid size 128 × 128 × 128 with CFL=0.8

4.

Performance and scaling

FLASH has been developed over a decade, has continually expanded solution capabilities, and has performed some of the largest scientific simulations to date. Shortly after FLASH1 was released to the public in 2000, the Flash Center accomplished a high-resolution (20, 480×10242 effective grid) 3D simulation of cellular structure in a carbon detonation front, achieving a sustained performance of 238 GFlop (11% of peak) on 6,420 processors of the ASCI Red machine using 64-bit arithmetic (Calder et al. 2000). This early achievement, which earned the Flash Center the Gordon Bell Prize, clearly demonstrated the high scalability of FLASH and the PARAMESH library on which it is based. We have preserved and extended this level of performance to new architectures. In 2006 the Flash Center ran FLASH3 on an 18563 uniform-grid compressible turbulence problem using 32,768 nodes of the Blue Gene/L at Lawrence Livermore National Laboratory (Fisher et al. 2007).

Figure 2. FLASH scaling performance (July 2007). A horizontal line would indicate perfect scaling. The blue line corresponds to the MHD performance, the yellow to filling guardcells, and the red to refinement and de-refinement changes.

Figure 2 shows a weak scaling result for FLASH3 on the 2D Orszag-Tang MHD vortex problem using the Argonne BG/L machine. This test uses the USM-MHD solver and achieves constant work per processor with an AMR mesh

6

Lee, Dubey, Olson, Weide, and Antypas

by increasing the number of coarse blocks with the number of processors. Scaling is nearly perfect up to 512 processors. 5.

Conclusion

In this paper we discuss the features of the FLASH code architecture that facilitate its multiphysics capability and make it extensible and modular. Abstraction of infrastructural code design is the key features which has been exercised in order to (i) bring modularity to the collection of multi-purposed code units/solvers, (ii) de-centralize data ownership, (iii) provide support for various grid data structures on both UG and AMR, and (iv) provide a hierarchy and general grid variable support in the EOS. As such, the code can provide significant flexibility in developing multiple alternative implementations in an object oriented structure. The extensibility of the FLASH code has been demonstrated by its ability to support unsplit solvers in a framework originally designed around split solvers. Acknowledgments. The FLASH code has been developed by the DOEsupported ASC/Alliance Center for Astrophysical Thermonuclear Flashes at the University of Chicago. This work is supported by the U.S. Department of Energy under Grant No. B523820 to the Center for Astrophysical Thermonuclear Flashes at the University of Chicago. References Almgren, A. S., Bell, J. B., Rendleman, C. A., Zingale, M., 2006, ApJ., 637, 922 Balsara, D. S., 2001, J. Comput. Phys., 174, 614 Calder, A. C., Curtis, B. C., Dursi, L. J., Fryxell, B., Henry. G., MacNeice, P., Olson, K., Ricker, P., Rosner, R., Timmes, F.X., Tufo, H. M., Truran, J. W., Zingale, M., 2000, ACM/IEEE Conference, pages 56-72 Dubey, A., Antypas, K., Ganapathy, M. K., Reid, L. B., Riley, K., Sheeler, D., Siegel, A., Weide K., Parallel Computing, accepted Fisher, R. T., Kadanoff, L., Lamb, D. Q., Constantin, P., Foster, I., Cattaneo, F., Papka, M. E., Dubey, A., Plewa, T., Rich, P., Antypas, K., Abarzhi, S. I., Asida, S. M., Calder, A. C., Reid, L. B., Sheeler, D., Gallagher, J. B., Glendenin, C. C., Needham, S. G., 2007, Technical Report ANL/MCS-P1401-0307, Argonne National Laboratory Fryxell, B., Olson, K., Ricker, P., Timmes, F. X., Zingale, M., Lamb, D. Q., MacNeice, P., Rosner, R., Truran, J. W.,Tufo, H., 2000, ApJS, 131, 273 Hanawa, T., Mikami, H., Matsumoto T., 2008, J. Comput. Phys. 227, 7952 Lee, D., Deane, A. E., 2009, J. Comput. Phys., 228, 952 MacNeice, P., Olson, K., Drexel University, http://www.physics.drexel.edu/∼olson/paramesh-doc/Users manual/amr.html Saltzman, J., 1994, J. Comput. Phys., 115, 153