A Case Study of Verifying and Validating an ...

NUMERICAL MODELING OF SPACE PLASMA FLOWS ASP Conference Series, Vol. 359, 2006 Nikolai V. Pogorelov and Gary P. Zank

A Case Study of Verifying and Validating an Astrophysical Simulation Code A.C. Calder, N.T. Taylor, K. Antypas, D. Sheeler, and A. Dubey Center for Astrophysical Thermonuclear Flashes, The University of Chicago, Chicago, IL 60637 Abstract. We describe the process of verifying and validating FLASH, a parallel, multi-physics simulation code intended to model astrophysical environments. Verification tests are designed to test and quantify the accuracy of the code. Validation tests are meant to ensure that simulations meaningfully describe nature by comparing the results of simulations to relevant laboratory experiments. The centerpiece of the verification process is the re-engineered FlashTest toolkit, which is used both as a stand-alone testing application and as a manager for a nightly test-suite. FlashTest exercises the unit test framework now available in FLASH3, the most recently released version, as well as a variety of standard verification tests. We also present a validation example in which simulations were directly compared to a laboratory experiment. We discuss our findings and evaluate the agreement between simulations and experiment.

1.

Introduction

The Advanced Simulation and Computing (ASC) Center for Astrophysical Thermonuclear Flashes at the University of Chicago, supported by the U.S. Department of Energy, is charged with providing a freely-available simulation code to the astrophysical community. Accordingly, much of the Center’s effort goes into developing and supporting a modern, user-friendly simulation code. Verification and Validation are critical testing steps for any code project, and the process is particularly important for software meant for wide distribution. In order to assure quality, the Flash Center has an ongoing, formal verification and validation effort. The FLASH code (Fryxell et al. 2000; Calder et al. 2000) is an extensible, modular, component-based application code for multi-physics applications, particularly those relevant to astrophysics. FLASH has been applied to a variety of astrophysics problems, including X-ray bursts, classical novae, and type Ia supernovae, as well as to basic physics problems such as turbulence and turbulent mixing. FLASH has a variety of code units for managing infrastructure, monitoring the progress of a simulation, and for solving physics equations. The physics units include units for hydrodynamics, relativistic hydrodynamics, magnetohydrodynamics, self-gravity, and nuclear burning. The most recent major version, FLASH3, supports multiple methods for managing the simulation mesh, including the PARAMESH library (McNeice et al. 2000) for a block-structured adaptive mesh and a uniform mesh package developed in-house. FLASH3 also features well-defined public interfaces for each code unit and includes a framework for testing code units individually. This 119

120

Calder et al.

unit test framework works in tandem with the newly released toolkit, FlashTest, for verification and regression testing. FlashTest automates the process of verification testing, allowing it to be run regularly on multiple platforms. 2.

Verification and Validation

Verification and Validation (V&V) are the principal testing steps of simulation code development and as such are the primary methods for building confidence in modeling and simulation (Oberkampf et al. 2004). As computing and numerical methods have matured, V&V has become increasingly important and is now a blossoming discipline of its own (AIAA 1998; Roache 1998; Oberkampf et al. 2004). What may at first appear to be synonyms are actually technical terms for the process of verifying that a particular code is accurate and validating that the resulting simulations meaningfully describe nature. Adopting the definitions of the American Institute of Aeronautics and Astronautics (AIAA 1998), we have: Verification The process of determining that a model implementation accurately represents the developer’s conceptual description of the model and the solution of the model. Validation The process of determining the degree to which a model is an accurate representation of the real world from the perspective of the intended uses of the model. As noted by the AIAA, these are both ongoing processes without well-defined endpoints and should be part of the course of code development. Verification tests are typically simple tests designed to measure the accuracy of a numerical solution compared to known or bench-marked solutions. Verification requires identifying and quantitatively describing error, and the approach for mesh-based numerical methods is typically a systematic study of mesh and time step resolution. Other numerical methods (e.g. Monte Carlo methods) require different procedures (AIAA 1998). Validation involves comparison between numerical results and observed results. The scope of validation is larger than verification and numerical testing because it tests the fundamental assumptions of the theoretical model. The overall goal of the process may be phrased as probing the range of validity of the model (Calder et al. 2002). Validation quantifies error by applying verification techniques to the comparison between simulation results and data from laboratory experiments. In addition, prediction and the predictive capability of numerical methods are becoming increasingly important as the use of numerical methods continues to grow. V&V improves the confidence in predictions, and the V&V methodology continues to evolve to better support predictive capability (Oberkampf et al. 2004). 3.

The FLASH Test Suite

During its development, FLASH has been subjected to a plethora of verification tests (Fryxell et al. 2000; Calder et al. 2002). FlashTest, an automated regression testing tool developed in-house, runs a series of simulations (the test-suite) on a nightly basis and verifies the correctness of the resultant data. This tool is

Verifying and Validating FLASH

121

essential to the FLASH development team’s efforts to verify results on a wide variety of architectures (Intel, MIPS, Opteron, Power4), compilers (Portland Group, Lahey, Intel, XLF), and operating systems (Linux, Irix, MacOSX, AIX). The test-suite comprises tests of a collection of FLASH simulations, building and executing these simulations with different grid packages, I/O libraries, numbers of processors, and variations in all manner of runtime parameters. The testsuite is also responsible for unit testing and for verifying FLASH’s ability to accurately restart a test that has been stopped in the middle of a run.

Seconds in Monitoring Period

280

275

270

265

260

0

2

4

6

8 10 Time (days)

12

14

16

18

Figure 1. Results from FlashTest include performance data that may be used to monitor the performance of the code over time. This figure shows the execution time for a particular verification test plotted as a function of time for a period of eighteen days. The plot illustrates the (relatively minor) changes in performance of a simulation of a particular problem during development.

FlashTest consists of a logical progression of tests for problem set-up, compilation, execution, and verification of results. This progression is essential for reporting problems in any of the preliminary phases of simulation construction, and for alerting the developers to errors unintentionally introduced into the code. The most commonly applied test, the comparison test, executes a FLASH simulation and compares the resultant data to a benchmark that has previously been deemed correct by the developers. The restart test verifies that the code obtains a consistent result when it is restarted from a checkpoint file. It compares output from a restarted executable to that produced by the same executable when allowed to run continuously from initialization to completion. Unit tests report on the correct or incorrect functioning of individual FLASH units. The results generated by FlashTest are now easily analyzed via a second in-house application, the browser-based FlashTestView, which gathers test-suite data from all the platforms on which FlashTest is running and presents it to the user in an intuitive and easily navigable web page. Both of these applications, FlashTest and FlashTestView, have been made available to FLASH users in the

122

Calder et al.

most recent release (http://flash.uchicago.edu). Regular use of FlashTest allows developers to monitor the code during periods of development. Figure 1 illustrates this use with timing results over an eighteen day period. The figure indicates that the execution time for a particular test varied by about 6% as changes were made. This information allows developers to determine when changes that affect performance occur and investigate the cause.

4.

Validation

A hierarchical approach is used to validate FLASH. Key components of the code are identified, and the results of each component’s simulation are compared to actual laboratory data. Integrated simulations with multiple components are then validated as far as possible. Most of the validation effort has focused on the piecewise-parabolic method (PPM) hydrodynamics module and its ability to accurately simulate various fluid instabilities. These simulations include studies of the Rayleigh–Taylor instability, laser-driven shocks propagating through multi-layer targets (a configuration subject to Richtmyer–Meshkov and Rayleigh–Taylor instabilities) (Calder et al. 2002), and a weak shock hitting columns of a dense gas descending through air (Weirs et al. 2005). Recent validation work revisits laser-driven shocks propagating through multi-layer targets. The current generation of these experiments have greatly improved diagnostics and are investigating the effects of three-dimensional perturbations on instability evolution (Hearn et al. 2006). Figure 2 shows results from one study involving a laser-driven shock propagating through a three-layer target of decreasing density. The configuration resembles a core-collapse supernova in which a shock propagates through shells of decreasing density as it rises to the surface of a massive star. The interaction of the shock with the shell interfaces spawns fluid instabilities, and subsequent mixing of the materials may explain certain features observed in the early spectra (Arnett, Fryxell, & M¨ uller 1989). The experiment is designed to probe these fluid instabilities. Hence, a natural metric for comparison between simulation and experiment is the growth of these instabilities. The figure shows instabilities growing from an initial sinusoidal perturbation between two of the layers. At the beginning of the simulation, the width of the mixed region is equal to the amplitude of the perturbation. As the shock passes through the material it is compressed, as may be observed by the decrease during approximately the first 10 ns. After the shock has passed, the instabilities grow. The figure presents four simulations of increasing resolution that all agree with the two experimental data points to within the spatial error. Although the simulations agree well with the experiments, not all of the relevant physics were included in the simulations (e.g. a material equation of state) and the convergence study showed degraded agreement between simulations at the highest resolutions. We attribute this difference to the different amounts of small-scale structure found in the simulations. FLASH evolves the inviscid Euler equations, and as the resolution increases, the numerical viscosity decreases allowing an increased development of small-scale structure, which appears to influence the evolution of larger structures such as the spikes. Discussion on the convergence of the Euler equations appears in Calder et al. (2002) and references therein.

Verifying and Validating FLASH

123

0.06

Experiment 512 X 256 1024 X 512 2048 X 1024 4096 X 2048

Cu spike length (cm)

0.05

0.04

0.03

0.02

0.01

0.00

0

10

20

30 40 50 Simulation time (ns)

60

70

80

Figure 2. Results from a validation test consisting of a laser-driven shock propagating through a multi-layer target. The width of the mixed region between two layers is plotted vs. time from several simulations in a convergence study. Also shown are the experimental results at two times with spatial error bars of (±25µm). The timing error is about the width of the diamonds marking the experimental result. The differences between the simulations at different resolutions is less than the uncertainty of the experimental results.

5.

Summary and Conclusions

Verification and validation are the testing steps of computational science, and the principal conclusion we draw from our efforts is that these steps are essential for credible numerical modeling. The verification tests indicate that the code modules are performing as expected. The FLASH3 distribution includes FlashTest, a toolkit for automated verification and regression testing. Our experience shows that working with an automated testing tool is indispensable for software development. Making even the simplest of changes can introduce bugs, and FlashTest is invaluable for catching them in a timely manner. FLASH3 also features an improved framework for unit testing, thereby improving the overall test suite. Our conclusion is to highly recommend an automated testing procedure for any software project, particularly one with multiple developers. The validation example presented here showed good agreement between simulation and experiment, and demonstrated that the FLASH hydrodynamics module captures the bulk flow properties and the resolvable morphology. In general, however, our validation results thus far are more mixed. In most cases we have reached the limit of what can be done without modifying the code to include modules not relevant to the astrophysics problems of interest. Even in the encouraging validation problem described here, missing physics and imperfect convergence of the solutions forced us to concede that the simulations have

124

Calder et al.

not yet been “validated.” Complete details of the experiment and the validation study may be found in Kane et al. (2001) and Calder et al. (2002). We note that validation includes quantifying effects of omitted physics, but we have not performed such a study on these results. Also, we note that in some cases simulation-experiment comparison is limited by the experimental diagnostic resolution including the initial conditions (Weirs et al. 2005). Despite mixed results, validation has been worth the investment because of the better understanding of the relevant issues and the increased confidence in the results. Unanswered questions, such as the role of small-scale structure, call attention to issues that might be overlooked in astrophysical simulations for which there cannot be a quantitative comparison to experimental data. We end with observations on the validation process. Determining the metric for comparison and extracting a result from the experimental and simulational data requires input from both experimentalists and theorists, and we conclude that a meaningful validation program must include input from both parties. Experimentalists also benefit from validation as the process suggests metrics for comparison, provides useful diagnostics, and supplies a virtual model that can guide the design of future experiments. Acknowledgments. This work is supported in part at the University of Chicago by the U.S. Department of Energy under Grant B523820 to the ASC Alliances Center for Astrophysical Flashes. The authors gratefully acknowledge the contributions of the many people who participated in the development of FLASH and in the continuing V&V effort. The authors thank Nathan Hearn and Lynn Reid for thoughtful conversations and for previewing this manuscript. References AIAA 1998, Guide for the Verification and Validation of Computational Fluid Dynamics Simulations, AIAA Report G-077-1998 (Reston, VA: American Institute of Aeronautics and Astronautics) Arnett, D., Fryxell, B., & M¨ uller, E. 1989, ApJ, 341, L63 Calder, A. C., Curtis, B. C., Dursi, L. J., Fryxell, B., Henry, G., MacNeice, P., Olson, K., Ricker, P., Rosner, R., Timmes, F. X., Truran, J. W., Tufo, H. M., & Zingale, M. 2000 in Proc. SC2000 (Los Alamitos: IEEE Computer Soc.) http://sc2000.org Calder, A. C., Fryxell, B., Plewa, T., Rosner, R., Dursi, L. J., Weirs, V. G., Dupont, T., Robey, H. F., Kane, J. O., Remington, B. A., Drake, R. P., Dimonte, G., Zingale, M., Timmes, F. X., Olson, K., Ricker, P., MacNeice, P., & Tufo, H. M. 2002, ApJS, 143, 201 Fryxell, B., Olson, K., Ricker, P., Timmes, F. X., Zingale, M., Lamb, D. Q., MacNeice, P., Rosner, R., Truran, J. W., & Tufo, H. 2000, ApJS, 131, 273 Hearn, N., et al. in prep. Kane, J. O., Robey, H. F., Remington, B. A., Drake, R. P., Knauer, J., Ryutov, D. D., Louis, H., Teyssier, R., Hurricane, O., Arnett, D., Rosner, R., & Calder, A. 2001, Phys. Rev. E, 63, 055401 (R) P. MacNeice, K. M. Olson, C. Mobarry, R. de Fainchtein, and C. Packer 2000 Comp. Phys. Commun., 126, 330, 2000. Oberkampf, W. L., Trucano, T. G., & Hirsch, C. 2004 Appl. Mech. Rev., 57, 345 Roache, P. J. 1998, Verification and Validation in Computational Science and Engineering (Albuquerque: Hermosa) Weirs, G., Dwarkadas, V., Plewa, T., Tomkins, C. & Marr-Lyon, M. 2005 Ap&SS, 298, 341

A Case Study of Verifying and Validating an ...

A Case Study of Verifying and Validating an ...

Suggest Documents