Highly scalable discrete-particle simulations with ...

Molecular Physics An International Journal at the Interface Between Chemistry and Physics

ISSN: 0026-8976 (Print) 1362-3028 (Online) Journal homepage: http://www.tandfonline.com/loi/tmph20

Highly scalable discrete-particle simulations with novel coarse-graining: accessing the microscale Timothy I. Mattox, James P. Larentzos, Stan G. Moore, Christopher P. Stone, Daniel A. Ibanez, Aidan P. Thompson, Martin Lísal, John K. Brennan & Steven J. Plimpton To cite this article: Timothy I. Mattox, James P. Larentzos, Stan G. Moore, Christopher P. Stone, Daniel A. Ibanez, Aidan P. Thompson, Martin Lísal, John K. Brennan & Steven J. Plimpton (2018) Highly scalable discrete-particle simulations with novel coarse-graining: accessing the microscale, Molecular Physics, 116:15-16, 2061-2069, DOI: 10.1080/00268976.2018.1471532 To link to this article: https://doi.org/10.1080/00268976.2018.1471532

View supplementary material

Published online: 14 May 2018.

Submit your article to this journal

Article views: 32

View Crossmark data

Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=tmph20

MOLECULAR PHYSICS 2018, VOL. 116, NOS. 15–16, 2061–2069 https://doi.org/10.1080/00268976.2018.1471532

THERMODYNAMICS 2017

Highly scalable discrete-particle simulations with novel coarse-graining: accessing the microscale Timothy I. Mattox a , James P. Larentzos b , Stan G. Moore c , Christopher P. Stone Aidan P. Thompson c , Martin Lísal f ,g , John K. Brennan b and Steven J. Plimptonc

d , Daniel A. Ibanez

e,

a Engility Corporation, DoD High Performance Computing Modernization Program (HPCMP) PETTT Group, U.S. Army Research Laboratory,

Aberdeen Proving Ground, MD, USA; b Weapons and Materials Research Directorate, U.S. Army Research Laboratory, Aberdeen Proving Ground, MD, USA; c Multiscale Science Department, Sandia National Laboratories, Albuquerque, NM, USA; d Computational Science and Engineering, LLC, DoD High Performance Computing Modernization Program (HPCMP) PETTT Group, U.S. Air Force Research Laboratory, Dayton, OH, USA; e Computational Multiphysics Department, Sandia National Laboratories, Albuquerque, NM, USA; f Department of Molecular and Mesoscopic Modelling, Institute of Chemical Process Fundamentals of the CAS, v. v. i., Prague, Czech Republic; g Department of Physics, Faculty of Science, J. E. Purkinje University, Ústí n. Lab., Czech Republic ABSTRACT

ARTICLE HISTORY

Simulating energetic materials with complex microstructure is a grand challenge, where until recently, an inherent gap in computational capabilities had existed in modelling grain-scale effects at the microscale. We have enabled a critical capability in modelling the multiscale nature of the energy release and propagation mechanisms in advanced energetic materials by implementing, in the widely used LAMMPS molecular dynamics (MD) package, several novel coarse-graining techniques that also treat chemical reactivity. Our innovative algorithmic developments rooted within the dissipative particle dynamics framework, along with performance optimisations and application of acceleration technologies, have enabled extensions in both the length and time scales far beyond those ever realised by atomistic reactive MD simulations. In this paper, we demonstrate these advances by modelling a shockwave propagating through a microstructured material and comparing performance with the state-of-the-art in atomistic reactive MD techniques. As a result of this work, unparalleled explorations in energetic materials research are now possible.

Received 10 January 2018 Accepted 16 April 2018

Introduction Many types of manufactured materials are inherently heterogeneous, composed of polycrystalline grains, either as a result of processing or by design, such as in additive manufacturing. Microstructure at the grain scale can constitute various inhomogeneities, e.g. voids, defects in and between grains, or species variability, such as mixtures, additives, and fillers. Such microstructural heterogeneity dictates the macroscopic material properties and

CONTACT Timothy I. Mattox

coarse-graining; dissipative particle dynamics; energetic materials; microstructure; reaction kinetics; scalable parallel simulation

material response to thermal and mechanical loading present in most applications and technologies. Thus, it is critical to understand the role of microstructure for nearly all materials in soft and condensed matter. Modelling and simulation of materials with microstructure is challenging, due to the disparate spatial and temporal scales over which the phenomena that govern the material behaviour occur. For example, material behaviour at the mesoscale is linked to the distribution

[email protected]

Supplemental data for this article can be accessed here. https://doi.org/10.1080/00268976.2018.1471532 © 2018 Informa UK Limited, trading as Taylor & Francis Group

KEYWORDS

2062

T. I. MATTOX ET AL.

of grain properties (i.e. grain size, orientation, composition, etc.), as well as the curvature and interconnectivity of the boundaries between grains. At the atomistic scale, the local atomic or molecular arrangement and composition at grain boundaries play important roles in material performance and grain evolution. Materials that undergo chemical reactions further add to the complexity. Energetic materials composed of molecular crystalline grains are one such class of materials. In such materials, chemical decomposition and energy release occur rapidly at the molecular scale, yet the explosive response of the material manifests itself at the macroscale. Prior to the efforts described here, an inherent gap in computational capabilities existed due to these challenges. Atomistic scale simulations are, in principle, viable but their extreme computational demand limits simulations to simple, idealised microstructures. Therefore, simulating the competing effects of microstructural heterogeneities is not possible. Moreover, existing fieldbased and continuum-level approaches lack the fidelity needed to explicitly simulate microstructure, so that critical phenomena may not be properly captured. To overcome the above obstacles, our groups have recently developed micro- and mesoscale computational capabilities necessary to represent the salient physical and chemical features of polycrystalline energetic materials through coarse-grain (CG) particle approaches [1–4]. However, until now, the optimised high performance computing (HPC) implementations that enable grainscale simulations of energetic materials have been lacking. As outlined in this paper, with increasing computing capabilities, algorithmic developments, and performance optimisations, these massive particle-based simulations are enabling research explorations into understanding the interdependence of these microstructure effects. Our computational capabilities are demonstrated by propagating a shockwave through a CG model of a common energetic material, RDX (hexahydro-1,3,5-trinitros-triazine, C3 H6 N6 O6 ) [4] with grain-scale microstructure. To capture the range of simulated responses of energetic materials, chemical reactivity must be included in the model. The state-of-the-art method for modelling such systems requires atomistic reactive molecular dynamics (MD) simulations. These rely on interatomic potentials that properly describe the energy landscape and barriers between the initial materials, the reaction products, and the relevant transition structures. Numerous atomistic reactive potentials are available [5], with ReaxFF [6,7] likely the most widely used model. For example, atomistic reactive MD simulations of energetic materials have enabled direct study of the initiation and growth of hotspots (i.e. localised high temperature

regions that can be generated from a variety of mechanisms including exothermic chemical reactivity) created near a single microstructural heterogeneity [8] when the material is subjected to thermal and mechanical loading. However, reactive MD simulations of even a nanometre-sized heterogeneity require petascale computational resources. As such, the simulation of realistic samples containing an assembly of hundreds to thousands of grains or other microscale heterogeneities would require O(years) of wall-clock time using current petascale supercomputers – an impractical computational expense before even considering if sufficient memory would be available for such a model. Particle-based microscale simulation methods using CG models currently offer a promising route for extending the power of atomistic modelling towards the mesoscale. CG models are developed by grouping a set of smaller entities (e.g. atoms or molecules) into a single, larger entity. The computational speedup arises from the mapping of the atomistic degrees-of-freedom (DOF) onto the reduced set of CG DOF. For example, the 21 atoms in an RDX molecule are reduced to a single CG particle in the present work. However, features lost during this coarse-graining process will not be adequately recovered unless correctly reintroduced through an appropriate CG methodology. For instance, as a direct consequence of losing molecular DOF during coarse graining, the MD approach is inadequate due to the faster dynamics of the CG models compared to their atomistic counterparts [1]. Furthermore, at the atomistic scale, the formation and breaking of chemical bonds is treated explicitly and is conceptually intuitive; however, at the microscale, CG models and methods must collectively capture and recover the relevant chemical features lost during coarse-graining. CG models cannot replicate the full fidelity of atomistic models. Rather, they intend to produce results of sufficient accuracy across a much broader range of material properties, conditions, and, crucially, at considerably less computational expense. Therefore, proper care must be taken in recovering the salient DOF that would otherwise be lost. The computational capabilities described here are rooted within the dissipative particle dynamics (DPD) method. DPD is a well-established coarse-graining technique for the simulation of soft matter that has only recently been adapted to energetic material composites [1–3]. As part of the effort described here, a suite of discrete-particle methods based on the DPD method has been developed of which the energy-conserving (DPDE) variant is particularly suitable for non-equilibrium simulation scenarios and thermally variant conditions [9,10]. DPD-E uniquely treats the CG DOF through both the dissipative forces and a particle internal energy term,

MOLECULAR PHYSICS

where the latter provides a numerical means of ensuring energy conservation during the simulation. Furthermore, the particle internal energy provides an additional mechanism for recovering the CG DOF, which is essential for accurately reproducing the atomistic model behaviour [1]. A unique capability within the suite of DPD methods is a variant that includes a microscale description of chemical reactivity (DPD-RX) [1]. The description of the project thus far has focused on the advances made from an algorithm viewpoint, i.e. advancements in CG modelling and the DPD methodology. However, a second key element of this project has been the HPC implementation of these algorithmic advances, including code optimisation, acceleration, and adaptation to heterogeneous computer architectures. The advancement of both the algorithms and their computational performance is critical to accessing the spatial and temporal phenomena at the microscale. Several components of the DPD variants, such as the integration of the stochastic equations of motion, present unique challenges for HPC implementation and optimisation. The DPD-RX computational framework described here has been developed to address the gap in energetic material modelling capabilities, now enabling research explorations in scientific areas not previously possible. CG models and methods that include chemical reactivity can now be applied to investigate interacting microstructural heterogeneities at the grain scale. Both the innovative algorithmic developments and the computational performance enhancements that were critical in achieving this goal are briefly described below. This work focuses on computational performance, as such; a quantitative comparison of the shockwave response of the CG model and atomistic reactive MD model is not demonstrated here. However, elsewhere under non-reactive conditions, excellent quantitative agreement has been found with atomistic MD simulations [1].

Methods CG models and methodologies The DPD-based methods incorporate CG models upscaled from quantum-based models. For the application presented here, an RDX molecule is represented by an isotropic one-site CG model (CG-RDX) obtained by the multiscale coarse-graining method (MS-CG), which is a particle-based force-matching approach for deriving effective interaction potentials [4]. Each CG-RDX particle represents a single 21-atom RDX molecule resulting in a reduction of 20 DOF between the atomistic and CG models.

2063

In the DPD-RX framework, the particle interaction potential does not explicitly break and form chemical bonds. Rather, the evolving chemistry is modelled through a CG-reactor depiction, such that chemical reactivity occurs within the local volume of the CG particle. At any given time, particles represent a collection of molecular species at a given concentration. The specified chemical kinetic equations controlling the concentrations are solved for all particles at every DPD time-step. The CG-reactor is directly coupled to concentrationdependent interaction potential models that account for the particle transformation as the reactions progress [1]. After each time-step, the updated CG particle concentrations are numerically integrated with a deterministic Runge–Kutta–Fehlberg integration scheme [11,12]. For the RDX decomposition model presented here, the chemical character of the particles can vary from pure RDX to a mixture of small molecule product gases, and states in between. The decomposition reactions are described through a four-step reaction rate model [1,13] describing the decomposition of RDX and the combustion of its decomposition products. In the energetic material composite models considered here, the samples are composed of CG particles arranged in a multifaceted polycrystalline structure with complex, planar interfaces. The polycrystalline RDX microstructure geometry is constructed using a Voronoi tessellation method [14,15], where each grain is a defectfree lattice of CG particles at a random 3D orientation. Further details of the CG model and methods used in this work are described elsewhere [1,13]. The initial development of the CG methodologies was implemented and validated within a research-level serial version of the code, which is limited to O(103 ) particle simulations [1]. This limitation motivated our implementation of DPD-RX within the LAMMPS software [16]. Stochastic integration schemes Numerical integration of the stochastic equations-ofmotion requires special consideration due to the pairwise coupling of particles through the random and dissipative forces within the DPD method. The most commonly applied numerical integration scheme for DPD applications originates from the velocity-Verlet (VV) integrator typically used for MD. However, for the energy-conserving DPD-E method, the required timestep length is prohibitively small for applications to realistic material models involving dynamic processes (see Figure 1(a)). As such, schemes based upon the Shardlowsplitting algorithm (SSA) have been developed [17,18]. To date, these are considered the most efficient, stable integration schemes available for DPD simulations [2,3].

2064

T. I. MATTOX ET AL.

Figure 1. (a) The DPD-E method is applied to a 2500 × 40 × 40 nm3 polycrystalline RDX sample. All the SSA curves overlap and thus show the conservation of energy during the 25 ps DPD-E simulation for time-steps as large as 20 fs, while the VV integration scheme requires < 0.05 fs time-steps. Although a 20 fs integrator time-step is viable, a conservative 10 fs time-step is used for all benchmarks presented in this work. (b) A 2D representation of the directional communication for two arbitrary processors (labelled 10 and 11). The shaded regions represent the actively interacting particle pairs at one stage of the SSA parallelisation scheme (reproduced by permission from [2]).

For DPD-E, time-step increases of O(103 ) were achieved compared to previously used integration schemes. A parallel stochastic integration scheme While the SSA approach has clear advantages over traditional integration techniques, it has not been widely adopted partly due to its nontrivial implementation within massively parallel domain-decomposition codes. The design and implementation of the first-ever parallel version of the SSA is a critical component of the LAMMPS implementation, as it overcomes the time-step restrictions of other integration techniques [2,3]. Efficient parallelisation of the SSA requires complex communication patterns and significant coordination between computation and communication due to its pseudorecursive nature. In particular, each particle’s current momentum is both used in and updated by the SSA integration loop. If care is not taken, a data-dependency race condition would exist when two interacting particles are updated concurrently, risking the loss of particle momentum during updates. For a spatially decomposed MPI parallelisation of the SSA, we previously developed a directional communication scheme that avoids this race condition [2] (e.g. see Figure 1(b)). Furthermore, we recently implemented a multi-threaded, shared memory parallelisation of the SSA that exploits concurrency within each MPI spatial decomposition domain. This multi-threaded SSA algorithm uses a geometricisolation scheme to preclude the race condition between the threads. Although the MPI and thread-parallel implementation require more inter-process communication and coordination, the SSA exhibits improved stability

over longer time-steps compared to traditional DPD integrators, justifying its regular use in parallel applications. This finding is particularly true for DPD-E simulations, where the time-to-solution is improved by up to two orders of magnitude [2]. Adaptation into an HPC framework The suite of DPD methods, including DPD-RX, has been integrated into the LAMMPS MD code [16] under the USER-DPD add-on package, enabling large-scale simulations of complex microstructures involving O(109 ) CG particles. LAMMPS is a highly scalable domaindecomposition software originally developed at Sandia National Laboratories (SNL), and is one of the premier MD simulators for computational chemistry and materials science applications. The LAMMPS USER-DPD package leverages existing capabilities within LAMMPS in regard to the parallel communication framework, dynamic load balancing, parallel I/O, and data analysis techniques. Several features within LAMMPS USERDPD include extensions to the energy- and enthalpyconserving DPD variants, the SSA numerical integration scheme, the concentration-dependent interaction potentials, and efficient reaction kinetics solvers [11,12] with adaptive time-stepping. The LAMMPS USER-DPD package was optimised for use on heterogeneous architectures by leveraging the intra-node parallelism exposed through the Kokkos library. Kokkos was developed by SNL as a performance portability abstraction layer for enabling code to use vector, thread, and task-based parallelism [19]. A key feature of Kokkos is that it is a single-source system;

MOLECULAR PHYSICS

i.e. code can be written once in a form independent of the target hardware (e.g. CPU or GPU). The Kokkos C++ templates generate target-specific code based on compile-time options. This feature is critical for handling a wide variety of hardware, and protects developers from maintaining multiple, platform-specific versions of code. Further details about the Kokkos implementation of LAMMPS USER-DPD are provided in the Supplemental Material.

Simulation details To demonstrate the new capability for simulating microstructure-dependent material behaviour, a polycrystalline RDX sample that is representative of experimentally observed microstructures [20] is simulated under shock conditions. With proper modelling of the shockwave interaction with the grain interfaces and the chemical response to the resulting temperature and pressure fluctuations, the overall response of the energetic material is dramatically different than for a defect-free, homogeneous crystal. In other words, initiation of the explosive would not be properly simulated if microstructure was ignored. The sample material was chosen to be 2500 nm in length along the x-direction, so that an impact velocity of 2.25 km/s on the left edge of the sample would allow approximately 0.35 ns of shockwave propagation before the shock front reaches the right edge (positive x-direction) of the sample. This duration is sufficient to observe interactions with multiple defects and secondary shockwaves of those defects, as well as the collapse of voids that transfer energy to surrounding particles and initiate localised chemical reactions. In order to explore the performance and to better characterise the material behaviour, three different sample thicknesses of 40, 100, and 300 nm in the y- and z-directions are considered, as listed in Table 1 and shown in Figure 2.

2065

Table 1. Polycrystalline sample details. Sample name Sample thickness Average grain size No of grains No of atoms No of CG particles

Small

Medium

Large

40 nm 30 nm 243 4.3 × 108 2.0 × 107

100 nm 75 nm 90 2.7 × 109 1.3 × 108

300 nm 225 nm 31 2.4 × 1010 1.1 × 109

For each distinct phase of the simulation runs, the total wall-clock time, including MPI launch time and I/O time, is recorded. At start-up, a binary restart file (up to 126 GB in size) containing the initial positions, velocities, and CG particle internal states is read in parallel via MPIIO. The MPI processes then exchange their particles with the proper owners based on the current balanced spatial decomposition of the simulated volume. Prior to impacting the left edge of the sample to initiate the shock, the sample is equilibrated through a simulation protocol that consists of a 0.1-ns enthalpy-conserving DPD [3] simulation (using a Nosé-Hoover barostat damping parameter of 0.01 ns), followed by a 0.05 ns energy-conserving DPD simulation. The sample is then impacted at the left edge, creating a shock wave that propagates through the material for an additional 0.35 ns of simulation. In total, a 0.5 ns simulation using a 10 fs time-step is conducted to model the equilibration and shock wave propagation through the sample. Five snapshots (up to 347 GB each) of the system state are recorded at 0.15, 0.2, 0.3, 0.4, and 0.5 ns of simulated time, totalling over 1.5 TB of recorded data for the large sample (see Table 1). Also, several hundred spatially distributed histograms of various particle state variables are recorded every 0.005 ns in order to collect simulation results at a higher temporal resolution. In order to explore the computational performance of the LAMMPS USER-DPD package, simulations were conducted on three different HPC systems (see Supplemental Table 1). The first system, Thunder, has traditional multi-core Intel E5-2699v3 Haswell (HSW) CPUs.

Figure 2. Three benchmark 3D polycrystalline RDX samples, where the interfaces between crystal grains are delineated by the colouring scheme [21]: (top) Small (2500 × 40 × 40 nm3 ), (middle) Medium (2500 × 100 × 100 nm3 ), and (bottom) Large (2500 × 300 × 300 nm3 ). Note: the white regions indicate where grains span the entire thickness of the sample. There are no voids in these samples .

2066

T. I. MATTOX ET AL.

Figure 3. The time evolution of a 3D polycrystalline RDX sample of size 2500 × 300 × 300 nm3 under shock is shown (left frames), where individual crystal grains are delineated by the colouring scheme [21]. The reaction progress and location of the triple junctions within the shocked, polycrystalline RDX sample are shown at various times (right frames). For visual clarity in the reaction progress snapshots, the unreacted material is not shown in order to depict the surface area of the reaction zones.

The other two systems, Trinity and Stampede2, have many-core Intel Xeon Phi Knights Landing (KNL) selfhosted processors. The HPC system-specific details along with compile-time and run-time specifications are provided in the Supplemental Material.

Results and analysis The newly developed methods and software within the USER-DPD package that is distributed within LAMMPS are applied to simulate shock in the three polycrystalline samples, enabling investigations of microstructural features that would have otherwise been impractical with atomistic ReaxFF simulations (cost estimates using the LAMMPS USER-REAXC package [16,22,23] are provided in Supplemental Table 2). Figure 3 shows the progression of a compressive shock wave through the Large, O(109 ) particle, sample. The flyer plate strikes the sample with an impact speed of 2.25 km/s, generating a shockwave that propagates through the sample in the positive x-direction at an average shock speed of ∼ 6.9 km/s. The propagation of the shockwave results in plastic deformation of the material and a transfer of kinetic energy into the particle internal energy. The particle internal temperature and density profiles are monitored along the shock direction of the sample (see Figure 4), where the density of the material trailing the shock front increases to ∼ 2.7 g/cm3 , and the average particle internal temperature elevates above 1000 K. Shortly after impact, some initial chemical reactivity at the impact surface is observed, corresponding to particle internal temperatures exceeding 1400 K. As the shockwave continues to propagate through the sample, energy localizes at various triple junctions (i.e. the intersection of three crystal

grains), creating local hot-spots ( > 1500 K) and initiating chemical reactions. The grain interfaces and triple junction lines (with overlaid reaction zones) are illustrated in the left and right frames of Figure 3, respectively. The exothermically reacting particles release heat to neighbouring particles (see Supplemental Figure 1 for temperature profiles), causing the reaction zones to expand and evolve over the duration of the simulation. The simulations are conducted for approximately 0.5 ns, at which point, the shock wave approaches the end of the 2.5 µm long sample. Next, the computational performance is considered. The samples are simulated on three different supercomputers using a variety of node and core counts (see Supplemental Table 3). Figure 5 presents the weak and strong scaling results on log–log graphs. Ideal strong scaling is shown as the dashed diagonal black lines. Note that if shown, ideal weak scaling would be represented by horizontal lines for each of the computational intensities. Also note that the rightmost data in Figure 5(a) show the Large sample (i.e. O(109 ) CG particles) run on over half a million cores of Trinity/KNL achieved a parallel efficiency of 62% on one of the largest US supercomputers available. Further details of the computational performance on the three supercomputers can be found in the Supplemental Material. The particle-based methods detailed here are able to capture the short-time physics that evolve immediately after the shockwave passes through the sample, which is dependent upon microstructural heterogeneities. These large-scale simulations enable the exploration and understanding of the fundamental mechanisms that influence material performance. Since the emphasis of this work is the HPC implementation and performance, a

MOLECULAR PHYSICS

2067

Figure 4. The time evolution of the particle internal temperature (left frames) and density (right frames) profiles for a 3D polycrystalline RDX sample of size 2500 × 300 × 300 nm3 under shock is shown. The profiles are computed by sub-dividing the x-axis into a total of 360 bins of equal size.

Figure 5. (a) LogLog USER-DPD scaling results for Trinity/KNL: strong scaling on three sample sizes: Small (2.0 × 107 particles), Medium (1.3 × 108 particles), and Large (1.1 × 109 particles); weak scaling at three computational intensities: low ( ∼ 570 particles/core), moderate ( ∼ 2000 particles/core), and high ( ∼ 5070 particles/core). A few data points for Stampede2 are marked with triangles showing very similar performance for the same configurations as on Trinity/KNL. (b) LogLog USER-DPD scaling for Thunder: strong scaling on two sample sizes: Small (2.0 × 107 particles) and Medium (1.3 × 108 particles); weak scaling at two computational intensities: moderate ( ∼ 5000 particles/core) and high ( ∼ 11,200 particles/core).

2068

T. I. MATTOX ET AL.

quantitative comparison of the shockwave response of the CG model and atomistic reactive MD model was not demonstrated. However, elsewhere under non-reactive conditions, excellent quantitative agreement was demonstrated with atomistic MD simulations [1]. Moreover, qualitative agreement has been found for the adiabatic heating response between the CG simulations and atomistic reactive MD simulations [1,24,25].

Conclusions A general computational software package for discreteparticle simulations was described that enables practical simulations of O(109 ) particles. The DPD-RX computational framework can be considered general since other materials may be modelled without altering the method framework or computational code, thereby enabling research explorations that extend far beyond the energetic material composites presented here. To date, computational investigations of the microstructure-dependent material response have mainly focused on the effect of a single void or a single grain interface, but not on the interaction between microstructural heterogeneities themselves. Our previous simulations of single ideal interfaces of randomly-oriented crystals under the same shock conditions as presented in this work exhibited minimal response, where the particle temperatures at the grain interface did not exceed the threshold to initiate chemistry over the timescale of the simulations. The degree of which hot-spots form and reactions evolve depends upon a number of factors, which are neglected in such simplified models, including the average grain-size distribution, the grain orientation and composition, the grain interface curvature and interconnectivity, the porosity, and the speed, shape and character of the shock front through the sample. Variations in the local geometry and density, or the reflection of shockwaves through a polycrystalline sample may lead to significantly different hotspot formation mechanisms, which ultimately will dictate the macroscopic material response. The DPD-RX implementation in the LAMMPS USER-DPD package detailed here evolved from a research-level serial version of the code that was limited to simulations of O(103 ) CG particles [1]. A description of the software optimisations for applications on highly scalable, heterogeneous HPC architectures was presented. For this particular study, it was demonstrated that the application of the state-of-the-art, atomistic reactive MD simulations are not computationally feasible in probing the effects of grain-scale microstructural heterogeneities on material performance. For the largest system size examined, the time-to-solution for the CG model is

O(hours), whereas the fully-atomistic model is O(years) using O(10) PFLOP/s supercomputing systems. Clearly, application of these CG models and methods that treat chemical reactivity extend the length and time scales well beyond those currently realisable in atomistic simulations, enabling exploration of material properties for microstructure-dependent material systems that was not previously possible. For simulations beyond the microscale, these capabilities can be used to assess the macroscale response of energetic materials via a hierarchical multiscale simulation approach [26–28] that bridges the LAMMPS USERDPD package with continuum-level hydrodynamic modelling applications (e.g. Lawrence Livermore National Laboratory’s Arbitrary Lagrangian-Eulerian – ALE3D – Multi-Physics Code) [29]. Finally, the ramifications of the software development go beyond the discovery of new science and materials. For example, the computational savings of the CG method compared to the analogous atomistic simulations require less energy consumption, which leads to financial savings and reduced environmental impact.

Acknowledgements The authors acknowledge the support of several institutions and individuals, including AFRL DSRC, ARL DSRC, and ERDC DSRC, for providing HPC resources that have contributed to the research results reported within this paper, and Christian Trott (SNL), Ross Smith (Engility), and Brian Barnes (ARL) for their technical assistance. We especially thank the Alliance for Computing at Extreme Scale (ACES), a partnership between Los Alamos National Laboratory, and Sandia National Laboratories for the US Dept. of Energy’s NNSA for providing computing time on the Cray Trinity system. We also thank the Texas Advanced Computing Center (TACC) at the University of Texas at Austin for early access to the Stampede2 system.

Disclosure statement No potential conflict of interest was reported by the authors.

Funding This study was supported by the US Dept. of Defense High Performance Computing Modernization Program (HPCMP) User Productivity Enhancement, Technology Transfer, and Training (PETTT) activity (GSA Contract No. GS04T09DBC0017 through Engility Corporation). ML acknowledges funding provided by the US Army RDECOM-Atlantic and the US Army Research Office [grant no. W911NF-16-1-0566] and by the Czech Science Foundation [grant no. P208-16-12291S]. Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International, Inc., for the US Department of Energy’s National Nuclear Security Administration under contract DE-NA-0003525.

MOLECULAR PHYSICS

ORCID Timothy I. Mattox http://orcid.org/0000-0001-5265-4848 James P. Larentzos http://orcid.org/0000-0002-9873-4349 Stan G. Moore http://orcid.org/0000-0001-8951-2886 Christopher P. Stone http://orcid.org/0000-0002-9621-5334 Daniel A. Ibanez http://orcid.org/0000-0002-6537-5589 Aidan P. Thompson http://orcid.org/0000-0002-0324-9114 Martin Lísal http://orcid.org/0000-0001-8005-7143 John K. Brennan http://orcid.org/0000-0001-9573-5082

References [1] J.K. Brennan, M. Lísal, J.D. Moore, S. Izvekov, I.V. Schweigert, and J.P. Larentzos, J. Phys. Chem. Lett. 5 (12), 2144–2149 (2014). [2] J.P. Larentzos, J.K. Brennan, J.D. Moore, M. Lísal, and W.D. Mattson, Comput. Phys. Commun. 185 (7), 1987–1998 (2014). [3] M. Lísal, J.K. Brennan, and J. Bonet Avalos, J. Chem. Phys. 135 (20), 204105 (2011). [4] J.D. Moore, B.C. Barnes, S. Izvekov, M. Lísal, M.S. Sellers, D.E. Taylor, and J.K. Brennan, J. Chem. Phys. 144 (10), 104501 (2016). [5] K. Farah, F. Müller-Plathe, and M.C. Böhm, Chem. Phys. Chem. 13 (5), 1127–1151 (2012). [6] A.C.T. van Duin, S. Dasgupta, F. Lorant, and W.A. Goddard, J. Phys. Chem. A. 105 (41), 9396–9409 (2001). [7] K. Chenoweth, A.C.T. van Duin, and W.A. Goddard, J. Phys. Chem. A. 112 (5), 1040–1053 (2008). [8] T.-R. Shan, R.R. Wixom, and A.P. Thompson, Phys. Rev. B. 94 (5), 054308 (2016). [9] J. Bonet Avalos and A.D. Mackie, EPL. 40 (2), 141 (1997). [10] P. Español, EPL. 40 (6), 631 (1997). [11] C.P. Stone and R.L. Davis, J. Propul. Power. 29 (4), 764–773 (2013). [12] E. Fehlberg, NASA Technical Report, NASA-TR-R-315, (1969).

2069

[13] B.C. Barnes, J.K. Brennan, E.F.C. Byrd, S. Izvekov, J.P. Larentzos, and B.M. Rice, in Computational Approaches to Understanding Chemical Reactivity Under High Pressures, edited by N. Goldman (Springer, Cham, Switzerland, submitted 2017). [14] D. Foley, S.P. Coleman, G. Tucker, and M.A. Tschopp, Army Research Laboratory Technical Note, ARL-TN0806, 2016. [15] M.E. Fortunato, J. Mattson, D.E. Taylor, J.P. Larentzos, and J.K. Brennan, Army Research Laboratory Technical Report, ARL-TR-8213, 2017. [16] S.J. Plimpton, Comput. Phys. 117 (1), 1–19 (1995). [17] T. Shardlow, SIAM J. Sci. Comput. 24 (10), 1267–1282 (2003). [18] G. Stoltz, Europhys. Lett. 76 (5), 849–855 (2006). [19] H.C. Edwards, C.R. Trott, and D. Sunderland, J. Parallel Distr. Com. 74 (12), 3202–3216 (2014). [20] C.B. Skidmore, D.S. Phillips, P.M. Howe, J.T. Mang, and J.A. Romero, Proceedings of the 11th International Detonation Symposium, Snowmass Village, CO, 556 (1998). [21] A. Stukowski, Modell. Simul. Mater. Sci. Eng. 18 (1), 015012 (2010). [22] H.M. Aktulga, J.C. Fogarty, S.A. Pandit, and A.Y. Grama, Parallel Comput. 38 (4–5), 245–259 (2012). [23] M.A. Wood, D.E. Kittell, C.D. Yarrington, and A.P. Thompson, Phys. Rev. B. 97, 014109 (2018). [24] J.P. Larentzos, M. Lísal, and J.K. Brennan, Private Communication (2016). [25] K.L. Joshi and S. Chaudhuri, Phys. Chem. Chem. Phys. 17, 18790 (2015). [26] E.B. Tadmor and R.E. Miller, Modeling Materials: Continuum, Atomistic, and Multiscale Techniques (Cambridge University Press, Cambridge, 2011). [27] B.C. Barnes, K. Leiter, R. Becker, J. Knap, and J.K. Brennan, Modell. Simul. Mater. Sci. Eng. 25, 055006 (2017). [28] A. Abdulle, E. Weinan, B. Engquist, and E. VandenEijnden, Acta. Numer. 21, 1–87 (2012). [29] A.L. Nichols, Lawrence Livermore National Laboratory Report, LLNL-SM-433954, 2010.

Highly scalable discrete-particle simulations with ...

Highly scalable discrete-particle simulations with ...

Suggest Documents

Scalable Flow Simulations with the Lattice Boltzmann ...

Mechanically Flexible Interconnects with Highly Scalable Pitch and

Scalable Peer-to-Peer Web Retrieval with Highly Discriminative ... - IEEE

Enabling Highly-Scalable Remote Memory Access Programming with ...

Highly Scalable Web Applications with Zero-Copy Data Transfer

Highly Scalable Multiprocessing Algorithms for Preference ... - CiteSeerX

Highly Efficient and Scalable Separation of

A Generally Applicable, Highly Scalable Measurement Computation ...

Highly Scalable ALD-deposited Hafnium Silicate Gate

Scalable fabrication of highly sensitive flexible ...

Highly Interactive Scalable Online Worlds - Semantic Scholar

TinyGarble: Highly Compressed and Scalable Sequential ... - Practice

A highly scalable spray coating technique for

Compiler-Supported Simulation of Highly Scalable ... - CiteSeerX

A Scalable and Highly Configurable Cache-Aware

A highly efficient and temporally scalable incremental

Megastore: Providing Scalable, Highly Available Storage - CIDR

Highly Scalable Distributed Component ... - Kostadin Damevski

Highly Scalable Linear Solvers on Thousands of

An Interoperable, Highly Scalable Software ... - ACS Publications

A Continuously Available and Highly Scalable

A Scalable and Highly Available Web Server

Highly scalable production of uniformly-coated

Highly Interactive Scalable Online Worlds - Semantic Scholar