A Computation Technique for Rigid Particle Flows in ... - Springer Link

3 downloads 5767 Views 658KB Size Report
ulations (DNS). The idea is to treat the whole computational domain (including the particle domain) as fluid. A body force is formally introduced in the momentum.
A Computation Technique for Rigid Particle Flows in an Eulerian Framework Using the Multiphase DNS Code FS3D Philipp Rauschenberger, Jan Schlottke, and Bernhard Weigand

Abstract A new technique to simulate the motion of rigid particles was implemented in the in-house VOF based code FS3D with the aim of studying freezing processes of undercooled water droplets in the atmosphere. Particle deformation is determined and terminal velocities of a free falling sphere are compared to analytic results for validation. The computations were performed on the NEC SX-9 platform of the HLRS.

1 Introduction The ITLR code Free Surface 3D (FS3D) shall be extended to simulate freezing of supercooled droplets in the atmosphere. This requires the code to handle three phases: ice (solid), water and air (fluids). A preliminary is to treat the motion of rigid bodies in fluids. This method is presented here and bases on the works of Patankar [6] and Sharma et al. [13]. The numerical technique does not use any model for fluid-solid interaction and can hence be used for direct numerical simulations (DNS). The idea is to treat the whole computational domain (including the particle domain) as fluid. A body force is formally introduced in the momentum equation to force rigid body motion. The translatory and angular momenta of the rigid body are computed by taking into account the control volumes occupied by the particle. The resulting rigid body velocity is then projected onto the velocity field in the particle domain. Sharma et al. [13] move the particle centroid according to its current translatory and rotatory velocity. In contrary, the method presented here convects the VOF variable defining the particle with the linear transport equation. The advantage is that existing features of FS3D like evaporation and future

Philipp Rauschenberger Institut f¨ur Thermodynamik der Luft- und Raumfahrt, Universit¨at Stuttgart, Pfaffenwaldring 31, 70569 Stuttgart, Germany, e-mail: [email protected] W.E. Nagel et al. (eds.), High Performance Computing in Science and Engineering ’11, DOI 10.1007/978-3-642-23869-7 23, © Springer-Verlag Berlin Heidelberg 2012

309

310

P. Rauschenberger, J. Schlottke, B. Weigand

developments like freezing of water can be used and implemented in a coherent fashion. First, the mathematical formulation of the technique is given, followed by a description of the numerical method. Then, the results of two test cases are presented. The report winds up with a section on performance analysis and improvements on the NEC SX-9.

2 Mathematical Formulation The whole computational region is treated as fluid in a first step. However, an additional condition, forcing the fluid region to perform rigid body motion, must be imposed and so Carlson [3] introduced the term rigid fluid method. No model for fluid-particle interaction is necessary. The viscosity of the fluid representing a particle is the same as that of the surrounding fluid and any model for the viscous behavior of the base fluid may be used. However, a Newtonian fluid is assumed here. Let Ω designate the whole computational domain (fluid and particle) and P(t) be the region occupied by a particle. For simplicity, only one particle will be considered, but the extension to multiple particles is straightforward. The condition of rigid body motion is represented by the familiar equation of mechanics: u = U +ω × r

(1)

u is the velocity vector in one point of P(t). U and ω are the translatory and angular velocity relating to the particle centroid, r is the vector pointing from the centroid to the respective point within the particle. In the fluid domain Ω \ P(t) ∇·u = 0

in Ω \ P(t)

(2)

must hold. In the particle domain P(t), a supplementary condition (4) appears: ∇ · u = 0 in P(t)  1 D [u] = ∇u + (∇u)T = 0 in P(t) 2

(3) (4)

Equation (4) demands no deformation in P(t) with D [u] = 0. This constraint also assures that the velocity field is divergence free and makes (3) redundant. Formally, the rigidity constraint gives rise to a tensor field L in the particle domain assuring rigid body motion. Thus, the stress tensor reads τ = −pI + L + S, where p is pressure, I is the identity matrix, the viscous stress tensor S = 2μ D [u] turns out to be zero due to (4). According to Patankar et al. [7], the rigidity constraint can be imposed with

A Computation Technique for Rigid Particle Flows

311

∇ · (D [u]) = 0 in P(t) D [u] · nγ = 0 on ∂ P(t)

(5) (6)

Hence, the main problem is how to impose the rigidity constraint into the numerical method of FS3D.

3 Numerical Method FS3D uses a finite volume method on a staggered cartesian grid according to the MAC method by Harlow and Welch [5]. All scalar variables (e.g. f , ρ , p, T ) are located at the cell centers, while the velocities u, v, w are stored at the center of the cell faces. Convective terms are discretized by second order accurate upwind schemes, while diffusive terms are approximated by central difference schemes of second order accuracy. Fluxes are determined with classical Godunov type schemes. However, fluxes of the volume fraction f over cell faces are computed with the known reconstructed interface plane provided by the PLIC algorithm of Rider and Kothe [8]. The numerical scheme to solve the momentum equations is based on the projection method by Bell et al. [1]. It basically consists of two steps. First, all terms except for the pressure gradient are treated by explicit schemes gaining an intermediate velocity. In the second projection step, the pressure field is calculated implicitly, from the Poisson equation:   ∇·u 1 ∇p = (7) ∇· ρ( f ) δt It satisfies the solenoid continuity equation, which is in fact a pure velocity condition due to incompressibility. A multi grid solver is used to solve the upcoming linear system of equations. It bases upon a coarse grid Galerkin approximation, where the efficient red-black Gauss-Seidel algorithm in W-cycle scheme is applied as smoother. The number of pre- and post-smoothing steps are adjusted during runtime and an overrelaxation of the Gauss-Seidel scheme is introduced depending on convergence rate. More detailed information about the applied numerical procedure in FS3D can be found in [9] and [12]. In the rigid fluid method, the whole computational domain is treated as fluid. Solid and fluid are only distinct due to a density difference in Ω \ P(t) and P(t). The surface tension of the rigid particle is zero. Let uˆ be an intermediate velocity field in Ω . Rigid body motion must be imposed in P(t). Another source term frigid is formally introduced and rigid body motion is projected on P(t): urigid = uˆ +

Δt frigid ρ( f )

in

P(t)

Equations to compute frigid can be gained from (5) and (6):

(8)

312

P. Rauschenberger, J. Schlottke, B. Weigand

      Δt frigid ∇ · D urigid =∇ · D uˆ + =0 ρ( f )     Δt D uˆ + frigid · n = 0 D urigid · n = ρ( f )

(9) (10)

In contrast Patankar [6] proposed that ˆ + ωˆ × r urigid = U

(11)

must hold and the linear and angular momentum must be conserved in P(t). The momenta can be determined from a simple integration over all particle cell volumes:

ˆ = MU Jωˆ =



P(t)

P(t)

ˆ ρ ( f )udV

(12)

ˆ r × ρ ( f )udV

(13)

Then, the velocity field u˘ accounting for rigid body motion in P(t) reads: u˘ = uˆ u˘ = urigid

in Ω \ P(t)

(14)

in P(t)

(15)

ˆ and ωˆ on a staggered Some remarks must be made about the computation of U ˆ grid. Each element of the translatory velocity U may be computed separately on the respective momentum control volumina where the velocities u, v, w are defined. To do so the exact mass in the momentum control volumina (mass of solid + mass of fluid) is determined with the PLIC-reconstructed planes in each cell. By this means, a no-slip condition is obtained at the interface of particle and fluid and momentum is conserved. The elements of ωˆ are computed with a vector product of u and r (e.g. ωx = ry w− rz v), where r points to the center of the control volume, while velocities are defined on the interfaces. Hence, mass averages of velocity (denoted umcv ) are determined in the center of the control volumes to compute r × umcv . The update of the particle position and orientation is done within the Eulerian framework, when the VOF-variable f is advected at the next time step. The velocity field assures rigid body motion due to the above adaptions. Therefore, the rigid body is not treated as a Lagrangian particle as in other methods (e.g. Sharma et al. [13]). Rigid body motion is divergence free by definition, but there will be a divergence in the velocity field on ∂ P(t) (i.e. the interface cells) due to the projection of the rigid body velocity onto the velocity field. Therefore, a pressure correction according to (7) must be conducted after having imposed rigid body velocity. In return, the pressure correction disturbs the velocity field within the particle and particularly in the interface cells. This is no problem, as long as the particle density is much higher than the fluid density ρ p /ρ f  1, because then the solver adds most acceleration due to pressure to the fluid phase and the particle velocity is almost untouched.

A Computation Technique for Rigid Particle Flows

313

However, if ρ p ≈ ρ f , the particle might rather be deformed due to the non-rigidbody velocity field. Two pressure corrections with preceding rigid body correction are done at each time step.

4 Results 4.1 Deformation In this section, the rigid fluid method is to be tested for deformation of a rigid particle. The setup is an oblate ellipsoid in a uniform inflow (see Fig. 1). The lateral boundaries are set to be free slip boundary conditions. The outlet has a continuous (Neumann) condition, where an additional damping zone (grey zones in Fig. 1) avoids backflow [10]. The uniform inflow velocity on the left side with u∞ = 8 m/s ρ u2 D

corresponds to a relative Weber number of We = g σ∞ e = 3.5, with De being the equivalent diameter of a sphere and surface tension σ = 0.072 kg/s2 . The particle density is ρ = 998.2 kg/m3 and the surrounding fluid is air (ρ = 1.2045 kg/m3 , μ = 18.2 μPa s). A water droplet of the same shape is computed for reference. The test cases are computed with different resolutions: 32 · 2n × 16 · 2n × 16 · 2n cells for the nth run. The resulting deformation is depicted in Fig. 2. The reference water droplet is computed with a resolution of 128 × 64 × 64 cells. It oscillates between an oblate and prolongated form. The water droplet shape is depicted in Fig. 2 at distinct times (light blue droplets). Maximum relative deviations from the initial surface and the resolution of the ellipsoid semiaxes are given in Table 1. With the rigid fluid method, the deformation is reduced considerably already on the coarsest grid, although the ellipsoid fills only two cells in x-direction. Admittedly, the ellipsoid keeps on changing its shape throughout the considered period. Doubling the cell numbers in each direction (blue curve) leads to a constant value

Fig. 1 Geometry and boundary conditions of an ellipsoid in steady inflow, semiaxes a = 5·10−4 m, b = c = 3 · 10−3 m; lengths L = 1.6 · 10−2 m

314

P. Rauschenberger, J. Schlottke, B. Weigand

Table 1 Resolution and maximum deviation of surface area Mesh res. of 2a res. of 2b, 2c 128 × 64 × 64, ref 4 24 64 × 32 × 32 2 12 128 × 64 × 64 4 24 256 × 128 × 128 8 48

|A/A0 − 1|max 39% 5.2% 3.2% 0.2%

Fig. 2 Deviation of surface area related to initial surface of an ellipsoid in steady inflow with the rigid fluid method in comparison with a water droplet

of the surface area after some oscillations for times t < 0.2 s. These are due to the strong acceleration of the flow field that is initially at rest. Once the flow has developed, the surface area is virtually constant. On the finest grid, the maximum relative devitation of surface area is only 0.2%. Thus, the presented method is able to conserve the ellipsoid’s shape, if the particle is resolved with enough cells. The rigid ellipsoids are illustrated at time t = 0.1 s in Fig. 2. The lower resolution obviously leads to a thickening of the ellipsoid and reduction of the two big semiaxes, i.e. a more spherical form.

4.2 Terminal Velocity of a Free Falling Rigid Sphere This section treats a rigid sphere falling freely in an unbounded fluid (see Fig. 3) with the following properties: gravitational acceleration g = 9.81 m/s2 , fluid dynamic viscosity μ = 1 · 10−3 kg/(m s), fluid density ρ f = 998.2 kg/m3 and rigid particle density ρ p = 1200 kg/m3 . The boundary conditions are a no-flow condition at the left and right boundary and slip walls (symmetry) on the other sides. Clift et al. [4] give a correlation to determine the theoretical terminal velocity. It depends

A Computation Technique for Rigid Particle Flows

315

Fig. 3 Geometry and boundary conditions of a free falling rigid sphere in an unbounded fluid: D = 5 · 10−4 m, channel width L = 4 · 10−3 m (L/D = 8)

on the dimensionless diameter ND from which the Reynolds number can be determined:   4ρ f ρ p − ρ f gd 3 2 ND = CD ReT = (16) 3μ 2 log10 Re = −1.7095 + 1.33438W − 0.11591W 2 ;

W = log10 ND

(17)

Equation (17) is valid in the range 73 < ND ≤ 580 and 2.37 < Re ≤ 12.2. For this setup ND = 329.35 and Re = 8.23, it follows that the terminal velocity UT = 0.01649 m/s. The influence of the walls (due to L/D = 8) on the terminal velocity is considered to be small and allows a high number of cells per diameter. A stability problem arises: The terminal velocity experiences a sudden jump as can be seen in Fig. 4. After the peak is reached, the velocity drops back to a steady state value. Figure 4 shows curves of the terminal velocity on the same grid but with different CFL-numbers. The jump in velocity can be smoothed by lowering the time step and vanishes for CFL ≤ 0.1. It was discovered that the divergence of the velocity also experiences a sudden strong increase right before the velocity jump. By reducing the CFL-limit, the increase in divergence is damped. This time step limitation holds at each time step, but a restriction is only needed, when the divergence exceeds a certain limit. The following time step limit is governed by the infinity norm of the divergence || · ||, because it is the most restrictive one:

316

P. Rauschenberger, J. Schlottke, B. Weigand

Fig. 4 Terminal velocity of the sphere with stability problem Table 2 Free falling rigid sphere: discretization parameters, relative error in terminal velocity and deformation Mesh fdiv res. of D relative error in UT |A/A0 − 1|max 080 × 032 × 032 1.0 · 10−2 4 27.1% 0.72% 8 15.9% 0.83% 160 × 064 × 064 5.0 · 10−3 16 9.70% 0.52% 320 × 128 × 128 2.5 · 10−3 32 6.30% 0.28% 640 × 256 × 256 2.5 · 10−3

δ t = fdiv ||(∇ · u)||−1

(18)

fdiv is a factor decreasing with cell size. Results obtained with the above time step limit are depicted in Fig. 5. fdiv as well as the relative error in terminal velocity and deformation are listed in Table 2. The terminal velocities converge with decreasing cell sizes differing 6.7% from the analytic value on the finest grid. A grid refinement analysis according to Roache [11] yields an extrapolated relative error between the solution on the finest grid UT,1 and the asymptotic one on an infinitesimal fine grid UT,ext :

εext =

UT,ext −UT,1 = 2.6% UT,ext

(19)

The fine grid convergence index GCI = 3.4%. Then, the error of UT,ext compared to the theoretical solution is εth = 4.1%.

A Computation Technique for Rigid Particle Flows

317

Fig. 5 Terminal velocity of the free falling rigid sphere in an unbounded fluid with time step restriction

5 Performance Analysis Switching from NEC SX-8 to NEC SX-9 at the High Performance Computing Center Stuttgart (HLRS) disclosed severe performance issues of the FS3D code on the new platform. Particularly the speed-up of parallel versions proved to be intolerable. As reported by Weking et al. [15], computational performance on the NEC SX-8 is comparatively good. Vector lengths of 208 and vector operation ratios above 98% were achieved on a grid with 512 × 256 × 256 cells. The speed-up is 1.75, 2.81 and 3.85 computing on 2, 4 and 8 CPUs, respectively. Admittedly, these results are far from being ideal. However, FS3D is a code for DNS of incompressible multiphase flow and the numerical solution of the incompressible Navier-Stokes equations makes it necessary to solve a Poisson equation for pressure. This results in a huge set of linear equations using up to 90% of the computation time. In FS3D, a multigrid solver (cf. P. Wesseling [16]) is used that essentially solves the problem on consecutively coarser grids leading to faster convergence. The coarsest grid is obtained by successively dividing all three grid dimension by two until the modulo is not equal to zero. The number of divisions plus one is the number of levels nl . The coarsest grid is solved with a direct solver (i.e. dgesv of the lapack library). On coarse grids, there are quite small vector lengths. At a certain coarse grid size, there will be no speed-up anymore when splitting up the computation on several CPUs/threads (actually a thorough examination has shown, that using multiple threads on these coarse grids slows down computation). Even worse, the multigrid solver is smoothing the levels in a W-cycle, which means that the coarser the grid, the more often the smoothing step has to be performed. A Wcycle for a grid with 163 cells is shown in Fig. 6a.

318

P. Rauschenberger, J. Schlottke, B. Weigand

Fig. 6 Multigrid W-cycle of a grid with 163 cells

The key feature of the newly implemented parallelization is that the coarse grids are smoothed by one CPU only. This is realized with conditional OMP directives. Furthermore, coarsening is restricted to a certain limit such that the direct solver treats a somewhat larger set of linear equations. The limit is probably architecture dependant. On the NEC SX-9 it was found to be 64 cells. This new restriction is depicted in Fig. 6b. Hence, using the multigrid solver leads inevitably to a higher serial fraction of the code than a classic solution method that is solving the problem directly on the finest grid. Nevertheless, the multigrid solver leads to considerably faster convergence. The bottom line is that no ideal speed-ups can be expected by FS3D due to the large fraction of serial code in the multigrid solver. However, the finer the resolution (i.e. the more grid cells) the lower the serial fraction and hence, the better the speed-up, because the finer grids then play a more important role. Typical numerical setups computed with FS3D involve single drops. The volume of fluid method (VOF) with piecewise linear interface reconstruction (PLIC) frequently implies loops that only do computations if the cell has a phase interface. Dynamic scheduling of the OMP do loops is used to obtain good load balance in these cases and work arrays are assigned that save the indices of the interface cells, previous to the actual computation. On each multigrid level, one smoothing operation is done with the Gauss-Seidel method. The structure of the pressure Poisson equation leads to an algorithm that accesses all six direct neighbors of an element of the pressure array p. In order to vectorize this code, the Red-Black ordering method is used. At each smoothing step, two loops run through the whole array successively. This approach is also shown in Sako et al. [14] who describe the implementation of the Red-Black ordering method on the NEC SX-9. In contrary to their approach, here a mask is defined that hides the halo cells. This allows to pass the loop in 1D with stride 2, allowing for longer vector lengths (especially on coarse grids). As mentioned in [14], the compiler directive on adb is used on array p in order to store it in the NEC SX-9 assignable data buffer (ADB) to reduce access time.

A Computation Technique for Rigid Particle Flows

319

All of the following performance results are obtained with a representative test case of a water droplet in air exposed to an incoming flow with a velocity of 2 m/s. Its diameter is 0.01 m and the computational domain is cubic with a lateral length of 0.05 m. The speed-up S of parallel versions is defined by S=

T1 TN

(20)

where Ti is the computation time of one time step. The index N denotes the number of CPUs / threads used for the computation. Hence, T1 marks the reference computation time of the parallel code on one CPU. The ideal speed-up is reached, if S = N. This leads to the definition of the efficiency E E=

S T1 = NTN N

(21)

The serial fraction F of the parallel code can be estimated experimentally by using the Karp-Flatt metric which can be derived from Amdahl’s Law [2]. S= F=

1 (1 − P) + NP 1 S

− N1

1 − N1

(22) (23)

In this formulation the parallel fraction P = (1 − F). Knowing the estimated serial fraction F, the maximum possible speed-up on an infinite number of CPUs is calculated from (22) Smax = lim

N→∞

1 1 1 = = (1 − P) F (1 − P) + NP

(24)

The first step in improving the performance on the NEC SX-9 was to have a look at the vectorization of the code. Afterwards the code was parallelized with OpenMP. The current FS3D version is v52 including all vectorization and parallelization developments. It shall be compared to the former version v51. First, the average computation time of one time step is compared using the serial codes. The average is taken from 200 time steps and the initialization time is subtracted. Version v51 takes 6.33 s/timestep, while version v52 takes 4.46 s/timestep. Computation time is reduced by a factor of 1.4. Hence, vectorization adapted to the NEC SX-9 architecture already improves computation time considerably. Next, the computation times of the two versions are compared on different numbers of CPUs. Both versions are compiled on the NEC SX-9 with the P auto compiler flag. Figure 7 shows that computation time is considerably reduced with version v52. When using the OpenMP directives, additional reduction of computation time can be achieved.

320

P. Rauschenberger, J. Schlottke, B. Weigand

Fig. 7 Comparison of average computation time per time step of versions v51 auto parallelize, v52 auto parallelized and v52 OpenMP on a grid with 2563 cells Table 3 Computation time compared to v51 on multiple CPUs CPUs 1 2 4 8 16

Tv51,ap /Tv52,ap 2.65 3.38 4.09 4.72 5.05

Tv51,ap /Tv52,omp 2.57 3.67 5.15 6.81 8.24

The factor between the computation times for a number of CPUs of version v52 auto parallelize and OpenMP compared to version v51 are presented in Table 3. It is approximately 2.5 on one CPU compared to v51. Using multiple CPUs, the speedup increases up to 5.05 with auto parallelization and 8.24 with the OpenMP version. These results clearly show the improvements in computation time compared to version v51. Figure 8 depicts the speed-up of all three versions and ideal speed-up. The axes are scaled logarithmic with base 2. Version v51 has a speed-up of S = 1.53 on 16 CPUs with an estimated serial fraction of F = 63%. Thus, the maximum speedup on an infinite number of CPUs is Smax = 1.6. Considerable improvement was achieved with version v52. The serial fraction reduces to F = 29.4% with the auto parallel compiler option set and F = 14.9% when using the OpenMP directives. Therefore, the maximum speed-up of the auto parallel version is Smax = 3.4, with OpenMP directives it is Smax = 6.7. This clearly shows that the OpenMP version gives the best results and shall now be tested on cases with more grid cells, where even better results are expected. This is due to the fact that fine grids show a very good speed-up and then play a more important role in total computation time. The results are shown in Fig. 9.

A Computation Technique for Rigid Particle Flows

321

Fig. 8 Comparison of speed-up of versions v51 auto parallelize, v52 auto parallelized and v52 OpenMP on a grid with 2563 cells

Fig. 9 Comparison of speed-up of version v52 OpenMP on grids with 2563 , 5123 and 7683 cells

The serial fraction reduces to F = 10.4% (Smax = 9.6) on the grid with 5123 cells and F = 8.8% (Smax = 11.4) on the grid with 7683 cells. A computation with 10243 cells is not possible on one node of the NEC SX-9 due to memory limitations. The average vector operation ratio is above 99% for all computations and the average vector lengths are 174, 211 and 222 on the grids with 2563 , 5123 and 7863 cells, respectively. Table 4 shows the concurrent GFLOPS reached during computation of different grids. The peak performance of the NEC SX-9 being 100 GFLOPS per CPU, there is still some room for optimization, since only approximately a seventh of peak FLOPS is obtained.The average vector operation ratio is above 99% for all computations and the average vector lengths are 174, 211 and 222 on the grids with 2563 , 5123 and 7863 cells, respectively. Table 4 shows the concurrent GFLOPS reached during computation of different grids. The peak performance of the NEC

322

P. Rauschenberger, J. Schlottke, B. Weigand

Table 4 Concurrent GFLOPS of v52 OpenMP on different grids 2563 12.0 20.6 32.6 45.8 56.8

CPUs 1 2 4 8 16

5123 15.1 27.1 45.2 67.8 89.0

7863 16.7 30.4 51.8 79.7 107.8

Table 5 Comparison of speed-up and efficiency of v52 OpenMP on different grids CPUs 1 2 4 8 16

2563 S 1.00 1.73 2.77 3.93 4.92

E [%] 100.00 86.72 69.24 49.14 30.76

5123 S 1.00 1.81 3.05 4.64 6.18

E [%] 100.00 90.62 76.28 57.95 38.63

7863 S 1.00 1.84 3.16 4.96 6.81

E [%] 100.00 91.95 79.05 61.96 42.54

SX-9 being 100 GFLOPS per CPU, there is still some room for optimization, since only approximately a seventh of peak FLOPS is obtained. Table 5 lists speed-up and efficiency for each grid and can be taken as a guidance for the choice of number of CPUs used in future computations. Having in mind that a certain amount of serial fraction is inevitable in FS3D, the results are relatively good.

6 Conclusion A new rigid fluid method was implemented in FS3D and is able to simulate rigid body motion. The implementation of this method was necessary to investigate icewater particles in clouds in the future. However, deformation cannot be completely inhibited due to the convection of the VOF-variable and the necessary pressure correction after having imposed the rigid body velocity. The instability that occurs in accelerated motion, such as the presented free fall test case, can be avoided with an additional time step restriction. The performance improvements on the NEC SX-9 were successful. Ideal speedups cannot be reached due to the serial fraction of the applied multigrid solver. However, computationally expensive cases on large grids can be computed efficiently. Acknowledgments. The authors greatly appreciate the High Performance Computing Center Stuttgart (HLRS) for support and supply of computational time on the NEC SX-9 platform under the Grant No. FS3D/11142. Sincere thanks are given to J. Hertzer and the HLRS-team for very helpful advice on the code optimization and the plenty of time they sacrificed. The authors also greatly acknowledge financial support of this project from DFG for the collaborative research council SFB-TRR 75.

A Computation Technique for Rigid Particle Flows

323

References 1. Bell, J.B., Colella, P., Glaz, H.M.: A second-order projection method for the incompressible Navier-Stokes equations. Journal of Computational Physics 85(2), 257–283 (1989) 2. Bengel, G., Baun, C., Kunze, M., Stucky, K.U.: Masterkurs Parallele und Verteilte Systeme. Vieweg+Teubner, Wiesbaden (2008) 3. Carlson, M.: Rigid, Melting and Flowing Fluid. Ph.D. thesis, College of Computing Georgia Institute of Technology (2004) 4. Clift, R., Grace, J.R., Weber, M.E.: Bubbles, Drops and Particles. Dover Publications, Inc., Mineola, New York (2005) 5. Harlow, F.H., Welch, J.E.: Numerical calculation of time-dependent viscous incompressible flow of fluid with free surface. Physics of Fluids 8(12), 2182–2189 (1965) 6. Patankar, N.A.: A formulation for fast computations of rigid particulate flows. Center for Turbulence Research, Annual Research Briefs, pp. 185–196 (2001) 7. Patankar, N.A., Singh, P., Joseph, D.D., Glowinski, R., Pan, T.W.: A new formulation of the distributed Lagrange multiplier/fictitious domain method for particulate flows. International Journal of Multiphase Flow 26, 1509–1524 (2000) 8. Rider, W.J., Kothe, D.B.: Reconstructing volume tracking. Journal of Computational Physics 141(2), 112–152 (1998) 9. Rieber, M.: Numerische Modellierung der Dynamik freier Grenzfl¨achen in Zweiphasenstr¨omungen. Ph.D. thesis, University of Stuttgart (2004) 10. Rieber, M., Graf, F., Hase, M., Roth, N., Weigand, B.: Numerical simulation of moving spherical and strongly deformed droplets. Proceedings ILASS-Europe, pp. 1–6 (2000) 11. Roache, P.J.: Perspective – a method for uniform reporting of grid refinement studies. Journal of Fluids Engineering-transactions of the ASME 116(3), 405–413 (1994) 12. Schlottke, J., Weigand, B.: Direct numerical simulation of evaporating droplets. Journal of Computational Physics 227(10), 5215–5237 (2008) 13. Sharma, N., Patankar, N.A.: A fast computation technique for the direct numerical simulation of rigid particulate flows. Journal of Computational Physics 205, 439–457 (2005) 14. Soga, T., Musa, A., Okabe, K., Komatsu, K., Egawa, R., Takizawa, H., Kobayashi, H., Takahashi, S., Sasaki, D., Nakahashi, K.: Performance of SOR methods on modern vector and scalar processors. Computers & Fluids 45(1), 215–221 (2011). http://www.sciencedirect.com/science/article/B6V26-51TYDY8-4/2/ 2656643dcf6a469e4321b243c552060d 15. Weking, H., Huber, C., Weigand, B.: Direct numerical simulation of single gaseous bubbles in viscous liquids. HLRS, High Performance Computing in Science & Engineering, pp. 1–13 (2009) 16. Wesseling, P.: An Introduction to Multigrid Methods. John Wiley & Sons (1991)

Suggest Documents