Better GPU implementation of lattice Boltzmann ...

7 downloads 0 Views 424KB Size Report
1Department of Mechanical Engineering, National Institute of Technology, Hamirpur 177005, INDIA. 2Department of Power Mechanical Engineering, National ...
Better GPU implementation of lattice Boltzmann method to simulate channel flow Tanmay Agrawal1,2, Pei-Yao Hong2, Chao-An Lin2* 1

Department of Mechanical Engineering, National Institute of Technology, Hamirpur 177005, INDIA

2

Department of Power Mechanical Engineering, National Tsing Hua University, Hsinchu 30013, TAIWAN

*Corresponding author’s email address: [email protected], Phone no. +886-35742602

Abstract We present implementation of lattice Boltzmann method (LBM) with Bhatnagar-Gross-Krook (BGK) collision model on graphic processing unit (GPU) to simulate fluid flowing in a channel commonly termed as Poiseuille flow. Parallel computations required by LBM can be easily done on programmable GPUs. The computing hardware used in the present work is NVidia GeForce GTX 560 which is convenient enough to accommodate in any workstation. Compute unified device architecture (CUDA), developed by NVidia is used to program GTX 560. Performance of GPU during LBM simulation is under special focus. Effect on performance using on-chip and off-chip memory is illustrated. Results of LBM simulation are in excellent agreement with the standard velocity profile in the channel flow. Keywords lattice Boltzmann method, graphic processing unit, Poiseuille flow, compute unified device architecture, fluid simulation. Nomenclature GPU – Graphic Processing Unit 1. Introduction Originated from lattice gas automata, lattice Boltzmann method (LBM) has proven itself to be an efficient method to simulate a large variety of fluid flows, and a potential alternative to the conventional Navier-Stokes (NS) equations [1,2,3,4,5]. An introduction to LBM theory, its methodology may be obtained from [6, 7, 8, 9, 10]. Since LBM is explicit in nature and generally needs only nearest neighbor information, it is suitable for parallel computations [11, 12]. For this reason there has been an immense research to implement LBM on graphics processing units (GPU). GPU is a massively multi-threaded architecture and can be used for general purpose computations [13].

Different from the traditional computational fluid dynamic methods which solve the nonlinear Navier-Stokes equations directly, the LBM simulates fluid flows from the particle’s viewpoint. By applying the macroscopic constraints, such as low expansion velocity, the Navier-Stokes equation can be recovered from the lattice Boltzmann equation. Instead of solving time consuming Poisson equation for pressure, LBM uses an extremely simple formula, . In the present work, LBM is used to simulate Poiseuille flow while implementing in on GPU. It is a very standard problem of whose solution is available in literature. We also present the performance of GPU in various test cases which are discussed further. 2. Mathematical formulation and boundary conditions Lattice Boltzmann equations adopting a uniform lattice with BGK collision model can be expressed as, (1) where fi is the particle density function along the particle velocity direction ei, τ is the dimensionless relaxation time that controls the rate approaching equilibrium. Based on the density distribution function, the macroscopic variables are defined as: (2) (3) To implement lattice Boltzmann simulations, two essential steps are required, which are defined as follows: Collision step:

Streaming step:

After collision, each distribution function will move to neighboring node by following the direction of its lattice velocity. The boundary condition for the computational domain depends on the simulation cases. On boundaries, some distribution functions become unknown due to its upstream direction is out of computational domain. Hence the boundary treatment is extremely important. Modified bounce-back boundary conditions [14] at the wall and pressure/velocity boundary conditions are applied at the inlet and outlet. These are described below:

a. Top wall: Velocity boundary conditions: Unknown distribution functions:

. b. Bottom wall: Velocity boundary conditions: Unknown distribution functions:

c. Inlet: Velocity boundary conditions: Unknown distribution functions:

d. Outlet: Pressure boundary condition:

Unknown distribution functions:

On the corners, the unknown distribution functions are obtained by equating the non-equilibrium part of the LBE. 3. Results 1. Velocity distribution

Fig. 1. Velocity contour in the channel 2. Pressure Distribution

Fig. 2 Pressure distribution in the pipe

3. Performance improvement: Improvement in performance of GPU implementation while simulating Poiseuille flow using D2Q9 LBGK model and utilizing shared memory is shown in Table 1. It also shows how the block size (number of threads in a block) affect the performance. Performance is measured in MLUPS (Million Lattice sites Update Per Second), which is defined as:

Block size

16 32 64 128 256 512

D2Q9 1024 128 Global memory 96.09 182.3 296 336.94 329.32 287.43

D2Q9 2048 128 Global memory 92.43 171.78 280 330.57 328.09 287.75

D2Q9 1024 128 Shared memory 183.06 334.36 564.96 682.66 681.59 562.5

D2Q9 2048 128 Shared memory 180.66 336.51 547.27 718.2 695.34 599.87

Improvement in GPU Performance 1.92x 1.89x 1.92x 2.1x 2.09x 2.02x

Table 1: Performance comparison while using shared memory Similar improvement in performance while using shared memory of a GPU with different grid size of computational domain is shown in Fig. 14.

Fig. 3 Improvement in performance while using shared memory of a GPU

4. References [1] S. Succi, “The Latticce Boltzmann - For Fluid Dynamics and Beyond”, Oxford University Press, 2001, p.288. [2] O. Filippova, D. Hanel, “A novel BGK approach for low mach number combustion”, J. Comput. Phys. 158, 139, (1992). [3] R. Mei, W. Shyy, D. Yu, L. S. Luo, “Lattice Boltzmann method for 3-D flows with curved boundary”, J. Comput. Phys. 161 680, (2000). [4] Z. Guo, T. S. Zao,”A lattice Boltzmann model for convective heat transfer in porous media”, Numer. Heat Transfer B 47, 155, (2005). [5] W. Shi, W. Shyy, R. Mei, “Finite difference based lattice Boltzmann method for inviscid compressible flows”, Numer. Heat Transfer B 40, 1, (2001). [6] D. V. Patil, K. N. Lakshmisha, B. Rogg, “Lattice Boltzmann simulation of lid-driven flow in deep cavities”, Computers and Fluids35, 1116, (2006). [7] S. Chen, G. D. Doolean, “Lattice Boltzmann method for fluid flows”, Ann Rev Fluid Mech30, 328, (1998). [8] Y. H. Qian, D. d’Humi`eres, P. Lallemand, “Lattice BGK models for Navier-Stokes equation”, Europhys Lett 17, 479 (1992). [9] X. He, L. S. Luo,“A priori derivation of the lattice Boltzmann equation”, Phys. Rev E 55, R6333, (1997). [10] X. He, L. S. Luo, “Theory of the lattice Boltzmann equation: From the Boltzmann equation to the lattice Boltzmann equation”, Phys. Rev E 56, R6811, (1997). [11] K. Mattila, J. Hyvaluoma, J. Timonen, T. Rossi, “Comparison of implementation of the lattice-Boltzmann method,” Comput. Math. Appl. 55, 1514, (2008). [12] W. Li X. Wei, A. Kaufman “Implementing lattice Boltzmann computation on graphics hardware,” Vis. Comput. 19, 444, (2003). [13] J. A. Anderson, C. D. Lorenz, A. Travesset, “General purpose molecular dynamics simulations fully implemented on graphics processing units”, J. Comput. Phys.227, 5342, (2008). [14] X. He, Q. Zou, L. S. Luo, M. Dembo “Analytic solutions of simple flows and analysis of nonslip boundary conditions for the lattice Boltzmann BGK model,” J. Statistical Phys., 87, 115, (1997).

Suggest Documents