May 30, 1997 - National Center for Supercomputing Applications and Aeronautical and ... their potential to be the fastest supercomputers, a trend that may be.
INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING, VOL.
40, 1857–1875 (1997)
FINITE ELEMENT ALGORITHMS FOR DYNAMIC SIMULATIONS OF VISCOELASTIC COMPOSITE SHELL STRUCTURES USING CONJUGATED GRADIENT METHOD ON COARSE GRAINED AND MASSIVELY PARALLEL MACHINES SUNG YI
The School of Mechanical and Production Engineering, Nanyang Technological University, Nanyang Ave, Singapore, 2263, Republic of Singapore M. FOUAD AHMAD AND HARRY H. HILTON
National Center for Supercomputing Applications and Aeronautical and Astronautical Engineering Department, University of Illinois at Urbana-Champaign, Urbana, IL 61801, U.S.A.
SUMMARY Recently much attention has been paid to high-performance computing and the development of parallel computational strategies and numerical algorithms for large-scale problems. In this present study, a nite element procedure for the dynamic analyses of anisotropic viscoelastic composite shell structures by using degenerated 3-D elements has been studied on vector and coarse grained and massively parallel machines. CRAY hardware performance monitors such as Flowtrace and Perftrace tools are used to obtain performance data for subroutine program modules and speci ed code segments. The performances of conjugated gradient method, the Cray sparse matrix solver and the Feable solver are evaluated. SIMD and MIMD parallel implementation of the nite element algorithm for dynamic simulation of viscoelastic composite structures on the CM-5 is also presented. The performance studies have been conducted in order to evaluate eciency of the numerical algorithm on this architecture versus vector processing CRAY systems. Parametric studies on the CM-5 as well as the CRAY system and benchmarks for various problem sizes are shown. The second study is to evaluate how eectively the nite element procedures for viscoelastic composite structures can be solved in the Single Instruction Multiple Data (SIMD) parallel environment. CM-FORTRAN is used. A conjugate gradient method is employed for the solution of systems. In the third study, we propose to implement the nite element algorithm in a scalable distributed parallel environment using a generic message passing library such as PVM. The code is portable to a range of current and future parallel machines. We also introduced the domain decomposition scheme to reduce the communication time. The parallel scalability of the dynamic viscoelastic nite element algorithm in data parallel and scalable distributed parallel environments is also discussed. ? 1997 by John Wiley & Sons, Ltd. KEY WORDS:
viscoelasticity; conjugated gradient solver; parallel computing
INTRODUCTION The nite element method is a very attractive technique for solving boundary and initial value problems and it has been providing researchers with powerful versatile means for solving complex problems in science and engineering. However, evaluations of signi cantly large-scale problems and=or analysing rate-dependent systems which are governed by hereditary integrals or by CCC 0029–5981/97/101857–19$17.50 ? 1997 by John Wiley & Sons, Ltd.
Received 22 February 1996 Revised 22 August 1996
1858
S. YI, M. FOUAD AHMAD AND H. H. HILTON
high-order dierential equations require large memory and computational times since the solution at the speci c time is in uenced by all previous time solutions. With the advent of fast microprocessors and high bandwidth communication technology, it is possible to solve such large and complex problems more eectively, since vector and parallel process computers can provide increased capabilities in both computational speed and memory. High-speed parallel processing machines have demonstrated their potential to be the fastest supercomputers, a trend that may be accelerated in the future. Recently much attention has been paid to high performance computing and the development of parallel computational strategies and numerical algorithms for large-scale problems in science and engineering.1–3 High performance computing leads to accurate and ecient implementation of nite element analyses and utilization of vector and coarse grained parallel processing machine is expected to improve code performances. In order to optimize large codes, hardware performance analyses are needed to point out sections and=or subroutines of codes where vectorization and parallelization are both useful and feasible, and where programs need to be restructured. Codes can be vectorized and parallelized on coarse grained machines (CRAY Y-MP, CRAY C-90) using the full utilization of compiler tools such as fpp and fmp as well as visual vectorization and parallelization tools, such as perfview and atexpert. The two most commonly used programming paradigms in parallel environments are the Single Instruction Multiple Data (SIMD) and Multiple Instruction Multiple Data (MIMD) approaches. In a data parallel computer, all processors can perform the same operation on all data elements at the same time. The Connection Machine 5 (CM-5) provides high performance computing for large-scale problems with ne and coarse grained concurrence in a single architecture. Also the CM-5 takes advantage of the latest developments in compiling technologies, RISC microprocessors, operating systems, and networking. It combines the best features of existing parallel architectures; including ne and coarse grained concurrence and MIMD as well as SIMD control in a single integrated architecture. The CM-5 at the National Center for Supercomputing Applications (NCSA), University of Illinois at Urbana–Champaign, has 512 SPARC processors, vector units, 5 partition managers, 16 Gbyte of memory storage and 136 Gbyte scaled disk. The peak performance is 64 G ops on 64-bit oating-point operation. The processing node in the CM-5 can accomplish independent tasks or collaborate on a single problem and each processing node has 32 Mbyte of memory. High parallel scalability can be achieved on the CM-5. Recently Tafti4 reported the upto 8 G ops performance on the 512 node CM-5 for the direct numerical simulation of channel ow with two homogeneous directions and one inhomogeneous direction. Generally, the solution of sparse systems is the most computationally expensive task in nite element analyses and, therefore, ecient solvers must be used. Liu5 reviewed frontal and mutifrontal solvers for nite element analyses and Irons6 and Hood7 developed frontal methods for symmetric and non-symmetric systems, respectively. Detailed information on available sparse solvers can also be found in Reference 8. However, some solvers tailored to scalar machines may not be ecient on vector and parallel processing machines. Available on CRAY systems are the SPARSE algorithms, used as sparse matrix solvers for solutions of real sparse symmetric and positive-de nite systems. The rst study involves parametric studies on various CRAY machine architectures and benchmarks for various problem sizes. The performance of solvers such as the Feable subroutine developed for structural analysis purposes at MIT in the early 1970s and the SPARSE packages developed for the CRAY systems is evaluated. We also propose to implement the dynamic viscoelastic nite element algorithm on the massively parallel machine, the Connection Machine 5 (CM-5). The second study is to evaluate how eectively the nite element procedures for viscoelastic composite structures can be solved in the Single Instruction Multiple Data (SIMD) INT. J. NUMER. METHODS ENG., VOL. 40: 1857–1875 (1997)
? 1997 by John Wiley & Sons, Ltd.
VISCOELASTIC COMPOSITE SHELL STRUCTURES
1859
parallel environment. CM-FORTRAN9 is used. A conjugate gradient method is employed for the solution of systems. In the second study, we propose to implement the nite element algorithm in a scalable distributed parallel environment using a generic message passing library such as PVM.10 The code is portable to a range of current and future parallel machines. We also introduced the domain decomposition method to reduce the communication time. This domain decomposition algorithm can be implementable on any MIMD parallel system. The performance of the dynamic viscoelastic nite element algorithm in data parallel and scalable distributed parallel environments is discussed in detail. ANALYSIS Analysis for viscoelastic solids The theory of linear thermo-viscoelasticity leads to the following integral constitutive equations Z t @ Qij [To ; Mo ; ij (x; t) − ij0 (x; )] j (x; ) d (1) i (T; M; x; t) = @ −∞ where
Z
t
ij (x; t) =
bij [T (x; s); M (x; s)] ds
(2)
0
and ij0 (x; )
Z
=
bij [T (x; s); M (x; s)] ds 0
are reduced times which re ect material memory of temperature T and moisture M histories, the subscript o denotes reference conditions, x are principal material co-ordinates, and Qij are relaxation moduli. The composite shell is assumed to be in a state of plane stress and relaxation moduli for an orthotropic composite laminar in the principal material directions are Q11 Q12 0 0 0 0 Q 0 0 21 Q22 0 0 0 0 0 0 0 0 (3) [Q(T; M; t)] = 0 0 0 0 Q44 0 0 0 0 0 0 Q55 0
0
0
0
0
Q66
where Q12 = Q21 . These relaxation moduli are related to viscoelastic Young’s and shear moduli and to Poisson’s ratios and each may be temperature, moisture and time dependent E11 E22 ; Q22 = Q11 = (1 − 12 21 ) (1 − 12 21 ) Q12 =
12 E22 ; (1 − 12 21 )
s · G23 ; Q55 = K23
Q44 = G12
(4)
s Q66 = K31 · G31
s s where the K23 and K31 are shear correction factors. The relaxation moduli Q ij with respect to the laminate axes can be obtained from co-ordinate transformations. ? 1997 by John Wiley & Sons, Ltd.
INT. J. NUMER. METHODS ENG., VOL. 40: 1857–1875 (1997)
1860
S. YI, M. FOUAD AHMAD AND H. H. HILTON
Finite element formulation Finite element procedures for the dynamic analyses of anisotropic viscoelastic composite shell structures can be formulated by using degenerated 3-D elements. Displacements in each shell element are expressed in terms of nodal degrees of freedom ui (t) 1 P P u(x; t) n n i (t) v(x; t) = (5) Ni vi (t) + Ni · hi [−ˆv2i ; vˆ 1i ] · i2 (t) i=1 2 i=1 w(x; t) wi (t) where n is the number of nodes per element; x are the laminate co-ordinates; hi is a laminar thickness at the ith node; is the local curvilinear co-ordinate through the laminar thickness direction; ui ; vi ; wi are nodal displacements; i1 ; i2 are rotations; Ni are shape functions; and vˆ 1i and vˆ 2i are unit vectors which are tangent to the midsurface and de ne the directions of rotations i1 and i2 . By dierentiating equations (5) with respect to the laminate co-ordinates, strains can then be obtained in terms of nodal displacements {q(t)} as {( x; t)} = [B]{q(t)}
(6)
where [B] is the element strain–displacement matrix and {q(t)} is given by {q(t)} = bu1 ; v1 ; w1 ; 11 ; 12 ; : : : ; ui ; vi ; wi ; i1 ; i2 ; : : :cT
(7)
By using a variational formulation and the above expressions, the following nite element equilibrium equations are obtained for each element: Z =t @qn () e e Mmn d = fme (t) (m; n = 1; 2; : : : ;