Multi-dimensional modelling of the ow and combustion promises to become a useful optimisation tool for IC engine design. Currently, the total simulation time for.
Rapid CFD Simulation of Internal Combustion Engines H. Jasak, J.Y. Luo, B. Kaludercic and A.D. Gosman
Computational Dynamics Ltd.
H. Echtle, Z. Liang and F. Wirbeleit
Daimler-Benz AG
M. Wierse
SGI/Cray Research
S. Rips and A. Werner
University of Stuttgart
G. Fernstrom and A.Karlsson
AB Volvo Technological Development
ABSTRACT
insight that can be reached experimentally only with great diculty. In some other areas of engineering design, CFD is already complementing experiments as a standard diagnostic tool in the optimisation process. Typical examples of this kind are the coolant jackets of IC engines, under-hood ows and passenger compartments in the automotive industry, air-conditioning systems for buildings, cooling of electronic components etc. The aerospace industry has replaced a lot of its prototype testing with CFD simulations. In the simulation of the ow and combustion in IC engines, the progress seems to be slower. Here, CFD is used primarily as a research tool, for a number of reasons. The rst is the physical complexity of the ow models. The system of partial dierential equations to be solved consists of the Reynolds-averaged compressible Navier-Stokes equations with the constitutive relations for an ideal gas mixture, the energy equation, a turbulence Reynolds stress- ux model, spray model, ignition, chemical reaction and combustion model. Additionally, the ow is unsteady and some boundaries (piston and valves) move in time. Progress in combustion (and turbulence) theory [1] has provided a number of models that claim good accuracy and predictive properties, thus setting the base for further use of CFD in engine design. However, the high cost of the computation and the complexity of the setup still hinder the routine use of CFD in the design process. For example, a typical turn-around time for the simulation of an IC engine can be of the order of many weeks. This is caused by several factors described below.
Multi-dimensional modelling of the ow and combustion promises to become a useful optimisation tool for IC engine design. Currently, the total simulation time for an engine cycle is measured in weeks to months, thus preventing the routine use of CFD in the design process. Here, we shall describe three tools aimed at reducing the simulation time to less than a week. The rapid templatebased mesher produces the computational mesh within 1-2 days. The parallel ow solver STAR-CD performs the ow simulation on a similar time-scale. The package is completed with COVISEMP , a parallel post-processor which allows real-time interaction with the data.
INTRODUCTION Currently, the design of Internal Combustion (IC) engines relies mainly on experimental methods, where engine prototypes are constructed and tested in a range of operating regimes. A substantial amount of accumulated knowledge allows the engineer to meet the stringent demands on performance, emissions and fuel economy. However, as the complexity of the engine and the number of criteria that need to be achieved simultaneously increases, it becomes more and more important to understand and control the details of the ow, spray and the combustion processes. In the search for more detailed information about these in IC engines, Computational Fluid Dynamics (CFD) potentially oers an Corresponding author, Computational Dynamics Ltd, Olympic House, 317 Latimer Road, London W10 6RA, England, Tel: (+44) 181 969 9639, Fax: (+44) 181 968 8606
1
to two months. Moreover, even relatively small changes to the setup are often neither easy nor quick, as they might require a review of the whole of the mesh movement and/or the mesh structure. In a design cycle, an engineer would typically like to examine tens of dierent load regimes and engine speeds, for several con gurations of, say, inlet manifolds, valve timings, valve shapes etc. This is clearly not feasible within the current turn-around time. If the results of a parametric study are to be used to improve the design, they need to be available in the time-frame of several days and even less for simple modi cations, like a change in the valve lift pro le or inlet port shape. This paper describes the main elements of a project that had as its major objective the reduction of the simulation time to meet these requirements.
From the point of view of mesh generation, an IC engine is a complex geometry. The complexity is further increased by the moving boundaries: after the mesh has been built, it is necessary to prescribe its movement in time to accommodate the movement of the boundaries. Moreover, for good accuracy it is also advisable to modify the topology of the mesh to preserve the sensible aspect ratios of the cells which changes with the boundary movement. For example, a mesh consisting of cubes at BDC may be unacceptably deformed at TDC. Therefore, the number of cell layers between the cylinder head and the piston needs to be changed as the piston moves. A similar action is also necessary around the valves as they open and close. In addition, the mesh quality should be preserved throughout the whole calculation. This sometimes requires additional connectivity changes: an acceptable mesh with the valves closed may become highly distorted when the valves open. Additional complexity is introduced if the volumes swept by the valves overlap with each other and with the volume swept by the piston, a situation regularly encountered in pent-roof designs. A sensible number of cells for combustion calculations in IC engines typically consists of around sixty thousand cells at TDC, rising to about three times as many at BDC. Additional mesh is needed to model the inlet and exhaust manifolds, giving the typical total mesh size of around 300 000 cells. A typical integration time-step is usually around 0:1 of a degree Crank-Angle (CA). Allowing for some run-in time, a single engine cycle (720o CA) would thus easily require 10 000 timesteps. The simulation time for such a setup has been reported to be around 2 months on a fast single-processor workstation [2]. Currently, Daimler-Benz routinely runs simulations on meshes with similar resolution over 500o CA, with 4 000 time-steps in 3 weeks. The amount of data available from the calculations described above is extremely large. Potentially, it can provide the spatial distribution of any of the variables (, U, p, T , k, , species concentration and combustion variables) for all time-steps, but usually only selected data is of interest. Even the reduced set of data may be so large that it becomes very dicult to analyse. Ideally, one would like to visualise the iso-surfaces of dierent variables, plot cutting lines or planes mapped with the scalar or vector data and follow the tracks of seeded particles. Also, to aid understanding of the details of the
ow, it would be useful to be able to generate animations of any of the above properties, with the possibility of \live" interaction with the data. All of the above requires a very powerful computer, capable of storing vast amounts of data [3]. Having in mind the complexity of the mesh generation and mesh movement requirements, long simulation times and the complexity of the post-processing, a single simulation cycle from the CAD description to the data analysis using conventional methodology could take up
RAPID SIMULATION STRATEGY The HPS-ICE project was set up with the goal of reducing the turn-around time for the complete simulation cycle of an IC engine to 1 week or less. In order to achieve this, improvements were made to every step of the simulation cycle by adapting the following strategy: In the rst instance, it is necessary to considerably speed up the mesh generation and the setup of mesh movement, while preserving good quality. For this purpose, a template-based mesh generator is developed, based on previous experience of \good" mesh generation practices and capable of producing parametrised mesh movement routines. The aim is to enable creation of a mesh and the associated movement routines within 1-2 days, with the possibility of easy and quick changes of key engine parameters (eg. valve shape, valve timings and the lift curve, shape of piston head, details of the combustion dome geometry, position of the fuel injector etc.). The necessary reduction of the simulation time is achieved with the use of parallel computers. Scalable speedups for the CFD codes on massively parallel computers have been reported in the past [3,4] but not for the cases involving moving meshes and topology changes. If a similar speedup could be achieved for IC engine simulations, the turn-around time could be reduced to a few days or less, in line with the target time-frame. The parallelisation of the solver opens an interesting option for rapid post-processing. Most of the necessary data analysis can be done on the distributed data sets, using the computational power of the parallel machine as a back-end for the graphics renderer. Once the data is processed, the actual 2
(bottom, inner and outer), length of the valve stem cylinder etc. When all the parameters are de ned, the template mesh is created automatically. It is also possible to subsequently modify the template manually in order to simplify the mapping process.
visualisation can be done on a workstation with good graphics performance. This would allow the simultaneous use of the large memory and computing power of the parallel platform, bringing closer the goal of the \live" interaction with the data. In the rest of this paper, the successful development of the components outlined above will be described in more detail. Example results of the application of the methodology for the cold ow and combustion simulation to a representative 4-valve pent-roof engine will also be presented, together with some parallel speedup results and examples of post-processing. Finally, a summary and indication for future work will be given.
The next stage is to input the actual geometry of
the engine. Usually, this information is available in a CAD format and is passed to the mesh generator as a shell surface mesh for the combustion dome and the arms, along with a set of pro les describing the shape of the valves. The description of the geometry is completed with the de nition of the valve lift curves.
Finally, the template is mapped onto the real ge-
TEMPLATE-BASED MESH GENERATOR
ometry. This is done in several stages, the rst of which is the identi cation of \feature edges" of the template and the geometry. Once the edges and the corresponding surfaces are matched, the rest of the mesh is created by projection and mesh smoothing algorithms. The nal mesh, together with the associated movement instructions automatically produced by the mesh generator are written in a format suitable for the ow solver.
A reliable automatic mesh generator for IC engine applications is still a subject of research rather than industrial reality. Although the progress in recent years has been signi cant, the \ultimate" mesh generator which produces good quality (preferably hexahedral) meshes for arbitrarily shaped domains with moving boundaries is still not available. Currently, the best meshes for IC engine geometries are created by skilled engineers with extensive experience in this area. The approach taken here has been to automate the mesh generation process, based on the know-how of experienced specialists and make it available to the average user. Pro-ICE, the engine-speci c mesh generator developed within the HPS-ICE project is based on the principle of template-mapping. The process is divided into the following stages: The rst stage is the analysis of the geometry in question, to establish the salient features. These include the type (Diesel or petrol), the number of valves, the type of the combustion dome (eg. at, basic pent-roof, shallow etc.), the geometry of the manifolds (separate, Siamese) etc. Other basic dimensions and characteristics of the engine are also speci ed: the piston diameter, the connecting rod length, the stroke, the engine speed and the maximum valve lift. The second stage is the construction of the template, which is a mesh having the necessary topological features but not at this stage conforming to the actual geometry. The appropriate parts of this are extracted from a \template" library and the mesh for each part is described with a number of parameters which de ne the local mesh resolution. For example, the parameters de ning a valve template are: no. of circumferential cells, no. of cell layers for the maximum lift, no. of ring radial cells
The remaining preparations for the simulation are done in the pre-processor for the ow solver. This includes the selection of the turbulence and combustion models, the type of fuel, the ignition timing, the time-step size, total simulation time etc. The code is now ready to run. THE TEMPLATE LIBRARY As already noted, within the mesh generator there exists a library of template geometries for dierent engine con gurations. This is the most technically demanding part of the mesh generation: each template is based on the previous experience about the \best" mesh structure for the geometry in question. The templates are also interchangeable, making it easy to, for example, change the port type on a Diesel engine. Each of the templates also comes with the associated parametrised mesh movement routines. Once the mesh is completed, the nal mesh movement is automatically assembled from the segments from all parts of the mesh. Two example templates are shown in Fig. 1. The rst, Fig. 1(a), shows a 2-valve Diesel engine with a at combustion deck and a steep helical intake port. The exhaust port is not included in this template. Fig. 1(b) shows the template for a 4-valve petrol engine with Siamese arms on both the intake and the exhaust side. The template library covers various con gurations of port and intake arm geometries and is continuously being enlarged. It is also possible to manually modify the template mesh before the mapping process. Piston bowls, spark plug geometry and other similar features can be treated in this way, with the aid of the \Arbitrary Interface" mesh matching feature [5] of the ow solver. 3
(a) 2-valve Diesel engine. Figure 2: 4-valve Diesel engine template without the man-
ifolds.
(b) 4-valve petrol engine. Figure 1:
Template mesh.
Figure 3:
If an appropriate template is not available, the template mesher can be used to build only the moving parts of the mesh, which is then combined with a mesh for the manifolds built in some other way. Such an example is shown in Fig. 2. The nal (mapped) mesh with the separate mesh for the manifolds is shown in Fig. 3. The two meshes are again interfaced using the \Arbitrary Interface" feature. THE MAPPING PROCESS The process of mapping the features of the template to the real geometry is done in three stages. Initially, the edges of the template are paired with the corresponding edges on the shell surface. Once the pairing is established, the edge vertices of the template are mapped to the correct position on the shell surface. The surfaces of the template are then projected onto the new geometry. The remaining work consists of the trans nite mapping of the internal mesh and a number of smoothing steps, both on the surface and the interior of the model. The feature edges of the geometry are preserved during the smoothing process.
Complete mesh.
Fig. 4 shows the intake port for the template and the nal mesh for the engine simulation example mentioned earlier, which is the Daimler-Benz M-111 design. The template was manually modi ed before the mapping to accommodate the fuel injector cavity above the intake manifold, while the regularity and spacing of the template is preserved in the nal mesh. The mesh movement information is provided to the
ow solver \on-the- y". When the actual mesh topology changes, the mapping and smoothing will be repeated to obtain the new vertex positions. If no topology change occurs, the vertex positions are calculated using \smart interpolation", which guarantees that the edges and surfaces of the geometry will be preserved. Fig. 5 shows cross-sections through the mesh for two crank angles. In comparison with hand-built meshes, the template-based mesh is of higher quality, having lower average warpage angle and nonorthogonality. Moreover, the quality of the mesh is preserved during the mesh motion, which is not always the case for the hand-built meshes. 4
e e40
e e35 e e25 e e37 e e19 e e12 ee6 e11 e10 e e e4
e e69
e e66 e e16 e e39
e e41 e e68
e e20
e e30
e42e17 e e e e38
e5 e ee8 e9 ee7 ee3 e
e e31
e e15
e e1 e e32
e e29
C.A. = 390.0 e e77 e e76
e e13
e e27 e e34
e e26
e e33
e e28
(a) Template mesh.
e e27 e e34
e e40 e e35 e19 e4 e ee11 e10 ee12 e
e30 e e e39 e e41
e e37
e e6 e e69
e e16 e e66 e e38 e e68
e e20
e e42 e e17
e e31
e e5 e ee8 e9 ee7 ee3 e e15
e e26 e e33 e e1
e e32 e e25
C.A. = 540.0 e e28 e e29
Figure 5:
geometry.
e e77 e e13
Mesh movement for the Daimler-Benz M-111
e e76 e e22
(b) Mapped mesh.
SIMPLE MODIFICATIONS
The templatebased mesh generator described above allows quick modi cations of the original mesh, both in terms of mesh movement and geometrical detail. For example, a change in the valve lift curve or in valve timing can now be done simply on a \point-and-click" basis, as the mesh movement routines are modi ed automatically.
Mesh template and nal mapped mesh for the Daimler-Benz M-111 geometry.
Figure 4:
Geometrical modi cations can also be rapidly accommodated. Typically, the mesh structure and most of the edge mapping stays the same and only minimal user-interaction is needed. 5
PROBLEM SETUP AND THE SOLUTION ALGORITHM
lelism. For that purpose we need to resort to dynamic load balancing and tests of this are already under way. The balanced domain decomposition for the Eulerian part of the calculation does not imply the balanced load related to the Lagrangian particle-tracking procedure used in the spray model. Here, the pie-like decomposition of the combustion deck may improve the situation for multi-nozzle Diesel injectors, but a better practice is still under development. INDUCTION RESULTS The ow elds for two dierent piston positions are presented in Fig. 6. Fig. 7 shows the reduction in the execution time with the number of CPUs on the parallel machine. The results are presented for three parallel computers: a 180 MHz 24-CPU SGI Origin 2000 (CD), a 195 MHz 14CPU SGI Origin 2000 (SGI) and a 512-CPU Cray T3E900. The initial test was performed for the rst 100 time-steps and then repeated for 1000 time-steps of the intake stroke on one of the machines. In all three cases the elapsed time per time-step on 16 CPUs is around 20 s per time-step. This would equate to 144 000 s, or 40 hours for 720o CA, in line with the target turn-around time. In terms of parallel eciency the data in Fig. 7(b) corresponds to a speedup of 15.4 on 16 CPUs, only marginally worse than typical results for static mesh applications [10]. COMBUSTION CALCULATIONS The combustion calculations for the M-111 case on the parallel platforms are incomplete at the time of writing; the full parallel performance data have not yet been assembled. However the need for this capability can be demonstrated now, by reference to Fig. 8. This shows measured and calculated ame propagation, the latter on two meshes with 150 000 and 300 000 cells, respectively. The results on the ner mesh are clearly in closer agreement with the measurements, indicating that at least this degree of resolution is required for accuracy.
The simulation of the fully premixed combusting ow in the M-111 engine is done using the standard k ? model with wall functions to account for turbulence, with the combustion modelled by the premixed version of the 2equation Weller model [6{9]. Additionally, two \passive scalar" transport equations are solved to track the fresh charge and the residual gas, bringing the total number of equations solved to 11. The engine operates at 1500 rpm and at part load, with the intake manifold pressure of 0:45bar. The fuel-air mixture is stoichiometric with no residual gas recirculation and ignition occurs at 10o CA before TDC. The calculation starts 30o CA before the TDC and lasts for 750o CA, with the time-step size corresponding to 0:1o CA. PARALLEL FLOW SOLVER The ow solver used in this study is the parallel version of STARCD. The equations are discretised using the nite volume method and a segregated solution algorithm on a moving grid with topology changes [5]. The parallelisation is done through domain decomposition, where a portion of the mesh is assigned to each processor. The exchange of information on the inter-processor boundaries is performed in the messagepassing paradigm, suitable both for shared and distributed memory computers. The mesh decomposer and other parallel setup tools are a part of the pre-processing package for the ow solver. The parallel performance of the code on static meshes has been reported before, with the linear scaling behaviour observed on up to 32 CPUs and beyond [4,10]. In an ideal parallel code the work should be equally divided between the processors. For the FV solvers, this is usually achieved by evenly distributing the cells and at the same time trying to minimise the communication between the processors. If the mesh topology does not change in time, this is done only once, at the beginning of the calculation. In engine simulations, due to the cell layer addition/removal, an initially well-balanced decomposition may deteriorate and re-balancing might become necessary: this could be done dynamically during the run, using one of the tools specially designed for this purpose (eg. JOSTLE [11]). One should, however, nd an appropriate trade-o between the cost of the load imbalance and the additional balancing work. Currently, we have adopted a strategy where the mesh is initially decomposed in such a way that the cell activation/deactivation occurs evenly over all subdomains, thus preserving to a certain degree the initial load balance. This is achieved by a pie-like decomposition in the moving part of the mesh. While a solution of this kind is cost-eective on a relatively small number of CPUs as it incurs no re-balancing cost, it might not be appropriate for very large meshes and massive paral-
POST-PROCESSING Techniques for analysing large amounts of data in a short period of time is the nal component of the rapid analysis system. The aim is to provide post-processing techniques appropriate for the data in question: line and plane cuts with the scalar and vector data mapped onto them, iso-surfaces of the selected eld, possibly coloured by some other eld and particle traces. Further, the user would be able to interactively modify the parameters for any of the above, spin and zoom in on parts of the geometry and nally perform an animation consisting of at least 100 time-steps for the eld in question. The post-processing issues described above need to be seen in the light of the size of data sets in question. Sometimes it may be physically impossible to operate on the data to any other available computer apart from 6
MAGNITUDE VELOCITY M/S TIME = 0.666668E-02 *PRESENTATION GRID* 28.00 26.00 24.00 22.00 20.00 18.00 16.00 14.00 12.00 10.00 8.000 6.000 4.000 2.000 0.0000E+00
Z
1.2
C.A. = 390.0
X
Y
time [s] /10
4
TRANSIENT IN-CYLINDER COLD-FLOW ANALYSIS M111 4-VALVE ENGINE, PARTLOAD (PMAN=0.45 BAR) SECTION THROUGH VALVE CENTERLINES
MAGNITUDE VELOCITY M/S TIME = 0.233331E-01 *PRESENTATION GRID* 28.00 26.00 24.00 22.00 20.00 18.00 16.00 14.00 12.00 10.00 8.000 6.000 4.000 2.000 0.0000E+00
SGI Origin (CD) SGI Origin, (SGI) Cray T3E-900
0.8
0.4
0.0 0.0
10.0
30.0
40.0
(a) Initial 100 time-steps.
Z C.A. = 540.0
X
20.0 number of CPU-s
Y
TRANSIENT IN-CYLINDER COLD-FLOW ANALYSIS M111 4-VALVE ENGINE, PARTLOAD (PMAN=0.45 BAR) SECTION THROUGH VALVE CENTERLINES
2.0 1.6 SGI Origin (CD)
5
TURB KINETIC ENERGY M**2/S**2 TIME = 0.666668E-02
time [s] /10
35.00 32.50 30.00 27.50 25.00 22.50 20.00 17.50 15.00 12.50 10.00 7.500 5.000 2.500 0.0000E+00
1.2 0.8 0.4 0.0
Z
0.0
4.0
C.A. = 390.0
X
Y
TRANSIENT IN-CYLINDER COLD-FLOW ANALYSIS M111 4-VALVE ENGINE, PARTLOAD (PMAN=0.45 BAR) SECTION THROUGH VALVE CENTERLINES
8.0 number of CPU-s
12.0
16.0
(b) 1000 time-steps of the intake stroke. Figure 7:
TURB KINETIC ENERGY M**2/S**2 TIME = 0.233331E-01
CPUs.
35.00 32.50 30.00 27.50 25.00 22.50 20.00 17.50 15.00 12.50 10.00 7.500 5.000 2.500 0.0000E+00
Z C.A. = 540.0
X
Y
TRANSIENT IN-CYLINDER COLD-FLOW ANALYSIS M111 4-VALVE ENGINE, PARTLOAD (PMAN=0.45 BAR) SECTION THROUGH VALVE CENTERLINES
Figure 6: Cold ow results for the Daimler-Benz M-111 geometry: U and k at 390o and 540o CA.
7
Cold ow results: execution time vs. number of
Figure 9: COVISEMP Figure 8:
tion.
Experimental comparison of the ame propaga-
post-processing environment.
The above approach allows post-processing of the data sets in question with almost live interaction. Although the \chain" of post-processing has been broken in a sensible place, the communication between the back-end supercomputer and the renderer on the workstation is still intensive. The limiting factor in the whole process, nonetheless, is the band-width of the connection between the two. Having in mind the limited capacity of this data link between the back-end and the workstation, one way of enhancing the performance of the post-processor is the use of data reduction techniques. Also, data reduction will result in faster rendering and interaction with the data and reduced memory requirements on the workstation, which is particularly useful in animations. The data reduction approach used here relies on the fact that the computational mesh and, for example, the description of an iso-surface in terms of poly-triangles is frequently unnecessarily ne for the human viewer and, among other things, depends on the colour resolution of the post-processor and the distance from the object (zoom). It is therefore possible to considerably reduce the details of the surface description and compress the vertex-based (colour) data by analysing the neighbourhood of the vertex in question before passing the data to the renderer. An example of the surface data reduction is shown in Figs. 10 and 11. Such tools parallelise naturally and several of them are already available (OpenGL Optimiser from SGI [14], INDEX Project [15]).
the parallel supercomputer used for the ow simulation because of the memory requirements. Although parallel computers oer a huge increase in performance, they are designed as batch machines and their graphics performance is often poor. On the other hand, a highend workstation provides fast (hardware-accelerated) graphics and supports some interesting special postprocessing devices, like stereo-vision or virtual reality. The optimal approach to the post-processing requirements would therefore be to use the high computing power (and storage) of a supercomputer in combination with the high graphics performance of a workstation. COVISEMP [3,12,13], the distributed visualisation software environment, allows direct operation on the data sets created by the parallel ow solver. Fortunately, most of the algorithms used to perform the postprocessing operations described above can be done on a cell-by-cell basis with no inter-processor dependence and therefore parallelise naturally (the notable exception is the particle-tracking). Once the visualisation data is assembled, it is passed to the rendering part of the package, which operates on a workstation. Accelerated graphics hardware allows interaction with the renderer in real time. If any of the post-processing parameters (like the iso-surface level) changes, the backend supercomputer is again brought into action. Fig. 9 shows the post-processing environment, consisting of the Process Control Panel, which de nes the
ow of data, several menus, allowing the user to specify the post-processing parameters and the rendered window. The last-named shows the ame front and the temperature mapped onto a semi-transparent cutting plane.
CONCLUSIONS AND FUTURE WORK In this paper, a set of software tools developed to reduce the simulation time for the ow and combustion calcu8
Figure 10:
Data reduction: original surface, 17 000 trian-
Figure 11:
Data reduction: reduced surface, 5 000 trian-
gles.
gles.
lations in IC engines has been presented. They include a rapid template-based mesh generator, capable of producing a mesh and the associated movement setup in 1-2 days. Rapid modi cations of the mesh movement parameters are also catered for. With the use of parallel computers the simulation time for a complete engine cycle is reduced to as little as two days for the cold ow and estimated to about three days for the combusting ow on a 32-processor machine, with potential for further reduction. In order to preserve the load balance, a domain decomposition which guarantees that the cell activation/deactivation will be distributed over all processors has been used. This is a pre-cursor to the potentially better-behaved dynamic load balancing, which is also being tested. The massively parallel STAR-HPC has been shown to preserve the good scalability for the moving meshes with topology changes. The package is completed with COVISEMP , a postprocessor capable of operating on distributed datasets and providing user-interaction in real time. This is achieved with the combination of a parallel supercomputer back-end and a fast graphics workstation. The CPU-intensive post-processing operations are performed on the domain decomposition originally used by the ow solver. Once the graphical information is assembled, it is compressed and passed to the workstation for rendering. Such a combination has proven to be highly ecient, simultaneously using the high computing power of a supercomputer and the superior graphics performance of a workstation. The future work within this project will be aimed at further reducing the turn-around time. The options are numerous: Currently, the bulk of the eort in mesh generation is associated with the edge mapping process, with the other parts proving to be suciently robust and user-friendly to allow the user to ne-tune the mesh to his satisfaction. While a certain part of the mapping will always be manual, some typical feature edges/splines could be created automatically or imported from the CAD data. Better mesh quality allows us to use higher order discretisation, further improving the accuracy of the predictions. Ultimately, the Automatic Resolution Control tools can be used to dynamically adjust the local mesh resolution and the time-step size based on a-posteriori error estimates already available in the ow solver. On the post-processing side, two promising strategies for further data compression are the mantissa reduction, as the single-precision data representation is unnecessarily accurate for most of the postprocessing operations, and time-sequence compression, where the compression algorithms work on 9
the time sequence rather than the un-ordered space sequence and achieve higher compression, nding more similarity in the data set. Both were developed by RUS within the INDEX Project and will be available as stand-alone tools. In addition, they will be integrated into the post-processing package.
[9]
Acknowledgement
[10]
The work described in this paper has been performed within the High Performance Simulation of Internal Combustion Engines (HPS-ICE) Project, funded by the European Commission within the ESPRIT programme (Contract Number 20184), whose support is gratefully acknowledged. The partners in the Project are: Computational Dynamics Ltd, Daimler-Benz AG, SGI/Cray Research, the University of Stuttgart and AB Volvo.
[11]
REFERENCES
[12]
[1] Gosman, A.D: \CFD modelling of ow and combustion for IC engines", In Wagner, S., Hirschel, E.H., Periaux, J., and Piva, R., editors, Computational Fluid Dynamics, pages 132{143. John Wiley and Sons, September 1994. [2] Echtle, H., Liang, Z., Willand, J., and Wierse, M.: \Transient simulation of uid ows in internal combustion engines on parallel computers", In ECOMAS Conference Proceedings. John Wiley & Sons, 1998. [3] Werner, A., Echtle, H., and Wierse, M.: \High performance simulation of internal combustion engines", In ACM/IEEE Supercomputing '98 Conference Proceedings, 1998: to be published. [4] \STAR-CD: Computational Dynamics Web Page": http://www.cd.co.uk. [5] \STAR-CD Version 3.05: Methodology and User Guide": Computational Dynamics Limited, 1998. [6] Weller, H.G.: \The Development of a New Flame Area Combustion Model Using Conditional Averaging", Thermo-Fluids Section Report TF 9307, Imperial College of Science, Technology and Medicine, March 1993. [7] Weller, H.G., Uslu, S., Gosman, A.D., Maly, R.R., Herweg, R., and Heel, B.: \Prediction of Combustion in Homogeneous-Charge Spark-Ignition Engines", In International Symposium COMODIA 94, pages 163{169. The Japan Society of Mechanical Engineers, 1994. [8] Heel, B., Maly, R.R., Weller, H.G., and Gosman, A.D.: \Validation of SI Combustion Model
[13]
[14] [15]
10
Over Range of Speed, Load, Equivalence Ratio and Spark Timing", In International Symposium COMODIA 98. The Japan Society of Mechanical Engineers, 1998. Heel, B.: Dreidimensionale Simulation der Stromung und Verbrennung im Zylinder eines OttoForschungsmotors, PhD thesis, Universitat Karlsruhe, 1997. Behling, S.R., Robinson, D., and Bauer, W.: \Recent experience with STAR-HPC on the CRAY T3E", In High Performance Computing in Automotive Design, Engineering and Manufacturing. Cray Research, Cray Research Inc., 1996. Walshaw, C., Cross, M., and Everett, M.G.: \Dynamic load-balancing for parallel adaptive unstructured meshes", In Heath, M. and et. al., editors, Parallel Processing for Scienti c Computing. SIAM, Philadelphia, 1997. Wierse, A., Lang, U., and Ruhle, R.: \Architectures of distributed visualization systems and their enhancements": Eurographics Workshop on Visualization in Scienti c Computing, Abingdon, 1993. Rantzau, D. and Lang, U.: \A scalable virtual environment for large scale scienti c data analysis", In Proceedings of the Euro VR Mini Conference 97, Amsterdam. Elsevier 1998, November 1997. \SGI OpenGL Optimizer White paper, Mountain View, California": http://www.sgi.com, 1998. \INDEX (Intelligent Data Reduction) Project": ESPRIT Contract No. 22745, 1997.