removes the partition manager (GRUB or LILO) from the. Master Boot Record (MBR). ... GNU software, so it is free-available). The Conectiva Linux. 6.0 does not ...
Index
COMPUTERS CLUSTER FOR PARALLEL PROCESSING WITH THE MCNP4B CODE: LINUX + PVM + MCNP4B
Hugo Moura Dalle and Francisco J. Muniz
Centro de Desenvolvimento da Tecnologia Nuclear – CDTN/CNEN Caixa Postal 941 – Cidade Universitária – Pampulha 30123-970 – Belo Horizonte/MG - Brazil
ABSTRACT Procedures and results obtained in the installation process of a microcomputers cluster (Linux and PVM) to reduce processing time in MCNP simulations are described. The necessities and specificity concerned to the hardware, the Linux operational system, the PVM software package and an adequated compiled version for MCNP4B are presented. The system was tested to verify the gain in the processing time as function of the number of computers in the cluster using the MCNP input developed to simulate the research reactor TRIGA IPR - R1 of the CDTN/CNEN. Advantages and disadvantages of MCNP code generated with several compilers are discussed, as well as the performance of the executable code generated by each compiler is compared. The results show the viability of the system. It is possible to reduce significantly the simulations processing time of the MCNP code by utilizing PVM computers cluster. It was found that the cost/performance of such a system is often better than the sequential machine (workstations or even mainframes). Keywords: MCNP, computers cluster, PVM. I. INTRODUCTION This paper describes procedures and results obtained in the installation process of a Linux microcomputers cluster which allows to reduce the processing time in MCNP[1] simulations using parallel processing techniques available through the PVM software package[2,3]. The hardware and software specifications are presented in details. The system has been tested to verify the gain in the processing time as function of the number of computers in the cluster using the MCNP input developed to simulate the research reactor TRIGA IPR - R1 of the CDTN/CNEN. Monte Carlo method has had an increasing use on Reactors Physics. The main disadvantage of Monte Carlo is that the practical application of the method demands high processing power. Millions of simulations are necessary in order to reach good accuracy with low statistical error. By this way Monte Carlo simulations frequently need weeks or even months to run in standard sequential computers. Parallel processing techniques can be used to reduce execution time in Monte Carlo applications. A low cost approach of parallel processing technique is the use of computers clusters. A computers cluster consists of some microcomputers linked by a network in such way that the computers working altogether solving each one a part of the problem reach a global solution in a shorter time than just one computer could get. A high performance computer costs millions of dollars. One microcomputer costs about one thousand
dollars. For many applications it is possible to get high computational power running the computer programs in a network of PCs with extremely low cost. The cluster described here was built utilizing three available computers PENTIUM II - 266 MHz. It was only necessary to upgrade the hard disk of these computers in order to get enough storage space for the programs (operational systems and its tools, compilers, MCNP, cross-sections libraries, etc) since that was decided to keep both operational systems, Windows and Linux, in the computers.
II. INSTALLING LINUX The operational system chosen for the CDTN's cluster was the Linux. This system is widespreadly used for similar clusters around the world. Linux presents high level of confidence, safety, performance and stability as well as it has multi-users and multitasks features. Another important feature of Linux is that it is free: since that the main purpose of installation of this cluster is to get a high performance and low cost system, Linux becomes the obvious choice. Among the various Linux distributions available the Conectiva Linux[4] was chosen. The version 6.0 was the most updated when the cluster was built. The Conectiva package is based on Red Hat Linux and presents the advantage of being a Brazilian company it has the whole documentation and interfaces in portuguese and a good technical support. This version has an user friendly interface which becomes the installation a simple procedure. In the
Index
paragraphs bellow are described the main procedures for installation and changes on the standard configurations of Linux Conectiva 6.0 necessaries for the right working of the cluster. The CDTN's cluster consists of three microcomputers Pentium II - 266MHz (by the time this paper was being written three new computers PENTIUM III - 900MHz were added to the cluster, but they were still under evaluation). All the computers had their hard disks divided in two partitions because such an approach allows to keep the Windows operational system on the computers as well. The hardware of the computers is the standard for microcomputers, as follows: ♦Midi towers case with 350 Watts power supply for Intel Pentium processor; ♦Motherboards "off board" for Intel Pentium processor; ♦Intel Pentium processors; ♦DIMM RAM memory chips; ♦Hard drive IDE, ATA 100, minimum 20 GB; ♦3.5" high density floppy drive; ♦Ethernet cards, 10/100 Mbps; ♦52X internal CD-Rom drive; ♦AGP SVGA cards with 32MB, 4X; ♦sound cards and speakers; ♦mouse; ♦104 enhanced keyboards; ♦SVGA color monitors. The connection device also is a standard Ethernet network. The computers are connected in the CDTN's network, no hub or switch was used to connect the computers in an exclusive network. They are even located at different offices. The hard disk of each computer is partitioned using the Windows Fdisk program to create one partition with 50% of the capacity of the hard disk. The Windows is installed in this partition and the others 50% of the disk must be left untouched by the Fdisk - the formats for Windows and Linux are different - and by the Linux installation time, it will find the 50% lost in the hard disk and will format it. The installation sequence of the systems is important. Windows must be installed first because it removes the partition manager (GRUB or LILO) from the Master Boot Record (MBR). After to install the Windows system the Linux can be installed. The installation procedures of the Conectiva Linux 6.0[4] are simple and user-friendly. After Linux installation some changes and settings must be done. As root, run the Linuxconf tool in order to configure the network environment with the name of the machine, the Internet Protocol (IP), network device, kernel module, Domain Name Server (DNS) and Network Information Service (NIS). After that, the protocols telnet, ftp, login and shell must be activated in the file /etc/inetd.conf and the module inet must be started. Finally, two files must be created or modified. In the directory /etc a file named hosts.equiv containing the name of the cluster machines and the login of the user must
be created and a copy of this file, named .rhosts, must be done in the user's home directory.
III. PVM - PARALLEL VIRTUAL MACHINE PVM is a software package that allows to develop and run applications in a cluster of computers linked by a network as if they were just a unique parallel computer. In parallel processing many paradigms have being used including, shared memory, parallel compilers and messages passage. The message passage model is the adopted by the PVM. The key concept of PVM is that it does many computers look to the user like just one unique virtual computer. That is the reason of the name. PVM was designed to work with heterogeneous network, built by computers of different manufacturers and/or models (486, Pentium, Pentium II, Pentium III, Sun, HP, etc), operational systems (Linux, Solaris, etc) and data formats. Due to that feature, PVM has a large versatility, but otherwise, it has not a so good performance when compared, for instance with the Message Passing Interface (MPI) software package, which has not so sophisticated management tools for the heterogeneity as PVM has. When working with PVM always keep in mind that the directory structure must be the same in all computers of the cluster, except if the cluster works with Network File System (NFS). The complete installation of the Conectiva Linux 6.0 automatically install the PVM software package, version 3.4.3, in the directory /usr/lib/pvm3. Nevertheless, some adjustments are necessary in order to make it operational. In the user home directory shall be created a directory named pvm3. Inside this created pvm3 directory a link to /usr/lib/pvm3/lib must be done. Also, inside pvm3 other directory named bin, shall be created and inside this bin directory other directory named LINUX.
IV. MCNP - MONTE CARLO N-PARTICLE TRANSPORT CODE MCNP is a widespreadly used general purpose code based on Monte Carlo method used for stochastic simulation of the transport of neutrons, photons and electrons. The MCNP, version 4B, compilation for UNIX systems requires standard Fortran77 and ANSI C compilers. MCNP4B can be compiled with fort77 or g77 and gcc[5] in PCs running Linux. The Fortran compiler installed by the Conectiva Linux 6.0 is the g77 version 0.5.25. This version as well as the versions 0.5.23 and 0.5.24 have a bug that interrupts the execution of MCNP when the option "LIKE... BUT..." is used in the MCNP input file. This problem can be solved using the version 0.5.22 or prior versions but, as will be clarified in the next section of this paper the best choice is to use the newest version g77-0.5.26 (g77 is a GNU software, so it is free-available). The Conectiva Linux 6.0 does not have the tool fsplit necessary to the MCNP
Index
compilation with g77. This program can be freely downloaded (http://rpmfind.net). Once that all the aforementioned tools are operational, the multiprocessing version of MCNP4B can be compiled following the instructions that accompany the MCNP4B Compact Disk, distributed by Radiation Safety Information and Computational Center (RSICC). Just little modifications in the install.fix and prpr.id files are necessary. The modifications are well described on page 13 of the MCNP electronic notebook maintained by RSICC (www-rsicc.ornl.gov/ENOTE/enotmcnp.html). The executable multiprocessing version of MCNP (after compiled) must be renamed to mcnp.pvm and copied in each hard disks of all computers of the cluster (unless NFS is being used). A copy or a soft link of mcnp.pvm must be in the previously made directory /home/.../pvm3/bin/LINUX and in /usr/bin directory. Before running MCNP aplication in the cluster, the parallel machine must be built using the PVM commands. After that MCNP is ready to run using all the cluster resources.
V. TESTING THE SYSTEM The system performance was verified by the simulation of the TRIGA IPR - R1 research reactor of CDTN/CNEN[6]. The MCNP model for IPR - R1 was ran in several combinations of numbers of histories and cycles, compilers, operational systems, number of processors and library formats. MCNP executables were generated for Windows and Linux using three different compilers: Lahey5 for Windows, g77-0.5.22 and g77-0.5.26 for Linux. The effect of data library format and of the processors number over the processing time was compared. MCNP can read the cross-sections data files in binary or ASCII formats. The data library files are originally distributed by RSICC in ASCII format. Nevertheless, binary files are smaller and can be read faster, therefore is recommended to work with binary libraries. The formats conversion can be done with the tool makxsf distributed together with the MCNP. It was measured a processing time reduction of 2% when the binary libraries were used instead of the formatted ones (for 20000 histories and 200 cycles). In the Linux system were generated both parallel and sequential executable code. It was noted that in calculations with only one processor the sequential executable is faster than the parallel one (with PVM). Running just a few histories and cycles (500 histories and 50 cycles) the sequential version is 13% faster. Nevertheless, when the number of histories and cycles increases this difference becomes shorter. In Windows system (version 98) only a sequential version of MCNP was evaluated.
TABLE 1. System Tests Summary Operat. # of System / computers Compiler Linux / g77-0.5.22 1 Win98 / Lahey5
1
Linux / g77-0.5.26
1
Linux / g77-0.5.22
2
Linux / g77-0.5.26
2
Linux / g77-0.5.22
3
Linux / g77-0.5.26
3
# of # of Time Time cycles histories (hours) Diff (%) 50 500 0.25 -7.4 110 5000 5.31 -200 20000 37.58 -2.8 50 500 0.18 -33.3 110 5000 3.49 -34.3 200 20000 25.34 -34.4 50 500 0.14 -48.1 110 5000 3.02 -43.1 200 20000 21.67 -43.9 50 500 0.27 -110 5000 5.29 -0.4 200 20000 38.65 -50 500 0.17 -37.0 110 5000 3.28 -38.2 200 20000 23.84 -38.3 50 500 0.21 -22.2 110 5000 3.15 -40.7 200 20000 21.30 -44.9 50 500 0.13 -51.9 110 5000 2.20 -58.6 200 20000 15.80 -59.1
Table 1 summarizes the results to the tests carried out in the system.. All the execution time are compared to the worst case. It can be noted that concerning the compilers g77-0.5.22 produces the slowest MCNP executables. Lahey5 is in an intermediary level and the g77-0.5.26 produces the fastest executables. Regarding the influence of the number of processors over the processing time, can be seen that two computers running in parallel is slower than to run the same case in just one computer using the sequential executable of MCNP. An exception was found in the case of 110 cycles with 5000 histories, but the difference is so small that can be neglected. No gain was found utilizing two computers, by the contrary, three computers produce a favorable reduction on the processing time. When this paper was being written more three computers were added to the cluster, but unfortunately they were not installed in time for the tests and this has limited the information about the relationships between the number of computers and the processing time. The results presented in Table 1 are plotted in Fig. 1.
Index
ACKNOWLEDGEMENTS
Number of cycles x number of histories
4.0E+06
The authors are most grateful to Elcio Tadeu Palmieri and Paulo Cezar Gomes from CDTN's Reactors Technology Service and Dr. Robert Jeraj from Josef Stefan Institute.
3.0E+06
2.0E+06
1.0E+06
REFERENCES
0.0E+00
[1] Briesmeister, J. F., MCNP - A General Monte Carlo N-Particle Transport Code, Version 4B. LA-12625-M, 1994.
0
5
10
15
20
25
30
35
40
Processing time (hours)
Figure 1. System Tests Summary.
[2] Geist, A.; Beguelin, A.; Dongarra, J.; Jiang, W.; Manchek, R.; Sunderam, V., PVM: Parallel Virtual Machine. A Users' Guide and Tutorial for Networked Parallel Computing. The MIT Press, Cambridge, Massachusetts, 1994.
VI. CONCLUSIONS
[3] Hargrove, W. W.; Hoffman, F. M.; Sterling, T., The Do-It-Yourself - Supercomputer. Scientific American, Aug. 2001.
Linux - 1 proc. - g77 0.5.22 Linux - 3 proc. - g77 0.5.22 Linux - 2 proc. - g77 0.5.26 W98 - 1 proc. - Lahey5
Linux - 2 proc. - g77 0.5.22 Linux - 1 proc. - g77 0.5.26 Linux - 3 proc. - g77 0.5.26
The results show the viability of the system. A PVM cluster with three microcomputers (using the Linux operational system) was successfully installed, and has demonstrated computational performance improvement, at an extremely low cost, for MCNP applications. As microcomputers are cheap equipment the hardware of the cluster can be frequently upgraded. The costs with software are negligible as well, since the operational system (Linux), the compilers (g77 and gcc) and PVM package are chargefree. The MCNP4B Monte Carlo code was rightly installed in the system and a multiprocessing version was successfully compiled. In calculations using just one processor the sequential executable code is faster than the PVM version. No gain was found utilizing two computers. The three computers, running the MCNP executable code compiled with g77 - version 0.5.26, presents best performance (small execution time). However, as the cluster was configurated with only three microcomputers, the information about the relationships between the number of processors and the execution time is still restricted. It is intended to increase the number of computers in the CDTN's cluster. The addition of more computers to the cluster is simple, cheap and the computers do not need be uniquely used for MCNP simulations. The parallel machine can be mounted at evening, the processes run during the night and some computers released for other uses during the day.
[4] CONECTIVA S. A.; Guia do Usuário do Conectiva Linux 6.0. Casa e Escritório; 1ª edição, 2000. [5] GNU Compiler Collection, G77 0.5.26 (GCC 3.0.3) Manual, 2001. [6] Dalle, H. M., Implantação de um Cluster de Computadores Linux para Processamento Paralelo e sua Utilização na Simulação com MCNP4B do Núcleo BOL do Reator TRIGA IPR - R1. NI-CT4-001/01; CDTN/CNEN, 1999.