transformations to be applied on a medium/large numerical legacy code program having in mind the final objective of parallel computing. The steps are oriented ...
Update and Restructure Legacy Code for (or Before) Parallel Processing F. G. Tinetti1,4 , M. Méndez1 , M. A. Lopez2 , J. C. Labraga2 , P. G. Cajaraville3 1 III-LIDI, Fac. de Informática, UNLP, La Plata, Argentina 2 Centro Nacional Patagónico, CONICET, Puerto Madryn, Argentina 3 Fac. Ingeniería, UNPSJB, Sede Pto. Madryn, Puerto Madryn, Argentina 4 Comisión de Investigaciones Científicas de la Prov. de Bs. As., La Plata, Argentina Abstract— Transformation of an old large sequential applications into a parallel version is still a big challenge in the field of parallel computing. This paper presents a set of transformations to be applied on a medium/large numerical legacy code program having in mind the final objective of parallel computing. The steps are oriented to improve source code while preserving the software external behavior. Each of these transformations has been selected at least to turn old Fortran source code more readable and understandable in order to upgrade it and make it easier to be parallelized. Keywords: Legacy Code, Parallelization, Fortran Code Upgrade, Numerical Simulation
1. Introduction Legacy software have has become an issue in many organizations, involving a number of problems and characteristics [26] [8] [6]. The numerical processing field is strongly related to legacy software since • There is a large number of applications currently in production, being used in a number of organizations and areas such as aerospace, meteorology, etc. [22]. • Mathematical models and computer simulation are being applied since many decades ago. Furthermore, single CPU (Central Processing Unit) processing (e.g. in terms of Mflop/s or millions of floating point operations per second) is not expected to follow the socalled Moore’s Law [21] [12]. There will be almost no improvement in CPU clock rates, instead, it is expected that the number or CPUs or cores will raise at least in the years coming [23] [24]. Multi- and many-core computers are now commonplace, and combinations of multiple multicore processors are now included in medium to large desktop and server computers. This is directly related to legacy code, since it is not longer valid to expect reducing runtime by getting the latest computer [23] [24]. There will be no runtime reduction unless the code is parallelized in some way. Unfortunately, numerical legacy code is amongst the most reactive to change software, and this paper is focused on how to approach these applications in order to be upgraded and parallelized.
Numerical legacy applications are programmed mostly in Fortran for several reasons. Fortran has been and still is one of the most appropriate languages for numerical processing, mostly because numerical processing was about the only one application field by the time Fortran was created [4] [5]. Fortran is one of the first high level programming languages, it is in use and being updated from decades ago [20] [?], unlike most of the current programming languages, including the most popular ones. Fortran has been the first standardized language and, also, it has several standards reflecting its evolution [1] [2] [3] [15] [16] [17]. Legacy software/applications and the environments in which they are used have several features that make it difficult to change/update: • •
•
•
Either the software documentation is lost, or is outdated, or did not exist at all. Today standard software development methodologies and/or tools have not been used for reaching the current version of the software or even the initial version. The current version is in fact the result of several not always well documented maintenance/adaptation changes. Several developers have been involved, some of them at the initial stages of the development process and others as time and environment progressed. Also, each developer used its own coding style.
Numerical applications have several disadvantages which are combined with the previous ones on legacy code: •
•
Physical/mathematical models are usually coded in large and hard to read programs, due in part to the combination of the low level programming language abstractions and numerical method/s properties. Most of the numerical problems/properties involved are, in turn, a combination of discrete number representation and numerical method used to compute a solution [18]. The software developers usually have not been trained in software development tools/processes. Thus, the initial software version contains structures and/or coding specifically oriented towards using a specific computer or computing facility instead of solving a numerical problem. Also, hardware dependent code sections are
undocumented and difficult to identify in the whole application. This paper proposes a general methodology as well as presents a proof of concept on a specific legacy code, a global climate model (GCM). This program can be used as representative of a medium/large numerical application, since it has • About About 300 files containing a total of about 58000 lines of FORTRAN 77 source code. • Approximately 10% of the files are used for defining FORTRAN Common Blocks (global data). • About 80% of the source code lines containing comments just identify programmer and/or minor code modifications. • Most of the FORTRAN routines access global data and, also, define aliased data via “equivalence” declarations. The parallelization process is hard to start in legacy applications, and an incremental process is presented in this paper. Almost every change/update in the legacy code aims to apply parallel processing, either directly or indirectly: • Enhancing readability allows understanding the code and, thus, makes simpler and less error prone every other code change. Even when enhancing readability does not imply parallel processing per se, it is necessary for every further software modification needed for parallel processing. • Some software sections are almost directly approached for parallel processing, being loops the most clear ones. Loops are the first candidates for parallelization using OpenMP directives [10]. Also, loops are where most calculations are carried out, so they have to be understood in order to transform that processing in shared/distributed memory parallel computers.
2. The General Approach The general methodology is similar to that in almost every software maintenance process apply a single change plus the necessary software testing with two objectives. Unlike almost every software maintenance, software testing after a change in this context is focused on assuring that the software did not change its behavior. Standard software maintenance tasks are usually focused in the opposite direction: changing the software behavior (for correcting or adding/deleting/changing software functionality) [25] [9] [7]. There is one point in common, though: some standard maintenance tasks are oriented towards enhancing performance, which is almost exclusively the focus of the work in this paper, including software parallelization. There are several distinguishing characteristics of updating and restructuring legacy code for (or previously to) parallel processing: • There are well known update definitions, specifically in Fortran legacy code. Those updates can be applied even without knowledge of source code, and applying
•
those changes may produce better knowledge of the software. For example: FORTRAN 77 code is on the so called fixed format, where every line has to begin in a predefined column. In this context, the change from fixed to free format can be applied directly to the whole software, automatically including some code indenting style. It is not necessary to know anything about the software, unlike a traditional maintenance task where something about the software should be known in order to be changed/deleted/etc. The process can be reduced to those routines or software sections which have the highest processing requirements of the application. It is possible to take advantage of profiling in order to identify routines or code sections with most of the elapsed runtime. However, this will not be used in this paper, since the change process will be applied to all of the application, disregarding those with greater/minor processing requirements. There are mainly two reasons for applying changes to the whole legacy code: – Avoiding further mixture of coding styles, since applying changes to only a section or set of routines will result in partially changed software: some sections/routines in old or legacy style and some others with new features/better style. – Distributed memory parallel programming usually implies some coding for adding at least the calls to send()-recv() communication routines. Including code in a partially updated/changed legacy software can be worse than doing the same task directly on legacy code without change.
•
Some software sections will have much more effort than others. More specifically, loops will be analysed and updated so that they will be easily handled in further parallelization steps.
Specifically, every software change/update will be made in a series of steps: 1) Identify and save (e.g. in a software version manager) the current legacy software application/program, which will be taken as the reference. Every change will be accepted/rejected according to its relationship to the reference program. 2) Select and apply a specific change/update to be applied to the reference program. A new program version will be produced. 3) Check/verify the new program version by comparison with the previous one. Define and apply software testing comparison criterion/criteria in order to accept or reject the new program version. This may/should include a set of test cases if necessary. 4) Accept/reject the change according to the previous comparison. An accepted program version will be the candidate as the current version for the next change.
A rejected program version would be: • Discarded in order to avoid investing more time/effort in a possible useless change. • Reviewed in order to find out the problem/s and possible solution/s. 5) Document the accepted/rejected change. In case of an accepted change, documentation should include at least a general description of the change plus several (if not all) specific/actual changes. Specific changes are highly prone to be produced automatically (e.g. by a software version manager). And the complete legacy software update process can be described as an iteration on these steps, each iteration for a different specific change, as shown in Fig 1, which has several points in common with [11] [13] [14]. The first and
on itself can be verified without knowledge of the application/output. The Accept/Reject involves a difficult decision in numerical applications: whether something changes in the output it is safe to accept the new version or not. This is particularly difficult because numerical code is frequently used for simulating a system, and numerical models do not necessarily have a unique (correct) output. Specific decisions and implementations of these steps on the CGM described above are shown in the next section.
3. Defining and Verifying Changes There are several issues/decisions involved in the legacy software update process as described in the previous section. At the highest level of abstraction, some decisions are related to the software feature which needs improvement. At the lowest level of abstraction, there are also several choices such as the tool/s used in the process. This section is focused in the most time and effort consuming updating process steps: Change, Check, and Accept/Reject.
3.1 Possible Changes: Update/Restructure
Fig. 1: Software Update Process last steps can be considered almost automatically made by some well-known tools. The Change step on Fortran legacy code has some issues directly related to the language (e.g. old and deprecated Fortran features), others related to the original software development tools/methodology (or lack of), and current software version, and others related to the environment for which the updated software is expected to run (i.e. parallel computing hardware). The Change step is almost about implementation details of the specific change and the tool selected for implementation. The Check step is one of the most interesting steps, entailing software testing. Some software testing is directly related to the application, specifically about analysing and comparing program output. However, given that some changes are about readability (i.e. mostly syntax) it is also expected a priori that the change
The first changes applied to the legacy CGM described above have a dual purpose: enhance readability and highlight issues relevant to the parallelization process. Many of the parallelization issues are related to data involved in each calculation, whether it is local to a routine or used from global memory (Fortran Common blocks). Beyond the classical side effect problems, global memory usually makes harder the parallelization process. More global memory accessed in more routines necessarily implies more data traffic (communications) in distributed memory parallel computers such as clusters. Furthermore, more data traffic in general also implies performance penalizations since the multi-core caches become insufficient and/or dirty more often at runtime. Also, some changes are related to old and/or bad coding practices, sometimes accepted by Fortran compilers as language extensions. Surprisingly, most of those bad coding practices are not accepted in almost any of the standards, but just as language extensions and mostly nonportable features. 1) Remove Tabs: The tab character is not a legal character in any Fortran standard source code. The GCM as well as other legacy code include such characters and are accepted by compilers via the so called Fortran compiler or language extensions. Also, the tab character is handled rather freely by editors and IDEs (Integrated Development Environments), so it is far from enhancing readability. 2) Change Fixed Form to Free Form: Fixed form source code format was removed as mandatory from the Fortran standard since Fortran 90 [3]. This old language feature makes difficult the process of reading and understanding source code. Furthermore, fixed format source code is prone to errors since blanks
3)
4)
5)
6)
(white spaces), for example, are not meaningful and can be used (or not used) almost freely in the code. It is worth noting that fixed source form is fully standard compliant [17]. Replace Old Style DO Loops: FORTRAN 66 and FORTRAN 77 do not have an end loop statement, it was introduced in the Fortran 90 standard [1] [2] [3]. As a consequence, DO loops use continue statement as an ending point or use a labeled statement in the worst-case scenario. Furthermore, shared DO loop termination becomes a coding style that hinders program readability. Loops are necessarily analyzed and must be well known, since most of the numerical processing is made in loops. Loops are at the initial focus for using OpenMP, for example, in the context of shared memory parallel computing. In distributed memory parallel computing it is necessary to know every data involved in every loop in order to find out the amount of local and non local data and, thus, the communication needs at each processing step. Replace Obsolete Operators: Old FORTRAN logical operators (.lt., .eq., etc.) were replaced with those commonly used in modern programming languages: