MAPL - A High-Level Programming Paradigm to Support More Rapid and Robust Encoding of Hierarchical Trees of Interacting High-Performance Components. Max Suarez
Atanas Trayanov
Chris Hill
Global Modeling and Assimilation Office, Goddard Space Flight Center, Maryland, USA
[email protected]
Global Modeling and Assimilation Office, Goddard Space Flight Center, Maryland, USA
[email protected]
Department of Earth Atmospheric and Planetary Sciences, Massachusetts Institute of Technology, Massachusetts, USA
[email protected]
Paul Schopf
Yuri Vikhliaev
George Mason University,Virginia, USA
[email protected]
Global Modeling and Assimilation Office, Goddard Space Flight Center, Maryland, USA
[email protected]
Abstract
1.
We describe the design, and deployment in several large scale Earth system codes, of an innovative programming library, MAPL. MAPL is a layer of software that is built on top of the Earth System Modeling Framework (ESMF) component library. It provides mechanisms for automating and managing key aspects of the interconnection and control of deep, hierarchical trees of interacting components. Examples of the use of the MAPL library, in both an illustrative five component coupled system and with state-of-theart large scale Earth system models, are used to highlight MAPL’s role in automating key aspects of the creation of sophisticated, scalable component based systems.
High-performance component frameworks, such as the Common Component Architecture (CCA) [2], META-CHAOS [7], CACTUS [1], MCT [11], Fractal [4] and the Earth System Modeling Framework (ESMF) [9], provide a powerful basis for developing next generation Earth system models. The potential of these systems has energized Earth system model developer and user groups, resulting in growing interest in interchange of computational implementations and catalyzing more distributed approaches to development Applications built using emerging component paradigms have started to appear, some containing large numbers of components and featuring multiple combinations of peer-peer and parent-child component interactions. Expressing the resulting component interactions succinctly and robustly presents a programming challenge. In this paper we describe a software library, called MAPL, that is built on top of ESMF. The ESMF software that underlies MAPL has been described in several articles [9, 5]. We review some key aspects of ESMF in this article, but much more detail can be found in existing publications and at the ESMF web site [15]. In essence, ESMF is a library of software for building and coupling weather, climate and related models. It supports component oriented design to facilitate reuse and code sharing amongst Earth science modeling communities. ESMF presents a component programming model in which (i) components are software entities that usually perform, potentially parallel, computations on sets of arrays that represent some discretized one, two or three dimensional space and (ii) components can be flexibly
Categories and Subject Descriptors D.2.12 [Software]: Software Engineering Interoperability; J.2 [Software]: Software Engineering Interoperability General Terms Component programming; Structured approaches. Keywords Components; Hierarchy; Tree
Copyright 2007 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by a contractor or affiliate of the U.S. Government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only. HPC-GECO/CompFrame’07, October 21–22, 2007, Montr´eal, Qu´ebec, Canada. c 2007 ACM 978-1-59593-867-1/07/0010. . . $5.00 Copyright
11
Introduction
Figure 2. UML diagram of the interfaces that a standard ESMF component, in this case the SOLAR function from Figure 1, provides. All MAPL components provide these functions.
Figure 1. A diagram showing the hierarchical relationship amongst components in the GEOS-5 software. coupled together in a manner that reflects couplings between physical processes that are simulated in Earth and planetary science models. The goal of the MAPL software presented in this article is to ease building large, multi-component, Fortran based HPC applications that exploit the ESMF component programming model. All the MAPL software and documentation and ESMF software and documentation described and referenced in this article are available online [15, 17]. 1.1
on ESMF. In section 2 we briefly review the ESMF component model. In section 3 we describe key semantics and rules that MAPL introduces and show examples of the MAPL API. In section 4 we present a simple example of MAPL use, and in section 5 we show examples of MAPL being used to extend the GEOS-5 system to include multi-model ocean simulation capabilities. Finally we conclude in section 6 and examine future MAPL applications and development directions.
An example MAPL component hierarchy
2.
Figure 1 shows a representation of the GEOS-5 software, a component based implementation of the Goddard Earth Observing System (GEOS) atmospheric general circulation model. Each box in the figure is a separate ESMF component. Each component performs a specific algorithmic function. For example the AGCM component orchestrates the numerical time-stepping of two child components COLUMN PHYSICS and DYNAMICS. These child components handle respectively (through multiple child components), atmospheric simulation physics (i.e. numerically simulating cloud processes, radiative processes) and atmospheric simulation dynamics (i.e. numerically simulating the fluid flow equations that involve flow and pressure in the atmosphere). As evident in the illustration, the GEOS-5 system consists of a hierarchical tree of components. The root of the tree (the box marked CAP) serves as the top-level control point for the simulation. When executed, data and control must pass among components in the correct sequence and along the right paths. A key role of MAPL is to provide support for succinct, comprehensible and correct expression of the control and data flows that underlie a component tree such as that in Figure 1. This article describes the features in MAPL that support programming an application patterned as a tree, and based
The ESMF component model
Each box in the tree in Figure 1 is a separate software component, with public component interface functions that conform to the ESMF component model described in [9, 5] and briefly recapped here. For example, the ESMF component box marked SOLAR in Figure 1 is controlled through four of the public component interface functions, SetServices(), Initialize(), Run() and Finalize(), that the ESMF component model defines1 . In the ESMF component model these functions are envisioned to perform the following roles: 1. A components SetServices() function is called when an instance of the component is created. The argument to SetServices is a variable of type ESMF GridComp. This variable corresponds to a specific instantiation of a component. In an ESMF component, component developer written SetServices functions register a series of component specific functions by calling the ESMF function ESMF GridCompSetEntryPoint. Calling ESMF GridCompSetEntryPoint, with appropriate arguments, associates component developer supplied Initialize(), Run() and Finalize functions with the component instance. As 1 ESMF
defines a number of other component interface functions that a component can register[16]. However, to illustrate the core concepts in MAPL SetServices(), Initialize(), Run(), Finalize() are sufficient.
12
Phase 1 Creation: creates components and registers user code procedures for methods called in later phases. Create steps are also called for Gridded Component COMPONENT2 and Coupler Component COUPLER21 (not shown).
COMPONENT1 = EMSF_GridCompCreate("Example Component 1", LAYOUT_GC1) CALL ESMF_GridCompSetServices(COMPONENT1,component1_register) COUPLER12 = EMSF_CplCompCreate("Example Coupler12", LAYOUT_CPL12) Phase 2 Initialization: call the user code initialization procedures registered in phase 1. Initialize steps are also called for COMPONENT2, COUPLER12, COUPLER21 (not shown)
CALL ESMF_GridCompInitialize( COMPONENT1, IMPORT1, EXPORT1, CLOCK, RC ) Phase 3 Run: call the user code run procedures, normally within one or more control loops (not shown).
CALL CALL CALL CALL
ESMF_GridCompRun( COMPONENT1, IMPORT1, EXPORT1, CLOCK, ESMF_CplCompRun( COUPLER12, EXPORT1, IMPORT2, CLOCK, ESMF_GridCompRun( COMPONENT2, IMPORT2, EXPORT2, CLOCK, ESMF_CplCompRun( COUPLER21, EXPORT2, IMPORT1, CLOCK,
RC RC RC RC
) ) ) )
Phase 4 Finalize: A component finalize procedure will usually do some form of shut down I/O and deallocate resources. CALL ESMF_GridCompFinalize( COMPONENT1, IMPORT, EXPORT, CLOCK, RC ) Figure 3. Simplified Fortran like, non-MAPL ESMF, pseudo code for two Gridded Components COMPONENT1 and COMPONENT2 that communicate with one another through two Coupler Components COUPLER12 and COUPLER21. described in section 3, MAPL extends the role of a components SetServices function considerably.
for programming components with the listed interface functions. However, ESMF does not mandate that all components provide the four functions listed and does not make any assumptions about the semantics of each of the functions. The only thing that is mandated by the ESMF component model are (i) that the respective component interface functions argument lists be of the types shown and (ii) that the component interface functions be registered by a components SetServices function. All the processing that takes place in each component interface function is assumed to be user written. This makes ESMF very general and avoids imposing constraints on application developers. However, under ESMF, it is up to the component developer to write all the code for SetServices, Initialize, Run and Finalize. In contrast, when using ESMF through MAPL, in many situations, components can employ default SetServices, Initialize, Run and Finalize interfaces, that are provided by the MAPL library. This requires following some specific rules regarding the actions that are performed by each of these component interface functions. These rules are explained in section 3.
2. A components Initialize() function is called to configure an instance of the component. The arguments of an ESMF component’s Initialize function include a variable of type ESMF GridComp. This variable is the instantiation of the component. In addition there are three other arguments, two of type ESMF State and one of type ESMF Clock. These are used to pass in to the component an import state variable, an export state variable and a simulation time counter variable respectively. The import state and export state variables are used in ESMF to pass data between components. ESMF provides an extensive API [16] for attaching typical Earth system model data structures into ESMF State type variables. Two important aspects of the API that are relevant to MAPL are (a) data structures added to an ESMF State can have arbitrary meta data tags associated with them. (b) An ESMF State can contain an ESMF State variable, allowing recursive nesting of ESMF State variables. 3. A components Run() function is called to carry out a cycle of the iteration that makes up the kernel of a components computational algorithm. The Run function is called repeatedly as part of component instances lifecycle. It takes the same argument list as Initialize().
2.1
ESMF Gridded and Coupler Components
In ESMF software components can be created of type ESMF GridComp or ESMF CplComp. These types correspond to what are called respectively “gridded components” and “coupler components” in the ESMF documentation[16]. These two types have very similar characteristics. By convention gridded component type variables are used for components that carry out the main numerical and algorithmic computations of a code. Coupler components are used to map data between gridded components. For exam-
4. A components Finalize() function is called to terminate a component cleanly. It takes the same argument list as Initialize(). Key ESMF component interface functions and signatures are shown in Figure 2. The ESMF library provides support
13
subroutine SetServices ( GC, RC ) type(ESMF_GridComp), intent(INOUT) :: GC integer, optional, intent( OUT) :: RC : call MAPL_AddImportSpec(GC,SHORT_NAME=’U’, & LONG_NAME=’eastward_wind’,UNITS=’m s-1’, & DEFAULT=0.0,DIMS=MAPL_DimsHorzVert, & VLOCATION=MAPL_VLocationCenter,RC=STATUS ) : call MAPL_AddExportSpec(GC,SHORT_NAME=’DUDT’, & LONG_NAME=’eastward_wind_tendency’, & UNITS=’m s-2’,DIMS=MAPL_DimsHorzVert, & VLOCATION=MAPL_VLocationCenter,RC=STATUS ) : call MAPL_GridCompSetEntryPoint (gc,ESMF_SETRUN,& Run, rc=status) : C1=MAPL_AddChild(GC,NAME=’C1’,SS=c1_SetServices,& rc=status) C2=MAPL_AddChild(GC,NAME=’C2’,SS=c2_SetServices,& rc=status) : CALL MAPL_AddConnectivity( GC, & SHORT_NAME = (/ ’X’ /), & SRC_ID=C1,DST_ID=C2 & : call MAPL_GenericSetServices ( GC, RC=STATUS) :
ple the SOLAR component in Figure 1 requires the threedimensional time-varying atmospheric pressure as part of its import state. This quantity is computed by another component in the tree, FVCORE, which will have produced the quantity as part of its export state. In ESMF, a coupler component is used to map the atmospheric pressure entry held in the FVCORE export state to the atmospheric pressure entry in the import state of SOLAR. In a non-MAPL ESMF component this mapping between components must be explicitly coded by a developer. When using ESMF through MAPL, this mapping is automated and no explicit developer code is required. 2.2
A simplified, non-MAPL, ESMF example
The, non-MAPL, ESMF application code fragments in Figure 3 show the flow of control through SetServices(), Initialize(), Run() and Finalize() phases and illustrate data flows encapsulated in type ESMF State import and export state variables. The figure shows key pieces of code from a hypothetical component based application in which two gridded components (COMPONENT1 and COMPONENT2) are coupled together by two coupler components (COUPLER12 and COUPLER21). In the section marked Phase 1 on Figure 3, gridded component variables COMPONENT1 and COMPONENT2, both of type ESMF GridComp, are created (not all the code is shown) and their Initialize(), Run() and Finalize() functions are registered using appropriate SetServices calls. In this section coupler component variables COUPLER12 and COUPLER21 are also created, again only a fragment of the needed code is shown for this. In the section marked bf Phase 2 the Initialize() functions for each of the gridded and coupler components are executed. Only the Initialize() call for COMPONENT1 is shown on the figure. The code for the other components is very similar. The Phase 3 section shows the Run() component interface function being called for each of the gridded and coupler components in sequence. The role of the coupler components is evident. For example the Run() stage of COUPLER12 takes the export state, EXPORT1 (an ESMF State type variable), from gridded component COMPONENT1 and maps it to the import state, IMPORT2, of gridded component COMPONENT2. The Phase 4 section is where the Finalize() function of each component is invoked. 2.3
Figure 4. Fortran 90 code fragments extracted from a MAPL based SetServices() function. The extracted code fragments show the key functions that are supplied by MAPL for use in the ESMF SetServices stage. Numerous declarations and other syntax, needed for compilation, have been omitted for clarity. the hierarchy shown on the figure, using the basic ESMF paradigm, rapidly becomes a very complex exercise. Code, like the fragments shown in Figure 3, and described in section 2.2 is required at level in the hiearchy. In a non-MAPL ESMF code these pieces of glue and driver code are handwritten. MAPL, by introducing some rules about the semantics of SetServices, Initialize, Run() and Finalize can begin to address and automate this potential complexity.
Applying the standard ESMF component model to a deep tree hierarchy.
3.
Extending the ESMF approach described above and illustrated in Figures 2 and 3 to a tree of the sort shown in Figure 1 presents some challenges. As noted in section 2.1, in the example used in Figure 1, entries in the export state of component FVCORE need to mapped, via an ESMF coupler component, to entries in the import state of component SOLAR. There are many such data flows amongst the component in Figure 1, coding each of these using hand-written coupler component soon becomes impractical. Generating
Semantics and API of MAPL
In section 2 we reviewed core aspects of the ESMF component model. In this section we describe how the MAPL software, that is layered on top of ESMF, exploits the ESMF component paradigm. A key goal of MAPL is to simplify robust coding of hierarchies of ESMF components. In MAPL this is done primarily through (i) introducing recursive nesting of the four calls (SetServices, Initialize, Run and Finalize) contained in the numbered list in section 2 and (ii) as-
14
Phase SetServices
Initialize
Run
Action(s) Registers import state specifications. call MAPL StateAddImportSpec(...) Registers export state specifications. call MAPL StateAddExportSpec(...) Registers components children. call MAPL AddChild(...)) Registers connections between children. call MAPL AddConnectivity(...) Calls children SetServices. call MAPL GenericSetServices(...) Call MAPL GenericInitialize(...) set grid (inherited or set locally). automatically allocate state entries. automatically create coupler components. call childrens Initialize phases. Call MAPL GenericRun(...) call automatic coupler components and childrens Run phases. Call MAPL GenericFinalize(...)
export state variables are registered with the gridded component GC. The functions MAPL AddImportSpec and MAPL AddExportSpec respectively register import and export state entries. This registration indicates that the gridded component GC import and export states will contain specific arrays of data. The arguments to the registration routines include meta-data that will be held with the arrays. For example the call to MAPL AddImportSpec registers an array to be held in the import state of GC, with SHORT NAME meta-data U. At the SetServices stage, however, memory for the array data is not allocated. Instead flags passed in to the MAPL AddXXXXXXSpec routines called DIMS and VLOCATION are used to indicate the shape (but not the size) of the array for this import or export. The actual memory allocation for these arrays is deferred until the Initialize() phase, at which point, by convention in MAPL, a simulations domain extents are known. 3.1.2
Registering child components
sociating certain default behaviors or semantics with each of the four calls. Here we outline the semantics that MAPL adds to each of SetServices(), Initialize(), Run() and Finalize().
Another extension to SetServices that MAPL introduces is the MAPL AddChild call shown in Figure 4. This call registers “child” components that will become children of the parent component GC. The SetServices functions to the child gridded components, C1 and C2, are c1 SetServices and c2 SetServices respectively in Figure 4. These child SetServices routines will look similar to the SetServices of their parent shown in the figure. The registration of child SetServices functions provides the basis for a recursive traversal of a tree like the one shown in Figure 1.
3.1
3.1.3
Finalize
Table 1. Summary of extensions MAPL adds to the standard ESMF SetServices, Initialize.
SetServices semantics in MAPL
The SetServices() function in ESMF is used to register a set of function pointers that are to be associated with specific phases (illustrated in Figure 3) in a component life-cycle. In standard ESMF descriptions this is the only role of SetServices. In MAPL this aspect of SetServices is invoked by calling the the library function MAPL GridCompSetEntryPoint(). This function uses the ESMF library call ESMF GridCompSetEntryPoint() to add an entry to a user specified function to a function pointer table held with the gridded component variable. The argument ESMF SETINIT passed into to MAPL GridCompSetEntryPoint() in Figure 4 is a flag that indicates that the component developer supplied function Initialize() will be associated with the ESMF initialize phase for the component GC. MAPL GridCompSetEntryPoint() also records information in internal MAPL data structures that are then attached to the ESMF GridComp variable GC. This is possible because ESMF provides a function ( called ESMF UserCompSetInternalState()) that can associate pointers to arbitrary data structures and store them with a variable of type ESMF GridComp. 3.1.1
Recursive execution of the SetServices phase.
A MAPL form of SetServices finishes with a call to MAPL GenericSetServices. This call appears at the end of every components SetServices when using MAPL. This call invokes, in the order in which they were registered, the SetServices functions for each child component registered using MAPL AddChildGC. The result of these extensions to the standard ESMF SetServices are that, when using MAPL, on return from execution of the SetServices phase an entire tree of components SetServices phases can have been traversed. The traversal proceeds in a depth first fashion starting from the top of the tree as drawn in Figure 1. The tree traversal is imposed by the MAPL library so that the only explicit invocation of a SetServices phase, of the style shown in the standard ESMF example in Figure 3, is in user code for the top level component. Once the tree has been traversed, every component will have registered the arrays it provides in its export state and registered the arrays it requires in its import state. The tree of import and export of all its descendants is made available to each component at each level of the tree. For example, once the SetServices tree has been traversed, the gridded component variable associated with the RADIATION component in Figure 1 has a set of export state entries that MAPL im-
Registering import and export state entries
Unlike standard ESMF, in MAPL, SetServices also includes several other parts. At the top of Figure 4, lists of import and
15
plicitly “includes” from the child components SOLAR and IR. In MAPL, the rules regarding propagation of export state entries up the tree of components are used to facilitate automation of large parts of the Initialize(), Run() and Finalize() phases. In standard ESMF, these phases typically require explicit user code from a component developer to handle data flows between components. In MAPL, many aspects can be resolved automatically once a tree of gridded components, import state and export state entries, with their associated meta-data, is available. 3.2
Services phase, are used to check consistency of component to component connections, to automatically allocate memory for arrays associated with import and export states and to automatically instantiate and configure coupler components needed to map between components import and export states. This is possible because the tree structure established in the MAPL SetService phase contains all the information needed to represent the data flow throughout the component hierarchy. In MAPL, every MAPL component calls MAPL GenericInitialize() When MAPL GenericInitialize() is executed it will call the Initialize() functions of child components that were registered through calls to MAPL AddChild.
Coupling child import and export state entries in MAPL
The final operation that MAPL introduces into a components SetServices phase is registering connectivity specifications for its children. An example of this is the MAP AddConnectivity() call in Figure 4. This function call registers that the export state entry “X” of component C1 maps to the import state entry “X” of component C2. In MAPL two conventions are followed regarding specifying coupling. The first is that coupling is specified by the parent of a set of children. The second is that component import state entries that are not coupled at the parent level by a MAP AddConnectivity() call are passed up the component tree. These import state entries then become part of the import state of the parent. Export states entries of child components also become part of their parents export state. The proceeding connectivity conventions are used to define a rule for expressing coupling between component import and export states that are not connecting adjacent components. This rule requires that unsatisfied import state entries at a child level become imports at the parent level in a recursive manner. A MAP AddConnectivity() call is then added at the level, in the component tree, at which a common ancestor of the components with connecting import and export state entries occurs. For example, as described in section 2.1, one of the imports state entries for the component SOLAR in Figure 1 is the atmospheric pressure field, which is an export state entry of the component FVCORE in the figure. This connectivity, between the appropriate export state entry from FVCORE and appropriate the import state entry from SOLAR, is expressed through an MAP AddConnectivity() coupling in the SetService() phase of the component AGCM, the first common ancestor of SOLAR and FVCORE. 3.3
3.3.1
Allocation of registered import and export state entries
At runtime, import and export state array entries that were registered in SetServices have memory allocated in MAPL GenericInitialize(). The size of the arrays allocated is based on two things (i) the shape flags that were set during MAPL AddXXXXXXSpec() calls and (ii) the actual grid on which the gridded component is running. An example of a shape flag is the parameter MAPL DimsHorzVert in Figure 4. This flag is used to define the shape (i.e. one-dimensional, twodimensional horizontal, three-dimensional) of the array with respect to a grid that is defined during the Initialize() phase. The exact size of the grid, however, is not set until the Initialize() phase. In MAPL each gridded component has a grid associated with it. This grid, called the components “natural grid”, and it is the one used to determine the size of the arrays that are allocated for the import and export state entries registered in SetServices. If a component does not set a grid during Initialize() it will inherit from the parent gridded component.
3.3.2
Recursive execution of the Initialize phase.
By (i) requiring that all MAPL components call MAPL GenericInitialize(), (ii) recording child components with their parent during SetServices and (iii) having a component MAPL GenericInitialize() call its child components Initialize a MAPL code can execute the Initialize cycle for all the components in a tree such as the one in Figure 1 in sequence (depth first) as soon as the top level component calls its Initialize method. This means that, on return from the top-level component Initialize phase, all the import and export state arrays that were previously registered (in the SetServices phase) have now been allocated. Component developer code that must be written is limited to special initialization operations that are not general enough to be easily automated. Most of the other generic aspects of initializing ESMF components is handled automatically by MAPL. In non-MAPL ESMF codes this initialization cycle must be hand coded explcitly.
Initialize() semantics in MAPL
Similarly to MAPL GenericSetServices(), MAPL provides a MAPL GenericInitialize() function. If a component specific Initialize() function is not registered in SetServices() then MAPL associates a MAPL GenericInitialize() function with the components Initialize phase. In the MAPL GenericInitialize phase inheritance rules, together with the import and export state entry registration information from the Set-
16
3.4
Run() and Finalize() semantics in MAPL
The component Run() and Finalize() phases in MAPL follow a similar recursive strategy as SetServices() and Initialize(). A MAPL component that doesn’t register a Run phase function will be given a default run phase function MAPL GenericRun(). Any MAPL component that supplies its own Run() function must call MAPL GenericRun() from that function. The MAPL GenericRun() function runs a components child components (i.e. calls the child components Run phase functions) that were registered in the SetServices phase using the MAPL AddChild() function. The recursion rule and automatic default rule also applies to the Finalize phase. 3.5
Coupler components in MAPL
Figure 5. Hierarchy of simple atmospheric dynamics + physics component MAPL example.
As discussed in sections 2.1 and 2.2 and illustrated in Figure 3, in standard ESMF, a developer is expected to provide coupler components (components of type ESMF CplComp) as well as gridded components (components of type ESMF GridComp). The role of coupler components is to map export state entries of one component to import state entries of another component. This operation is automated in MAPL. Connectivity between pairs of component import and export state entries are registered with MAPL AddConnectivity during the components SetServices phase. This information is used during the recursive traversal of the component tree Initialize and Run phases to automatically create and execute coupler components to carry out the registered couplings. 3.6
nents together. In the following we briefly described the MAPL aspects of these components. 4.1
In this application, the GEOS AgcmSimpl GridComp component declares connectivity between import and export state entries of its children, GEOS HSGridComp and Fvdycore GridComp. 4.1.1
Summary of MAPL semantics
SetServices phase
In its SetServices phase GEOS AgcmSimpl GridComp includes calls
Table 1 summarizes the actions that MAPL introduces to the SetServices, Initialize, Run and Finalize phases of a standard ESMF component in order to make it a MAPL component. As can be seen the biggest change is in the amount of functionality that is added to the standard ESMF SetServices phase. The extra declarative code in SetServices allows aspects of the Initialize, Run and Finalize cycle, that would be explicitly hand-coded in a non-MAPL ESMF code, to be automated.
4.
MAPL in GEOS AgcmSimpl GridComp
call MAPL_AddConnectivity ( GC, & SHORT_NAME = (/ ’DUDT’, ’DVDT’, ’DTDT’ /),& SRC_ID = PHS,DST_ID = DYN, RC=STATUS ) call MAPL_AddConnectivity ( GC, & SRC_NAME=(/’U ’,’V ’,’T ’,’PLE’ /),& DST_NAME=(/’U ’,’V ’,’TEMP’,’PLE’ /),& SRC_ID=DYN,DST_ID=PHS,RC=STATUS ) . These connections express (i) mapping of GEOS HSGridComp export state entries (DUDT, DVDT and DTDT) to the Fvdycore GridComp import state and (ii) mapping of Fvdycore GridComp export state (U, V, T and PLE) to the GEOS HSGridComp import state.
A simple MAPL example
The MAPL software distribution includes a simple “HeldSuarez” multi-component, atmospheric simulation application. This application has a small hierarchy of components, illustrated in figure 5. The bulk of the numerical computations take place in the components GEOS HSGridComp and Fvdycore GridComp. These components implement respectively, (i) thermodynamic forcing terms [8] that describe an approximate rate at which the atmosphere moves toward radiative equilibrium as a function of spatial location and instantaneous atmospheric state (ii) fluid flow equations of motion [12] in response to internal fluid state and subject to thermodynamic forcing calculated by the GEOS HSGridComp component. The GEOS AgcmSimpl GridComp component connects the two computational compo-
4.1.2
Initialize, Run and Finalize phases
The other phases for GEOS AgcmSimpl GridComp use MAPL defaults. This ensures that GEOS AgcmSimpl GridComp will call its child components Initialize, Run and Finalize phases and will create and execute needed coupler components. There is no need for any component developer driver and glue code for GEOS AgcmSimpl GridComp except for the SetServices phase. In a non-MAPL ESMF component, the component developer would need to provide significant explicit code for each of the other phases.
17
subroutine Initialize(GC,IMP,EXP,CLOCK,RC) type(ESMF_GridComp), intent(inout) :: GC : type (MAPL_MetaComp), pointer :: MAPL type (ESMF_State ) :: INTERNAL real, pointer :: SPHI2(:,:) : call MAPL_GetObjectFromGC(GC,MAPL, RC=STATUS) call MAPL_Get(MAPL,INTERNAL_ESMF_STATE=INTERNAL,& RC=STATUS ) call MAPL_GetPointer(INTERNAL, SPHI2, ’SPHI2’, & RC=STATUS)
radiative equilibrium formula [8]. These calculations use data that is held in variables registered in the SetServices phase. As with the internal state example in figure 6 these variables are accessed by using MAPL query functions. For example the following code sequence ! Pointers TO IMPORTS call MAPL_GetPointer(IMP,PLE , ’PLE’ ,RC=STATUS) ! Pointers TO EXPORTS call MAPL_GetPointer(EXP,DUDT, ’DUDT’,RC=STATUS) is used at the start of the Run phase of GEOS HSGridComp to setup access to the import and export state entries. 4.2.4
Figure 6. Internal state access in MAPL.
Finalize phase
As with most computational kernels GEOS HSGridComp requires component developer code for each of SetServices, Initialize and Run.
There is no need for an explicit, component developer written Finalize phase for component GEOS HSGridComp. This component only uses MAPL based data structures. As a result, MAPL is able to carry out any tidying that is needed at the Finalize phase and so the component relies on the default MAPL GenericFinalize function.
4.2.1
4.3
4.2
MAPL in GEOS HSGridComp
SetServices phase
SetServices registers import and export entries e.g.
In contrast to GEOS HSGridComp the FVdycore GridComp component uses its own data types to represent its internal state. A pointer to a custom data type, defined in the component code, is stored with the gridded component. MAPL and ESMF operations using the custom data type then appear in the SetServices, Initialize, Run and Finalize phases. This approach is taken in FVdycore GridComp because it has its own internal data organization.
call MAPL_AddImportSpec(GC, SHORT_NAME = ’V’, & LONG_NAME=’northward_wind’,UNITS=’m s-1’, & DEFAULT= 0.0,DIMS=MAPL_DimsHorzVert, & VLOCATION=MAPL_VLocationCenter,RC=STATUS ) registers an import state entry called V. 4.2.2
Initialize phase
4.3.1
The Initialize phase of GEOS HSGridComp makes use of the “internal state” that MAPL supports. A MAPL internal state is a container type similar to a standard ESMF import and export state, except that it is not passed across the Initialize, Run and Finalize function interfaces explicitly. The internal state variable for a given component is stored with the gridded component variable by MAPL (by using the function ESMF UserCompSetInternalState() describe in section 3.1). MAPL internal state entries are registered in SetServices using the function MAPL AddInternalSpec. Entries registered in the MAPL “internal state” are accessed from user code by accessing the MAPL supplied context information associated with the gridded component variable. Code fragments to access a two-dimensional array SPHI2 registered in the internal state of GEOS HSGridComp is shown in Figure 6. These code fragments are taken from GEOS HSGridComp Initialize phase.
4.2.3
MAPL in FVdycore GridComp
SetServices phase
There are standard calls to MAPL AddImportSpec and MAPL AddExportSpec in the SetServices phase. However, there are no calls to MAPL AddInternalSpec. In place of these, a pointer to a special data type T FVDYCORE STATE is declared, allocated and attached to the gridded component, using the function ESMF UserCompSetInternalState. 4.3.2
Initialize, Run and Finalize phases
In the FVdycore GridComp component Initialize, Run and Finalize phases the pointer to the type T FVDYCORE STATE variable allocated in the SetServices phases is retrieved. This is done using the ESMF UserCompGetInternalState function. 4.4
Time stepping
The time-stepping of an application tree in MAPL is driven from the root of the tree. Both Figure 1 and Figure 5 have a component called CAP at their root. This component is responsible for ticking a heartbeat “clock” signal that is stepped forward at a higher frequency than any of the child time-steps. At each clock tick CAP will call the Run phase of its children, which will cause the whole tree to be traversed. Components then individually decide whether to
Run phase
The Run phase for GEOS HSGridComp contains the mathematical relations that calculate the terms in the Held-Suarez
18
stand the large uncertainties in simulation results. Building and configuring multi-model coupled systems is, however, tyhpically very technically involved and time consuming. The authors of this paper have been using MAPL to simplify the task of developing a multi-model coupled system with different ocean general circulation models. This has involved incorporating ocean model MAPL components into the tree shown in Figure 1. The details of the configuration and exact approach taken are beyond the scope of this paper. Here we only describe MAPL aspects of the system and show results to illustrate the system in action. Figure 7. Poseidon ocean model sea-surface temperature when coupled to the GEOS-5 system using MAPL.
5.1
The ocean models are Poseidon [14], MITgcm [10, 13] and MOM4 [18]. Poseidon is developed at George Mason University, it has a density coordinate based vertical discretization. MITgcm is developed at the Massachusetts Institute of Technology and the configuration used here has a height based vertical coordinate and a cube-sphere mesh. MOM4 is developed at the Geophysical Fluid Dynamics Laboratory and, in this work, uses a normalized pressure vertical coordinate and tri-polar horizontal mesh. Figure 7 shows the sea-surface temperature from the Poseidon - GEOS-5 coupled system. Figure 8 shows the annual mean sea-surface salinity from the MOM4 - GEOS-5 system. Figure 9 shows the GEOS-5 wind stress and its projection onto the MITgcm cube-sphere mesh.
Figure 8. MOM4 sea-surface salinity when coupled to the GEOS-5 system using MAPL.
5.2 carry out their own internal computations. The CAP component also launches the root SetServices, Initialize and Finalize phases. 4.5
The use of MAPL
Each model is represented as a MAPL gridded component. It defines and allocates in SetServices an internal type that it uses to hold some or all of its internal state. The mapping of fields between the ocean mesh and the GEOS-5 mesh is carried out as part of the coupling step in the common parent of the ocean and GEOS-5 atmosphere components. This mapping and interpolation is hidden from both the atmosphere and the ocean components and takes place on a mesh that is the union of the atmosphere and ocean grid cells. The union grid is a mesh whose cells can be aggregated to form either the atmosphere of the ocean grid [3]. The ocean components define the same set of import state and export state entries for coupling. Overall, the approach taken at the computational level is almost identical to the approach used for the FVdycore GridComp in section 4.
MAPL history and MAPL IO
There are two other important functions that MAPL provides. These take advantage of the standardization of import state and export state rules in MAPL. The first of these functions MAPL History is a component in its own right (see Figures 1 and 5). Its role is to gather and record requested information (under the control of runtime input parameters) about any import or export state variable in a MAPL application tree. The second key ancillary function is a general I/O layer called MAPL IO that has been developed by Arlindo Dasilva. This layer takes the form of a library of functions included in MAPL that allows easy writing of MAPL import, internal and export state data to an external disk or other end-point.
5.
The ocean models
6.
Conclusions
We have described MAPL, a software layer that sits on top of the ESMF component model. It adds a number of very powerful abstractions that greatly simplify the use of a general component framework like ESMF and make the construction of applications from hierarchical trees of components robust and clear. The MAPL approach centers on built-in recursion for tree wide control flow and standard data types to permit reasoning about tree wide data flow. These concepts
Using MAPL to support multi-model ocean simulation
The approach for incorporating the FVdycore GridComp component into MAPL is being used to extend the Figure 1 system to include multiple ocean models. Multi-model coupling is of scientific interest as a technique to better under-
19
on Independent Grids. In Proceedings of Parallel CFD 2005, pages 179–188. Elsevier, 2005. [4] E. Bruneton, T. Coulpaye, and J. B. Stefani. Recursive and Dynamic Software Composition with Sharing. In Seventh International Workshop on Component-Oriented Programming, 2002. [5] N. Collins, G. Theurich, C. Deluca, M. Suarez, A. Trayanov, V. Balaji, P. Li, W. Yang, C. Hill, and A. D. Silva. Design and implementation of components in the earth system modeling framework. Int. J. High Perform. Comput. Appl., 19(3):341– 350, 2005. [6] E. Dijkstra. Goto considered harmful. Communications of the Association for Computing Machinery, 11(3):147–148, 1968. [7] G. Edjlali, A. Sussman, and J. H. Saltz. Interoperability of data parallel runtime libraries. In IPPS ’97: Proceedings of the 11th International Symposium on Parallel Processing, pages 451–459, Washington, DC, USA, 1997. IEEE Computer Society.
Figure 9. Surface-stress projected from GEOS-5 mesh (upper) onto MITgcm cube faces (lower) using MAPL coupling.
[8] I. Held and M. Suarez. A proposal for the intercomparison of the dynamical cores of atmospheric general circulation models. Bulletin of the American Meteorological Society, 75(10):1825–1830, 1994.
could equally be applied to other HPC component frameworks, to support applications that are naturally decomposed into acyclic trees like Figure 1. The MAPL library fits nicely with ESMF because ESMF provides mechanisms, but does not set any policies across components. This focus on mechanisms is an appealing property in a framework, as it does not impose a structure. However, in many programming situations structured approaches can be easier to manage, comprehend and adapt [6]. The MAPL software adds such structure to ESMF by precluding ”spaghetti” coupling of components and imposing an acyclic tree structure on applications. The acyclic tree abstraction is a very attractive pattern for many multicomponent, HPC applications, because it readily lends itself to substitution of portions of the tree, with well-defined consequences.
[9] C. Hill, C. DeLuca, V. Balaji, M. Suarez, and A. da Silva. The architecture of the earth system modeling framework. Computing in Science and Engg., 6(1):18–28, 2004. [10] C. Hill and J. Marshall. Application of a Parallel NavierStokes Model to Ocean Circulation. In Proceedings of Parallel CFD 1995, pages 545–552. Elsevier, 1995. [11] J. Larson, R. Jacob, and E. Ong. The Model Coupling Toolkit: A New Fortran90 Toolkit for Building Multiphysics Parallel Coupled Models. Int. J. High Perf. Comp. App., 19(3):277– 292, 2005. [12] S. Lin. A vertically Lagrangian Finite-Volume Dynamical Core for Global Models. Monthly Weather Review, 132:2293– 2307, 2004.
MAPL research and development has been funded by the NASA Modeling and Analysis Project.
[13] J. Marshall, A. Adcroft, C. Hill, L. Perelman, and C. Heisey. A finite-volume, incompressible navier stokes model for studies of the ocean on parallel computers. JGR, 102, C3:5,753–5,766, 1997.
References
[14] Paul Schopf. The Poseidon web site. http://climate.gmu.edu/poseidon/, 2007.
Acknowledgments
[1] G. Allen, W. Benger, T. Goodale, H. Hege, G. Lanfermann, A. Merzky, T. Radke, E. Seidel, and J. Shalf. The cactus code: A problem solving environment for the grid. In Proceedings of Ninth IEEE International Symposium on High Performance Distributed Computing, HPDC-9, August 1-4 2000, Pittsburgh, pages 253–260. IEEE Computer Society, 2000.
[15] The ESMF developers. The ESMF web site. http://www.esmf.ucar.edu/, 2007.
[2] R. Armstrong, D. Gannon, A. Geist, K. Keahey, S. Kohn, L. McInnes, S. Parker, and B. Smolinski. Toward a common component architecture for high-performance scientific computing. In Proc. Conf. High-Performance Distributed Computing, 1999.
[17] The MAPL developers. The MAPL web site. http://maplcode.org/, 2007.
[16] The ESMF Joint Specification Team. The ESMF User Guide and The ESMF Reference Guide. http://www.esmf.ucar.edu/, 2007.
[18] The MOM4 developers. The MOM4 web site. http://www.gfdl.noaa.gov/fms/pubrel/l/mom4/doc/mom4 manual.html, 2007.
[3] V. Balaji, J. Anderson, I. Held, M. Winton, J. Durachta, S. Malyshev, and R. Stouffer. The Exchange Grid: A Mechanism for data exchange between Earth System Compnents
20