PaaS (platform), for developing components and applications that may exploit the potential performance of these parallel computing platforms;. â SaaS ...
XVII Brazilian Symposiun on Formal Methods (SBMF'2013) In: III Brazilian Conference on Software: Theory and Practice (CBSOFT'2013)
Derivation and Verification of Parallel Components for the Needs of an HPC Cloud Thiago Braga Marcilon Francisco Heron de Carvalho Junior
ParGO Research Group
Pós-Graduação em Ciência da Computação Universidade Federal do Ceará Fortaleza/CE, Brazil
MDCC/UFC
Topics ●
Context and motivations (HPC Storm);
●
Goals of this study;
●
The system of formal contracts of HPC Storm;
●
Contract-based formal derivation process;
●
Case studies on derivation of parallel code using Circus/HCL:
●
Conclusions.
SBMF'2013 Brasília/DF, Brazil
Derivation and Verification of Parallel Components for the Needs of an HPC Cloud
Context and Motivations
HPC Storm HPC Storm is HPC in clouds through components
components CBHPC CCA Hash Fractal GCM
HPC applications ● ● ●
computational sciences engineering …
SBMF'2013 Brasília/DF, Brazil
HPC
HPC Storm HPC in clouds
CBSE in clouds
clouds
Derivation and Verification of Parallel Components for the Needs of an HPC Cloud
Context and Motivations
HPC Storm ●
Services; –
IaaS (infrastructure), comprising parallel computing platforms;
–
PaaS (platform), for developing components and applications that may exploit the potential performance of these parallel computing platforms;
–
SaaS (software), built from components, for attending HPC users. ●
●
●
applications;
Stakeholders: –
Domain specialists;
–
Application providers;
–
Component developers;
–
Platform maintainers.
Architecture: –
providers
maintainers
specialists use applications build components
build applications
manage infrastructure
Front-End (SaaS) Core (PaaS)
applications components
built from
Back-End (IaaS)includes
parallel computing platforms
Front-End / Core / Back-End.
SBMF'2013 Brasília/DF, Brazil
Derivation and Verification of Parallel Components for the Needs of an HPC Cloud
providers
developers
maintainers
specialists (final users) use applications build applications
build components
Front-End (SaaS)
manage infrastructure
applications built from
Core (PaaS) Back-End (IaaS)
components includes
parallel computing platforms
SBMF'2013 Brasília/DF, Brazil
Derivation and Verification of Parallel Components for the Needs of an HPC Cloud
Context and Motivations ●
Hash Component Model –
A model of parallel components; ●
●
●
Targeted at distributed-memory parallel computing platforms;
–
Units + overlapping composition + component kinds;
–
Makes possible isolation of parallelism concerns inside components;
Hash Programming Environment (HPE) –
http://hash-programming-environment.googlecode.com
–
Reference implementation of Hash, for cluster computing platforms;
Hash Type System (HTS) –
Discovery and binding of components according to assumptions about the execution context (application + target parallel computing platform);
–
Contextual abstraction, abstract components and instantiation types;
–
Which component is the best for a given context (instantiation type) ?
SBMF'2013 Brasília/DF, Brazil
Derivation and Verification of Parallel Components for the Needs of an HPC Cloud
HPC Storm
The System of Formal Contracts ●
Each application orquestrates a set of available components to find (computational) solutions for problems in a specific domain of interest;
●
Components may be built from (overlapping) composition of other components, defining depedencies;
●
●
A component dependency is described by a contract; The contract (of a software component) specifies:
Hash HTS extension
–
The assumptions of the component about the execution environment (context = application + parallel computing platform);
–
The computational task performed by the component.
–
A specification of how the component perform its task.
–
The guarantees of the component about its performance (QoS).
SBMF'2013 Brasília/DF, Brazil
The problem we are addressing in this paper
Derivation and Verification of Parallel Components for the Needs of an HPC Cloud
HPC Storm
The System of Formal Contracts abstract component
contract implementation assumptions
includes platform assumptions and performance requirements component #2
Algorithms
influenced by
Implementation assumptions (contextual abstraction)
what it does
Functionality
how it does
Algorithms
when it does
Behaviour
SBMF'2013 Brasília/DF, Brazil
Derivation and Verification of Parallel Components for the Needs of an HPC Cloud
HPC Storm
The System of Formal Contracts Front-End
Core
(application)
(component catalog)
Back-End platform2
contracts of components
SBMF'2013 Brasília/DF, Brazil
platform1
platform3
Derivation and Verification of Parallel Components for the Needs of an HPC Cloud
Context and Motivations
HPC Storm – Component Certification ●
Specialists demand for correctness and performance obligations on component orquestrations from providers ; –
Avoiding the high costs of unexpected errors and performance bottlenecks in long-running intensive computations;
●
In turn, providers demand for correctness and performance obligations on component implementations from developers;
●
How to certify components in the cloud under HPC assumptions ? –
In this paper, we are interested in the problem of how to certify that a component performs the computation specified in its contract;
–
We still loosely address concerns about performance.
SBMF'2013 Brasília/DF, Brazil
Derivation and Verification of Parallel Components for the Needs of an HPC Cloud
Goals of this Study ●
Propose a certification process for components in HPC Storm –
With some emphasis on HPC assumptions;
–
Parallel components of the Hash component model;
–
Circus specification language for describing component contracts; ●
●
F. H. de Carvalho Junior; R. D. Lins (2010) “Compostional Specification of Parallel Components Using Circus”, Electronic Notes in Theoretical Computer Science, vol. 260, n. 1, pages 47-72. (FACS'2008 proceedings)
Evaluate the feasibility of deriving parallel code from refinement and translation of contracts written in an extension of Circus: –
Realistic case studies require: 1) An informal descrition (“pencil-and-paper”), from which deriving a contract; 2) An existing tuned implementation, built by professional HPC programmers.
–
NAS Parallel Benchmarks (NPB).
SBMF'2013 Brasília/DF, Brazil
Derivation and Verification of Parallel Components for the Needs of an HPC Cloud
HPC Storm
Contract-Based Certification Process abstract component + implementation assumptions (context)
contract refinement step
Refining towards a specification of the appropriate algorithms for the context
abstract specification
Translating towards using the best techniques for implementing the algorithms, taking advantage of particular features of the target parallel computing platform
refinement step
abstract specification refinement
refinement step
abstract specification refinement step
concrete specification
SBMF'2013 Brasília/DF, Brazil
translation
source code
Derivation and Verification of Parallel Components for the Needs of an HPC Cloud
HPC Storm
Contract-Based Certification Process Front-End (application) Component contract
(3) for running, the application asks the Core for a component that implements the contract
(2) the application demands for a component described by a contract of the given abstract component
abstract component
(1) the component is derived by refinement and translation from a contract of a catalogued abstract component
contract
(4) if a component is found, the concrete specification must be matched against the abstract component specification
component
concrete specification
source code
ensure that it was derived from refinement rules
Core (component catalog)
SBMF'2013 Brasília/DF, Brazil
Derivation and Verification of Parallel Components for the Needs of an HPC Cloud
HPC Storm
Contract-Based Certification Process ●
We propose Circus for specification of abstract components; –
●
Circus is a synergetic combination of Z, CSP and Dijkstra's guarded commands for formal speficiation of concurrent programs;
Why Circus ? –
It supports concurrency and, by consequence, parallelism;
–
It separates behaviour (CSP) and functional (Z) concerns; ●
Z may specify functional tasks (actions) of components;
●
CSP may specify orquestration of component actions towards a goal;
–
It supports a refinement !
–
There are practical experiences with tools for verification (ProofPower-Z), refinement (CRefine) and automatic code generation (JCircus);
–
In a previous work (FACS'2008), we have proposed Circus/HCL, an extension of Circus for specification of parallel components in HPE.
SBMF'2013 Brasília/DF, Brazil
Derivation and Verification of Parallel Components for the Needs of an HPC Cloud
Circus/HCL ●
HCL = Hash Configuration Language –
●
●
An arquitecture description and configuration language for composition and orquestration of parallel components in HPE;
HCL + Circus: what Circus may offer to HCL: –
A language for specification of parallel computations performed by components of the Hash component model in HPE;
–
Circus/HCL incorporates HTS (contextual abstraction);
Circus + HCL: what HCL may offer to Circus: –
A mechainism for (overlapping) composition of Circus specifications describing parallel computations.
SBMF'2013 Brasília/DF, Brazil
Derivation and Verification of Parallel Components for the Needs of an HPC Cloud
Circus/HCL
...
dot_product
SBMF'2013 Brasília/DF, Brazil
Derivation and Verification of Parallel Components for the Needs of an HPC Cloud
...
. ..
VecVecProduct
Case Studies
NPB Parallel Benchmarks ●
●
●
●
●
Evaluate the performance of high-end parallel computing platforms for CFD (Computational Fluid Dynamics) code; 8 original benchmarks: –
5 kernels: EP, IS, CG, MG, FT;
–
3 pseudo applications: SP, BT, LU;
Standard workload sizes (problem classes): S, W, A, B, C, D, E, F, … Informal specifications (“pencil-and-paper”) that must be implemented for exploiting the features of the target parallel computing platforms; Reference implementations developed by HPC specialists; –
Many versions: 1.x, 2.x, 3.x
–
Different parallel programming platforms: ●
MPI, OpenMP, HPF, Globus, Java;
SBMF'2013 Brasília/DF, Brazil
Derivation and Verification of Parallel Components for the Needs of an HPC Cloud
Case Studies
IS and CG ●
●
●
IS (Integer Sorting) –
Symbolic computation (memory intensive);
–
Bucketsort algorithm;
–
Reference implementation in C/MPI;
CG (Conjugate Gradient) –
Numeric computation (float-point intensive);
–
Apply the inverse power iteration method to find the lowest eigenvalue of a sparse positive-definite symmetric matrix.
–
Reference implementation in Fortran/MPI;
Circus/HCL specifications have been derived for both kernels, from the “pencil-and-paper” informal descriptions; –
IS and CG contracts, aimed at refinement and translation towards C#.
SBMF'2013 Brasília/DF, Brazil
Derivation and Verification of Parallel Components for the Needs of an HPC Cloud
Case Studies
Derivation of Parallel Code using Circus/HCL IS Contract (Bucketsort)
state
actions
protocol
SBMF'2013 Brasília/DF, Brazil
Derivation and Verification of Parallel Components for the Needs of an HPC Cloud
Case Studies
Derivation of Parallel Code using Circus/HCL
Circus Refinement + Translation
C# SBMF'2013 Brasília/DF, Brazil
Derivation and Verification of Parallel Components for the Needs of an HPC Cloud
Case Studies
Derivation of Parallel Code using Circus/HCL IS contract
verify
IS component refine
IS concrete specification
SBMF'2013 Brasília/DF, Brazil
translate
IS C# code
Derivation and Verification of Parallel Components for the Needs of an HPC Cloud
Case Studies
Derivation of Parallel Code using Circus/HCL CG contract
see the details of refinement and translation steps in the paper
verify
CG component refine
CG concrete specification
SBMF'2013 Brasília/DF, Brazil
translate
CG C# code
Derivation and Verification of Parallel Components for the Needs of an HPC Cloud
Conclusions ●
●
The specification (from “pencil-and-paper” description), refinement and translation processes were time-consumig and error-prone: –
High mathematical skills are necessary;
–
Enforcing the need of develop tools for guiding code derivation process;
It is possible to choose an appropriate sequence of refinement and translation rules for tuning the component performance: –
In the case of IS and CG, we have used rectangular arrays, instead of jagged ones, and ordered loops for improving data locality; ●
●
Common assumptions of HPC programmers;
–
The reference implementations of IS and CG were useful as a baseline;
–
Question: is it feasible to systematically guide application of refinement and translation rules according to context in a semi-automatic derivation tool ?
Language abstractions and syntactic sugar on Circus for helping translation towards HPC code: –
Ex: type Arrayk(T) ≡ NK → T, onde _[_,_,...,_] ≡ Arrayk(T) × NK → T
SBMF'2013 Brasília/DF, Brazil
Derivation and Verification of Parallel Components for the Needs of an HPC Cloud
(indexing);
Conclusions ●
●
●
This work have collected evidences about the feasibility of using formal methods of specification and derivation for HPC software, in the context of HPC Storm project; Also, it outlines a process of certified software development for the needs of HPC Storm; (Challenging) further works: –
Performance comparison of derived code and reference implementations (journal version ?); ●
Experimental methodology must be rigorous for achieving useful evidences;
–
Investigate how to use contextual abstraction for guiding code derivation;
–
Incorporate the certification process in the HPC Storm implementation; ●
●
The Core (component catalog) must manage the component code, their specifications, and contract matching on top of an existing theorem prover; The component developer's Front-End must support specification and semiautomatic derivation of code
SBMF'2013 Brasília/DF, Brazil
Derivation and Verification of Parallel Components for the Needs of an HPC Cloud
XVII Brazilian Symposiun on Formal Methods (SBMF'2013) In: III Brazilian Conference on Software: Theory and Practice (CBSOFT'2013)
Derivation and Verification of Parallel Components for the Needs of an HPC Cloud Thiago Braga Marcilon Francisco Heron de Carvalho Junior
ParGO Research Group
Pós-Graduação em Ciência da Computação Universidade Federal do Ceará Fortaleza/CE, Brazil
MDCC/UFC