Knowledge-Based Resource Management for ... - Semantic Scholar

Knowledge-Based Resource Management for Distributed Problem Solving Sergey Kovalchuk, Aleksey Larchenko, and Alexander Boukhanovsky e-Science Research Institute, National Research University ITMO, Saint-Petersburg, Russian Federation [email protected], [email protected], [email protected]

Abstract. Knowledge-based approach for composite high-performance application building and execution is proposed as a solution to solving complex computational-intensive scientific tasks using set of existing software packages. The approach is based on semantic description of existing software, used within composite application. It allows building applications according to user quality requirements and domain-specific task description. CLAVIRE platform is described as an example of successful implementation of proposed approach’s basic principles. Exploration of described software solution performance characteristics is presented. Keywords: composite application, high-performance computing, e-science, expert knowledge processing, performance estimation.

1

Introduction

Nowadays scientific experiment often requires huge amount of computation during simulation or data processing. Performance of contemporary supercomputers is increasing rapidly. It allows solving computation-intensive scientific problems, processing large arrays of data stored in archives or produced by sensor networks. Today we can speak about a new paradigm for scientific research often called eScience [1]. This paradigm introduces many issues that have to be solved by collaboration of IT-specialists and domain scientists. These issues become more urgent as appropriate hardware and software turn into large complex systems. Currently there is a lot of software for solving particular domain problem developed by domain scientists using their favorite programming language and parallel technologies. Thus today we have a great diversity of software for particular problem solving in almost each scientific domain. On the other hand with powerful computational resources we have an ability to solve complex problems that requires the use of different software pieces combined within composite applications. To address this issue problem solving environments (PSE) [2] are introduced as an approach for software composition. But still there are two problems. First, problem of previously developed software integration: with diversity of technologies, data formats, execution platforms it’s quite complicated task to join different software even within PSE. Y. Wang and T. Li (Eds.): Knowledge Engineering and Management, AISC 123, pp. 121–128. springerlink.com © Springer-Verlag Berlin Heidelberg 2011

122

S. Kovalchuk, A. Larchenko, and A. Boukhanovsky

Second, there is a problem of using third party software with a lack of knowledge about its functionality, and internal features. Looking at contemporary computational resources we can also see a high level of diversity in architectures, technologies, supported software etc. Moreover today we have an ability to combine different computational resources using metacomputing, Grid, or Cloud approaches. In this case the problem of integration becomes more important as we should care of performance issues in heterogeneous computational environment because of computational intensity of e-Science software. Today’s common approach for solving integration problem is typically based on service-oriented architecture (SOA). This approach allows developing of composite application using sets of services that give access to diverse resources in unified way. But the problem of composite application performance within heterogeneous computational environment still remains. The developer should care of configuring every running service in the way that allows making the whole application faster. In case of a single application implementing simple algorithm there is a lot of approaches for performance estimation and optimization depending on software and hardware features [3, 4]. But in case of composite application using diverse existing software and hardware resources, implementing complex algorithms we are faced with more complex issues of composite application development and execution. In this paper we present our experience of knowledge-based description of distributed software and using it for performance optimization and support of software composition. Within this approach expert knowledge is used to describe a set of domain-specific services in the way that allows composing application for solving complex simulation problems taking into account technical features of software and hardware available within computational environment.

2 2.1

Basic Concepts Semantic Description of Resources

As it was described before there is great diversity of software and hardware resources needed to be integrated within composite applications for solving e-Science problem. Description of this software should include the following structure of knowledge typically available to experts: • Software specification. Basic statements used for software identification (name, version etc.) are mentioned within this part of description. • Implemented algorithms. This part of description allows making composition of services for solving more complex problems within the domain. Also it can be used for searching and comparing alternatives solutions. • Performance model. This part of knowledge should allow make estimation of execution time depending on execution parameters: computational environment specification and domain-specific data parameters. These two parts allow estimate execution time of the application with particular hardware and input parameters. • Input and output sets. Describing set of incoming and outgoing data this part of knowledge gives information on parameters’ structure, data formats and the way of

Knowledge-Based Resource Management for Distributed Problem Solving

123

passing. Using this information it’s possible to connect software pieces within composite application automatically, to apply data decomposition and transformation. • Way of running. This part describes the procedure of calling the application including low level parameters passing (e.g. files or streams) and environment access procedure (e.g. service invocation, pre- and post-configuration of resources). • Hardware and software dependencies. The set of requirements of the application is presented within this part. It is required for appropriate software deployment (in case this procedure is available within the computational environment). Using this set allows describing a set of existing software available for particular problem domain. But also it is required to describe computational service environment that allows running these software. Service environment description should include the following structure of information: • Hardware characteristics. With a set of resources available statically, dynamically or on-demand it is required to have full description of resources. • Available services. This part maps the set of services (described as shown before) on the set of resources available within computational environment. Finally, having the information mentioned above we should describe domainspecific usage process of the software for solving particular tasks. Here and further we use quantum chemistry (QC) as an example of problem domain. The proposed concept and technologies were applied within this domain during the HPC-NASIS project [5] - platform for QC computer simulation using distributed environment with integrated well-known computational software. • Problems. This set of knowledge describes known domain problems which can be solved using a set of available software. Typically the problem set is well known for particular field of knowledge (e.g. in the field of quantum chemistry such problems could be a single point problem or geometry optimization). • Methods. This part defines well-known methods of problem domain implemented in software. E.g. concerning QC problem domain there are such method as Hartree-Fock (HF) or Density functional theory (DFT). • Domain specific values. This part of semantic description contains domainspecific types, their structure and possible values. E.g. for QC domain we should describe concepts like basis or molecular structure. • Solution quality estimation. With expert knowledge it is possible to define quality estimation procedure for known methods and its implementation taking into account domain-specific input values. This procedure is defined for particular quality characteristic space (precision, speed, reliability etc.). For example it is possible to define precision of selected method for solving single point problem for the defined molecular structure and basis. Using this part of knowledge gives opportunity to estimate quality metrics for each call within composite application and integral quality of whole application. First of all the description defined above guides composite application building using a) semantic description of particular software pieces and b) available resource definition. In this case set of performance models allows execution time estimation and consequential structural optimization of composite application. But beside of that last part of knowledge extend available facilities with explanation and quality estimation

124


expressed using domain-specific terms. As a result it is possible to build software system that can “speak” with domain specialist using his/her native language. 2.2

Implementation of Knowledge Base

Today semantic knowledge is typically expressed using ontologies. Concerning semantic software description mentioned above we can define ontology structure which integrates all the parts of knowledge. Fig. 1 shows an example of ontology part describing available software using proposed structure. This part of ontology (simplified for illustration) defines five concepts: package (which represents particular software piece), method (implemented within the software), cluster (as a subclass of resource), value (domain-specific) and data format. Important part of the knowledge is mentioned as attributes of individuals and relationships between the individuals. E.g. we can define that ORCA package implements DFT method with particular quality and performance (defined as constant quality value, function with set of parameters or table with profile values).

es us

Fig. 1. Basic ontology structure for software semantic description

This approach allows supporting dynamic composition of software and hardware resources within complex application given by set of requirements. For instance it is possible to compose an application that requires shortest time to execute or gives the most precise solution using available parameter dictionary within the desired time.

3 3.1

Approach Implementation CLAVIRE Platform

The approach described above can be concerned as a part of iPSE concept (Intelligent Problem Solving Environment) [6], which forms a conceptual basis for the set of projects performed by National Research University ITMO within last years. During these projects the infrastructure platform for building and executing of composite


125

application for e-Science was developed. The CLAVIRE (CLoud Applications VIRtual Environment) platform allows building composite applications using domain specific software available within distributed environment. One of important features of the platform is expert knowledge application for solving the following tasks: • Composite application building using domain-specific knowledge within intelligent subsystem. This process takes into account actual state of computational resources within the environment, available software and data uploaded by the user. Using semantic description of software this system forms an abstract workflow (AWF) definition of composite application. The AWF contains calls of software (using domain-specific values as high-level parameters) without mapping to particular resources. • Parallel execution of AWF using performance optimization based on models defined as a part of knowledge. During this procedure technical parameters of execution are tuned to reach the best available performance for resources selected for execution parts of AWF. As AWF’s elements are mapped to particular resources and low level parameters are defined the workflow turns into a concrete workflow (CWF). • Data analysis and visualization. These procedures are supported by knowledge about a) data formats using during execution of composite application; b) solving domain-specific problem for automatically selection and presenting the data required by the user. 3.2

Knowledge-Based Solution Composition

The most interesting part of knowledge-based procedures mentioned above is composite application building using used-defined data and requirements (performance, quality etc.). Within the CLAVIRE platform tree-based dialog with the user is presented as a tool for decision support (see Fig. 2). Passing through the nodes presented by domain-specific concepts (there are four levels of the current tree implementation: problem, method, package (software) and service) the user can define or select given input and required output values for every level of the tree. Passing through the levels 1 (Problem) to 3 (Package) produce AWF, available for automatic execution. But the user can pass further to the level 4 (Service) which allows fine tuning the execution parameters during producing of CWF. Passing through the tree nodes user can control generation of next-level nodes by defining parameters of auto-generation or by blocking nodes processing for selected nodes. After passing the tree user can compare available solutions which will be estimated by quality values. The performance will be estimated for composite application using performance models of software, used for solving subtasks. It is possible to estimate and optimize the execution time using performance model. Another performance issue is related to planning process of the whole composite application. The CLAVIRE platform uses well-known heuristics selected depending on performance estimation for current state of computational environment.

126

S. Kovalchuk, A. Larchenko, and A. Boukhanovsky Single point task

Method

Problem

Saddle point search

Package

Parameter analysis

Single point task Input parameters: Ź Molecular system: file Output parameters: Ż HF (KS) orbital Ż HF (KS) energy

DFT

Molecular system Atoms number: 23 Atoms: H: 10; O: 6; C: 7

Inference

Hartree-Fock + MP2

Hartree-Fock

Hartree-Fock

Hartree-Fock

Input parameters: Ź Basis: 6-31G

ORCA

GAMESS

Hartree-Fock + CC Hartree-Fock

Input parameters: Ź Basis: MINI

Input parameters: Ź Basis: PC3

MOLPRO

ORCA

GAMESS

GAMESS

GAMESS

Input parameters: Ź SCF type: RHF Ź SCF iteration limit: 300 Ź Direct SCF: Ⱦɚ

ITMO Cluster #1

Service

Geometry optimization

Input parameters: Ź SCF type: RHF Ź SCF iteration limit: 300 Ź Direct SCF: Ⱦɚ

GridNNN

ITMO Cluster #1

Execution time: 20...32 min Price: 10 Ź Reliability: 0,98 RUN Precise: 0,6 Edit script

Inference

Info

GridNNN Execution time: 1...1,5 h Price: 30 Reliability: 0,95 Precise: 0,95 Edit script

Inference

Ź RUN Info

Fig. 2. Tree-based dialog

4

Numerical Results

During exploration the CLAVIRE platform a lot of experiments were performed. Fig. 3 shows the selected results of computational experiments with performance models used by the platform. The experiment was performed using planning simulation system which used performance models, parameterized using experiments with set of packages during test runs with CLAVIRE platform. A) Comparing alternative implementation of the same computational problem. Three QC packages (GAMESS – 1, ORCA – 2, MOLPRO – 3) were compared during solving single point problem using on the same input data. Performance estimation using parameterized models shows that in case of time optimization the best choice is to use ORCA (I) running on 2 CPU cores (II). B) Comparing packages with two-dimension models that takes into account shows more complex case of alternative selection. Looking at shown example it can be seen that package 1 (GAMESS) is better in region above dotted line while package 2 (ORCA) is better for parameters in region below the line. This simple case shows that the performance model can combine domain-specific parameters (here is count of basis function) with technical parameters (CPUs count). C) Comparing CLAVIRE overhead to overhead of underlying computational Grid environment (Russian Grid network for nanotechnologies – GridNNN [7] was used as computational platform) shows that CLAVIRE has quite low overhead level and can be used as distributed computation management system.

127

Probability density

WF control overhead

Total CLAVIRE overheads

Data transfer overheads

Distribution of Grid overheads


Fig. 3. Performance characteristics of CLAVIRE

D) Experiments with heuristic application for workflow planning show that it can bring us to notable decreasing of execution time for composite application with large amount of subtasks. On the other hand in case of identical (from performance point of view) subtasks using different heuristics gives almost the same results. For instance distributions of estimated execution time for parameter sweep task in case of stochastic behavior of computational environment planned with different heuristics almost overlaps.

5

Discussions and Conclusion

Today there are a lot of solutions trying to build composite applications automatically using knowledge bases (e.g. [8, 9]). Typically they use semantic pattern-based composition of workflows as a composite application description. But the most powerful approach for composite application building should actively involve domain-specific expert knowledge, which describes high-level computational experiment process. This description has to be clear for domain specialists (i.e. end-users of computational

128


platforms) even in case of no technological background. The approach of high-level knowledge-based support of computational experiment is also discussed widely [10, 11], but still there are lack of appropriate common implementations of this approach. Within described work we are trying to build a solution which can be adapted for any problem domain containing computational-intensive tasks. This solution should isolate the end-user from technical (hardware and software) features of underlying architecture. It should be focused on high-level concept of computational experiment common and understandable by almost every domain scientist. Described knowledge-based approach to composite application organization for eScience supports building and execution of composite application developed using set of existing computational software. It was applied within the set of past and ongoing projects (including CLAVIRE platform) performed by University ITMO. The projects were developed for solving computational intensive problems in various domains: quantum chemistry, hydrometeorology, social network analysis, ship building etc. Projects successfully apply formal knowledge description for building and running composite applications, processing and visualization of data, supporting users etc. Acknowledgments. This work was supported by projects “Multi-Disciplinary Technological Platform for Distributed Cloud Computing Environment Building and Management CLAVIRE” performed under Decree 218 of Government of the Russian Federation and “Urgent Distributed Computing for Time-Critical Emergency Decision Support” performed under Decree 220 of Government of the Russian Federation.

References 1. Hey, T., Tansley, S., Tolle, K. (eds.): The Fourth Paradigm. Data-Intensive Scientific Discovery, Microsoft (2009) 2. Rice, J.R., Boisvert, R.F.: From Scientific Software Libraries to Problem-Solving Environments. IEEE Computational Science & Engineering 3(3), 44–53 (1996) 3. Kishimoto, Y., Ichikawa, S.: Optimizing the Configuration of a Heterogeneous Cluster with Multiprocessing and Execution-Time Estimation. Parallel Computing 31(7), 691–710 (2005) 4. Dolan, E.D., Moré, J.J.: Benchmarking Optimization Software with Performance Profiles. Mathematical Programming 91(2), 201–213 (2002) 5. HPC-NASIS, http://hpc-nasis.ifmo.ru/ 6. Boukhanovsky, A.V., Kovalchuk, S.V., Maryin, S.V.: Intelligent Software Platform for Complex System Computer Simulation: Conception. In: Architecture and Implementation, vol. 10, pp. 5–24. Izvestiya VUZov, Priborostroenie (2009) (in Russian) 7. Start GridNNN, http://ngrid.ru/ngrid/ 8. Kim, J., et al.: Principles For Interactive Acquisition And Validation Of Workflows. Journal of Experimental & Theoretical Artificial Intelligence 22, 103–134 (2010) 9. Gubała, T., Bubak, M., Malawski, M., Rycerz, K.: Semantic-based grid workflow composition. In: Wyrzykowski, R., Dongarra, J., Meyer, N., WaĞniewski, J. (eds.) PPAM 2005. LNCS, vol. 3911, pp. 651–658. Springer, Heidelberg (2006) 10. Gil, Y.: From Data to Knowledge to Discoveries: Scientific Workflows and Artificial Intelligence. Scientific Programming 17(3), 1–25 (2008) 11. Blythe, J., et al.: Transparent Grid Computing: a Knowledge-Based Approach. In: Proceedings of the 15th Annual Conference on Innovative Applications of Artificial Intelligence, pp. 12–14 (2003)