RESEARCH ARTICLE
Adv. Sci. Lett. 21, 2243-2246, 2015
Copyright © 2015 American Scientific Publishers All rights reserved Printed in the United States of America
Advanced Science Letters Vol.21, 2243-2246, 2015
Modeloo – the Tool for Teaching Parallel Computations Vitaliy Mezhuyev1*, Jasni Mohamad Zain1, Nikolay Kudinov2, Vladimir Lavrik2 and Vladislava Mezhuyeva2 1
Faculty of Computer Systems and Software Engineering University Malaysia Pahang Gambang, Malaysia 2 Department of Informatics and Software Engineering Berdyansk State Pedagogical University Shmidta str., 4, Berdyansk, Ukraine, 71100
The paper presents the methodics for teaching Parallel and Distributed Computations (PDC) using developed by authors software tool Modeloo. The idea of Modeloo is design of PDC algorithms in the form of the graph “operations-operands”. Modeloo is specially adapted for teaching purposes, which allows active involvement of students in the development of parallel algorithms and analyses of their effectiveness. Keywords: parallel computations, graph “operations-operands”, software tool, Modeloo.
1 INTRODUCTION The high complexity and abstractness of the teaching subject “Parallel and distributed computations” need revision of the traditional pedagogical techniques and approaches, applying new information and communication technologies. The objective of the course “Parallel and distributed computations” is to form the skills of design and analyses of parallel algorithms by development of corresponding models of computations. The analyses includes a study of effectiveness of computations correspondingly to the chosen algorithm (e.g. the effectiveness of its parallelisation), and estimation of the maximum possible performance of computations for the concrete type of algorithm (i.e. effectiveness of the parallel method for solution). The tool Modeloo gives to students the possibility for development of models of parallel computations in the form of the graph “operations-operands” with following analyses of parallel algorithms. The model “operationsoperands” was described by D.P.Bertsekas and J.N. Tsitsiklis [1], and V.V. Voyevodin and Vl.V. Voyevodin [2]. *
Email Address:
[email protected]
For simplification, in the model “operations-operands” the times of execution of any computing operation are assumed as identical and equal to one unit (in some system of measurement). Besides, there is an assumption about immediate data transmission between computers (what is acceptable at using shared memory in the computing systems like parallel vector processors or symmetric multiprocessors). In our approach, students use Modeloo for modelling PDC algorithms at initial stages of learning, and continue with more complex tools (will be considered later). Modeloo allows us to make analyses of properties of parallel and distributed algorithms being designed, which is very important before writing corresponding software code. By using graph “operations-operands” students analyse the most important characteristics of parallel computations, e.g. deduct how many processors they need to solve a problem with maximum efficiency. Use of the graphs “operation-operands” has the big didactic value and helps students learn different properties of PDC algorithms (acceleration, efficiency), form and further improve practical skills needed for parallel computations. In this paper, we describe possibilities of Modeloo tool and introduce corresponding teaching methodics.
RESEARCH ARTICLE
Adv. Sci. Lett. 21, 2243-2246, 2015
The paper is organized as follows. First, analyses of existing techniques and software tools for modelling PDC is given. Next, the methodics of Modeloo’ application for teaching purposes is introduced. The feedback from students using the Modeloo tool is given. The conclusion, plan for future research and references finalize the paper. 2
ANALYSES OF EXISTING PARALLEL COMPUTATIONS
TOOLS
FOR
Nowadays, parallel computer is gaining popularity as an effective solution for low cost supercomputing [3]. Development of multi-core technology has induced big challenges to software design. To take full advantages of the performance, offered by new multi-core hardware, softwareprogramming models have made a great shift from sequential to parallel programming. Let’s consider existing tools for making parallel computations and their further analyses. Quite popular are Intel VTune Performance Analyzer and Intel Thread Checker, described in [4]. In [5] jParalize – a simple, free and lightweight tool for parallelizing Matlab computations on multicores and clusters was described. Authors of [6] propose to emulate a generalized distributed memory in heterogeneous networked environments with the parallel virtual machine software infrastructure. However, all these software tools do not take into account educational aspect. One of few papers, which describes a teaching tool for introducing students into interactive parallel and distributed programming, is [7]. Its authors develop a system of modular interactive tiles as a tool for easy, fast, and flexible hands-on exploration of PDC issues, and show how to implement interactive parallel and distributed processing with different behavioural software models such as open loop, randomness-based, rule-based, user interactionbased, AI- and ALife-based software. In [8] MapReduce tool was introduced, allowing to scientists and engineers develop applications, which can exploit a computing power of a cloud. By using MapReduce authors propose to study how to develop effective cloud applications. This is a quite interesting technique, having clear advantages for teaching purposes. Authors of [9] propose a software tool for parallel computing and effective control of parallel runs using ANSYS solver. A typical example of application is an optimization problem, in which a response for each design point has to be calculated. In [10] was described an extended event graph-based modelling method for Parallel and Distributed DiscreteEvent Simulation (PDES). This paper proposes an extension of the event graph, called Extended Event Graph (EEG), to consider communication of logical processes via the sent events, and proposes an EEG-based modelling method for PDES. This modelling method shifts the focus of PDES development from writing code to building models, and the system implementation can be automatically and directly generated from EEG model.
In [11] the С* system was introduced and next used by authors for teaching purposes. С* package has 2 major elements: - С* or Pascal* Compiler (a choice depends on student’s preference); - Parallel Computer Simulation System. It allows for a “classical” C (or Pascal) program to add some primitives for making parallel programming. This parallel form is called C* or Pascal* languages. С* system allows users to work with 3 modes of operation: Shared-memory, Distributed-memory and MPI Modes. In each mode, there are different primitives for parallel programming, such as formal statements, spinlocks, remote procedure calls, streams etc. Definitely, this resource can be used at the initial stage of PDC learning to start writing parallel programs. From our point of view, weaknesses of this system are: - students should learn new instruction language (C* or Pascal*), which is needed only for training and will not be used further in practice; - this language contains only a limited number of operations and so allows to define only limited number of computation primitives; - C* / Pascal*compiler has rather weak capacity. In [12] the Software Tools for Academics and Researchers (StarHPC) was introduced, created in the Massachusetts Institute of Technology. StarHPC provides the development environment and computational resources necessary to teach parallel programming in OpenMP and OpenMPI. StarHPC has the following components: - a virtual machine, used by students; - administrative scripts, used by administrator; - Amazon Elastic Computing Cloud (EC2) machine, used to build a cluster and share it by the class. The limitations is that StarHPC is hosted on Amazon's EC2 web service, so running virtual machine is costly. In addition, StarHPC was designed to support advanced computer science courses to cover the challenges of parallel programming. So this resource cannot be used for teaching PDC at primary stage. Authors of [13] suggest for teaching parallel computing paradigms use python MPI4py in the LittleFe educational Cluster. Except MPI4py, the proposed teaching exercises use Disco map-reduce libraries, which deemphasize attention on low-level details and so gives time for elaboration of solutions for more complex problems. In [14] authors discuss principles of teaching parallel programming for first year students. For this purpose, they offer own electronic lecture notes, which are accompanied by multimedia material for animation of parallel data structures, visualization of algorithms, experimentation with various architectures, monitoring parallel programs and demonstration of scientific visualization, with a focus on algorithmic over architectural issues. These tools include animation programs, hypermedia, video clips, and audio files. Authors offer to use Parallaxis programming language, because it is similar to Pascal, supports the SIMD
programming paradigm, can simulate many topologies, runs a variety of platforms including Macintosh, Unix, the Maspar parallel processor, and is free. Authors of [15] propose at studying parallel programming don't use abstract examples or so-called “toy problems”, and not focus on a single programming model. They present a software package, called Shallow Water Equations (SWEs) that supports teaching different parallel programming models. There is also a Paralab [16] tool, developed at Nizhniy Novgorod University of Russia by the group of Prof. V.Gergel. This software allows modelling parallel computations in simulation mode on a personal computer or using a real cluster. Paralab also implements real-time visualization of parallel computations in software models. Paralab’ users can choose the type of network for parallel computations, number of processors and their characteristics; specify the method for solution and options for visualization. Paralab has a mode for long computations and function for saving results. In addition, the tool can build diagrams using previously saved data. Definitely, Paralab can help students acquire theoretical knowledge in the domain of parallel computations and improve their skills in development of computational clusters. At the same time, its interface is rather complex for the students, who just start learning PDC. As result, let’s note, that teaching parallel and distributed computing becomes very important task. We support the idea that learning parallel computations should start from the first years of studying. At the same time, existing methodics and tools are rather complex to be used by beginners. In the paper we propose a new technique, based on development of PDC models as graphs “operationoperands”, which we consider as a first step for teaching parallel computations. 3
THE METHODICS OF USING MODELOO
Modeloo provides the possibility of visual development of PDC models as graphs “operation-operands”, storing the models in files, loading and editing models to calculate “acceleration” and “efficiency” of the parallel algorithms, to define number of processors, needed for the most effective problem solution. Modelo presents the dependencies in a parallel algorithm in the form of an oriented graph. The entry nodes of the graph represent the operands, others nodes - the computing operations. Edges show the dependences between nodes of the computing scheme. After visual development of a model, Modeloo helps students make analyses of parameters of a parallel algorithm such as the time of computations (sequential, parallel and using paracomputer), the acceleration and the efficiency. The proposed methodics for a model development using Modeloo has the following steps: specification of the operands – the nodes of the graph, which have no entry edges;
specification of operations to be executed – the nodes of the graph, which have at least one entering edge; representing existing dependences in the computing algorithm by connecting nodes by edges, starting from the operands and finishing exit node - the result of computation (Modeloo shows the result of intermediate operation near representing its node and the final result for the exit node); specification of values of the data, used in the algorithm, and computation of the parameters of performance of a parallel algorithm. An example of using Modeloo for developing a PDC model is shown in Fig. 1.
Fig. 1. Development of the model of a parallel algorithm using Modeloo tool This problem is computing of the simple algebraic expression (x1*x2)+(y1-y2). The existing dependences in the algorithm are represented as an oriented graph. The model assumes the immediate data transmission between all processors of the system and that any operation takes a unit of time. Besides, Modeloo takes into account that having no logical connections the branches of the graph can be executed in parallel in the chosen computation scheme. Students specify operations, which have to be executed in the algorithm, and existing dependences between operations, having as result the model of computation in the form of acyclic oriented graph G=V(R). Here, V is the set of the graph's nodes, representing operations of the algorithm, and R is a set of the edges of the graph. The edge r=(i, j) belongs to the graph only if the operation j uses the result of execution of the operation i. For example, fig. 2 shows the graph of the computation algorithm for the surface area of the rectangle, being set by the coordinates of the two opposite corners (the analyses of the parameters of this algorithm will be given in the section 4).
RESEARCH ARTICLE
Fig. 2. An example of a computing model for the algorithm using graph operations-operands By using Modeloo students can deduct, that for the execution of the chosen algorithm the different computations schemes and the different computing models can be built. By development of models and making computations by different schemes, students learn various possibilities of parallelisation and discover the most effective computing scheme for the parallel execution of the algorithm. E.g. for the computation of the surface area of a rectangle, first the computation of its sides, and next computation of their product to be done. 4
EXAMPLE OF AN ALGORITHM ANALYSES IN MODELOO
In the considered computational model of the algorithms, the nodes of the graph “operations-operands” without entry edges serve as input (initial values) of operands, and nodes without exit edges as output of operations. Let's designate as V the set of the graph's nodes without input, and as d(G) the graph's diameter (the length of the maximum path in the graph). These parameters of the graph students will apply for computation of the characteristics of time, performance and efficiency of the parallel algorithms. Let’s designate the time of execution of a parallel algorithm as Tp, where p is the amount of the processors used for the scheme implementation. E.g. T1 defines the sequential solution - the time of execution of an algorithm, where one processor is used. This time is important and further used for computation of performance of the parallel algorithms. From an assumption that each operation is executed a unit of time, and the amount of nodes of the computing scheme in the graph G without input nodes is V, the uniprocessor (serial) time of a problem solution is T1(G)=V. For example, in the case of finding the area of a rectangle the T1=7. The T∞ is the minimum time of execution of the parallel algorithm if use of unlimited number of processors is possible (so called paracomputer). Here, all the separate branches in the computing scheme can be executed in parallel and T∞(G)=d(G), in our case T∞=3.
Adv. Sci. Lett. 21, 2243-2246, 2015 The performance of the parallel algorithm using p processors, in comparison with the sequential solution, is defined by Sp(n)=T1(n)/Tp(n), i.e. as the ratio of the solution time in the uniprocessor computer on the time of execution of a parallel algorithm in the computer having p processors. Here, n is used for parameterization of computing complexity of a problem and can be treated as a size of input data. For example, using 2 processors to find the surface area of a rectangle, Tp=4, and Sp=7/4. The effectiveness of using processors in the parallel algorithm is defined by Ep(n)=Sp(n)/p - quotient of the performance and the number of processors used (the value of efficiency defines an average time of execution of the parallel algorithm during which processors are really used for computations). In our example, for 2 processors the efficiency is 7/8. These values T1, T∞ and Tp as also Sp and Ep students analyse in Modeloo after visual development of the model of the computation. 5
FEEDBACK FROM STUDENTS USING THE MODELOO TOOL
After finishing course “Parallel and distributed computations” we asked students to give feedback about using Modeloo. Most of the students have mentioned the simplicity of Modeloo comparatively with other tools, which is quite important at initial stage of learning PDC. The next point is that using the natural notation - graph “operationsoperands” - allows student to understand the sense of parallel computations. Besides, using Modeloo improves students’ abilities in visual modelling and develops the skills for construction of the schemes for parallel computations. CONCLUSION AND FUTURE PLANS The analyses of existing software tools used for PDC shows their limitations for teaching purposes. To overcome this problem the Modeloo tool was developed, which allows to students to model schemes of computations by using graph “operations-operands” and to make analysis of the parallel algorithms. An advantage of Modeloo is taking into account the principles of teaching, which results in the active involvement of students both in the development of parallel algorithms and analyses of their effectiveness. Modeloo has simple interface but advanced functionality, allowing intuitive development of PDC models and their further analyses (acceleration and efficiency of the parallel algorithms, including number of CPUs needed for the most effective solution). In our future works we will improve Modeloo functionality and enhance the proposed teaching technique for modelling parallel and distributed computations, giving the possibility to students of execution of algorithms on a real cluster.
REFERENCES [1] D.P. Bertsekas and J.N. Tsitsiklis, "Some Aspects of Parallel and Distributed Iterative Algorithms – A Survey", Automatica, Vol. 27, No. 1, 1991, pp. 3-21. [2] Voyevodin V. V., Voyevodin Vl. Parallel computations. BHV-St. Petersburg, 2002. - 608 p. [3] N. Kumar. Simulation Study for Performance and Prediction of Parallel Computers, BIJIT-BVICAM’s International Journal of Information Technology - 2012, Vol. 4 No. 2, 2012. [4] S. Hua and Z. Yang. Comparison and Analysis of Parallel Computing Performance Using OpenMP and MPI. The Open Automation and Control Systems Journal, 5, 38-44, 2013. [5] A. Karbowski, M. Majchrowski, and P. Trojanek. jParalize–a simple, free and lightweight tool for parallelizing Matlab computations on multicores and in clusters. Minisymposium on HPC Software: Tools, Libraries and Frameworks, PARA 2008: 9th International Workshop on State-of-the-Art in Scientific and Parallel Computing, May 2008 [6] R. Z. Khan, J. Ali. A Practical Performance Analysis of Modern Software used in High Performance Computing. International Journal of Advanced Research in Computer Science and Software Engineering, Volume 2, Issue 3, Mar. 2012 [7] L. Pagliarini, H. H. Lund. An educational tool for interactive parallel and distributed processing. Artificial Life and Robotics, Volume 16, Issue 4, pp. 441-447, Feb. 2012. [8] W.-C. Shih; S.-S. Tseng; C.-T. Yang. Performance Study of Parallel Programming on Cloud Computing Environments Using MapReduce. Information Science and Applications (ICISA), 2010 International Conference, pp.1-8, Apr. 2010. [9] J. Susen, J. Sindler, M. Sulitka. A New Software Tool for Parallel Computing with ANSYS. In Proceedings of 20th SVSFEM ANSYS Users' Group Meeting and Conference 2012, Oct. 2012. [10] W. Xia, Y. Yao & X. Mu. An extended event graph-based modelling method for parallel and distributed discrete-event simulation. Mathematical and Computer Modelling of Dynamical Systems: Methods, Tools and Applications in Engineering and Related Sciences, Volume 18, Issue 3, pages 287-306, 2012. [11] Bruce P. Lester. The Art of Parallel Programming, 2nd Edition, 568 pages, 2006. [12] Ceraj Ivica, Justin T. Riley, Charles Shubert. StarHPC - Teaching Parallel Programming within Elastic Compute Cloud ITI, International Conference on Information Technology Interfaces, Cavtat/Dubrovnik, Croatia, pages 353-356, 2009. [13] José Ortiz-Ubarri, Rafael Arce-Nazario. Modules to teach parallel computing using Python and the LittleFe Cluster, The Int. Conference for High Performance Computing, networking, Storage and Analysis, 2013. [14] Donald Johnson, David Kotz, Fillia Makedon. Teaching Parallel Computing to Freshmen. Conference on Parallel Computing for Undergraduates, Colgate University, 1994. [15] Alexander Breuer, Michael Bader. Teaching Parallel Programming Models on a Shallow-Water Code. Conference on Parallel Processing for Scientific Computing, 2012. [16] Victor Gergel, Anna Labutina. The ParaLab System for Investigating the Parallel Algorithms. Methods and Tools of Parallel Programming Multicomputers 2010. Pp. 95-104.