Parallelization of Matrix - Vector Multiplication on a ...

1 downloads 0 Views 42KB Size Report
Keywords: matrix - vector multiplication, parallel algorithm, cluster computing, performance analysis. ... A column of blocks of matrix distributed in a cyclic fashion.
Parallelization of Matrix - Vector Multiplication on a Cluster Platform Vasilis Stefanidis, Panagiotis D. Michailidis and Konstantinos G. Margaritis Parallel and Distributed Processing Laboratory Department of Applied Informatics, University of Macedonia 156 Egnatia str., P.O. Box 1591, 54006 Thessaloniki, Greece E-mail: {bstefan,panosm,kmarg}@uom.gr URL: http://macedonia.uom.gr/˜{bstefan,panosm,kmarg} Keywords: matrix - vector multiplication, parallel algorithm, cluster computing, performance analysis. The multiplication of a vector by a matrix is the kernel operation in many algorithms used in scientific computation. Some studies of distributed matrix vector multiplication have been made [1]. Recently, we proposed four variations for matrix - vector multiplication on a cluster of workstations [2]. These variations are based on row block decomposition scheme using dynamic master - worker paradigm. Furthermore, we proposed a performance prediction model of four parallel implementations in paper [2]. In this paper, we present two parallel matrix vector multiplication implementations on a cluster of workstations using the Message Passing Interface (MPI) library. These parallel implementations are based on the block checkerboard decomposition scheme using the master worker programming model. More specifically, the first implementation the master allocates the square blocks of matrix and the blocks of vector to workers. A column of blocks of matrix distributed in a cyclic fashion. The second implementation we assume that the blocks of the matrix and the vector is stored in the local memory of workers instead of allocation of the blocks to workers. Furthermore, we present a performance analysis of two proposed matrix - vector implementations on a cluster of workstations. The computational results carried out on a cluster of workstations prove the effectiveness of the performance analysis and the accuracy of the predictions. Finally, the experimental results show that the first implementation occur lower speedups than the second implementation.

References [1] V. Kumar, A. Gramma, A. Gupta, G. Karypis, Introduction to Parallel Computing, The Benjamin/Cummings, Publishing Company, 1994. [2] T. Typou, V. Stefanidis, P. Michailidis, K. Margaritis, Matrix Vector multiplication on a cluster of workstations, in Proceedings of the First International Conference From Scientific Computing to Computational Engineering, Athens, Greece, September 8-10, 2004.

1

Suggest Documents