Parallel Implementation of Bi-cubic Interpolation ...

Parallel Implementation of Bi-cubic Interpolation Algorithm using MPI on multi-core Systems Lokendra Singh Umrao, Ravi Shankar Singh, Srijan Misra and S Sujana Department of Computer Science & Engineering, Indian Institute of Technology (BHU), Varanasi–221 005, India

Objectives

Speedup Computation

The main objectives of the paper are: • Study

the parallel computing and utilization of additional processing resources to accelerate computations. • Proving that massively parallel computer can solve scientific problems faster than a single processor. • Parallelization of Bi-cubic Interpolation algorithm using MPI API. • Showing that an exponential speedup is possible if sufficiently many processors are available.

Introduction Multi-core chips are an important new trend in computer architecture. Several Microprocessor manufacturing companies have entered the era of multicore processors where multiple processors are being added on the same chip instead of just increasing frequency of a single processor. The focus now has been shifted towards adding multiple cores on the same chip thereby allowing applications to run independently on a different cores at the same time. Not only different applications can run in parallel but also a single application could be threaded so that different threads could get a different core each and hence run in parallel at the same time.

Basics When a single application is running on an underlying multi-core processor, it can be threaded itself. Hence, utilize all the computing power. An algorithm has to be properly studied so that different areas in the algorithms could be explored where it can be parallelized. Various synchronization issues have to be taken care of while converting a sequential algorithm into parallel. Once an algorithm has been studied for parallelization, it has to be threaded so that different tasks can run in parallel.

Figure 1: Speedup performance based on Amdahl’s law

Software Environment Operating System: Linux Development tools: gcc compiler with MPI

Methods

Conclusion

Automatic Parallelization: Many compilers provide a flag or option for automatic program parallelization. When this is selected, the compiler analyzes the program, searching for independent sets of instructions, and in particular for loops whose iterations are independent from one another. It then uses this information to generate explicitly parallel code. Message Passing Interface: MPI is an application programming interface for distributed memory parallel programming platform which helps to create processes and give independent tasks to threads easily. Just by specifying some compiler directives one can able to create threads, divide work between threads and synchronize threads.

Increasing the number of threads after four has not lead into increase into speedup proportionally because the machine is quad-core but by using hyperthreading it gives good performance up to eight threads. Hence it can be clearly seen that complete computing power of these next generation multi-core processors can be used if the algorithms are designed to support distributed memory parallel programming and hence benefit the end user.

Future Work • Further

improve the parallel algorithm • Apply different approaches; OpenMP/CUDA

References

Bilingual International Conference on Information Technology: Yesterday, Today, and Tomorrow Organised by DESIDOC, DRDO, Metcalfe House, Delhi–110 054, India during 19-21 February 2015

Plateform model name intel core i7 − 3770 clock rate 3.40GHz architecture i686 CPU op-mode(s) 32–bit, 64–bit byte order little endian CPU(s) 8 socket 1 cores/socket 4 thread/core 2 on-line CPU(s) list 0–7 CPU family 6 model 58 CPU MHz 1600 L1 cache/core 32k L2 cache/core 256k L3 cache/socket 8192k

Results Figure 2 showed that using more number of cores for computation by increasing number of processors up to four, the real time for interpolating image has decreased continuously. There is 200% speedup when number of processors are two and nearly 350% speedup when number of processors are four.

[1] G. Tournavitis, Z. Wang, B. Franke, O’Boyle, and F. P. Michael. Towards a holistic approach to auto-parallelization: integrating profile-driven parallelism detection and machine-learning based mapping. In ACM Sigplan Notices, volume 44 (6), pages 177–187, 2009. [2] J. G. Blas, F. Isaila, M. Abella, J. Carretero, E. Liria, and M. Desco. Parallel implementation of a x-ray tomography reconstruction algorithm based on mpi and cuda. In Proceedings of the 20th European MPI Users’ Group Meeting, pages 217–222. ACM, 2013. [3] Y. Liu and F. Gao. Parallel implementations of image processing algorithms on multi-core. In Fourth IEEE International Conference on Genetic and Evolutionary Computing, pages 71–74, 2010.

Contact Information • Web: Figure 2: Parallelization overhead for Bi-cubic Interpolation

http://www.iitbhu.ac.in/cse • Email: [email protected] • Phone: +91 9415772305

Parallel Implementation of Bi-cubic Interpolation ...

Parallel Implementation of Bi-cubic Interpolation ...

Suggest Documents

Bicubic Interpolation

High Accuracy Bicubic Interpolation Using Image ...

High Accuracy Bicubic Interpolation Using Image

Bicubic G1 interpolation of arbitrary quad meshes ... - Semantic Scholar

Directional Bicubic Interpolation â A New Method of ... - Atlantis Press

PARALLEL INTERPOLATION, SPLITTING, AND ... - Google Sites

Performance Evaluation of Parallel Implementation

PARALLEL SPATIAL INTERPOLATION Marc P ... - mapcontext.com

Generalised Parallel Bilinear Interpolation Architecture for Vision ...

15.7 Bicubic Bezier Surface Patches

Parallel implementation of Estimation of Distribution ...

Fast parallel particle-to-grid interpolation for ... - University of Maryland

Interpolation and parallel adjustment of center-sampled trees ... - DCSL

Extended view interpolation by parallel use of the GPU ... - CiteSeerX

Parallel Implementation of the Unified Flow Solver

GPGPU optimized parallel implementation of AES ...

Implementation of Parallel LFSR-Based ... - DATE Conference

Implementation of Multigrid on Parallel Machines

Parallel Implementation of 2D Daubechies - D4 ...

Development of parallel implementation for the dendritic

Efficient Parallel Implementation of Multilayer Backpropagation ... - APT

Parallel software implementation of recursive multidimensional digital ...

Parallel Implementation of Real-Time Block-Matching

Parallel Implementation of the Gauss-Seidel

Parallel Implementation of Bi-cubic Interpolation ...