Optimizing Irregular Shared-Memory Applications for Clusters ∗

3 downloads 674 Views 858KB Size Report
Jun 12, 2008 - evaluate the performance of the optimized LDSM system on a set of representative irregular benchmarks. ... Languages, Performance.
Optimizing Irregular Shared-Memory Applications for Clusters ∗ Seung-Jai Min

School of Electrical and Computer Engineering Purdue University, West Lafayette, IN 47907

[email protected]

ABSTRACT Irregular applications pose challenges in optimizing communication, due to the difficulty of analyzing irregular data accesses accurately and efficiently. This challenge is especially big when translating irregular shared-memory applications to message-passing form for clusters. The lack of effective irregular data analysis in the translation system results in unnecessary or redundant communication, which limits application scalability. In this paper, we present a Lean Distributed Shared Memory (LDSM) system, which features a fast and accurate irregular data access (IDA) analysis. The analysis uses a region-based diff method and makes use of a runtime library that is optimized for irregular applications. We describe three optimizations that improve the LDSM system performance. A parallel array reduction transformation reduces overheads in the analysis. A packed communication optimization and a differential communication optimization effectively eliminate unnecessary and redundant messages. We evaluate the performance of the optimized LDSM system on a set of representative irregular benchmarks. The optimized LDSM executes irregular applications on average 45% faster than the hand-tuned MPI applications.

Categories and Subject Descriptors D.3.4 [Programming Languages]: [Compilers, Run-time environments]

General Terms Languages, Performance

Keywords Compiler Analysis, Runtime Techniques, OpenMP, MPI, Irregular Data Accesses, Performance ∗This work was supported, in part, by the National Science Foundation under grants No. 0429535-CCF, 0650016-CNS, and CNS-0751153.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ICS’08, June 7–12, 2008, Island of Kos, Aegean Sea, Greece. Copyright 2008 ACM 978-1-60558-158-3/08/06 ...$5.00.

Rudolf Eigenmann

School of Electrical and Computer Engineering Purdue University, West Lafayette, IN 47907

[email protected]

#pragma omp parallel for private(i, j) for (i=0; i

Suggest Documents