MPI/CTP: A Reconfigurable MPI for HPC ... - Semantic Scholar

4 downloads 0 Views 424KB Size Report
Reconfigurations in MPI/CTP. Reconfigurable protocol recovers performance. Conclusion and Future Work. MPI/CTP: A Reconfigurable MPI for HPC.
MPI/CTP - Implementation of MPI Reconfigurations in MPI/CTP Reconfigurable protocol recovers performance Conclusion and Future Work

MPI/CTP: A Reconfigurable MPI for HPC Applications Manjunath Gorentla Venkata, Patrick Bridges Scalable Systems Lab University of New Mexico

The University of New Mexico

MPI/CTP - Implementation of MPI Reconfigurations in MPI/CTP Reconfigurable protocol recovers performance Conclusion and Future Work

Static configuration of protocol is not enough Static configuration of communication protocol cannot optimize service in all cases Varying static protocol demands HPC applications have diverse communication requirements HPC machines have variety of network interface capabilities

Dynamic communication characteristics Solution: MPI/CTP - A reconfigurable implementation of MPI implementation A fine grained reconfigurable protocol driven by application and hardware specific protocol optimizations

MPI/CTP - Implementation of MPI Reconfigurations in MPI/CTP Reconfigurable protocol recovers performance Conclusion and Future Work

Outline

1

MPI/CTP - Implementation of MPI

2

Reconfigurations in MPI/CTP

3

Reconfigurable protocol recovers performance

4

Conclusion and Future Work

MPI/CTP - Implementation of MPI Reconfigurations in MPI/CTP Reconfigurable protocol recovers performance Conclusion and Future Work

MPI/CTP features

Supports various granularities of configuration Fine grained - functional properties and protocol behavior Coarse grained - selecting network interfaces for message transfer during component start time.

Supports all types(sync, async,buffered) of point-to-point operations

MPI/CTP - Implementation of MPI Reconfigurations in MPI/CTP Reconfigurable protocol recovers performance Conclusion and Future Work

MPI/CTP - Birds eye view of design

MPI/CTP built using Cactus framework and CTP MPI functionality implemented as microprotocols Example microprotocols MPI message transfer, Demultiplexing, Reliability, Ordering, Congestion Control

Events and event handlers

MPI/CTP - Implementation of MPI Reconfigurations in MPI/CTP Reconfigurable protocol recovers performance Conclusion and Future Work

Relevant design details MPI API:Open, Close, Push, Pop

MPI layer to match semantics of Cactus and MPI applications

Shared Data

Events

Microprotocols

MPI/CTP - Implementation of MPI Reconfigurations in MPI/CTP Reconfigurable protocol recovers performance Conclusion and Future Work

MPI/CTP - Relevant design details MPI API:Open, Close, Push, Pop MPI Support

MPI/CTP matching/demultiplexing

Shared Data

Events

Microprotocols

MPI/CTP - Implementation of MPI Reconfigurations in MPI/CTP Reconfigurable protocol recovers performance Conclusion and Future Work

MPI/CTP - Relevant design details MPI API:Open, Close, Push, Pop MPI Support Eager Eager Rendezvous

Message transfer protocols

Shared Data

Rendezvous

Events

Microprotocols

MPI/CTP - Implementation of MPI Reconfigurations in MPI/CTP Reconfigurable protocol recovers performance Conclusion and Future Work

MPI/CTP - Relevant design details MPI API:Open, Close, Push, Pop MPI Support Eager

MPI Wait SegmentFrom User

Eager Rendezvous

MPI related Events

Shared Data

Rendezvous

SegmentFrom Net

Events

Microprotocols

MPI/CTP - Implementation of MPI Reconfigurations in MPI/CTP Reconfigurable protocol recovers performance Conclusion and Future Work

MPI/CTP - Relevant design details MPI API:Open, Close, Push, Pop MPI Support Eager

MPI Wait SegmentFrom User

Eager Rendezvous

MPI/CTP Message headers

Shared Data

Rendezvous

SegmentFrom Net

Events

Microprotocols

Rank

Tag

RTS

CTS

MPI/CTP - Implementation of MPI Reconfigurations in MPI/CTP Reconfigurable protocol recovers performance Conclusion and Future Work

MPI/CTP - Relevant design details MPI API:Open, Close, Push, Pop MPI Support

Compatible with CTP functions

Eager

MPI Wait SegmentFrom User

Eager Rendezvous

Reliability Congestion Control Flow control

Shared Data

Rendezvous

SegmentFrom Net

Events Retransmit Rank

Tag

RTS

CTS

Positive Ack Transport Driver

Microprotocols

MPI/CTP - Implementation of MPI Reconfigurations in MPI/CTP Reconfigurable protocol recovers performance Conclusion and Future Work

Reconfiguration opportunities

Network-based Reliability protocols

Application-based Message list management Message transfer protocols

Collectives

MPI/CTP - Implementation of MPI Reconfigurations in MPI/CTP Reconfigurable protocol recovers performance Conclusion and Future Work

Case study: Adapting protocol behavior

MPI/CTP message transfer microprotocols Eager Rendezvous, Rendezvous,Eager Preposted receives percentage dictate message send protocols Per-message, per-peer protocol reconfiguration

MPI/CTP - Implementation of MPI Reconfigurations in MPI/CTP Reconfigurable protocol recovers performance Conclusion and Future Work

Experiment setup

Hardware - 2.3 Ghz Pentium III Xeon, Myrinet NIC with Lanai 7 processor Software - Linux Kernel version 2.4.2, GM 2.1.1, MPICH 1.2.6, OpenMPI 1.0.2 Benchmark - derived from SNL benchmarks Measure effective throughput Vary percent of receives preposted

MPI/CTP - Implementation of MPI Reconfigurations in MPI/CTP Reconfigurable protocol recovers performance Conclusion and Future Work

Selecting protocol dynamically provides more bandwidth Adaptation based on Percentage of PrePosted Receives 270 260 250 Bandwidth Mbps

240 230 220 210 200 190 180

Bandwidth-Adaptive EagerRendezvous Rendezvous

170 160 10

20

30

40 50 60 70 80 Pre Posted Receives Percentage

90

100

MPI/CTP - Implementation of MPI Reconfigurations in MPI/CTP Reconfigurable protocol recovers performance Conclusion and Future Work

Overhead in current prototype

Raw Bandwidth Curve - Indicating Overhead 1200

Bandwidth Mbps

1000 800

Zero copy 600

Offload

400 200

OpenMPI cactus-MPI MPICH

0 0

50000

100000 150000 200000 Message Size Bytes

250000

300000

MPI/CTP - Implementation of MPI Reconfigurations in MPI/CTP Reconfigurable protocol recovers performance Conclusion and Future Work

Related Work

MPI/CTP Open-MPI Coarse grained reconfiguration MPICH Compile time reconfiguration H-CTP Designed for Grid systems

Fine grained Run time HPC systems

MPI/CTP - Implementation of MPI Reconfigurations in MPI/CTP Reconfigurable protocol recovers performance Conclusion and Future Work

Conclusion

MPI/CTP recovers lost performance Flexibility to provide more bandwidth to application

Contributions Protocol architecture for application and hardware specific protocol reconfiguration in MPI Prototype MPI implementation supporting reconfiguration at compile time and runtime

MPI/CTP - Implementation of MPI Reconfigurations in MPI/CTP Reconfigurable protocol recovers performance Conclusion and Future Work

Future Work

Support for zero-copy in MPI/CTP Demonstrate advantages of hardware specific protocol reconfiguration in MPI Collective and single sided MPI operations reconfigurable to application requirements.

MPI/CTP - Implementation of MPI Reconfigurations in MPI/CTP Reconfigurable protocol recovers performance Conclusion and Future Work

Thanks

Acknowledgments UNM - Prof.Patrick Bridges, Prof. Barney Maccabe SNL- Ron Brightwell, Rolf Riesen SSL members DOE - Office of Science grant DE-FG02-05ER25662 Sandia University Research Program contract number 190576

Suggest Documents