Single Chip Massively Parallel Processor. (Keynote Abstract). Benoit Dupont de Dinechin. CTO, Kalray, France. AbstractâThe Kalray MPPA-256 processor ...
Dataflow Language Compilation for a Single Chip Massively Parallel Processor (Keynote Abstract)
Benoit Dupont de Dinechin CTO, Kalray, France Abstract—The Kalray MPPA-256 processor (Multi-Purpose Processing Array) integrates 256 processing engine (PE) cores and 32 resource management (RM) cores on a single 28nm CMOS chip. These cores are distributed across 16 compute clusters and 4 I/O subsystems. On-chip communications and synchronizations are supported by an explicitly addressed dual network-on-chip (NoC), with one node per compute cluster and 4 nodes per 4 I/O subsystem. The Kalray MPPA software development kit includes a complete programming environment for a C-based dataflow language, whose compiler fully automates the distributed execution of tasks across the processing, memory, communication and synchronization resources of the MPPA architecture.
c 2013 IEEE 978-1-4799-1010-6/13/$31.00
We first introduce the model of computation of the Kalray dataflow language, which is based on cyclostatic dataflow with extensions such as the firing thresholds of Karp & Miller computation graphs. We then describe the main steps of dataflow compilation to a distributed execution platform. These include: task sequencing, communication buffer sizing, task clustering, DMA engine exploitation, place & route, NoC bandwidth allocation, and generation of run-time tables. Finally, we discuss the suitability and restrictions of this and related static dataflow models of computations with regards to the dynamic and realtime requirements of embedded applications targeted by the MPPA processor.