Swift/T: Scalable Data Flow Programming for Many-Task Applications

Recommend Documents

Feb 27, 2013 - The data flow programming model of the Swift parallel scripting language [6] can elegantly express, through implicit parallelism, the massive ...

Collective Communications for Scalable Programming

Adlib was originally developed using C++ by the Parallel Compiler Runtime Con- ... The basic features of HPJava [10] [11] [12] have been described in several ear- ... A translated and compiled HPJava program is a standard Java class file, ..... ifica

Collective Communications for Scalable Programming

sections regardless of source and destination mapping. Another function, write- ... void sendReq(int offset, int[] strs, int[] exts, int dstId) { ... } void recvReq(int offset ...

Scalable Performance for Web Applications

management from VMware vFabric tc Server or Tomcat application servers , as well as .... custom Java applications â in

Anthill: A Scalable Run-Time Environment for Data Mining Applications

We show scalability results of two different data-mining algorithms that were parallelized using our approach and our runt-time support. Both applications scale.

Service-Oriented Data Denormalization for Scalable Web Applications

Apr 25, 2008 - A conversation with Werner Vogels. ACM. Queue, 4(4):14â22, May 2006. [13] S. D. Gribble, E. A. Brewer, J. M. Hellerstein, and. D. Culler.

Scalable Inter-Vehicular Applications

aggregation and summarisation explicitly, allowing a compiler to auto- matically optimise the ... Internetâthese applications will evolve beyond disconnected intra-vehicle applications and will help to ..... In the exponentia- tion example, we ..

Data-Flow Oriented Visual Programming Libraries for ... - CiteSeerX

Both so-called viewers offer solid, hidden-line, wire-frame and .... and other symmetric sequences. http://www.netlib.org/fftpack/index.html. 6. Boor,. C. de:.

Orla: A data flow programming system for very

warehouse are SQDADL expressions and the repository returns the ..... e.g., in the construction of a yield curve on the USD all maturities are needed, but.

A Methodology for Programming Scalable Architectures - CiteSeerX

contract numbers N00014-90-J-1899 and N00014-93-1-0273), by an ... In this paper, we describe a methodology for programming concurrent computers which.

Concurrent Programming for Scalable Web Architectures

Jan 17, 2012 ... 3.1.1 Server-Side Technologies for Dynamic Web Content . . . . . . . . . . 23. 3.1.2 Tiered ... 4 Web Server Architectures for High Concurrency. 45.

Polyglot Programming in Applications Used for Genetic Data Analysis

Jul 31, 2014 - libraries, and tools, used to develop applications for process- ing genetic data. .... Ajax techniques available on modern web browsers, mainly.

Constraint Programming Languages for Big Data Applications - Rice CS

Constraint Programming Languages for Big Data Applications. Francesca Rossi1 and Vijay Saraswat2. 1 University of Padova

Polyglot Programming in Applications Used for Genetic Data Analysis

Jul 31, 2014 - three-layer architecture: the desktop, the database server, the thin client, and the ... framework was designed to support multiuser applications. Collaboration .... are written explicitly and this library has the best docu- mentation.

Concurrent Programming for Scalable Web Architectures

Jan 17, 2012 - the host of the web server hosting the resource. ...... Event-driven web servers like nginx (e.g. GitHub, WordPress.com), lighttpd (e.g. YouTube,.

A programming model for spatio-temporal data streaming applications

languages (e.g., C/C++, Java, PHP, JavaScript, python, etc.) do not ... led to the tragic crash of Air France flight 447, killing all 216 passengers and 12 aircrew [2].

Polyglot Programming in Applications Used for Genetic Data Analysis

Jul 31, 2014 - ties of modern computers with multiple processors and/or multiple .... (MVVM) client-side JavaScript framework AngularJS [8]. Apache Flex is a ...

Scalable Problem Localization for Distributed Systems - Programming ...

{rui.zhang,bruno,swm}@comlab.ox.ac.uk [email protected] ... works, at their best, are able to achieve O(logN) problem localization time and O(1) per node ...

Network Configuration and Flow Scheduling for Big Data Applications

Nov 3, 2015 - Storm [5] is another approach that aims at streaming data analytics, while ...... Big Data applications are a major representative in today's cloud ...

Tree-based Overlay Networks for Scalable Applications

data analysis or computation. As the number of nodes. in the system increases, both the computational and. communication demands on the system increase ...

LENOVO'S SCALABLE SOLUTIONS FOR SAP APPLICATIONS

Enterprise Cloud platform software running on the hyperconverged Lenovo HX .... This consolidated, scale out solution re

APPLICATIONS FOR THE SCALABLE COHERENT ... - SLAC

Apr 9, 1990 - processor chip technology, and are not susceptible to quick obsolescence. .... signal return) pins to keep noise under control; there are already high speed ... Therefore, we specify a mechanism which allows each interface to ...

Tree-based Overlay Networks for Scalable Applications

study efficient ways for tools and applications to run. effectively in ... TB Â¯ON infrastructure, be it a tool (as is most common) ..... ing trends in business analytics.

Designing Scalable Synthetic Compact Applications for Benchmarking ...

Oct 15, 2006 - the Scalable Synthetic Compact Applications (SSCA) benchmark suite, ..... application programmer to attain high performance on these codes.

Swift/T: Scalable Data Flow Programming for Many-Task Applications

Download PDF

4 downloads 37 Views 190KB Size Report

Comment

Feb 27, 2013 - Argonne National Laboratory &. University of Chicago [email protected]. Abstract. Swift/T, a novel programming language implementation for ...

Swift/T: Scalable Data Flow Programming for Many-Task Applications Justin M. Wozniak

Timothy G. Armstrong

Michael Wilde

Argonne National Laboratory & University of Chicago [email protected]

University of Chicago [email protected]

Argonne National Laboratory & University of Chicago [email protected]

Daniel S. Katz

Ewing Lusk

Ian T. Foster

University of Chicago & Argonne National Laboratory [email protected]

Argonne National Laboratory [email protected]

Argonne National Laboratory & University of Chicago [email protected]

Abstract Swift/T, a novel programming language implementation for highly scalable data flow programs, is presented. Categories and Subject Descriptors D.3.2 [Programming Languages]: Language Classifications— Concurrent, distributed, and parallel languages; Data-flow languages General Terms Languages

We have evaluated the performance and programmability of Swift/T for a collaboration graph analysis and optimization application, a branch-and-bound game solver, and synthetic stress tests of language constructs. Our tests show that Swift/T can already scale to 128K compute cores with 85% efficiency for 100second tasks. Thus, Swift/T provides a scalable parallel programming model for productively expressing the outer levels of highlyparallel many-task applications. The benefits of these advances are illustrated by considering the Swift code fragment in Figure 1.

Keywords MPI; ADLB; Swift; Turbine; exascale; concurrency; dataflow; futures

1.

Introduction

Many important application classes that are driving the requirements for extreme-scale systems—branch and bound, stochastic programming, materials by design, uncertainty quantification—can be productively expressed as many-task data flow programs. The data flow programming model of the Swift parallel scripting language [6] can elegantly express, through implicit parallelism, the massive concurrency demanded by these applications while retaining the productivity benefits of a high-level language. However, the centralized single-node evaluation model of the previously developed Swift implementation limits scalability. Overcoming this important limitation is difficult, as evidenced by the absence of any massively-scalable data flow languages in current use. We present here Swift/T, a new data flow language implementation designed for extreme scalability. Its technical innovations include a distributed data flow engine that balances program evaluation across massive numbers of nodes using data-flow-driven task execution and a distributed data store for global data access. Swift/T extends the Swift data flow programming model of external executables with file-based data passing to finer-grained applications using in-memory functions and in-memory data. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. PPoPP’13, February 23–27, 2013, Shenzhen, China. c 2013 ACM 978-1-4503-1922/13/02. . . $10.00 Copyright

1 2 3 4 5 6 7 8 9 10 11 12

Start

int X = 1000, Y = 1000; Outer int A[][]; Loops int B[]; Inner foreach x in [0:X-1] { Loops foreach y in [0:Y-1] { check() if (check(x, y)) { then / else A[x][y] = g(f(x), f(y)); f() } else { g() A[x][y] = 0; }} sum() B[x] = sum(A[x]); }

...

... T

...

...

F

Task Data Spawn Task Data wait/write

Figure 1. Simple data flow application. The implicit parallelism of this code generates 1 million concurrent executions of the inner block of expressions, invoking as many as 4M function calls (3M within conditional logic). Previously, the single-node Swift engine would perform the work of sending these leaf function tasks to distributed CPUs at