Multicore Programming in ParaSail Parallel Specification and Implementation Language S. Tucker Taft SofCheck, Inc.
[email protected]
Abstract. The advent of multicore processors requires a new approach to programming. ParaSail is an example of such a new approach. It is a marriage of implicit parallelism and formal methods integrated into a simplified yet powerful programming language. Keywords: parallel programming, formal methods, multicore processor, race-free
1
Introduction
ParaSail [1] is a new language for race-free parallel programming, with a distinct approach to supporting parallelism. ParaSail has two overarching themes: language semantics should be parallel by default, forcing the programmer to work harder if sequential execution is required; and all checking, for race conditions, user-defined assertions, and other potential run-time problems such as null values or array out of bounds, should be performed at compile-time. ParaSail was created not by bolting parallelism and formal annotations onto an existing language, but rather by going back to basics, and building parallelism and formal annotations into the language from the beginning, while simplifying and unifying concepts wherever possible.
2
Implicitly Parallel
All expression evaluation in ParaSail is parallel by default. Explicitly parallel, explicitly sequential, or (by-default) data-dependence-based execution of statements and loops is provided. Annotations such as preconditions, postconditions, assertions, constraints, invariants, etc., are woven into the syntax, and are enforced at compile-time. Both sequential and concurrent data structures are supported, with both lock-based and lock-free concurrency mechanisms provided. To enable its full compile-time checking, ParaSail eliminates global variables to operations, requiring all outputs and non-constant inputs of an operation to be explicitly declared. In addition, no aliasing is permitted between a writable parameter that is of a non-concurrent type, and any other parameter to the operation.
3
Simplified and Unified Language Concepts
To make conceptual room for including implicit parallelism and formal annotations in ParaSail, a conscious attempt was made to eliminate from the language all extraneous concepts, and to unify those that remain. ParaSail has four basic concepts – modules, types, objects, and operations. All modules are parameterized (like a generic template). Every type is an instance of a module. Every object is an instance of a type. Operations are defined in modules and operate on objects. There is no special syntax for built-in types. Instead, all aspects of a type are definable by the user, including what literals are appropriate for the type, whether the type is indexable like an array, the comparison operations available on the type, any other operators available on the type, etc. Modules may be sequential or concurrent, with their instances being sequential or concurrent types, respectively. Instances of a concurrent type support concurrent access by multiple threads. Synchronization for concurrent objects may be indicated as locked, conditionally queued, or lock-free. There is no explicit use of the heap or pointers in ParaSail. All ParaSail objects effectively live on the stack, in a region associated with the scope where they are declared. Objects are extensible and shrinkable as the result of assignment, but never share space with any other object. Storage management is automatic within each region, but there is no need for asychronous garbage collection since all size-changing operations are explicit and there is no sharing. There are no exceptions in ParaSail, though it is possible for one thread to explicitly “exit” or “return” from a lexically enclosing construct, and as a sideeffect terminate all other threads active within the construct. Large objects are generally passed by reference, but since there is no aliasing for non-concurrent objects and no exceptions, passing non-concurrent objects by copy is feasible as well.
4
Parallel Run-Time Model and Pico-Threading
ParaSail’s run-time model is most closely related to that of Intel’s Cilk language [2], where small computations are spawned off as pico-threads, which are then served by a set of worker processes running on separate processors or cores. Computations spawned by a given worker are served last-in, first-out (LIFO) by that worker. When a worker runs out of threads to serve, it steals from the queue of another worker, but in this case using first-in, first-out (FIFO). Because of the lack of aliasing and the concurrent looping constructs, ParaSail is also amenable to the stream computing model of CUDA [3] and OpenCL [4], where the body of a concurrent loop becomes a “kernel” which is executed on each element of the container or stream over which the iteration applies. Because of the lack of pointers and exceptions, passing parameters by copy, as might be required when communicating with a Graphics Processing Unit (GPU), is straightforward.
5
Deterministic and non-Deterministic Race-Free Parallel Programming
ParaSail makes it easy for the programmer to achieve determinism when desired, but also does not force overspecification, so that, for example, the iterations of a (non-concurrent) loop over a sequence are by default unordered, but the programmer may specify “forward” or “reverse” explicitly. This enables the compiler to more readily interleave or run in parallel non-data-dependent parts of the loop. Similarly, by default the execution of sequential statements are limited only by data dependencies involving non-concurrent data structures, but it is possible for the programmer to force strictly sequential execution by using “;;” rather than simply “;” to separate statements. Or the programmer can go the other way, and effectively declare there are no non-concurrent data structure dependencies by using “||” rather than “;” to separate statements, essentially “forcing” parallel execution.
6
Object-Oriented Programming in Parasail
As far as object-oriented programming, ParaSail supports inheritance and polymorphism. Each module has an “interface,” and if not declared as abstract, a “class” that defines it. Modules may inherit operation interfaces from one or more other modules, and may inherit operation code and data components from at most one other module. Named sets of operations may be effectively appended to a module, without disturbing the original module, largely bypassing the need for “visitor” operations. A polymorphic variant of a type, identified by appending a “+” to the type name, may be used anywhere a type is used, to represent any type that implements the associated interface.
7
Conclusion
Our position is that languages like ParaSail are the way to bring safe and efficient parallel programming to the masses, which will be mandatory as we move into the era of multi-core on the desktop. ParaSail fosters the use of parallel programming by making programs parallel by default, while eliminating programmer concerns like race conditions and run-time failures, thereby easing the debugging burden. This burden is further reduced by eliminating exceptions, the heap, and reassignable pointers, and unifying the typically distinct concepts of generic templates, packages, namespaces, modules, interfaces, classes, objects, and structs, into a single notion of module, with all types being an instance of a module, and all objects being an instance of a type.
References 1. Taft, S. T.: ParaSail Programming Language blog, http://parasail-programminglanguage.blogspot.com (2011)
2. Blumofe et al: Cilk: An Efficient Multithreaded Runtime System, http://publications.csail.mit.edu/lcs/pubs/pdf/MIT-LCS-TM-548.pdf (1995) 3. NVIDIA: What is CUDA, http://www.nvidia.com/object/what is cuda new.html (2011) 4. Khronos Group: OpenCL - The open standard for parallel programming of heterogeneous systems, http://www.khronos.org/opencl/ (2011)
Appendix As an example of the syntax of ParaSail, here is a parallel version of the Quicksort algorithm in ParaSail. The expressions in braces are annotations which are checked for validity at compile-time. Comments start with “//”. Reserved words are in lower case. In this example, user identifiers are in mixed case, though that is not required. Note that rather than explicit recursion, a parallel “continue loop” is used to perform the sorting of the two partitions of the original array. i n t e r f a c e S o r t i n g i s // Non−r e c u r s i v e p a r a l l e l q u i c k s o r t procedure Q u i c k s o r t ( Arr : r e f var One Dim Array ; function B e f o r e ( L e f t , R i g h t : One Dim Array : : Element Type ) −> B o o l e a n i s ” B o o l e a n i s ” Arr while Length (A) > 1 loop // Handle s h o r t a r r a y s d i r e c t l y . Partition longer arrays . i f Length (A) == 2 then i f B e f o r e (A [ A . L a s t ] , A [A . F i r s t ] ) then // Swap t h e e l e m e n t s i f o u t o f o r d e r A [ A . L a s t ] : = : A [A . F i r s t ] ; end i f ; else // P a r t i t i o n a r r a y const Mid := A [ A. F i r s t + Length (A ) / 2 ] ; var L e f t : Index Type := A . F i r s t ; var R i g h t : Index Type := A . L a s t ; u n t i l L e f t > R i g h t loop var New Left : Index Type := R i g h t +1; var New Right : Index Type := L e f t −1; block // Find item i n l e f t h a l f t o swap f o r I in L e f t . . R i g h t f o r w a r d loop i f not B e f o r e (A [ I ] , Mid ) then // Found an item t h a t can go i n t o r i g h t p a r t i t i t i o n New Left := I ; i f B e f o r e ( Mid , A [ I ] ) then // Found an item t h a t ∗ must ∗ go i n t o r i g h t p a r t e x i t loop ; end i f ; end i f ; end loop ;
|| // In p a r a l l e l , f i n d item i n r i g h t h a l f t o swap f o r J in L e f t . . R i g h t r e v e r s e loop i f not B e f o r e ( Mid , A[ J ] ) then // Found an item t h a t can go i n t o l e f t p a r t i t i t i o n New Right := J ; i f B e f o r e (A [ J ] , Mid ) then // Found an item t h a t ∗ must ∗ go i n t o l e f t p a r t e x i t loop ; end i f ; end i f ; end loop ; end block ; i f New Left > New Right then // Nothing more t o swap // E x i t l o o p and r e c u r s e on two p a r t i t i o n s L e f t := New Left ; R i g h t := New Right ; e x i t loop ; end i f ; // Swap i t e m s A [ New Left ] : = : A [ New Right ] ; // c o n t i n u e l o o k i n g f o r i t e m s t o swap L e f t := New Left + 1 ; R i g h t := New Right − 1 ; end loop ; // At t h i s p o i n t , ” R i g h t ” i s r i g h t end o f l e f t p a r t i t i o n // and ” L e f t ” i s l e f t end o f r i g h t p a r t i t i o n // and t h e p a r t i t i o n s don ’ t o v e r l a p // and n e i t h e r i s t h e whole a r r a y // and e v e r y t h i n g i n t h e l e f t p a r t i t i o n can p r e c e d e Mid // and e v e r y t h i n g i n t h e r i g h t p a r t i t i o n can f o l l o w Mid // and e v e r y t h i n g between t h e p a r t i t i o n s i s e q u a l t o Mid . { Left > Right ; R i g h t < A. L a s t ; L e f t > A. F i r s t } { ( f o r a l l I i n A. F i r s t . . R i g h t => not B e f o r e ( Mid , A[ I ] ) ) ; ( f o r a l l J i n L e f t . . A. L a s t => not B e f o r e (A[ J ] , Mid ) ) ; ( f o r a l l K i n R i g h t+1 . . L e f t −1 => not B e f o r e ( Mid , A[K] ) and not B e f o r e (A[K] , Mid ) ) } // I t e r a t e on two h a l v e s ( i n p a r a l l e l ) then continue loop with A => A [A . F i r s t . . R i g h t ] ; || continue loop with A => A [ L e f t . . A . L a s t ] ; end i f ; end loop ; end procedure Q u i c k s o r t ; end c l a s s S o r t i n g ;