Internet-Based TSP Computation with Javelin++ Michael O. Neary and Peter Cappello Department of Computer Science University of California, Santa Barbara Santa Barbara, CA 93106 fneary,
[email protected] Abstract Javelin++ is a Java-based infrastructure for Internet computing. This paper presents an extension of Javelin++ to solve the Traveling Salesman Problem (TSP), a computationally complex combinatorial problem. Javelin++'s piecework computational model is extended to support a branch-and-bound model that is applied to the TSP computation. This extension implements the pipelined RAM model of cache consistency. Like Javelin++'s piecework computational model, the underlying architecture of this branch-and bound model is based on clients and hosts coordinated by brokers. The paper presents: a branch-and-bound computational model, the implementation of a scalable task scheduler using distributed work-stealing, the design and implementation of fault tolerance and load balancing via eager scheduling, and the results of performance experiments which appear promising. The paper focuses on a scalable implementation of branch-and-bound (especially the “bound” part) on Internetworked computers. Javelin++'s seamless integration of distributed work-stealing, bound propagation, and lazy, light-weight, eager scheduling is the principle contribution of this paper.
1. Introduction Our goal is to use Java to harness the Internet's vast, growing, computational capacity for ultra-large, coarsegrained parallel applications. By providing a portable, secure programming system, Java holds the promise of harnessing this large heterogeneous computer network as a single, homogeneous, multi-user multiprocessor [6, 10, 1]. Some research projects that also are designed to exploit this include Charlotte [5], Atlas [4], Popcorn [8], Javelin [9], Bayanihan [16], Manta [18], Ajents [11], and Globe [3].
While global computing raises many issues, we focus here on 3 issues that are fundamental to every Java-based global computing application:
Performance — If there is no niche where Java-based global computing outperforms existing multiprocessor systems, then there is no reason to use it.
Scalability — In order for the system to outperform existing multiprocessor systems, it must harness a larger set of processors: The architecture must scale to a higher degree than existing multiprocessor architectures, such as networks of workstations (NOW) [2].
Fault tolerance — An architecture that scales to thousands of hosts must be fault tolerant, particularly when hosts, in addition to failing, may dynamically disassociate from further participation in an ongoing computation.
The piecework computational model that Javelin++ supports is extended to implement the pipelined RAM [12] model of cache consistency. This model of shared memory is strong enough to support branch-and-bound computation (in particular, bound propagation), but weak enough to enable high performance. Each host caches a copy of the current minimum cost upper bound. We present a highperformance, scalable, fault tolerant, Internetworked architecture for the Traveling Salesman Problem. For such an architecture to succeed, the architects must be diligently cognizant of the central technical constraint: On the Internet, communication latency is large. Absolute communication efficiency depends on the communication protocol used. However, architectural scalability does not depend on the communication protocol used. In Javelin++, we use RMI (as opposed to, say, using TCP directly). However, this is not the focus of our investigation. The remainder of the paper is organized as follows: The next section presents the branch-and-bound model of computation. Section 3 presents the scheduler: a scalable, distributed, deterministic work stealer; as well as the fault
tolerance scheme (which also contributes to balancing the load among hosts): a distributed eager scheduler. Section 4 presents results of performance experiments. The final section concludes the paper, outlining some immediately fruitful areas of scalable global computing research and development.
2. Model of Computation The branch-and-bound method intelligently enumerates all feasible points of a combinatorial optimization problem. (See Papadimitriou and Steiglitz [15] for a more complete discussion of branch-and-bound.) The qualification intelligent refers to the fact that not all feasible solutions are examined. Branch-and-bound, in effect, produces a proof that the best solution is found without actually examining all feasible solutions. The method successively partitions the solution space (branches), and skips a subspace (bounds), when there is sufficient information to infer that none of the subspace's solutions are as good as a solution that already has been found. A basic, sequential branch-and-bound algorithm is as follows (please see [15]): activeset = {0}; // "0" is the original problem. U = infinity; while ( !activeset.empty() ) { node = activeset.select(); // removes node for (int i = 1; i