An Efficient BSP/CGM Algorithm for the Matrix Chain ...

2 downloads 0 Views 205KB Size Report
An Efficient BSP/CGM Algorithm for the Matrix Chain Ordering Problem. Mounir Kechid and Jean Frédéric Myoupo. Université de Picardie-Jules Verne.
An Efficient BSP/CGM Algorithm for the Matrix Chain Ordering Problem Mounir Kechid and Jean Frédéric Myoupo Université de Picardie-Jules Verne Laboratoire Modélisation, Information & Systèmes 33 rue Saint Leu, 80039 Amiens, France [email protected] Abstract- The matrix chain ordering problem (MCOP) is widely used in computer and specially in combinatorial optimization. Even though there has been intensive work for the parallelization of dynamic programming on PRAM, systolic arrays among others, its parallel version on BSP/CGM is still to be done. In our former work, [10], we proposed a BSP/CGM for this problem running in O(n3/p)). Our approach was based on the classical sequential algorithm (running in O(n3)), hence the algorithm we obtained is not optimal. In this paper our strategy is based on the Yao’s sequential algorithm for dynamic programming [18] running in O(n2). Our resulting algorithm runs in S super-steps with O(n2/P) as time of execution per processor. To our knowledge, it is the first CGM algorithm for this problem derived from the Yao’s acceleration technique. Key Works: Dynamic Programming, Parallel Algorithms, BSP/CGM Algorithms.

1. INTRODUCTION The dynamic programming is a widely used technique to solve problems in combinatorial optimization. The idea in dynamic programming is to order the computations of solutions of sub-problems in such a way that each of them is computer once. This situation happens when we deal with a multistage dependence acyclic graph : • Each node is a sub-problem. • A node that does not have an outgoing arc corresponds to the initial state • An outgoing arc from node N1 to node N2, means that the computations of optimal solutions of N2 depend on the optimal solutions of N1. • A stage is a set of independent sub-problems. A subproblem SP belongs to level « i » if it does not have outgoing arcs to levels (j≤i) and it contents a least one outgoing arc to level i+1. The solution of the problem turns out to solve first the subproblems that correspond to its multi-level DAG. According to the nature of the sub-problems dependences, two multi-level dynamic programming approaches are to be considered: [6] : a. the serial dynamic programming in which the solution of a sub-problem of a given level,

exclusively depends on some solutions of subproblems of immediate precedent level b. the non-serial dynamic programming in which the solution of a sub-problem of a given level, depends of some solutions of sub-problems of several precedent levels. Moreover, according to the number of the terms of the recurrence (see equation 1 below), the dynamic programming is said to be monadic if the inherent cost function has only one term of recurrence. It is said to be polyadic otherwise. In this paper we tackle, for the first time in our knowledge, the parallelisation of such model for the Bridging Coarse Grain BSP/CGM(Bulk synchronous parallel model/Coarse Grain Multicomputer) [16, 17]. Specially the MCOP (Matrix Ordering Problem) is a non serial dynamic programming which concerned in this paper.CGM seems the best suited for the design of algorithms that are not too dependent on an individual architecture. A BSP/CGM machine is a set of P Processors. Each having an own local memory of size M (with O(M)>>O(1)) and connected to a router able to deliver messages in point-to-point fashion. A BSP/CGM algorithm consists of alternating local computations and global communication rounds. Each communication round consists of routing a single h-relation with h=O(M). A CGM computation/communication round corresponds to a BSP super step with communication cost g.M [6]. g is the cost of the communication of a word in the BSP model. To produce an efficient BSP/CGM algorithm, designer must search to maximize speedup and minimize the number of communication rounds (ideally independent from the problem size, and, constant in the optimum). The sequential generic1 algorithm that solves all the MCOP (1) runs in O(n3) [5]. For an equivalent problem (Optimal Search Binary Tree Problem.) Kechid and Myoupo [10] proposed an efficient BSP/CGM with O(p) rounds of communication and running in O(n3/p) per processor. However this our BSP/CGM does not fit with the best sequential algorithm. In fact depending of the semantic nature of the function f, different specific sequential algorithms are proposed in the literature [Knuth, Yao, Chin, Raman]. For MCOP, by reducing the number of sub-problems to be solved 1

With the semantic of f which differs from one problem to another.

Yao [18] presented an accelerated generic algorithm. The result is that he brought the execution time from O(n3) to O(n2). This approach is called the dynamic programming acceleration. Introducing the dynamic programming acceleration in BSP/CGM is not straightforward. Recently, Kechid and Myoupo [11] remarked that it is difficult to parallelize the Knuth acceleration of the OBST(Optimal Binary Three Search) problem ((O(n3) O(n2))) in BSP/CGM. The processing of the MCOP on other parallel models than BSP/CGM can be found in [1, 2, 4, 7; 14, 15] 1.1 Our contribution In this paper we propose a BSP/CGM parallel algorithm for the MCOP based on Yao acceleration. It runs in O(S) communication rounds and needs O(n2/P) operations per processor, S being dynamically computed. To the best of our knowledge, it is the first CGM parallel algorithm, based on the Yao sequential version of the MCOP problem. The rest of this paper is organized as follows: Section 2 defines MCOP problem and the Optimal Triangulation of convex polygon Problem which is equivalent to MCOP problem.. Section 3 presents Yao’s sequential algorithm for MCOP which is derived from the above equivalence and that runs in O(n2). [Yao 1981]. The MGEN algorithm is described in section 4. A conclusion ends the paper.

Solving efficiently P reduces to compute Cost(1,n). 2.2 Optimal Triangulation Problem As said earlier some properties of the function f in the MCOP problem can help to accelerate the sequential process of MCOP from O(n 3) to O(n 2). To point out these properties, one usually considers another optimization problem that is equivalent to MCOP [Yao 1982, Chin 1981, Raman 1990]. It is the so called Optimal Triangulation Problems Consider a convex polygon of n+1 weighted summits (v0, v1,…, vn), each with weight W(vi). The problem is to find the minimal cost of the triangulation of this polygon. The triangles must not intersect one another. And the cost of a triangle with summits vi , vj et vk is W(vi)* W(vj)* W(vk The arc vivj (i

Suggest Documents