Grid-Enabled Tabu Search for Electromagnetic ... - IEEE Xplore

IEEE TRANSACTIONS ON MAGNETICS, VOL. 46, NO. 8, AUGUST 2010

3265

Grid-Enabled Tabu Search for Electromagnetic Optimization Problems Sara Carcangiu, Alessandra Fanni, Anna Mereu, and Augusto Montisci Electrical and Electronic Engineering Department, University of Cagliari, Cagliari 09123, Italy The use of Grid Computing to solve electromagnetic optimization problems by means of the Tabu Search strategy is proposed in this paper. In order to significantly reduce the prohibitive computational cost of the numerical analyses required by the majority of iterative algorithms, two different grid-enabled Tabu Search strategies have been ported in the grid. Both strategies belong to the Domain Decomposition family: the decomposition of the search space and the decomposition of the neighborhood. The performances of the different parallel implementations have been evaluated on some electromagnetic benchmarks. Index Terms—Design of electromagnetic devices, finite element methods, grid computing, optimization methods.

I. INTRODUCTION

I

N THE DESIGN of electromagnetic structures, it is often necessary to analyze the electromagnetic field distribution using numerical techniques such as the Finite Element Method (FEM). In order to optimize the design, it is usual to apply iterative techniques to search the potentially optimal configuration in the solutions domain. Moreover, when the number of design parameters to be optimized is considerable, the number of electromagnetic problems to be solved could be of the order of thousands. Because numerical electromagnetic solutions are often computationally intensive, the use of numerical solutions during the iterative optimization process could be unfeasible. One way to overcome this problem is to use approximating techniques, such as neural networks [1]. The main drawback of using approximating models is represented by the approximation errors, which can alter the value of the solution corresponding to the same design parameters. Another way to avoid the prohibitive computational time of iterative optimization is to use the new Grid Computing technology. Grid computing is a family of technologies for dynamically and opportunistically provisioning computing power from a pool of resources. The Grid is a type of parallel and distributed system that enables the sharing, selection, and aggregation of geographically distributed “autonomous” resources. These resources are dynamically assigned at runtime depending on their availability, capability, performance, cost and user’s quality-of-service requirements [2]. In this paper, a Tabu Search (TS) is proposed as search algorithm, and its parallel implementation on a computational grid is presented. TS is a family of meta heuristic procedures, which perform the search for the optimal solution exploring the variable space and storing the features that correspond to bad previous moves. Such features are labeled as tabu and they are avoided during the search for the optimum [3]. In literature, different approaches have been adopted to implement a parallelization of the TS. In [4], a hierarchical classification of the par-

Manuscript received December 23, 2009; revised February 25, 2010; accepted March 04, 2010. Current version published July 21, 2010. Corresponding author: S. Carcangiu (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TMAG.2010.2045487

allel TS strategies is presented. Two main types of parallelization can be performed: the first one is the so-called Multiple TS task category, in which multiple TS algorithms are run in parallel, which may differ for some parameters such as the initial solution and the tabu list size; the second class is the so-called Domain Decomposition. In this work, we will compare the performances of two parallel TS strategies both belonging to the Domain Decomposition family: the decomposition of the search space and the decomposition of the neighborhood. The decomposition of the search space implies that the domain space is decomposed in a number of smaller domains. Each sub-domain has to be solved by separate TS. The decomposition of the neighborhood is performed by assigning to each task a different portion of the neighborhood to be evaluated. In this work the performances of the two types of Domain Decomposition strategies will be compared. II. GRID-ENABLED TABU SEARCH ALGORITHMS The optimization problem under investigation consists in finding a set of design parameters that allows the device to produce the desired electric or magnetic field in prefixed points, under feasibility constraints. TS is a typically discrete algorithm, but it is possible to develop suitable strategies in order to apply it to solve optimization problems dealing with real-valued variables. The first step of the proposed methodology consists of subdividing the range of each variable into a finite number of sub-ranges uniformly distributed (whose size is a design parameter). A symbol of a discrete alphabet is associated to each sub-range, so that every solution in the continuous can be identified with an ordered sequence of symbols. According to this choice, the search strategy moves from one discrete solution to another by simply modifying the value of one variable at a time. At each iteration, the neighborhood associated to the current configuration is built performing all the possible moves that can be played starting from that configuration. This simple paradigm can lead to some inefficiencies and its success would be doubtful if only a straightforward random extraction is performed in the subintervals to generate a real-valued feasible solution. Local optimization algorithms, such as the mono dimensional Golden Search, may be performed in each subinterval to find out the optimum value corresponding to a new current solution. The simple iterative scheme of the TS is enhanced introducing several rule-of-thumb criteria such as aspiration (an element may be removed from

0018-9464/$26.00 © 2010 IEEE

3266


the Tabu List under certain conditions), intensification (deeply exploration of a region looking promising), and diversification (leave a region that does not look promising). A. Parallel Strategy The idea for a straightforward parallelization scheme is to decompose the search space into a set of disjoint subspaces, each of them explored by an instance of TS on a different CPU. When all the processes terminate, the obtained results are written in a secondary memory. Then, a new process reads the secondary memory and gathers the results to obtain the optimal solution. The resulting parallel algorithm is simple, because inter-process communication is not necessary, and only a final synchronization point is needed.

Fig. 1. Die press with electromagnet.

B. Master-Slave Strategy The previously described parallelization scheme does not fully exploit the peculiarity of the TS meta-heuristics to make use of the past history of the search to enhance the solutions domain. An alternative use of the Grid-Computing paradigm is made in this paper by decomposing the neighborhood. In fact, the most important requirement in using a TS algorithm consists in defining the set of admissible moves, i.e., in defining the neighborhood set of a given configuration. Our TS implements Cartesian moves, which consist of changing the value of one design parameter at a time. The exploration of a single variable can be carried out by independent jobs, which means that, for each configuration in the neighborhood, a FEM analysis has to be performed in order to evaluate the fitness of that solution. It is important to notice that in the Grid the execution of cyclic jobs is not allowed. This leads us to design the core of our algorithm to run locally in the user interface and to run the parallel exploration of the neighborhood in the Grid. This has been possible by splitting the program in a master-slave architecture. At the beginning, the master program sets up a catalogue that is needed to obtain the Tabu List. In fact in the Tabu Search algorithm there is a list (Tabu List) that keeps trace of the previously explored configurations. This is due to the idea to avoid cycling around an already explored configuration. To this end, in the master program a catalogue is initialized, to which also the slaves can have access. Each slave writes its own best solution into the catalogue. When all of them finish writing, the master program reads and compares the data selecting the optimal move. The creation of the database is done by using several Application Programming Interfaces (API) available for AMGA (Arda Metadata Grid Application) Metadata Catalogue [5]. After the creation of the “tabulist” catalogue, a while cycle starts. It is governed by two parameters: the total number of calls to the objective function and the number of iterations of the Tabu Search that we want to perform. It is also possible to stop the algorithm when the objective function reaches a certain threshold. Throughout the APIs, the master program launches the execution of the slaves. During the first iteration the master sends to the slaves the jobs by starting from an initial configuration. The jobs are parametric jobs where all slaves have to perform the same task, that is, given the current configuration, the neighborhood of one variable is explored and different FEM analyses

are run. When one slave finds a configuration that is a good candidate to be stored in the catalogue, it checks the tabu list: if that configuration is not already stored in the list, the slave computes the hash code corresponding to the configuration and stores all the information in the catalogue. When all the slaves terminate their jobs, the master program analyzes the entries written by the slaves in the catalogue and performs the optimal current move. III. APPLICATIONS The computing infrastructure we used is based on several computation centers located in the main research institutions and academies in Sardinia, Italy, and it consists of more than 100 nodes 1U System x3455 (200 CPU AMD Opteron 2218, 400 CPU core). The fiber optics connections allow to dynamically aggregate the distributed resources and to reach a pick aggregated computation power of some TeraFlops. In order to compare the performance of the two grid-enabled TS codes, they have been used to find the optimal configuration of two electromagnetic devices test beds: the T.E.A.M Problem 25 “Optimization of Die Press Model” [6], and the T.E.A.M Problem 22, “Optimal design of a superconducting magnetic energy storage (SMES) device” [7]. The bidimensional models and the computation of the objective functions of the two benchmarks are performed throughout calls to the ELFIN [8] finite element code. In the following, we will show the improvement in terms of computational time that has been reached by implementing the two algorithms in the computational Grid. A. TEAM Workshop Problem 25 The die press with electromagnet for the orientation of magnetic powder is used for producing anisotropic permanent magnet [6]. The shape of the die mold is controlled by a circle of radius R1 for the inner die and an ellipse represented by L2, L3 and L4 for the outer die (see Fig. 1). The problem constraints are the ranges of the design parameters, which are shown in Table I. The four parameters have to be chosen so that the magnetic flux density be radial and equal to 0.35 T in the cavity where the magnetic powder is inserted. The objective function to minimize is (1)

CARCANGIU et al.: GRID-ENABLED TABU SEARCH FOR ELECTROMAGNETIC OPTIMIZATION PROBLEMS

TABLE I RANGE OF THE DESIGN VARIABLES

3267

TABLE II TEAM PROBLEM 25: COMPARISON WITH RESULTS FROM OTHER AUTHORS

TABLE III TEAM PROBLEM 25: PERFORMANCE OF THE MASTER-SLAVE TS VERSUS THE PARALLEL TS

Fig. 2. Objective function trend for different jobs obtained with Parallel TS strategy.

(2) where

and are the components of the flux density in points along a circular arc of 45 having radius of 11.75 mm, whereas is the angle from the axis, and identifies the required values. Each design parameter can assume 100 equidistant values within the range. In this way, the search space dimension to be explored is 100 . In order to limit the computation time, the size of the neighborhood can be changed. To avoid any loss of generality, we adopted a strategy in which dynamically varies in a given range following a Reactive TS scheme [3]. In the present implementation these bounds are: and . The optimization procedure terminates when a stable value of the objective function is found, or the maximum number of FEM analysis is reached. Said number has been fixed equal to 10 000, which corresponds to 40 iterations. As said before, the parallel strategy consists in subdivide the search space into disjoint sub-domains. In order to evaluate how the search space decomposition affects the performance of this strategy, different subdivisions have been compared in terms of number of iterations needed to the convergence of the optimization procedure. Fig. 2 shows the diagrams of objective function corresponding to some examples of subdivisions. As it can be seen, the number of different jobs (equal to the number of CPUs) and the way the search space is divided affect the convergence speed, whereas the optimal value is practically the same in all the cases. The same stop criterion and number of jobs have been fixed for the Master-Slave Strategy. Table II compares the performance of the proposed TS algorithms with those reported in literature. As can be noted, the obtained values show a good agreement with those obtained by other authors. The performances of the two Grid-enabled TS are reported in Table III in terms of

speedup and efficiency parameters usually used to evaluate the performance of parallel algorithms [9]. As the Grid we used is composed by a number of identical CPUs, we can extend the concept of speedup and efficiency, used in parallel systems, in the grid environment. The speedup is the ratio of the execution time of the serial algorithm when executed on one processor to , whereas that when executed on processors the efficiency is the ratio of the actual speedup versus the the. Here oretical maximum speedup equal to and are the total time required to run the TS algorithm on a single CPU, and on CPUs respectively. The maximum efficiency is obtained when the speedup is the closest to the theoretical speedup [12]. The results highlight the superiority of the Master-Slave approach with respect to the Parallel strategy. The Master-Slave strategy has speedup limits and appears to saturate. This is due mainly to communication overhead, and to some uncertain factors as instability of computing nodes, dynamic change of Grid environment, total load on the Grid, and so on. The overhead can be neglected if the parallelized jobs have a high computational load. As the number of slaves increases, the speedup is more and more lower than the theoretical one. , the This phenomenon depends on the communication time number of variables , the size of the sub-ranges, the compu, and the fraction tational cost of a single FEM calculation of calculus performed by the Master [12]. More specifically, the number for which the saturation begins depends on the ratio between and the total computation time , and . on the ratio B. TEAM Workshop Problem 22 The Master-Slave strategy has been also used to optimize superconducting magnetic energy storage (SMES) device [7] in order to store a significant amount of energy in magnetic fields with a fairly simple and economical coil arrangement which can be rather easily scaled up. There are 8 design variables all related to two coils (see Fig. 3): the radius of the coils 1 and 2, R and R ; their height h and h ; their thickness d and d ; and

3268


benefits of using the grid computing technologies. Note that speedup saturates and efficiency decreases when the number of processors is much greater than the number of design parameters to optimize. IV. CONCLUSION In this paper, two grid-enabled implementations of the TS algorithm, the Parallel and the Master-Slave strategies, have been tested and compared. The experiments showed that the porting of the algorithm on a computational Grid reduces the computation time with respect to the implementation on a single processor, even considering the communication overhead, due to the communication delay between the computing elements. The Master-Slave strategy exhibits better performance with respect to the Parallel strategy, but the former one is subject to the saturation phenomenon, whereas the latter one seems to be immune. As a consequence, the Master-Slave strategy appears to be suitable when the computational resources are limited, whereas the Parallel strategy allows one to exploit a great number of processors.

Fig. 3. Configuration of the SMES device.

TABLE IV TEAM PROBLEM 22: PERFORMANCE OF THE MASTER-SLAVE TS

ACKNOWLEDGMENT

the value of the current densities J and J , respectively. In the implemented TS algorithm each design parameter can assume 100 equidistant values within its range. The given current density and the maximum magnetic flux density value on the coil must not violate, at any point of the coils, the superconducting quench condition [7]. There are two objectives: maintaining a prescribed level for the stored energy on the device; and minimizing the strayed field evaluated along the lines and in Fig. 3. These two conflicting goals give indeed rise to a multi-objective problem, which is reduced to a single-objective one in the benchmark definition. The objective function to minimize is [7] (3) where the reference stored energy and stray field are MJ and mT. is defined as (4) where is evaluated along 22 equidistant points along the lines and in Fig. 3. Coupling the Tabu Search algorithm with the ELFIN code for energy and field calculations, the serial algorithm required about 56 h and about 130 iterations. The stop criterion fixed the maximum number of analysis FEM equal to 100 000. The obtained optimal solution leads to an energy of 180 MJ and a stray field of 8.9 T. Table IV reports the performance of the Master-Slave strategy in terms of speedup and efficiency parameters, showing the

This work makes use of results produced by the Cybersar Project managed by the Consorzio COSMOLAB, a project co-funded by the Italian Ministry of University and Research within the PON 2000–2006. More information is available at htttp://www.cybersar.it. The authors would like to thank Prof. Salvatore Alfonzetti and Dr. Emanuele Dilettoso for the availability of ELFIN code, and for the useful discussions. REFERENCES [1] D. Cherubini, A. Fanni, A. Montisci, and P. Testoni, “Inversion of MLP neural networks for direct solution of inverse problems,” IEEE Trans. Magn., vol. 41, no. 5, pp. 1784–1787, May 2005. [2] R. Buyya, “Grid computing: Making the global cyberinfrastructure for eScience a reality,” CSI Commun., vol. 29, no. 1, Jul. 2005. [3] E. Cogotti, A. Fanni, and F. Pilo, “A comparison of optimization techniques for Loney’s solenoids design: An alternative Tabu Search approach,” IEEE Trans. Magn., vol. 36, no. 4, pp. 1153–1157, Jul. 2000. [4] E. G. Talbi, Z. Hafidi, and J. M. Geib, “Parallel Tabu search for large optimization problems,” in Proc. Meta-Heuristics: Advances and Trends in Local Search Paradigms for Optimization, Boston, 1999, pp. 345–358. [5] [Online]. Available: http://amga.web.cern.ch/amga/ [6] N. Takahashi, K. Muramatsu, M. Natsumeda, K. Ohashi, K. Miyata, and K. Sayama, “Solution of problem 25 (optimization of die press model),” in Proc. ICEF’96, Hubei, China, Oct. 1996, pp. 383–386. [7] C. Magele, TEAM Benchmark Problem 22 1996, available at: [Online]. Available: www-igte.tu-graz.ac.at/team [8] G. Aiello, S. Alfonzetti, G. Borzì, and N. Salerno, , A. Konrad e C. Brebbia, Ed., “An overview of the ELFIN code for finite element research in electrical engineering,” in Software for Elect. Eng.: Anal. Design VI. Southampton, U.K.: WIT Press, 1999, pp. 143–152. [9] F. Luna, A. J. Nebro, and E. Alba, “Observations in using Grid-enabled technologies for solving multi-objective optimization problems,” Parallel Comput., vol. 32, pp. 377–393, 2006. [10] P. Alotto and M. A. Nervi, “An efficient algorithm for the optimization of problems with several local minima,” Int. J. Numer. Meth. Eng., no. 50, pp. 847–868, 2001. [11] A. Canova, G. Gruosso, and M. Repetto, “Magnetic design optimization and objective function approximation,” IEEE Trans. Magn., vol. 39, no. 5, pp. 2154–2162, Sep. 2003. [12] W. H. Ware, “The ultimate computer,” IEEE Spectrum, vol. 9, no. 3, pp. 84–91, Mar. 1972.