Faster, more accurate, parallelized inversion for shape optimization in

0 downloads 0 Views 587KB Size Report
Jan 14, 2015 - inversion for shape optimization in electroheat problems on a graphics processing unit (GPU) with the ... of coils and geometries to realize a desired heat distribution. ..... Material and Topology Design, Wiley, Chichester. 354.
COMPEL: The International Journal for Computation and Mathematics in Electrical and Electronic Engineering Faster, more accurate, parallelized inversion for shape optimization in electroheat problems on a graphics processing unit (GPU) with the real-coded genetic algorithm Victor U. Karthik Sivamayam Sivasuthan Arunasalam Rahunanthan Ravi S. Thyagarajan Paramsothy Jayakumar Lalita Udpa S. Ratnajeevan H. Hoole

Downloaded by Michigan State University At 08:04 14 January 2015 (PT)

Article information: To cite this document: Victor U. Karthik Sivamayam Sivasuthan Arunasalam Rahunanthan Ravi S. Thyagarajan Paramsothy Jayakumar Lalita Udpa S. Ratnajeevan H. Hoole , (2015),"Faster, more accurate, parallelized inversion for shape optimization in electroheat problems on a graphics processing unit (GPU) with the real-coded genetic algorithm", COMPEL: The International Journal for Computation and Mathematics in Electrical and Electronic Engineering, Vol. 34 Iss 1 pp. 344 - 356 Permanent link to this document: http://dx.doi.org/10.1108/COMPEL-06-2014-0146 Downloaded on: 14 January 2015, At: 08:03 (PT) References: this document contains references to 39 other documents. To copy this document: [email protected] The fulltext of this document has been downloaded 4 times since 2015* Access to this document was granted through an Emerald subscription provided by 191576 []

For Authors If you would like to write for this, or any other Emerald publication, then please use our Emerald for Authors service information about how to choose which publication to write for and submission guidelines are available for all. Please visit www.emeraldinsight.com/authors for more information.

About Emerald www.emeraldinsight.com Emerald is a global publisher linking research and practice to the benefit of society. The company manages a portfolio of more than 290 journals and over 2,350 books and book series volumes, as well as providing an extensive range of online products and additional customer resources and services. Emerald is both COUNTER 4 and TRANSFER compliant. The organization is a partner of the Committee on Publication Ethics (COPE) and also works with Portico and the LOCKSS initiative for digital archive preservation. *Related content and download information correct at time of download.

The current issue and full text archive of this journal is available on Emerald Insight at: www.emeraldinsight.com/0332-1649.htm

COMPEL 34,1

344

Downloaded by Michigan State University At 08:04 14 January 2015 (PT)

Received 27 June 2014 Revised 22 July 2014 Accepted 6 August 2014

REGULAR PAPER

Faster, more accurate, parallelized inversion for shape optimization in electroheat problems on a graphics processing unit (GPU) with the real-coded genetic algorithm Victor U. Karthik and Sivamayam Sivasuthan Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan, USA

Arunasalam Rahunanthan Department of Mathematics and Computer Science, Edinboro University, Edinboro, Pennsylvania, USA

Ravi S. Thyagarajan and Paramsothy Jayakumar The US Army Tank Automotive Research Development and Engineering Center, Warren, Michigan, USA, and

Lalita Udpa and S. Ratnajeevan H. Hoole Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan, USA Abstract

COMPEL: The International Journal for Computation and Mathematics in Electrical and Electronic Engineering Vol. 34 No. 1, 2015 pp. 344-356 Emerald Group Publishing Limited 0332-1649 DOI 10.1108/COMPEL-06-2014-0146

Purpose – Inverting electroheat problems involves synthesizing the electromagnetic arrangement of coils and geometries to realize a desired heat distribution. To this end two finite element problems need to be solved, first for the magnetic fields and the joule heat that the associated eddy currents generate and then, based on these heat sources, the second problem for heat distribution. This two-part problem needs to be iterated on to obtain the desired thermal distribution by optimization. Being a time consuming process, the purpose of this paper is to parallelize the process using the graphics processing unit (GPU) and the real-coded genetic algorithm, each for both speed and accuracy. Design/methodology/approach – This coupled problem represents a heavy computational load with long wait-times for results. The GPU has recently been demonstrated to enhance the efficiency and accuracy of the finite element computations and cut down solution times. It has also been used to speedup the naturally parallel genetic algorithm. The authors use the GPU to perform coupled electroheat finite element optimization by the genetic algorithm to achieve computational efficiencies far better than those reported for a single finite element problem. In the genetic algorithm, coding objective functions in real numbers rather than binary arithmetic gives added speed and accuracy. Findings – The feasibility of the method proposed to reduce computational time and increase accuracy is established through the simple problem of shaping a current carrying conductor so as to yield a constant temperature along a line. The authors obtained a speedup (CPU time to GPU time

r This material is declared a work of the U.S. Government and is not subject to copyright protection in the United States. Approved for public release; distribution is unlimited.

Downloaded by Michigan State University At 08:04 14 January 2015 (PT)

ratio) saturating to about 28 at a population size of 500 because of increasing communications between threads. But this far better than what is possible on a workstation. Research limitations/implications – By using the intrinsically parallel genetic algorithm on a GPU, large complex coupled problems may be solved very quickly. The method demonstrated here without accounting for radiation and convection, may be trivially extended to more completely modeled electroheat systems. Since the primary purpose here is to establish methodology and feasibility, the thermal problem is simplified by neglecting convection and radiation. While that introduces some error, the computational procedure is still validated. Practical implications – The methodology established has direct applications in electrical machine design, metallurgical mixing processes, and hyperthermia treatment in oncology. In these three practical application areas, the authors need to compute the exciting coil (or antenna) arrangement (current magnitude and phase) and device geometry that would accomplish a desired heat distribution to achieve mixing, reduce machine heat or burn cancerous tissue. This process presented does it more accurately and speedily. Social implications – Particularly the above-mentioned application in oncology will alleviate human suffering through use in hyperthermia treatment planning in cancer treatment. The method presented provides scope for new commercial software development and employment. Originality/value – Previous finite element shape optimization of coupled electroheat problems by this group used gradient methods whose difficulties are explained. Others have used analytical and circuit models in place of finite elements. This paper applies the massive parallelization possible with GPUs to the inherently parallel genetic algorithm, and extends it from single field system problems to coupled problems, and thereby realizes practicable solution times for such a computationally complex problem. Further, by using GPU computations rather than CPU, accuracy is enhanced. And then by using real number rather than binary coding for object functions, further accuracy and speed gains are realized. Keywords CAD, Finite element methods, Applied electromagnetics, Computational methods, Genetic algorithms, Field optimization Paper type Research paper

Faster, more accurate, parallelized inversion 345

Optimization of electroheat system shapes for desired temperature Electorheating is used in many processes where it is often desirable to accomplish a particular thermal distribution – whether to save an alternator from overheating or to accomplish the necessary melting of the ore or to burn cancerous tissue without hurting healthy tissue. As shown in Figure 1, the design process involves setting the parameters {p} that describe the electroheat system (consisting of geometric dimensions, currents in magnitude and phase, and material values), solving the eddy current problem for the magnetic vector potential A (Starzynski and Wincenciak, 1998): 1 ð1Þ  r2 A ¼ J ¼ se E ¼ se ½joA  rj m Finite Element Analysis Mesh Deformation

Electromagnetic System

EM Geometry Thermal Geometry

Start

Design Parameters

Thermal System

Optimization Coupled Objective Minimization

Stop if Minimum

Performances of Thermal Systems

Figure 1. Finite element analysis and optimization of coupled magnetothermal problems

COMPEL 34,1

where m is the magnetic permeability, se is the electrical conductivity, E the electrical field strength and  rj is the externally imposed electric field driving the current (Chari and Salon, 2000; Hoole, 1990). The frequency o is relatively low (50 Hz to 1 kHz) so that the current density J has only a conduction term seE and no displacement term joAE. After finding A (Pham and Hoole, 1995), we compute the joule heating density q from:

346 and:

E ¼ joA  rj

Downloaded by Michigan State University At 08:04 14 January 2015 (PT)



se 2 E 2

ð2Þ ð3Þ

Once we have the heat source distribution q, the second problem of finding the resulting temperature is addressed by solving (Morinova and Mateeve, 2012): st r2 T ¼ q

ð4Þ

where st is the thermal conductivity. Since the problem began with defining the parameters of system description {p}, we note that T ¼ T({p}) since the computed T will depend on the values of {p}. When a particular temperature distribution To(x,y) is desired, the problem is one of finding that {p} which will yield: TðfpgÞ ¼ To

ð5Þ

This is recognized as inverting (5) to find {p} and therefore referred to as the inverse problem which is now well understood in the literature, particularly when we are dealing with one branch of physics like electromagnetics (Arora and Hang, 1976; Marrocco and Pironneau, 1978; Hoole et al., 1991; Hoole, 1995). In multi-branch coupled physics problems like the electroheat problem under discussion, {p} is defined in the electromagnetic system and F in the thermal system. Further, when dealing with numerical methods such as the finite element method, T is given at the nodes and To, rather than being a function of x and y, is more conveniently defined at measuring points i, numbering say m. The design desideratum then may be cast as an object function F to be minimized with respect to {p}: F ¼ FðfpgÞ ¼

1Xm i i 2 i¼1 ½T  T0  2

ð6Þ

where Ti is T as computed at the same m points i where To is defined. The optimization process, by whatever method (Vanderplaats, 1984), keeps adjusting {p} until F is minimized. At that point {p} would represent our best design. Context and novelty To set this work in context, we note that this two-part optimization problem has been worked on before we state what is novel here. In our work the design vector {p} includes geometric dimensions. That is, we do shape optimization by an accurate finite element eddy current problem followed by another accurate finite element temperature problem. The inherent parallelism of the genetic algorithm was

Downloaded by Michigan State University At 08:04 14 January 2015 (PT)

recognized as far back as 1994 by Henderson (1994) and done on parallel computers. In 2009 Wong and Wong (2009) and Robilliard et al. (2009) took that parallelization work on single field problems to the graphics processing unit (GPU), working still with single field systems. We apply their work to the coupled field problem where solutions can take very long and reducing solution times is critical. Further, in the genetic algorithm the object functions are coded in binary arithmetic to make mutations. But the natural form of object functions is real arithmetic so we bring Deb and Kumar’s (1995) work into finite element optimization to represent object functions in their normal real-coded form to reap not only speed but also accuracy. The coupled electroheat optimization problem has indeed been tackled before, but with key differences. Pham and Hoole (1995) have used two finite element solutions and done shape optimization. But that process is difficult to build general purpose software with because of difficulties with the gradient optimization. Siauve et al. (2004) solve eddy current and thermal problems sequentially but what they optimize is not shape but the antenna currents. Likewise in the electroheat optimization work of Nabaei et al. (2010) too, it is the current distribution they optimize. Battistetti et al. (2001) avoid a two-part finite element solution by using an analytical solution for the electromagnetic part and a finite difference solution for the thermal part. Di Barba et al. (2003) use circuit models. Method of optimization – the genetic algorithm In the optimization of a single physics problem as in magnetostatics or eddy current analysis, we construct one finite element mesh, solve for the magnetic vector potential A, and then change {p}. The method by which we change {p} depends on the method of optimization we employ (Vanderplaats, 1984). In coupled problems like the electroheat problem under discussion, two different meshes are often required (Pham and Hoole, 1995). Moreover, the optimization process too imposes huge difficulties depending on the method employed. In the simpler zeroth-order methods, only the value of F, given {p}, needs to be computed. Therefore zeroth-order optimization methods, which are more slowly convergent than gradient methods, are the best route to go for optimizing coupled electroheat problems Having settled on a zeroth-order method of optimization that does not need gradient information on the object function, we take cognizance that most zeroth-order methods are statistical so that several-fold more object function computations need to be made. The practicable alternatives are simulated annealing and the genetic algorithm, both good methods widely used in industry and a part of commercial software (Venkataraman, 2009). Going by the literature Preis et al. (1990), staunch advocates of the zeroth-order evolution strategy, merely say it is competitive with its higher order deterministic counterparts (which we take to mean the same in time at best), but claim its “robustness and generality” are superior. This we agree with because search methods will never see mesh-induced artificial local minima as a problem (Hoole et al., 1991). So we did a quick study, the results of which, shown in Figure 2, support the genetic algorithm. This study was done on the coupled electroheat problem described in greater detail below. Therefore we chose the genetic algorithm (Deb et al., 2002; Toledo et al., 2014). We note that in results from other disciplines besides finite element optimization, Manikas and Cain (1996) also say that “the genetic algorithm was shown to produce solutions equal to or better than simulated annealing” in their work.

Faster, more accurate, parallelized inversion 347

COMPEL 34,1

Fitness Score with Time for Genetic Algorithm and Simulated Annealing

1.2

Downloaded by Michigan State University At 08:04 14 January 2015 (PT)

348

Fitness Score

1 0.8

Genetic Algorithm Simulated Annealing

0.6 0.4 0.2

Figure 2. Genetic algorithm speed compared with simulated annealing

0 0

20

40

60

80

100

120

140

Time (s)

In the genetic algorithm, the design parameter vector { p} is represented by a binary encoding method. A chromosome is a vector {p}. Its fitness score f is defined in terms of the object function F: f ¼

1 1þF

ð7Þ

Therefore when F goes to it minimum at 0, f will reach its maximum value of 1. Real arithmetic for object functions The traditional process of using binary arithmetic is tedious because object functions are usually in real arithmetic for the cross-over and mutation of the genetic algorithm. This means they need to be re-coded in binary arithmetic to perform the mutations and then reconverted to real arithmetic. The process is clearly slow, given the numerous cross-over operations involved in a genetic algorithm process. Further, as shown by Deb and Kumar (1995) using real-coded object functions is more accurate. So that is what we do. Their procedure for getting offspring from real-coded object functions is based on mimicking the probability of having offspring with binary arithmetic, to get a simulated binary cross-over (SBX) operator which may create any solution in the entire real space [N, N]. Kalyanmoy Deb has liberally made his well-commented generic code available on the web[1] which we adapted and applied for the first time to finite element coupled problem optimization. GPU computation for genetic algorithms for electroheat problems We have at this point decided on optimization by the real-coded genetic algorithm in favor of other zeroth-order methods and gradient methods. We have a two-stage coupled problem, having to solve the magnetic field for A and then the thermal problem where we solve for the temperature. For realistic problems this has to be done several times – indeed tens of thousands of times – in searching the solution space for

Downloaded by Michigan State University At 08:04 14 January 2015 (PT)

the minimum object function. Wait times can be excessive, making optimization practicably infeasible. To cut down solution time, parallel processing needs to be resorted to (Hoole, 1990). From the 1990s multiprocessor computers have been tried out. Typically with n processors (or computing elements), solution time could be cut down by almost a factor of (n-1). This route, however, is not desirable because supercomputers are prohibitively expensive and there are technical difficulties in sharing memory, because beyond 32 processors the costs become very high. Although much of this parallelization work was moved to cheaper engineering workstations by the late 1990s (Hoole and Agarwal, 1997), the restrictions on processors remained. Recently the GPU, endowed with much computing prowess to handle graphics operations, has been exploited to launch a computational kernel as several parallel threads (Hoole et al., 2014a, b). Wong and Wong (2009), Robilliard et al. (2009) and Fukuda and Nakatani (2012) have shown that the genetic algorithm with its inherently parallel structure may be efficiently implemented on the GPU to optimize magnetic systems. We extend that here to coupled problems. Cecka et al. (2011) have also created and analyzed multiple approaches in assembling and solving sparse linear systems on unstructured meshes. The GPU coprocessor using single-precision arithmetic achieves speedups of 30 or so in comparison with a well optimized double-precision single core implementation (Hoole et al., 1991). The matrix computations have already been parallelized by us as efficient GPU code and reported (Sivasuthan et al., 2014a, b). The Deb and Kumar (1995) code also was rewritten by us for the GPU. The superior accuracy of GPU computations over CPU computations has been suggested by Yablonski (2011) and further justified by Hoole et al. (2014a, b). We divide the GPU threads and blocks of the same number as the population size and compute f values simultaneously (Figure 6). Since we have 65,536 blocks (216) and 512 threads in a general GPU, we can go up to a population size of 65,536  512 ¼ 33,554,432 using the one GPU card on a PC. Since we do not need such a large population size for effective optimization, this is not restrictive. All the finite element calculation parts were programmed on the GPU in the CUDA C language. So when, given each {p}, we launch the fitness computation kernel as several threads (one for each {p}). The fitness score for all chromosomes is thereupon calculated at the same time in parallel. Test problem: shaping an electroheated conductor to achieve temperature profile The test problem chosen is a simple one on which the method can be demonstrated and its feasibility established. Shown in Figure 3 is a rectangular conductor which is heated by a current through it. The equi-temperature profiles would be circle-like around the conductor. But we want a constant temperature along two lines parallel to the pre-shaping rectangular conductor’s two opposite edges to be shaped (Figure 3). The question is this: how should that edge be reshaped to accomplish a constant temperature along the lines on either side of the conductor? Figure 3 presents the associated boundary value problem formed from a quarter of the minimal system for analysis consisting of a square conductor (with mr ¼ 10, se ¼ 15 S/m, and st ¼ 0.1 W/m/1C). A current density J o ¼ 10 þ j0A/m2 has a relatively low frequency of o ¼ 10/s kept deliberately low to avoid a very fine mesh, our purpose

Faster, more accurate, parallelized inversion 349

COMPEL 34,1

here being to investigate and establish methodology rather than to solve large problems in their full complexity. The top edge of the conductor has to be shaped to get a constant temperature profile of 601C, at y ¼ 6 cm, and at 10 equally spaced measuring points in the interval 4 cmpxp8 cm as shown in Figure 3 along the measuring line where, along the lines of (6), we define the object function:

350



1 X 10 2 i¼1 ½Ti  60 2

ð8Þ

Downloaded by Michigan State University At 08:04 14 January 2015 (PT)

where Ti is the temperature computed at the measuring point i. An erratic undulating shape with sharp edges arose when Pironneau (1984) optimized a pole face to achieve a constant magnetic flux density and this was overcome through constraints (Subramaniam et al., 1994). Haslinger and Neittaanmaki (1997) suggest Bezier curves to keep the shapes smooth with just a few variables to be optimized, while Preis et al. (1990) have suggested fourth order polynomials which when we tried gave us smooth but undulating shapes. As such we follow Subramaniam et al. (1994) and extend their principle, so as to maintain a non-undulating shape by imposing the constraints: ð9Þ

h1 4h2 4h3 4h4 4h5 4h6 4h7

to ensure a smooth shape. We have had no difficulties even using 18 parameters to shape an alternator in like manner to get a sinusoidal flux distribution (Sivasuthan et al., 2014a). The penalties were imposed by adding a penalty term to the object function

A=0.0 and T=20°C

4 cm

μr =1.0

Conductor

σ=0.0 k=0.1

μr =10 4 cm

Figure 3. Numerical model for coupled electroheat problem (symmetric quarter. nominal values shown for magnetic permeability m, electrical conductivity s and thermal conductivity k)

12 cm

Air

2 cm

Measuring line (sampled into 10 points)

σ =15

Edge to be shaped

k = 100

8 cm

∂A ∂T ⎯ = 0, ⎯ = 0 ∂n ∂n

Line of Symmetry

F in (8) whenever it fails to satisfy the conditions for constraints (Vanderplaats, 1984; Cecka et al., 2011). Tolerance boundaries of each hi were set to: ð10Þ

1:5 cmphp5:5 cm

Downloaded by Michigan State University At 08:04 14 January 2015 (PT)

The parameterized problem-specific mesh is shown in Figure 4 where the device descriptive parameter set {p} consists of the 7 heights hi. In the process of optimization as these heights hi change the mesh connections remain the same but the element sizes and shapes change.

Faster, more accurate, parallelized inversion 351

Results and discussion thereof Figure 5 shows the optimum shape of the conductor and temperature profile after 40 iterations for a population size of 512. The corner of the conductor rising toward the line where the constant 601C temperature is desired is as to be expected. For as seen in Figure 6 (which shows the design goals being accomplished), the constant 12

10

8

6

4

2 h2

h1 0

0

1

2

3

h3 5

4

h4 6

h5

h6 7

h7

Figure 4. The parameterized geometry

8

12 10 8 6 4 Contour at T = 60 °C

2 0

Figure 5. Optimal shape by the genetic algorithm

Shaped Edge

0

1

2

3

4

5

6

7

8

COMPEL 34,1

65

55

Intial Optimum Desired

50 45

Figure 6. Temperature distribution: desired, initial and optimized

40 4.00

4.50

5.00

5.50 6.00 6.50 X coordinates/cm

7.00

7.50

8.00

601C temperature is perfectly matched. The lower graph giving the initial temperature shows that the temperature drops above the corner of the conductor. Therefore to address this, the corner has to rise close to the line of measurement to heat the line above the corner and Figure 5 shows that this is what the optimization process has accomplished. Significant speedup was accomplished with GPU computation as seen in Figure 7. No gains in speedup beyond a factor of 28 were seen after a population size of 500. The meandering nature of the gain after that may be attributed to the happenstance inherent to a statistical method like the genetic algorithm. The gain of 28 or so for optimization including matrix solution is lower than the “30 or more” reported by Cecka et al. (2011) for their paper about the direct finite element solution paper but it is an impressive figure given the extensive communication time in a coupled problem such as this. To verify that the performance of our programming is superior we programmed the conjugate gradients solution for ever larger matrix sizes and measured the GPU:CPU computational time ratio. Our findings are seen in Figure 8 where we achieve a far superior gain of 147 as explained 30 25 Ratio GPU/CPU Times

Downloaded by Michigan State University At 08:04 14 January 2015 (PT)

352

Temperature / °C

60

20 15 10 5

Figure 7. Speedup: GA optimization GPU time: CPU time

0 0

100

200

300 400 Population Size

500

600

Faster, more accurate, parallelized inversion 353

Conclusions Shape optimization for the electroheat problem using GA has been presented and validated using a simple geometry and neglecting radiation and convection. Real-coded object functions give faster more accurate solutions. This is the first two-stage finite element solution of a magnetic field problem and then a thermal Speed Up 160

CPU time/GPU time

140 120 100 80 60 40 20 0 0

2,000

4,000

6,000

8,000

10,000

Figure 8. Speedup of the conjugate gradients algorithm: matrix size vs CPU time/ GPU time

Matrix size

3,000 2,500 2,000 Time (s)

Downloaded by Michigan State University At 08:04 14 January 2015 (PT)

in a companion paper (Hoole et al., 2014b). This may be because Cecka et al. could achieve only a two to threefold gain in matrix assembly. Even accounting for that, however, their gain is very small considering that matrix assembly takes but trivial, negligible time in the finite element solution process where the preponderant computational load is from matrix solution. The faster computational speed realized for different population sizes with real-coded object functions is shown in Figure 9 as to be expected. (The times shown are the total for five runs to eliminate probabilistic changes in the genetic algorithm which produce the odd shape.) In the numerous experiments we performed, we found real-coded algorithms to be consistently faster, but speedup varied from 10 to 30 per cent. These numbers are still being interpreted and will be reported on in due course because the real coding of object functions remains unknown to finite element analysis in our discipline.

40 Iterations

1,500

Real Binary

1,000 500 0 0

10

20

30 40 Population Size

50

60

70

Figure 9. Computing time with real arithmetic and binary arithmetic

COMPEL 34,1

Downloaded by Michigan State University At 08:04 14 January 2015 (PT)

354

problem repeated in optimization iterations for finding both shape and currents and is amenable to implementation as a general purpose software tool. This avoids the need to use circuit models or analytical solutions to avoid the difficulties of optimization in a two-stage finite element solution process. The procedure provides for shape optimization whereas the extant limited literature on two-stage electroheat problem optimization shows only the current magnitude and phase being optimized without change of shape. This problem was computed using GPU parallel computing techniques whereby speedups of 28 were demonstrated. This is comparable to the speedup of 30 recently demonstrated in the literature for a single finite element solution. Yet we have demonstrated a speedup of 148 for a single finite element matrix equation solution. A companion paper will show how such an immense speedup was achieved (Hoole et al., 2014b). Note 1. For the source code, visit Kalyanmoy Deb’s COIN Lab web site: www.egr.msu.edu/Bkdeb/ COIN.shtml Choose Source Code and then get the code at the link under “Single-objective GA code in C (for Windows and Linux)”: K GA in C (Real þ Binary þ Constraint Handling) For GPU programming see: http://on-demand.gputechconf.com/gtc-express/2011/presentations/GTC_Express_Sarah_ Tariq_June2011.pdf http://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html References Arora, J.S. and Hang, E.J. (1976), “Efficient optimal design of structures by generalized steepest descent programming”, International Jo. For Num. Meth. Eng, Vol. 10 No. 4, pp. 747-766. Battistetti, M., Di Barba, P., Dughiero, F., Farina, M., Lupi, S. and Sivini, A. (2001), “Optimal design of an inductor for transverse flux heating using a combined evolutionary-simplex method”, COMPEL, Vol. 20 No. 2, pp. 507-522. Cecka, C., Lew, A.J. and Darve, E. (2011), “Assembly of finite element methods on graphics processors”, Int. J. Num. Meth. Eng, Vol. 85 No. 5, pp. 640-669. Chari, M.V.K. and Salon, S.J. (2000), Numerical Methods in Electromagnetism, Academic Press, San Diego, CA. Deb, K. and Kumar, A. (1995), “Real-coded genetic algorithms with simulated binary crossover: studies on multimodal and multiobjective problems”, Complex Systems, Vol. 9 No. 6, pp. 431-454. Deb, K., Pratap, A., Agarwal, S. and Meyarivan, T. (2002), “A fast and elitist multiobjective genetic algorithm: NSGA-II”, IEEE Trans. Evolutionary Computation, Vol. 6 No. 2, pp. 182-197. Di Barba, P., Dughiero, F., Lupi, S. and Savini, A. (2003), “Optimal design of devices and systems for induction-heating: methodologies and applications”, COMPEL, Vol. 22 No. 1, pp. 111-122. Fukuda, H. and Nakatani, Y. (2012), “Recording density limitation explored by head/media cooptimization using genetic algorithm and GPU-accelerated LLG”, IEEE Trans. Magn, Vol. 48 No. 11, pp. 3895-3898. Haslinger, J. and Neittaanmaki, P. (1997), Finite Element Approximation for Optimal Shape, Material and Topology Design, Wiley, Chichester.

Henderson, J.L. (1994), “Laminated plate design using genetic algorithms and parallel-processing”, Computing Systems in engineering, Vol. 5 Nos 4-6, pp. 441-453. Hoole, S.R.H. (1990), “Finite element electromagnetic field computation on the sequent symmetry 81 parallel computer”, IEEE Trans. on Magnetics, Vol. 26 No. 2, pp. 837-840.

Downloaded by Michigan State University At 08:04 14 January 2015 (PT)

Hoole, S.R.H. (1995), Finite Elements, Electromagnetics and Design, Elsevier, Amsterdam. Hoole, S.R.H. and Agarwal, K. (1997), “Parallelization of optimization methods”, IEEE Trans. Magn, Vol. 33 No. 2, pp. 1966-1969. Hoole, S.R.H. Sivasuthan, S., Karthik, V.U., Rahunanthan, A., Thyagrajan, R.S. and Jayakumar, P. (2014a), “Electromagnetic device optimization: the forking of already parallelized threads on graphics processing units”, ACES Journal, September Vol. 29 No. 9, pp. 677-694. Hoole, S.R.H. Karthik, V.U., Sivasuthan, S., Rahunanthan, A., Thyagrajan, R.S. and Jayakumar, P. (2014b), “Finite elements, design optimization, and nondestructive evaluation: a review in magnetics, and future directions in gpu-based, element-by-element coupled optimization and NDE”, Intl. J. App. Electromagnetics in Matrls, Under review, doi 10.3233/JJAE-140061. Hoole, S.R.H., Subramaniam, S., Saldanha, R., Coulomb, J.L. and Sabonnadiere, J.C. (1991), “Inverse problem methodology and finite elements in the identification of inaccessible locations, sources, geometry and materials”, IEEE Trans. on Mag, Vol. 27 No. 3, pp. 3433-3443. Manikas, T.W. and Cain, J.T. (1996), “Genetic algorithms vs. simulated annealing: a comparison of approaches for solving the circuit partitioning problem”, Tech. Report TR-96-101, Department. of Electrical Engineering, University of Pittsburgh, Pittsburgh. Marrocco, A. and Pironneau, O. (1978), “Optimum design with Lagrangian Finite Elements: design of an electromagnet”, Computer Methods in Applied Mechanics and Engineering, Vol. 15 No. 3, pp. 277-308. Morinova, I. and Mateeve, V. (2012), “Inverse source problem for thermal fields”, COMPEL, Vol. 31 No. 3, pp. 996-1006. Nabaei, V., Mousavi, S.A., Miralikhani, K. and Mohensi, H. (2010), “Balancing current distribution in parallel windings of furnace transformers using the genetic algorithm”, IEEE Trans.Magn, Vol. 46 No. 2, pp. 626-629. Pham, T.H. and Hoole, S.R.H. (1995), “Unconstrained optimization of coupled magneto-thermal problems”, IEEE Trans. Magn, Vol. 31 No. 3, pp. 1988-1991. Pironneau, O. (1984), Optimal Shape Design for Elliptic Systems, Springer-Verlag, New York, NY. Preis, K., Magele, C. and Biro, O. (1990), “FEM and evolution strategies in the optimal design of electromagnetic devices”, IEEE Trans. Magnetics, Vol. 26 No. 5, pp. 2181-2183. Robilliard, D., Marion-Poty, V. and Fonlupt, C. (2009), “Genetic programming on graphics processing units”, Genetic Programming and Evolvable Machines, Vol. 10 No. 4, pp. 447-471. Siauve, N., Nicolas, L., Vollaire, C., Nicolas, A. and Vasconcelos, J.A. (2004), “Optimization of 3-D SAR distribution in local RF hyperthermia”, IEEE Trans. Magn, Vol. 40 No. 2, pp. 1264-1267. Sivasuthan, S., Karthik, V.U., Rahunanthan, A., Jayakumar, P., Thyagarajan, R.S., Udpa, L. and Hoole*, S.R.H., (2014a), “A Script-based Parameterized Finite Element Mesh for Design and NDE on a GPU”, IETE Technical Review (accessed November 2014). Sivasuthan, S., Karthik, V.U., Rahunanthan, A., Jayakumar, P., Thyagarajan, R.S., Udpa, L. and Hoole*, S.R.H. (2014b), “GPU computation: Why element by element conjugate gradients?”, Sixteenth Biennial IEEE Conference on Electromagnetic Field Computation, Annecy France, 25-28 May. Starzynski, J. and Wincenciak, S. (1998), “Benchmark problems for optimal shape design for 2D Eddy currents”, COMPEL, Vol. 17 No. 4, pp. 448-459.

Faster, more accurate, parallelized inversion 355

COMPEL 34,1

Downloaded by Michigan State University At 08:04 14 January 2015 (PT)

356

Subramaniam, S., Arkadan, A.A. and Hoole, S.R.H. (1994), “Optimization of a magnetic pole face using linear constraints to avoid jagged contours ‘Constraints for Smooth Geometric Contours from Optimization’”, IEEE Trans. Magn, Vol. 30 No. 5, pp. 3455-3458. Toledo, C.F.M., Oliveira, L. and Franca, P.M. (2014), “Optimization of neural networks: a comparative analysis of the genetic algorithm and simulated annealing”, Journal of Computational and Applied Mathematics, Vol. 261, No. 1, pp. 341-335. Vanderplaats, G.N. (1984), Numerical Optimization Techniques for Engineering Design, McGraw-Hill, New York, NY. Venkataraman, P. (2009), Applied Optimization with MATLAB Programming, 2nd ed., John Wiley, Hoboken, NJ. Wong, M.L. and Wong, T.T. (2009), “Implementation of parallel genetic algorithms on graphics processing units”, in Gen, M., Green, D., Katai, O., McKay, B., Namatame, A. Sarkar, R.A. and Zhang, B.-T. (Eds), Intelligent and Evolutionary Systems, Book Series: Studies in Computational Intelligence, Vol. 187, pp. 197-216. Yablonski, D. (2011), “Numerical accuracy differences in CPU and GPGPU codes”, Paper No. 58, Electr. and Comp. Engineering Master’s Thesis, Northeastern University, Boston, MA, available at: http://hdl.handle.net/2047/d20001067 (accessed 10 February 2014). Further reading Hoole, S.R.H. (1989), Computer-Aided Analysis and Design of Electromagnetic Devices, Elsevier, New York, NY. Hoole, S.R.H., Sathiaseelan, V. and Tseng, A. (1990), “Computation of Hyperthermia-SAR Distributions in 3-D”, IEEE Trans. Magnetics, Vol. 26 No. 2, pp. 1011-1014. Hoole, S.R.H., Weeber, K.R. and Subramaniam, S. (1991), “Fictitious minima of object functions, finite element meshes, and edge elements in electromagnetic device synthesis”, IEEE Trans. Magn, Vol. 27 No. 3, pp. 5214-5216. Randy, L.H. (1995), “Comparison between genetic and gradient-based optimization algorithms for solving electromagnetics problem”, IEEE Trans. Magnetics, Vol. 31 No. 3, pp. 1932-1935. Randy, L.H. and Sue, E.H. (2004), Practical Genetic Algorithms, 2nd ed., John Wiley, Hoboken, NJ. Corresponding author Dr S. Ratnajeevan H. Hoole can be contacted at: [email protected]

For instructions on how to order reprints of this article, please visit our website: www.emeraldgrouppublishing.com/licensing/reprints.htm Or contact us for further details: [email protected]

Suggest Documents