inverse electromagnetic field problems is investigated. It is shown that artificial neural networks, while being no panacea, have a role to play in a limited domain ...
1931
IEEE TRANSACTIONSON MAGNETICS,VOL. 29,NO.2,MARCH 1993
Artificial Neural Networks in the Solution of Inverse Electromagnetic Field Problems S. Ratnajeevan H. Hoole Department of Engineering, Harvey Mudd College, Claremont, CA 9 1711 Abstract: In an inverse electromagnetic field problem, we are asked to detexmine thdse values of the descriptive parametersof the device that would give us a required performance. In this paper, the use of artificial neural networks in the solution of inverse electromagnetic field problems is investigated. It is shown that artificial neural networks, while being no panacea, have a role to play in a limited domain of applications. That is, while it is ineffective to train networks to cover a broad class of devices, it is indeed possible to develop well-mined networks that function effectively over a narrow range of performanceof a particular class of device.
simulated annealing. These methods are costly and require hundreds even thousands when the lauer statistical methods are employed - of finite element analyses of the design for different combinations of (p). Other approximate schemes have also been proposed using the manipulation of flux lines as objects for designing electromagnetic devices using expert system methodology [231, but such methods are, as stated, approximate and the resulting design has to be fine-tuned using one of the more sophisticatednumerical schemes. This paper investigates the use of neural networks in inverse electromagneticfield problems.
-
11. ARTIFICIAL NEURAL NETWORKS
I. INVERSE PROBLEMS I N ELECTROMAGNETICS In a typical design of an electromagneticdevice - or for that matter any engineering device - specifications are in terms of performance. That is, for a desired output vector ( 0 ) . we are asked to find the required descriptive parameters (p). Here ( 0 ) may consist of flux densities at specified points, forces, etc., while ( p ) might have device dimensions, material properties such as permeabilities, and current densities as its elements. However, well known analysis techniques like the finite element method [l], allow us only to predict performance ( 0 ) for given parameters (p).The inverse problem that is the real engineering problem of finding the device-descriptive {p] for a required output ( 0 ) - has therefore been tackled iteratively 12221 using gradient techniques, evolution strategies and
-
Figure 1: An Artificial Neural Network
'k-
OS
What is a neural network and wfiat does it do ? A neural network is said to be a model of the brain and how it functions in the process of recognition [24- 261. This recognition is based on leaming from past experience. The relevant model for our purposes here, however, is merely an algorithm, a piece of code, that does a mapping. Using given data pairs ( ( 0 ) .[p)), our past experience, the neural network leams the mapping (0) + (p1. In the process of learning, the neural network merely computes the mapping function. While there is much terminology associated with neural networks, it does not have to delay us here since it is meant more to dazzle than clarify. As shown in Fig. 1. the neural network has an input layer of neurons (computing elements), a hidden layer or more of neurons, and an output layer of neurons. Here the inputs to the neural network ( 0 ) (that is the output performance of the electromagnetic device) are symbolized by (B) in anticipation of the desired flux density at a set of points that we will specify as the output for our example. Each neuron is depicted as a circle with a value in it, this value being the input in the first layer, the output in the last layer, and both the input and output for neurons in the hidden layer. We may increase the number of neurons per layer and also have more hidden layers. This generally, but not always, increases the power of the network, although there are marginal returns and issues of architectural design involved. Each neuron of a layer is connected to each one of the next layer and the path from neuron i of layer 1 to neuron j of the next layer has the weight Wlij. The process of lrming has to do with computing thcse weights. The neuron with input/output 1 in each layer is there for generality for thresholding purposes. The mapping from one layer to the next is now done thus for a neuron j in the intermediate layer:
m Sumj = .CwlijBi
(1)
1 4
I
5 0
Figure 2: A Sigmoid Function
.-
X
where the latter is the sigmoid function depicted in Fig. 2. Other functions incorporating the range (0.1) such as the sine function, the hyperbolic tan and the straight line have also been used. i = 0 corresponds to the thresholding neuron. Another point to note is that by the very nature of the sigmoid
0018-9464/93$03.00 0 1993 IEEE -~
~
Fz:d
IEEE TRANSACTIONS ON MAGNETICS, VOL.29, NO. 2, MARCH 1993
1932
function whose values are in the range (O,l), our (p) cannot be raw numbers. They must first be scaled (or more formally mapped) into this interval before they are used. While a thresholding function is sometimes utilized in place of the sigmoid function in pattern-recognition, for the backpropagation algorithm we shall use, this sigmoid function is desirable. Thus depending on the weights. for a given input (B], the network will return the output (p). It is thus these weights that determine the mapping. And finding these weights is known as training or learning, and one way to do it is to use the back-propagation algorithm.
Object:
Constant flu3I Densib
I
,
I
1 /
ldealizec
Air-gap
Figure 4: Test Problem
In the back-propagation algorithm, shown in Figure 3 and based on [24], the network needs ((B),(p))data pairs which are used to train the network to learn the function that maps (B] to (p). (B) is fed in as input, and the network with arbitrary weights is allowed to compute (p). Then, using the difference between the computed (p], (pc), and the target (p], the (p) that we know to be correct, we move back into the network adjusting the weights so as to minimize the error.
u2 ' I
I
Un I
=
I
111. NEURAL NETWORKS I N
ELECTROMAGNETICS
Figure 5: The Reduced Problem
The earliest use of a neural network in electromagneucs was by Ahn, Lee, Lee and Lee [27] and Dyck, Lowther, and McFee [28], who used it to determine optimal finite element 1 t 0.95 (or other learning rate) Set each weight wlij, w2ij to a random number between -0.1 and 0.1 ror every training pair ( B ) , (p) Do Assign this (B.) to the input units For j t 0 to 1 1 hj t m
For j
t
0 to n 1
Pjc
meshes. It was more recently however, that neural networks were used to solve inverse problems. Hoole [29]. Low and Chao [30], Mohammed, Parl, Uler and Ziqiang [31] and Ishiguro, Tanaka and Uchikawa [32] have independently proposed employingartificial neural networks to solve inverse electromagneticfield problems. As proposed in [29]. data pairs (( B ) ,(p)) are generated using finite element analysis and these are used to train the network. The trained network, it has been proposed, can then be used to compute the network output (p] for desired network input (B). The purpose of this paper is to investigate in greater detail than is available in the literature and was appropriate for an educational journal [29], the use of neural networks in solving inverse electromagnetic field problems. In conclusion, it is shown that neural networks, while having their natural niche in inverse problem solution, need to be used with care, and are applicable only to a limited class of problems.
1
For j t 0 to n [ S2j is the error between the Output Pjc and the target output pj ] "j + Pjc(l-P'c)(Pj - pjc) For j c 0 to 1 (dornputethe error Slj, in the hidden layer] n Slj t hj(1-hj) JS2i ~2 ji 1=1 For j t 0 ton For i t 0 to I Aw2ij t q 62j hi ~ 2 icj ~ 2 i+j A ~ 2 i j For j c 0 to h Fori t 0 tom Awl ij t q 61j Xi w l ij t wlij + Awlij End. 'igure 3: The Back-DroDaeation Algorithm
IV. THEPOLE FACE EXAMPLE To test the ideas proposed we shall take as example the optimization of the contour of a pole-face so as to make the flux density in the air-gap a constant. This configuration is shown in Fig. 4 to scale, where the distance from the top to the bottom is 10 cm.. By symmetry and periodicity, as well as by imposing no-1eakage.conditions at the top and bottom, the the problem may be reduced to that shown in Fig. 5. This geometry was optimized by the structural mapping technique [14], determining the three nodal displacements ul, u2, and u3, regarding the pole as a structure,so to produce that smooth distortion of a structure that would give us the required constant flux density in the airgap. The value of u = 0.1 corresponds to a shift of 0.2 c.m.. Table 1 summarizes for each desired flux density from 1.0 T to 1.6 T, the resulting three parameters (p) and the actual values of the flux density B at 5 measuring points in the airgap. Naturally, the parameters [p) are the actual displacementu mapped accordingto 0.01 + 0.05056819 (u + 9.72411) if uc0 p = 0.51 + 0.040068918 (u - 1.08174) if 1,120 (3)
{
1933
IEEE TRANSACTIONS ON MAGNETICS, VOL.29, NO. 2, MARCH 1993
Table 1: Specified B and Corresponding {p} and {B) from Structural Optimization
increased. When completely arbitrary data was ugd, as to ber expected, the network returned answers with much larger error, necessitating the use of even more points and even more neurons. Unless the training sets were close to the answer sought, generally, unmitigable convergence problems were experiencedin training with the back-propagation Further observationsare that the training time fo network depends on the order in which the data s in running the back-propagation algorithm. That is, if are given in the order B = 1.0. 1.2, 1.4, 1.6, the conv is superior to when the order is arbitmy2.
VI. CONCLUSIONS
the more expensiveand conventionaloptimization strategies. REFERENCES S. Ratnajeevan H. Hoole, “Computer Aided Analysis and Design of Electromagnetic Devices,” Elsevier, New YO& 1989.
S. Ratnajeevan H. Hoole. S. Subramaniam,
Figure 6: Flux Patern and Pole-face for B = 1.1 T in Airgap
so that all positive u’s are mapped on to (0.51,0.99) and all negative u’s on to (0.1,0.49), because, as explained, of the range of the sigmoid function; the numbers 0 and 1 are avoided since these are limiting values of the sigmoid function.
V. RESULTS Using all but the second row of Table 1 (B = 1.1). a neural network of 4 inputs, 3 outputs and 1 hidden layer with 15 nodes was trained using a momentum rate of 0.95 and a teaming rate of 0.95 on a SEQUENT Parallel Computer with 8 processors. And then the network was asked to predict the parameters for a desired B of 1.1 T, whose solution we know from Table 1 to be (0.490,0.900,0.090}. The result returned by the network was I0.435, 0.904, 0.097). It took 621 iterations. Be it noted that the data points are over a patterned sequence of geometries. The field solution corresponding to this geomerry is shown in Fig. 6. Subsequently,the same network was trained using arbitrary data sets from the many field solutions that were obtained in the process of each optimization. That is, the data points were scattered, this time, over a slightly broader range of operation of the device. Now to test the network, for flux densities corresponding to (p) = (0.785, 0.704, 0.307), the output of the network was (0.813,0.740,0.244) - that is, the error had
Saldanha, J.-L. Coulomb, and J.-C. Sabonnadiere. Problem Methodology and Finite Elements in the Identification of Inaccessible Locations. Sources, and Materials,“ IEEE Trans. on Magn., Vol. 27, 3433-3443,May 1991. S. Ratnajeevan H. Hoole. “Optimal Design, Inverse Problems and Parallel Computers.” IEEE Trans. M a p , Vol. 27. pp. 4146-4149,Sept. 1991. S. Subramaniam, S. Kanaganathan and S. R Hoole, “Two Requisite Tools in the Optim Electromagnetic Devices.” IEEE Trans. Magn. 4105-4109.Sept. 1991. S. Ratnajeevan H. Hoole and S. Sirikumaran off Aircraft and Shape Optimization of a Rid IEEE Trans. Magn., Vol. 27,pp. 4150-415 S. Ratnajeevan H. Hoole. K. Weeber and “Fictitious Minima of Object Functions, Meshes, and Edge Elements in Elec Synthesis.” IEEE Trans. Magn.. Vo Nov. 1991. M. Ratnarajan R. Hoole and S. Ratnajeevan H. Hoo Hessian in Inverse Problem Opt Electromagnetic Field Computation,” Electromag. Matrls.. Vol. 2. Special Sup Application of Electromagnetic Forces, Jan 270. S. Ratnajeevan H. Hoole. “Inverse Solutio Antenna Problem,” Int. J. App. Electr Special Supplement on the Application Forces, Jan. 1991,pp. pp. 251-254.
Similar experience has been confirmed in a personal communication by Y. Uchikawa of Nagoya University, Nagoya, Japan. 2Also confirmed in a personal communication by Nathan Ida of the Universitv of Akron. Akron. OH. U.S.A..
4
1934 S. Ratnajeevan H. Hoole and S. Subramaniam, “Higher Finite Element Derivatives for the Quick Synthesis of Electromagnetic Devices.” IEEE Trans. Magn., Vol. 28. NO. 2, pp.1565 - 1568, March, 1992. S. Ratnajeevan Hoole and S. Subramaniam, “Inverse Problems with Boundary Elements,” IEEE Trans. Magn., Vol. 28, NO. 2, pp.1529 - 1532, March, 1992. K. Weeber and S. Ratnajeevan H. Hoole, “The Subregion Method in Magnetic Field analysis and Design Optimization,” IEEE Trans. Magn., Vol. 28, NO. 2, pp.1561 -1564, March. 1992. S. Ratnajeevan H. Hoole. “An Integrated System for the Synthesis of Coated Waveguides from Specified Attenuation,” IEEE Trans. Microwave Theory and Techniques. Vol. 40, No. 7, pp. 1564-1571. July, 1992. K. R. Weeber and S. Ratnajeevan H. Hoole. “Geometric Parametrization and Constrained Optimization Techniques in the Design of Salient Pole Synchronous Machines,” IEEE Trans. Magn., Vol. 28, pp. 1948-1960. July, 1992. Konrad Weeber and S. Ratnajeevan H. Hoole, “A Structural Mapping Technique for Geometric Parametrization in the Synthesis of Magnetic Devices.” Int. J. Num. Meth. Eng. Vol. 33, pp. 2145-2179. July 15, 1992. S. Ratnajeevan H. Hoole, “Synthesizing a Square-wave Generating Synchronous Machine,” Int. J. Appl. Electromagn. in Matrls.. In press. K. R. Weeber, E. Johnson, SSinniah. K. Holte, S. Sabonnadiere and S. R. H. Hoole, “Design Sensitivity for Skin Effect and Minimum Volume Optimization of Magnetic Shields,” IEEE Trans. Magn., Sept. 1992. K. Preis. 0. Biro. M. Friedrich, A. Gottvald and C. Magele. “Comparison of Different Optimization Strategies in the Design of Electromagnetic Devices,” IEEE Trans. M a p , Vol. 27, No. 5. pp. 4154-4157, 1991. S . Russenschuck, “Application of Lagrange Multiplier Estimation to the Design Optimization of Permanent Magnet Synchronous Machines,” IEEE Trans. Magn., Vol. 28, No. 2, pp. 1526-1529. 1992. I. Park, B. Lee and S. Hahn, “Design Sensitivity Analysis for Nonlinear Magnetostatic Problems Using Finite Element Method.” IEEE Trans. Magn., Vol. 28, No. 2, pp. 15341537. 1992. J. Simkin and C. Trowbridge, “Optimizing Electromagnetic Devices Combining Direct Search Methods with Simulated Annealing,” IEEE Trans. Magn., Vol. 28, No. 2. pp. 15461549, 1992. M. Kasper. “Shape Optimization by Evolution Strategy,” IEEE Trans. Magn.. Vol. 28, No. 2, pp. 1556-1559, 1992. R. R. Saldhana, J. L. Coulomb, and J. C. Sabonnadiere, “An Ellipsoid Algorithm for the Optimum Design of Magnetostatic Problems,” IEEE Trans. Magn., Vol. 28. No. 2, pp. 1573-1576, 1992. D.Kurumbalapitiya and S . Ratnajeevan H. Hoole, “Rules for Manipulating Equipotential Plots in Expert System Design,” Inf. J . Appl. Eleclromagn. in Matrls.. In press. Elaine Rich and Kevin Knight, “Artificial Intelligence,” 2nd ed., McCraw-Hill, New York, 1991. M. Takeda and J. W. Goodman, “Neural Networks for Computation: Number Representations and Programming Complexity,” Applied Optics, Vol. 28, No. 18, Sept. 15, 1986.. V. Vemuri, “Artificial Neural Networks: Theoretical Concepts,’’ IEEE Computer Society Press, Los Alamitos, CA, 1990. Chang-Hoi Ahn, Sang-So0 Lee, Hyuek-Jae Lee and SooYoung Lee, “A Self-organizing Neural Network Approach for Automatic Mesh Generation.” IEEE Tram. Magn., Vol. 27, pp. 4201-4204, Sept. 1991. D. N.Dyck, D. A. Lowther, and S. McFee, “Determining an Approximate Finite Element Mesh Density Using Neural Network Techniques,” IEEE Trans. Magn., Vol. 28. pp. 1767-1770. March, 1992.
E E E TRANSACTTONS ON MAGNETICS, VOL. 29, NO. 2, MARCH 1993
[29] S. Ratnajeevan H. Hoole, “A Course on Computer Modeling for Second or Third Year Undergraduates,” IEEE Trans.Educ.. Vol. 36, No. 1, Feb. 1993. [30] T. Low and B. Chao, “The Use of Finite Elements and Neural Networks for the Solution of Inverse Electromagnetic Problems.” IEEE Trans. Magn.. Voi. 28. No. 5. pp. 28112813, Sept. 1992. [31] 0. Mohammed. D. Park, F. Uler and C. Ziqiang. “Design Optimization of Electromagnetic Devices Using Artificial Neural Networks,” IEEE Trans. Magn., Vol. 28, No. 5, pp. 2805-2807, Sept. 1992. [32] A. Ishiguro, Y. Tanaka and Y. Uchikawa, “An Estimation of Current Distribution Using Genetic Algorithms,” Int. Symp. on Nonlinear Phenomena in Electromagnetic Fields (ISEM). Paper AP-2-10. Jan. 26-29. 1992.
Acknowledgements This work was supported by the Harvey Mudd CollegeSouthem California Edison Center for Excellence in Electrical Systems. Help from Duy Hong Nguyen, a Senior at Harvey Mudd College, in training the neural network is gratefully acknowledged.
S . Ratnajeevan H. Hoole (M’ 83, SM ‘89) was born in Tamil Ceylon on Sept. 15, 1952. He received the B.Sc. degree in electrical engineering (with honors) in 1975 from the University of Ceylon, the M.Sc. degree with the Mark of Distinction in electrical engineering from the University of London, in 1979, jointly offered through Imperial College and Queen Mary College, and the Ph.D. degree in electrical engineering from Carnegie-Mellon University, Pittsburgh, PA, in 1982. After working as a faculty member in Ceylon, Nigeria, and Drexel University, and as a consultant in Singapore and the U.S., he joined Harvey Mudd College, Claremont, CA, in 1987 as an Associate Professor of Engineering,and now holds the rank of Professor with continuous tenure. His area of interest is the finite element and other numerical methods and their application in computer-aided design. He has published numerous papers in this area, besides his text book “Computer Aided Analysis and Design of Electromagnetic Devices” (Elsevier, 1989). He is presently working under contract on a book for Oxford University Press on undergraduate electromagnetics,and another for Elsevier on deeper issues in finite element analysis; these are both to come out early in 1993. He is also the Guest-Editor of the special issue of the IEEE Transactions on Education on Computation and Computers (November, 1992). Dr. Hoole is a past Chairman of IEEE Magnetics Society’s Philadelphia Chapter and serves on the Editorial Boards of Electrosoft, the Journal of Electromagnetic Waves and Applications, and the International Journal of Applied Electromagnetics in Materials. He is presently the General Chairman of the IEEE Magnetics Society‘s Conference on Electromagnetic Field Computation (Claremont, CA, August 3-5, 1992). Dr. Hoole is a member of Electromagnetics Academy, and the Magnetics, Education, and Computer Societies of the IEEE, as well as the Power Engineering Society and its Electric Machine Committee. Dr. Hoole also teaches a course titled “The Political Economy of South Asia,” at Harvey Mudd College, and is the Director of the Sri Lanka Studies Institute, Claremont, CA. He has published numerous non-technical articles in places such as the refereed journal Indian Church Estory Review, The Los Angles Times, and newspapers in Sri Lanka.