Artificial Neural Networks in the Solution of Inverse Electromagnetic Field Problems S. Ratnajeevan H. Hoole Department of Engineering, Harvey Mudd College, Claremont, CA 9 1711 Abstract: In an inverse electromagnetic field problem, we are asked to detexmine thdse values of the descriptive parametersof the device that would give us a required performance. In this paper, the use of artificial neural networks in the solution of inverse electromagnetic field problems is investigated. It is shown that artificial neural networks, while being no panacea, have a role to play in a limited domain of applications. That is, while it is ineffective to train networks to cover a broad class of devices, it is indeed possible to develop well-mined networks that function effectively over a narrow range of performanceof a particular class of device.
simulated annealing. These methods are costly and require hundreds even thousands when the lauer statistical methods are employed - of finite element analyses of the design for different combinations of (p). Other approximate schemes have also been proposed using the manipulation of flux lines as objects for designing electromagnetic devices using expert system methodology [231, but such methods are, as stated, approximate and the resulting design has to be fine-tuned using one of the more sophisticatednumerical schemes. This paper investigates the use of neural networks in inverse electromagneticfield problems.
I. INVERSE PROBLEMS I N ELECTROMAGNETICS In a typical design of an electromagneticdevice - or for that matter any engineering device - specifications are in terms of performance. That is, for a desired output vector ( 0 ) . we are asked to find the required descriptive parameters (p). Here ( 0 ) may consist of flux densities at specified points, forces, etc., while ( p ) might have device dimensions, material properties such as permeabilities, and current densities as its elements. However, well known analysis techniques like the finite element method [l], allow us only to predict performance ( 0 ) for given parameters (p).The inverse problem that is the real engineering problem of finding the device-descriptive {p] for a required output ( 0 ) - has therefore been tackled iteratively 12221 using gradient techniques, evolution strategies and
Figure 1: An Artificial Neural Network
What is a neural network and wfiat does it do ? A neural network is said to be a model of the brain and how it functions in the process of recognition [24- 261. This recognition is based on leaming from past experience. The relevant model for our purposes here, however, is merely an algorithm, a piece of code, that does a mapping. Using given data pairs ( ( 0 ) .[p)), our past experience, the neural network leams the mapping (0) + (p1. In the process of learning, the neural network merely computes the mapping function. While there is much terminology associated with neural networks, it does not have to delay us here since it is meant more to dazzle than clarify. As shown in Fig. 1. the neural network has an input layer of neurons (computing elements), a hidden layer or more of neurons, and an output layer of neurons. Here the inputs to the neural network ( 0 ) (that is the output performance of the electromagnetic device) are symbolized by (B) in anticipation of the desired flux density at a set of points that we will specify as the output for our example. Each neuron is depicted as a circle with a value in it, this value being the input in the first layer, the output in the last layer, and both the input and output for neurons in the hidden layer. We may increase the number of neurons per layer and also have more hidden layers. This generally, but not always, increases the power of the network, although there are marginal returns and issues of architectural design involved. Each neuron of a layer is connected to each one of the next layer and the path from neuron i of layer 1 to neuron j of the next layer has the weight Wlij. The process of lrming has to do with computing thcse weights. The neuron with input/output 1 in each layer is there for generality for thresholding purposes. The mapping from one layer to the next is now done thus for a neuron j in the intermediate layer:
m Sumj = .CwlijBi
1 4
5 0
Figure 2: A Sigmoid Function
where the latter is the sigmoid function depicted in Fig. 2. Other functions incorporating the range (0.1) such as the sine function, the hyperbolic tan and the straight line have also been used. i = 0 corresponds to the thresholding neuron. Another point to note is that by the very nature of the sigmoid
function whose values are in the range (O,l), our (p) cannot be raw numbers. They must first be scaled (or more formally mapped) into this interval before they are used. While a thresholding function is sometimes utilized in place of the sigmoid function in pattern-recognition, for the backpropagation algorithm we shall use, this sigmoid function is desirable. Thus depending on the weights. for a given input (B], the network will return the output (p). It is thus these weights that determine the mapping. And finding these weights is known as training or learning, and one way to do it is to use the back-propagation algorithm.
Constant flu3I Densib
1 /
Figure 4: Test Problem
In the back-propagation algorithm, shown in Figure 3 and based on [24], the network needs ((B),(p))data pairs which are used to train the network to learn the function that maps (B] to (p). (B) is fed in as input, and the network with arbitrary weights is allowed to compute (p). Then, using the difference between the computed (p], (pc), and the target (p], the (p) that we know to be correct, we move back into the network adjusting the weights so as to minimize the error.
u2 ' I
Un I
Figure 5: The Reduced Problem
The earliest use of a neural network in electromagneucs was by Ahn, Lee, Lee and Lee [27] and Dyck, Lowther, and McFee [28], who used it to determine optimal finite element 1 t 0.95 (or other learning rate) Set each weight wlij, w2ij to a random number between -0.1 and 0.1 ror every training pair ( B ) , (p) Do Assign this (B.) to the input units For j t 0 to 1 1 hj t m
For j
0 to n 1
meshes. It was more recently however, that neural networks were used to solve inverse problems. Hoole [29]. Low and Chao [30], Mohammed, Parl, Uler and Ziqiang [31] and Ishiguro, Tanaka and Uchikawa [32] have independently proposed employingartificial neural networks to solve inverse electromagneticfield problems. As proposed in [29]. data pairs (( B ) ,(p)) are generated using finite element analysis and these are used to train the network. The trained network, it has been proposed, can then be used to compute the network output (p] for desired network input (B). The purpose of this paper is to investigate in greater detail than is available in the literature and was appropriate for an educational journal [29], the use of neural networks in solving inverse electromagnetic field problems. In conclusion, it is shown that neural networks, while having their natural niche in inverse problem solution, need to be used with care, and are applicable only to a limited class of problems.
For j t 0 to n [ S2j is the error between the Output Pjc and the target output pj ] "j + Pjc(l-P'c)(Pj - pjc) For j c 0 to 1 (dornputethe error Slj, in the hidden layer] n Slj t hj(1-hj) JS2i ~2 ji 1=1 For j t 0 ton For i t 0 to I Aw2ij t q 62j hi ~ 2 icj ~ 2 i+j A ~ 2 i j For j c 0 to h Fori t 0 tom Awl ij t q 61j Xi w l ij t wlij + Awlij End. 'igure 3: The Back-DroDaeation Algorithm
IV. THEPOLE FACE EXAMPLE To test the ideas proposed we shall take as example the optimization of the contour of a pole-face so as to make the flux density in the air-gap a constant. This configuration is shown in Fig. 4 to scale, where the distance from the top to the bottom is 10 cm.. By symmetry and periodicity, as well as by imposing no-1eakage.conditions at the top and bottom, the the problem may be reduced to that shown in Fig. 5. This geometry was optimized by the structural mapping technique [14], determining the three nodal displacements ul, u2, and u3, regarding the pole as a structure,so to produce that smooth distortion of a structure that would give us the required constant flux density in the airgap. The value of u = 0.1 corresponds to a shift of 0.2 c.m.. Table 1 summarizes for each desired flux density from 1.0 T to 1.6 T, the resulting three parameters (p) and the actual values of the flux density B at 5 measuring points in the airgap. Naturally, the parameters [p) are the actual displacementu mapped accordingto 0.01 + 0.05056819 (u + 9.72411) if uc0 p = 0.51 + 0.040068918 (u - 1.08174) if 1,120 (3)
Table 1: Specified B and Corresponding {p} and {B) from Structural Optimization
increased. When completely arbitrary data was ugd, as to ber expected, the network returned answers with much larger error, necessitating the use of even more points and even more neurons. Unless the training sets were close to the answer sought, generally, unmitigable convergence problems were experiencedin training with the back-propagation Further observationsare that the training time fo network depends on the order in which the data s in running the back-propagation algorithm. That is, if are given in the order B = 1.0. 1.2, 1.4, 1.6, the conv is superior to when the order is arbitmy2.
the more expensiveand conventionaloptimization strategies. REFERENCES S. Ratnajeevan H. Hoole, “Computer Aided Analysis and Design of Electromagnetic Devices,” Elsevier, New YO& 1989.
S. Ratnajeevan H. Hoole. S. Subramaniam,
Figure 6: Flux Patern and Pole-face for B = 1.1 T in Airgap
so that all positive u’s are mapped on to (0.51,0.99) and all negative u’s on to (0.1,0.49), because, as explained, of the range of the sigmoid function; the numbers 0 and 1 are avoided since these are limiting values of the sigmoid function.
V. RESULTS Using all but the second row of Table 1 (B = 1.1). a neural network of 4 inputs, 3 outputs and 1 hidden layer with 15 nodes was trained using a momentum rate of 0.95 and a teaming rate of 0.95 on a SEQUENT Parallel Computer with 8 processors. And then the network was asked to predict the parameters for a desired B of 1.1 T, whose solution we know from Table 1 to be (0.490,0.900,0.090}. The result returned by the network was I0.435, 0.904, 0.097). It took 621 iterations. Be it noted that the data points are over a patterned sequence of geometries. The field solution corresponding to this geomerry is shown in Fig. 6. Subsequently,the same network was trained using arbitrary data sets from the many field solutions that were obtained in the process of each optimization. That is, the data points were scattered, this time, over a slightly broader range of operation of the device. Now to test the network, for flux densities corresponding to (p) = (0.785, 0.704, 0.307), the output of the network was (0.813,0.740,0.244) - that is, the error had
Saldanha, J.-L. Coulomb, and J.-C. Sabonnadiere. Problem Methodology and Finite Elements in the Identification of Inaccessible Locations. Sources, and Materials,“ IEEE Trans. on Magn., Vol. 27, 3433-3443,May 1991. S. Ratnajeevan H. Hoole. “Optimal Design, Inverse Problems and Parallel Computers.” IEEE Trans. M a p , Vol. 27. pp. 4146-4149,Sept. 1991. S. Subramaniam, S. Kanaganathan and S. R Hoole, “Two Requisite Tools in the Optim Electromagnetic Devices.” IEEE Trans. Magn. 4105-4109.Sept. 1991. S. Ratnajeevan H. Hoole and S. Sirikumaran off Aircraft and Shape Optimization of a Rid IEEE Trans. Magn., Vol. 27,pp. 4150-415 S. Ratnajeevan H. Hoole. K. Weeber and “Fictitious Minima of Object Functions, Meshes, and Edge Elements in Elec Synthesis.” IEEE Trans. Magn.. Vo Nov. 1991. M. Ratnarajan R. Hoole and S. Ratnajeevan H. Hoo Hessian in Inverse Problem Opt Electromagnetic Field Computation,” Electromag. Matrls.. Vol. 2. Special Sup Application of Electromagnetic Forces, Jan 270. S. Ratnajeevan H. Hoole. “Inverse Solutio Antenna Problem,” Int. J. App. Electr Special Supplement on the Application Forces, Jan. 1991,pp. pp. 251-254.
Similar experience has been confirmed in a personal communication by Y. Uchikawa of Nagoya University, Nagoya, Japan. 2Also confirmed in a personal communication by Nathan Ida of the Universitv of Akron. Akron. OH. U.S.A..
Acknowledgements This work was supported by the Harvey Mudd CollegeSouthem California Edison Center for Excellence in Electrical Systems. Help from Duy Hong Nguyen, a Senior at Harvey Mudd College, in training the neural network is gratefully acknowledged.
