Improving the Effectiveness of Self-Organizing Map Networks Using a Circular Kohonen Layer M.Y. Kiang* , U.R. Kulkarni, M. Goul, A. Philippakis, R.T. Chi, & E. Turban Division of Computer Information Systems Department of Information Systems School of Accountancy, College of Business School of Business Arizona State University, Tempe, AZ 85287-3606 California State University at Long Beach email:
[email protected] Long Beach, CA 90840
Abstract Kohonen's self-organizing map (SOM) network is one of the most important network architectures developed during the 1980's. The main function of SOM networks is to map the input data from an n-dimensional space to a lower dimensional (usually one or twodimensional) plot while maintaining the original topological relations. A well known limitation of the Kohonen network is the “boundary effect” of nodes on or near the edge of the network. The boundary effect is responsible for retaining the undue influence of initial random weights assigned to the nodes of the network leading to ineffective topological representations. To overcome this limitation, we introduce and evaluate a modified, “circular” weight adjustment procedure. This procedure is applicable to a class of problems where the actual coordinates of the output map do not need to correspond to the original input topology. We tested the circular method with an example problem from the domain of group technology, typical of such class of problems.
1. Introduction The Self-Organizing Map (SOM) network, a variation of neural computing networks, is a categorization network developed by Kohonen [12,13]. The SOM networks have been successfully applied to various problem domains, including speech recognition [16, 29], image data compression [19], image or character recognition [1, 25],
robot control [24, 28], medical diagnostic tools [26], and group technology [7, 14, 22]. One major drawback of the SOM networks is the “boundary effect” of nodes on or near the edges of the network. This research develops and evaluates a “circular” training algorithm that tries to overcome some of the ineffective topological representations caused by the boundary effect. Using the conventional procedure, such nodes are prevented from exercising sufficient influence on nodes that, in the application problem, may be their natural “neighbors" and thus preserve the undue influence of initial random weights assigned to the nodes. In such cases, the resulting topology in the trained network may obscure or prevent otherwise natural and meaningful contiguities between some nodes, leading to less effective clustering possibilities. It is hypothesized that a significant effect of this is that the network is good at creating small local groups that accurately reflect the original relations among data points within the same group. However, the global picture of the map is usually difficult to maintain. That is, the relative positions of the subgroups may not represent the true relations of the input. This problem occurs especially in those applications where the actual coordinates of the output do not need to correspond to the original input topology and when the input data is sparse relative to the network size. It is therefore important to employ a suitable training algorithm to overcome these ineffectual clustering tendencies. The basic idea is to develop an approach that allows clusters to self-organize across edge boundaries. Theoretically, by relaxing these artificial edge
Corresponding Author. • This research is support in part by a grant from the Dean’s Award for • Excellence Summer Research Grants program at Arizona State University, College of Business.
Proceedings of The Thirtieth Annual Hawwaii International Conference on System Sciences ISBN 0-8186-7862-3/97 $17.00 © 1997 IEEE
1060-3425/97
$10.00
(c)
1997
IEEE
boundaries, we are enabling the SOM network a wider range of self-organization latitude. This objective is consistent with the notions of adaptation, emergent structures, and dimensional reduction in the study of complex and chaotic systems. To achieve this goal, we review the structure and operational features of Self-Organizing Map networks in the next section. Section 3 analyzes the magnitude of boundary effect problem on different network sizes and presents a new, “circular” method for weight adjustments for the network. In section 4 we describe the experimental procedure that we used to conduct an evaluation of the new, circular method using a well-known classification problem from the operations management literature as a test case. The paper concludes with a summary of our findings and brief discussion of future research.
2. Self-Organizing Map Network (SOM) In our Kohonen neural network model, an input pattern is denoted by a vector of order M as: x = (x1, x2, . . . , xm), where M is the number of input signals simultaneously incident on the nodes of a two-dimensional Kohonen layer. Associated with each of the N nodes in the nxn Kohonen layer, is a weight vector of order M denoted by: w i = (wi1, wi2, . . . , wim), where wij is the weight value associated with node i corresponding to the jth signal of an input vector. Kohonen has proposed an unsupervised training algorithm for generating a mapping of an input signal from a high-dimensional space to a one- or two-dimensional topological space. This is done by adapting the weight vectors of the nodes in the Kohonen layer using the following adaptation rule [13]: w i(tk+1) = w i(tk ) + a(tk )[w i(tk ) - x], for i ∈ NI(tk ) w i(tk+1 ) = w i(tk ), otherwise. Where a(tk ) is the learning coefficient that decreases over time and NI(tk ) is the set of nodes considered to be in the topological neighborhood of node i, the winner node. Node i represents the neuron that maximally responds to the input signal, i.e., its weight vector matches most closely, among all the Kohonen layer nodes, to that of the input signal. NI contains all nodes that are within a certain radius from node i. One aspect of the above adaptation rule is that the weight vectors of all the nodes within the set NI are updated at the same rate.
Weight Adaptation Function A Gaussian type of neighborhood adaptation function which decreases both in the spatial domain and the time
domain has been proposed [3, 17, 23]. Lo and Bavarian [20] have shown that an algorithm that uses the Gaussian type function will enforce ordering in the neighborhood set for every training iteration, yielding faster convergence. They modified Kohonen's adaptation rule (1) to include the amplitude of neighborhood adaptation Ai(tk ) as follows: wi(tk+1) = wi(tk) + α(tk) Ai(tk)[wi(tk) - x], for i ε NI(tk) wi(tk+1) = wi(tk), otherwise. In general, the farther a node is from the winner, the lower is the amplitude Ai and hence the lower is the update rate of the node's weight vector. For α(tk)Ai(tk) above, we use a Gaussian type neighborhood adaptation function h(tk , r), similar to the one used by Mitra and Pal [20]. This function decreases in both spatial and time domains. In the spatial domain, its value is the largest when node i is the winner node i and it gradually decreases with increasing distance from i.
h (t k , r ) =
a (1 − r * f ) 1 + t k 2 cdenom
(
)
r is the radial distance (of the node i) from the winner node i. Nodes within a radius Rk are considered for adaptation at time tk . Hence, 0