f10 o f11); D42 > 0 (f10 o f12); D21 < 0 (f5 o f6); and D84 < 0 (f5 o f9). No conflicts were observed. The fact that the switching function has now been expanded ...
THRESHOLD FUNCTION REALIZATION OF ARBITRARY SWITCHING FUNCTIONS AND ITS APPLICATIONS* Mankuan Vai Department of Electrical and Computer Engineering Northeastern University Boston, MA 02115 U. S. A.
Indexing terms: Threshold functions, Switching functions, Classifiers, Neural Networks A systematic approach is provided to design a valid threshold logic implementation for arbitrary switching functions. This approach creates a single layer of hidden units, on an as needed basis, for switching functions that are not linearly separable.
Introduction: Threshold circuits are the basic building blocks of neural networks. A switching function
, x ) on n variables is a threshold function if and only if there exists a weight vector & w = ( w , w , , w ) and a threshold T, such that F ( x1 , x 2 , 1
n
2
n
, x ) = ∑ w x ≥ T , and F = 0, if f ( x ,, x ) = ∑ w x < T . n
F = 1, if f ( x1 ,
n
n
i i
i =1
1
n
i i
(1)
i =1
Variations of linear programming are commonly used for solving the system of linear inequalities associated with (1).1 For example, an RTF (realization of threshold functions) map was developed to facilitate the detection of contradictions in the inequalities and the finding of a solution.2 A multiple layer neural network can be trained to model any switching function. However, learning algorithms generally cannot guarantee the convergence to an exact solution.3 Also, there is no known method that predicts the required number of hidden units in a multiple layer network before training.4 A number of deterministic methods have been proposed.5-6 One method provides a hyperplane for each input pattern to separate it from all others and uses an OR operation to group the desired partitions together.5 This method cannot fully utilize the capability of a threshold function and does not provide a minimal implementation of a switching function. Another method uses a Voronoi diagram over the set of input patterns in the space.6 This method depends on heuristic or visualization methods which degrade
rapidly when the number of input variables (i.e., the number of dimensions of the space) goes beyond three. We have developed a systematic approach that implements switching functions that are not linearly separable. Approach: The structure of a four-variable RTF map is shown in Fig. 1a. Each node is labeled with a number corresponding to an input pattern and is circled if it is a minterm. Each link indicates the difference between the weighted input sums (as defined by f(x1, x2, ..., xn) in (1)) of its end nodes. The original use of this RTF map, as proposed by Torng, is to reduce the number of inequalities and detect a function that is not linear separable by showing contradictory inequalities. Our approach is based on the fact that, given an arbitrary switching function, a linearly separating hyperplane can always be found after moving some of the input patterns from the 0-set to the 1-set. The change introduced to the switching function by this movement can be counteracted by a single layer of hidden units. These hidden units, when provided with these input patterns with flipped outputs, produce counterweights to the output threshold unit which implements the modified switching function. We will describe our approach using the function F = x1x 2 + x3 x 4 , which is not linearly separable, as an example. The RTF map of this function is shown in Fig. 1a. All the nodes corresponding to the minterms of F are circled. For a pair of nodes fi and fj, two types of inequality may be obtained. If fi is circled while fj is not, then fj has to be less than fi . The inequalities identified in Fig. 1a are: D10 > 0 (f2 o f3, f6 o f7, f10 o f11); D42 < 0 (f3 o f5); D42 > 0 (f10 o f12). The last two inequalities are contradictory and Thorng’s method will thus terminate. The current approach removes this conflict by either circling f5 or f10. We will randomly select f5 to be circled. The modified RTF map is shown in Fig. 1b and provides the following inequalities: D10 > 0 (f2
o f3 , f4 o f5 ,
f10 o f11); D42 > 0 (f10 o f12); D21 < 0 (f5 o f6); and D84 < 0 (f5 o f9). No conflicts were observed. The fact that the switching function has now been expanded will be dealt with in a later step.
*
This research was supported in part by ONR under grant N00014-94-1-0687.
2
A chart is now set up in Fig. 2. The columns corresponding to the circled fi's in the RTF map are checked. Each row of this chart represents an iteration which calculates the fi's. A new inequality has to be added if an entry in a checked column is equal to or smaller than that in an unchecked column. An additional inequality is thus found in the first iteration:
D10 + D21 > 0 (f1 o f3). This inequality can be
satisfied by setting D10 = 2 and the second iteration begins. The inequalities D42 + D21 < 0 (f3 o f6), +
D42
D84 < 0 (f3 o f9), D21 + D42 > 0 (f9 o f12) are identified in iterations 2, 3 and 4, respectively. An
unresolvable conflict (D21 +
D42 > 0 and D42 + D21 < 0) has arisen. This conflict can be removed by
circling (or checking) f9 which also removes the inequality D84 < 0. A solution has been found in the last iteration. A threshold function with w1 = f8 = 2, w2 = f4 = 2, w3 = f2 = 1, w4 = f1 = 3, and T = 4 is provided for the modified RTF map which now has two extra minterms (5 and 9). The last step of this approach implements a hidden unit (or in general hidden units) to remove these two extra minterms. This is done by designing the hidden unit to produce a one when the input is either 0101 (5) or 1001 (9). Also, the hidden unit output does not affect the output threshold unit as long as it does not produce a one when a minterm of F = x1x2 + x3x4 is applied. The switching function of the hidden unit is thus defined as H(x1, x2, x3, x4)=
6m(5, 9) + d(0-2, 4, 6, 8, 10). Applying the above
approach determines a threshold function with w1h = -1, w2h = -1, w3h = -2, w4h = 0, and Th = -1. In order to suppress the output of the output threshold unit, a weight of (-max(f5, f9) + T - 1 = -2) is assigned to the connection from the hidden unit to the output unit as shown in Fig. 3. In general, the switching function of the hidden unit itself may not be linearly separable. While in principle the above process can be repeated for the design of a hidden unit, a series of hidden units will have an adverse effect on the performance of the final circuit. In order to ensure that only a single layer of hidden units are used, conflicts observed during the implementation of a hidden unit are removed by moving patterns from the 1-set to the 0-set. Additional hidden units are then be generated to produce the counterweight for the patterns moved. Application: In addition to its use of being a logic function, an application of using a threshold unit as a classifier is shown in Fig. 4. The patterns to be classified are shown in Fig. 4a and each of them is labeled
3
by a binary number created by cascading its row (X1X2) and column (X3X4) coordinates. This is a difficult classification problem since one group is completely surrounded by another. Fig. 4b shows the result of applying the above method to design a threshold gate that performs the classification.
References 1
Muroga, S.: ‘Threshold Logic and Its Applications’ (Wiley, New York, 1971)
2
Torng, H. C.: ‘An approach for the realization of linearly separable switching functions’, IEEE Trans. on Electronic Computers, 1966, EC-15, (1), pp. 14-20
3
Rumelhart, D. E., Hinton, G. E., and Williams, R. J.: ‘Learning internal representation by error propagation,’ in Rumelhart, D. E., and Mcclelland, J. L. (Eds.): ‘Parallel Distributed Processing, Explorations in the Microstructure of Cognition’, Vol. 1, (The MIT Press, Cambridge, MA, USA, 1986), pp. 318-362
4
Huang, S. C., and Huang, Y. F.: ‘Bounds on the number of hidden neurons in multilayer perceptrons’, IEEE Trans. on Neural Networks, 1991, 2, (1), pp. 47-55
5
Hussain, B., and Kabuka, M. R.: ‘Neural network transformation of arbitrary Boolean functions’, SPIE Neural and Stochastic Methods in Image and Signal Processing, 1992, 1766, pp. 355-367
6
Boss, N. K., and Garga, A. K.: ‘Neural network design using Voronoi Diagrams’, IEEE Trans. on Neural Networks, 1993, 4, (5), pp. 778-787
4
Legends to Figures: Fig. 1 The RTF map for realizing F = x1x2 + x3x4: (a) the original map, (b) the modified map. Fig. 2 A chart demonstrating the realization of the example switching function. Fig. 3 The threshold logic implementation of F = x1x2 + x3x4. Fig. 4 A classifier example: (a) the pattern distribution; (b) a threshold function implementation.
5
Fig. 1
D10 D21 D42 D84 f0 1 2 3 3 3
-1 -1 -2 -2 -2
1 1 1 1 1
-1 -1 -1 -2 0
0 0 0 0 0
f1 1 2 3 3 3
f2 0 1 1 1 1
f3 1 3 4 4 4
f4
f5
f6
f7
f8
f9
f10
f11
f12
f13
f14
f15
2 2 2 2
4 5 5 5
3 3 3 3
6 6 6
1 0 2
4 3 5
1 3
4 6
2 4
7
5
8
Fig. 2
7
Fig. 3
8
Fig. 4
9