Neural Networks Applied in Linear Programming Problems: Design

3 downloads 0 Views 103KB Size Report
A complexity analysis between the main neural networks used in linear programming is ... engineering, where a set of design parameters is optimized subject to ...
Neural Networks Applied in Linear Programming Problems: Design and Complexity Analysis IVAN NUNES DA SILVA, ANDRE NUNES DE SOUZA, JOSE ALFREDO C. ULSON Department of Electrical Engineering State University of São Paulo – UNESP/FE/DEE CP 473, CEP 17033-360, Bauru – SP BRAZIL

Abstract: - Artificial neural networks are richly connected networks of simple computational elements modeled on biological processes. Systems based on artificial neural networks have high computational rates due to the use of a massive number of these computational elements. Neural networks with feedback connections provide a computing model capable of solving a rich class of optimization problems. In this paper, a modified Hopfield network is developed for solving linear programming problems. The internal parameters of the network are obtained using the valid-subspace technique. A complexity analysis between the main neural networks used in linear programming is also developed. Simulated examples are presented as an illustration of the proposed approach. Key-Words: - Artificial neural networks, linear programming, complexity analysis, operations research, artificial intelligence, systems optimization.

1 Introduction Artificial neural networks (ANN) have been applied to several classes of optimization problems and have shown promise for solving such problems efficiently. In this paper, it is presented a new approach based on ANN for solving linear programming problems. Complexity analysis of several networks used in linear programming is also presented. Engineers and operations researchers are traditionally involved in problem solving. The acceptance of the field of operations research in the study of industrial, business, military, and governmental activities can be attributed, at least in part, to the extent to which the operations research approach and methodology have aided the decision makers. Applications of operations research in the industrial context are mainly in the area of linear programming and statistical analysis. Thus, linear programming problems have a fundamental role in many areas of sciences and engineering, where a set of design parameters is optimized subject to inequality constraints [1]. Basically, all of the neural networks [2]-[5] used in linear programming contain some penalty parameters. The stable equilibrium points of these networks, corresponding to solution of the optimization problem, are obtained only when the penalty parameters are sufficiently large. Moreover, the convergence process of the network depends on the correct adjustment of these parameters. Hence, it is

developed in this paper a modified Hopfield network not depending on the penalty parameters. Linear programming problems are normally solved by the Simplex method [1]. However, the Simplex method is a finite algorithm. The neural network based algorithms applied in linear programming problems can be viewed as infinite algorithms [5]. The infinite algorithms can sometimes be better than finite algorithms on finite problems. Thus, it is important to investigate the neural network approach to solve linear programming problems as an alternative class of mathematically infinite algorithms. An artificial neural network is a dynamic system that consists of highly interconnected and parallel non-linear processing elements that shows extreme efficiency in computation. The main benefits of using ANN on linear programming problems are the following: i) the ability of learning and therefore generalization; ii) the facility of implementation in hardware; iii) the capacity of mapping complex systems without necessity of knowing the eventual mathematical models associated with them. This paper is also written with three principal objectives. This first and most important objective is to suggest that artificial neural networks can effectively map problems related to linear programming. A second objective is to present a systematic approach to design neural networks for applications in linear programming. The third

objective is to present a comparative analysis between the neural approaches used in linear programming. The organization of the present paper is as follows. In Section 2, the modified Hopfield network developed for solving linear programming problem is presented. In Section 3, a mapping of the linear programming problem is formulated using the modified Hopfield network. Section 4 contains an analysis about complexity of neural networks used in linear programming. In Section 5, simulation results are given to validate the developed network. In Section 6, the key issues raised in the paper are summarized and conclusion drawn.

2 The Modified Hopfield Network Besides providing a new approach for solving linear programming problems, artificial neural networks provide a method for exploring intrinsically parallel and adaptive processing architectures. In this paper, a modified Hopfield network with equilibrium points representing the problem solution has been developed. As introduced in [6], Hopfield networks are single-layer networks with feedback connections between nodes. In the standard case, the nodes are fully connected, i.e., every node is connected to all others nodes, including itself. The node equation for the continuous-time network with n-neurons is given by: ui (t ) = −η.ui (t ) +

n

∑Tij .v j (t ) + iib

(1)

Em(t) = Econf(t) + Eop(t)

where E (t) is a confinement term that groups all the constraints imposed by the problem, and Eop(t) is an optimization term that conducts the network output to the equilibrium points. Thus, the minimization of Em(t) of the modified Hopfield network is conducted in two stages: i) minimization of the term Econf(t): E conf (t ) = − 1 v (t ) T .T conf .v (t ) − v (t ) T .i conf (5) 2

where: v(t) is the network output, Tconf is weight matrix and iconf is bias vector belonging to Econf. This corresponds to confinement of v(t) into a valid subspace that confines the inequality constraints imposed by the problem. ii) minimization of the term Eop(t): E op (t ) = − 1 v (t )T .T op .v (t ) − v (t )T .i op 2

(I)

v(t)

v (t+1)=Tconf. v(t)+iconf

v(t) (II) gr(v)

(2)

where ui(t) is the current state of the i-th neuron; vj(t) is the output of the j-th neuron; i ib is the offset bias of the i-th neuron; η.ui(t) is a passive decay term; and Tij is the weight connecting the j-th neuron to i-th neuron

In Equation (2), g(ui(t)) is a monotonically increasing threshold function that limits the output of each neuron to ensure that network output always lies in or within a hypercube. It is shown in [6] that the equilibrium points of the network correspond to values v(t) for which the energy function (3) associated with the network is minimized: E (t ) = − 1 v (t )T .T .v (t ) − v (t )T .i b 2

(6)

where: Top is weight matrix and iop is bias vector belonging to Eop. This moves v(t) towards an optimal solution (the equilibrium points). Thus, the operation of the modified Hopfield network consists of three main steps, as shown in Fig. 1:

j =1

vi (t) = g (ui (t))

(4)

conf

(3)

The mapping of linear programming problems using a Hopfield network consists of determining the weight matrix T and the bias vector ib to compute equilibrium points. A modified energy function Em(t) is used here, defined as follows:

v v(t)←v(t)+∆v(t)

(III) ∆v(t)=∆t.(Tot.v(t) + iot)

v(t)

Fig. 1 - The Modified Hopfield Network Step ((1)): Minimization of Econf, corresponding to the projection of v(t) in the valid subspace defined by [7,8]: v(t) = Tconf.v(t) + iconf

(7)

where: Tconf is a projection matrix (Tconf.Tconf=Tconf) and Tconf.iconf=0. This operation corresponds to an indirect minimization of Econf(t). Step ((2)): Application of a nonlinear ‘symmetric ramp’ activation function constraining v(t) in a hypercube:

liminf , if liminf > vi   , if lim inf ≤ vi ≤ lim sup gi (vi ) =  vi  sup , if v i > lim sup lim

(8)

where vi(t) ∈ [liminf , limsup]. Step ((3)): Minimization of Eop, which involves updating of v(t) in direction to an optimal solution (defined by Top and iop) corresponding to network equilibrium points, which are the solutions for the linear programming problems, by applying the gradient in relation to the energy term Eop. ot

dv (t ) ∂E (t ) =v=− dt ∂v ∆v = − ∆t.∇E ot (v ) = ∆t.(T ot .v + i ot )

(9)

Therefore, minimization of Eop consists of updating v(t) in the opposite direction of the gradient of Eop. These results are also valid when a ‘hyperbolic tangent’ activation function is used. As seen in Fig. 1, each iteration has two distinct stages. First, as described in Step (III), v is updated using the gradient of the term Eop alone. Second, after each updating, v is projected directly in the valid subspace. This is an iterative process, in which v is first orthogonally projected in the valid subspace (7) and then thresholded so that its elements lie in the range [liminf, limsup]. Thus, the mapping of a linear programming problem using a modified Hopfield network consists of determining the weight matrices Tconf and Top, and the vectors iconf and iop.

3 Formulation of Linear Programming Problems by the Modified Hopfield Network A linear programming problem is a problem of minimizing or maximizing a linear function in the presence of linear constraints of the inequality and/or equality type. Since equality constraints can be easily converted in inequality constraints [1], it is considered in this formulation only inequality constraints. Consider the following linear programming problem, with m-constraints and nvariables, given by the following equations:

where A ∈ ℜnxm, b ∈ ℜm, and c, v, zmin, zmax ∈ ℜn. The conditions in (11) and (12) define a bounded convex polyhedron. The vector v must remain within this polyhedron if it is to represent a valid solution for the optimization problem (10). A solution can be obtained by a modified Hopfield network, whose valid subspace guarantees the satisfaction of the condition (11). Moreover, the initial hypercube represented by the inequality constraints in (12) is mapped by the ‘symmetric ramp’ function (8) used as a network activation function. Defining Tconf, iconf, Top and iop The parameters Tconf and iconf are calculated by transforming the inequality constraints in (23) into equality constraints by introducing a slack n variable w ∈ ℜ for each inequality constraint: g i (v ) +

q

∑ δ ij .w j = 0

(13)

j =1

where wj are slack variables, treated as the variables vi , and δij is defined by the Kronecker impulse function: 1 , if i = j δij =  0 , if i ≠ j

(14)

After this transformation, the problem defined by equation (10), (11) and (12) can be rewritten as: Minimize Eop(v) = cT.v

(15)

subject to Econf(v): (A+)T. v+ = b+

(16)

zmin ≤ vi+ ≤ zmax , i ∈ {1..n}

(17)

0≤

(18)

vi+

≤z

max

, i ∈ {n+1..N } +

T

T

T

v+ = A+.( A+ .A+)-1.b+

(19)

and the expression of the valid subspace in (7) must take account this solution, i.e., T

iconf = A+.( A+ .A+)-1.b+

(10)

v+ = Tconf.v+ + iconf

Subject to: Econf(v): AT.v ≤ b

(11)

v+ = Tconf.v+ + A+.( A+ .A+)-1.b+

(21) T

(12)

(20)

From (20), the parameter Tconf is derived as follows:

Minimize Eop(v) = cT.v zmin ≤ v ≤ zmax

N+

where N = n + m, and v+T = [v w ] ∈ ℜ is a vector of extended variables. Note that Eop does not depend on the slack variable w. If the rows of A+ are linearly independent, a solution for the Equation (16) is given by: +

(22)

Inserting the value of (16) in (22), the expression for Tconf is given by:

T

T

Tconf = I - A+.(A+ .A+)-1. A+

(23)

where I is identity matrix.

The parameters Top and iop in this case are such that the vector v+ is updated in the opposite gradient direction that of the energy function Eop. Since conditions (11) and (12) define a bounded convex polyhedron, the objective function (15) has a unique global minimum (|Top=0|). Thus, the parameters Top and iop are obtained by comparing equation (15) and (6). After this comparison, these parameters are given by: op

i = -c

(24)

op

T =0

( )

n 2 + 100n, such that n 2 ≤ n 2 + 100n ≤ 2n 2 , Θ n2 =   for all n ≥ 100. 

(27)

In this case, the values assumed for the constants k1 and k2 , referent to (26), are given by k1=1 and k2=2.

(25)

To illustrate the performance of the proposed neural network, some simulation results are presented in section 5.

4 Complexity Analysis of Networks Applied in Programming Problems

where n is the size of the considered instance, f(n) and g(n) are functions ℵ→ℜ. As an example, if f(n) = n2 + 100n and g(n) = n2, then f(n) has complexity given by Θ(n2), since n2 + 100n is lesser than or equal to 2n2, and greater than or equal to n2, for all n greater than or equal to 100, that is:

Neural Linear

The architecture analysis of an algorithm based on neural networks allows to estimate the resources for network implementation, such as the number of necessary logical gates and/or amount of necessary memory. In this section, it is analyzed the complexity of each neuron, the number of used neurons, and the complexity of the model for each approached architecture. As known in the literature, the neural networks can be implemented in hardware and in software. In an implementation in hardware, it can be used these measures to compare and to estimate the relative area complexity [9] for each neural network, which is independent of the used technology. In a discrete-time implementation, which can be in hardware or software (that is the case of the proposed network), these measures supply subsidies to compare and to predict the relative time complexity per iteration [10] for each network. Thus, it is analyzed the complexity of the proposed neural network in solving problems given by the equations (10)-(12), which are defined by N-variables, Meq equality constraints, and Min inequality constraints. In this paper, it is described the neural complexity and the model complexity in terms of the asymptotic complexity function Θ(.), which is defined by [11]: f( n), if there are positive constants k1    Θ(g (n) ) = and k2 such that for n sufficient ly big,   k .g (n) ≤ f (n) ≤ k .g (n)  2  1 

(26)

Neural Complexity In the analysis of neural complexity, the complexity of each neuron is defined as the multiplication/division and addiction/subtraction number per iteration performed in each neuron. The distinction between multiplication/division and addiction/subtraction operations is because implementation costs in software and/or hardware for each one of these classes can be significantly different. For the modified Hopfield network (MHN), proposed in this paper, the equation that synthesizes the operations performed in each neuron is given for: v k = −∆t.c k + ((Tkval ) T .v + s k )

(28)

The neural complexity (per neuron) for MHN is therefore (1 + (Meq + Min)) for the multiplications and (2 + (Meq + Min)) for the addictions/subtractions. Thus, the asymptotic complexity for both multiplications and addictions/subtractions of the neurons is (Meq + Min).

Model Complexity The model complexity is defined as the total number of multiplications/divisions and addictions/subtractions operations performed (per iteration) by the network. The estimation of multiplications/divisions and addictions/subtractions number is given by: • Total number of multiplications/divisions = (number of multiplications/divisions per neuron) x (neuron number of the network) per iteration. • Total number of addictions/subtractions = (number of addictions/subtractions per neuron) x (neuron number of the network) per iteration. In implementation using, for example, analog circuits, the area complexity is related to the total number of multiplications/divisions and addictions/subtractions performed in each processing

unit. In discrete-time simulations using computer, the time complexity associated with each iteration is also related to necessary total number of multiplications/divisions and addictions/subtractions in each algorithm. The total number of neurons used in the proposed MHN is (N + Min) neurons. Therefore, the model complexity is (N + Min).(1 + Meq + Min) for the multiplications, and (N + Min).(2 + Meq + Min) for the addictions/subtractions. The asymptotic complexity of the model is Θ((N + Min).(Meq + Min)) so much for the multiplications as for the addictions/subtractions. Table 1 presents a comparison of the model and neuron asymptotic complexity between five neural networks mentioned in this paper.

An oil company produces two types of crude oil components with the specifications given in Table 2. The company blends component 1 and component 2 to produce gasoline A and gasoline B, with the following requirements:

Table 1 - Results of Complexity Analysis

All units of the remaining components after gasoline A has been produced are blended to gasoline B. At present, gasoline A sells at a profit of $8/unit, and gasoline B sells at a profit of $5/unit. How much of gasoline A and B should the company produce to maximize profit ? For convenience, the following basic variables are defined as follows:

ANN Number of Model neurons Tank N Kennedy N Rodríguez N Zak

N + Min

RHM

N + Min

Neural Model Complexity Complexity Θ(Meq + Min) Θ(N(Meq + Min)) Θ(Meq + Min) Θ(N(Meq + Min)) Θ(Meq + Min) Θ(N(Meq + Min)) Θ((N + Min). (N + Θ(Meq + Min) Meq + Min)) Θ((N + Min).(Meq Θ(Meq + Min) + Min))

Using the asymptotic complexity of the model associated with each network, we can estimate the necessary resources for the implementation of each model in relation to the size of a specific problem. For problems of linear programming with a big number of variables and constraints (that is, N ≈ Min, Meq), the complexity order {O(.)} of the model is O(N2) for all networks. For problems with a large number of variables when compared with the total number of constraints, the complexity order of the model is O(N2) for the network of Zak [5] and for the other models {Tank [4], Kennedy [2], Rodríguez [3] , and MHN proposed in this paper ) it is O(N). Another important factor to be analyzed is the parameters number (weighting constants) of initialization that should be provided to start the simulations by the networks. In the network of Kennedy [2] and Rodríguez-Vázquez [3] is necessary the adjustment of two parameters (s e ∆t), while in the network of Zak [5] it necessary the adjustment of three initialization parameters (ρ, γ and ∆t).

5 Simulation Results The modified Hopfied network proposed in previous sections has been used to solve the linear programming problem known as ‘Gasoline Blending Problem’.

1) The minimum acceptable octane number of gasoline A is 105. The octane number is estimated by using the weighted average of the octane number of both components. 2) The maximum acceptable vapor pressure of gasoline A is 8 lb/in2. The vapor pressure of gasoline is estimated by using the weighted average of the vapor pressure of both components. 3) Gasoline B has no blending requirement.

v1 = number of units of component 1 in gasoline A, v2 = number of units of component 2 in gasoline A, v3 = number of units of component 1 in gasoline B, v4 = number of units of component 2 in gasoline B. This problem can be equivalently expressed as Maximize Eop(v) = 8v1 + 8v2 + 5v3 + 5v4 subject to:

110v1 + 100v2 ≥ 105 v1 + v2 10v1 + 5v2 ≤8 v1 + v2 v1 + v3 = 40

Table 2 - Specification for the Example Component 1 2

Performance Vapor Pressure Production (octane number) (lb/in2) (units/day) 110 10 40 100 5 60

For this problem, with two inequality constraints and with two equality constraints, the solution vector (equilibrium point) obtained by the modified Hopfield network is given by v = [40.001 40.001 0.000 19.998]T, with Eop(v) = 740.006. These results are near to the optimal solution provided by Simplex method, that is, v* = [40 40 0 20]T, with Eop(v*) = 740. The initial value of v was randomly generated between 0 and 1.

To illustrate that the modified Hopfield network (MHN) can be used efficiently, its results were compared with those obtained by other four different classes of neural network models used in linear programming problems: the Kennedy model [2], the Rodríguez model [3], the Tank model [4], the Zak model [5]. Table 3 shows the results provided by these approaches. Table 3 - Comparative Analysis of the Results RNA v1 v2 v3 v4 Eop Kennedy 40.193 40.132 -0.051 19.951 742.100 Rodríguez 41.702 41.451 -0.006 18.800 759.194 Tank 40.529 40.418 -0.014 19.564 745.326 Zak 40.001 40.001 0.000 19.973 739.881 RHM 40.001 40.001 0.000 19.998 740.006 These results show the efficiency of the modified Hopfield network for solving linear programming problems. The network has been also tested with various initial conditions. All trajectories lead toward the same equilibrium point, which represents the problem solution.

6 Conclusions In this paper, it is developed a modified Hopfield network for solving linear programming problems related to operations research. The internal parameters of the network were explicitly computed using the valid-subspace technique that guarantees the network convergence. Simulation results proved that the proposed network is a feasible alternative for solving such problems efficiently. A complexity analysis is also presented and shows the computational behavior involved in each neural approach used normally for solving linear programming problem. In addition to providing a new approach for application in linear programming problems, the modified Hopfield network does not require any special treatment for your initialization.

Acknowledgements The authors express thanks to FAPESP for providing financial support under grant No. 98/08480-0.

References: [1] M. S. Bazaraa and C. M. Shetty, Nonlinear Programming, John Wiley & Sons, New York, 1979. [2] M. P. Kennedy and L.O. Chua, “Neural networks for nonlinear programming”, IEEE Trans. Circuits Syst., Vol. 35, 1988, pp. 554-562. [3] R. V. Rodríguez et al., “Nonlinear switchedcapacitor neural network for optimization problems”, IEEE Trans. Circuits Syst., Vol. 37, 1990, pp. 384-398. [4] D. W. Tank and J. J. Hopfield, “Simple neural optimization networks: an A/D converter, signal decision network, and a linear programming circuit”, IEEE Trans. Circuits Syst., Vol. 33, 1986, pp. 533-541. [5] S. H. Zak, V. Upatising and S. Hui, “Solving linear programming problems with neural networks”, IEEE Trans. on Neural Networks, Vol. 6, 1995, pp. 94-104. [6] J. J. Hopfield, “Neurons with a graded response have collective computational properties like those of two-state neurons”, Proc. of the National Academy of Science, Vol. 81, 1984, pp. 30883092. [7] S. V. B. Aiyer, M. Niranjan and F. Fallside, “A theoretical investigation into the performance of the Hopfield Model”, IEEE Trans. on Neural Networks, Vol 1, 1990, pp. 53-60. [8] I. N. Silva, L.V.R. Arruda and W.C. Amaral, “Robust estimation of parametric membership regions using artificial neural networks”, International Journal of Systems Science, Vol. 28, 1997, pp. 447-455. [9] R. P. Brent and H. T. Kung, “The area-time complexity of binary multiplication”, J. Asso. Comp. Mach, Vol. 28, 1981, pp. 521-534. [10] C. Gimarc, V. Milutinovié and O. Ersoy, “Time complexity of binary modeling and comparison of parallel architectures for Fourier transform oriented algorithms”, Proc. of the Hawaii Int. Conf. Syst. Sciences, HICSS-22, Vol. 1, 1989, pp. 160-170. [11] C. H. Papadimitriou and K. Steiglitz, Combinatorial Optimization: Algorithms and Complexity, Prentice-Hall, Englewood Cliffs, NJ, 1982.

Suggest Documents