Concurrent Subspace Optimization of Mixed ...

Concurrent Subspace Optimization of Mixed Continuous/Discrete Systems Marc A. Stelmack Stephen M. Batilly Department of Aerospace and Mechanical Engineering University of Notre Dame Notre Dame, Indiana Abstract An extension of the method of Concurrent Subspace Optimization (CSSO) has been developed to accomodate mixed continuous/discrete design problems. The mixed CSSO framework employs articial neural networks to provide approximations to the design space, which are the means of coordinating design decisions in the individual disciplines. This approach is applied to a nonhierarchic test problem which contains continuous and discrete design variables. The results demonstrate that the mixed CSSO framework is able to locate optimal designs and did reduce the number of the complete system analyses required by conventional optimization techniques. Computational resources remain a concern, however, due to the large number of contributing (disciplinary) analyses required to perform mixed optimization at the discipline level. Results demonstrate that the database of design information assembled during CSSO can be exploited to enhance the eciency of subsequent runs, even if the requirements of the system design problem are altered.

only discrete design variables or be \mixed," containing both continuous and discrete design variables. The designer of an aircraft wing, for example, has the task of choosing the number of spars in the wing, their locations, and the materials to be used in its construction. While the location of a spar is a continuous design variable, the number of spars must be an integer and thus is a discrete design variable, as is material selection, since a structural designer would likely choose one of several available materials. Mixed systems present a particular challenge because many traditional optimization methods were developed to deal with systems that are either exclusively continuous or discrete. Any optimization technique that requires the merit function and constraints to be continuously dierentiable with respect to all the design variables is not well-suited for mixed optimization a space containing discrete variables generally does not meet the dierentiability condition. Furthermore, combinatorial (discrete-only) optimization schemes are often very expensive in terms of the number of merit function evaluations required. In the event that each system analysis is expensive, the application of such methods is only viable if some type of approximate analysis is available. While some design optimization strategies do not accommodate mixed systems, methods do exist which can either be extended to them or used in conjunction with established discrete and mixed optimization techniques. The Concurrent Subspace Optimization (CSSO) method proposed by Sobieski 1] and modied by Renaud and Gabriele 2] provides a framework in which discipline-level designers are provided with information regarding the impact(s) of their design decisions on non-local states and on the system-level merit function and constraints. These capabilities enable the CSSO method to be implemented with coupled, nonhierarchic design problems. The CSSO algorithm has

I. Introduction Recent work in multidisciplinary design optimization (MDO) has been inspired by the need to design complex systems quickly and at low cost. Such systems often involve mutually dependent disciplines, each of which has some impact on the whole. This interdependent nature requires that design decisions in each discipline be coordinated if a single, system design goal is to be eectively pursued. Many engineering systems, to which MDO techniques are potentially applicable, are not dened entirely by continuous design variables. Rather, these systems may contain

Graduate Research Assistant, Member AIAA Associate Fellow AIAA 1 c 1997 by Stephen M. Batill. Published by the Copyright American Institute of Aeronautics and Astronautics, Inc. with permission. y Professor,

1

been further modied and implemented by Sellar, et.al. 3] in a manner which incorporates response surfaces represented by articial neural networks (this version is referred to herein as the CSSO/NN framework). This method has compared favorably with conventional techniques in terms of the number of complete system analyses required during the course of a system optimization. It has also been demonstrated that neural networks possess the ability to represent systems containing both continuous and discrete design variables 4]. The extension of the CSSO/NN algorithm to include discrete design variables is the focus of this paper. The method has been applied to a coupled, multidisciplinary design problem which had previously been used in the validation of the CSSO/NN algorithm and was modied to include discrete design variables. This paper will describe the demonstration problem and discuss issues pertinent to the implementation of such an optimization framework.

Baseline Designs

STOP yes

no

II. CSSO/NN Framework The neural network-based concurrent subspace optimization algorithm, on which the mixed MDO framework based, is illustrated in Figure 1. The algorithm begins with the selection of a set of baseline designs, which are analyzed to provide the database from which the initial system approximations are constructed. The response surfaces used in the mixed CSSO/NN framework are feedforward, sigmoid-activation articial neural networks 5]. The subspace designers then optimize the system design based on their specialized expertise and analysis capabilities, using whatever tools are appropriate to their discipline. This requires them to solve the system-level optimization problem based upon this expertise in their own discipline and approximate analyses in all other disciplines. Response surface approximations are the sources of information for the subspace solution of the system optimization problem, as they are utilized in lieu of complete system analyses to provide information about non-local states. The designs obtained through each subspace optimization are analyzed, the database augmented with the corresponding design and state variable values, and the response surfaces reconstructed to reect this new information. The nal step of each CSSO iteration is a system optimization based entirely on response surface approximations. The design point provided by this approximate system optimization is also added to the design database and considered in constructing the response surfaces at

Figure 1: Concurrent Subspace Optimization Framework the outset of each subsequent iteration. The CSSO/NN algorithm has been demonstrated to be useful in the optimization of nonhierarchic engineering systems through application to a number of demonstration problems 6]. Its application, however, has heretofore been limited to continuous problems. An important goal of the current work is to allow the decomposition of the system design problem to be based on the disciplines that characterize it, rather than on the nature (continuous vs. discrete) of individual design variables. Articial neural networks (ANN's) are the means by which the system is approximated. Thus, a single method of response surface generation can be used regardless of whether a particular subspace is continuous, discrete, or mixed. Demonstration Problem The mixed CSSO/NN framework has been applied to a simple example which operates on three design variables (x1 x2 and x3 ), and produces two states (y1 and y2 ). The problem has been formulated such that there is one continuous (x1 ) and two discrete (x2 x3 ) design variables. The demonstration problem contains two \disciplines", or contributing analyses (CA's), which are closed2

design variables xed at x2 = 0 and x3 = 1. The central region of the space is infeasible because the constraint on state y1 is violated therein, while designs on either side of this area are feasible. The region on the far right is infeasible because of the constraint on y2 . The global optimum is located at ~x = (2:89 0 1), where the merit function is f = 9:00. There also exists a local optimum at ~x = (;2:69 0 1), whose merit is f = 9:32, as indicated on the graph.

form analytic expressions in the design variables and non-local states. This system analysis is given below in the form of two analytic expressions each representing a contributing analysis. CA1: CA2:

y1 = x21 + x2 + x3 ; 0:2y2 y2 = py1 + x1 + x3 (1)

The system is nonhierarchic because the CA's are mutually dependent. The system analysis for this problem is the process by which, given a xed set of design variables, the values of states y1 and y2 are determined. This process consists of the iterative solution of the set of coupled, nonlinear algebraic equations specied above. The two CA's exchange state information during the system analysis in the manner shown in Figure 2. This problem

100 INFEASIBLE

FEASIBLE

FEASIBLE

INFEAS.

f 50

CA1

y

1

{x}

{y} y

2

CA2

0 -10

0 x1

10

Figure 3: Example Problem Design Space

Figure 2: System Analysis for Example Problem

One aspect of this example is that the \discrete" variables x2 and x3 were created by simply selecting a number of values of inherently continuous parameters. Thus, the concept of a gradient would make some sense for these variables. This is not, however, always the case in discrete optimization, as it is possible for a single discrete variable to dictate multiple problem parameters. Material selection is one instance in which no \derivative" exists the \direction" from one material to another is an ambiguous concept. \Moving" from aluminum to steel, for example, results in many material property changes. In the context of optimization, some of these are likely to be desirable and some not (i.e., increasing strength at a cost of higher weight).

does not correspond to any particular physical application, but its form makes it ideal for studying mixed optimization in the CSSO context, and its size makes it convenient for displaying and evaluating results. The \merit" of a design is given by an analytic expression in design and state variables. The goal of the system optimization is to minimize this merit function while meeting specied constraints associated with the values of states y1 and y2 . The system optimization problem is stated as Minimize: f = f (~x ~y) = x22 +x3 + y1 + exp(;y2 ) Subject To: g1 = y81 ; 1 0

y2 0 g2 = 1 ; 10 ;10 x1 10 x2 2 f0 2 4 6 8 10g x3 2 f1 3 5 7 9g

global opt

local opt

III. Mixed Optimization Strategy A fundamental issue is how the optimization, at both the subspace and system coordination steps, is handled. To allow any given disciplinary subproblem to be mixed without the need for further decompostion, an optimization technique that accounts for both continuous and discrete variables is suggested. This was accomplished by combining two distinct optimization strategies, one suitable for purely discrete and the other for

(2)

A cross-section of the design space for this problem is illustrated in Figure 3. The graph shows the functional dependence of the merit function f on design variable x1 with the remaining 3

purely continuous optimization, to create a \hybrid" technique. A number of possible approaches exist, one of which is to nest the two opimizers and perform a fully discrete optimization at each iteration of a continuous one or vice-versa. Unfortunately, running one optimizer or the other many times would compound the issue of computational time and resources. The number of analyses that would be required by such a scheme likely depends on exactly how the two optimizers \communicate" with each other. Qualitatively speaking, however, increasing the number of required analyses is almost always undesirable in the context of MDO. A more feasible approach is to run each optimizer independently of the other. It would not suce, however, to simply x the continuous variables at some initial estimate while running the discrete optimizer, x the discrete variables at the values obtained, and then run the continuous optimizer to nd a solution. Figure 4(a) illustrates a shortcoming in this approach. The gure depicts a two-dimensional design space dened by one continuous (xc , along the vertical axis) and one discrete (xd , along the horizontal axis) variable. The possible values of the latter are indicated by the \tick marks" on its axis. The design space is constrained as shown and the \optimal" search direction points up and to the right as indicated. If the continuous variable xc is initially estimated and xed at the value xc0 shown in Figure 4(a), a discrete optimizer will choose xd by analyzing points only along the line xc = xc0 . The optimal value of xd along that line is shown as point \A" in the gure. Proceeding to x xd at this value while allowing a continuous optimizer to adjust xc will move the design until it encounters the constraint boundary at point \B". It is evident from the graph that this is not the optimal solution. Thus, this technique gives rise to some legitimate concern that the optimizer may be prevented from exploring the optimal region of the design space. Some mixed optimization techniques, such as that of Praharaj and Azarm 7], combat this tendency by repeating this sequence until the nal design converges. Figure 4(a), however, does not suggest that re-running the two optimizers sequentially starting from point \B" would necessarily improve the design (or even move it at all). To address the problem described above, each optimization problem in the mixed CSSO algorithm is temporarily transformed into a fully discrete one. This is accomplished by discretizing the continuous variables, allowing the discrete op-

xc

search dir. constraints optimum

B x c0

A x

d

(a)

xc

search dir. constraints

C

optimum

x

d

(b)

Figure 4: A Simple Mixed Design Space timizer to control all the design variables. This enables it to explore any region of the space. If the continuous variable in the previous example, xc , were discretized as shown in Figure 4(b) and this technique applied, the discrete optimizer would (hopefully) locate the point \C" on the gure. Starting from this intermediate solution, a continuous optimizer could then locate the optimal point shown. Continuous optimization in the current formulation of the mixed CSSO algorithm is performed using the generalized reduced gradient (GRG) method. It is a gradient-based technique that is suitable for constrained optimization 8]. The discrete optimization is performed by simulated annealing. The theory on which this method is founded, as well as its formulation and usefulness in discrete optimization, has been documented 9, 10]. The reasons it was selected for the mixed CSSO framework are related to certain characteristics of the method. The algorithm pos4

sesses the ability to escape local minima, increasing the likelihood that the global mimimum will be found. Conversely, the GRG method, like all gradient-based continuous optimization schemes, is susceptible to local minima. Continuous optimization follows simulated annealing in the hybrid scheme employed by the mixed CSSO framework it is therefore important that the discrete optimizer be one that will likely yield a design point near the global solution. The main drawback of simulated annealing, as is common among discrete optimization methods, is the large number of merit function evaluations often required. Optimization performed in the CSSO framework, however, is based partially (subspace optimizers) or entirely (system coordination) on response surface approximations, which enable information to be obtained quickly. Essentially, Concurrent Subspace Optimization will capitalize on simulated annealing's global robustness while reducing (though not eliminating) its high cost.

30 merit fcn. penalized merit fcn. 20 f

apparent opt.

10

0

-5

0 x 1

5

Figure 5: Small Penalty Function 150 merit fcn. pen. merit fcn. 100

Constrained Discrete Optimization In performing constrained optimization by simulated annealing, the constraints were formulated as penalty functions and added to the objective function. The reason for this is that \design decisions" in simulated annealing are based solely on objective function values at the design points considered. The form of these penalty functions can aect the performance of the optimizer. For example, consider the design space for the demonstration problem described earlier. Figures 5 and 6 show the merit functions that result from penalizing infeasible designs in two dierent ways. The plots show the dependence of f on x1 in a crosssection of the space where x2 and x3 are xed at 0 and 1, respectively. Both penalties are zero for feasible designs and positive otherwise. Figure 5 corresponds to a penalty function that is so small it creates an \apparent optimum" well inside the infeasible region as shown. This penalty function was dened as p = 21 max(8 ; y1 0) + max(y2 ; 10 0)] (3) The apparent optimum is infeasible because this penalty term is insucient to oset the reduction in the merit function between the constraint boundaries and that point. In other words, while the penalized merit function is greater than the original merit function everywhere in the infeasible region, it does not prevent infeasible designs from comparing favorably with feasible ones. This is evident from the graph in Figure 5.

f 50

0

-10

0 x1

10

Figure 6: Large Penalty Function Conversely, the penalized merit function shown in Figure 6 makes the optimal points, including the local optimum, extremely pronounced. This penalty function was dened by

p=

0 if y1 8 and y2 10 100 otherwise

(4)

While simulated annealing is able to escape local optima by sometimes moving to a worse design, the probability of this happening at a given time is dependent on (among other things) how much worse the \candidate" design is than the current one{ the greater the discrepancy, the less chance of moving to the worse design. In this case, it would be very dicult for an optimizer to escape from the region of the local minimum the large penalty has created two \narrow valleys" in what was originally a smooth function. 5

IV. Results The mixed optimization scheme described above was implemented to solve the problem stated by equation 2 using both \all-at-once" optimization and the mixed CSSO/NN framework. For the former, a full system analysis was performed each time a merit function or constraint computation was necessary. Optimization in this sense does not involve decomposing the problem according to the disciplines it contains, nor does it incorporate any means of state or merit function approximation. The two sets of results were compared in terms of required computational resources.

Table 1: All-at-Once Optimization Results PENALTY

max(8-y1 ,0) + max(y2 -10,0) 2 max(8-y1 ,0) + max(y2 -10,0)]

max(8-y1 ,0) + max(y2 -10,0)] 1 2

0 if y1 8 and y2 10 100 otherwise

max(8-y1 ,0)]2 + max(y2 -10,0)]2

All-at-Once Optimization A total of 50 all-at-once trials was performed to evaluate the dierent penalty functions, which were used to enforce feasibility during the discrete phase of optimization. In each case, the continuous variable, x1 , was discretized into 21 points, the integers ;10 ;9 : : : 9 10. Ten trials were run for each of the ve dierent penalties listed in the rst column of Table 1. The second column of the table lists, for each penalty, the design points obtained after only the discrete phase of optimization. This is the best design found by the simulated annealing algorithm (which can be dierent from the nal design yielded by it, but such occasions are rare). This phase was initiated from a random starting point for these trials. The discrete variables (x2 and x3 ) were then xed at the indicated values and a one-dimensional GRG (continuous) optimization was initiated from the simulated annealing solution. During the latter phase of the mixed optimization algorithm, x1 was adjusted to obtain the nal solution listed in the third column of the table. Whether this solution corresponds to the global or local optimum in the design space is indicated by a (G) or (L) in the table, respectively. Finally, the value listed in the fourth column indicates how many (out of 10) trials resulted in that solution. In all-at-once optimization, since only exact analyses are performed, multiple GRG runs that start from the same point always nish at the same solution. Hence, the rst row in Table 1 indicates that the best design located was (2,0,1) in 9 simulated annealing runs, and that the subsequent GRG optimization yielded the global optimum (2.89,0,1) in those 9 trials. The results listed in Table 1 indicate that the rst two penalty functions were, among those listed, the best-suited to this particular problem. Specically, all of the trials for these two cases

SIM. ANN. SOLN. (2,0,1)

GRG SOLN. (2.89,0,1) (G)

#

(1,0,1) (3,0,1)

(2.89,0,1) (G) (2.89,0,1) (G)

1 10

(1,0,1)

(2.89,0,1) (G)

3

(0,0,1)

(-2.69,0,1) (L)

6

(-1,0,1) (3,0,1)

(-2.69,0,1) (L) (2.89,0,1) (G)

1 5

(-3,0,1) (3,0,1)

(-2.69,0,1) (L) (2.89,0,1) (G)

5 6

(-3,0,1)

(-2.69,0,1) (L)

4

9

yielded the global optimum. The only discrepancy between these two penalty functions is that for the rst one, the intermediate solutions (those provided by simulated annealing in the discrete phase of the optimization) were located in the infeasible region. This is again due to the penalty being too small to discourage infeasible designs. Doubling this penalty resulted in the intermediate solution being feasible as well as closer to the global optimum. The corresponding merit function is illustrated in Figure 7. It is evident from the gure that while the locations of the local and global optima have been exactly maintained, the penalty is not so large that it \traps" simulated annealing near the local optimum. The third and fourth penalty functions were those dened by equations 3 and 4, respectively. As suggested by Figure 5, the intermediate solutions in the former case are all infeasible. The 150 merit fcn. penalized merit fcn. 100 f 50

0 -10

0 x1

10

Figure 7: An Eective Penalty Function 6

results also indicate that the one located most often is closer to the local optimum than the global thus, the global solution was obtained only in a minority of tests. The solutions obtained using the last two penalty functions listed in Table 1 were fairly evenly distributed between the global and local optima. This is consistent with the reasoning that a large penalty results in the discrete optimizer rarely moving from the region of the local optimum to that of the global or vice-versa. Concurrent Subspace Optimization The same demonstration problem was also solved using the mixed CSSO/NN framework. For the decomposition into two subspace optimization problems, it was assumed that designers in subspace 1 were responsible for computing the state y1 using the appropriate \analysis tool" (CA1). Information about the non-local state, y2 , is required as input to CA1 and was provided by a neural network. This response surface gave an approximation of y2 as a function of the complete design vector. Furthermore, it was assumed that the designers for this subspace had the freedom to adjust x1 and x2 but not x3 . The rst subspace optimization problem (SSO1) was thus stated as Minimize: f = f (x2 x03 y10 y~2 ) = x22 + x03 + y10 + exp(;y~2 )

function for the discrete phase of the mixed optimization scheme in all these cases, selected based on the all-at-once results discussed above, was

p = 2 max(8 ; y1 0) + max(y2 ; 10 0)]

(6)

The results of these testcases are summarized in Table 2. Table 2: CSSO Results (sparse initial database) case Initial DB Start. Pt. Final Pt. 1 corners (-10,10,9) (-2.64,0,1) (L) 2 corners (10,0,1) (2.88,0,1) (G) 3 face-cen (-10,6,5) (-2.66,0,1) (L) 4 face-cen (0,0,5) (2.95,0,1) (G) Two testcases were run for each initial database used. Of these, one was initiated from the furthest point in that database from the global optimum and the other from the nearest. Cases 1 and 3, which were initiated as far from the global optimum as possible, both converged on the local optimum. Conversely, cases 2 and 4 did result in the global optimum after being initiated at points closer to it. How strongly the location of the nal design is inuenced by the initial design, however, is unclear given the limited number of testcases and the number of other factors which can inuence the performance of CSSO. The neural network approximations, for example, play a critical role in determining what design points will be located by the mixed CSSO/NN algorithm. This is particularly important for the demonstration problem being considered here because the dierence in merit between the global (f = 9:00) and local (f = 9:32) optima is quite small. Also, the neural networks are trained to meet the target outputs to a prescribed tolerance, which is given as a percentage of the range of the appropriate output (either y1 or y2 ) amongst all the designs in the database the range of y1 , for example, is on the order of 100. The range of the merit function is of the same order of magnitude, as y1 is one (often the largest) of four terms which are summed in calculating f . The combined eect of these characteristics is that the merit dierence between the global and local optima becomes less than the permitted discrepancy between its exact and approximate values. It is therefore possible that the response surfaces meet all training requirements but still indicate that the local optimum has a better merit function value than does the global.

Subject To: g1 = y81 ; 1 0 (5) y~2 0 g2 = 1 ; 10 ;10 x1 10 x2 2 f0 2 4 6 8 10g The notation in equation 5 above is as follows: x03 indicates that system design variable x3 is xed for this subspace, y~2 is an approximate value of y2 obtained through the ANN, and y10 is the value of y1 obtained by using the system design vector and y~2 as inputs to CA1. The second subspace optimization problem (SSO2) was cast in an analogous manner the designers in subspace 2 used CA2 to compute y2 based on an approximation to y1 . Also, design variable x2 was xed in SSO2, allowing it to adjust x1 and x3 . The two subspace optimization problems shared one design variable (x1 ) and were both mixed. An initial set of four testcases was run in which the initial database was relatively sparsely populated, containing either 6 or 8 points which were located at either the \face-centers" or corners of the design space, respectively. The penalty 0

7

The convergence of the merit function and state y1 for one trial are illustrated in Figures 8(a) and 8(b), respectively. The dashed horizontal line in Figure 8(b) indicates the constraint boundary on y1 . The initial design was infeasible as the value of y1 at that point fell below this threshold. The rst two iterations yielded designs which were feasible but far from optimal, resulting in their relatively high merit function values. The design point then approached the global optimum and converged in subsequent iterations.

ing data in some regions of the space, including that of the optimum. The tendency of the subspace and system coordination optimizers to provide non-optimal designs during the rst few iterations is not entirely detrimental. The designs obtained at this stage of the algorithm, when optimization is based on approximations that may only modestly conform to the actual space and/or to each other, tend to be distributed throughout the design space. Augmenting the database with such points enables the neural networks to map more of the space accurately in subsequent iterations. Thus, the quality of the approximations improves as the algorithm progresses, enabling an optimum to be located in later iterations.

40

30 f

100

nonparametric inital NN NN after 2 iters final NN

20

60 y1

10 0

2

4 CSSO iteration

6

8

20

(a) merit function

40

-20 -10 30

y

-5

0 x1

5

10

Figure 9: Design Space Approximation 1 20

Figure 9 shows several neural network approximations to y1 present at various iterations of the same trial. They are compared with the nonparametric (exact) response, which is shown by the solid line. The graphs show y1 as a function of x1 in the plane of the optimum (x2 x3 = 0 1). As can be seen on the gure, the initial network (trained using only the designs in the initial database), aside from showing that y1 increases with x1 for positive x1 , did not map that region of the space accurately. The response surface approximation after 2 iterations, at which point 6 design points had been added to the database (2 from subspace optimizations and 1 from system coordination at each iteration), shows some improvement. It was essentially coincident with the nonparametric curve in one portion of the domain, but was still inaccurate for negative x1 . Finally, the network at the nal iteration reected the actual behavior of the system over the entire range of

feasible 10 infeasible 0

0

2

4

6

8

CSSO iteration

(b) state y1

Figure 8: CSSO Convergence History The behavior exhibited by the example above, in which the design point did not approach an optimum until after the rst few iterations, is fairly typical of cases where the initial database is sparse. In such instances, only a limited number of designs are used to train the neural networks during the rst iteration, resulting in response surface approximations that may only vaguely resemble the design space. There may not be any train8

the input. Such progression of the approximations illustrates the point made above. The time and computational resources required to analyze every point in the initial database contribute to the overall cost of a CSSO run. In general, however, information about the system being optimized may be available before the optimization process actually begins. An additional set of testcases was run to investigate the eects of exploiting such information on the eciency of the mixed CSSO/NN algorithm. A complete database that existed at the end of a previous run was used as the initial database in each of these testcases. Furthermore, the system design problem itself was altered by changing the constraint on y1 from that given in equation 2 to

y1 ; 1 0 g1 = 12

# 5 6 7 8 9 10

y1min

12 18

init. DB start. pt. from 3 -2.675,0,1 from 4 2.95,0,1 from 2 -10,10,9 from 3,5 3.68,0,1 from 4,6 3.80,0,1 from 1,5 -10,10,9

nal pt. 3.68,0,1 (G) 3.80,0,1 (G) 3.55,0,1 (G) -4.15,0,1 (L) 4.47,0,1 (G) 4.37,0,1 (G)

16.0

14.0

f 12.0

(7)

This modication is akin to changing the performance requirements of a physical system, the typical what-if question. Design databases assembled during CSSO/NN optimization of the original problem remained useful because they contained the values of states, rather than constraints, at various design points. Finally, after a number of such trials, the constraint was again changed to

y1 ; 1 0 g1 = 18

Table 4: CSSO Results starting w/ existing database

10.0

8.0

0

1 2 CSSO iteration

3

Figure 10: Convergence History for case #6

(8)

only 3 iterations. Although the design point did not have to move very far, this improvement in eciency can also be attributed to the fact that exploiting an existing design database results in relatively good response surface approximations, even during early iterations. This is evidenced by Figure 11, which compares the initial and nal neural network approximations of y1 versus x1 in this testcase to the exact function around the optimum. The plots show that even the network trained using the initial database was fairly representative of the nonparametric function throughout the space. Thus, the tendency of the subspace and system coordination optimizers to yield sporadic, non-optimal points at rst is not prevalent when the CSSO/NN algorithm is initiated with an existing database. The accuracy of the approximation improved further as the design database grew during optimization. Finally, the convergence history for testcase 10 is shown in Figure 12. The initial database for this case contained information from two prior applications of the mixed CSSO/NN framework. The requirements (constraints) in testcase 10 were different than those for either of the testcases which contributed database information. Nonetheless,

and the CSSO/NN algorithm applied to this third version of the problem. Adjusting the minimum value of y1 widened the infeasible region in the center of the design space (recall Figure 3). The corresponding local and global minima for the two modied constraints are listed in Table 3. The results of Table 3: Optimal points after constraint modications min. y1 global opt. local opt. (~x) f (~x) f 12 (3.55,0,1), 13.0 (-3.35,0,1), 13.3 18 (4.35,0,1), 19.0 (-4.15,0,1), 19.3 CSSO/NN optimization are listed in Table 4. An entry such as \from 2" in the \initial database" column means that the design database that existed at the end of trial 2 (in Table 2) was used as the initial database for that run. Figure 10 illustrates the convergence history for testcase 6 listed in Table 4. The algorithm was initiated at the optimal point of the original problem and converges on the \new" optimum in 9

SA's and how many were required during subspace optimization. The results for CSSO/NN runs have also been divided into three catagories: those in which no database information was taken from previous problems, those in which a database was taken from one problem, and those in which databases were taken from two problems. The values listed in the table are averages computed from all the testcases described earlier.

120.0 nonparametric initial NN final NN

80.0 y1

40.0

Table 5: Computational Requirements for Optimization

0.0 -10.0

-5.0

0.0 x1

5.0

10.0

Figure 11: Design Space Approximations the solution converged by locating the optimum in each of the rst two iterations. It is also of note that, for the sake of determining where and when the solution converges in such an instance, this run was initiated at a point in the design space far from the optimum.

200 f 100

0

1 CSSO iteration

#SA

all-at-once CSSO w/ 0 pvs. DB CSSO w/ 1 pvs. DB CSSO w/ 2 pvs. DB

285 39

#CA1 SA's SSO 1139 156 4284

#CA2 SA's SSO 1139 156 4514

18

72

1856

72

2083

11

44

1746

44

1393

The number of system analyses required by CSSO was considerably less than that for all-atonce optimization. This is consistent with results that have been obtained through earlier implementations of the (continuous-only) CSSO/NN framework 3]. An obvious consequence of this is that the number of contributing analyses required to perform the necessary SA's is also reduced. The mixed CSSO/NN algorithm, however, required more total calls to each CA than did allat-once. The reason for this lies in the fact that simulated annealing, like many discrete optimization methods, is very expensive. Also, while each subspace optimization required fewer CA's than did all-at-once (because they do not call the iterative system analysis), they were performed at every iteration of CSSO. The total number of CA's required thus depends on how many iterations are performed. The results listed in Table 5 demonstrate that the ability of CSSO to incorporate design data obtained from previous experience is benecial. This is evidenced by the reduction in required SA's and CA's as more of such information is included. The reason this trend is attributable to the use of prior design data is two-fold. First, no SA's (and hence no CA's) are required to build the initial database in a CSSO run if the designs therein have already been analyzed. More importantly, using more data generally resulted in response surfaces which better approximate the system, meaning fewer CSSO iterations are required before the subspace and system coordination opti-

300

0

method

2

Figure 12: Convergence History for case #10 Computational Resources The number of system and contributing analyses required to perform multidisciplinary design optimization is a primary concern. Table 5 lists the average number of system analyses (SA's) and contributing analyses (CA1, CA2) required to perform optimization via the methods discussed in this paper. For CSSO/NN runs, it is indicated how many CA's were required to perform the 10

mizers are able to locate an optimum. This considerably lessens the total number of CA's required to perform all the subspace optimizations over the course of CSSO. In fact, in two of the three runs initialized with two existing databases, the total number of CA's required was less than the average number required by all-at-once. Both of these runs converged in 2 iterations. The remaining testcase also had located the global optimum after a single iteration but took several more to converge, resulting in an inordinate number of CA's. Had this run been \suspended" after its second iteration, the resultant design would've still been nearly optimal.

by equating the right hand sides of equations 9 and 10. (11) j = ki (+mk;+1)1

The number of hidden nodes in each network used by this application of the mixed CSSO/NN framework was initially set to the integer obtained by truncating the value specied by equation 11. Additional hidden nodes were added in the event that a network was unable to conform to the training data within the desired tolerance. Thus, the size of the hidden layer generally increased during the course of CSSO. There is some concern that an excessive number of hidden nodes may result in unwanted uctuations in network output, causing the response surface to exhibit one or more \false optima". While no such behavior has yet occurred in this application of the mixed CSSO/NN framework, methods of avoiding it are worthy of consideration. For example, the subspace and system coordination optimizers often yield designs very near to each other during the iterations just prior to convergence. Counting nearly-coincident designs as only one point (m = 1 in equation 11) would slow the growth of the hidden layer. Another issue that arises is how discrete inputs are represented by a neural network. As mentioned previously, the discrete variables in the demonstration problem described in this paper were obtained by selecting several discrete values of inherently continuous parameters. As such, these variables were very easy to represent a single neuron was used for each and its value was analogously restricted to several specic points in a continuous domain. Neural network representation is not as trivial, however, for discrete design variables for which no concept of direction exists. Material selection, for example, is an instance when the selection of one \design variable" (a specic material) dictates multiple problem parameters (density, strength, elastic modulus, etc.). Several options exist for representing via a neural network the task of selecting one of N materials. It is possible to have a single neuron take on one of N discrete values, each of which corresponds to a specic material. Alternatively, that single design decision could be represented by N binary neurons, each of which corresponds to a specic material. One of those neurons is then \turned on" when the appropriate material is selected. The subject of neural network representation(s) of discrete systems has been explored by Batill and Swift 12], however, an investigation of such in the context of CSSO has not been performed.

V. Comments Regarding Neural Networks The articial neural networks used for function approximation in the mixed CSSO/NN framework have been mentioned many times throughout this paper. Little of the work presented here addressed issues related specically to the ANN's. A number of such issues exist, however, and are worthy of some discussion. These include the means by which the networks are trained, their ability to represent discrete systems, and how the parameters which dene a neural network are archived and/or adjusted from one CSSO iteration to the next. Background information regarding neural network architecture and functionality is not included here such details can be found in Zurada 5] and Rumelhart & McClelland 11], to name a few. The response surfaces used in the mixed CSSO/NN framework were 3-layer neural networks trained by the Error Back-Propagation technique 11]. The number of neurons, or \nodes", in the rst (input) and third (output) layers were dictated by the number of inputs and outputs of the functions being approximated. The number of nodes in the middle layer (\hidden nodes") was dependent on the number of points in the design database. Given that the number of equations represented by m training points of a function with k outputs is

Neqn = mk

(9)

and that the number of parameters (\weights") to be determined in training a network with i inputs, j hidden nodes, and k outputs is

Nunk = (i + 1)j + (j + 1)k

(10)

the number of hidden nodes required to obtain an exactly determined system can be determined 11

Finally, the way in which network weights are archived and/or initiated is of consequence. All the networks trained during mixed CSSO/NN optimization were initialized with small random weights. In all iterations after the rst, an alternative would've been to resume the training of each network from the point at which it last terminated. The weights and/or network architecture would then be adjusted only if the existing response surface approximation did not conform to designs which had since been added to the database. A potential benet of this approach is a reduction in network training time, as little or no adjustment in weights is necessary when a design recently added to the database is very close to one that was already in it. Archiving network weights can also be construed as a means of exploiting prior experience, as these weights were a function of whatever information was used during training. A drawback of this approach, however, is that the functional form of the response surfaces is \biased" towards that of their predecessors. In the CSSO context, it thus becomes likely that the designs given by the subspace and system coordination optimizers continually lie at or near the same point. The information in the resulting database is then concentrated in one region of the space while designs in other regions may never be analyzed. Essentially, maintaining weights in this fashion puts the most faith in the initial networks, which were trained using the least amount of data.

quired to execute the mixed optimization scheme at the subspace level. Results did demonstrate that the database of design information assembled during CSSO can be exploited to enhance the eciency of subsequent runs, even if the requirements of the system design problem are altered. The initial implentation of this mixed CSSO/NN framework demonstrates that the method is applicable to problems containing both continuous and discrete design variables. The issue of computational resources must be addressed if it is to become a feasible option for large problems. Other future eorts include an investigation of the implications of discrete variables for which no concept of direction exists on the neural network representation of the design vector. Acknowledgments This work was supported in part by the National Aeronautics and Space Administration, Langley Research Center, Grant NAG-1-1561. Dr. J. Sobieski, Project Monitor.

VI. Summary An extension of the Concurrent Subspace Optimization method which is applicable to mixed continuous/discrete problems has been developed. An optimization scheme which combines the simulated annealing and generalized reduced gradient methods, making it suitable for mixed problems, serves as a central component of the mixed CSSO framework. This allows the subspace problems to be mixed without further decomposition being necessary. Response surface approximations are provided by articial neural networks and are used to provide information about the entire system during discipline-level optimization. The mixed CSSO/NN framework has been applied to a nonhierarchic test problem and was consistently able to locate optimal designs. This method did eliminate many of the complete system analyses required by conventional optimization techniques. Computational resources remain a concern, however, due to the large number of calls to the contributing (disciplinary) analyses re12

References

10] I. O. Bohachevsky, M. E. Johnson, and M. L. Stein. Generalized Simulated Annealing for Function Optimization. Technometrics, 28(3), August 1986. 11] D. E. Rumelhart and J. L. McClelland. Parallel Distributed Processing: Explorations in the Microstructure of Cognition. The MIT Press, 1986. 12] S.M. Batill and R. A. Swift. Preliminary Structural Design- Dening the Design Space. Wright Laboratory Report WL-TR-93-3004, 1993.

1] J. Sobieszczanski-Sobieski. Optimization by Decomposition: A Step From Hierarchic to Non-Hierarchic Systems. NASA Conference Publication 3031, Part 1, Second NASA/Air Force Symposium on Recent Advances in Multidisciplinary Analysis and Optimization, Hampton, Virginia, September 1988. 2] J. E. Renaud and G. A. Gabriele. Approximation in Nonhierarchic System Optimization. AIAA Journal, 32(1), January 1994. 3] R. S. Sellar, S. M. Batill, and J. E. Renaud. Response Surface Based, Concurrent Subspace Optimization for Multidisciplinary System Design. 34th AIAA Aerospace Sciences Meeting and Exhibit, 96-0714, Reno, Nevada, January 1996. 4] R. S. Sellar, J. E. Renaud, and S. M. Batill. Optimization of Mixed Discrete/Continuous Design Variable Systems Using Neural Networks. 5th AIAA/NASA/USAF/ISSMO Symposium on Multidisciplinary Analysis and Optimization, AIAA 944348, Panama City, Florida, September 1994. 5] Jacek M. Zurada. Introduction to Articial Neural Systems. PWS Publishing Company, 1992. 6] R. S. Sellar, M. A. Stelmack, S. M. Batill, and J. E. Renaud. Response Surface Approximations for Discipline Coordination in Multidisciplinary Design Optimization. 37th AIAA/ASME/AHS/ASC Structures, Structural Dynamics, and Materials Conference, AIAA 96-1383, Salt Lake City, Utah, April, 1996. 7] S. Praharaj and S. Azarm. Two-Level Nonlinear Mixed Discrete/Continuous Optimization-Based Design: An Application to Printed Circuit Board Assemblies. In Advances in Design Automation, Volume 1, ASME, 1992. 8] G. V. Reklaitis, A. Ravindran, and K. M. Ragsdell. Engineering Optimization: Methods and Applications. John Wiley and Sons, 1983. 9] W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery. Numerical Recipes in FORTRAN, 2nd edition. Cambridge University Press, 1992. 13

Concurrent Subspace Optimization of Mixed ...

Concurrent Subspace Optimization of Mixed ...

Suggest Documents

Thermodynamic optimization of mixed refrigerant

Sequential Subspace Optimization Method for Large-Scale ...

Support Vector Machine via Sequential Subspace Optimization

Subspace Optimization Techniques for Classification Problems - DiVA

Capacity optimization using subspace method over ... - CiteSeerX

Concurrent topology optimization design of structures

Optimization of canopy conductance models from concurrent ...

Concurrent optimization on the powertrain of robot

Speeding-Up Convergence via Sequential Subspace Optimization ...

Concurrent Optimization for Selection and Control of

Mixed-Integer Nonlinear Optimization

Part I: Mixed Binary Optimization

CONCURRENT SIMULATION AND OPTIMIZATION MODELS FOR ...

a new approach to concurrent logic optimization

Multidisciplinary Concurrent Design Optimization via the Internet

Face Alignment with Unified Subspace Optimization of Active

design optimization of mixed concrete-steel beams

Search Engine Optimization Process: A Concurrent ...

Parametric optimization of powder mixed electrical

Parametric optimization of powder mixed Electro ...

Concurrent Discrepancy-based Search for Distributed Optimization

Concurrent Trajectory and Vehicle Optimization: A ...

Genetic Algorithm for Concurrent Balancing of Mixed-Model Assembly ...

Optimization of Concurrent Deployment of the Juvenile Salmon ...