Modeling Initial Design Process Using Artificial Neural ...

2 downloads 0 Views 261KB Size Report
Third, the writers have also used ANN for the initial areas of truss members of .... Like many other rules in ANN, this rule also based on experience and has no ...
MODELING INITIAL DESIGN PROCESS 8 USING ARTIFICIAL NEURAL NETWORKS

Downloaded from ascelibrary.org by California Department of Transportation on 08/04/17. Copyright ASCE. For personal use only; all rights reserved.

Discussion by S. Rajasekaran/ Member, ASCE, and J. V. Ramasamy4 The authors are to be congratulated for applying the ANN with a back-propagation algorithm for the initial design of rectangular reinforced-concrete single-span beams. However, the writers would like to offer the following comments. First, it is seen from experience that even for most of the nonlinear problems one hidden layer is sufficient. This is also seen in Fig. 5(a-c). Second, the authors have not given any rule for the choice of neurons in the hidden layer. Since input nodes are eight and output nodes are five it is enough if they take nodes in the hidden layer between L - 2L (i.e., 8-16), where L = number of neurons in the input layer. But the authors have taken 30 nodes involving more computer time. Third, the writers have also used ANN for the initial areas of truss members of industrial roofs (Ramasamy, PhD thesis submitted for review, 1997). For the given set of input such as span, angle of roof, access provided or not, and spacing, the given output is the areas of six regions of truss members. The ANN is trained and for any other pattern of input it is possible to get initial areas that are very near to the actual areas. This substantially reduces the number of subsequent analyses for design cycles finally arriving at optimal design. The writers have also combined a genetic algorithm for this. Fourth, instead of training the ANN with the same learning rate TJ one can use an adaptive back-propagation neural network (Atiya et al. 1992) by changing the learning rate according to error as (1)

where E = error in one epoch; TJo = given learning rate to start with; and TJ = learning rate at every iteration. Using the adaptive back-propagation algorithm the ANN can be trained with fewer number of cycles compared to an ANN with a usual back-propagation algorithm. Fifth, it is also not clear why the authors have used two neurons for MI5 and M20 grades of concrete and three neurons for Fe-250, Fe-4I5, and Fe-500 steel. Instead, one neuron for strength for concrete and one neuron for steel grade may be used. Thus input neuron are reduced from eight to five. This can be done by normalizing the strength to the maximum value of 40; MI5 is represented as 0.267 and M20 as 0.5. This can be done for steel also. Finally, the authors have also arrived at an important conclusion in this study that damaging up to two nodes will not badly affect the performance of the network. For a general ANN more such experiments have to be performed before arriving at such conclusions.

APPENDIX.

Nalina, K. (1995). Artificial Neural Networks in Civil Engg, ME Thesis submitted to the Bharathiar University. Ramasamy, J. V. (1995). Expert System for the Design of Industrial Roofs using ANN and General Algorithm, Ph.D thesis submitted to the Bharathiar University.

Discussion by Imad A. Basheer,s Student Member, ASCE, and Yacoub M. Najjar,6 Associate Member, ASCE The authors have used the back-propagation algorithm (BP) to model an optimizer used to design single-reinforced concrete single-span beams. The obtained relative success of the BP as a tool for optimization stems from the fact that the authors have used the available optimization model of Chakrabarty (1992). Hence, the authors have utilized the BP as a prediction tool rather than as an optimization search technique to design best-beam cross sections. While the paper has nicely treated the addressed subject, it also showed the success of BP and the popularity of such a neural network algorithm in an important field of structure design. In the following, the discussers address a few issues regarding the generality of the developed concrete beam design neural network and the enhancement of the developed model. First, the success of BP is linked to its ability to generalize from the examples on which the network has been trained. The example the authors used generally covers a reasonable domain of design of reinforced-concrete beam spans; however, the neural network they developed is limited to three types of steel (reflected by their yield stress) and to two types of concrete mixes (reflected by their compressive strength). Whereas steel does not widely vary in strength from one construction project to another, concrete, on the other hand, may vary over a much wider range. To make the developed neural network more applicable for and apt for use in a wide spectrum of design cases, continuous values of both the concrete strength and the steel yield stress in the input layer should have been used instead of the 1 and 0 (on/off) type of data; exactly same way as was used with the dead and live loads. This will at least have three advantages: (1) reducing the input variables by three that may consequently enhance the network generalization and improve its predictive capability; (2) addressing advantages of neural networks over expert systems that mainly work on discrete data; and (3) providing more flexibility for users in design. As an enhancement of the developed design model and in order to expand its domain of applicability, the neural network can be designed to include the unit costs of all materials as input nodes in the input layer. This, in addition, can render the developed reinforced-concrete beam design network an efficient tool for decision making and studying economic feasibility of construction projects.

Closure by Abhijit Mukherjee7 and Jayant M. Deshpande8

REFERENCES

Atiya, A. F., Parlos, A. G., Muthusami, J., Fernandez, B., and Tsai, W. K. (1992). "Accelerated learning in multilayer networks." Proc.• Int. Joint Conf Neural Networks, Vo!. 3, 925.

The writers thank the discussers for their interest in the paper. The points raised by Rajasekaran and Ramasamy are analyzed as follows:

·July 1995, Vo!. 9, No.3, by Abhijit Mukherjee and Jayant M. Deshpande (Paper 6733). 'Prof. and Head, Dept. of Civ. Engrg., P.S.G. CoIl. of Techno!., Coimbatore, Tamil Nadu, India. 4Lect. in Civ. Engrg., P.S.G. Col!. of Techno!., Coimbatore, Tamil Nadu, India.

'PhD Candidate and Asst. Prof., Dept. of Civ. Engrg., Kansas State Univ., Manhattan, KS 66506. ·PhD Candidate and Asst. Prof., Dept. of Civ. Engrg., Kansas State Univ., Manhattan, KS. 7Asst. Prof., Indian Inst. of Techno!., Powai, Bombay 400 076, India ·PhD Student, Indian Inst. of Techno!., Powai, Bombay 400 076, India JOURNAL OF COMPUTING IN CIVIL ENGINEERING / APRIL 1997/145

J. Comput. Civ. Eng., 1997, 11(2): 145-145

Downloaded from ascelibrary.org by California Department of Transportation on 08/04/17. Copyright ASCE. For personal use only; all rights reserved.

Concerning the discussers' first point, although a single hidden layer is sufficient for many problems it is not uncommon to use two hidden layers. The rule of thumb is when the number of nodes in the hidden layer exceeds twice the number of nodes in the input layer it is preferable to use two hidden layers. In a two-hidden -layer network the first hidden layer clusters the data and the second hidden layer classifies the data further. Like many other rules in ANN, this rule also based on experience and has no proof. Fig. 5(b) presents a study on the damage tolerance of the network. A damage tolerant network need not be an oversized one. As for their second point, the networks presented in the paper are the product of a systematic exercise with various network configurations. The size of the hidden layer was increased gradually starting from five nodes. The writers obtained the best results with the networks presented in the paper. Analyzing Rajasekaran and Ramasamy's fourth point, it is true that a varying learning rate can accelerate convergence. However, a constant watch on the learning rate should be kept during training. The methods that automatically update the learning rate have the risk of running into network paralysis or local minima. In the present investigation the network converged monotonically and no need to change the learning rate was felt. To clarify their fifth point of discussion, concrete construction practice allows a few standard grades of concrete and steel. Any other design with nonstandard material is of little interest. Hence, the ANN should be trained for the standard materials only. If one input node is used for all the grades of concrete the network will be smaller, but there is always a chance of a designer inputting a nonstandard material and getting absurd results. Moreover, the network learns very quickly with binary input (l or 0). The contribution to the hidden layer from a material node that is not chosen is zero (due to 0 as input). The material that is not selected is irrelevant to the design problem at hand. Thus a binary input for materials helps in clustering the data right from the input level. As the prediction of the network is extremely fast, the size of the network is not of much concern. On the point raised by Basheer and Najjar regarding the inclusion of the unit cost of materials the writers would like to point out that neural networks are model-free estimators. Hence, they do not require the unit cost of materials and workmanship to arrive at the total cost. If the unit costs fluctuate uniformly the prediction of the network can be multiplied by the index of fluctuation. If the costs fluctuate arbitrarily the network must be trained for different fractions of unit costs to the total cost. That would lead to a very large number of costfraction combinations for training. Usually the unit costs fluctuate uniformly. Therefore, the arbitrary fluctuation of unit costs has been neglected.

ing in civil engineering. However, after noting a form of the word "reasoning" in the title, the discusser was disappointed to find that the majority of the paper dealt with representations and transformations rather than reasoning. The authors state on page 251 that: "The primary task of the GRID is to interpret the low-level coordinate information stored in CAD models (or spatial representations in general) and transform the coordinates into abstract relationships that form the basis for geometric reasoning." This was given in the context of a statement earlier on that page: "A prototype GRID system was developed to present a standard methodology for qualitative geometric reasoning within the design process." It seems to the discusser that the authors have succeeded in fulfilling the goals of the former statement while the contribution to the latter statement is harder to find in the paper. Representation and transformational issues are dealt with in the majority of the paper whereas reasoning is touched on in the last page or two. On page 257 it is stated that: "Although a primitive set of causal modelers were implemented within the GRID prototype, this issue is left for open investigation...." Would it be possible for the authors to supply more information on the reasoning capabilities that result from implementing GRID-like transformations? Also, it would be helpful to have a greater insight into the types of qualitative reasoning that are intended. In addition, it would be interesting to find out how the authors see their work as it fits into the substantial amount research performed over the past 15 years into qualitative physics and qualitative reasoning. The discusser would also like to raise a few questions in the area of context-sensitive lexicons. For the range of practical problems associated with chemical plants, how many of these lexicons are needed? Also, how many lexicons are used in more than one situation? What are the parameters used to define the context? Finally, what are the problems experienced thus far with the lexicon framework?

PREDICTION OF ESTUARINE INSTABILITIES WITH ARTIFICIAL 8 NEURAL NETWORKS

Discussion by Robert J. Schalkoffz

The discusser would like to congratulate the authors on a very interesting paper related to an important area of comput-

The majority of this paper concerns an interesting application of a feed-forward (FF) artificial neural network (ANN), and salient issues such as network architecture, training using gradient descent [back-propagation (BP»), and the resulting network mapping and generalization capabilities. Furthermore, the author articulates a number of design decisions that are often made iteratively in the process of developing a useful mapping network. The discusser feels that some of the broad generalizations of ANN operation and technology, while probably harmless, are somewhat inaccurate. There are profound differences between the operation of the human brain and that of an ANN (Churchland and Sejnowski 1993; Koch and Poggio 1987).

"October 1995, Vol. 9, No.4, Paul S. Chinowsky and Kenneth F. Reinschmidt. (Paper 8486). 3 Artificial Intelligence Lab. (LIA), Compo Sci. Dept., Fed. Inst. of Technol. (EPFL), 1015 Lausanne, Switzerland.

·October 1995, Vol. 9, No.4, by John P. Grubert (Paper 8676). 2Prof., Dept. of Electr. and Compo Engrg., Clemson Univ., Clemson SC 29634-0915.

QUALITATIVE GEOMETRIC REASONER 8 FOR INTEGRATED DESIGN

Discussion by Ian F. C. Smith,3 Member, ASCE

146/ JOURNAL OF COMPUTING IN CIVIL ENGINEERING / APRIL 1997

J. Comput. Civ. Eng., 1997, 11(2): 145-145

Suggest Documents