Automatic component identification using artificial ...

4 downloads 10839 Views 687KB Size Report
volume objects (shown here as wireframe model). 2.3 Conditioning of ... intersection has to be converted into a floating point value. As we can see, the ..... http://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/cs11/report.html,. Download 12.04.
Automatic component identification using artificial neural networks techniques

M. Schleinkofer, A. Bastian, C. van Treeck and E. Rank, Lehrstuhl für Bauinformatik, Technische Universität München, Munich, Germany

Abstract Updating existing plans or, where the original plans are no longer accessible, producing a completely new set of drawings to reflect the current state of existing building stock is a common requirement in the refurbishment and redevelopment of old buildings. A convenient, promising way of collecting such data is to make use of laser measurement technology. The models obtained with this technique then have to be transferred to a building product model in order to meet the needs of the subsequent technical project planning stage. We therefore utilise artificial intelligence procedures and propose a method based on virtual neural networks for the analysis of the extracted objects. Objects are categorised with respect to building component classes. Keywords: Neural network, artificial intelligence, solid model, building product modelling, AEC, CAD.

1

Introduction

Redevelopment schemes and the refurbishment of old buildings account for a large proportion of the building activities conducted in developed countries. In the interests of the long-term sustainability of the environment and resources it is therefore of particular importance to provide appropriate high-performance,

1

computer-aided planning and simulation tools. A research project funded by the Bavarian State Ministry for the Environment aims to develop the fundamental principles of a model-based planning process for existing building stock. The core aspect is a so-called product model, in which not only the three-dimensional geometry of an edifice but also all the relevant product data (materials, physical properties, ecological data, etc.) are stored and made available for the simulation of the entire life-cycle of the building [1][6]. The project has the largely automated, two-phase creation of this product model as its main focus. An exact assessment (and study) of the current state of the building is indispensable for successful planning. For this reason, one of the first steps is to construct the geometric volume model using methods commonly practised in laser-supported engineering surveying. Once the “point cloud” has been completed, it is analysed by a project partner and initially transferred to a surface model. This forms the basis for the subsequent creation of a volume model, from which the product model is later derived in a second step. Here, we intend to focus our attention on this second stage.

Figure 1: Point cloud, surface model, volume model and product model.

The transferral of the volume model to a product model with AEC objects (architecture, engineering and construction) should be done automatically. For this purpose, the computer is required to identify the objects of the volume model as components. Whereas pattern recognition is already quite advanced in other sectors (image analysis, biometry, for example), very few component identification approaches are available so far in the field of structural engineering. In order to recognise components that exist in the form of topologically distinct volume objects, it is obvious to base the decision procedure on their characteristic features.

2

Figure 2: The way from a laser scan to a product model of a truss.

This paper deals with pattern recognition with respect to applying neural network techniques. It is organised as follows: first we discuss which characteristic parameters of the volume model should be used as an input for the identification algorithm. These parameters are then brought into an assessable and – if possible – optimised form for the recognition process, the artificial neural network. This involves examining the suitability of the training data used for training the net. Since the choice of network, including the learning algorithm, depends on the task in hand, the focus then shifts to this aspect. The final section takes a look at the best combination of training variants for producing reliable results.

2

Artificial neural networks for the classification of building components

2.1 Procedure Artificial neural networks [3] (ann) consist of cells (neurons, perceptrons), which communicate with one another. The input into such a cell determines its activation level, which is forwarded in the form of an output signal to other cells where it becomes their input. It is therefore necessary to feed a certain amount of information data into a neural network for internal evaluation and ultimately to produce the computational result. The communication between the user and the net is based on input neurons (located in the input layer) and output neurons (output layer). The interstitial neurons belong to hidden layers, to which the user has no access. There are many different neural networks, which vary in their connection structure, the propagation rule and the learning algorithm. An example of a simple net is the layer-based connected feedforward network (Figure 3).

3

Figure 3: Artificial neural networks with input (left-hand side) and output layer (righthand side), with a hidden layer in between.

A trained network stores its knowledge in its edge loads, which strengthen or weaken the relaying of the signals between the neurons. During the learning process these loads are modified in such a way that the net's output is optimised for a training database. The aim of the training phase is to enable the conditioned network to work on unknown data samples in similar way. This is no longer possible if the network is too attached to the training data, due to excessive training or not enough training examples. 3D volume objects (solids) form the basis for the building component classification process. Per definition they can be arbitrarily formed, so the collection of meaningful parameters is an important task. Once these data are available, they are brought into a format which is accessible by the net. The network, which has the same number of output neurons on the output side as types (categories) of building elements to be differentiated, provides an output value for each category, which accordingly has to be interpreted as a network decision.

2.2 Choice of geometrical attributes Artificial neural networks are already employed successfully for pattern recognition in 2D-bitmaps [3][7], where brightness values of colour components (alternatively for greyscales only) are transferred to a neural net. A possible option for solving the problem concerned in this paper could therefore be to discretise the objects in the continuous area in a grid fashion ("to voxel") [8] and then transfer the voxel matrix to the net. Yet it should be borne in mind that the size of the input layer depends directly to the number of voxels. Thus, using a high resolution would lead to a large number of voxels, causing an immense amount of data. The procedure

4

presented in this paper therefore uses a different approach: capturing the geometry directly from the objects. In order to identify building components, it is necessary to define decision parameters that are equally valid for every object. For this reason, it is straight forward to adopt the approach of applying attributes that common sense would also use as allocation criteria, while simultaneously attempting to analyse the objects regardless of their position within the general context. In the setting of this case, the relationship between neighbouring solids is not assessed. The following geometrical properties of the volume objects will be used as decision criteria: •

Based on the expansion of the object, we can take the height as it stands. The two horizontal dimensions are obtained by using the smallest box [2] that encloses all the points projected onto the horizontal plane.



Next, we ascertain the body surface area and its volume.



Boolean operations are used in order to determine the solid's position within the other objects. In this context it should be noted that only components with mass are allowed to be represented as volume elements, unlike topological areas, such as the room itself. This means that all bodies are represented without their openings being cut out. A restriction to this rule is that the openings (like windows or doors) themselves (which are actually massless) will also be depicted as volume objects (see Figure 4). These results from the overriding stipulation that all the objects dealt with here are generated from laserscan data.

5

Figure 4: Objects representing walls and their massless openings are both described by volume objects (shown here as wireframe model).

2.3 Conditioning of the geometrical attributes Transferring the attributes to the virtual neural network is done by means of a "double" type vector. Accordingly the attributes have to be transferred to this vector. The spatial dimensions: length l, width w and height h are already available in an appropriate form. The ratios l / w and l / h should be used as additional network input data. The same applies to surfaces and volumes, which can be transferred directly, their ratios serving to increase their significance. The result of the intersection has to be converted into a floating point value. As we can see, the correlation "is incorporated in" is more significant than "incorporates": walls or slabs may contain between 0 and n objects, openings are always contained within just one object. This information is transferred as a floating point number 1.0 ("incorporated in") or 0.0 ("not incorporated in"), respectively.

2.4 Conditioning of the training data The following terms are defined with regard to the training stage: a single set of data used to train the net or to request a decision is called a sample. All samples are combined to form the training basis. Each cycle of the training basis for conditioning the net is called a training cycle. A net either has to be trained using a given number of cycles or until the net falls below a given error value. One way of calculating the error value is to validate the output from the network with the given target values of the training basis employed. Another option is to use additional

6

samples, which are not part of the training basis, but where the type of component is also already known. A problem regarding the training of virtual neural networks is the choice of a suitable database. Based on a few selected input samples, the network has to derive a more general, more abstract form of the sample [9]. Therefore the training basis has to satisfy certain criteria: it should cover worst case scenarios and contain numerous mutations. In particular, identical data records should be avoided. In order to train the net competently for all possible outcomes, there has to be an adequate number of samples for all categories required.

2.5 The artificial neural network employed A layered feedforward net was used [10] to solve the current problem, the nodes of the net being arranged in a succession of layers. There is one-way communication only. Neural networks of this kind are simple to handle but still provide sufficient complexity due to a fairly large number of hidden layers. The relevant learning algorithm employed here is the well-known backpropagation algorithm [3][4]. An input sample is presented to the net. As in the decision stage, this input passes through the network and computes an output. Once this output has been calculated, it is compared with the required result, which then leads to an error vector. The new edge loads are recalculated with the help of the error vector using reverse computation, i.e. starting at the output layer and ending at the input layer, after which the values are updated. Although the use of the backpropagation algorithm in combination with hierarchical feedforward networks is a reliable method, typical problems of the gradient descent procedures may sometimes occur. The procedure can stagnate in local minima or flat plateaus. On the other hand, there is the phenomenon of oscillations in steep canyons. There are several well-known procedures that endeavour to modify this weakness of the backpropagation algorithm, but they are not employed in this particular application. The duration of the learning procedure is another factor that should not be underestimated. It can be controlled by the learning factor η, for example. If η is too large, which would accelerate the learning progress, there is a risk of not attaining the optimum solution because the jumps on the error surface are too big. Since the size of the input vector is small due to the low number of object parameters, and because the number of output neurons depends on the number of object categories, the number of degrees of freedom is low enough to allow the training cycles to be computed in an acceptable time. The networks under investigation were trained in a matter of a few seconds.

7

2.6 Post processing Each output neuron is assigned to a building element class. The values that are read off the output neuron have to be evaluated in order to determine the most appropriate component category. Since the output neurons in the implemented case "fire" values within the interval tolerance [0; 1] and it is possible to interpret the results as "the bigger the value, the more likely it is", the largest values are compared with one another and, ideally, one will stand out against all its rivals. The category to which this neuron is assigned is accordingly regarded as the network's output decision. It should be noted that the output of a neuron is not to be confused with a value from the calculus of probabilities. For one thing, the sum of all the output values does not equal 1. For another, the final operation carried out in the decision process results in the output of the output layer not being linear, but it follows the laws of the implemented activation function, instead. In the programme employed here, this was the sigmoid function (see Figure 5).

f ( x) =

1 1 + e− x

Figure 5: The sigmoid function is used as the activation function.

2.7 Sample Application An extension to the commercial CAD software AutoDesk Architectural Desktop 2005 was programmed for the purpose of testing the proposed method. This application makes it possible to train the network directly within its CAD environment, i.e. to convert arbitrarily shaped volume bodies into AEC components. First of all, a simple feed forward network with a single hidden layer is chosen for testing purposes. The aforementioned backpropagation algorithm is applied by way of the training algorithm. Initially, the number of component classes, i.e. the number of output neurons, is set at seven, namely "wall", "slab", "door", "window", "slab opening", "beam" and "column", referring to the most important component classes of the standard CAD object description. To begin with, the number of neurons in the hidden layer is set at seven, too. As the input parameters themselves as well as their numbers are

8

changed several times during testing, the number of input neurons has to be modified accordingly, ranging from two to eight. Training data is taken from an existing standard CAD model (see Figure 7). The ability of the given data to serve as training material is initially tested by training and testing the net using the same set of data. Various input parameters, aligned to those parameters that a person would use to decide between different component categories, are tried out in multiple combinations. Some helpful and practical parameters that can be extracted from the CAD object description include: •

dimensions of the minimum enclosing box in x-, y- and z-direction,



area, volume, centroid coordinates of the solid itself,



number of objects contained in a solid under consideration and number of objects containing that solid.

Training with various combinations and numbers of input parameters yield two main results concerning the kind and number of input parameters: •

The more input parameters there are, the more "confused" the network becomes and the longer it takes to be trained to recognise at least most of the training samples correctly. If there are more than four input parameters, there is almost no training effect at all, and the network never achieves an error quota of less than thirty per cent in recognising its training data (see Figure 6a). Consequently, as only a few parameters can be passed to the network, these have to be chosen carefully, and it is not wise to add parameters just for the sake of adding information.



An obvious point was confirmed: Absolute values are less significant than their ratio to one another. In combination with the former, the input of absolute dimensions is replaced by the input of its ratios instead.

Taking these two observations into account, the input is reduced to the three most informative parameters •

"x / y" (i.e. larger horizontal dimension to smaller horizontal dimension of the minimum enclosing box)



"x / z" (i.e. larger horizontal dimension to vertical dimension of the minimum enclosing box) and



"contains" (i.e. number of objects contained in the solid under consideration).

9

Figure 6a and b : Incorrect classifications and normalised network error for training an existing set of standard CAD object descriptions, supplemented by additional slabs taken from other buildings, using eight input parameters (dimensions in x-, y-, zdirection, ratio area / volume, number of objects contained, number of objects containing, ratios x / y and x / z).

In terms of the composition of the set of training data, a heavily one-sided distribution of the different output categories or object types, as the case may be, within the set of data is found to be error prone with regard to classification.

10

Considering a usual building for example, there are many walls and windows but only a few slabs (Figure 7). The consequence is that for a network which has been trained using this data set it is cumbersome to learn classifying slabs, but it is easily trained on walls. Thus it classifies most of the samples as walls, including windows, doors and especially slabs. This problem is solved by adding more samples, namely slabs from other existing buildings, to the structure used as a training data set, thus turning it into a serviceable set of data for training.

Figure 7: A typical house features an unbalanced number of different components. To train an artificial neural network with these data it is necessary to add further samples from other buildings.

The fastest convergence and most reliable recognition of the training data is achieved by following the aforementioned terms and conditions (see Figure 8a). The network is found to be classifying the training data correctly after performing 190 training cycles. From this point on, further training merely serves to improve determinism and reduce the normalised network error. Observing the normalised error produced by the network after several training steps, we conclude that convergence of the network error is necessary but not sufficient for an effective training, i.e. for obtaining a small percentage of unrecognised samples (see Figure 6b and Figure 8b). At least we observe a reasonable behaviour of the network error convergence rate: for a completely trainable case (Figure 8), the network error is lower than that for the remaining thirty

11

per cent incorrect classifications (Figure 6), regardless of the number of training cycles performed.

Figure 8a and b: Incorrect classifications and normalised network error for training an existing set of standard CAD object descriptions, supplemented by additional slabs taken from other buildings, using three input parameters ("x / y", "x / z", "contains").

12

Another criterion required for the convergence of the algorithm and for an indication of a trained, well performing network is provided by the development of the edge loads in the network throughout training. It is only when the change in these weights is close to zero during each additional training cycle that the network can be described as "trained" and the algorithm has converged (Figure 9).

Figure 9: Loads during training in the network described above (each line depicts the development of one load between input layer and hidden layer).

Affecting changes in the network architecture by reducing or increasing the number of hidden nodes, or alternatively omitting the entire hidden layer, does not improve convergence; instead, it leads to a residual percentage of incorrect classifications, no matter how many cycles of training are applied. Since the main reason for applying neural networks to this task is to create improved adaptability to new, unknown sets of data, or recognition of them, a fictive set of components is invented (see Figure 4 and Figure 11) for testing the trained network as described above. The results are satisfactory: after two hundred cycles of training, the network already manages to classify all the invented testing samples correctly (Figure 10). From this point on, further training again serves just to improve the distinction between the categories and reduce the normalised network error.

13

Figure 10: Incorrect classifications for determining the fictive set of components in Figure 4 by means of the trained network according to Figure 8a and b.

3

Conclusion

Using artificial neural networks to identify building components is a way of reacting flexibly to different samples. It attempts to imitate the approach common sense might adopt to solve the problem. Attributes that are also significant for human beings were adapted to a format that could be used for the network. For this reason, forms of building elements that are hitherto unknown to the system are not excluded from the outset. The process we used here may make wrong decisions if the result calculated by the network fails to produce a clear favourite. In this case, user interaction may be useful in avoiding misinterpretations. In addition, the net can provide the user with a shortlist of possible decisions as a guide. Some useful results have already been obtained in the tested cases. Conventional component forms were identified and assigned to the correct component category with great reliability.

14

Figure 11: Result of the classification based on the volume objects shown in Figure 4: solids are identified as AEC-objects.

References [1]

Neuberg, F.; Ekkerlein, C.; Rank, E.; Faulstich, M.: Integrated Life Cycle Simulation and Assessment of Buildings. In: Xth International Conference on Computing in Civil and Building Engineering (ICCCBE-X), Weimar, Germany 2004

[2]

O’Rourke, J.: Finding minimal enclosing boxes. In: International journal of computer and information sciences 14(3), 183-99, 1985

[3]

Rojas, R.: Neural networks - A systematic introduction. Springer-Verlag, Berlin, New York 1996

[4]

Rumelhart, D.; Hinton G.; Williams, R.: Learning internal representations by error propagation. In: Rumelhart, D.; McClelland, J. (eds.): Parallel distributed processing. Explorations in the microstructure of cognition, vol. 1, 318-362, MIT Press, Cambridge, 1986

[5]

Stergiou, C.; Siganos, D.: Neural Networks, http://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/cs11/report.html, Download 12.04.2005

[6]

van Treeck, C.; Rank, E.: Analysis of Building Structure and Topology Based on Graph Theory. In: Xth International Conference on Computing in Civil and Building Engineering (ICCCBE-X), Weimar, Germany 2004

15

[7]

Welstead S.: Neural Network and fuzzy logic applications in c/c++, John Wiley & Sons, New York 1994

[8]

Wenisch, P.; Wenisch, O.: Fast octree-based voxilization of 3D Boundary Representation-objects, technical report, Lehrstuhl für Bauinformatik, TU München 2004

[9]

Wikipedia. Die freie Enzyklopädie (eds.): künstliches neuronales Netz. http://de.wikipedia.org , Download 21.07.2004

[10] Zell, A.: Simulation neuronaler Netze, Addison-Wesley, Bonn 1994

16