Effective Localized Regression for Damage ... - ACM Digital Library

Industry/Government Track Paper

Effective Localized Regression for Damage Detection in Large Complex Mechanical Structures Aleksandar Lazarevic

Ramdev Kanapady

Chandrika Kamath

University of Minnesota, United Technologies 200 Union Street SE, 4-192, Minneapolis, MN 55455, USA 1-612-626-8096

University of Minnesota 111 Church Street, SE 125 Minneapolis, MN 55455, USA 1-612-626-8101

Lawrence Livermore National Lab. Box 808, L-561 Livermore, CA 94551, USA 1-925-423-3768,

[email protected]

[email protected]

[email protected]

ABSTRACT

General Terms

In this paper, we propose a novel data mining technique for the efficient damage detection within the large-scale complex mechanical structures. Every mechanical structure is defined by the set of finite elements that are called structure elements. Largescale complex structures may have extremely large number of structure elements, and predicting the failure in every single element using the original set of natural frequencies as features is exceptionally time-consuming task. Traditional data mining techniques simply predict failure in each structure element individually using global prediction models that are built considering all data records. In order to reduce the time complexity of these models, we propose a localized clustering-regression based approach that consists of two phases: (1) building a local cluster around a data record of interest and (2) predicting an intensity of damage only in those structure elements that correspond to data records from the built cluster. For each test data record, we first build a cluster of data records from training data around it. Then, for each data record that belongs to discovered cluster, we identify corresponding structure elements and we build a localized regression model for each of these structure elements. These regression models for specific structure elements are constructed using only a specific set of relevant natural frequencies and merely those data records that correspond to the failure of that structure element. Experiments performed on the problem of damage prediction in a large electric transmission tower frame indicate that the proposed localized clustering-regression based approach is significantly more accurate and more computationally efficient than our previous hierarchical clustering approach, as well as global prediction models.

Algorithms, Performance, Design, Experimentation.

Keywords Clustering, localized regression, structure elements, damage detection, mechanical structures.

1. INTRODUCTION With the increasing demand for safety and reliability of structures and mechanical systems, damage detection by nondestructive evaluation methods has attracted considerable attention recently. The phenomenon of damage in a material includes localized softening or cracks in a certain neighborhood of a structural component due to high operational loads, or the presence of flaws due to manufacturing defects. Methods that identify the presence, the location and the severity of damage in the mechanical structure are useful for non-destructive evaluation procedures that are typically employed in agile manufacturing and rapid prototyping systems. In addition, these techniques will be critical for reliable prediction of damage in structural systems such as bridges, skyscrapers, aircraft structures, and various structures deployed in space. Since every structural system is defined by the set of finite elements that are called structure elements, reliable structural damage prediction is usually considered in terms of the damage detection of structure elements, which results in changes in structural responses such as static deformations and dynamic characteristics (e.g. natural frequency and the mode shapes). Although rigorous damage models exist, in this work we primarily focus on the aspect of structural damage that is assumed to be associated with structural stiffness as a reduction in Young's modulus or modulus of elasticity (E) of the structure elements [16]. In these situations, there are three levels of damage identification: (i) Recognition - qualitative indication that damage might be present in the structure; (ii) Localization - information about the probable position of the damage in the structure (which structure elements are damaged); (iii) Assessment - estimate of the extent of severity of the damage in the structure (intensity of the damage for failed structure elements).

Categories and Subject Descriptors H.2.8 [Database Management]: Database Applications (data mining, scientific databases, spatial databases)

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. KDD’04, August 22–25, 2004, Seattle, Washington, USA Copyright 2004 ACM 1-58113-888-1/04/0008...$5.00.

A practical damage assessment methodology must be capable of predicting changes in the structural stiffness as a function of changes in structural response and dynamic characteristics (e.g. natural frequency and the mode shapes) [19]. Standard analytical techniques employ mathematical models to approximate the relationships between specific damage conditions and changes in

450


created cluster, we identify corresponding structure elements assuming that the failed element is one of these identified structure elements. More specifically, by identifying corresponding structure elements we focus our prediction only at the structure elements that are highly possible to be damaged. Therefore, instead of predicting an intensity of damage in all structure elements, we build a prediction model for each of these identified corresponding structure elements in order to determine which of these structure elements has really failed. Prediction model for a specific structure element is constructed using only those data records from the entire training data set that correspond to different intensities of its failure and using only a specific set of relevant natural frequencies. Experiments performed on the problem of damage prediction in a large electric transmission tower frame indicate that the proposed localized clusteringregression based approach is more accurate than our previously proposed hierarchical partitioning approaches [7]. In addition, the proposed approach also requires less computational time than our hierarchical partitioning approaches.

the structural response or dynamic properties. Such relationships can be computed by solving a class of so-called inverse problems [4, 15]. However, the existing approaches for solving these problems have several major drawbacks, namely: i) a large amount of modal information such as eigen-values and eigenvectors associated with the damaged structure has to be employed to identify the damage in the structure accurately; ii) the more sophisticated methods involve computationally cumbersome system solvers which are typically solved by singular value decomposition techniques, non-negative least squares techniques, bounded variable least squares techniques, etc.; and, iii) all of these computationally intensive procedures need to be repeated for any newly available measured test data for a given structure. Hence there is a need to explore alternative, more computationally efficient and accurate approaches for the damage identification problem. An immediate alternative is to design data mining techniques that can enable the real-time prediction and identification of damage for newly available test data once a sufficiently accurate model is developed from the training data. Recently, a number of researchers [12, 16, 18-20] have used data mining techniques, such as neural networks, to address the problem of damage detection in mechanical structures by using static displacements and dynamic characteristics. Most of these studies , that are categorized as “the direct approach” require the prediction of the material property, namely, the Young's modulus, of all the structure elements in the domain individually or simultaneously. As a consequence, they are restricted to prediction of damages in mechanical structures with a small number of structure elements (order of ten). The development of a predictive model that can correctly identify the location and severity of damage in practical large-scale complex structures using the “direct approach” can be a considerable challenge. Increased geometric complexity of the structure causes increase in the number of structure elements (target variables from data mining point of view), which also causes an increase in the number of prediction models that need to be built. This growth will not only increase the time required for training prediction models but also the time required for data generation since each damage state (data record) requires an eigen solver to generate natural frequencies and mode shapes of the structure. All these limitations make the “direct approach” not scalable to situations in which thousands of elements are present in the complex geometry of the structure or when multiple elements in the structure have been damaged simultaneously.

2. PROBLEM DESCRIPTION The problem of predicting damage intensity in structure elements conceptually corresponds to the equivalent data mining problem of predicting large number of continuous target variables. Traditional data mining techniques for addressing this problem simply predict each of these target variables simultaneously or individually. However in damage detection, due to the nature of the problem, there is no need to predict all target variables but only those that are of particular interest (damaged structure elements). In addition, the characteristics of the domain may cause (i) a strong correlation among these target variables (structure elements) that need to be predicted, and (ii) an existence of high dimensional and heterogeneous data, where different prediction models are responsible for different regions. In such databases, incorporating this background knowledge into learning process may typically produce more efficient and accurate prediction models. It is interesting to observe that the problem of predicting multiple target variables is also present in other emerging applications. For instance, in a manufacturing process we may want to predict various quality aspects of a product from the parameter settings used in the manufacturing, while on the other hand, in financial markets, given econometric variables as predictors, the goal may be to predict changes in the valuations of the stocks in large number of industry groups or mutual funds [2].

In this paper, we propose an effective and scalable data mining technique for accurate damage detection in very large-scale complex mechanical structures. The proposed localized clustering-regression based approach consists of two phases: •

finding the most similar training data records to a test data record of interest and creating a cluster from these data records; and

•

predicting damage intensity only in those structure elements that correspond to data records from the built cluster, assuming that all other structure elements are undamaged.

Significant data mining research work has been done in the area of predicting multiple target variables. Some of these approaches include multitask learning [3], learning to learn [1], curds and whey algorithm [2], clustering learning tasks [17] and learning internal representations [6]. However, most of the previous work on multiple task learning is based on the idea that the tasks to be learnt jointly are somehow ''algorithmically related'', in the sense that the results of applying a specific learning algorithm to these tasks are assumed to be similar. A key feature that distinguishes our work from these existing algorithms is that our approach does not learn all target variables, but instead focuses on learning only a limited number of target variables of interest by exploring problem-specific interrelationships among them. In addition, it decomposes the complex problem of predicting multiple target

In the first step, for each test data record, we build a local cluster around that specific data record. The cluster contains the most similar data records from the training data set. Then, in the second step, for all identified similar training data records that belong to

451


the number of target variables usually causes increase in the number of data records needed for learning prediction models, as well as increase in time needed for building prediction models. These issues need to be successfully addressed when predicting large number of target variables.

variables from the global data set into a simpler one where particular target variables are predicted using only relevant set of data records as well as relevant set of features.

2.1 Problem definition Given a high-dimensional heterogeneous data set D with a large number of continuous target variables (Table 1), the problem consists of effectively and accurately predicting real value of target variables of interest. The typical data layout used in predicting multiple target variables is shown in Table 1, where data set D contains data records d1, d2, … , dN, i = 1, …, N. Each data record di is described with the pair {f, E}, where f = {f1, f2, … , fm} is the feature set, while E = {E1, E2, … , En} is the set of target variables. Each data record di in the data set D pertains to a specific state depending on domain-specific knowledge. Table 1. A typical input to data mining model for damage detection in mechanical structures. Data records d1 d2 ... dN

2.

Correlation among target variables. Structure elements in mechanical structures are considered related if they are spatially close to each other or if they are symmetric according to some symmetry axis. For example, structure elements Ex and Ey in Figure 1 are correlated since they are spatially close, while structure elements Ex and Ez are correlated due to symmetry. The fact that the target variables are correlated is not properly addressed in many existing algorithms for learning multiple tasks. Traditional methods for transfer of knowledge [1, 9] or multitask learning [3] are usually aimed only at learning a fixed single task.

3.

High dimensionality of the problem. The number of natural frequencies and mode shapes for very large and complex mechanical structures may be extremely large (order of thousands). Using large number of features when building prediction models unnecessarily slows down the learning process. Straightforward feature selection is not applicable for this type of problem since in most cases prediction of individual target variables depends on all features. In mechanical structures, damage in two individual structure elements that are very close or symmetric in 3D space results that two data records that correspond to these two damaged elements (see Table 1) have very similar low natural frequencies, i.e. frequencies with the lowest values (e.g. frequencies f1 to f100), and the only method to determine which element actually failed is to consider higher natural frequencies (e.g. frequencies f600 to f700). Regardless of whether the damaged elements are close or symmetric in 3D space, it is reasonable to group these elements in the same substructure. In this way, the background knowledge about the similarity of spatially close or symmetric elements is embedded into the procedure for sophisticated dimensionality reduction.

4.

Heterogeneous data distribution. Since each data record corresponds to a specific damage state of particular structure element, it is evident that only those data records are relevant for predicting the damage in the particular structure element. In such cases, standard global predictive models trained on the entire set of data records do not have satisfactory prediction performance. In many scenarios involving prediction of multiple target variables, only very small regions of data sets may be responsible for predicting particular target variables or their groups, and building standard global models may only harm the overall prediction performance.

Features (Frequencies) Target variables f1 f2 ... fm E1 E2 . . . En 72.833 151.67 . . . 213.45 0.5E E ... E 73.45 152.56 . . . 216.65 0.6E E ... E ... ... ... ... ... ... ... ... 74.01 153.01 . . . 214.21 E E . . . 0.7E

Since mechanical structures may be represented with a certain number of structure elements (Figure 1), the problem of predicting multiple target variables in damage detection in mechanical structures is reduced to predicting the intensity of the damage in structure elements {E1, E2, … , En} using dynamic characteristics (e.g. natural frequencies) as features {f1, f2, … , fm} (Table 1).

Ez

Ex E

y

Figure 1. Each square in the airplane represents the individual damage detection zone (element). The intensity of the element’s damage needs to be predicted by a global predictive model. The existing data mining methods that are recently proposed for learning multiple target variables [12, 16, 18, 20] are not adequate for addressing this specific problem of predicting damages in mechanical structures due to the following challenges: 1.

2.2 Our Previous Work To address aforementioned challenges, we have recently proposed hierarchical partitioning approach [7, 14] to predict the damage in the complex mechanical structures. The hierarchical partitioning approach consists of three phases that are applied recursively: (1) partitioning of data set (2) localization of groups of interesting target variables, and (3) the prediction of target variables. In the partitioning step, similar data records are first grouped using

Large number of target variables that need to be predicted. Large-scale complex mechanical structures typically have an extremely large number of structure elements (order of thousands) that need to be predicted. Current data mining techniques for learning multiple target variables are designed for predicting relatively small number of them. Increase in

452


symmetry of structure elements, but they also provide information about the correlation among structure elements, which is not available in the manual partitioning. Therefore, by using particular set of features (frequencies), it is possible to group similar target variables (structure elements) that are either close or symmetric in 3D space. In such a way, data records with similar set of features belong to one group of target variables, while data records that have different features from a specific set of features correspond to different groups. When the identified group needs to be further partitioned, different and/or additional set of features that bring additional knowledge needs to be considered (e.g., higher frequencies in damage detection problem) in order to distinguish among similar target variables already identified. For example, clustering algorithm applied only on the low natural frequencies may identify the regions that contain structure elements that are spatially close or spatially symmetric, since the damage in those structure elements results in very similar low frequencies. In order to further split these regions, there is a need to focus on higher frequencies, since this is the only method to distinguish between two spatially close or symmetric structure elements.

manual splitting or clustering algorithm and then corresponding groups of similar target variables are identified. In the localization step, a classification model is used to predict which group of the target variables is of particular interest and needs to be further investigated. If the identified group of target variables still contains large number of target variables, then the partitioning and localization steps are repeated recursively and the identified group is further split into subgroups with more similar target variables. When the number of target variables per identified subgroup is sufficiently small, the prediction phase takes place and the target variables are predicted using localized prediction models that are built not using all data records from entire data set, but using only those data records that correspond to the particular subgroup. In the partitioning step of our hierarchical approach [7], we have used two approaches, namely manual partitioning and clusteringbased partitioning. The first and the simplest method for partitioning target variables into similar groups is to perform manual partitioning of target variables by incorporating heuristics describing spatial locality of target variables (e.g. locations of target variables) [14]. Figure 2 illustrates manual partitioning of the airplane structure.

The proposed hierarchical clustering based approach for partitioning groups of target variables and predicting their values Algorithm Partitioning(D, m1) • Given: set with data records D ={d1, d2, … , dN}, where each di = {f1, … , fm, E1, … , En}, i = 1, …, N. • Select the set of low natural frequencies {f1, f2, …, fm1}, where m1 0 has to contain at least minimum number of points, i.e.

3. METHODOLOGY

-

ε-neighborhood

200

150

100

ε-threshold 50

0 Data records

Figure 4. Typical plot of sorted k-distances The minimum number of points is specified in the same way as in the original DBSCAN algorithm, and it is set to k+1. When both parameters (ε and minimum number of points) are set, we simply

identify all training data records that are in the ε-neighborhood of a given test data record and assign these data points to a cluster.

In the rare cases when there are no points in the ε-neighborhood of a given test data record, we switch to the nearest neighbor approach and build a cluster using k nearest training data records.

In the nearest neighbor approach, the cluster of similar data records is simply built from k nearest data records from training data set, where the Euclidean distance is computed using only

454


belong to a built cluster. In the example from Figure 5, structure elements E1 and E3 correspond to 5 training data records, which is 83% (5/6) of the entire cluster (the size of cluster is 6). Our experiments have shown that the failed structure element is always among the structure elements that correspond to the training data records within the discovered cluster. In the case when several structure elements have the same number of corresponding data records, the advantage is given to those structure elements that correspond to the data records that are closer to a given test data record.

3.2 Building localized regression models When the cluster of training data records similar to a given test data record is identified, the next step is to associate which structure elements correspond to the data records and to predict the failure only in those structure elements. To illustrate the principle of identifying corresponding target variables (structure elements), lets assume that the training data records that are similar to a test data record and belong to a cluster discovered in the first step are presented in shaded rows in Figure 5. Lets also assume that the training data set contains 14 data records

When drawing an analogy between the localized regression approach proposed in this paper and our previously proposed hierarchical partitioning approach, it can be observed that the first step of building a cluster in the localized regression approach basically corresponds to the first step in the hierarchical partitioning approach, when clusters at the first level are identified. However, there are two main distinctions: (1) in the localized regression approach here, there is only one cluster that is built, unlike many clusters in the hierarchical partitioning approach; and (2) the cluster that is built in the localized regression approach around a test data record is typically significantly smaller than the clusters discovered at the first level of partitioning in the hierarchical approach.

It can be observed from Figure 5, that the training data records from a built cluster typically correspond to the failure in more than one structure element. For example, in Figure 5, the built cluster contains training data records d1, d2, d5, d9, d10 and d14 that correspond to the failure in structure elements E1, E3 and En (* denotes the damaged structure elements). These corresponding structure elements are also presented as shaded columns in Figure 5. In general, depending on the size and the quality of the cluster, the number of corresponding damaged structure elements can vary significantly. Features Target Variables Data records f1 f2 f3 … fm E1 E2 E3 En … * d1 * d2 * d3 * d4 * d5 * d6 * d7 * d8 d9 * d10 * d11 * d12 * d13 * d14 * Figure 5. Typical training data records obtained by building a cluster around given test data record

Identifying structure elements of interest that need to be predicted in the second step of the localized regression approach corresponds to the further refinement of structure elements identified in the previous step. This step essentially corresponds to the further levels of partitioning in the hierarchical approach and represents further splitting of the cluster identified at the first level of the localized regression approach. When the structure elements of interest are identified, the next step is to build localized regression models for predicting the failure in these structure elements. These localized models are constructed using only the data records that correspond to the failure in particular structure element in the entire training data set. For example, if we predict the failure in structure element E1 and E3 then not only data records d1, d2, d9, d10 and d14 will be used in constructing the regression model for both of these structure elements, but all the data records from the training data set that are generated for different intensity of the failure in the elements E1 and E3.

Although all structure elements that correspond to data records from discovered cluster are of particular interest for predicting intensity of damage, we typically choose to predict the failure in only a few of them. The number of structure elements for which we predict the failure depends on the total number of different structure elements within the cluster. For example, in Figure 5, there are six training data records that correspond to the failure in three structure elements E1, E3 and En. The structure elements that will be predicted are determined according to the largest number of corresponding data records within the cluster. For example, the structure element E1 corresponds to two data records, the structure element E3 corresponds to three data records, while the structure element En corresponds to only one data record. Therefore, since structure elements E1 and E3 have the largest number of corresponding data records within the built cluster, we will predict the failure only in those two structure elements. In general, the number of structure elements that we chose to predict should correspond to at least 75% of all training data records that

In addition, unlike the first step, when only the lowest natural frequencies were used to build a cluster around a test data record, in this step only the highest natural frequencies are employed to build a regression model for predicting the failure in structure elements, since low natural frequencies do not provide any useful information in distinguishing between similar (spatially close or symmetric) structure elements. The number of the highest natural frequencies is again determined heuristically according to the expert background knowledge. As local regression models, we have trained 2-layered feedforward neural network models with number of hidden neurons equal to the number of input attributes. In order to reduce the number of input attributes considered in neural network model, variance-based dimensionality reduction through principal component analysis [8] is also employed, such that newly transformed features will retain some predefined part of the variance. We have used three neural network learning algorithms:

455


The electric transmission tower (Figure 6), studied in [14], has been chosen to demonstrate the effectiveness of our data mining approach in damage detection. The training dataset of 6,241 data records, 900 natural frequencies (features) and 312 structure elements (target variables) is generated by failing a random single element by a random amount. This data set corresponds to the scenario of 20 failure states per element (312 elements x 20 = 6240 and one data record for the non-damaged mechanical structure). The testing data sets was obtained using the same procedure and it contained 7800 data records, which corresponds to 25 different scenarios of failure for each structure element.

resilient propagation [11], conjugate gradient backpropagation with Powell-Beale restarts [10], and Levenberg-Marquardt [5].

4. EXPERIMENTS 4.1 Experimental Setup Predicting damage in mechanical structures using data mining techniques requires the following steps for training data generation: • Feature Construction: To build the right data mining model it is important to construct a useful set of features that will successfully characterize the damage states, capture the physics of the problem at hand, and be independent of operational loads for a given structure. Since natural frequencies and mode shapes of the mechanical structure meet these criteria, they are selected as useful features. This selection is made due to following considerations: i) these quantities can be measured from the actual physical structures, (ii) the natural frequencies represent global behaviors of the structure, while the mode vectors represent the local characteristics of the structure, and iii) the number of features can be limited to very few low natural frequencies and mode shapes compared to the number of degrees of freedom in the structure. In this study, however, our features are limited to natural frequencies only.

4.2 Experimental Results We first performed manual partitioning of the electric transmission tower into four legs and a head (Figure 6), and predict the existence of the damage within these substructures. Then, we also applied the hierarchical clustering approach and partition the transmission tower into five sub-structures, shown in Figure 7. The detailed results about prediction performance of these two approaches are available in our previously published work [7]. Here, we only provide the comparison of these two partitioning approaches to the localized regression method proposed in this paper.

• Data Generation: The data for building prediction models is generated by using a typical finite element analysis code. In the typical data layout shown in Table 1, the feature set f = {f1, f2, … , fm} corresponds to the set of m natural frequencies, while the set of target variables E = {E1, E2, … , En} represents the values for Youngs’s modulus of elasticity for all n finite elements. Each data record di in the data set D pertains to a failure state, where the failure state is simulated by failing a single element in the structure and performing the eigenanalysis of the finite element model. In more complex scenario, data records may correspond to failure of more elements, but this analysis is out of scope of this paper. In our experiments, the elements are failed in steps (e.g., each element is failed by reducing E from the base value of E to E' in steps of ∆E where ∆ is a small fraction).

Head

Figure 7. Illustrative five sub-structures at the first level of clustering employing the algorithm Partitioning. (Figure is best viewed in color) It is important to note that the prediction performance for all the methods was measured using the coefficient of determination defined as R2 = 1 – MSE/σ2, where σ is a standard deviation of the target variable. R2 value is a measure of the explained variability of the target variable, where 1 corresponds to a perfect prediction, and 0 to a trivial mean predictor. If the prediction model is worse than the mean predictor, it is also possible to achieve R2 value that is less than 0. In order to alleviate the effect of neural network instability in our experiments, the R2 value for each element from substructures is averaged over 10 trials of the neural network learning algorithm.

Legs

Figure 6. Three-dimensional model of electric transmission tower discretized using beam elements

When applying the localized regression method, one of the major issues was to select right set of natural frequencies that will be used in building a cluster around a test data record. Since natural

456


structure elements, as well as the average R2-value of all the elements within particular structures. Since the R2-value of predicting particular elements may be less than 0 in many cases (e.g., –1 or –2 for extremely poor global regression models in the direct approach), the average R2-value for substructures is computed such that the R2-value of those models with negative R2-value is assigned zero value. Thus, the exceptionally bad regression models will not negatively influence the accurate ones. This kind of analysis only helps the direct (global) approach and the manual partitioning, since very often regression models built using these two approaches are exceptionally poor and have very high negative R2-values. On the other hand, this happens rarely for regression models built using the hierarchical clustering approach and never for regression models built using the localized regression method. It is apparent from Table 2 that both partitioning approaches produce localized regression models that are more accurate than the global regression models used in the direct approach. In addition, the regression models built using hierarchical clustering approach are in most cases more accurate then regression models obtained by manual partitioning. Only in approximately 2% of the total number of elements, are the models obtained from manual

frequencies that are easy to be measured in practice are usually less than 25Hz, higher frequencies are typically eliminated first. In order to identify the exact number of frequencies to be used in the first step (building a cluster), we use a heuristic approach that looks for the largest gap between two natural frequencies in the vicinity of some round natural frequency (e.g. the biggest gap around 0.5 Hz or 1Hz). Thus, the building of a cluster was performed employing the 202 lowest frequencies, since these frequencies were smaller than 0.5Hz. Both, nearest neighbor and density based approach were investigated when building a cluster around a test data record. The prediction performance achieved using the density-based approach was slightly better due to a better quality of locally built clusters around a test data record. The quality of the cluster identified using the density based approach was better since the training data points around a test data record belong to higher density regions, which are typically good indicators of valuable clusters. Due to the lack of space, only the results obtained using the density based approach are presented in this paper. When constructing localized regression models, only the highest 200 natural frequencies were used. Again, this number was determined in a similar way explained for the low natural frequencies. The experimental results of predicting the intensity of damage in the elements using straightforward direct approach, partitioning approaches (after first level of partitioning) as well as using the localized regression method are given in Table 2. When considering straightforward direct approach, we have also run the experiments using standard regression analysis but the results were worse or comparable to those obtained by neural networks.

partitioning Mj , j = 1,312 more accurate than the models Rj, j =

1,312 constructed in the hierarchical clustering approach. For example, the prediction of element E145 that belongs to the structure S5 is more accurate when using the localized models obtained through manual partitioning (R2 = 0.245) than when using the models obtained in hierarchical clustering based approach (R2 = 0.165). Table 2 also shows that both partitioning based approaches are consistently inferior compared to the prediction models constructed using the localized regression method. It can be observed that even for the structure elements for which the hierarchical partitioning approach achieves the best prediction performance (except for the structure element E96), the models built by the localized regression method are significantly more accurate. To illustrate the superior performance of the localized regression method, we plot the R2 values for all of 312 structure elements that are obtained using the localized regression method and both hierarchical partitioning approaches (Figure 8). Again, when plotting the R2-values that are less than 0, we assume that these negative R2-values are assigned zero values.

Table 2. Prediction of damage intensity (given in R2 values) for those elements within structures for which hierarchical clustering approach achieves the worst and the best accuracy. Direct Manual Hierarchical Localized Str uct Element clustering regression (global) partiure method approach tioning approach

S2

S3

S4

S5

E15 (worst) E96 (best) average* E241 (worst) E263 (best) average* E312 (worst) E209 (best) average* E102 (worst) E2 (best) average* E207 (worst) E195 (best) average*

Effective Localized Regression for Damage ... - ACM Digital Library

Effective Localized Regression for Damage ... - ACM Digital Library

Suggest Documents

24 Triggering Effective Social Support for Online ... - ACM Digital Library

Combining Speech and Pen Input for Effective ... - ACM Digital Library

lconic Programming Proves Effective for ... - ACM Digital Library

A manifesto for effective process models - ACM Digital Library

Adopting IT for Effective Management of Social ... - ACM Digital Library

Supporting Localized OpenVX Kernel Execution ... - ACM Digital Library

Tag Normalization and Prediction for Effective ... - ACM Digital Library

Terminologies Used In Localized Mobile ... - ACM Digital Library

Model-based regression test case prioritization - ACM Digital Library

design - ACM Digital Library

crpit - ACM Digital Library

Conversations - ACM Digital Library

Incentives - ACM Digital Library

Gunrock - ACM Digital Library

Abstract - ACM Digital Library

AdaGIDE - ACM Digital Library

MOVELETS - ACM Digital Library

Regression Test Selection on System ... - ACM Digital Library

P10 - ACM Digital Library

2PXMiner - ACM Digital Library

feature - ACM Digital Library

C++ ... - ACM Digital Library

practice - ACM Digital Library

Locating and Disseminating Effective Messages ... - ACM Digital Library