A comparison of feature-based classifiers for ultrasonic structural ...

0 downloads 0 Views 571KB Size Report
Both changing environmental and structural conditions affect the ultrasonic wave ... Health Monitoring and Smart Nondestructive Evaluation of Structural and.
Copyright 2004 Society of Photo-Optical Instrumentation Engineers. This paper is made available as an electronic reprint with permission of SPIE. One print or electronic copy may be made for personal use only. Systematic or multiple reproduction, distribution to multiple locations via electronic or other means, duplication of any material in this paper for a fee or commercial purposes, or modification of the content of the paper are prohibited.

A comparison of feature-based classifiers for ultrasonic structural health monitoring Jennifer E. Michaels*, Adam C. Cobb and Thomas E. Michaels School of Electrical and Computer Engineering Georgia Institute of Technology Atlanta, GA 30332-0250 ABSTRACT Diffuse ultrasonic signals received from ultrasonic sensors which are permanently mounted near, on or in critical structures of complex geometry are very difficult to interpret because of multiple modes and reflections constructively and destructively interfering. Both changing environmental and structural conditions affect the ultrasonic wave field, and the resulting changes in the received signals are similar and of the same magnitude. This paper describes a differential feature-based classifier approach to address the problem of determining if a structural change has actually occurred. Classifiers utilizing time and frequency domain features are compared to classifiers based upon timefrequency representations. Experimental data are shown from a metallic specimen subjected to both environmental changes and the introduction of artificial damage. Results show that both types of classifiers are successful in discriminating between environmental and structural changes. Furthermore, classifiers developed for one particular structure were successfully applied to a second one that was created by modifying the first structure. Best results were obtained using a classifier based upon features calculated from time-frequency regions of the spectrogram. Keywords: Ultrasound, diffuse waves, classifier, spectrogram, damage detection, structural health monitoring

1. INTRODUCTION Active ultrasonic sensors offer many advantages over other proposed sensors for monitoring the health of a structure due to their potential for global measurements and the well-known sensitivity of high frequency elastic waves to both damage and microstructural changes. Waves generated by one transducer and received by the same or different transducer convey information about the entire material volume through which the wave has propagated. The problem is reliably and quantitatively correlating the ultrasonic response to actual structural conditions, particularly in the face of environmental effects such as temperature and surface condition changes. Conventional ultrasonic nondestructive evaluation (NDE) is a mature technology that has been successful in detecting both defects and bulk material properties such as porosity, inclusions and distributed damage. However, the basic measurement is essentially made at a point, and motion of the transducer is required, often with expensive scanning equipment, to obtain volumetric information. An additional difficulty with ultrasonic NDE, particular for material property measurements, is obtaining repeatable results. There is much variation in the coupling of the transducers to the structure being inspected, in the positioning of the transducers, and in the transducers themselves. The approach taken for most ultrasonic NDE methods is generally not suitable for an in-situ structural health monitoring (SHM) system due to the need for either moving the transducer, or using many transducers, in order to get wide area coverage. The ideal scenario for an ultrasonic SHM system would be to permanently mount a small number of sensors on, in or near the structure, and either continuously or intermittently activate them in order to monitor the ultrasonic response1. Existing research has concentrated primarily on generating guided waves (i.e. Rayleigh or Lamb waves) that can travel long distances with little dispersion and are relatively simple to analyze2,3. An alternative approach, often referred to as the acousto-ultrasonic technique, has been investigated primarily for NDE4. This method requires exciting an ultrasonic source that generates a diffuse wave field and monitoring both the frequencydependent decay rate of the diffuse field and features of the power spectrum that are typically called shape parameters. *

[email protected]

Health Monitoring and Smart Nondestructive Evaluation of Structural and Biological Systems III, edited by Tribikram Kundu, Proc. of SPIE Vol. 5394 (SPIE, Bellingham, WA, 2004) · 0277-786X/04/$15 · doi: 10.1117/12.540087

363

The guided wave method has been successful for both NDE and SHM for structures with a plate-like or cylindrical geometry that support propagation of these modes. The diffuse wave approach has had limited success as the received signals are very complex and parameters being monitored do not always correlate to the material properties of interest. There is one advantage of an SHM system over an NDE system -- having permanently mounted ultrasonic sensors. This advantage eliminates the significant problem of lack of repeatability of transducers and coupling conditions. The work presented here utilizes quantitative differential signal processing methods to explicitly take advantage of the fixed nature of the sensors and structures. Other related applications, such as processing of speech, sonar, radar and biomedical signals, do not have this advantage, and thus suitable differential methods have not been developed in these fields.

2. EXPERIMENTAL PROCEDURE The specimen utilized for this work is a 6061 aluminum plate, 50.8 mm x 152.4 mm x 4.76 mm (2” x 6” x 3/16”), as previously reported5 and illustrated in Figure 1(a). This specimen, although simple in both material and geometry, is of interest because reflections from the boundaries play a major role in the recorded signals and the geometry is not suitable for the use of guided waves. Two epoxy-backed piezoelectric transducers were attached to the top surface of the specimen as shown in Figure 1(b). These transducers were constructed by bonding 12.5 mm diameter, longitudinally polarized, 2.25 MHz PZT disks to a 0.127 mm (0.005”) thick brass plate, and were attached to the specimen using cynoacrylate adhesive. A conventional ultrasonic pulser receiver (Panametrics 5072PR) was used for spike mode transducer excitation and waveform amplification. Waveforms were digitized with a sampling rate of 12.5 MHz and a resolution of 8 bits, and each recorded waveform was the average of 50 signals. (a)

(b)

Figure 1. (a) Drawing and (b) photograph of aluminum specimen used for all measurements. The specimen was first subjected to environmental changes consisting of temperatures ranging from 9°C to 38°C (48°F to 100°F) and varying surface conditions. Surface conditions were changed by both wetting the top surface and placing a small, oil-coupled aluminum block at various positions. The changes due to wetting the plate were not repeatable because of the difficulty in applying a consistent pattern of water, but the changes due to the oil-coupled block were generally consistent and thus could be repeated in conjunction with the introduction of flaws. Artificial flaws were introduced into the plate by first drilling a single 1.98 mm (5/64”) diameter hole, and subsequently enlarging it up to a final diameter of 6.35 mm (1/4”). A second hole was later drilled in the same manner, as can be seen in Figure 1(b). Ultrasonic signals were recorded after each incremental change in diameter, and environmental changes were also applied. Table 1 provides a summary of all 275 recorded signals; note that there is significant redundancy in order to assess measurement repeatability. The major experimental difficulty in developing differential classifiers for damage detection is that once a specimen is damaged, it can, in general, no longer be used for differential measurements. For this study, this difficulty was mitigated somewhat by using holes for defects, and by drilling a second hole. The purpose of the second hole was not to introduce additional damage, but to consider the specimen with the first hole as, in essence, a “new” specimen, and the second hole as introduction of damage into this new specimen. One issue that is addressed as part of this study is whether or not a classifier developed for one specimen can be applied to another. This issue is very important because it is of course not practical to introduce damage in order to train a classifier to predict damage.

364

Proc. of SPIE Vol. 5394

Table 1. Summary of recorded signals. Signals 1-27 28-33 34-53 54-73 74-94 95-129 139-149 150-168 169-219 220-239 240-263 264-275

Description Undamaged specimen, temperature varied from 9°C (48°F) to 38°C (100°F) Undamaged specimen, temperature varied from 31°C (88°F) to 26°C (78°F) Undamaged specimen, 23°C (73°F) Undamaged specimen, 22°C (72°F) Undamaged specimen, temperature varied from 10°C (50°F) to 32°C (90°F) Undamaged specimen, heated and surface wetted, then cooling off and drying Undamaged specimen, 24°C (75°F) Undamaged specimen, varied surface conditions via contact with oiled block Hole #1, drilled in increments combined with various environmental effects Specimen with hole #1, 23°C (73°F) Specimen with hole #1, varied surface conditions Hole #2, drilled in increments

Figure 2 illustrates typical signals from the specimen as follows: (a) undamaged and nominal environmental conditions (room temperature and dry, free surface), (b) undamaged at 32°C (90°F), and (c) 4.76 mm (3/16”) diameter hole and nominal environmental conditions. For all of the signals digitization began 10 µsec after transmit, and the total time window was 1000 µsec. This window includes most of the energy of the received signal; a longer window does not yield significant additional information because the signal amplitude is lower than the noise level of the 8 bit digitizer. (a)

(b)

(c)

Figure 2. Recorded waveforms from the (a) undamaged specimen at room temperature, (b) the undamaged specimen at 32°C (90°F), and (c) specimen after introduction of damage. Note that the signal window of 1000 µsec is equivalent to a longitudinal wave traveling approximately 6,350 mm (250 inches) in aluminum, which is over 40 times the length of the specimen. For reference, the calculated first arrival of the longitudinal wave is 16.3 µsec, the first shear arrival is 32.4 µsec, and the Rayleigh wave arrival is at 34.8 µsec. These arrivals are consistent with the recorded waveforms, and no attempt was made to identify additional arrivals due to the large number of combinations of edge reflections. The general trend and appearance of the three signals in Figure 2 are very similar, as is true for all of the recorded signals, although subtle differences are clearly evident. There are no obvious characteristics of the signals that distinguish between environmental and structural conditions.

Proc. of SPIE Vol. 5394

365

3. ANALYSIS METHODS Due to the complexity of the recorded signals and the subtlety of the changes, a differential feature-based approach to analysis was taken, in contrast to the usual approach of calculating time and frequency domain features from a single signal6. The interpretation here of “differential” is that all methods and features explicitly incorporate comparison to a reference signal, and that if there are no changes from this reference, the condition of the structure is classified as unchanged5. Although this approach may seem obvious, quantitatively incorporating it into the signal processing steps required careful consideration. In preparation for development and evaluation of a feature-based classifier, four sets of labeled data were constructed, referred to as the training, evaluation, comprehensive and new data sets. Consistent with the differential approach, each member of these four data sets consists of two waveforms referred to as the signal and the baseline. For N waveforms, there are N2-N different ways of defining signal-baseline pairs; for the 275 recorded waveforms there are thus 75,350 possible combinations. Many of these combinations are not logical (e.g. using a signal from a drilled hole as a reference), but there is clearly a rich set of data from which to select signal pairs. The training data set consisted of 50 signal pairs, of which 32 were due to environmental changes (class 1) and 18 to structural changes (class 2). The environmental conditions included both temperature changes and surface condition variations, and the structural changes consisted of various sizes of hole #1 combined with environmental effects. The evaluation data set was constructed in a similar manner but with different signal pairs. The comprehensive data set consisted of 132 signal/baseline pairs where the baseline was the undamaged specimen with various environmental conditions and all of the flaw signals were from hole #1. None of the 132 pairs were in the training and evaluation data sets, and many of the conditions were not representative of the training and evaluation data. The new data set consisted of 46 signal/baseline pairs where the baseline was the specimen containing hole #1 drilled to its full diameter of 6.35mm (1/4”), and the structural changes consisted of hole #2 combined with environmental effects. The purpose of the new data set is to determine if a classifier developed for one specimen is effective for a different, albeit similar, specimen. Several different methodologies were used to calculate features, as described in subsequent sections. For all cases, linear classifiers were implemented with a neural network consisting of a single hard-limit perceptron7. The input to the classifier is a set of N features, and the output is a zero for the first class (no damage), and a one for the second class (damage). The training data set was used to construct the classifier, and the evaluation data set was for verification. The combined performance of the training and evaluation data sets was the basis for considering the classifier and its associated features for further consideration. The comprehensive data set was used to assess classifier performance for data recorded from the base specimen. As stated previously, the new data set was used to assess performance on what is in essence a different structure.

3.1

Time and Frequency Domain Features

The first approach taken to classification, similar to previously reported results5, was to define and calculate a set of differential features in both the time domain and the frequency domain. The essence of the differential approach is that all of the features are defined relative to a reference, or baseline, signal. In addition, all features are defined so that they are identically zero if the signal is the same as the baseline. For example, instead of using the parameter of “Peak Amplitude”, the feature would be defined as the difference of the peak amplitudes scaled by the baseline peak amplitude (Feature = [Asignal - Abaseline]/Abaseline). Table 2 summarizes the 24 features that were calculated. One measure of a single feature’s ability to separate two classes is the Fisher Discriminant Ratio7, FDR =

( µ1 − µ2 )



2 1

2

+ σ 22 )

(1)

where µ1 and µ2 are the means and σ1 and σ2 are the variances of the two classes (environmental and structural, respectively). The FDR is large for a specific feature if the classes have a large difference in means combined with small variances. Table 3 shows the FDRs for the 24 features sorted in descending order. It can be seen that feature 24 has the largest FDR, and thus is most effective in separating the classes.

366

Proc. of SPIE Vol. 5394

Table 2. Summary of features used for classification of environmental vs. structural changes. Feature 1 2 3 4 5 6,7,8 9,10 11 12,13,14 15 16 17,18,19 20,21 22,23 24

Description Differential energy Differential amplitude Differential center of energy -- time domain Differential cross correlation peak amplitude Differential curve length Differential time domain 1st, 2nd and 3rd moments Differential energy in time domain, +/- 5% and +/- 10% of reference time center Energy of difference between normalized envelopes 1st, 2nd and 3rd moments of absolute value of difference between normalized envelopes Differential center frequency Normalized inner product of amplitude spectrum with reference spectrum (minus one) Differential frequency domain 1st, 2nd and 3rd moments Differential energy in frequency domain, +/-5% and +/-10% of reference frequency center Slope and error of local time shift vs. time as determined by Short Time Cross Correlation Normalized Short Time Cross Correlation peak at T=910 µsec (minus one) Table 3. Fisher discriminant ratios for time and frequency domain features.

Feature 24 23 22 21 1 11 12 7

3.2

FDR 3.4332 1.2227 0.8108 0.6319 0.5424 0.4421 0.2915 0.2308

Feature 6 13 8 14 5 17 18 4

FDR 0.2213 0.2204 0.2128 0.1796 0.1256 0.1245 0.0857 0.0828

Feature 19 10 20 16 3 9 15 2

FDR 0.0604 0.0476 0.0421 0.0222 0.0189 0.0092 0.0007 0.0000

Spectrogram Features

The second method of calculating differential features is based upon the spectrogram, a time-frequency representation of a time signal. The spectrogram is the magnitude of the Short Time Fourier Transform, which is calculated by taking multiple discrete Fourier transforms using a sliding time window8. The result is a two dimensional representation of the signal where time is the horizontal axis and frequency is the vertical axis, as shown in Figure 3 for the time signal of Figure 2(a). For this figure, the grey scale palette is inverted so that large amplitudes are dark. For the data shown here, there are a total of 180 time pixels over 900 µsec, and 1025 frequency pixels over 6.25 MHz, for a pixel resolution of 45 µsec x 6.1 kHz. Rectangular regions were defined in the time-frequency plane in order to calculate features. The feature F associated with a particular region is defined as

F=

Esignal − Ebaseline

(2)

Ebaseline

where E is the energy of the signal (or baseline) in that region as computed from the spectrogram. Note that the value of the feature is zero if the energy in the region is unchanged.

Proc. of SPIE Vol. 5394

367

Figure 3. Spectrogram of the signal from the undamaged specimen under nominal environmental conditions.

The difficulty in region selection is there are a virtually unlimited number ways to segment all or part of the timefrequency plane. Considered here are two methods for selecting regions: (1) manually identifying areas of interest by observing the spectrograms, and (2) automatically tiling the time-frequency plane. For the first method, it was noted that there are horizontal bands of energy in the spectrogram. A total of ten bands were delineated with nine time divisions per band for a total of 90 regions, as shown in Figure 4(a). For the automated method, it was first noted that there is little significant energy above 2.8 MHz. Thus, the overall area of interest was from 0 to 900 µsec, and 0 to 2.8 MHz. Regions were defined by repeated N x N tilings of this area, with N ranging from 1 to 20. Note that N=1 corresponds to one region encompassing the entire area of interest, N =2 is four regions, and so forth until the 400 regions corresponding to N =20. There are a total of 2,780 regions, as computed by 12 + 22 + 32 +……+ 202 = 2,780. For N=20, the region size is 45 µsec by 170 kHz; Figure 4(b) shows these 400 regions superimposed on the spectrogram.

(a)

(b)

Figure 4. Time-frequency regions for feature calculation; (a) all of the manually selected regions, and (b) 400 of the automatically selected regions for N=20.

368

Proc. of SPIE Vol. 5394

Since there are too many features for detailed analysis, a reduced number were chosen for further consideration. For the work presented here, the regions chosen manually do not overlap, but regions computed with the automated algorithm from different values of N could overlap. Features from overlapping regions tend to be highly correlated and thus do not add significant additional information; therefore overlapping regions were not permitted. A total of 24 features were selected by picking those with the highest FDRs from non-overlapping regions. This number of features is arbitrary and was chosen so that there were the same number of spectrogram features as time and frequency domain features. Tables 4 and 5 show FDR values for the 24 best manually and automatically selected features, respectively. Table 4. Fisher discriminant ratios for spectrogram features based upon manually selected regions. Feature 40 23 17 26 32 56 37 64

FDR 10.3121 1.8880 1.8309 1.6419 1.1513 0.9752 0.9534 0.9367

Feature 30 47 16 42 39 41 28 2

FDR 0.9168 0.8512 0.6388 0.6365 0.6130 0.5067 0.5026 0.4945

Feature 8 19 15 53 50 45 22 67

FDR 0.4769 0.4493 0.4349 0.4059 0.3979 0.3816 0.3546 0.2920

Table 5. Fisher discriminant ratios for spectrogram features based upon automatically selected regions. Feature 1 2 3 4 5 6 7 8

FDR 9.9258 8.2295 2.7710 2.6481 2.2034 2.0570 2.0317 1.7798

Feature 9 10 11 12 13 14 15 16

FDR 1.7528 1.7316 1.6433 1.6354 1.5302 1.3336 1.2905 1.2818

Feature 17 18 19 20 21 22 23 24

FDR 1.2652 1.2397 1.1787 1.1689 1.1636 1.1545 1.1374 1.1085

4. RESULTS 4.1

Classifiers Based on Time and Frequency Domain Features

The FDR is useful for evaluating the effectiveness of a single feature, but combinations of features have the potential for being more effective than single features. With a total of 24 features, there are a total 16,777,215 (2N-1) different combinations of features that could be used to form a classifier, too many to empirically evaluate. It can be seen from the FDRs shown in Table 3 that feature 24 is much better than the rest. In order to evaluate a manageable number of classifiers, the 276 combinations of two and three features that included feature 24 were evaluated using the training data set to create the classifier and the evaluation data set to verify performance. There were no classifiers with perfect performance for both data sets, but there were 13 that correctly classified at least 95 of the 100 signal pairs. There were 12 features in addition to feature 24 that were present in these 13 combinations: 1, 2, 6, 7, 8, 9, 10, 17, 18, 19, 20, 22 and 24. It was noted that features 6, 7 and 8 were highly correlated as well as features 17, 18 and 19; only those with the highest FDRs were kept, resulting in nine remaining features: 1, 2, 7, 9, 10, 18, 20, 22 and 24. All combinations of these features were evaluated using the training and evaluation data, and there were eight combinations that had no classification errors for the 100 signal pairs in the two data sets. The one with the fewest number of features was considered to be the preferred classifier because of its simplicity; it consisted of features 1, 2, 9 and 24.

Proc. of SPIE Vol. 5394

369

This four-feature classifier was then used to classify data in the comprehensive and new data sets. For the comprehensive data set there were seven misses and three false alarms, and for the new data set there were five misses and three false alarms, for an overall correct classification rate of 89.9% for the combined data sets. Confusion matrices summarizing these results are shown in Figures 5(a) and (b). Note that properly classified “No Flaw” signal pairs are in the upper left corner, properly classified “Flaw” signal pairs are in the lower right, false alarms (“No Flaw” classified as “Flaw”) are in the upper right, and misses (“Flaw” classified as “No Flaw”) are in the lower left. In an attempt to improve performance, the output of the eight classifiers with perfect performance on the training and evaluation data were combined using a voting scheme whereby the overall output was set to that of the majority of the classifiers. In the case of a tie, the output was considered to be a flaw. Results are slightly improved in terms of number of misses as shown by the confusion matrices in Figures 5(c) and (d), with an overall correct classification rate of 91.0%. Four Feature Classifier (a)

Comprehensive Data (86) (3) 96.7% 3.4% (7) 16.3%

(b)

(36) 83.7%

Combined Classifier

New Data (21) (3) 87.5% 12.5% (5) 22.7%

(c)

(17) 77.3%

Comprehensive Data (85) (4) 95.5% 4.5% (6) 14.0%

(d)

(37) 86.0%

New Data (21) (3) 87.5% 12.5% (3) 13.6%

(19) 86.4%

Figure 5. Confusion matrices for classifiers determined from time and frequency domain features.

4.2

Classifiers Based on Spectrogram Features

The 24 best manually selected features were evaluated as described in section 4.1 whereby 276 combinations of two and three features were considered, always including the feature with the highest FDR. Of these combinations, 36 classified data in both the training and evaluation data sets with no errors. This situation presents a difficulty in selecting a classifier for further evaluation as all 36 classifiers performed perfectly. In an effort to determine the most significant features, the ones most used in these 36 classifiers were tabulated; the top four features were 40, 64, 22 and 47, with 22 and 47 being used the same number of times. Feature 22 was discarded since it had a lower FDR than feature 47. The classifier consisting of features 40, 64 and 47 was then used to classify data in the comprehensive and new data sets. For the comprehensive data set there were four misses and eleven false alarms, and for the new data set there were three misses and one false alarm, for an overall correct classification rate of 89.3%. Confusion matrices summarizing these results are shown in Figures 6(a) and (b). Since there were so many classifiers that performed perfectly on the training and evaluation data sets, even with just two and three features, there was no clear method for predicting which one of them would perform the best on the comprehensive and new data. Therefore an attempt was made to find the so-called “best” classifier by trial and error, which after a modest effort resulted in the classifier consisting of features 2, 23, 40, 56 and 64. Results were considerably improved for the comprehensive data set, and are similar for the new data set, as can be seen by the confusion matrices of Figures 6(c) and (d); the overall correct classification rate is 95.5%. Three Feature Classifier (a)

Comprehensive Data (78) (11) 87.6% 12.4% (4) 9.3%

(39) 90.7%

(b)

“Best” Classifier

New Data (23) (1) 95.8% 4.2% (3) 13.6%

(19) 86.4%

(c)

Comprehensive Data (88) (1) 98.9% 1.1% (2) 4.7%

(41) 95.3%

(d)

New Data (21) (3) 87.5% 12.5% (2) 9.1%

(20) 90.9%

Figure 6. Confusion matrices for classifiers determined from manually selected spectrogram regions. 370

Proc. of SPIE Vol. 5394

The same procedure for identifying the best features and classifiers was followed using features calculated from the 2,780 automatically generated regions. The best 24 were selected from non-overlapping regions based upon the FDR values. Due to the large number of features, they were renumbered from 1 to 24 in order of decreasing FDR, with feature 1 being the best. A total of 276 combinations of these 24 features were evaluated as classifiers with the best feature (#1) being combined with one or two additional features. Of these combinations, 28 classified data in both the training and evaluation data sets with no errors. The features most used in these 28 classifiers were tabulated; the top three were 1, 2, and 9. The classifier consisting of these three features was then applied to the data in the comprehensive and new data sets. For the comprehensive data set the performance was excellent with only one false alarm and no misses. Results for the new data set, however, were very poor with 19 misses (but no false alarms). The overall correct classification rate is 88.8%. Confusion matrices summarizing results are shown in Figures 7(a) and (b). As was done for the manually selected regions, an attempt was made to find the so-called “best” classifier by trial and error. This process resulted in the classifier consisting of features 1, 4, 5, 8, 14, 22 and 24. As shown in Figure 7(c) and (d), results are the same for the comprehensive data set with only one false alarm, and are considerable improved for the new data set where there are only two misses and two false alarms. The overall correct classification rate is 97.2% with only five incorrectly classified signal pairs out of 178. “Best” Classifier

Three Feature Classifier (a)

Comprehensive Data (88) (1) 98.9% 1.1% (0) 0%

(43) 100%

(b)

New Data (24) (0) 100% 0% (19) 86.4%

(3) 13.6%

(c)

Comprehensive Data (88) (1) 98.9% 1.1% (0) 0%

(43) 100%

(d)

New Data (22) (2) 91.7% 8.3% (2) 9.1%

(20) 90.9%

Figure 7. Confusion matrices for classifiers determined from automatically selected spectrogram regions.

5. DISCUSSION OF RESULTS The two classifiers created from time and frequency domain parameters performed similarly, as shown via the confusion matrices of Figure 5. Not surprisingly, performance was slightly better for the comprehensive data set than the new data set. Of interest are the features that were incorporated in these classifiers, particularly features 1, 2, 9 and 24; refer to Table 1 for a description of all features. Note that features 1 and 2 are global measures of the difference in energy and amplitude, respectively, between the signal and the baseline. Feature 9 is a measure of the shift in energy in the time domain within +/-5% of the center of energy in time. Feature 24 is a measure of the local coherence of the signal with the baseline near the end of the recorded signal5. The significance of these parameters is that none of them are dependent upon any of the time or frequency domain details of the signals, which means it is more likely that classifiers based on these features will be generally applicable to different structures and types of damage. Somewhat better performance was achieved with the addition of features 7, 10, 18, 20 and 22. Feature 7 is the change in the time domain second moment, feature 10 is a measure of the change in energy in the time domain within +/-10% of the time center of energy, feature 18 is the change in the second moment of the amplitude spectrum, feature 20 is a measure of the change in energy in the frequency domain within +/-5% of the frequency center of energy, and feature 22 is a measure of time shift as a function of time. Note that features 7 and 18 are both related to how the signal is decaying as a function of time; these features are similar to the acousto-ultrasonic shape parameters employed by others9. Feature 10 conveys similar information as feature 9, although over a broader time window, and feature 22 is related to temperature change since it is well-known that changes in temperature result in a time-dependent time shift of a diffuse ultrasonic signal10. These results will serve as important guidelines to future classifier designs, allowing reduced sets of features to be considered. The spectrogram-based features were selected for use in classifiers in two steps: (1) sorting by FDR, and (2) sorting by performance on the training and evaluation data. Figure 8 shows the best 24 regions of the manually selected ones plotted on top of the spectrogram of Figure 3. Figure 8(a) shows regions below 500 kHz, and Figure 8(b) shows regions above 500 kHz. All regions used as one of the classifiers summarized in Figure 6 (i.e., 2, 23, 40, 47, 56 and 64) Proc. of SPIE Vol. 5394

371

are shown with a bold, italicized font. Figure 9 is the corresponding plot for the automatically selected areas; the entire time-frequency plane is shown in one view. Note that regions 1, 2, 4, 5, 8, 9, 14, 22 and 24 are shown with a bold, italicized font, indicating that they were used in the classifiers summarized in Figure 7. (a)

(b)

Figure 8. Best 24 regions for the manually selected areas; (a) frequencies below 500 kHz, and (b) frequencies above 500 kHz. Regions used in classifiers are shown in bold and italics.

Figure 9. Best 24 regions for the automatically selected areas; regions used in classifiers are shown in bold and italics.

It is interesting to compare the regions determined by the manual and automated region selection methods. The best performing region for each method is in the same general area of the time-frequency plane: between 300 and 400 µsec, and 200 and 500 kHz. Since the FDR is higher for the manually selected region, its shape probably more closely encompasses the true area of maximum sensitivity. It is also of interest that the frequency of this region is low compared to the transducer frequency of 2.25 MHz. It is clear from the spectrogram that the lower frequencies are dominant, indicating that the structure is selectively attenuating the higher frequencies. Another area of the time-frequency plane with region overlap between the methods is between 0 and 100 µsec and from 1.5 to 1.7 MHz. It is not surprising that the higher frequency features are at shorter times due to the more rapid attenuation of the high frequency content. There are also overlapping regions of significance in the area between 100 and 200 µsec and 600 and 800 kHz that are important for several of the classifiers.

372

Proc. of SPIE Vol. 5394

It is perhaps most interesting to compare the “best” classifiers that were found for both methods. The best classifier for the manually selected regions, consisting of features 2, 23, 40, 56 and 64, is heavily weighted with shorter time and higher frequency information. For the automatically selected regions, the best classifier depends primarily upon lower frequency regions distributed throughout the entire time window. Both of these classifiers, although dependent upon very different regions in the time-frequency plane, had excellent performance on both the comprehensive data set and the new data set. One observation is that all regions in the best classifiers had significant energy in the nominal baseline signal; regions with low energy were generally not found to be useful as classifiers. This observation holds for both the manually and automatically selected regions. The biggest problem encountered, for which a solution has not yet been found, is how to predict which classifiers will perform well on the new data, which, as previously discussed, is essentially from a different structure. Most of the classifiers had reasonable performance on the comprehensive data set, but there was little correlation between performance on the training and evaluation data sets and performance on the new data set. In general, the best classifiers based upon spectrogram features performed better than those based upon time and frequency domain features, although it was more straightforward to select a classifier using time and frequency domain features.

6. SUMMARY AND CONCLUSIONS Classifiers based upon differential features of diffuse ultrasonic signals were shown to be successful in discriminating between environmental and structural changes introduced in a small aluminum plate. Classifiers were constructed from time and frequency domain parameters, as well as from spectrogram parameters; both performed well although the best classifiers utilizing spectrogram parameters had superior performance in terms of false alarms, misses, and overall correct classification rate. A methodology for automatically selecting regions in the time-frequency plane was introduced and was shown to produce classifiers that outperformed those created by manually selecting regions. Furthermore, the classifiers were shown to be effective for a structure for which they were not designed -- the same aluminum plate but with a hole drilled in it. This extension is noteworthy because the introduction of the hole significantly changed the details of the diffuse ultrasonic signals. Also, this extension meant that the signals from the original specimen with a flaw became the baseline signals for the modified specimen. There is one significant problem for which an acceptable solution has not yet been found -- an automatic method for determining which classifiers are extendable to new specimens. Part of this problem is that the classifiers performed so well on the base specimen that there was no aspect of the training and evaluation data that could be used to provide additional selectivity. One important outcome of this study is a better understanding of the classifiers that performed well; this information will be useful for future work. Only linearly separable classifiers implemented as neural networks with a single hard-limit perceptron were considered here, primarily because they worked well with the power of the classifier not being an issue. It was observed that the final classifier was very dependent upon the normalization of the data; for this study all features were normalized to zero mean and unit standard deviation within the combined training and evaluation data sets. Also considered but not reported were multi-level perceptron (MLP) neural networks. Various numbers of perceptrons and hidden layers were considered, as well as different types of perceptrons. Although some of these classifiers performed well, they were not used because there was no convergence to a unique solution for the weights and biases; the final network was dependant upon the starting values for network training. Multiple classifiers with the same architecture met very stringent stopping criteria but resulted in widely varying classifier performance. Anticipated future work will concentrate on the following: • • • • •

Additional experimental work on extensibility of classifiers to different structures. Consideration of additional artificial defects such as notches. Application to structures with introduced actual damage (e.g. cracks). Investigation of other classifier architectures such as support vector machines. Modeling of the effect of environmental and structural changes on diffuse ultrasonic waves. Proc. of SPIE Vol. 5394

373

REFERENCES 1.

T.E. Michaels and J.E. Michaels, "Sparse Ultrasonic Transducer Array for Structural Health Monitoring", in Review of Progress in QNDE, Vol. 23, eds. D. O. Thompson and D. E. Chimenti, American Institute of Physics, in press, expected 2004. 2. J. L. Rose, “Standing on the Shoulders of Giants: An Example of Guided Wave Inspection,” Materials Evaluation, 60, 53-59, 2002. 3. M.J.S. Lowe, D.N. Alleyne, and P. Cawley, “Defect detection in pipes using guided waves,” Ultrasonics, 36, 147154, 1998. 4. A. Vary, “Acousto-Ultrasonic Characterization of Fiber Reinforced Composites,” Materials Evaluation, 40, 650654, 1982. 5. J.E. Michaels and T.E. Michaels, "Ultrasonic Signal Processing for Structural Health Monitoring", in Review of Progress in QNDE, Vol. 23, eds. D. O. Thompson and D. E. Chimenti, American Institute of Physics, in press, expected 2004. 6. T.J. Case and R.C. Waag, “Flaw Identification from Time and Frequency Features of Ultrasonic Waveforms,” IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, 43, 592-600, 1996. 7. R. Schalkoff, Pattern Recognition, 89-104, John Wiley & Sons, Inc., New York, 1992. 8. S. Mitra, Digital Signal Processing: A Computer-Based Approach, 767-771, McGraw-Hill, New York, 2001. 9. A.L. Gyekenyesi, L.M. Harmon and H.E. Kautz, “The Effect of Experimental Conditions on Acousto-Ultrasonic Repeatability”, Proceedings of the 2002 SPIE Conference on Nondestructive Evaluation and Health Monitoring of Aerospace Materials and Civil Structures, 177-186, 2002. 10. R. Weaver and O. Lobkis, “Temperature dependence of the diffuse field phase,” Ultrasonics, 38, 491-494, 2000.

374

Proc. of SPIE Vol. 5394

Suggest Documents