19th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007
ULTRASONIC SIGNAL PROCESSING FOR ARCHAEOLOGICAL CERAMIC CLASSIFICATION PACS: 43.35.Zc Salazar, Addisson1; Parra, Angela2; Vergara, Luis1; Doménech, María Teresa2; Serrano, Arturo1 1 Universidad Politécnica de Valencia, Instituto de Telecomunicación y Aplicaciones Multimedia, Valencia, Spain;
[email protected] 2 Universidad Politécnica de Valencia, Instituto de Restauración del Patrimonio, Valencia, Spain;
[email protected]
ABSTRACT This paper presents a new procedure to classify archaeological ceramics using ultrasound nondestructive testing. The procedure features important advantages face to traditional methods of archaeological ceramic characterization such as chemical or thermo-luminescence analyses that are expensive and/or destructive. It consists in three stages: measuring by through transmission technique, extracting features from the measured ultrasonic signals, and classifying the feature set in classes corresponding to historic or protohistoric periods. The procedure was tested with several archaeological ceramic pieces from three deposits at the Comunidad Valenciana in Spain. The pieces corresponded to different periods as Bronze Age, Iberian, Roman and Middle Age. Results comparing a proposed classification procedure based on mixtures of independent component analyzers with different classification techniques as knearest neighbour and linear discriminant analysis are included. Those results were verified considering information of microscope and porosity analyses.
INTRODUCTION Independent component analyzers (ICA) [1] mixture models were introduced in [2] considering a source model switching between Laplacian and bimodal densities. Recently this model has been relaxed using generalised exponential sources [3], self-similar areas as a mixture of Gaussians sub-features [4], and sources with non-Gaussian structures recovered by a learning algorithm using Beta divergence [5]. Real applications of those works span: separation of eyemovement artefacts from EEG recordings, separating ‘background’ brain tissue, fluids and tumours in fMRI images, and the separation of voices and background music in conversations. The contribution of this paper is introducing ICA mixture models in ultrasonic non-destructive testing of materials, particularly in classifying archaeological pottery sherds from different ages. The hypothesis of diverse physical properties of the materials from different ages is assumed. In fact the standardization of an efficient and non-destructive method as ultrasonic testing for material characterization could become an important contribution for archaeologists. Several current techniques for characterization and dating of archaeological ceramics, such as, thermoluminescence, chemical methods, and thin section microscopy are destructive and expensive [6]. The procedure applied to separate the classes (archaeological ages) in the ICA mixtures, incorporates a versatile scheme with the following features: (i) no assumptions about the pdf of the sources, (ii) a learning algorithm that maximizes a cost function using ascent gradient, (iii) supervised-unsupervised learning of the model parameters, and (iv) flexibility of using any ICA algorithm into the iterative learning process. The archaeological pottery pieces come from three deposits at the East Spain: Requena, Lliria, and Enguera. A total of 480 pieces were available from the Bronze Age, Iberian, Roman, and Middle Age periods that were measured by ultrasonic testing in through transmission mode (using two transducers: one emitter and one receiver) [7]. From the recorded signals the
following features were extracted: wave propagation velocity, principal and centroid frequencies, amplitude and attenuation of the principal frequency, power, total attenuation and attenuation curve initial value of the signal, and non-linear parameters (third-order autocovariance, timereversibility, and instantaneous centroid frequency). Definitions of the features are included in section 3. The signal features were preprocessed with PCA, and an iterative process of classifying with a variable number of components were applied, using LDA (Linear Discriminant Analysis) and kNN (k-Nearest Neighbors) as classifiers to determine the best number of PCA components [8]. The selected components were used as input for the learning ICA mixture model procedure, using a non-parametric ICA (MIXCA), JADE, TDSEP, and fastICA [9-11] to compute the increments of the model parameters and several supervision ratios were tested [12-14]. Results up to 100 Montecarlo trials of the classifications are presented. ICA MIXTURE ALGORITHM The observation vectors xk are modelled as the result of applying a linear transformation Ak to a vector sk (sources), whose elements are independent random variables, plus a bias vector b k , thus Wk = A −k 1 . The algorithm is based on maximizing the log-likelihood. That is the following cost function is minimized,
(
N
− L ( X / Ψ ) = − log p ( X / Ψ ) = − ∑ log p x ( n ) / Ψ n =1
)
(1)
where Ψ is a compact notation for all the unknown parameters Wk , b k for all the classes Ck , (k = 1… K ) . In a mixture model, the probability of every available feature vector can be separated into the contributions due to every class. The training of the data proceeds with
the following algorithm: Initialize i = 0 , Wk ( 0 ) , b k ( 0 ) .
0
1 2
4
)
(
N
)
)
(
)
p Ck / x( n ) , Ψ ( i ) =
(
det Wk ( i ) . p s(k ) ( i ) n
)
∑ det W ( i ) p ( s( ) ( i ) ) K
Use
k '=1
Use p s( n) = a⋅ e ( km ) ∑
]
n
( n ) ( n ') 1 ⎛ s −s ⎞ ⎡ ⎤ − ⎜ km km ⎟ ⎢ s ( n ') ⋅ e 2 ⎜⎝ h ⎟⎠ 2 ⎥ ∑ to 1 ⎢ n '≠ n km ( n) ⎥ = 2⎢ − skm ⎥ ⎛ s ( n ) − s ( n ') ⎞ 1 h ⎢ − ⎜ km km ⎟ ⎥ ⎟2 h 2⎜ ⎠ ⎢ ∑e ⎝ ⎥ ⎣⎢ n '≠ n ⎦⎥
( )
Actualize
b k ( i + 1) = b k ( i ) + β ⋅ ∆b k ( i )
m =1...M
k =1...K
to
k = 1...K , or
simply re-estimate
∑ x( ) p (C
/ x( n) , Ψ ( i )
∑ p (C
)
N
n'≠n
estimate the marginals p ( s(kn) ( i ) ) . 3
]
estimate f s(kn ) .
k = 1...K
for the rest of k − n pairs. ( n) ( n') 1⎛ s −s ⎞ − ⎜ km km ⎟ 2⎜ h ⎟2 ⎝ ⎠
( )
( ) f skm
n k'
k'
[
n =1
k − n pairs with knowledge about n p Ck / x( ) , Ψ .Compute
(
[
∆b k (i ) = ∑ − diag f (s (kn ) ) w km (i ) ⋅ p (C k / x (n ) , Ψ )(i )
Compute n n s (k ) ( i ) = W k ( i ) ( x ( ) − b k ( i ) ) k = 1.. K n = 1... N Directly use p Ck / x( n ) , Ψ ( i ) = p Ck / x( n ) , Ψ for those
(
Compute ∆b k ( i ) using
bk ( i + 1) =
Use the selected ICA algorithm to compute the increments ∆ ( n ) Wk ( i ) corresponding to
n
n=1
5
ICA
the observation x( n) , n = 1,… N , that would be applied in Wk ( i ) , in an “isolated”
k
n=1 N
k
)
/ x( n) , Ψ ( i )
Go back to step 1, with the new values Wk ( i + 1) , b k ( i + 1) and i → i + 1
learning of class Ck . Compute the total increment by N
(
)
n) ∆Wk ( i ) = ∑ ∆ (ICA Wk ( i ) ⋅ p Ck / x( n ) , Ψ ( i ) n =1
.
Update Wk ( i + 1) = Wk ( i ) + α ⋅ ∆Wk ( i )
k = 1...K
k = 1...K .
2 19th INTERNATIONAL CONGRESS ON ACOUSTICS – ICA2007MADRID
.
19th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007
Then, the estimated parameters of the mixture models are used for allocate classes to the testing dataset. Given an observation x , a selected class is assigned having the maximum computed value in the following expression, p (C k / x ) =
det A k−1 p (s k )
K
∑ det A
k ' =1
(2)
p (s k ' )
−1 k'
EXPERIMENT SETUP The equipment setup used in the experiments was the following: (1) Ultrasound equipment setup: ultrasound system (Matec PR5000), transducers (2.25 MHz krautkramer), pulse width (4 µs), pulse amplitude (100%), no analog filters, excitation signal (tone burst 1.050 MHz), operation mode (through transmission), amplifier gain (43 dB); (2) Acquisition equipment setup: equipment (Handyscope HS-3), sampling frequency (100 MHz), sample number (10000), observation time (1 ms), vertical resolution (16 bits), dynamic range (100 mV/division), average (16 acquisitions), PC connection (USB). Figure 1 shows a diagram of the equipment. Digital data transmission (USB bus)
Data acquisition setup (RS232 bus)
Data acquisition equipment
Analog data transmission (coaxial cable, BNC connector)
CONTROL
Ultrasound equipment
1 0
Storage box: cables, sensors, connectors, etc.
Mobilization
Ultrasound transducers
Figure 1 – Equipment setup
Distribution of the pieces was: 47 Bronze age, 155 Iberian, 138 Roman, and 140 Middle age pieces. They were measured using a rubber adaptor to adapt the acoustical impedance of the transducer to the piece. This adaptor has advantages as, better coupling to the curve surface of the material, be innocuous for the piece, and easy to automatize, see Figure 2. adaptor
ceramic sherd
adaptor
Middle age period 4
Roman period 3
Iberian period 2
Bronze age 1
connection to ultrasound system
emitter
receiver
connection to ultrasound system 1000
Figure 2 – Detail of the measurement by ultrasound
2000
3000
4000
5000
6000
7000
8000
9000
10000
Figure 3 – Collected signal waveforms
Some of the waveforms of collected signals are shown in Figure 3. Note the time of flight of the ultrasonic waves is different for these pieces. FEATURE EXTRACTION AND PCA The features extracted from the signals are in Table 1.
Total signal attenuation
Attenuation curve initial value (dB)
xˆ(t ) = Ae − β t ⇒ Aten _ Total = β
xˆ(t ) = Ae − β t Po = 10log( A) f2
Propagation velocity
piece thickness
v=
∫ f ⋅ X ( f ) df
fc =
Centroid frequency
f1 f2
ultrasound time of flight
∫
X ( f ) df
f1
Principal frequency
f max / X ( f max ) ≥ X ( f ) ∀f
Principal frequency amplitude
X ( f max )
Principal frequency attenuation (dB)
{
xˆi (t ) = Ae
}
T
3 x
⎛ dx (t ) ⎞ ⎜ ⎟ ⎝ dt ⎠
3
x ( t ) ⋅ x ( t − 1) ⋅ x ( t − 2 )
Instantaneous centroid frequency
⇒ At _ F Re son = β i
∫ P _ Total =
Signal power
σ
Third order autocovariance
xi (t ) = TF −1 X ( f )· Notch filter , fi ( f ) − t βi
1
Time-reversibility
fc(t = t 0 )
2
x (t ) dt
0
T
Table 1 – Ultrasound signal features
In order to reduce dimensionality, features were preprocessed with PCA, and to determine the best number of components an iterative procedure varying the number of components and classifying with LDA and kNN was proceeded. The classification procedure divided the records in two datasets, 75% of the data for training and 25% for test, and was repeated 100 Montecarlo trials. The relative small size of the dataset may cause a false minima problem. For small sample size there exist rotations (instances of W ) for which portions of the data spuriously approximately align. Thus, to address this problem an augmented dataset was processed in training stage, adding 3 replicates of the original samples with additive spherical Gaussian noise [15]. Figure 4 shows the results for the number of component selection. The parameter k of kNN was 10. The lowest classification error (0.273) was obtained for 6 components and LDA algorithm using quadratic distance. For this case, the percentage of success per class was: Bronze (87), Iberian (84.5), Roman (47.3), and Middle age (80). 0.65 LDA-linear LDA-Mahalanobis LDA-quadratic kNN
0.6
classification error
0.55
0.5
0.45
0.4
0.35
0.3
0.25
1
2
3
4
5
6
7
8
9
10
11
components
Figure 4 – Component selection and results for LDA and kNN
RESULTS AND DISCUSSION Figure 5 show the scatterplot of the three first components with the estimated parameters (basis vectors and bias terms), during training for each of the four classes, depicted on the data. Figure 5 has been rotated to highlights a point of view where the data for the different classes are discernible; but there are zones where data are crowded together making difficult separation of samples from different classes. For this case the ICA mixture algorithm was able to find the right direction and location of the parameters using non-parametric ICA. Figure 6 shows the log-likelihood evolution through the ICA mixca algorithm iterations for the case of Figure 5. The curve represents an ascending behaviour increasing the number of iterations, the adjusted parameters fit the observed data and the log-likelihood is maximized. The algorithm converged successfully at the 158th iteration. In addition, Figure 6 shows the gradient ascent technique surpassing two zones of local maxima. Then tuning of parameter α , in step 3 of the ICA mixture algorithm of Section 2, is critical for convergence. 4 19th INTERNATIONAL CONGRESS ON ACOUSTICS – ICA2007MADRID
Bronze + green | Iberian o blue | Roman * yellow | Middle age x cyan 4 6000
3
5000
2
4000
3000
1
3
2000
0 1000
-1 0
-2
-1000
-2000
-3
-3000
-4 5
0
-5
-10 -4
0
-2
2
4
-4000
6
0
20
40
60
80
100
120
140
160
1 2
Figure 5 – 3D scatter plot of the data in the component subspace with the ICA mixture parameters outlined
Figure 6 – Data log-likelihood as a function of the number of iterations of the ICA mixture algorithm
We have verified through Montecarlo simulations, the convergence of the algorithm depended on overlapping areas of the classes, parameter initialization, and supervision ratio. The convergence ratio was high, most of non-convergence cases, when the algorithm became stuck, corresponded to unsupervised classification. However after some labelling of the data the algorithm converged. This latter could not be a problem when you have enough historical data. As well the gradient method produced a good compromise between convergence velocity and computational payload. In addition, the selection of the smoothing parameter h was critical to determine the kernel for the non-parametric estimation of the mixture parameters. The averaged classification percentage of success of the ICA mixture classifications using different ICA algorithms in estimating the increments of the model parameters and two different supervision ratios are shown in Figures 7 and 8. Supervision ratio is the proportion between labelled and unlabelled data in training stage. We had a total of 480*0.75=360 original samples for training. 3 replicas with noise of the original samples were added to obtain a total of 1440 samples for training. Then for a supervision ratio of 0.7, we had 1008 labelled and 430 unlabelled samples for training stage. The best performance in classification for the ICA mixture algorithm is obtained using a non-parametric ICA algorithm (MIXCA). For 0.5-supervision ratio average results of percentage of success were: MIXCA=71%, JADE=65%, TDSEP=65.5%, and fastICA=60.4%. Mixca results with this little supervision in training are comparable with the best results of LDA with quadratic distance (72.7%). For a supervision ratio of 0.7 the results are much better for Mixca(80.42%) and for JADE(72.3%) are comparables with LDA-quadratic distance; results for TDSEP(67.8%) and fastICA(66.7%) become better than LDA- linear and Mahalanobis distances and kNN. Table 2 contains the confusion matrix obtained by Mixca with 0.7 supervision ratio. The category Roman is not very difficult to classify but it is often assigned to pieces of Bronze and Iberian ages. Middle age pieces are 11% confused with Iberian pieces. Thus Roman and Middle age pieces cause misclassification of Bronze and Iberian pieces. The good average percentage of success obtained (80.42%) indicates good matching of the 6D component space projected from the calculated features to an ICA mixture model. Some of the pieces were treated by consolidation products for conservation, but it did not seem to affect their classification on archaeological age. In addition, there was knowledge about correlation of the ceramic age with material porosity: Bronze and Iberian (high porosity), Middle age (medium porosity), and Roman (low porosity). Thus the correct classification of the sherds indicates that ultrasonic signal features could measure changes in physical properties, as porosity, of the archaeological ceramics. Bronze Iberian Roman Middle age
Bronze 0.71 0 0 0
Iberian 0 0.74 0.07 0.02
Roman 0.21 0.15 0.83 0.1
Middle age 0.07 0.11 0.1 0.88
Table 2 – Confusion matrix by Mixca with 0.7 supervision ratio. Values in percentages
5 19th INTERNATIONAL CONGRESS ON ACOUSTICS – ICA2007MADRID
1
1
0.9
0.9
0.8
0.8
0.7
0.6
% success
% success
0.7
0.5 0.4
0.6 0.5 0.4 0.3
0.3 0.2
mixca jade
0.2
0.1
tdsep fastica
0.1
0
Bronze
Iberian
Roman
0
Middle
mixca jade tdsep fastica Bronze
AGE
Iberian
Roman
Middle
AGE
Figure 7 – Results of ICA algorithms 0.5 supervision ratio
Figure 8 – Results of ICA algorithms 0.7 supervision ratio
CHEMICAL AND PHYSICAL ANALYSES Complementing the measures with ultrasounds, have been carried out a diversity of morphological and physiochemical characterization by means of conventional instrumental techniques in order to be contrasted with the above mentioned. 25 representative pieces were selected for chemical and physical analyses. Those pieces were observed, photographed and analyzed by means of optical microscope, scanning electron microscope (SEM) and X-ray diffraction techniques. As well physical analysis for the evaluation of density and open porosity parameters were applied on the selected pieces. Preparation of the samples for the scanning electron microscope analysis required a laborious protocol. Test tubes were elaborated submerging small stratigraphic fragments in polyester resin with cold assembly and short hardening times. Then the hardened samples were polished with wet sandpapers, eroding the resin until the stratigraphic section of the fragment appeared by one of the faces, allowing observation and superficial characterization by means of the electron scanning. As well, additional eroding was required to obtain a polished transparent surface, with parallel faces and similar height and width in order to make easy the observation by SEM. Some of the test tubes are showed in Figure 9.
Figure 9 – Test tubes prepared for SEM
After preparation of the test tubes, they were photographed by optical microscope, to obtain supplement information as texture, colour, and to identify areas of interest in the fragments. Making, observation, measurement, and data processing of the 25 test tubes were a 20 days long procedure approximately. Physical analyses to obtain density and open porosity data required many weights of the fragments taken a week-long. For X-ray diffraction analysis, small fragments of the pieces were pulverized with a mortar. Data provided by these analytical studies show that there are clear differences at a morphological level between the different groups of processed fragments. So, ceramic corresponding to the Bronze age exhibits a dark brown tone with quite a lot of porosity and the presence of a lot of dark tone ferrous-composition spots associated with magnetite as well as reddish ferrous iron oxide nucleus. The Iberian ceramic coming from Enguera has a variable 6 19th INTERNATIONAL CONGRESS ON ACOUSTICS – ICA2007MADRID
tonality, between orange and black. It can be found abundant ferrous iron oxide nucleus (III) as well as more isolated dark magnetite spots. It is an iron-rich ceramic (up to 7.45% of Fe2O3), with a noticeable content in calcium (up to 6.30% of CaO). In relation with the physical properties we can said that the real densities were around 2.4 g/cm3 and the bulk density and the porosity were around 1.8 g/ cm3 and 22%, respectively. The fragments of Roman ceramic coming from Lliria have variable characteristics depending on the typology (sigillata, common or amphora). In all cases, they are orange tone mush, small size porosity as well as low level of degreaser, rising in quantity from the sigillata typology to the amphora, with content in Fe2O3 of 5.71, 6.36 and 9.24% respectively, and content in CaO of 0.67, 2.92 and 1.29% respectively. The real densities are very similar, between 2.4 and 2.7 g/cm3, and the bulk densities and porosity are around 2.1, 1.75 and 1.8 g/ cm3, and 42, 28 and 31% respectively. It is worth noting the high value of porosity showed by the fragments of sigillata, which is associated with pores of small to very small size and very connected, which allows big water absorption once the varnish layer is removed. Finally, the Middle age ceramic shows a bright orange to brown colour that remarks that are ferrous mush. It can be seen quite red ferrous iron oxide nucleus of small to very small size as well as dark tone magnetite spots. Also it can be seen white tone limy masses, associated to a high content in CaO (around 8%). In relation with the physical properties of density and porosity, it is worth mentioning real density values ranging in 2.0-2.38 g/ cm3, bulk density of 1.68-1.88 g/ cm3 and porosity of 10.0-25.7%. Results from chemical and physical analyses from a sample of the pieces showed differences among porosities and properties of pieces from different ages. Those results seem to be correlated with the ultrasound parameters extracted. Thus, conclusions of both procedures were useful to classify the ceramics by age. However, there are significant differences in time consuming, equipment cost, and innocuity of the procedures, being the proposed ultrasound procedure more effective. Those preliminary results have to be confirmed by more extensive research. CONCLUSIONS The matching of the problem of classifying archaeological sherds by age to an ICA mixture model means the mixture matrixes of the sources (basis vectors) and centroids (bias terms) allow to separate the pieces in different classes. Basis vectors define a particular relationship between the features or components for each class, i.e., archaeological age. Features in nondestructive testing by ultrasounds come from signals that contain information on the material microstructure, due to the ultrasonic pulse travelling through the material. This information can be related with physical properties as porosity, elasticity, material grain, and so on, in material characterization. Frequency components of the injected ultrasonic pulse change when they travel through the material, and ultrasonic wave propagation velocity depends on the material porosity. Due the signal features fit well the ICA mixture model, it could be considered the “sources” in the ICA mixture model as the responses to ultrasounds of different material microstructures, corresponding to different archaeological periods. ICA mixture classification performed better than LDA and kNN classifications; results of Mixca with 0.7 supervision ratio in training were better than LDA and kNN with 1 supervision ratio. ICA mixture defines more arbitrary surfaces than the hyperplanes defined by LDA. That is because ICA mixture expands the observation vector in multidimensional spaces whereas LDA develops the observation vector in a unidimensional expansion basis. In addition, ICA mixture makes estimates of the source probability density function with no assumption on Gaussianity or other pdf constrains; it is more versatile than the estimates made by kNN. The proposed procedure works in semi-supervised training schema allowing approaching more problems where complete supervision is not possible. In spite of those advantages, ICA mixture models have some restrictions as; more complexity, increase of matrix dimensionality, and the model assumed for each category is adapted.
References: [1] A. Hyvärinen, J. Karhunen, E. Oja, Independent Component Analysis. John Wiley & Sons, 2001.
7 19th INTERNATIONAL CONGRESS ON ACOUSTICS – ICA2007MADRID
[2] T.W. Lee, M. Girolami, and T.J. Sejnowski, “Independent component analysis using an extended Infomax algorithm for mixed sub-Gaussian and super-Gaussian sources”, Neural Computation, vol. 11, no. 2, pp. 409-433, 1999. [3] W.D. Penny and S. Roberts, “Mixtures of independent component analyzers”, in Proc. ICANN2001, Vienna, August. 2001, pp. 527-534. [4] R. Choudrey and S. Roberts, “Variational Mixture of Bayesian Independent Component Analysers”, Neural Computation, vol. 15, no. 1, pp. 213-252, 2003. [5] N.H. Mollah, M. Minami, and S. Eguchi, "Exploring Latent Structure of Mixture ICA Models by the Minimum ß-Divergence Method '', Neural Computation, vol. 18, pp. 166-190, 2005. [6] R. E. Taylor and M. J. Aitken, Dating in Archeology (Advances in Archaeological and Museum Science), Springer, 2003. [7] J. Krautkrämer, Ultrasonic Testing of Materials, Springer, 4th edition, Berlin, 1990. [8] R.O. Duda, P.E. Hart, and D.G. Stork, Pattern Classification. Wiley-Interscience, 2nd edition, 2000. [9] J.F. Cardoso and A Souloumiac, “Blind beamforming for non Gaussian signals”, Radar and Signal Processing, IEE Proceedings-F, vol. 140, no. 6, pp. 362-370, 1993. [10] A. Ziehe and K.R. Müller, “TDSEP- an efficient algorithm for blind separation using time structure”, Proc. of ICANN'98, pp. 675-680, 1998. [11] A. Hyvärinen, “Fast and Robust Fixed-Point Algorithms for Independent Component Analysis”, IEEE Transactions on Neural Networks, vol. 10, no. 3, pp. 626-634, 1999. [12] O. Chapelle, B. Schölkopf, and A., Semi-supervised learning. MIT Press, 2006. [13] K. Nigam, A.K. McCallum, S. Thrun, and T. Mitchell, “Text classification from labeled and unlabeled documents using EM”, Machine Learning, vol. 39, pp. 103-1034, 2000. [14] J. Larsen, A. Szymkowiak, and L.K. Hansen, "Probabilistic hierarchical clustering with labeled and unlabeled data", Int. Journal of Knowledge Based Intelligent Engineering Systems, 2001. [15] E.G. Learned-Miller, and J.W. Fisher, “ICA using spacings estimates of entropy”, Journal of Machine Learning Research, vol. 4, pp. 1271-1295, 2003.
8 19th INTERNATIONAL CONGRESS ON ACOUSTICS – ICA2007MADRID