Association Rules Enhanced Classification of Underwater Acoustic Signal Jie Chen1, Haiying Li2, Shiwei Tang1 National Laboratory on Machine Perception, Peking University, P. R. China 2 National Key Laboratory of Underwater Information Processing and Control, Northwestern Polytechnical University, P. R. China
[email protected],
[email protected],
[email protected] 1
Abstract Classification of underwater acoustic signal is one of the important fields of pattern recognition. Inspired by the experience of training man experts in sonar, we propose a two-phase training algorithm to exploit the association rules to reveal the understandable intrinsic rules contributing to correct classification in the known misclassification datasets in this paper. Preliminary experimental results demonstrate the potential of classification association rules to enhance the accuracy of classification of underwater acoustic signals.
1. Introduction Classification of underwater acoustic signal is one of the important fields of pattern recognition and has been extensively studied in the last ten years [1]. Researches on this topic fall into two categories: feature extraction and classification algorithm. Though many techniques of feature extraction have been developed, features from different classes are remarkably overlapped due to the complicated mechanism of vessel radiated noise [2]. Several kinds of neural networks have been used as classifiers of vessel radiated noises [3], yet the classification of underwater acoustic signal is still a tough problem owing to the complicated mechanism of subtle targets and their diverse environments. Bearing this problem in mind, we notice that some up to date research results on classification and association mining are beneficial to tackle the above problem. Classification and association mining were studied in isolation for many years until recently. [4] discussed the use of association rule mining for the discovery of models of the data that may not cover all classes or all examples. [5] used association rules and prunes rules using both the minimum support and the pessimistic estimation to implement a classification algorithm called CBA (Classification Based on Associations). [6] proposed a method to build a decision tree from association rules.
Proceedings of the 2001 IEEE International Conference on Data Mining (ICDM’01) 0-7695-1119-8/01 $17.00 © 2001 IEEE
However, the potential of these association rule based classification techniques has not been fully uncovered, and few papers using these techniques to the field of target recognition have been reported up to now. Learning from the experience of training man experts in sonar, we propose a two-phase training algorithm with association rules enhanced classifier for the classification of underwater acoustic signal. Usually training an expert need a long and laborious procedure. Sometimes they have to listen to and distinguish the records of some similar but often mistaken vessels. Inspired by this procedure, we aim to exploit the association rules to reveal the understandable intrinsic rules contributing to correct classification in the known misclassification datasets so as to build a classifier with high accuracy and robustness, which combines the advantages of both probabilistic neural networks (PNN) and association rules based classifiers.
2. CBA The main idea of CBA algorithm is to build a classifier from the complete set of classification association rules that satisfy the user-specified minimum support and minimum confidence. Classification rules are a special subset of association rules whose right-hand-side are restricted to the classification class attribute. Compared with neural network classifiers, classification rule built classifiers are easy to understand and realize. Here we omit the detail of CBA algorithm that is composed of two parts, including a rule induction algorithm and a classifier-building algorithm.
3. Algorithm of association rules enhanced classification Our objectives are (1) to separate the misclassification region of the feature space, and (2) to dig out the intrinsic rules that contribute to make correct classification in the misclassification region. Hence we propose a two-phase
training algorithm to enhance the predicative accuracy of classifier. For a given training set T , a portion of training cases ' is treated as an actual training set T in the first phase. Because of ease of training and a sound statistical foundation in Bayesian estimation theory, we train a probabilistic neural network called PNN-1 to model the ' training set T . Then input the whole set T into the trained network, thus T falls into two subsets noted by Tc and Tm respectively. The former is a set that has been correctly classified, and the latter is the rest that has been misclassified. In the next phase, we suppose that there are effective low dimension patterns in the misclassification region though high dimension patterns in the misclassification region are overlapped. Intuitively, classification association rules can be exploited to discover these possible low dimension patterns. Thus the CBA algorithm is applied to mine the classification association rules in Tm and construct an association rules based classifier CBA-1, which is of the following format where ri is a classification association rule selected from all rules, which are both frequent and accurate. Obviously, CBA-1 is a local or partial classifier with respect to the whole feature space, therefore it is necessary to build a probabilistic neural network PNN-2 to perform a classification of two subsets Tc and Tm . For a new input case, the classifier PNN-2 is used to decide which sub classifier will be chosen as the final classifier. If the new case is classified as M, it will be classified by CBA-1, otherwise, it will be further classified by PNN-1. Based on the partition of training set T and enhanced by classification association rules, it is reasonable that new cases within or near to the misclassification region Tm will be correctly classified by CBA-1. On the other hand, most other new cases classified into C class will tend to be correctly classified by PNN-1, or at least not to be more easily misclassified than using the PNN-1 to classify them directly.
4. Experimental results A preliminary experiment applying the proposed algorithm to the classification of four classes of real vessel radiated noise data, which have 964 cases altogether, is carried out. Feature vectors extracted from these cases compose the whole dataset. We use a half of the set as training set T and a half of T as the actual ' training set T . The left 482 cases are tested using the two-phase constructed classifies (PNN-2, PNN-1 and
Proceedings of the 2001 IEEE International Conference on Data Mining (ICDM’01) 0-7695-1119-8/01 $17.00 © 2001 IEEE
CBA-1) compared with a straightforward probabilistic neural network classifier PNN-0 treating T as training set directly. Table 1 shows the results of 10-fold cross validation. All recognition rates of test set are the average of 10 folds, and we keep to use the same training and test sets for the PNN-0 and two-phase constructed classifiers during the cross validation. The class labels of four classes of vessel radiated noises are I, II, III, and IV respectively. Table 1. Test results of two-phase constructed and PNN-0 classifiers. Classifier PNN-0 Twophase
I 91.5%
II 82.2%
III 80.7%
IV 86.9%
Average 85.3%
92.1%
82.0%
84.3%
89.3%
86.9%
5. Conclusion Inspired by the experience of training man experts in sonar, we propose a two-phase training algorithm to exploit the association rules to reveal the understandable intrinsic rules contributing to correct classification in the known misclassification datasets in this paper. Preliminary experimental results from real data demonstrate the potential of classification association rules to enhance the accuracy of classification of underwater acoustic signals. In our future work, we will focus on studying the impact of the partition of the feature space and discretization of continuous attributes. We suggest that using this algorithm as an incremental training algorithm and adding experts interfered selection of rules are beneficial to improve the performance of practical classification system for underwater acoustic signal.
References [1]
[2]
[3]
[4]
[5] [6]
J. Huang, J. Zhao, and Y. Xie, “Source classification using pole method of AR model”, IEEE ICASSP-97, 1997, Vol. 1, pp. 567 -570 J. Chen, H. Li, J. Sun, and Z. Li, “Fractal feature of underwater acoustic signals in zero-crossing domain”, WESTPRAC VII, Kumamoto, Japan, 2000, pp. 1129-132 X. Li, F. Zhu, “Application of the zero-crossing rate, LOFAR spectrum and wavelet to the feature extraction of passive sonar signals”, Proc. 3rd World Conf. Intell. Cont.& Auto., 2000, Vol. 4, pp. 2461-2463 K. Ali, S. Manganaris, and R. Srikant, “Partial classification using association rules”, KDD, 1997, pp. 115-118 B. Liu, W. Hsu, and Y. Ma, “Integrating classification and association rule mining”, KDD, 1998, pp. 80-86 K. Wang, S. Zhou, and Y. He, “Growing decision tree on support-less association rules”, KDD, 2000, pp. 265-269