signal-to-symbol conversion level yz - CiteSeerX

1 downloads 0 Views 219KB Size Report
The signal-to-symbol conversion level of a generic signal interpretation system is .... According to their principles, existing signal-to-symbol conversion schemes ...
Towards a General Signal Interpretation System

signal-to-symbol conversion levelyz

Yifan Gong and Jean-Paul Haton CRIN/INRIA-Nancy, 54506 Vandoeuvre, France Conference submitted for: Image, Speech and Signal Processing

N

 This work is supported by the European Economic Community within the Esprit-II project o 2167 y Neither this paper nor any version close to it has been or is being o ered for publication elsewhere. z Accepted for publication by 10th Int. Conf. on Pattern Recognition

1

(AITRAS).

Abstract The signal-to-symbol conversion level of a generic signal interpretation system is presented. Using an application-independent structure, the system can compile user supplied application-speci c task description to generate programs capable of performing statistic and structural pattern recognition and reasoning in a given application domain. The computation of three types of symbols, quantitative symbols, qualitative symbols and compound symbols, is outlined. The notion of context of symbols and its use in the system is presented. In this system, symbolic processing, signal processing routine and similarity measure can be speci ed in terms of situations. We give an valuation of the system performance in an application of nondestructive testing of steam generator using eddy current signals.

Contents 1 Introduction

3

2 Representation Space

3

3 Signal-to-Symbol Conversion

4

3.1 Quantitative Symbols

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

4

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

4

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

4

3.2 Qualitative Symbols 3.2.1 De nition

3.2.2 Qualitative Symbol Computation { an Example : 3.3 Compound Symbols

: : : : : : : : : : : : : : : : : : : : : : : :

5

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

6

4 Contextual Processing of Symbols 4.1 Introduction : 4.2 Context

7

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

7

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

7

4.3 Environment

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

7

5 Situation-Sensitive Signal Processing and Comparison

7

6 Application and Evaluation

8

6.1 Introduction :

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

6.2 Description of the Spaces 6.3 Symbols

8

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

8

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

9

6.4 Context, Environment and Situation 6.5 Evaluation :

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

9

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

10

7 Conclusion

10

2

1 Introduction Signal understanding is based on an interpretation process which consists in transforming, through successive abstract conceptual levels, an observed signal into known symbolic structures [4]. The signal interpretation process necessitates some kind of reasoning which can only be achieved on a symbolic representation of information. Consequently, a conversion of signal into symbols which gives the base of reasoning vocabulary is necessary. Few works have addressed the problem of signal-to-symbol conversion in a generic framework. Traditionally, speci c systems are designed to deal with speci c applications. In this paper we propose a formalization of the signal-to-symbol conversion problem and propose a generic approach for solving it in the framework of an environment, called Generator of Signal Interpretation Generic MAchine level-1 (G1 for short). Using G1 , an interpretation system is instantiated each time an application is given. The system is described in terms of its representation space, associated signal processing procedures and de nitions of symbols. A language is de ned and its compiler implemented, which is provided to the user for specifying his task. G1 will compile the speci cation and generate a C program for the task. Running the program, compiled with user-supplied application-speci c procedures, the user can obtain the symbols he needs for reasoning, either in top-down or in bottom-up mode [8]. According to their principles, existing signal-to-symbol conversion schemes belong to two approaches [9]:

 a physical model based approach which tends to represent the underlying physical model that actually generates the signal;

 a mathematical model based approach which uses general pattern recognition methodology to achieve the conversion, independently of the physical phenomenon.

Since the physical model based approach is highly application-dependent, at present only mathematical model based approach is helpful to cover a large number of phenomena. In order to build a generic system, i.e. a system whose structure is independent of an application physical process, we have investigated systematic techniques based on the mathematical model based approach. In the following, we introduce the principles of G1 . Signal representation spaces and computation of three types of symbols, are outlined (section 3). The notion of symbol context is presented and its use in G1 discussed (section 4). The possibility of performing situation-sensitive signal processing and signal comparison is presented (section 5). An example of application of the system to a nondestructive testing task using eddy current signal is nally included (section 6).

2 Representation Space G1 requires the speci cation of representation spaces as well as procedures for space mappings. A signalto-symbol conversion system can therefore be described by the following representation spaces:

   

an original space O where the signal is measured; a visual space V in which signals can be visually examined and interpreted through human expertise; a parameter space P in which signals are compared using some similarity measure; a symbol space S which is composed of all possible symbolic descriptions of signals.

3

The above spaces should be de ned for a given application. Mappings between spaces are needed. Two mappings are essentially application-speci c:

 from original space O to parameter space P and  from original space O to visual space V . Signal processing procedures provided by the user will carry out these mappings. These procedures convert signal into suitable spaces chosen by the user. The mapping from P to S will be discussed in the next section. Besides, the user must provide the functions necessary for the computation of signal similarity in P space, since no general application-independent solution exists for this point.

3 Signal-to-Symbol Conversion Basically, symbols in space S may be of one of the following three types: quantitative symbols, qualitative symbols, and compound symbols. Symbols of di erent types require di erent conversion techniques. We

de ne these symbols types and related conversion techniques below.

3.1 Quantitative Symbols A quantitative symbol (QTS) describes some physical property of the signal. The problem is therefore to calculate, in the parameter space P , some signal properties f 2 F where F is made up of all properties to be measured, giving a real value: QTS : ? P ! F  R: (1) This is in fact a measuring process performing some parametrization, totally dependent on the physical meaning of the property and thus application-speci c. Consequently, the application user should provide G1 with speci c routines. Typical examples of such symbols are: frequencies, time intervals between two events, or the energy of a signal.

3.2 Qualitative Symbols 3.2.1 De nition A qualitative symbol (QLS) indicates the presence of a fact. This presence can be measured by a likelihood coecient ranging from 0 to 1 to express the uncertainty. A higher value indicates a better certitude about the symbol's presence. In this case, we are given a symbol set S , and for each symbol s 2 S a list of images ik;s 2 P , called training set, where ik;s is the kth image of the symbol s. The problem consists in nding a projection , using the training set: QLS : ?  : P ! S  R[0;1] (2) such that 8s 2 S ; 8k (ik;s) = (s; lk;s ) and X lk;s be maximized: 8s;8k

4

lk;s is the computed likelihood that the kth image of symbol s is mapped as s.  is then used for qualitative symbol computation. Let's consider an example of qualitative symbols. The fact \signal has front transition" can be represented by the symbol FrontTransition. A computed likelihood value of 1 indicates that the signal does have a front transition whereas 0 states that it does not at all.

3.2.2 Qualitative Symbol Computation { an Example 3.2.2.1 a. Introduction A number of pattern recognition techniques may be used to compute qualitative symbols. We now describe a method which has been used in continuous speech recognition [5]. This method is based on reference pattern comparison and therefore has the advantage of assuming nothing about the underlying physical phenomenon. 3.2.2.2 b. Training Training is the process of obtaining knowledge about  under a given a structure. It is performed in this case in three steps:

1. Data reduction: Using unsupervised statistical classi cation [7], clusters are computed in the P space for all data. Each center of cluster is stored and is given a code n 2 N . The resulting list of pairs (vector, code) is called a codebook. The size of the codebook can be controlled to represent signal space with a sucient precision. 2. Vector quantization: Vector quantization [6] is applied to code all images in the training set, i.e. to replace images by their codes. The result is the coded version of the training set, called dictionary, in the form of pairs of (c 2 N p ; s 2 S ) where p is the number of components of the signal. 3. Codebook reduction: Drop all images and their code which, in the vector quantization phase, are not used for coding reference. A reduced codebook is then obtained [2].

3.2.2.3 c. Recognition Recognition is the phase realizing signal-to-symbol conversion. A test signal is given in the P space. The recognition is achieved by the following steps: 1. Using reduced codebook, components of the test signal are coded, resulting a list of pairs of (n 2 N ;  2 R[0;1] ) where  is the likelihood of coding. 2. Computing cumulated likelihood with respect to each reference in the dictionary: (r) 1 LX i l(r) = L(r) i=1

8r:

(3)

where L(r) is the number of components of the reference. 3. Using likelihood as key, sorting the resultant list. 4. Consulting the reference dictionary and for each code nd its symbol.

3.2.2.4 d. Learning Learning consists in nding more general descriptions of a training set. In this example, learning is done by using a small codebook size and dropping all references whose code has already been used for coding another reference of the same symbol.

5

3.3 Compound Symbols A compound symbol (CMS) is introduced by the designer in order to describe the structure of a signal. It is de ned on the basis of already de ned symbols, using and connector: CMS : (S  R[0;1] )  (S  R[0;1] )  :::(S  R[0;1] ) ! (S  R[0;1] ) (4) The de nition of a compound symbol follows the syntax: CMS = A1 &A2 &A3 &:::An (5) where Ai can either in turn be a compound symbol, a qualitative symbol, an expression of quantitative symbol as described in Section 3, or a fuzzy descriptor (FUD): Ai = CMDjQLS jexpr(QTS)jFUD: (6) A fuzzy descriptor converts the imprecision (fuzzyness) of a natural language description, which the user may employ to specify a signal, into machine representation [3]. Descriptions such as Is-About(x), Is-Great(x), and Is-Small(x) can be used for symbol de nition, where x is a symbol. A fuzzy descriptor has the following syntax FD = symbol qualifier < interval > (7) with qualifier 2 fIs-About, Is-Great, Is-Smallg, giving the degree, ranging from 0 to 1, of quali cation of a symbol with respect to a given interval. A CMS s is associated with a likelihood Q(s), as in the case of QLD. The Q(s) is dependent on the relative importance of each term of the right-hand-side of a rule, and on the likelihood of each term. In order to express this relative importance, we extend the form of Eq-5, yielding: CSD = A1 [c1]&A2 [c2]&A3[c3]&:::An[cn ] (8) in which ci is the relative importance of term Ai with respect to other terms. Based on intuition, the mechanism for computing the likelihood of a compound symbol should satisfy the following constraints: 1. The likelihood should increase if the likelihood of one of its composing symbol increases. 2. The more important an individual symbol in the expression is, the more in uence on the resulting likelihood it should give. In G1 , the likelihood Q(x) of the compound symbol x is computed by Eq-9. ( P 1 Pn Q(A )  c if x is CMS i i i=1 c =1 Q(x) = likelihood(x) if x is QLS or FUD n j

j

(9)

As example, the compound symbol DangerSignal = FrontTransition[7]& Energy Is-About < Es; Re > [4] describes the fact that a dangerous signal is the one having a front transition and an energy approximately (as indicated by Re ) equal to Es . The two component facts have respectively the importance of 7=11 and 4=11 in the symbol. 6

4 Contextual Processing of Symbols 4.1 Introduction In real world applications, the existence of the two kinds of symbols discussed in subsection 3.2 and 3.3 is constrained by some more global conditions which can be evaluated at the run time. In this section we discuss how these conditions are represented in G1 . For each type of symbols, we use a speci c model, the context for qualitative symbols and the environment for compound symbols.

4.2 Context Qualitative symbols are computed using pattern recognition techniques, with reference patterns extracted from a training corpus. In general, the references are accompanied with some speci cations about the occurrence of the symbol they represent. It is meaningless to compare references which have di erent speci cations. The notion context is introduced in order to distinguish symbols of same nomination but with di erent speci cations. During the task description, qualitative symbols with their image associated with their context are represented in G1 as: symbol : fimage; predicate(arg1 ; arg2; :::)g (10) The image of a symbol will not be compared with a test pattern if the evaluation of the associated predicate gives a FALSE result.

4.3 Environment Compound symbols are further divided into two categories:

 those introduced for representing intermediate facts;  those representing nal symbols that should be fed to higher levels of interpretation, called objective symbols.

Objective symbols can only occur when a list of preconditions is veri ed. We use the notion of environment to model such constraints. An environment together with a symbol list is expressed as following: predicate(arg1; arg2; :::) : symbol1 ; symbol2 ; ::: (11) The list of symbols will be computed only when the evaluation of the related predicate gives a TRUE result.

5 Situation-Sensitive Signal Processing and Comparison The performance of the signal-to-symbol conversion for qualitative symbols described in 3.2 is closely related to two factors:

 the parameter space P and  the similarity measure between vectors in P . 7

These two factors were introduced in Section 2. No unique solution exists for choosing either P or the similarity (or likelihood, distance) measure. Especially, in some pattern recognition applications, it is preferable to alternate the signal representation space and the associated similarity measure, according to run-time known situations. In other words, di erent analyses and di erent criteria are used to compare signals of di erent categories. Situation-dependent space and similarity measure are used in order to satisfy this requirement. In G1 , signal representation space and similarity measure are speci ed in terms of \situation", in the following form: predicate(arg1 ; arg2; :::)

(12)

signal-processing-routine, similarity-measuring-function This formula should be interpreted as follows: if predicate, which speci es a situation, returns TRUE, then signal-processing-routine will be used for representing the signals and the signals will be compared using similarity-measuring-function. The predicate otherwise is provided to specify the case where a

default procedure will be used.

6 Application and Evaluation 6.1 Introduction In this section we describe some issues related to the application of G1 in the framework of AITRAS, a large scale ESPRIT project for environment performing real-time signal understanding tasks. This application deals with multi-frequency (FL, FM, FH) eddy current inspection of steam generator tubes in nuclear power plants, with the objective of giving a diagnosis of nature, origin, type, shape and size of eventual defects of the tubes [1]. The approach taken by AITRAS is to interpret signal patterns in a suitable display space, as an expert analyst would do. G1 is used as the signal-to-symbol conversion level of AITRAS system. Using G1 , the programming of an application consists of two parts:

 declarative programming: during this phase several elements describing the application task are

given: the spaces, the three types of symbols, contexts and environments, and situations where no low level programming manipulating arrays or loops is needed.  procedural programming: in this phase are provided application-speci c routines, such as similarity measures and signal parametrizations in which more domain semantics is involved.

The two types of knowledge are compiled and integrated into an executable program.

6.2 Description of the Spaces For this application the following spaces were de ned:

 O: The original space is made up of multi-frequency complex eddy current signals.  V : The visual space is the x-y plan where the trace of a mobile point is controlled by the amplitude of x(t) and y(t). In this space, signals are usally seen as closed curves with several lobes.  P : The parametric space is chosen as the complex Fourier transform of the signals in the x-y plan, including FL, FM, FH and two mixed signals derived from the combination of these signals. 8

 S : the symbol space is described in subsection-6.3. The similarity measure  mentioned in Eq-3 is based on the Euclidean distance generalized to complex vectors: (13) (x1; x2) = 1 +  1d(x ; x ) 1 2 where is a constant and v u p u X d(x1; x2) = t 1p (xr1 ? xr2 )2 + (xi1 ? xi2 )2 (14) i=0

6.3 Symbols Symbols are extracted from oral explanation of the reasoning process by expert analysts. Quantitative symbols include phase of the main lobe, amplitude of the main lobe, direction of the main lobe, signal to noise ratio, signal width. These parameters are computed for three frequencies (FL, FM, FH). Qualitative symbols contain labels of defect types, and the following shape-related descriptions of signal: point or round extremity, smooth or irregular contour, phase closing to zero, \banana" type shape. Compound symbols include: amplitude increases when frequency increases, angle stability for three frequencies, external loss of matter. As example, the last symbol is speci ed by the following rule: ExternalLossMatter = (FLAngle) Is-Small < 1; 2 > [2]& (FMAngle) Is-Small < 3; 4 > [1]& (FHAngle) Is-Small < 5; 6 > [3]& (FLAngle) Is-Great < 7; 8 > [1]& (FMAngle) Is-Great < 9 ; 10 > [1]& (FHAngle) Is-Great < 11; 12 > [1] which means: external loss of matter will occur if the angles of the lobes of the lower, medium and high frequency components (FL, FM, FH) are all small (compared to ( 1 , 2), ( 3, 4), ( 5, 6), respectively), and are all great (compared to ( 7, 8), ( 9, 10), ( 11, 12), respectively). In this rule the s specify the fuzzy intervals.

6.4 Context, Environment and Situation Context is used to distinguish symbols of same nomination but with references extracted from di erent tube locations. This constraint prevents references extracted at di erent locations from being compared. Environment is used to specify the relation between tube locations and eventual tube defects so that the search of a given defect is focalized only in speci c locations. The application requires di erent signal processing procedures in di erent situation, according to tube axis location. Using the notion of situation, each location is associated with a suitable signal processing procedure and a similarity measure. 9

6.5 Evaluation We have evaluated the performance of conversion of the qualitative symbols, for which the hand-labeled signals were provided by expert analyst. Approximately 800 defect signal les from two steam generators were available. They represent about 20 defect types. These les were randomly divided into two parts: the training corpus and the test corpus. This distinction ensures that the interpretation system be evaluated in a open test condition. In other words, all signals to be recognized are not seen by the system before the test. In Table 1, we give the conversion result by G1 for one of the two generators, no signi cant di erence in recognition rate is observed for the other generator. For some defects, training data are insucient. The performance of the conversion would be certainly improved if more training samples for these defects were added.

7 Conclusion This work contributes to the formulationof a knowledge-based generic tool for signal-to-symbol conversion in which the notion of context, environment and situation-sensitive space and similarity measure are exploited. We propose to split the solution of signal-to-symbol conversion problem into three types, according to the nature of problem. Compound symbols are obtained using structured rule-based methodology, Qualitative symbols are computed by systematic, application-independent techniques, whereas quantitative symbols need application-speci c solutions. For the rst two types of symbols, i.e. compound symbols and qualitative symbols, it is the existence of a symbol that is involved, whereas for the last one, the quantitative symbols, it is some physical property of the real object to be recognized that is involved. The possibility of specifying the three types of symbols and introducing context-sensitive processing of symbols and signals implies that G1 provides a suitable shell for integrating domain speci c knowledge of pattern recognition, especially for integrating numerical processing and symbolic processing in a signal interpretation process. G1 constitutes thus a generic tool for generating signal-to-symbol conversion systems in the interpretation of image, speech and other signals. AAAA AAAE AAAZ BDAI BDAP CFBI CGBF CGBH CGBI CGBL CGBP CGBZ CZZZ ZZZZ AAAA 0^ 0 0 0 0 0 0 0 0 0 0 0 0 1 ( 0/ 1 = 0.000) AAAE 0 10^ 1 0 0 0 0 0 0 0 0 0 0 1 ( 10/ 12 = 0.833) AAAZ 0 1 6^ 0 0 0 0 0 0 0 0 0 0 5 ( 6/ 12 = 0.500) BDAI 0 0 0 0^ 0 0 0 0 0 0 0 0 0 0 ( 0/ 0 = 0.000) BDAP 0 0 0 0 0^ 0 0 0 1 0 0 0 0 0 ( 0/ 1 = 0.000) CFBI 0 0 0 0 0 1^ 0 0 0 0 0 0 0 0 ( 1/ 1 = 1.000) CGBF 0 0 0 0 0 0 6^ 0 0 0 0 0 0 0 ( 6/ 6 = 1.000) CGBH 0 0 0 0 0 0 0 6^ 0 0 0 0 0 0 ( 6/ 6 = 1.000) CGBI 0 0 0 2 0 0 0 0 3^ 2 1 0 0 0 ( 3/ 8 = 0.375) CGBL 0 0 0 0 0 0 0 0 0 5^ 0 0 0 0 ( 5/ 5 = 1.000) CGBP 0 0 0 2 0 1 0 0 2 0 0^ 0 0 0 ( 0/ 5 = 0.000) CGBZ 0 0 0 0 0 0 0 0 0 0 0 0^ 0 1 ( 0/ 1 = 0.000) CZZZ 0 0 0 0 0 0 0 1 0 0 0 0 0^ 0 ( 0/ 1 = 0.000) ZZZZ 0 2 3 0 0 0 0 0 0 0 0 0 0 153^(153/158 = 0.968) Global recognition rate = 190/217 = 0.876

Table 1: recognition results for 14 defect-related qualitative symbols, expressed in the form of confusion matrix, obtained on a test corpus of 217 defect signals. The codebook size is 256 and 32 Fourier coecients (complex-valued) are used. 10

References [1] D. Dobbeni. Improved NDE in nuclear power plants with computer controlled acuisation and analysis. Technical Report 21.05.87, Belgian Nuclear Society, Belgium, 1987. [2] Y. Gong. Contribution to automatic interpretation of uncertain signals. PhD thesis, Universite de Nancy 1, France, May 1988. [3] Y. Gong and J.-P. Haton. A knowledge based system for contextually deformed pattern interpretation applied to chinese tone recognition. In Proc. of 2nd International Conference on Arti cial Intelligence, pages 521{530, Marseille, France, 1986. IIRIAM. [4] Y. Gong and J.-P. Haton. A specialist society for continuous speech understanding. In Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing 1988, pages 627{630, New York City, April 1988. [5] Y. Gong and J.-P. Haton. Phoneme based continuous speech recognition without pre-segmentation. In Proceedings of European Conference on Speech Technology, volume 1, pages 121{124, Edinburgh, September, 1987. [6] R. M. Gray. Vector quantization. IEEE ASSP Magazine, April 1984. [7] Y. Linde, A. Buzo, and R. M. Gray. An algorithm for the vector quantizer design. IEEE. Trans. on Communication, COM-28(1):84{95, Jan. 1980. [8] M. Nagao. Control strategies in pattern analysis. Pattern Recognition, 17:45{56, 1984. [9] G. P. Singh and S. Udpa. The role of digital signal processing in NDT. NDT International, Vol. 19(3):125{132, June 1986.

11