Focus-of-attention techniques in the automatic ... - Semantic Scholar

PAGEOPH, Vol. 135, No. 1 (1991)

0033-4553/91/010061-1551.50 +0.20/0 9 1991 Birkh/iuser Verlag, Basel

Focus-of-attention Techniques in the Automatic Interpretation of Seismograms CLAUDIO CHIARUTTINI 1

Abstract--The focus-of-attention techniques implemented in SNA2, a knowledge-based system for seismogram interpretation, are presented. They consist of data compression of the input digital records, scanning of the compressed traces to detect candidate seismograms and extraction of seismogram features. A criterion is given to rate the clarity of seismograms; the clarity defines the order in which the system will consider them to build up the interpretation. The proposed techniques are simple and fast; they allow quick rejection of noise and focussing the attention of the system on the portions of traces containing relevant information.

Key words: Seismology, seismic networks, artificial intelligence, expert systems, automatic interpretation, seismic event detection, data compression.

1. Introduction When interpreting seismograms, the trained seismologist has the ability to rapidly identify the presence or absence of relevant signals and to formulate a preliminary, although rough, hypothesis about the event to be analyzed. He is able to sort the records of an event according to their clarity: records containing clear seismograms, records in which either the signal is difficult to interpret or its presence is doubtful, and records that contain no signal at all. Then, based on the inspection of a few key parameters of the clear signals (like amplitude, frequency and approximate initial time) he makes a first qualitative guess about the features of the event. The subsequent increasingly detailed analyses lead either to the rejection of the hypothesis--if it proves wrong--and to the statement of a new one, or to the quantitative refinement of the hypothesis itself. This kind of activity is known in Artificial Intelligence as focus-of-attention (Nn, 1986). The present note describes the focus-of-attention techniques implemented in the version 2 or SNA (Seismic Network Analyzer), a knowledge-based system for the interpretation of seismograms (CHIARUTTINIand ROBERTO, 1988; CHIARUTTINI et al., 1989; ROBERTO and CHIARUTTINI, 1990) and the result of tests with data collected by a microseismic network. l Istituto di Geodesia e Geofisica, Universitfi di Trieste, Via dell'Universit~i 7, 34123 Trieste, Italy.

62

Claudio Chiaruttini

PAGEOPH,

In SNA2 the knowledge is represented in three different ways: rules, procedures and facts (NII and FEIGENBAUM, 1978; BARR and FEIGENBAUM, 1982; FROST, 1986). Rules are condition-action pairs: they implement decisions and inferences; procedures are computations to be performed with a basically sequential control; facts are the static knowledge of the system, like the station coordinates and the velocity of seismic phases. At the procedural level the data records are scanned to extract the objects of reasoning (seismograms, phases, noise, ...) and provide a description of them. The objects and the relations among them are represented in SNA2 by a symbolic data structure based on a semantic net (ROBERTO and CHIARUTTINI, 1990). At the logical level, inferences are made to create new objects--the event, for instance--to refine their attributes, and to check the consistency of the interpretation. SNA2 builds the interpretation incrementally, by successive activations of independent specialized modules called knowledge sources (KS) (Nn, 1986). The analysis of the event starts with the set of candidate seismograms of higher clarity, and subsequently considers the signals in the lower clarity levels to refine the hypothesis. In this way ambiguous signals are interpreted within the context of clear ones, as the analyst does. The processing we describe here is the essential activity of the knowledge source Initialize of SNA2.

2. Principles for the Identification of Candidate Se&mograms The extraction of objects is a signal-to-symbol transformation (NII e t al., 1982) that results in the description of a trace in terms of candidate seismograms and purely noisy intervals, each one with its own attributes. This can be accomplished by a suitable segmentation of traces. The trace scanning procedure is required to be simple, reliable and provide a description of the features of noise and extracted signals. It should also be autoadaptive, due to the large variability of noise features with time and in different traces. We achieve this by exploiting basic knowledge about the seismograms shape and simple statistics. We consider vertical component records produced by an automatic data acquisition system that records upon some event trigger. The basic assumptions necessary for detection are: 1. 2. 3. 4. 5.

records start with background noise; background noise is stationary; occasional bursts of noise may occur; seismograms appear suddenly and fade away gradually; seismograms are oscillatory with respect to the baseline.

Item 4 implies that seismograms have some minimum duration, provided that the signal-to-noise ratio is not vanishingly small. Item 5 means that seismograms

Vol. 135, 1991

Automatic Interpretation of Seismograms

63

can be properly described in terms of amplitude and frequency and that their envelope is roughly symmetrical with respect to the baseline.

3. Data Compression and Preprocessing Digital data collection systems record samples at rates that assure the good reconstruction of waveforms. Such detail is unnecessary at the beginning of the analysis; it is more convenient to use a recognition algorithm operating on a reduced number of data, provided they retain the essential features of the original ones. The compressed data lie at an intermediate level of abstraction between the input data and the symbolic objects to be described; they can be used in subsequent steps of the analysis, for instance searching for secondary arrivals. A convenient way of data compression is the partition of seismic traces into segments of fixed duration, and the description of each of them by means of amplitude and frequency. There are several possible estimates of such quantities; a few among the simplest ones were tested. As amplitude estimate, both average and peak values of the absolute amplitude were considered. Average amplitude is more robust, since it is biased by the presence of spikes to a lesser extent. Peak amplitude has the advantage of greater sensitivity than average amplitude in detecting the sudden onset of seismograms. The simplest ways to estimate frequency are counting peaks or zero crossings. Both measures provide the same results when applied to narrow-band signals, but may differ considerably with wide-band inputs, like seismic traces. First, the number of peaks and troughs usually exceeds the zero-crossing count, or at most equals it. Second, the number of zeroes is less robust, being biased by baseline shifts, while the number of extrema is insensitive to this. The problem is best understood considering the superpositions of two sinusoids of different frequencies. Let al ,f~ and a2,f2 be their respective amplitude and frequency, and assumefz