On the combinatorial complexity of fuzzy pattern matching in music ...

1 downloads 0 Views 568KB Size Report
1993 KluwerAcademic Publishers. Printed in the Netherlands. example, one might wish to locate all occurrences of the 3-note motif (Ex. 1), and of all its variants.
On the Combinatorial Complexity of Fuzzy Pattern Matching in Music Analysis R i c h a r d E. O v e r i l l Algorithm Design Group, Department of Computer Science, King's College London, Strand, London WC2R 2LS, U.K.

e-mail: [email protected]

Abstract: In music analysis it is a common requirement to search a musical score for occurrences of a particular musical motif and its variants. This tedious and time-consuming task can be carried out by computer, using one of several models to specify which variants are to be included in the search. The question arises: just how many variants must be explicitly considered? The answer has a profound effect on the computer time needed. In this paper, recurrence relations and closed form analytic expressions are derived for the run time complexity of two models of "fuzzy pattern matching" for use in music analysis; each model assumes the existence of an atomic exact pattern matching operation. The formulae so obtained are evaluated and tabulated as a function of their independent parameters. These results enable a priori estimates to be made of the relative run times of different music searches performed using either model. This is illustrated by applying the results to an actual musical example. Key Words: music analysis, combinatorial complexity, analytic formulae, recurrence relations, approximate string matching

1. Introduction Locating the occurrences of variations or developments of a given motif in a musical score is central to the practice of music analysis. For

Richard E. Overill, BSc, PhD, C.Math, FIMA, C.Eng, MBCS, MIEE, is Lecturer in Computer Science at King's College London. His research includes the design, analysis, and practical implementation of algorithms on supercomputers. He has also given lecture-recitals on the keyboard music of the Tudor composers Thomas TaIlis (1985), John Blitheman (1991), and William Byrd (1993). Computers and the Humanities 27: 105-110, 1993. © 1993 KluwerAcademicPublishers. Printedin the Netherlands.

example, one might wish to locate all occurrences of the 3-note motif (Ex. 1), and of all its variants with one fewer note overall (Exx. 2-4) or with one extra note overall (Exx. 5-8, where * denotes an extra note) or with the same number of notes overall but with each interval inaccurate by up to one semitone (Exx. 9-16), and all exact transpositions of these. Clearly, for a musical score of substantial length this would constitute a tedious and time consuming manual task. However, the capability of modern digital computers for rapid searching and matching of large numbers of symbolic strings (representing, in this case, the notes of a musical score or motif) has given rise to the development of computer programs to automate the searching and matching aspects of music analysis (cf. Pearce, 1992). In fact, the music analyst's task is but one example of a class of problems known as Approximate String Matching (ASM), which is currently an area of great research activity in computer science (cf. Galil and Giancarlo, 1988) not least because of its applicability in fields as apparently diverse as genetic sequencing, speech processing, and text correction (Sankoff and Kruskal, 1983), in addition to music analysis (Mongeau and Sankoff, 1990). An important practical aspect of computerassisted music analysis is having the ability to predict how long a putative music search might take in order to assess its feasibility. In this paper, we analyze the run time complexity of two models of "fuzzy pattern matching" for use in music

106

RICHARD E. OVERILL

analysis. One of these (Model A) is a straightforward development of the ideas exemplified above, while the other (Model B) has recently been implemented and employed for a variety of music analysis tasks (Pearce, 1992). In Model A, we want to locate the motif and its variants with up to fmax fewer notes overall, up to emax extra notes overall, and up to imax semitones inaccuracy in each interval. In Model B, we want to locate the motif and its variants with up to fmax consecutive fewer notes at any point, up to emax consecutive extra notes at any point, and up to imax semitones inaccuracy in each interval. In section 2, which may safely be skipped by readers not wishing or needing to know the theory, we encapsulate the mathematical development of the two models, deriving recurrence relations and closed form analytic expressions without explicit reference to music analysis. In section 3, we illustrate a real application of both models to an actual musical example, presenting results which indicate how the run time of each model depends on the choice of each of its free parameters. Finally, in section 4, we make some observations on the range of applicability of the models and issue some caveats concerning their use. 2. Theory In the abstract models of "fuzzy pattern matching" to be analyzed here we explicitly assume the existence of a symbol alphabet A which is defined over a linear distance metric; that is, the 'distance' in A of a symbol s from a reference symbol r is measured by the position in A of s relative to that of r. In this context we may remark that the term "symbol" is used to denote an entity which may require several characters for its representation. We also implicitly assume the existence of either a symbol string comparator C(P,Q) which yields the result true if (and only if) its two symbol string operands P and Q are identical, or else an exact symbol string matching operation M(P,Q) which yields the result true if (and only if) P occurs in Q. Finally, we implicitly assume the existence of a hypothetical wildcard symbol * which exactly matches precisely one instance of any symbol s in A; that is, M(s,*) is true if (and only if) s is in A.

Given these three fundamental predicates, we can state the generalized form of our abstract model as follows: search a target symbol string T for every occurrence of a pattern symbol string P and all the variants of P which contain fewer symbols, extra symbols, inaccurate 'distances' between adjacent symbols, or any combination of these variants. We now specify two distinct forms of the generalized abstract model. In Model A we require the search to locate every variant of P in T with up to fmax fewer symbols overall, up to emax extra symbols overall, and up to imax inaccuracy in the 'distance' between adjacent symbols. In Model B we require the search to locate every variant of P in T with up to fmax consecutive fewer symbols at any point, up to emax consecutive extra symbols at any point, and up to imax inaccuracy in the 'distance' between adjacent symbols.

Model A Firstly, for a symbol string of length n there are n symbols from which a total o f f distinguishable symbols can be removed. From elementary combinatorial theory, the number of ways in which this can be done is given by the binomial coefficient ( n ) (Abramowitz & Stegun, 1972). Therefore the

f

number of ways in which up tofmax distinguishable symbols can be removed is: fmax

(1) f=O

Secondly, for a symbol string of length n there are (n + 1) slots into which a total of e extra (indistinguishable) symbols can be added. This problem is isomorphic with the problem of removing a total of e distinguishable symbols from a symbol string of length (n + e), which can be done in (n+e) ways. Therefore the number o f ways in which up to emax extra (indistinguishable) symbols can be added is:

(ne e )

emax

e=0

F U Z Z Y P A T T E R N M A T C H I N G IN M U S I C A N A L Y S I S

Thirdly, for a symbol string of length n, there are (n - 1) pairs of adjacent symbols, each of which may have a 'distance' inaccuracy taking on any one of (2imax + 1) distinct values. Therefore the number of ways in which up to imax inaccuracy within each pair of adjacent symbols can be realized is:

(2imax + 1)n-I

(3)

Thus if P is of length n, the total number of variants of P required by Model A, with full coupling between fewer symbols, extra symbols and inaccurate symbols in (1), (2) and (3) respectively, is given by:

~., ~.,

n

f=O

f

e=0

) (.+e)

(2imax +1) "-y-l

(4)

e

Model B Firstly, for a symbol string of length n the total number of ways of removing up to k consecutive symbols at any point is given by the modified Fibonacci sequence of order k (Bollinger, 1984), as defined by the linear recurrence relation: n-1

F(n,k) =

~

2n

(1 < n < k) --

F(i,k)

(n > k)

(5)

107

length n' derivable from an initial string of length n by removing up to k consecutive symbols at any point. We proceed by constructing the recurrence: k'

G(n',m',k',O) = ~_. G(n'-l,m',k',m) m= 0 r

p

p

G(n ,m ,k ,m) = G(n ' ,m '-1,k ' ,m-l) . . . ,k ,k ) = G ( n', m"- l , k ,'k -"1 ) G(n . ,m +G(n ,m - 1 , k - l , k - 1 )

(7) (0 < m < k')

with G(n',O,0,O) = 1 for all non-negative n' and G(O,m',m',m') = 1 for all non-negative m'. The four parameters n', m', k' & m represent respectively the total number of symbols remaining, the total number of symbols removed, the maximum number of symbols that can be removed at any point, and the number of symbols removed at the end of the string. We now define:

G'(n',m',k')= ~ G'(n',m',U,m)

(8)

m=0

and the number of strings of length n' (= n-m') which can be derived from a string of length n by removing up to k consecutive symbols at any point is then given by summing over all k':

i=n-k

k

with F(O,k) = F(-1,k) = 1, and F(i,k) = 0 for i < -1, for non-negative k. The closed form expression for the recurrence (4) in terms of the binomial coefficients (Philippou, 1983) is given by: n+1

F(n,k) = ~ (_l)i

( n+l-(k+l) i ) 2"+t-'

i

where [x] denotes the greatest integer in x. However, in order to achieve correct coupling between fewer, extra and inaccurate symbols later on, we require a more general recurrence relation which yields the number of symbol strings of

(10)

However, in order to achieve correct coupling between fewer, extra and inaccurate symbols we again require the number of strings of length n' which can be derived from an initial string of length n by the addition of up to k consecutive extra symbols at any point. This problem is isomorphic with the problem of determining the number of strings of length n (= n'-m') which can

108

RICHARD

be derived from a string of length n' by removing up to k consecutive symbols at any point. From (9), this is given immediately by: k

G'(n',rn',k')

(11)

k'=0

Thirdly, for a symbol string of length n, up to imax inaccuracy within each pair of adjacent symbols can be realized in (2imax + 1)n-1 ways, exactly as for Model A in (3) above. Thus if P is of length n, the total number of variants of P required by Model B with up to fmax consecutive fewer symbols and up to emax consecutive extra symbols at any point is given, with the aid of (9) and (11), by: fmax n-1 emax (n+l)j

~, ~_~ ~,

~_, G'(n-ff,,i) G'(n-f, ej) (2imax+l) n-I-:

i=0 f=0 j=0

e=0

(12) It is worth remarking that (1), (2) and (3) serve as cross-checks for (4) in each of those cases where only one of fmax, emax and imax respectively is non-zero. Similarly, (6), (10) and (3) serve as cross-checks for (12) under the same conditions. It should also be noted that in order to avoid the inefficient generation of redundant and spurious variants of P, the transformations are necessarily applied to P in the following order: fewer symbols, inaccurate 'distance' between adjacent symbols, extra symbols.

3. Application to Music Analysis As explained in section 1, in the context of music analysis the parametersfmax and emax control the maximum degree of variation in the motif permitted by the model in terms of fewer notes and extra notes respectively, while the parameter imax controls the maximum tolerable degree of inaccuracy in any interval within the motif. From a practical point of view, it is convenient to work directly with interval strings rather than pitch strings (Pearce, 1992); an interval string may be defined as the first difference of a pitch string. This approach neatly accounts for the occurrence of exact transpositions of the motif and its variants within the musical score.

E. OVERILL

In Table 1, the number of variants of 3-, 4-, and 5-note motifs generated by Model A (4) and Model B (12) are tabulated as a function of the parameters fmax, emax, and imax. The results for Model A and Model B were produced by two straightforward computer programs which are available from the author, either as printed listings or as ASCII files via email or on a PC diskette. It is important to realize that the sample results presented in Table 1 do not represent a direct measure of the total time required for a fuzzy search of a musical score for a given motif. Rather, they represent the number of variants of the motif which must be generated and matched against the musical score at each point, assuming that each of the variants is used explicitly, as (for example) in the atomic string comparison or string matching operations of SNOBOL4+ (Emmer, 1985). The total time required depends also on the length of the musical score in a way which is itself a function of the particular string operation primitives employed. The results in Table 1 actually reflect the relative performances of different searches in a given musical score using the same primitive string operations. From Table 1, we see that Model B is significantly more computational than Model A. This is principally due to the fact that in Model B much longer (and hence many more) variants of the motif can be generated; in Model B the maximum length for a variant of an n-note motif is (n+l)emax notes, whereas in Model-A it is just (n+emax) notes. As an example from practical music analysis, we will take the task of searching the last section of the musical score of the final (incomplete) fugue from J.S.Bach's Art of Fugue (Bach, 1752) for variants of the composer's 4-note musical 'signature' "BACH" (Ex. 17). This will exemplify the kinds of choices and trade-offs available to the music analyst. If we required all variants of the "BACH" motif with up to 2 fewer notes at each point, up to 2 extra notes at each point, and an intervallic inaccuracy of up to 2 semitones, then we would use Model B with fmax = emax = imax = 2. From Table 1, this would require 39303 "BACH" variants to be generated and matched at each successive point in the score. However, if we

109

FUZZY PATTERN MATCHING IN MUSIC ANALYSIS

Table 1. The number of variants generated by Model A and Model B from an initial motif of n notes with the parameters shown. motif length (notes) n

maximum fewer notes fmax

maximum extra notes emax

maximum interval innaccy imax

number of variants Model A

number of variants Model B

3 3 3 3 3 3

0 1 0 1 0 1

0 0 1 1 0 1

0 0 0 0 1 1

1 4 5 17 9 81

1 5 16 44 9 220

4 4 4 4 4 4 4 4 4 4 4

0 1 0 1 0 i 2 0 2 0 2

0 0 1 1 0 1 0 2 2 0 2

0 0 0 0 1 1 0 0 0 2 2

1 5 6 26 27 342 11 21 141 125 4425

1 8 32 120 27 1512 13 243 747 125 39303

5 5 5 5 5 5 5 5 5 5 5

0 1 0 1 0 1 2 0 2 0 2

0 0 1 1 0 1 0 2 2 0 2

0 0 0 0 1 1 0 0 0 2 2

1 6 7 37 81 1377 16 28 283 625 34375

1 13 64 328 81 10392 24 729 2952 625 628704

reduced the scope of our search to f m a x = e m a x = i m a x = 1 then only 1512 "BACH" variants would be involved and we would expect the search time to be reduced by a factor of 39303/1512 = 26. On the other hand, if we used the simpler Model A, which specifies a maximum number of fewer notes overall and a maximum number of extra notes overall, then the corresponding numbers of "BACH" variants to be considered are 4425 and 342 respectively (see Table 1), and their ratio predicts a relative speed-up for the search by a factor of almost 13. In addition, a crosscomparison between Model A and Model B demonstrates that the more extensive search ( f m a x = e m a x = i m a x = 2) with Model A would take nearly 3 times longer than the less extensive

search ( f m a x = e m a x = i m a x = 1) with Model B, as given by the ratio 4425/1512.

4. Concluding Remarks Both the models analyzed here are also equally capable of handling inversion, retrogradation or inverted retrogradation of a motif; all that is required is to apply the appropriate transformation or sequence of transformations to the original motif before the process of variation is begun. As defined here, both models are concerned with pitch alone, not duration. However, the extension of the parameter set required to incorporate duration into either model is fundamentally straightforward, if somewhat tedious to implement. It is important to note that the combinatorial

110

RICHARD E. OVERILL

complexity of both models is exponential in n, fmax, and emax, but geometric in imax. The overall exponential growth of both models is the penalty to be paid for restricting the comparison and pattern matching operations to be atomic (that is, to operate on entire strings). Recent ASM algorithms (cf. Galil and Giancarlo, 1988) use sophisticated pre-processing of the musical score and/or the motif and also sub-string operations to achieve polynomial time performance; a very recent example is time quadratic (Galil and Park, 1990) and depends linearly on the lengths of the motif and the musical score. From the music analyst's point of view the traditional approaches, such as the two analyzed in this paper, are likely to prove of limited usefulness and applicability simply because of their exponential time complexity. It is likely that the newer approaches mentioned above hold the key to further practical developments in computerassisted music analysis in the 1990s.

Acknowledgement It is a pleasure to thank my colleagues Dr John Martin and Mr Andrew Wells for their invaluable advice on obtaining the recurrence relations in section 2. I am also most grateful to Dr Alastair Pearce for permission to refer to the material of

Ex. 1

Ex. 2

Ex. 9

Ex. 10

Ex. 3

Ex. 11

Ex. 4

his Ph.D. thesis prior to publication. The helpful comments of the referees and editor on a first draft of the paper were appreciated.

References Abramowitz, M., and I.A. Stegun, eds. A Handbook of Mathematical Functions. New York: Dover, 1972, p.822. Bach, J.S. "Die Kunst der Fuge" (BWV 1080). Contrapunctus XVIII, bb.193-4 et seq., 1752. Frankfurt: Edition Peters, Nr.218. Bollinger, R.C. "Fibonacci K-sequences, Pascal T-triangles,, and K-in-a-row Problems." Fibonacci Quarterly, 22 (1984), 146-51. Emmet, M.B. SNOBOL4+: The SNOBOL4 Language for the PC User. Englewood, NJ: McGraw-Hill, 1985. GaUl, Z., and R. Giancarlo. "Data Structures and Algorithms for Approximate String Matching." Journal of Complexity, 4 (t988), 33-72. GaUl, Z., and K. Park. "An Improved Algorithm for Approximate String Matching." SIAM Journal on Computing, 19 (1990), 989-99. Mongeau, M., and D. Sankoff. "Comparison of Musical Sequences." Computers and the Humanities, 24 (1990), 161-75. Pearce, A.T.P. "MAP-A Computer Program for Music Information Retrieval." Ph.D. Thesis. University of London, 1992. Philippou, A.N. "A Note on the Fibonacci Sequence of Order K and the Muttinomiat Coefficients." Fibonacci Quarterly, 21 (1983), 82-86. Sankoff, D., and J.B. Kruskal, eds. Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison. Reading, MA: Addison-Wesley, 1983.

Ex. 6

Ex. 5

Ex. 13

Ex. 12 k.

l~lt'•

Ex. 14 'a_

I III

Ex. 17 '

B

A

C

H

Music examples.

Ex. 7

~..

.

.

m

Ex. 8

Ex. 16

Ex. 15 .

e

"-

I~"

o

~."

II