optimality statement of the woody's method and ... - Laboratoire I3S

14 downloads 0 Views 216KB Size Report
Set d2=d3=…=dI=0. First step, m=1 (m=1..I). Cross-correlate. Trial xm. Template = mean of all the xk for k m. Find dm corresponding to max. ^. Shift realization ...
LABORATOIRE

INFORMATIQUE, SIGNAUX ET SYSTÈMES DE SOPHIA ANTIPOLIS UMR 6070

O PTIMALITY S TATEMENT OF THE W OODY ’ S AND I MPROVEMENT

METHOD

Aline CABASSON, Olivier MESTE, Grégory BLAIN, Stéphane BERMON Projet BIOMED Rapport de recherche ISRN I3S/RR–2006-28–FR Septembre 2006

L ABORATOIRE I3S: Les Algorithmes / Euclide B – 2000 route des Lucioles – B.P. 121 – 06903 Sophia-Antipolis Cedex, France – Tél. (33) 492 942 701 – Télécopie : (33) 492 942 898 http://www.i3s.unice.fr/I3S/FR/

R ÉSUMÉ : Même si les signaux d’électrocardiogrammes sont très étudiés, l’analyse des ondes P l’est un peu moins sans doute à cause de la difficulté d’estimer la position de cette onde. Aussi, peu d’études ont été mené sur l’analyse du temps de conduction auriculoventriculaire (intervalle PR) en situation d’exercice. Les méthodes habituelles pour estimer les intervalles PR sont basées sur la détection du maximum de la fonction d’intercorrélation. Cette étude propose une nouvelle méthode d’estimation de temps de retards basée sur le Maximum de Vraisemblance et qui généralise la méthode d’estimation de retards de Woody. Ainsi, cette nouvelle approche permet de determiner les intervalles PR en tenant compte de l’onde T qui vient chevaucher l’onde P notamment lors d’un exercice physique où la fréquence cardiaque augmente.

M OTS CLÉS : estimation de temps de retards, MV, temps de conduction auriculo-ventriculaire, effort

A BSTRACT: Very little works have been done on the atrioventricular conduction time (PR interval), undoubtedly because this signal is difficult to extract and process, as in exercise tests where T-P fusion occurs during higher heart rates, what makes this problem still interesting. The common approach for the estimation of PR interval during both exercise and recovery is to determine the latency using the detection of the maximum of cross correlation function. This work aims to present a new method of time delay estimation with unknown signal based on an iterative Maximum-Likelihood approach which generalizes the well known Woody’s method. This leads to a new approach to determine the PR intervals taking into account the presence of the T wave that is modeled.

K EY WORDS : time delay estimation, MLE, atrioventricular conduction time, exercise

Optimality Statement of the Woody’s method and Improvement

A. Cabasson1, O. Meste1, G. Blain2, S. Bermon3 1 2

Laboratoire I3S-CNRS-UNSA, Universit´e de Nice Sophia-Antipolis, France Laboratoire de Physiologie des Adaptations, Facult´e des Sciences du Sport, Universit´e de Nice Sophia-Antipolis, France 3 Institut Mon´egasque de M´edecine du Sport, Monaco

Abstract Very little works have been done on PR intervals, undoubtedly because this signal is difficult to extract and process, as in exercise tests where T-P fusion occurs during higher heart rates, what makes this problem still interesting. The common approach for the estimation of PR interval during both exercise and recovery is to determine the latency using the detection of the maximum of cross correlation function. This work aims to present a new method of time delay estimation with unknown signal based on an iterative Maximum-Likelihood approach which generalizes the well known Woody’s method. This leads to a new approach to determine the PR intervals taking into account the presence of the T wave that is modeled.

1

Introduction

The analysis of the heart period series is a difficult task especially under graded exercise conditions. Correlation techniques are usually used to estimate the PR by determination of the latency using the detection of the maximum of the cross correlation [1, 2]. Here, we will present a new approach to determine the PR interval taking into account the presence of the T wave which is especially difficult to extract during high rates because it overlaps the P wave. In order to estimate the PR intervals, we will use the Maximum Likelihood Estimator (MLE). The idea behind maximum likelihood parameter estimation is to determine the parameters that maximize the probability of the sample data. The first part of the paper is devoted to present the well known Woody’s method [3] which is used to analyze variable latency signals but which is suboptimal. We will present a new method to determine the PR intervals which formalizes the Woody’s one , using an iterative MLE to estimate delays which correspond to PR intervals. We will extend the Woody model because in exercise tests T-P fusion occurs during higher heart rates. It will be based on the modeling of the T wave which overlaps the P wave especially during the exercise, in order to generalized more over.

1

2

Woody’s method

Charles D. Woody [3], presented in 1967 an adaptative filter which allowed identification and analysis of variable latency signals and the basis of detection of latency by correlation. He calculated the cross correlation between each sweep and a template. Hence, the time lag matching the cross correlation maximum corresponds to the latency shift of the given sweep. In this part of report, we will describe this method. In the model, xi (n) represents all the sampled observations (for all n) of the considered ith interval to estimate (i = 1..I, with I the number of trials). Each observation contains sdi (n), considered unknown, defined as the reference wave, or the template, delayed by di as sdi (n) = s(n − di ), plus ei (n) an observation’s noise: xi (n) = sdi (n) + ei (n)

(1)

The technique derives from iterative correlation and averaging of the data signals. The method can be summarized as follows: given an initial estimate sb(n) of the template, and the set of observations xi (n), the delay di for each trial is estimated as: N 1 X dbi = arg max xi (n)b sd (n) d N n=1

(2)

where N is the total number of samples in each trial. At each step i, the maximum of cross correlation between the template and the ith trial gives the estimation of delay dbi . When all the dbi for i = 1..I are estimated, each ith data trial is corrected by his ith delay dbi . The average of these aligned data trials gives a new template. Then, a new iteration for i from 1 to I is computed to determine the new dbi until convergence.

However, we observe that Woody do not allows an amplitude variability αi as Ja´skowski and Verleger [4] who refereed to a more general model in which also the amplitude jitter is allowed : xi (n) = αi .sdi (n) + ei (n)

We can assume that Woody do not take this additional parameter in his model because the template is made from a constant weighted averaging. Besides, this method is suboptimal because the considered signal is included into the average taken as a template in the cross correlation step. So, the cross correlation is biased. Also, the same template is used to estimate all the delays during one iteration ; all the estimated delays are taking into account in averaging process at the end of the iteration.

3

Woody’s Method Improvement regards the optimality

Here, we present the theoretical formulation of our improvement in the problem of delay estimation where the amplitude jitter αi is not considered in order to confirm that the Woody’s method is not optimal. So, we consider the same model as Woody for the observations given by: xi (n) = sdi (n) + ei (n)

2

(3)

The noise ei (n) is a white Gaussian noise with null mean and a variance σ 2 . For one observation, with n fixed, we consider the probability as following:   1 1 p(xi (n); s(n), di ) = √ exp − 2 (xi (n) − sdi (n))2 (4) 2σ σ 2π By assuming that the noise is white, and consequently independent, all the observations are also independent. Then, for xi = [xi (1), xi (2), ..., xi (N )]T . Y p(xi ) = p(xi (n)) (5) n

Thus, p(xi ; s, di ) =



1 2πσ 2

 N2

1 X (xi (n) − sdi (n))2 exp − 2 2σ n

!

Then, for all i, the pdf of the processes xi ’s, given the delay di ’s and signal vectors s, is: !   N2I 1 1 XX 2 p(X; s, d) = (xi (n) − sdi (n)) exp − 2 2πσ 2 2σ i n

(6)

(7)

where X = [x1 , x2 , ..., xI ] and d = [d1 , d2 , ..., dI ]T . b which maximise the probability of So according to the MLE, the objective is to find bs and d X. Then, the criterion J to be minimized is: J=

1 XX . (xi (n) − sdi (n))2 2σ 2 i n

(8)

Finally, the aim of the study is then to solve: b = arg min J (b s, d)

(9)

s,di

b are intertwined, we compute firstly the derivation of (8) Since the parameters to find b s and d regards s(n), that produces: 1X b s= xk,−dk (10) I k

Then we replace in our expression of the criterion (8) s(n) by its estimate: J

=

1 XX 1X (x (n) − xk,di −dk (n))2 i 2σ 2 i n I k

=

=

1 XX 1 X x2i (n) + 2 ( xk,di −dk (n))2 2 2σ i n I k ! X 2 − xi (n) xk,di −dk (n) I k  !2 1 X X 1 XX X 2 xi (n) + 2 xk,di −dk (n) 2σ 2 I i n n i k !# X 2 XX − xi (n) xk,di −dk (n) I i n k

3

In this last expression, the second term is not function of the delay di because when we compute a double integral on signals which are delayed of di , it is the same that calculate the double integral of the mean of these signals. Then, this second term is a an approximation: !2 1X X ≈ xk,−dk (n) I n k

Then, the criterion is: J

 1 X X 1 xi (n)2 + 2σ 2 n I i

=

X

!2

xk,−dk (n)

k

!# X 2X − xi (n) xk,di −dk (n) I i k " # 1 X X 1 2 2 xi (n) + A − B 2σ 2 n I I i

=

Also, the terms A and B are equals because: A

X

=

!2

xk,−dk (n)

k

XX

=

k

B

=

X

xi (n)

i

= ≃

xk,−dk (n)xl,−dl (n)

l

XX i

k

k

i

XX

X

!

xk,di −dk (n)

k

xi (n)xk,di −dk (n) xk,−dk (n)xi,−di (n) = A

The criterion J is then simplified: " # 1 XX 1 X X 2 xi (n) − xk,−dk (n)xi,−di (n) J= 2σ 2 n I i i k

Also, i and k have symmetrical parts, then: " 1 X X 1X J = xi (n)2 − xi (n)2 2 2σ n I i i # I I 2 XX − xk,−dk (n)xi,−di (n) I i k>i "  1 X 1 X = 1− xi (n)2 2σ 2 n I i # I X I X 2 − xk,−dk (n)xi,−di (n) I i k>i

4

(11)

Set d2=d3=…=dI=0

First step, m=1 (m=1..I)

Trial xm

Cross-correlate

Template = mean of all the xk for km

Find d^m corresponding to max Shift realization xm by d^m

Replace template with the shifted trial xm

Increment m

no

Finished all m’s ?

yes

no

Convergence ?

yes ^ Output : d

Figure 1: Flow chart of the Woody Improved algorithm.

P And finally, as the term (1 − I1 ) i x2i is positive, minimize J comes down to maximize the second term in the sum. Then the estimator of the Woody Improved method is: " # I X I XX dbi = arg max xk,−dk (n)xi,−di (n) (12) di

n

i

k>i

The solution of (12) is obtained using an iterative scheme described by the flow chart presented b in criterion (9) which will be unique if a side condition, such in figure 1. The aim is to find (b s, d) P b as i di equal to a constant. Arbitrarily, we fix a delays average equal to 0.

Then, using an iterative algorithm, the dk are fixed for k 6= i, we compute the optimum of the criterion J with respect to variable i: we replace di by dbi and we reiterate.

5

Here, we present the first steps of this algorithm. 1st step : m = 1 Aim : estimation of the delay db1 Hypothesis: db2 = db3 = ... = dbI = 0 Computation of the criterion (12) : " # I I X X X J= x1,−d1 (n) xk (n) + x2 (n) xk (n) + ...... n

k=2

k=3

We observe that only the first term is function of d1 . So, maximize this criterion amounts to make the cross correlation between two signals : • x1 delayed of d1 • the mean of all the xi for i 6= 1

The maximum of this cross correlation gives db1 .

2nd step : m = 2 Aim : estimation of the delay db2 Hypothesis : d1 → db1 and db3 = db4 = . . . = dbi = 0 Computation of the criterion (12) :

J

=

X n

=

X n

"

I X

x1,−db1 (n) x2,−d2 (n) +

k=3

x2,−d2 (n) x1,−db1 (n) +

k=3

"

I X

!

xk (n)

+ x2,−d2 (n)

!#

xk (n)

+

X n

"

I X

xk (n) + x3,−d3 (n)

k=3

x1,−db1 (n)

I X

xk (n) + . . .

k=4 I X

k=3

#

xk (n) + x3,−d3 (n)

I X

xk (n) + . . .

k=4

As previously, only the term of the criterion is function of the delay d2 and maximize this criterion amounts to make the cross correlation between two signals : • x2 delayed of d2 • the mean of all the xi for i 6= 2 with x1 corrected by db1

The maximum of this cross correlation gives db2 . 3rd step : m = 3 . . .

At the end of all steps, we assure the unicity of our estimation thanks to the hypothesis that the average of the delays dbi at each iteration is constant. Arbitrarily, we choose that the average of the delays during the iterations must be equal to the average of the delays estimated during the first iteration. In conclusion, for each i, in order to determine the delay dbi , we maximize the correlation function between the observation xi and the average of all the xk for k 6= i with the trials xk for k < i realigned, corrected by the dbk for k < i already estimated. As for the Woody’s method, several iterations are necessary to converge to the optimal solution. The difference between this Woody Improved method and the Woody’s one is that the latter uses the correlation between the trial xi and the template, the mean of all trials, whereas here, the correlation is not biased by the presence of the considered xi in the template. That is why the Woody method [3] is suboptimal. Also, our Woody Improved method converges faster because the trial xi is corrected by dbi and the template is updated before the next step. In conclusion, the Woody’s method of 1967 is running but it is suboptimal. 6

#

4

Woody’s method Improvement-Generalization of the model

We remember that our aim is to determine the PR interval on effort ECG but the determination of the P wave is very particulary difficult, especially during the effort and at the beginning of the recovery where the T wave is superposed the P wave. Our idea is then to take into account the presence of the T wave in our model and then it will be easier to estimate the PR interval. We can describe a model in which xi (n) represents all the observations (for all n) of the considered ith PR interval. Each observation contains sdi (n), still considered unknown, defined as the reference wave delayed by di as sdi (n) = s(n − di ), plus an observation’s noise ei (n), a Gaussian white noise with null mean and a variance of σ 2 . Charles D. Woody [3], presented as we have seen in the first part of this report, a system which is based on iterative correlation-averaging techniques. Later, Pham et al. studied the estimation of variable latencies of noisy signals [5]. Ja´skowski and Verleger [4] refereed to a more general model in which the amplitude variability is also allowed : xi (n) = αi .sdi (n) + ei (n) However, these two studies, [5, 4], are not really fair regards the optimality of the method since they include frequency a priori in their approach. As in exercise tests T-P fusion occurs during higher heart rates, we can consider in order to generalized more over that the T wave is represented by a function f (n;θ i ). We assume that the T wave should be described by a regular and smooth function, i.e. a lth order polynomial function characterized by its coefficients in the vector θ i . Finally, our model is expressed like : xi (n) = αi .sdi (n) + αi .fdi (n; θ i ) + ei (n)

(13)

where i, i = 1..I, is the number of realizations, and the variable di is the ith PR interval to be estimated up to an unknown constant. As previously, it is obvious that if we do not impose constraints on the estimated delays, we will estimate the signal s with a time-lag. That is why, it is necessary to impose that the average of the estimated delays equal a constant. For example, we choose the average of the delays identical to the average of the estimated delays at the end of the first iteration. In order to estimate the PR intervals, we use the MLE (Maximum Likelihood Estimation). The idea behind maximum likelihood parameter estimation is to determine the parameters that maximize the probability, the likelihood, of the sample data. The noise ei (n) is an iid Gaussian noise with zero mean and a variance of σ 2 . So, for one observation, n fixed, we consider the likelihood function:   1 2 p(xi (n); s(n), di , θi , αi ) = Γ. exp − 2 .(xi (n) − αi .fdi (n; θi ) − αi .sdi (n)) 2σ For all the samples, i.e. all the n, if the noise is white then all the observations are independent and then: Y p(xi ) = p(xi (n)) n

7

And, 1 X p(xi ; s, di , θi , αi ) = Υ. exp − 2 . (xi (n) − αi .fdi (n; θi )) − αi .sdi (n))2 2σ n

!

Then, for all records, i.e. all i, the pdf of the processes xi ’s, given the delay di ’s, the signal vector s, the coefficients θi and the parameter of the amplitude jitter α, is: ! 1 XX 2 (xi (n) − αi .fdi (n; θ i ) − αi .sdi (n)) p(X; s, d, θi , αi ) = Ψ. exp − 2 . 2σ n i

where X = [x1 , x2 , . . . , xI ] and d = [d1 d2 . . . , dI ]T . The objective is to estimate the di ’s for all i, in other words to maximise p(X; s, d, θi , αi ). So according the MLE, the criterion J to be minimized is: X J= k xi − αi .sdi − αi .fdi (θ i ) k2 (14) i

To solve this kind of problem, first we make a change of variables: yi = xi − αi .fdi (θ i )

So we consider the criterion: J=

X i

(15)

k yi − αi .sdi k2

(16)

The noise sweeps ei (n) are modeled as trials of a common zero mean stationary Gaussian process and we assume that ej and ek are independent for j 6= k. Since the signal s(n) is unknown, we can estimate it by minimization of the criterion (16) with respect to s(n): bs =

1X 1 y I αk k,−dk k

Substituting this estimation for s in the equation (16), we obtain: J

=

X i

αi X 1 yk,di −dk k2 I α k i k X αi X 1 = k yi − (xk,di −dk − αk .fdi (θ k )) k2 I α k i

=

X

k yi − αi .b sd i k 2 k yi −

k

Using the equation (15), we obtain:

J

= =

X

αi X 1 .(xk,di −dk − αk .fdi (θ k )) k2 I α k i k X αi X 1 αi X k xi − .xk,di −dk − αi .fdi (θ i ) + fdi (θ k ) k2 I α I k i k xi − αi .fdi (θ i ) −

k

k

8

As previously mentioned, we can consider that the T wave is represented by a function f (n;θ k ) which is, for example, a lnd order polynomial characterized by its coefficients in the vector θk = [θk (0), θk (1), . . . , θk (L)]T : fdi (n; θ k )] =

L X l=0

Then, we have: αi X fdi (n; θk ) = I k

=

θk [l].(n − di )l

I L αi X X θk [l].(n − di )l I

(17)

k=0 l=0

I  αi X  θk [0].1 + θk [1].(n − di )1 + θk [2].(n − di )2 + .... I

(18)

k=0

This is corresponding to the average of the functions f (n; θk ), which is delayed of di . Also, in order to assert that the model is identifiable, we add a new non restrictive constraint that is the average of the functions f (n; θk ) is zero. And finally, the criterion to be minimized becomes: J=

X i

k xi − αi .fdi (n; θi ) −

I αi X 1 xk,di −dk k2 I αk k=1

When we develop this criterion (20), we obtain : I

J

=

α1 . X 1 k x1 − α1 .fd1 (θ1 ) − xk,d1 −dk k2 I αk k=1 I

+

+

J

= + +

k x2 − α2 .fd2 (θ2 ) − k x3 − α3 .fd3 (θ 3 ) −

α2 . X 1 xk,d2 −dk k2 I αk k=1

I α3 . X 1 xk,d3 −dk k2 + . . . I αk k=1

  α1 1 1 1 k x1 − α1 .fd1 (θ1 ) − x1,0 + x2,d1 −d2 + x3,d1 −d3 + ...... k2 I α1 α2 α3   α2 1 1 1 k x2 − α2 .fd2 (θ2 ) − x1,d2 −d1 + x2,0 + x3,d2 −d3 + ...... k2 I α1 α2 α3   α3 1 1 1 k x3 − α3 .fd3 (θ 3 ) − x1,d3 −d1 + x2,d3 −d2 + x3,0 + ...... k2 + . . . I α1 α2 α3

9

(19)

We can observe, for example for d1 , that it appears especially in the first term and is present only once in the following terms. Then, we can make the approximation that in the following terms the d1 ’s influence is negligible; only the 1st term in the criterion is then considered for the first step. Then, thanks to this approximation, for the ith step, the criterion to be minimized is: J =k xi − αi .fdi (θ i ) −

I αi X 1 xk,di −dk k2 I αk

(20)

k=1

Then, the optimization can use an iterative algorithm. In the first time, we define a reference wave, a template, which is the average of the observations which do not contain T wave considering that all the αi equal 1. Thanks to the MLE, for the first step (i.e. i = 1), we estimate the coefficient α b1 , the coefficient b θ1 of the polynomial function and the delay db1 . We adjust the first observation by substraction of the polynomial function and realign it using the estimated delay. A new template is computed in order to be used in the next steps. If necessary, the process can be iterated depending on the convergence of the algorithm. Thanks to this model, we take into account the overlapping T wave. Then, the PR intervals are produced up to an unknown constant by the estimated delays dbi .

5

Conclusion

In this report, it has been demonstrated that the Woody’s method [3], is suboptimal and that our approach is faster. The techniques to estimate the PR intervals were based only on the detection of the maximum of cross correlation function [1, 2]. In this study, it has been presented a new method based on an iterative Maximum-Likelihood approach which generalizes the well known Woody’s method. Thanks to this new technique, the estimation of PR interval on effort ECG takes into account the presence of the T wave which overlaps the P one at high heart rate.

References [1] O. Meste, G. Blain, and S. Bermon, “Hysteresis Analysis of the PR-PP relation under Exercise Conditions,” in Computers In Cardiology, vol. 31, (Chicago), pp. 461–464, September 2004. [2] A. Cabasson, O. Meste, G. Blain, and S. Bermon, “A New Method for the PP-PR Hysteresis Phenomenon Enhancement under Exercise Conditions,” in Computers In Cardiology, vol. 32, (Lyon), pp. 723–726, September 2005. [3] C. D. Woody, “Characterization of an Adaptative Filter for the Analysis of Variable Latency Neuroelectric Signals,” Med. & biol. Eng. Comp., vol. 5, pp. 539–553, 1967. [4] P. Ja´skowski and R. Verleger, “Amplitudes and Latencies of Single-Trial ERP’s Estimated by a Maximum-Likelihood Method,” IEEE Transactions on Biomedical Engineering, vol. 46 (no.8), pp. 987–993, August 1999. [5] D. T. Pham, J. M¨ocks, W. K¨ohler, and T. Gasser, “Variable latencies of noisy signals: Estimation and testing in brain potential data,” Biometrika, vol. 74, pp. 525–533, 1987.

10