The Bayesian detection of discontinuities in a polynomial regression ...

J Geod (2008) 82:527–542 DOI 10.1007/s00190-007-0203-8

ORIGINAL ARTICLE

The Bayesian detection of discontinuities in a polynomial regression and its application to the cycle-slip problem Maria Clara de Lacy · Mirko Reguzzoni · Fernando Sansò · Giovanna Venuti

Received: 15 December 2006 / Accepted: 13 November 2007 / Published online: 4 January 2008 © Springer-Verlag 2007

Abstract This paper deals with the problem of detecting and correcting cycle-slips in Global Navigation Satellite System (GNSS) phase data by exploiting the Bayesian theory. The method is here applied to undifferenced observations, because repairing cycle-slips already at this stage could be a useful pre-processing tool, especially for a network of permanent GNSS stations. If a dual frequency receiver is available, the cycle-slips can be easily detected by combining two phase observations or phase and range observations from a single satellite to a single receiver. These combinations, expressed in a distance unit form, are completely free from the geometry and depend only on the ionospheric effect, on the electronic biases and on the initial integer ambiguities; since these terms are expected to be smooth in time, at least in a short period, a cycle-slip in one or both the two carriers can be modelled as a discontinuity in a polynomial regression. The proposed method consists in applying the Bayesian theory to compute the marginal posterior distribution of the discontinuity epoch

and to detect it as a maximum a posteriori (MAP) in a very accurate way. Concerning the cycle-slip correction, a couple of simultaneous integer slips in the two carriers is chosen by maximazing the conditional posterior distribution of the discontinuity amplitude given the detected epoch. Numerical experiments on simulated and real data show that the discontinuities with an amplitude 2 or 3 times larger than the noise standard deviation are successfully identified. This means that the Bayesian approach is able to detect and correct cycle-slips using undifferenced GNSS observations even if the slip occurs by one cycle. A comparison with the scientific software BERNESE 5.0 confirms the good performance of the proposed method, especially when data sampled at high frequency (e.g. every 1 s or every 5 s) are available. Keywords

GNSS · Cycle-slips · Bayesian approach

1 Introduction M. C. de Lacy Dept. Ingeniería Cartográfica, Geodésica y Fotogrametría, Universidad de Jaén, Campus Las Lagunillas s/n, 23071 Jaén, Spain e-mail: [email protected] M. Reguzzoni (B) Geophysics of the Lithosphere Department, Italian National Institute of Oceanography and Applied Geophysics (OGS), c/o Politecnico di Milano, Via Valleggio 11, 22100 Como, Italy e-mail: [email protected] F. Sansò · G. Venuti DIIAR, Politecnico di Milano – Polo regionale di Como, Via Valleggio 11, 22100 Como, Italy e-mail: [email protected] G. Venuti e-mail: [email protected]

When a GNSS (Global Navigation Satellite System) receiver is turned on, the fractional phase between satellite and receiver carrier is observed and an integer counter is initialized. During the tracking period, the counter is incremented whenever the fractional phase exceeds one cycle. If a loss of the signal lock occurs, the integer counter is reinitialized, causing a jump in the accumulated phase. This jump is called cycle-slip and it is equal to an integer number of cycles. The discontinuity is generally due to a poor reception or to the presence of obstacles in the path of satellite signals; its amplitude varies from one to millions of cycles. Detecting and correcting cycle-slips is a classical issue in geodesy and is part of the more general problem of fixing integer variables in GNSS phase observations. In this respect, the cycle-slip correction can be related to the ambiguity

123

528

resolution, which is faced with many techniques, like the LAMBDA (Least-squares AMBiguity Decorrelation Adjustment) method (Teunissen 1997; Teunissen and Kleusberg 1998) or different Bayesian approaches (Betti et al. 1993; Gundlich and Koch 2002; de Lacy et al. 2002). In the last decades, algorithms specifically dedicated to the cycle-slip analysis have been proposed and implemented. For instance, in the BERNESE software (Beutler et al. 2006), few minutes of double difference phase observations are modelled by a polynomial function of low degree and cycle-slips are searched in the residuals of this polynomial interpolation. In 1990, Blewitt studied the problem with respect to undifferenced observations, exploiting the widelane and ionospheric combinations. The approach, called Turbo Edit (Blewitt 1990), is implemented in scientific programmes that process undifferenced observations as GIPSY-OASIS II (Lichten et al. 1995) and BERNESE 5.0 (Beutler et al. 2006). In this paper, the problem of cycle-slip detection and correction in undifferenced observations is faced by exploiting the Bayesian theory. This work can be considered the generalization of a previous one (Sansò and Venuti 1997), where it was assumed to directly observe the integer ambiguity of the GNSS phase data, derived by the Euler and Goad equations. Epoch by epoch, this ambiguity was modelled as a constant value plus a white noise and the presence of a cycle-slip was seen as a discontinuity in this constant time series. On the contrary, here we consider combinations of the GNSS observations as L 1 − L 2 , completely free from the satellite geometry, whose generally smooth behaviour can be realistically modelled by a polynomial regression. Another important generalization with respect to the previous work, very much in the spirit of the Bayesian approach, is to consider the variance of the observation noise as a random variable, along with the regression parameters and the cycle-slip epoch and amplitude. The noise variance, in Sansò and Venuti (1997), was instead fixed to an a-priori value generally smaller than the true one. Furthermore, the posterior distribution of noise variance allows us to set up a test on the model accuracy used to subdivide the original data set into intervals, each containing at most one cycle-slip. In this way, it is possible to analyse data with more than one cycle-slip without increasing too much the computational burden. Again, this strategy is different from the one presented in Sansò and Venuti (1997), where the one-jump model was also applied to data with many cycle-slips, in the hope of detecting them sequentially. Finally, we would like to mention the sign-constrained robust least squares with subjective breakdown point method (Xu 2005) as a possible technique applicable to cycle-slip detection. The method is quite interesting and general in nature, but to our knowledge it has not yet been directly implemented to solve our specific problem. It might become an alternative to our proposal in the future.

123

M. C. de Lacy et al.

In this paper, the proposed method, which has been implemented by MATLAB, is presented in detail. In particular, in Sect. 2, the GNSS observation equations and the linear combinations used in this work are described. In Sect. 3, the Bayesian approach for the detection of the discontinuity epochs as well as their amplitudes in a generic time series is presented, while in Sect. 4 the method is applied to GNSS observations. In Sect. 5, the numerical experiments made on simulated and real data in order to test the performance of the method are presented. The results are also compared with those obtained by the BERNESE software. The analytical computation of the posterior distributions is reported in Appendices A and B.

2 Observation equations We will consider the following model for GPS pseudorange and carrier phase observables specific to a receiver-satellite pair, i.e. for undifferenced data (Euler and Goad 1991): ⎧ i P1 j = ρ ij + J1i j + ν1i j ⎪ ⎪ ⎪ ⎨ P i = ρi + K J i + νi 2j j 1j 2j (1) L i1 j = ρ ij − J1i j + λ1 B1i j + ν3i j ⎪ ⎪ ⎪ i ⎩ L 2 j = ρ ij − K J1i j + λ2 B2i j + ν4i j where P1i j and P2i j are the code pseudoranges; L i1 j and L i2 j are the carrier phases expressed as ranges; ρ ij represents the distance between the receiver j and the satellite i positions, including the clock terms, the satellite and receiver range electronic delays and the tropospheric delay; J1i j is the ionospheric delay effect on the L 1 frequency biased by the differential code delay; Bki j is called ambiguity bias and is formed by lumping together the non-zero initial phases and the integer carrier phase ambiguities; in other words, B1i j and B2i j are the initial constants which are not integer in general; K = ( f 1 / f 2 )2 , f 1 and f 2 being the first and second GPS frequency respectively; ν1i j , ν2i j , ν3i j , ν4i j are the measurement noises. The GPS system constants used in the above equations are the following: c = 299792458 m/s (the speed of the light); λ1 =

c f1

≈ 19.0 cm;

λ2 =

c f2

≈ 24.4 cm;

with f 1 = 154 × f 0 f 2 = 120 × f 0 f 0 = 10.23 MHz.

Bayesian detection of cycle-slips

529

2.1 Linear combinations of observations

0.3 0.25 0.2 0.15

L1 L2 [m]

It is important to stress that each formula in Eq. (1) is expressed in a distance unit. Furthermore, all of them are affected by the electronic bias, constant at least in short time periods, representing the travel time of the signal through the circuitries of the receiver and satellite.

0.1 0.05 0 -0.05

Two linear combinations of the observations in Eq. (1) are used in order to detect cycle-slips. These are called geometry free combination and ionospheric combination, respectively. Both of them are independent of the geometry term ρ, so that we are sure that they are insensitive to clock instability. 1. The geometry free linear combination For the sake of simplicity, hereafter the indexes relative to satellite and receiver will be omitted. The geometry free combination at the epoch t is given by: L 1 (t) − L 2 (t) = (K − 1)J1 (t) + λ1 B1 − λ2 B2 + δ N + ν L 1 L 2 (t)

(2)

where δ N = λ1 δ N1 − λ2 δ N2 is the cycle-slip amplitude when present. The term ν L 1 L 2 represents the noise of the L 1 − L 2 combination ranging from 3 to 4 mm. 2. The ionospheric combination This linear combination involves code pseudorange and phase observations at f 1 frequency and is given by: P1 (t) − L 1 (t) = 2J1 (t) − λ1 B1 − λ1 δ N1 + ν P1 L 1 (t) (3) where δ N1 is the cycle-slip amplitude in the carrier phase L 1 when present. The noise ν P1 L 1 of this combination ranges from a few to tens centimeters, depending on the satellite elevation (Bona 2000). Considering short intervals of time, if no cycle-slip is present, all the terms in Eqs. (2) and (3) can be considered as constants apart from the ionospheric effect J1 (t), whose smooth behaviour can be modelled as a linear combination of polynomial functions (see e.g. Fig. 1). It follows that the cycle-slip detection is reduced to finding discontinuities in a smooth (polynomial) regression. In most cases, the ionospheric effect is well interpolated by a linear function (first order polynomial); however, it is recommended to use higher polynomial functions to take into account the effects of the satellite elevation and of the smoothing performed by some receivers to reduce the noise of code measurements.

-0.1 -0.15

0

50

100

150

200

250

300

epoch

Fig. 1 Typical behaviour of the ionospheric effect in the geometry free linear combination

3 The Bayesian approach in the detection of discontinuities In this section, we deal with the general problem of detecting discontinuities in a time-series of data, modelled by a smooth regression. For the sake of simplicity, we first consider a situation with one discontinuity only. In this case the observation equation can be written as y0 (ti ) =

m

ϕ j (ti ) x j +kh τ (ti )+ν(ti ) i = 1, 2, . . . , n (4)

j=1

where ϕ j (ti ) are the base functions of the regression; x j are the unknown parameters of the regression; τ is the epoch of the discontinuity; k is the amplitude of the discontinuity; ν is the observation noise; h τ (ti ) is the Heaviside function, defined as 0 ti < τ h τ (ti ) = . 1 ti ≥ τ Equation (4) can be also expressed in vectorial notation, i.e. y 0 = Ax + kh τ + ν

(5)

where the generic element Ai j of the design matrix A is given by ϕ j (ti ). An example of the different terms of Eq. (5) along with their joint effect is shown in Fig. 2. Furthermore, we make the hypothesis that the observation noise ν is a gaussian white noise with zero mean and unknown variance σ02 , namely ν ∼ N 0, σ02 I .

(6)

Therefore, the observation vector y 0 can be seen as a sample of the n-dimensional normal random variable

123

530


In the case of detecting discontinuities in a generic timeseries, we exploit the concept of non-informative priors (Box and Tiao 1992). In particular, for the “position parameters” we set up the following uniform prior distributions

p(x) = const. (8) x ∼ U Rm 1 (9) τ ∼ U [{t1 , t2 , . . . , tn }] p(τ ) = = const. n k ∼ U [R] p(k) = const. (10)

4

y1 = Ax + v 3

2

y2 = k h 1

k 0 0

50

100

150

200

250

while for the “variability parameter” σ02 the non-informative prior can be expressed as

epoch

p(σ02 ) ∝

6

1 . σ02

(11)

5

Note that we will compare the resulting posterior distribution of the noise variance σ02 with an a-priori value, thus introducing indirectly some prior information on this parameter. All the parameters are considered a-priori stochastically independent, so that

4 3

y0 = y1 + y2

2

p(x, τ, k, σ02 ) = p(x) · p(τ ) · p(k) · p(σ02 ).

1 0

0

50

100

150

200

250

epoch

Fig. 2 Example of one discontinuity in a polynomial regression: the different terms of the model (top) and the resulting signal (bottom)

y 0 ∼ N Ax + kh τ , σ02 I

(7)

where, of course, n is the number of observations in the considered time-series. In this framework, the problem to be solved is the estimation of the epoch τ of the discontinuity and its amplitude k. The idea is to face this problem with a Bayesian approach (Koch 1990), which models as random variables not only the observables, but also all the unknown parameters and allows us to determine their distribution conditional to the given observations. This distribution is called posterior. The method requires to introduce a probability distribution for each parameter, the so-called prior, describing our a-priori knowledge about the parameter before looking at the data. This is in general a useful tool, allowing us to force some constraints on the parameters to be estimated. For instance, in the processing of the P1 − L 1 combination (Eq. 3), we express our a-priori information on the amplitude of the cycle-slip by enforcing the condition of being integer. This condition will also constrain the subsequent discontinuity search in the L 1 − L 2 combination (Eq. 2), as explained in detail in Sect. 4.

123

(12)

It has to be emphasized that the prior distributions (8), (10) and (11) are improper probability distributions, since the normalization condition cannot be satisfied. However, the Bayesian approach allows for their use (Box and Tiao 1992), on condition that they give rise to a proper posterior distribution. By exploiting the well-known Bayes theorem (Bayes 1763) and taking into account the expressions (7) and (12), the joint posterior distribution of the parameters can be written as p(x, τ, k, σ02 | y 0 ) =

p(y 0 | x, τ, k, σ02 ) · p(x, τ, k, σ02 )

1 ∝ n/2 σ02

p(y )

0 2 1 1 exp − 2 y 0 − Ax − kh τ . 2σ0 σ02

(13)

In order to detect the discontinuity in the data, we first compute the marginal posterior distribution of the random variable “discontinuity epoch” p(τ | y 0 ), then estimate the epoch τ of the discontinuity as a maximum a posteriori (MAP), i.e. as the value with the highest a posteriori probability. Therefore, the amplitude of this discontinuity can be estimated from its posterior distribution p(k| τ , y 0 ), given the detected epoch, again exploiting the MAP principle. The algebraic derivation of these posterior distributions is reported in Appendix A. In particular, the marginal posterior distribution of the discontinuity epoch is given by

Bayesian detection of cycle-slips n−m−2 2

p(τ |y 0 ) ∝ aτ

− n−m−1 2 · aτ c − bτ2

531

(14)

namely, considering the following linear transformation, we have

where aτ = h + τ (I − P)h τ

t=

bτ = h + τ (I − P)y 0

and P = A(A+ A)−1 A+ is the usual least-squares projector on the Span{A}. In this respect, one can think of computing a classical least-squares interpolation on the whole data set and then deriving the term c as the sum of all the squared residuals and the term b as the sum of the residuals after the discontinuity epoch τ . The model (5) represents the case of no discontinuities when τ = t1 , i.e. h τ = h t1 = [1 1 · · · 1]+ .

aτ c − bτ2

√

n − m − 1 ∼ tn−m−1 .

(19)

It is clear that the MAP value of k, i.e. the estimate of the discontinuity amplitude at the epoch τ , corresponds to its mean value abττ . In many geodetic observation techniques, the amplitude k of a possible discontinuity can be usefully modelled as an integer variable defined in a bounded interval, i.e. k ∈ [kmin , kmax ] ⊂ Z. In these cases we can restrict the domain of the admissible values of k and hence get a “sharper” estimate for such a parameter. To this aim, we set up a discrete uniform prior distribution

(15)

In this case, if a constant term is already modelled as part of the regression, we have a t1 =

bτ aτ

aτ2

c = y+ (I − P)y 0 0

h+ t1 (I

k−

− P)h t1 = 0

(16)

and consequently for any y 0 p(τ = t1 | y 0 ) = 0.

(17)

In other words, the case of data without discontinuities cannot be simply detected by using a MAP criterion, i.e. by comparing p(τ = t1 | y 0 ) with p(τ = t1 | y 0 ). This case has to be identified by observing that the posterior distribution (14) of the discontinuity epoch τ is uniform or, at least, it does not present any value with a predominant probability, as shown by an example in Sect. 5. A numerical consideration is worth about the inversion of the normal matrix A+ A. Since the base functions ϕ j (ti ) are typically polynomials of degree j − 1, in order to speed up the inversion and have a better conditioned normal matrix (especially for high values of m), it is recommended to use orthogonal bases, e.g. derived from Gram-Schmidt orthonormalization (Berger 1987). This is what we have implemented in our software. As for the conditional posterior distribution of the discontinuity amplitude, it is a properly translated and rescaled Student-t, ⎡

⎤− n−m 2 bτ 2 k− ⎢ ⎥ aτ ⎥ ⎢ ; p(k| τ , y 0 ) ∝ ⎢1 + ⎥ ⎣ aτ c − bτ2 ⎦ aτ2

(18)

p(k) =

1 = const. kmax − kmin + 1

(20)

and the corresponding (discrete) conditional posterior distribution p(k| τ , y 0 ) can be computed by evaluating the probability density (18) for every integer value of k and by applying a proper normalization. Such a discretization is of course not required to determine the MAP value, which, in this case, is simply the integer number nearest to abττ . Until now we have assumed to have only one discontinuity in the data set. The logical question is: what happens if more than one discontinuity is present? The model (5) is not valid anymore and therefore its application would generally lead to wrong results. At this point, one could think of generalizing the model (5) to the case of r discontinuities (see formulas in Appendix B). However, the numerical complexity of the exhaustive evaluation of the discrete posterior distribution (B2) increases exponentially with the number r of the considered discontinuities, making this approach too heavy from a numerical point of view. Furthermore, its application requires an a-priori knowledge of an upper bound of the number of discontinuities in the data. This is typically unrealistic and contradicts the need of automatizing the procedure. Of course, in a Bayesian approach, the number r could be considered as an integer random variable as well (if necessary with a non-informative prior distribution), but this approach, although possible, will bring us to an even more complicated solution. An alternative strategy could be to split the whole dataset into many subsets containing at most one discontinuity, which could then be detected according to the model (5). The problem becomes that of finding a way to select time intervals with one discontinuity only. We propose the

123

532


following solution, where a moving window is used to scan the whole data set and select the time span under study; the dimension of this window is iteratively reduced as long as the corresponding data have more than one discontinuity. To check this condition, a test procedure on σ02 , namely a test on the accuracy of the used model, is established, based on an a-priori value σ˜ 02 of the variance of the observation noise and on the conditional posterior distribution of σ02 , given the epoch τ of the detected discontinuity; this posterior reads p(σ02 | τ , y 0 )

1

∝ n−m−1 +1 σ02 2

1 aτ c − bτ2 exp − 2 , (21) 2σ0 aτ2

as described in Appendix A. It is an inverse chi-square distribution after applying a proper linear transformation, i.e. aτ2

aτ c − bτ2

2 σ02 ∼ invχn−m−1 .

(22)

If more than one discontinuity is present in the data, the model (5) will not fit the observations y 0 and we will probably have p(σ02 > σ˜ 02 | τ , y 0 ) 1.

(23)

This means that the considered time window has to be further reduced. In general we state that the data set contains more than one discontinuity when the condition p(σ02 > σ˜ 02 | τ , y 0 ) > 1 − α

(24)

is verified, being α the chosen significance level. Let us conclude this section with a short remark. As discussed before, it could happen that no discontinuities are detected in the time span under study, namely the distribution (14) is practically uniform. In this case the observation model to be tested is y 0 = Ax + ν

(25)

and the corresponding conditional posterior distribution of σ02 is given by (Gelman et al. 1995) p(σ02 |y 0 )

1

∝ n−m +1 σ02 2

c exp − 2 2σ0

,

(26)

that is σ02 2 ∼ invχn−m . c

123

(27)

4 Bayesian Identifier of CYCLE Slips: the BICYCLES software The Bayesian approach is here applied on the combinations of GPS data described in Sect. 2. The differences between the two-phases observations or between phase and range observations of a single GPS receiver and a single satellite are in fact very smooth in time, so that they are well approximated by a polynomial regression, at least in a limited time period. A cycle-slip in one or both the two carriers is modelled as a discontinuity in this polynomial regression. The software described in this section is called BICYCLES, which stands for Bayesian Identifier of CYCLE Slips; it can be logically divided into two levels. The deeper one, i.e. the core of the proposed algorithm, corresponds to the general procedure described in Sect. 3. It looks for all the discontinuity epochs in a given time-series, also providing, for each of them, the conditional posterior distribution of the discontinuity amplitude. The method is summarized in the data-flow diagram in Fig. 3. A comment is due about the dimension of the “research window”. There is a typical trade-off. If the window is too large, the two “branches” of the polynomial regression, before and after the discontinuity, are very well determined and, generally, this implies a very accurate estimate of the discontinuity amplitude. On the other hand, the determination of the discontinuity epoch can be a little fuzzy; in fact, by confusing the correct epoch τ with, for instance, the previous or the following one, the corresponding high residual will not significantly affect σˆ 02 because of the large number of the remaining small residuals in the average. On the contrary, if the window is under-dimensioned, the estimate of the discontinuity epoch is generally sharp, but the higher variability in the data interpolation can lead to wrong results, especially in the discontinuity amplitude estimation. In the developed software, the default value for the window dimension is 0 = 60 epochs, considering also that the data set under study has to be reasonably fitted by a 4th degree polynomial. When more than one discontinuity is present, the window dimension is progressively reduced till a minimum dimension min (the default value is min = 15 epochs); for < min , the data are not analyzed anymore for the reasons already mentioned. Other relevant default values of the parameters (see Fig. 3) are the probability threshold in the selection of the discontinuity epoch τ , that is Pτ,min = 0.90, and the significance level of the σ02 tests, that is α = 0.10. Let us come now to the software upper level (see the scheme in Fig. 4), concerning the application of the Bayesian technique to the specific GPS problem. The main algorithm in Fig. 3 is first applied to the L 1 − L 2 combination, which has a noise level of the order of 3–4 mm. Numerical simulations show that all the discontinuities with an amplitude


533

L1-L2

P1-L1

main module y0 i = 1;

i=i ;

=

=

main module

= -1

window selection

i = i+ ;

y 0 (ti , ti

0

=

0

p (k1 | , y 0 )

)

p (k12 | , y 0 )

marginal posterior

~ k1

k12

p( y 0 )

MAP

MAP

MAP

k1 k 2

~ k1 2 0

No

P( ) Pmin

test

regression with no jumps

k1 Yes

no jump detected

k2

~ k1 -

k , ... , k12 -

~ k1 +

k

1k1

2

Yes

No

Fig. 4 Data-flow diagram of the BICYCLES software: detection and correction of cycle-slips in GNSS data

conditional posterior

p(k | , y 0 )

p(

2 0

2 0

| , y0 )

test

regression with one jump

No

window shrinking

Yes

jump detected

Fig. 3 Data-flow diagram of the BICYCLES main module: detection of discontinuities (in terms of epoch and amplitude) in a generic time series of data

2 or 3 times larger than the observation noise, i.e. all the discontinuities that cannot be mixed up with simple noise, are successfully identified. Being the wavelengths of the single two carriers some orders of magnitude greater than the L 1 − L 2 noise r.m.s., i.e. λ1 19 cm and λ2 24 cm, cycleslips of even one cycle amplitude in one of the two L 1 or L 2 are easily detected. This is not true for simultaneous cycle-slips in the two carriers. In fact, since the amplitude of the discontinuity in the L 1 − L 2 signal is a combination of the two different wavelengths λ1 and λ2 , as shown in Eq. (2), there are couples of simultaneous cycle-slips, δ N1 and δ N2 , whose amplitude is smaller than the L 1 − L 2 noise level. Some examples are δ N1 = 5, δ N2 = 4 or δ N1 = 9, δ N2 = 7. Fortunately, these

cycle-slips can be easily detected analyzing the phase-range combination P1 − L 1 ; in fact, the noise r.m.s. of this combination is of the order of 20–30 cm, but the cycle-slips under consideration have amplitudes always 2 or 3 times larger than such a noise level. Once the epochs of the discontinuities have been found, the problem is to determine the amplitude of these discontinuities, i.e. the integer number of cycles slipped in one of the two carriers or, simultaneously, in both of them. For each detected cycle-slip, we can compute the conditional posterior distribution of the amplitude given the epoch τ , either in the L 1 − L 2 or in the P1 − L 1 combinations. Obviously, due to the small observation noise, the information coming from L 1 − L 2 is much more accurate, but unfortunately the cycle-slips in the two carriers are mixed into a unique (not integer) value. On the other hand, although the discontinuity amplitude in P1 − L 1 is badly estimated, it depends only on the cycle-slip in the L 1 carrier. Note that the discontinuity amplitude is considered as a continuous random variable k12 in the L 1 − L 2 signal and as an integer variable k1 in the P1 − L 1 signal, meaning that the latter combination is expressed in cycles and not in metric form. The idea is to maximize the L 1 − L 2 posterior distribution p(k12 | τ , y 0 ), with k12 = λ1 k1 + λ2 k2 , restricting the k1 values to a few integer numbers around k˜1 , i.e. the MAP estimate of the P1 − L 1 posterior distribution p(k1 | τ , y 0 ).

123

534

M. C. de Lacy et al. 20

In particular, after selecting a small range of admissible k1 values, i.e.

15

with k = 5 by default, we derive, for each k1 , the corresponding integer value of k2 that maximizes p(k12 | τ , y 0 ), i.e. µk12 − λ1 k1 , (29) k2 = λ2 where [·] denotes the nearest integer number and µk12 is the mean value of p(k12 | τ , y 0 ), analytically computed from Eq. (19). Among this set of simultaneous integer slips in the two carriers, the most probable couple with respect to p(k12 | τ , y 0 ) is chosen as the estimate of the discontinuity amplitude. Finally, we would like to underline that when a data gap occurs, instead of introducing a further ambiguity to be estimated, one could think of removing all the epochs with zero values and then repairing the cycle-slip, generated before and after the signal loss, by applying the BICYCLES program.

y0

(28) 10

5

0 0

1= 200

100

2= 300

400

500

epoch

Fig. 5 Simulated data with two discontinuities at the epochs τ1 = 200 and τ2 = 300. The discontinuity amplitudes are k1 = 3 and k2 = 5, respectively 14 12 10

y0

k1 ∈ k˜1 − k , k˜1 + k ⊂ Z,

8 6 4 2

5 Numerical tests

150

The capability of the Bayesian algorithm to detect discontinuities has been tested in four different scenarios:

A time series of 500 observations with a noise variance σ02 = 1 is simulated; two jumps are added to these data: the first one at the epoch τ1 = 200 with an amplitude k1 = 3σ0 = 3; the second one at the epoch τ2 = 300 with an amplitude k2 = 5σ0 = 5. The resulting signal is shown in Fig. 5. It can be seen that the first discontinuity is quite embedded into the noise and therefore it is more difficult to be detected. We start by determining a time-window with a single discontinuity. To this aim the test (24) is performed, to verify the significance of an a-priori noise variance σ˜ 02 with respect to the conditional posterior distribution of σ02 given the discontinuity epoch τ . In this simulation, in order to clarify how the method works, two different time-windows are considered. – In the first case, both the discontinuities are present (see Fig. 6). Since the method looks for a single discontinuity,

123

350

2

2.5

2 1.5

p(

5.1 Simulated data with cycle-slips

= 300

2.5

, y0 )

simulated data with cycle-slips, simulated data without cycle-slips, real data with simulated cycle-slips, real data with real cycle-slips.

250 epoch

3

2 0|

– – – –

200

1 0.5 0

0

0.5

1

1.5 2 0

Fig. 6 The most likely interpolation in a time-window containing both the discontinuities (top). Conditional posterior distribution of the noise variance σ02 given the selected data and the detected discontinuity epoch τ = 300 (bottom). The hypothesis that σ02 = 1 is rejected

only the jump at τ = 300, i.e. the largest, is detected. However, the corresponding conditional posterior distribution of σ02 (see Fig. 6) implies a higher noise variance than the real one (σ02 = 1), leading to the test failure and to the consequent time-window shrinking. What happens here is that the interpolating model, disregarding the first discontinuity, produces residuals larger than expected, especially when close to the epoch τ = 200.


535 1.0

8

0.9

7

0.8

6

p ( | y0 )

0.7

y0

5 4

0.6 0.5 0.4 0.3

3

0.2

2

0.1

150

= 200 epoch

175

225

150

250

2.5

225

250

0.8 0.7

2

0.6 1.5

p(

p ( k | , y0 )

, y0 )

200

Fig. 8 Marginal posterior distribution of the discontinuity epoch τ given the observations of Fig. 7

3

2 0|

175

1 0.5

0.5 0.4 0.3 0.2

0 0

0.5

1

1.5

2

0.1

2.5

2 0

– In the second case, only the first discontinuity is included in the selected time-window (see Fig. 7). The conditional posterior distribution of σ02 given τ = 200 is now centered around the true value of the noise variance (σ02 = 1); as a consequence the σ02 test gives a positive answer, meaning that the hypothesis of a single discontinuity in the time span under study is accepted.

1

1.5

2

2.5

3

3.5

4

4.5

5

k

Fig. 9 Conditional posterior distribution of the discontinuity amplitude k given the observations of Fig. 7 and the detected discontinuity epoch τ = 200 (see Fig. 8). k is considered as a continuous random variable 0.8 0.7 0.6

p ( k | , y0 )

Fig. 7 The most likely interpolation in a time-window containing only one discontinuity (top). Conditional posterior distribution of the noise variance σ02 given the selected data and the detected discontinuity epoch τ = 200 (bottom). The hypothesis that σ02 = 1 is accepted

0

0.5 0.4 0.3 0.2

After dealing with the definition of the time-window dimension, we concentrate on the discontinuity detection. The marginal posterior distribution of the discontinuity epoch τ , given the observations y 0 in Fig. 7, is shown in Fig. 8. The MAP estimate τ = 200, with a probability of about 0.85, coincides with the true value; note that, due to the small amplitude of the discontinuity compared with the noise level (k1 = 3σ0 ), the surrounding epochs do not have a negligible probability. As for the estimate of this amplitude, it can be modelled either as a continuous variable or as an integer variable (as it actually is). In the former case (see Fig. 9), its conditional posterior distribution given the discontinuity epoch τ has a Student-t shape and the MAP value k is equal

0.1 0

1

2

3

4

5

k

Fig. 10 Conditional posterior distribution of the discontinuity amplitude k given the observations of Fig. 7 and the detected discontinuity epoch τ = 200 (see Fig. 8). k is considered as an integer random variable

to 3.08; in the latter case (see Fig. 10), the conditional posterior of k is a discrete distribution and the MAP value k = 3 coincides with the true discontinuity amplitude, thanks to the stronger a-priori information on the domain of k.

123

536

M. C. de Lacy et al. 5

Table 1 Cycle-slips estimated by BERNESE and BICYCLES using 1 s GPS data. Wrong or incomplete estimates in bold

4

Epoch

Satellite

Simulated jump (δ N1 , δ N2 )

BERNESE

BICYCLES

14:25:00

2

(5, 4)

(5, 4)

(5, 4)

14:25:00

4

(5, 18)

(5, 18)

(5, 18)

14:25:00

8

(0, −2)

(0, −2)

(0, −2)

14:25:00

10

(−1, 1)

(−1, 1)

(−1, 1)

y0

3

2

1

0 0

10

20 epoch

30

40

Fig. 11 Simulated data without discontinuities (true model in solid line)

0.06 0.05

p ( | y0 )

0.04 0.03 0.02 0.01 0

5

10

15

20

25

30

35

13

(−2, −3)

(−2, −3)

(−2, −3)

14:25:00

16

(23, 24)

(23, 24)

(23, 24)

14:25:00

23

(1, 32)

(1, 32)

(1, 32)

14:25:00

24

(9, 7)

(9, 7)

(9, 7)

15:15:45

2

(0, 1)

(0, 1)

(0, 1)

15:15:45

4

(11, 26)

(11, 26)

(11, 26)

15:15:45

8

(9, 7)

(9, 7)

(9, 7)

15:15:45

10

(−5, −8)

(−5, −8)

(−5, −8)

15:15:45

13

(5, 4)

(5, 4)

(5, 4)

15:15:45

23

(9, 7)

(9, 7)

(9, 7)

16:35:00

2

(1, 8)

(1, 8)

(1, 8)

16:35:00

8

(2, 2)

Only epoch

(2, 2)

16:35:00

10

(9, 7)

(9, 7)

(9, 7)

16:35:00

13

(0, 8)

(0, 8)

(0, 8)

16:35:00

26

(2, −21)

(2, −21)

(2, −21)

16:35:00

28

(−5, −4)

(−5, −4)

(−5, −4)

16:35:00

29

(1, 1)

Only epoch

(1, 1)

40

Fig. 12 Marginal posterior distribution of the discontinuity epoch given the observations of Fig. 11 (no discontinuities are present)

5.2 Simulated data without cycle-slips A time series of 40 values without discontinuities is considered (see Fig. 11). The marginal posterior distribution of the discontinuity epoch τ given these observations (no timewindow selection is here required) is computed and shown in Fig. 12. It can be seen that this distribution is practically uniform, that is to say that there are no epochs with a predominant probability. The value τ = 1 is not displayed in Fig. 12 since its probability to be a discontinuity epoch is always equal to 0, as discussed in Sect. 3. 5.3 Real data with simulated cycle-slips In order to evaluate the performance of the Bayesian method when applied to cycle-slip detection in GNSS data, an artificial jump is introduced in some RINEX files. These data are then processed by BERNESE and BICYCLES. It is important to stress that undifferenced GPS data processing with BERNESE 5.0 requires IGS precise ephemerides

123

14:25:00

(http://igscb.jpl.nasa.gov/) and the excellent high-rate satellite clocks estimated by the Center for Orbit Determination in Europe (CODE, http://www.aiub.unibe.ch/download/ BSWUSER50/ORB/), while BICYCLES requires RINEX files only. In particular, simulated cycle-slips are introduced in the following GPS data, collected from 23 to 27 May 2005: – 1 s data from the permanent station of Milan, Italy (http:// gps.agra.unimi.it/) with Ashtech ZII receiver and Ashtech ground plane antenna; – 5 s and 30 s data from the permanent station of Como, Italy (http://geomatica.como.polimi.it/) with TRIMBLE 4000SSI receiver and TRM29659.00 antenna. In the case of 1 s data (see Table 1) BICYCLES is able to detect and correct all the simulated discontinuities, while BERNESE can only estimate the epoch but not the amplitude when the cycle-slip is a multiple of (δ N1 = 1, δ N2 = 1). Note that the “unfortunate” couples (δ N1 = 5, δ N2 = 4) and (δ N1 = 9, δ N2 = 7) are correctly estimated by both BERNESE and BICYCLES. The same conclusions can be drawn when 5 s data are used (see Table 2); in addition, in this case a cycle-slip is not even detected by BERNESE.


537



Epoch

Satellite


BERNESE

BICYCLES

Epoch

Satellite


BERNESE

BICYCLES

05:02:25

3

(11, −11)

(11, −11)

(11, −11)

04:15:00

3

(7, 13)

(7, 13)

(7, 13)

05:10:35

18

(−1, 86)

(−1, 86)

(−1, 86)

04:15:00

6

(0, 43)

(0, 43)

(0, 43)

05:11:30

21

(12, 38)

(12, 38)

(12, 38)

04:15:00

15

(5, 4)

(5, 4)

(5, 4)

12:10:45

4

(−1, 2)

(−1, 2)

(−1, 2)

04:15:00

16

(1, 1)

Only epoch

(1, 1)

12:10:45

7

(−1, 1)

Only epoch

(−1, 1)

04:15:00

18

(9, 7)

(9, 7)

(9, 7)

12:10:45

11

(0, 1)

(0, 1)

(0, 1)

04:15:00

21

(−2, −3)

(−2, −3)

(−2, −3)

12:10:45

13

(3, 2)

(3, 2)

(3, 2)

04:15:00

22

(0, −11)

(0, −11)

(0, −11)

12:10:45

20

(2, −3)

(2, −3)

(2, −3)

13:21:00

21

(2, 21)

(2, 21)

(2, 21)

12:10:45

23

(3, 4)

Not detected

(3, 4)

14:07:00

2

(1, 1)

Only epoch

(1, 1)

12:10:45

24

(7, 9)

(7, 9)

(7, 9)

14:07:00

4

(2, 8)

(2, 8)

(2, 8)

12:10:45

25

(−2, −2)

Only epoch

(−2, −2)

14:07:00

8

(3, −22)

(3, −22)

(2, −23)

13:55:55

2

(4, 5)

(4, 5)

(4, 5)

14:07:00

13

(−1, 31)

(−1, 31)

(−1, 31)

13:55:55

4

(3, 2)

(3, 2)

(3, 2)

14:07:00

20

(9, 7)

(9, 7)

(9, 7)

13:55:55

8

(3, 11)

(3, 11)

(3, 11)

14:07:00

23

(−4, −5)

Not detected

(−4, −5)

13:55:55

10

(1, −1)

(1, −1)

(1, −1)

14:07:00

24

(2, −2)

(2, −2)

(2, −2)

13:55:55

13

(4, 23)

(4, 23)

(4, 23)

18:32:00

7

(3, 8)

(3, 8)

(3, 8)

13:55:55

20

(0, 3)

(0, 3)

(0, 3)

18:32:00

8

(−3, 3)

Only epoch

(−3, 3)

13:55:55

23

(5, 5)

Only epoch

(5, 5)

18:32:00

9

(0, 75)

(0, 75)

(0, 75)

13:55:55

24

(0, −2)

(0, −2)

(0, −2)

18:32:00

10

(0, 1)

(0, 1)

(0, 1)

13:56:20

13

(2, 3)

(2, 3)

(2, 3)

18:32:00

18

(−4, −5)

Only epoch

(−5, −6)

18:32:00

26

(5, 4)

(5, 4)

(5, 4)

18:32:00

28

(11, 12)

(11, 12)

(11, 12)

18:32:00

29

(1, 22)

(1, 22)

(1, 22)

21:44:30

5

(5, 4)

(5, 4)

(5, 4)

5.4 Real data with real cycle-slips The Bayesian method has been also tested on two real 30 s GPS data sets. In particular, GPS data from the days 24 and 27 May 2005 have been analyzed with BERNESE 5.0 and BICYCLES. Both dual frequency receivers, Leica GX1230 and Leica GRX1200PRO with LEIAX1202 and LEIAT504 antennae respectively, belong to the University of Jaen (Spain). The latter is the receiver of the University permanent station. The results of the analysis are shown in Table 4 and can be summarized as follows: – If a cycle-slip is detected, BERNESE and BICYCLES always estimate the same discontinuity epoch.

70 60 50

p ( k | , y0 )

Concerning the 30 s data analysis (see Table 3), BICYCLES seems to be less robust than in the previous cases; in particular, due to the less smooth shape of the ionospheric effect when sampled every 30 s, the amplitudes of two cycle-slips are wrongly estimated. In these cases, however, the estimated couples (δ N1 , δ N2 ) falls in the tail of the conditional posterior distribution of the cycle-slip amplitude in the L 1 − L 2 combination given the correctly detected epoch (see Fig. 13); this means that the detected cycle-slip should be tagged, but not corrected.

40 30 20 10 0 5.94

5.95

5.96

5.97

5.98

5.99

6

k

( N1= 3 , N2= -22)

( N1= 2 , N2= -23)

true value

estimated value

Fig. 13 Wrongly estimated cycle-slip on the basis of the conditional posterior distribution of the discontinuity amplitude in the L 1 − L 2 combination given the detected epoch

– In one case BERNESE estimates the epoch but not the amplitude of the discontinuity. However, since the elevation of the satellite is lower than 8◦ , this case does not represent an interesting example.

123

538

Table 5 Coordinates of the permanent station installed at the University of Jaen

Station

Epoch

Satellite (δ N1 , δ N2 )

BERNESE

Leica GX1230

393

13

(0, −10)

(0, −10)

Leica GX1230

430

16

Only epoch

(−1, 16)

Leica GX1230

868

19

(0, −29)

(0, −29)

Leica GX1230

1257

28

(0, −87)

(0, −87) (0, −70)

1475

7

(0, −70)

Leica GRX1200PRO (ujaen)

199

14

(−43, 45)

(−43, 45)


209

14

(−36, −62)

(−36, −62)


212

21

(−16, −31)

(−16, −31)


1590

7

(23, 34)

(32, 41)


2364

10

(44, 55)

(44, 55)


2590

7

(−58, −100)

(−58, −100)


2592

29

(−55, −77)

(−55, −77)


2605

14

(−17, −33)

(−17, −33)

Coordinates

X (m)

ITRF00(2003.5)

5036324.955

−332898.888

3887177.279

PPP with (δ N1 , δ N2 ) = (23, 34)

5036324.989

−332899.268

3887177.452

PPP with (δ N1 , δ N2 ) = (32, 41)

5036325.698

−332900.802

3887178.408

– There is one case in which the couples (δ N1 , δ N2 ) estimated by BERNESE and BICYCLES are significantly different, even though the corresponding amplitudes of the cycle-slip in the L 1 − L 2 combination are practically the same. Since the elevation of the satellite is upper than 20o, a further investigation has been carried out to detect the correct estimate.

In the case under study, the GPS data are coming from the University of Jaen permanent station, therefore ITRF00 coordinates of this station are known. They are reported in Table 5. They will be considered as “true” values in this test. A 30 min time series of GPS data has been extracted from the RINEX file, so that only the jump associated to the satellite 7 at the epoch 1590 is present. This cycle-slip has been corrected by introducing in the RINEX file an additional jump, in one case with an amplitude opposite to the BERNESE estimate, i.e. (−23, −34), and in another case with an amplitude opposite to the BICYCLES estimate, i.e. (−32, −41). After that, the coordinates of the permanent station have been estimated by using the PPP (Precise Point Positioning) method provided by BERNESE 5.0. The results are shown in Table 5. The estimated coordinates have been compared with the “true” one. Using the BERNESE correction the differences range from 0.03 to 0.38 m, which is consistent with the current accuracy level provided by PPP, when a short observation session is considered. On the other hand,

123

BICYCLES

Leica GX1230

Y (m)

X (m)

120 100

p ( k | , y0 )

Table 4 Cycle-slips estimated by BERNESE and BICYCLES using 30 s real GPS data. Incomplete or inconsistent estimates in bold


80 60 40 20 0 -3.93

-3.925

-3.92

-3.915

-3.91

-3.905

-3.9

k BERNESE estimate

BICYCLES estimate

Fig. 14 Comparison between BERNESE and BYCICLES estimates with regard to the conditional posterior distribution of the discontinuity amplitude in the L 1 − L 2 combination given the detected epoch τ = 1590. Receiver Leica GRX1200PRO, satellite 7

using the BICYCLES correction the differences are much higher, from 0.74 to 1.9 m. Anyway, this is a case in which the cycle-slip detected by BICYCLES should be pointed out without forcing any correction, since the selected couple (δ N1 , δ N2 ) drops in the tail of the posterior distribution (see Fig. 14). Note that the increasing noise level of the P1 − L 1 combination close to the cycle-slip epoch (see Fig. 15) can be blamed for the wrong estimate of the cycle-slip amplitude


539

10 8

P1 L1 [m]

6 4 2 0 -2

1200

1300

1400 epoch

1500

= 1590

Fig. 15 Phase-range combination with a cycle-slip at the epoch τ = 1590. Receiver Leica GRX1200PRO, satellite 7

by BICYCLES. In fact, when the difference between the two cycle-slip corrections in the L 1 − L 2 combination is of the order of the noise level, the possibility of discriminating between them mainly relies on the quality of the phase-range combination. This limitation could be overcome by the introduction of the third frequency in the GNSS measurements, as foreseen in the modernized GPS and in the Galileo System (Zimmermann et al. 2006).

6 Conclusions The general problem of detecting discontinuities in a smooth signal, i.e. in a signal that can be reasonably modelled by a multiple polynomial regression, has been studied in this work. A solution based on the Bayesian approach has been proposed and successfully tested on simulated data. Then the method has been applied to the specific geodetic problem of GNSS cycle-slips detection. An algorithm based on the joint use of different linear combinations of undifferenced observations has been implemented in a software called BICYCLES. Numerical tests on real data (with both simulated and real cycle-slips) show the satisfactory performance of BICYCLES also in comparison with the wide-used BERNESE software. In particular, BICYCLES seems to be a very effective tool for detecting and correcting the GNSS data set of 1 s and 5 s interval, while some errors can occur in the amplitude estimate if the data of 30 s interval are used. However, this case can be generally recognized by the low a-posteriori probability of the estimated amplitude, which avoids a wrong correction of the detected cycle-slip, as numerical examples show clearly. Furthermore, we expect that the efficiency of the method when applied to 10 s interval data will not change from the 1 s and 5 s cases. At this sampling rate, in fact, the ionosphere behaviour is such that it does not produce large residuals with respect to a smooth interpolation.

The BICYCLES software, with its ability to detect cycleslips in the undifferenced observations with a very limited computational burden, could find applications, for instance, in the data pre-processing of permanent GNSS stations, in order to provide services. It is also important to stress that the algorithm can be generalized to a triple frequency scenario. This research is in fact under development. The method described, of course, can be applied to all those phenomena which can be modelled as smooth functions with discontinuities embedded in noise: for example, staying in the GPS field, to detect cycle-slips in double differences or, in a geophysical context, to determine a sudden break in a slowly deforming body, as it happens in seismic area when an earthquake occurs.

Appendix A: Posterior distributions in a regression with one discontinuity In this Appendix, the case of a single discontinuity in a smooth time-series of data is considered and the marginal posterior distribution of the discontinuity epoch τ is analytically derived along with the conditional posterior distribution of the discontinuity amplitude k and of the noise variance σ02 . Starting from the joint posterior distribution of all the parameters p(x, τ, k, σ02 | y 0 ), given by Eq. (13), we first compute the marginal distribution with respect to the parameters x of the regression, i.e. p(τ, k, σ02 | y 0 ) =

p(x, τ, k, σ02 | y 0 ) dx Rm

∝

Rm

2 1 exp − 2 y 0 − Ax − kh τ 2σ0

dx .

2 n2 +1 σ0

(A1) Defining xˆ as the least-squares estimate of the vector x, we can write the orthogonal decomposition 2 2 2 ˆ y 0 − Ax − kh τ = y 0 − A xˆ − kh τ + A(x − x) 2 = y 0 − A xˆ − kh τ

(A2)

ˆ + N (x − x) ˆ + (x − x) where N = A+ A is the normal matrix of the corresponding least-squares problem. Substituting Eq. (A2) into Eq. (A1),

123

540


we have

2 1 exp − 2 y 0 − A xˆ − kh τ 2σ0 2 p(τ, k, σ0 | y 0 ) ∝ 2 n2 +1 σ0

1 + · exp − 2 (x − x) ˆ N (x − x) ˆ dx (A3) 2σ0 m R

2 1 exp − 2 y 0 − A xˆ − kh τ m 2σ0 2π σ02 2 · √ = 2 n2 +1 det N σ0

√ det N 1 · ˆ + N (x − x) ˆ dx. m exp − 2 (x − x) 2σ0 2π σ 2 2 Rm

0

Note that the integrand in Eq. (A3) is just the density function of an m-dimensional normal random variable (with mean equal to xˆ and covariance matrix equal to σ02 N −1 ). Therefore the integral in Eq. (A3) is equal to 1 for the normalization condition and the marginal distribution reads

2 1 exp − 2 y 0 − A xˆ − kh τ 2σ0 2 . p(τ, k, σ0 | y 0 ) ∝ 2 n−m +1 σ0 2

+∞ n−m η 2 −1 exp [−η] dη 0

p(τ, k| y 0 ) ∝ n−m y 0 − A xˆ − kh τ Γ

where Γ (·) is the Euler gamma function. Now, considering the expression (A9) and computing the marginal distribution with respect to the discontinuity amplitude k, we can finally derive the sought posterior distribution of the discontinuity epoch τ , i.e. +∞ p(τ | y 0 ) =

p(τ, k| y 0 ) dk

∝

τ

= (y 0 − kh τ )+ (I − P)(y 0 − kh τ ) dσ02 .

= aτ k 2 − 2bτ k + c

bτ 2 aτ c − bτ2 = aτ + k− aτ2 aτ

2 1 − A x ˆ − kh y τ 2σ02 0

2 y 0 − A xˆ − kh τ

=−

(A6)

2η

2η2

aτ = h + τ (I − P)h τ bτ = h + τ (I − P)y 0 c = y+ (I − P)y 0 . 0 By substituting Eq. (A11) into Eq. (A10) and defining the new variable

,

2 y 0 − A xˆ − kh τ

123

(A11)

where

so that

dσ02

n−m dk. y 0 − A xˆ − kh τ

least-squares projector P = A(A+ A)−1 A+ , i.e. 2 2 y 0 − A xˆ − kh τ = (I − P)(y 0 − kh τ )

Let us define

σ02 =

1

In order to solve the integral in Eq. (A10), it is useful to explicitly express the dependence of the quadratic form y − A xˆ − kh 2 on the variable k, also introducing the

(A5)

η=

(A10)

+∞ −∞

p(τ, k, σ02 | y 0 ) dσ02

2 1 +∞ exp − 2σ 2 y 0 − A xˆ − kh τ 0 ∝ 2 n−m +1 σ0 2 0

2

1 ∝ n−m y 0 − A xˆ − kh τ

0

(A9)

= n−m y 0 − A xˆ − kh τ

(A4)

+∞ 0

n−m

−∞

The next step is to marginalize the distribution (A4) with respect to the noise variance σ02 , i.e. p(τ, k| y 0 ) =

By substituting the new variable η into expression (A5), we obtain

(A7) t= dη.

(A8)

k−

bτ aτ

aτ c − bτ2 aτ2

√

n−m−1

(A12)


541

we obtain

used to normalize the distribution (A16), i.e. ⎡

2 ⎤− n−m 2

+∞ k − abττ ⎢ ⎣1 + a c−b2 p(τ | y 0 ) =

τ

−∞ n−m 2

aτ +∞ =

−∞

aτ2

⎥ ⎦

τ

aτ c−bτ2 aτ2

⎡ dk

n−m 2

t2 1+ √ n−m−1 n−m−1 1

n−m 2

aτ

⎤− n−m 2 bτ 2 k− ⎢ ⎥ aτ ⎥ ⎢ ⎢1 + ⎥ aτ c − bτ2 ⎦ ⎣

aτ c−bτ2 aτ2

− n−m 2 dt

n−m−1 2

(A13) − n−m−1 Γ n−m−1 √π 2 2 aτ c − bτ2 = aτ Γ n−m 2 − n−m 2 n−m t2 +∞ 1 + Γ 2 n−m−1 · dt. √ π(n − m − 1) Γ n−m−1 2 n−m−2 2

−∞

The integrand in Eq. (A13) is just the density function of a Student-t random variable (with n − m − 1 degrees of freedom), so the corresponding integral is equal to 1 and the marginal posterior distribution of τ results − n−m−1 2 · aτ c − bτ2 .

n−m−2 2

p(τ |y 0 ) ∝ aτ

(A14)

Note that the expression (A14) is not a density function but a discrete probability distribution, which has to be normalized by the condition n

p(τ = ti |y 0 ) = 1.

(A15)

i=1

The epoch τ of the discontinuity can be now estimated as the maximum a posteriori (MAP) of the distribution (A14). Then the amplitude of the discontinuity can be obtained by maximazing its conditional posterior distribution given the detected epoch τ . Using Eqs. (A9) and (A11), we have 1 p(k| τ , y 0 ) ∝ p(τ = τ , k| y 0 ) ∝ n−m y 0 − A xˆ − kh τ = n−m 2

aτ ⎡

⎢ ∝ ⎣1 +

aτ c−bτ2 aτ2

k−

bτ aτ

1 + k−

bτ aτ

2 n−m 2

(A16)

2 ⎤− n−m 2

aτ c−bτ2 aτ2

⎥ ⎦

which can be reduced to a Student-t density function (with n − m − 1 degrees of freedom) by the linear transformation (A12). In this way the normalization of the Student-t can be

(A17) Γ n−m aτ2 2 . p(k| τ , y 0 ) = n−m−1 Γ √ aτ c − bτ2 2 π(n − m − 1) aτ2 In order to set up the time span of the moving window and verify whether the data contain only one discontinuity, it is useful to compute the conditional posterior distribution of the noise variance σ02 given the discontinuity epoch τ . Using Eqs. (A4) and (A11), we have p(σ02 | τ , y 0 ) ∝ p(τ = τ , σ02 | y 0 ) +∞ =

p(τ = τ , k, σ02 | y 0 ) dk −∞ +∞

∝

−∞

2 1 exp − 2 y 0 − A xˆ − kh τ 2σ0

2 n−m +1 σ0 2

1 aτ c − bτ2 exp − 2 2σ0 aτ2 = 2 n−m +1 σ0 2

+∞ bτ 2 1 · exp − 2 k − dk aτ 2σ0 −∞

1 aτ c − bτ2 exp − 2 2σ0 aτ2 = 2π σ02 2 n−m 2 +1 σ0

bτ 2 1 +∞ exp − 2σ 2 k − a τ 0 · dk. 2 2π σ −∞ 0

dk

(A18)

The integrand in Eq. (A18) is just the density function of a normal random variable (with mean equal to abττ and variance equal to σ02 ). Therefore the integral in Eq. (A18) is equal to 1 and the conditional posterior distribution of the noise variance σ02 is given by

2 a c − b 1 1 τ τ p(σ02 | τ , y 0 ) ∝ n−m−1 exp − 2 (A19) 2 +1 2σ a 2 2 0 τ σ0 which can be reduced to an inverse chi-square density function (with n − m − 1 degrees of freedom) by the linear

123

542


transformation inv χ 2 =

aτ2

aτ c − bτ2

References σ02 .

(A20)

Again, the known normalization of the inverse chi-square can be used to normalize the distribution (A19), i.e. n−m−1

2 aτ c−bτ2 1 aτ c−bτ2 exp − 2 aτ2 2σ0 aτ2 2 p(σ0 | τ , y 0 ) = . n−m−1 2 n−m−1 +1 2 2 2 Γ n−m−1 σ0 2 (A21)

Appendix B: Posterior distribution in the case of r discontinuities The model (5) can be generalized to the case of r discontinuities as follows y 0 = Ax + Hτ k + ν

(B1)

where τ = [τ1 τ2 · · · τr ]+ = epochs of the discontinuities; k = [k1 k2 · · · kr ]+ = amplitudes of the discontinuities; Hτ = [h τ1 h τ2 · · · h τr ]. By using this model and introducing an r -dimensional uniform non-informative prior for the vectors τ and k, we get the following marginal posterior distribution of the discontinuity epochs p(τ | y 0 ) ∝

1 n−m−2 ! 2 det Aτ C − Bτ+ A−1 τ Bτ

(B2)

where Aτ = Hτ+ (I − P)Hτ Bτ = Hτ+ (I − P)y 0 C = y+ (I − P)y 0 . 0

This distribution is the direct generalization of that in Eq. (14) as can be easily verified by fixing r = 1.

123

Bayes T (1763) An essay towards solving a problem in the doctrine of chances. Philosophical Transactions of the Royal Society of London, vol 53, pp 370–418. Published posthumously, then reprinted in Biometrika, vol 45, pp 296–315 (1958) Berger M (1987) Geometry I. Translated from the French by Cole M and Levy S. Springer, Berlin Betti B, Crespi M, Sansò F (1993) A geometric illustration of ambiguity resolution in GPS theory and a Bayesian approach. Manus Geod 18:317–330 Beutler G, Bock H, Brockmann E, Dach R, Fridez P, Gurtner W, Habrich H, Hugentobler U, Ineichen D, Jaeggi A, Meindl M, Mervart L, Rothacher M, Schaer S, Schmid R, Springer T, Steigenberger P, Svehla D, Thaller D, Urschl C, Weber R (2006). BERNESE GPS software version 5.0 Blewitt G (1990) An automatic editing algorithm for GPS data. Geophys Res Lett 17(3):199–202 Bona P (2000) Precision, cross correlation and time correlation of GPS phase and code observations. GPS Solutions 4(2):3–13 Box GEP, Tiao GC (1992) Bayesian inference in statistical analysis. Wiley, New York Euler HJ, Goad CC (1991) On optimal filtering of GPS dual frequency observations without using orbit information. Bull Geod 65:130– 143 Gelman A, Carlin JB, Stern HS, Rubin DB (1995) Bayesian data analysis. Chapman & Hall, London Gundlich B, Koch KR (2002) Confidence regions for GPS baselines by Bayesian statistics. J Geod 76:55–62 Koch KR (1990) Bayesian inference with geodetic applications. Lecture Notes in Earth Sciences, vol 31. Springer, Berlin de Lacy MC, Sansò F, Rodriguez-Caderot G, Gil AJ (2002) The Bayesian approach applied to GPS ambiguity resolution. A mixture model for the discrete-real ambiguities alternative. J Geod 76:82– 94 Lichten SM, Bar-Sever YE, Bertiger EI, Heflin M, Hurst K, Muellerschoen RJ, Wu SC, Yunck TP, Zumberge JF (1995) GIPSYOASIS II: a high precision GPS data processing system and general orbit analysis tool, Technology 2006, NASA Technology Transfer Conference, Chicago, October 24–26 Sansò F, Venuti G (1997) Integer variables estimation problems: the Bayesian approach. Ann Geofisica XL(5):1415–1431 Teunissen PJG (1997) On the GPS widelane and its decorrelating property. J Geod 71:577–587 Teunissen PJG, Kleusberg A (1998) GPS for geodesy, 2nd edn. Springer, Berlin Xu P (2005) Sign-constrained robust least squares, subjective breakdown point and the effect of weights of observations on robustness. J Geod 79:146–159 Zimmermann F, Haak T, Hill C (2006) The Galileo system simulation facility-validation with real measurement data. ENC06, European Navigation Conference and Exhibition, Manchester, 8–10 May, 2006

The Bayesian detection of discontinuities in a polynomial regression ...

The Bayesian detection of discontinuities in a polynomial regression ...

Suggest Documents

Bayesian Polynomial Regression Models to Fit ... - Semantic Scholar

Accurate Early Detection of Discontinuities

Accurate Early Detection of Discontinuities

Modelling using polynomial regression

Estimating A Polynomial Regression With Measurement Errors In The ...

A Heteroscedastic Polynomial Regression with ... - Sankhya

A Database Performance Polynomial Multiple Regression Model

Geographic Boundaries as Regression Discontinuities - Google Sites

Comparison of three estimators in a polynomial regression with ...

A LOCAL POLYNOMIAL JUMP DETECTION ALGORITHM IN ...

Scene Duplicate Detection Based on the Pattern of Discontinuities in ...

Local Polynomial Order in Regression Discontinuity Designs

Testing for changes in polynomial regression - arXiv

Bayesian Bridge Regression

Bayesian Tensor Regression

Bayesian Robust Quantile Regression

Bayesian Linear Regression

Scene Duplicate Detection Based on the Pattern of Discontinuities in ...

Bayesian Multivariate Logistic Regression

LOCAL POLYNOMIAL REGRESSION ESTIMATION WITH ...

Bayesian isotonic density regression

the local polynomial regression method - Springer Link

Polynomial Smoothing of Time Series with Additive Step Discontinuities

The cross-validation method in the polynomial regression - UAB