Multidimensional statistical tests for imprecise data - Semantic Scholar

2 downloads 0 Views 859KB Size Report
Multidimensional tests for imprecise data. 4. Wuhan, 1. June 2006. Uncertainty modeling using imprecise quantities. Requirements: ➢Adequate description of ...
Multidimensional statistical tests for imprecise data Hansjörg Kutterer and Ingo Neumann Geodetic Institute University of Hannover Germany The VI Hotine-Marussi International Symposium on Theoretical and Computational Geodesy

Multidimensional tests for imprecise data

1

Wuhan, 1. June 2006

Contents •

Motivation



Uncertainty modeling using imprecise quantities



Statistical hypothesis tests in case of imprecise data - One-dimensional case - Multi-dimensional case



Applications - Globaltest in least-squares adjustment - Congruence tests



Conclusions

Multidimensional tests for imprecise data

2

Wuhan, 1. June 2006

Motivation Focus in this presentation : •

Measurement process: -

Stochasticity

Stochastics (Bayesian approach)

-

Observation imprecision

-

(Outliers)

Interval mathematics Fuzzy theory

GUM Guide to the Expression of Uncertainty in Measurement

Systematic effects Multidimensional tests for imprecise data

3

Wuhan, 1. June 2006

Uncertainty modeling using imprecise quantities Requirements: ¾Adequate description of Stochastics ¾Adequate description of Imprecision

Fuzzy set

% := {(x, m % (x)) x ∈X } , m % : X → [ 0,1] A A A

Imprecise quantity

Multidimensional tests for imprecise data

X universe m A% ( x ) membership function

4

Wuhan, 1. June 2006

Uncertainty modeling using imprecise quantities General case: fuzzy intervals α

xm

mean value

r

radius

L(.)

left reference fct.

R ( . ) right reference fct.

Fuzzy number

Interval

α

α

Multidimensional tests for imprecise data

5

Wuhan, 1. June 2006

Uncertainty modeling using imprecise quantities Examples of fuzzy numbers

LR-fuzzy x% = ( x m , x l , x r ) LR LL-fuzzy L-fuzzy

x% = ( x m , x l , x r ) LL

α

x% = ( x m , x s ) L

Multidimensional tests for imprecise data

6

Wuhan, 1. June 2006

Uncertainty modeling using imprecise quantities Set-theoretical operations Intersection m A% ∩ B% = min ( m A% , m B% ) Union

Minimum rule

m A% ∪ B% = max ( m A% , m B% ) Maximum rule

Complement m A% c = 1- m A%

Fundamental rules of arithmetic, e. g. L-fuzzy numbers e.g. ⎧ X % +Y % = (x m + y m , x s + ys ) L ⇒ ⎨ % = (a x m , a x s ) L ⎩a X

Multidimensional tests for imprecise data

7

Wuhan, 1. June 2006

Mean value xˆ computation from n samples

Example -

Standard deviation

-

σ 2x垐=

n

1 2 σ ∑ x n i =1 i

1 n xˆ r = ∑ x i r n i =1

Imprecision



σx =

1 ⎞2

1 ⎛ 2 σ ⎜∑ x ⎟ n ⎝ i =1 i ⎠ n

constant influence

Comparison of the law of propagation between different types of uncertainty 14 12

Standard devation Interval radii

10

[m m]

both

Distance measurement: n = 1 ... 20 σ i = 10 mm

8 6

x r = 3 mm

4 2 0 1

2

3

4

5

6

7

8

9

10 11 12 13 14 15 16 17 18 19 20

Number of samples for the computation of the mean value

Multidimensional tests for imprecise data

8

Wuhan, 1. June 2006

Example Remaining systematics: GPS phase observations (DD) Contribution to the interval radius in [mm]

25

interval radius satellite specific propagation specific station specific observation specific

20

Pair of satellites: ascending - descending

15

10

5

0 10/65

20/54

30/42

40/33

50/25

60/16

70/ 7

Elevation in [°] (satellite k / satellite l)

Multidimensional tests for imprecise data

9

Wuhan, 1. June 2006

Uncertainty modeling using imprecise quantities Extension principle according to Zadeh

(

% ,K, A % % =g A B 1 n m B% ( y ) =

)

:⇔ sup

( x1 ,K , x n ) ∈ X1 × K × X n g ( x1 ,K , x n ) = y

(

min m A% ( x1 ) ,K , m A% ( x n ) 1

n

)

∀ y∈Y

definition of a fuzzy vector

Extension principle: Formulation based on fuzzy vectors % = g ( z% ) w

:⇔ m w% ( w ) =

sup z ∈R n g (z) = w

Multidimensional tests for imprecise data

m z% ( z ) ∀ w ∈R

10

Wuhan, 1. June 2006

Uncertainty modeling using imprecise quantities Fuzzy vectors: Minimum rule 2D : m z% ( z ) = min(m x% (x), m y% (y))

Propagation of fuzziness / imprecision: Multidimensional tests for imprecise data

11

% = F z% ? w Wuhan, 1. June 2006

Uncertainty modeling using imprecise quantities 2D : m z% ( z ) = min(m x% (x), m y% (y))

m w% ( w ) = sup m z% ( z ) ∀ w ∈ R p n z∈R Fz = w

% ⊇ = ( F w m , F ws ) Applications: w Multidimensional tests for imprecise data

12

Propagation of fuzziness / imprecision

Minimum rule

L

Wuhan, 1. June 2006

Uncertainty modeling using imprecise quantities Extended least-squares estimation Representation with membership function - T T xˆ = X WX X Wy :⇔

(

)

m % ( x垐 )= xˆ

m y% ( y )

sup

(

y∈R

n

)

∀ x∈ R

u

-

xˆ = X T WX X T Wy

Representation as an α-cut (equivalent: interval mathematics) %x垐 = [ x ] = X T WX X T W [ y ] α

(

)

y% α Multidimensional tests for imprecise data

13

Wuhan, 1. June 2006

Statistical hypothesis tests in case of imprecise data m (x ) 1

Precise case (1D) R

A

R

and unique decisions !

Example:

TClear

Two-sided comparison of a mean value with a given value

x

Null hypothesis H0, alternative hypothesis HA, error probability γ → Definition of regions of acceptance A and rejection R

Multidimensional tests for imprecise data

14

Wuhan, 1. June 2006

Statistical hypothesis tests in case of imprecise data Consideration of imprecision m (x ) 1

Classical case

A

R

m (x ) 1

R

Imprecise case

T x

x

Multidimensional tests for imprecise data

15

Wuhan, 1. June 2006

Statistical hypothesis tests in case of imprecise data Consideration of imprecision m (x ) 1

Classical case

A

R

m (x ) 1

R

Imprecise case T%

T x

x Imprecision of test statistics due to the imprecision of the observations

T% = ( T m , Tl , Tr )LR Multidimensional tests for imprecise data

e. g., 16

T m ~ N ( μ, 1) Wuhan, 1. June 2006

Statistical hypothesis tests in case of imprecise data Consideration of imprecision m (x ) 1

Classical case

A

R

m (x ) 1

R

Imprecise case T%

% A

T x

x Imprecision of the region of acceptance due to the linguistic fuzziness of notions like „outlier“

% = (A , A , A , A ) A m n l r LR Multidimensional tests for imprecise data

17

Fuzzy-interval Wuhan, 1. June 2006

Statistical hypothesis tests in case of imprecise data Consideration of imprecision m (x ) 1

Classical case

A

R

m (x ) 1

R

Imprecise case

% T% R

% A

% R

T x

x Imprecision of the region of rejection as set-theoretical complement of the region of acceptance

%c R% = A Multidimensional tests for imprecise data

⇒ m R% = 1 − m A% 18

Wuhan, 1. June 2006

Statistical hypothesis tests in case of imprecise data Consideration of imprecision m (x ) 1

Classical case

A

R

R

m (x ) 1

Imprecise case

% T% R

% A

Conclusion: T Transitions regions prevent a clear and unique test decision ! x

% R

x

Formulation of hypotheses

Tm

⎧μ ⎪ =0 ~ N ( μ, 1) and ⎨ ⎪⎩μ = δ , δ ≠ 0

Multidimensional tests for imprecise data

19

H0 HA Wuhan, 1. June 2006

Statistical hypothesis tests in case of imprecise data Conditions for an adequate test strategy • Quantitative comparison of the imprecise test statistics Τ% and the % and rejection R% regions of acceptance A • Precise criterion pro or con acceptance • Probabilistic interpretation of the results

Multidimensional tests for imprecise data

20

Wuhan, 1. June 2006

Statistical hypothesis tests in case of imprecise data Degree of disagreement

Degree of agreement γ : F (R × R

δ: F (R × R

) → [0, 1]

of the imprecise test statistics T% ∈ F ( R

of the imprecise test statistics

)

T% ∈ F ( R

)

with the

with the

imprecise region of rejection

imprecise region of acceptance

R% ∈ F ( R

alternatives

) → [0, 1]

)

% ∈ F (R A

% R% ) = height ( T% ∩ R% ) γ R% ( T% ) := γ ( T,

(

card T% ∩ R% % R% = γ R% T% := γ T, card T%

( )

(

)

( )

Multidimensional tests for imprecise data

)

δA% ( T% ) := 1 − γ A% ( T% )

)

21

Wuhan, 1. June 2006

Statistical hypothesis tests in case of imprecise data Degree of disagreement

Degree of agreement

Degree of rejectability ρ: F (R

) → [0, 1]

of the imprecise test statistics T% ∈ F ( R )

(

ρ R% ( T% ) := min γ R% ( T% ) , δA% ( T% )

Multidimensional tests for imprecise data

22

) Wuhan, 1. June 2006

Statistical hypothesis tests in case of imprecise data •

Evaluate the test criterion

ρR%

%T ⎧⎨≤ ⎫⎬ ρ ∈ [ 0,1] ⇒ ⎧⎨ Do not reject H 0 crit > ⎩ ⎭ ⎩ Reject H 0

( )

• Calculate the probability of a Type I error Type II error

( ( ) 1 − β = P ( ρ ( T% ) ≤ ρ

Multidimensional tests for imprecise data

α = P ρR% T% > ρcrit H 0 R%

23

crit

HA

)

) Wuhan, 1. June 2006

Statistical hypothesis tests in case of imprecise data m( x ) 1

γ R% (T% )

The height criterion:

Tm

~

T

δ A% (T% )

~

A

-k

with:

~

0

R x

k

γ R% (T% ) = height (T% ∩ R% )

δ A% (T% ) = 1 − height (T% ∩ A% )

⎧ Do not reject H 0 ⎧≤ ⎫ % % % ρ R% (T ) := min γ R% (T ), δ A% (T ) ⎨ ⎬ ρcrit ∈ [ 0,1] ⇒ ⎨ ⎩> ⎭ ⎩ Reject H 0

(

Multidimensional tests for imprecise data

)

24

Wuhan, 1. June 2006

Statistical hypothesis tests in case of imprecise data m( x ) 1

Tm

The card criterion:

~

~

T

~

A

R x

card (T% ∩ R% ) with:

γ R%

( )

)

( ) %) card ( T% ∩ A δA% ( T% ) = 1 − card ( T% )

card (T% ∩ A% ) Overlap region

Multidimensional tests for imprecise data

(

card T% ∩ R% T% = card T%

25

Wuhan, 1. June 2006

Statistical hypothesis tests in case of imprecise data Test situation with tight bounds (weak imprecision): m( x )

Tm

1

~

R

~

T

~

~

A

-k

0

R x

k

degree of rejectability for the card criterion degree of rejectability for the height criterion Multidimensional tests for imprecise data

26

Wuhan, 1. June 2006

Statistical hypothesis tests in case of imprecise data Test situation with wide bounds (strong imprecision): m( x)

Tm

~

~

1

R

~

T

~

A

-k

0

R x

k

degree of rejectability for the card criterion degree of rejectability for the height criterion Multidimensional tests for imprecise data

27

Wuhan, 1. June 2006

Multidimensional hypothesis tests in case of imprecise data Test situation without imprecision (quadratic form): Ω = y T Σ −yy1 y ~ χ 2 (f ,0)

f := degrees of freedom

y:= Vector with an expected value of zero (under H 0 ) Σ yy := Variance covariance matrix (VCM) of the vector y

Hypotheses: H 0 : E( y ) = 0 H A : E(y ) = δ with δ ≠ 0

Test decision: Ω > χ12−γ (f ,0) ⇒ Reject H 0 Multidimensional tests for imprecise data

28

(γ:=error probability) Wuhan, 1. June 2006

Multidimensional hypothesis tests in case of imprecise data Ω = y T Σ −yy1 y

with y ∈ y%

Search the smallest and largest elements for y ∈ y% for a sufficient number of α-cuts Optimization

algorithm

% % Ω α min = min(Ω α ) % % Ω α max = max(Ω α )

Rigorous realization of Zadeh‘s extension principle! Multidimensional tests for imprecise data

29

Wuhan, 1. June 2006

Multidimensional hypothesis tests in case of imprecise data α−cut optimization

Multidimensional tests for imprecise data

30

Wuhan, 1. June 2006

Multidimensional hypothesis tests in case of imprecise data α−cut optimization

Multidimensional tests for imprecise data

31

Wuhan, 1. June 2006

Multidimensional hypothesis tests in case of imprecise data Resulting test scenario Æ 1D comparison

Final decision based on the height/card criterion Multidimensional tests for imprecise data

32

Wuhan, 1. June 2006

Applications

A monitoring network of a tunnel:

2000 m Entrance area

Entrance area

Reference: SBA Oldenburg Multidimensional tests for imprecise data

33

Wuhan, 1. June 2006

Applications

Global test in least-squares adjustment

( )

( ( )

( ))

ρR% T% := min γ R% T% , δA% T%

Height criterion: 0.60 < ρcrit = 0.8 → Accept H 0

Card criterion: 0.86 > ρcrit = 0.8 → Reject H 0

( )

( )

criterion

γ R% T%

δA% T%

height

1.00

0.60

card

1.00

0.86

Multidimensional tests for imprecise data

34

Wuhan, 1. June 2006

Applications

Congruence test I [ tunnel ]

Strong imprecision

16 identical points Specification

Low tide

Tide

observations

261

261

parameters

148

148

( )

γ R% T% = 0.15

( )

( ))

= 0.08

( )

δ A% T% = 0.08

Multidimensional tests for imprecise data

( ( )

ρR% T% := min γ R% T% , δA% T%

35

0.08 < ρcrit = 0.5 → Accept H 0 Wuhan, 1. June 2006

Conclusions •





Statistical hypothesis tests can be extended for imprecise data • Degree of rejectability Æ comparison of fuzzy intervals • 1D case is straightforward, mD case needs α-cut optimization Imprecision is an additive term of uncertainty what leads to a more reluctant rejection of the null hypothesis than in pure stoch. case Not shown but computable: Type I and Type II error probs.

Multidimensional tests for imprecise data

36

Wuhan, 1. June 2006

Future work •

In progress: Using real data for the assessment and validation and for the computation of more realistic fuzzy intervals for the influence parameters



Further research: - Sensitivity analysis - Kalman filtering

Multidimensional tests for imprecise data

37

Wuhan, 1. June 2006

Acknowledgements The presented results are developed within the research project KU 1250/4-1 ”Geodätische Deformationsanalysen unter Verwendung von Beobachtungsimpräzision und Objektunschärfe”, which is funded by the German Research Foundation (DFG). This is gratefully acknowledged.

Multidimensional tests for imprecise data

38

Wuhan, 1. June 2006

Multidimensional tests for imprecise data

39

Wuhan, 1. June 2006

Applications A monitoring network of a lock:

The lock Uelzen I

Monitoring network

Monitoring the deformations of the lock Multidimensional tests for imprecise data

40

Wuhan, 1. June 2006

Applications

Multiple outlier test (5 horizontal directions)

( )

( ( )

( ))

ρR% T% := min γ R% T% , δA% T%

( )

γ R% T% = 0.60

( )

δ A% T% = 0.16

Multidimensional tests for imprecise data

41

= 0.16

0.16 < ρcrit = 0.8 → Accept H 0 Wuhan, 1. June 2006

Applications

Congruence test II [ lock Uelzen ]

6 identical points Specification

Epoch 1999

Epoch 2004

observations

317

144

parameters

60

39

( )

γ R% T% = 0.58

( ( )

( ))

= 0.52

( )

δ A% T% = 0.52

Multidimensional tests for imprecise data

( )

ρR% T% := min γ R% T% , δA% T%

42

0.52 > ρcrit = 0.5 → Reject H 0 Wuhan, 1. June 2006

Suggest Documents