Multidimensional tests for imprecise data. 4. Wuhan, 1. June 2006. Uncertainty modeling using imprecise quantities. Requirements: â¢Adequate description of ...
Multidimensional statistical tests for imprecise data Hansjörg Kutterer and Ingo Neumann Geodetic Institute University of Hannover Germany The VI Hotine-Marussi International Symposium on Theoretical and Computational Geodesy
Multidimensional tests for imprecise data
1
Wuhan, 1. June 2006
Contents •
Motivation
•
Uncertainty modeling using imprecise quantities
•
Statistical hypothesis tests in case of imprecise data - One-dimensional case - Multi-dimensional case
•
Applications - Globaltest in least-squares adjustment - Congruence tests
•
Conclusions
Multidimensional tests for imprecise data
2
Wuhan, 1. June 2006
Motivation Focus in this presentation : •
Measurement process: -
Stochasticity
Stochastics (Bayesian approach)
-
Observation imprecision
-
(Outliers)
Interval mathematics Fuzzy theory
GUM Guide to the Expression of Uncertainty in Measurement
Systematic effects Multidimensional tests for imprecise data
3
Wuhan, 1. June 2006
Uncertainty modeling using imprecise quantities Requirements: ¾Adequate description of Stochastics ¾Adequate description of Imprecision
Fuzzy set
% := {(x, m % (x)) x ∈X } , m % : X → [ 0,1] A A A
Imprecise quantity
Multidimensional tests for imprecise data
X universe m A% ( x ) membership function
4
Wuhan, 1. June 2006
Uncertainty modeling using imprecise quantities General case: fuzzy intervals α
xm
mean value
r
radius
L(.)
left reference fct.
R ( . ) right reference fct.
Fuzzy number
Interval
α
α
Multidimensional tests for imprecise data
5
Wuhan, 1. June 2006
Uncertainty modeling using imprecise quantities Examples of fuzzy numbers
LR-fuzzy x% = ( x m , x l , x r ) LR LL-fuzzy L-fuzzy
x% = ( x m , x l , x r ) LL
α
x% = ( x m , x s ) L
Multidimensional tests for imprecise data
6
Wuhan, 1. June 2006
Uncertainty modeling using imprecise quantities Set-theoretical operations Intersection m A% ∩ B% = min ( m A% , m B% ) Union
Minimum rule
m A% ∪ B% = max ( m A% , m B% ) Maximum rule
Complement m A% c = 1- m A%
Fundamental rules of arithmetic, e. g. L-fuzzy numbers e.g. ⎧ X % +Y % = (x m + y m , x s + ys ) L ⇒ ⎨ % = (a x m , a x s ) L ⎩a X
Multidimensional tests for imprecise data
7
Wuhan, 1. June 2006
Mean value xˆ computation from n samples
Example -
Standard deviation
-
σ 2x垐=
n
1 2 σ ∑ x n i =1 i
1 n xˆ r = ∑ x i r n i =1
Imprecision
⇔
σx =
1 ⎞2
1 ⎛ 2 σ ⎜∑ x ⎟ n ⎝ i =1 i ⎠ n
constant influence
Comparison of the law of propagation between different types of uncertainty 14 12
Standard devation Interval radii
10
[m m]
both
Distance measurement: n = 1 ... 20 σ i = 10 mm
8 6
x r = 3 mm
4 2 0 1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20
Number of samples for the computation of the mean value
Multidimensional tests for imprecise data
8
Wuhan, 1. June 2006
Example Remaining systematics: GPS phase observations (DD) Contribution to the interval radius in [mm]
25
interval radius satellite specific propagation specific station specific observation specific
20
Pair of satellites: ascending - descending
15
10
5
0 10/65
20/54
30/42
40/33
50/25
60/16
70/ 7
Elevation in [°] (satellite k / satellite l)
Multidimensional tests for imprecise data
9
Wuhan, 1. June 2006
Uncertainty modeling using imprecise quantities Extension principle according to Zadeh
(
% ,K, A % % =g A B 1 n m B% ( y ) =
)
:⇔ sup
( x1 ,K , x n ) ∈ X1 × K × X n g ( x1 ,K , x n ) = y
(
min m A% ( x1 ) ,K , m A% ( x n ) 1
n
)
∀ y∈Y
definition of a fuzzy vector
Extension principle: Formulation based on fuzzy vectors % = g ( z% ) w
:⇔ m w% ( w ) =
sup z ∈R n g (z) = w
Multidimensional tests for imprecise data
m z% ( z ) ∀ w ∈R
10
Wuhan, 1. June 2006
Uncertainty modeling using imprecise quantities Fuzzy vectors: Minimum rule 2D : m z% ( z ) = min(m x% (x), m y% (y))
Propagation of fuzziness / imprecision: Multidimensional tests for imprecise data
11
% = F z% ? w Wuhan, 1. June 2006
Uncertainty modeling using imprecise quantities 2D : m z% ( z ) = min(m x% (x), m y% (y))
m w% ( w ) = sup m z% ( z ) ∀ w ∈ R p n z∈R Fz = w
% ⊇ = ( F w m , F ws ) Applications: w Multidimensional tests for imprecise data
12
Propagation of fuzziness / imprecision
Minimum rule
L
Wuhan, 1. June 2006
Uncertainty modeling using imprecise quantities Extended least-squares estimation Representation with membership function - T T xˆ = X WX X Wy :⇔
(
)
m % ( x垐 )= xˆ
m y% ( y )
sup
(
y∈R
n
)
∀ x∈ R
u
-
xˆ = X T WX X T Wy
Representation as an α-cut (equivalent: interval mathematics) %x垐 = [ x ] = X T WX X T W [ y ] α
(
)
y% α Multidimensional tests for imprecise data
13
Wuhan, 1. June 2006
Statistical hypothesis tests in case of imprecise data m (x ) 1
Precise case (1D) R
A
R
and unique decisions !
Example:
TClear
Two-sided comparison of a mean value with a given value
x
Null hypothesis H0, alternative hypothesis HA, error probability γ → Definition of regions of acceptance A and rejection R
Multidimensional tests for imprecise data
14
Wuhan, 1. June 2006
Statistical hypothesis tests in case of imprecise data Consideration of imprecision m (x ) 1
Classical case
A
R
m (x ) 1
R
Imprecise case
T x
x
Multidimensional tests for imprecise data
15
Wuhan, 1. June 2006
Statistical hypothesis tests in case of imprecise data Consideration of imprecision m (x ) 1
Classical case
A
R
m (x ) 1
R
Imprecise case T%
T x
x Imprecision of test statistics due to the imprecision of the observations
T% = ( T m , Tl , Tr )LR Multidimensional tests for imprecise data
e. g., 16
T m ~ N ( μ, 1) Wuhan, 1. June 2006
Statistical hypothesis tests in case of imprecise data Consideration of imprecision m (x ) 1
Classical case
A
R
m (x ) 1
R
Imprecise case T%
% A
T x
x Imprecision of the region of acceptance due to the linguistic fuzziness of notions like „outlier“
% = (A , A , A , A ) A m n l r LR Multidimensional tests for imprecise data
17
Fuzzy-interval Wuhan, 1. June 2006
Statistical hypothesis tests in case of imprecise data Consideration of imprecision m (x ) 1
Classical case
A
R
m (x ) 1
R
Imprecise case
% T% R
% A
% R
T x
x Imprecision of the region of rejection as set-theoretical complement of the region of acceptance
%c R% = A Multidimensional tests for imprecise data
⇒ m R% = 1 − m A% 18
Wuhan, 1. June 2006
Statistical hypothesis tests in case of imprecise data Consideration of imprecision m (x ) 1
Classical case
A
R
R
m (x ) 1
Imprecise case
% T% R
% A
Conclusion: T Transitions regions prevent a clear and unique test decision ! x
% R
x
Formulation of hypotheses
Tm
⎧μ ⎪ =0 ~ N ( μ, 1) and ⎨ ⎪⎩μ = δ , δ ≠ 0
Multidimensional tests for imprecise data
19
H0 HA Wuhan, 1. June 2006
Statistical hypothesis tests in case of imprecise data Conditions for an adequate test strategy • Quantitative comparison of the imprecise test statistics Τ% and the % and rejection R% regions of acceptance A • Precise criterion pro or con acceptance • Probabilistic interpretation of the results
Multidimensional tests for imprecise data
20
Wuhan, 1. June 2006
Statistical hypothesis tests in case of imprecise data Degree of disagreement
Degree of agreement γ : F (R × R
δ: F (R × R
) → [0, 1]
of the imprecise test statistics T% ∈ F ( R
of the imprecise test statistics
)
T% ∈ F ( R
)
with the
with the
imprecise region of rejection
imprecise region of acceptance
R% ∈ F ( R
alternatives
) → [0, 1]
)
% ∈ F (R A
% R% ) = height ( T% ∩ R% ) γ R% ( T% ) := γ ( T,
(
card T% ∩ R% % R% = γ R% T% := γ T, card T%
( )
(
)
( )
Multidimensional tests for imprecise data
)
δA% ( T% ) := 1 − γ A% ( T% )
)
21
Wuhan, 1. June 2006
Statistical hypothesis tests in case of imprecise data Degree of disagreement
Degree of agreement
Degree of rejectability ρ: F (R
) → [0, 1]
of the imprecise test statistics T% ∈ F ( R )
(
ρ R% ( T% ) := min γ R% ( T% ) , δA% ( T% )
Multidimensional tests for imprecise data
22
) Wuhan, 1. June 2006
Statistical hypothesis tests in case of imprecise data •
Evaluate the test criterion
ρR%
%T ⎧⎨≤ ⎫⎬ ρ ∈ [ 0,1] ⇒ ⎧⎨ Do not reject H 0 crit > ⎩ ⎭ ⎩ Reject H 0
( )
• Calculate the probability of a Type I error Type II error
( ( ) 1 − β = P ( ρ ( T% ) ≤ ρ
Multidimensional tests for imprecise data
α = P ρR% T% > ρcrit H 0 R%
23
crit
HA
)
) Wuhan, 1. June 2006
Statistical hypothesis tests in case of imprecise data m( x ) 1
γ R% (T% )
The height criterion:
Tm
~
T
δ A% (T% )
~
A
-k
with:
~
0
R x
k
γ R% (T% ) = height (T% ∩ R% )
δ A% (T% ) = 1 − height (T% ∩ A% )
⎧ Do not reject H 0 ⎧≤ ⎫ % % % ρ R% (T ) := min γ R% (T ), δ A% (T ) ⎨ ⎬ ρcrit ∈ [ 0,1] ⇒ ⎨ ⎩> ⎭ ⎩ Reject H 0
(
Multidimensional tests for imprecise data
)
24
Wuhan, 1. June 2006
Statistical hypothesis tests in case of imprecise data m( x ) 1
Tm
The card criterion:
~
~
T
~
A
R x
card (T% ∩ R% ) with:
γ R%
( )
)
( ) %) card ( T% ∩ A δA% ( T% ) = 1 − card ( T% )
card (T% ∩ A% ) Overlap region
Multidimensional tests for imprecise data
(
card T% ∩ R% T% = card T%
25
Wuhan, 1. June 2006
Statistical hypothesis tests in case of imprecise data Test situation with tight bounds (weak imprecision): m( x )
Tm
1
~
R
~
T
~
~
A
-k
0
R x
k
degree of rejectability for the card criterion degree of rejectability for the height criterion Multidimensional tests for imprecise data
26
Wuhan, 1. June 2006
Statistical hypothesis tests in case of imprecise data Test situation with wide bounds (strong imprecision): m( x)
Tm
~
~
1
R
~
T
~
A
-k
0
R x
k
degree of rejectability for the card criterion degree of rejectability for the height criterion Multidimensional tests for imprecise data
27
Wuhan, 1. June 2006
Multidimensional hypothesis tests in case of imprecise data Test situation without imprecision (quadratic form): Ω = y T Σ −yy1 y ~ χ 2 (f ,0)
f := degrees of freedom
y:= Vector with an expected value of zero (under H 0 ) Σ yy := Variance covariance matrix (VCM) of the vector y
Hypotheses: H 0 : E( y ) = 0 H A : E(y ) = δ with δ ≠ 0
Test decision: Ω > χ12−γ (f ,0) ⇒ Reject H 0 Multidimensional tests for imprecise data
28
(γ:=error probability) Wuhan, 1. June 2006
Multidimensional hypothesis tests in case of imprecise data Ω = y T Σ −yy1 y
with y ∈ y%
Search the smallest and largest elements for y ∈ y% for a sufficient number of α-cuts Optimization
algorithm
% % Ω α min = min(Ω α ) % % Ω α max = max(Ω α )
Rigorous realization of Zadeh‘s extension principle! Multidimensional tests for imprecise data
29
Wuhan, 1. June 2006
Multidimensional hypothesis tests in case of imprecise data α−cut optimization
Multidimensional tests for imprecise data
30
Wuhan, 1. June 2006
Multidimensional hypothesis tests in case of imprecise data α−cut optimization
Multidimensional tests for imprecise data
31
Wuhan, 1. June 2006
Multidimensional hypothesis tests in case of imprecise data Resulting test scenario Æ 1D comparison
Final decision based on the height/card criterion Multidimensional tests for imprecise data
32
Wuhan, 1. June 2006
Applications
A monitoring network of a tunnel:
2000 m Entrance area
Entrance area
Reference: SBA Oldenburg Multidimensional tests for imprecise data
33
Wuhan, 1. June 2006
Applications
Global test in least-squares adjustment
( )
( ( )
( ))
ρR% T% := min γ R% T% , δA% T%
Height criterion: 0.60 < ρcrit = 0.8 → Accept H 0
Card criterion: 0.86 > ρcrit = 0.8 → Reject H 0
( )
( )
criterion
γ R% T%
δA% T%
height
1.00
0.60
card
1.00
0.86
Multidimensional tests for imprecise data
34
Wuhan, 1. June 2006
Applications
Congruence test I [ tunnel ]
Strong imprecision
16 identical points Specification
Low tide
Tide
observations
261
261
parameters
148
148
( )
γ R% T% = 0.15
( )
( ))
= 0.08
( )
δ A% T% = 0.08
Multidimensional tests for imprecise data
( ( )
ρR% T% := min γ R% T% , δA% T%
35
0.08 < ρcrit = 0.5 → Accept H 0 Wuhan, 1. June 2006
Conclusions •
•
•
Statistical hypothesis tests can be extended for imprecise data • Degree of rejectability Æ comparison of fuzzy intervals • 1D case is straightforward, mD case needs α-cut optimization Imprecision is an additive term of uncertainty what leads to a more reluctant rejection of the null hypothesis than in pure stoch. case Not shown but computable: Type I and Type II error probs.
Multidimensional tests for imprecise data
36
Wuhan, 1. June 2006
Future work •
In progress: Using real data for the assessment and validation and for the computation of more realistic fuzzy intervals for the influence parameters
•
Further research: - Sensitivity analysis - Kalman filtering
Multidimensional tests for imprecise data
37
Wuhan, 1. June 2006
Acknowledgements The presented results are developed within the research project KU 1250/4-1 ”Geodätische Deformationsanalysen unter Verwendung von Beobachtungsimpräzision und Objektunschärfe”, which is funded by the German Research Foundation (DFG). This is gratefully acknowledged.
Multidimensional tests for imprecise data
38
Wuhan, 1. June 2006
Multidimensional tests for imprecise data
39
Wuhan, 1. June 2006
Applications A monitoring network of a lock:
The lock Uelzen I
Monitoring network
Monitoring the deformations of the lock Multidimensional tests for imprecise data
40
Wuhan, 1. June 2006
Applications
Multiple outlier test (5 horizontal directions)
( )
( ( )
( ))
ρR% T% := min γ R% T% , δA% T%
( )
γ R% T% = 0.60
( )
δ A% T% = 0.16
Multidimensional tests for imprecise data
41
= 0.16
0.16 < ρcrit = 0.8 → Accept H 0 Wuhan, 1. June 2006
Applications
Congruence test II [ lock Uelzen ]
6 identical points Specification
Epoch 1999
Epoch 2004
observations
317
144
parameters
60
39
( )
γ R% T% = 0.58
( ( )
( ))
= 0.52
( )
δ A% T% = 0.52
Multidimensional tests for imprecise data
( )
ρR% T% := min γ R% T% , δA% T%
42
0.52 > ρcrit = 0.5 → Reject H 0 Wuhan, 1. June 2006