Echantillonnage spatial pour la construction de ... - AgroParisTech

Echantillonnage spatial pour la construction de carte d’occurrence

M. Bonneau, N. Peyrard et R. Sabbadin

– p. 1

Problem How to design an efficient spatial sampling method to estimate an occurrence (0/1) map when X spatial structure X no evolution during the sampling period X imperfect observations X set of sites which can be sampled is finite X sampling has a cost

– p. 2

Motivation X Control of invasive species: construction of fire ants invasion maps in Queensland, Australia X Weeds management: construction of grown plants and seed bank maps Question: Where to allocate supplementary sampling effort to estimate an invasion map of satisfying quality, taking into account sampling cost?

– p. 3

Proposed method Based on • a stochastic modeling of binary maps and sampled observations • tools from sequential decision under uncertainty • same model and same criterion for selection of sampled area and for map reconstruction Two modeling options • Image analysis: HRMF and Maximum Posterior Marginals • Geostatistics: boolean model and kriging – p. 4

Three steps method 1) From a first arbitrary sampling, estimate model parameters and compute initial knowledge = map distribution conditionally to observations 2) Based on this initial knowledge choose a sampling strategy specifying which sites to sample, in order to improve knowledge - static or adaptive sampling - trade cost/expected map quality 3) build invasion map from the improved knowledge – p. 5

Table of Contents 1- Image analysis approach 2- Geostatistics approach 3- Experimental comparison 4- Methodological comparison 5- Conclusion/perspectives

– p. 6

Image analysis approach Hidden variables

Observations

Hidden Markov Random Field on G = {V, E}

– p. 7

The model • X = {Xi , i ∈ V }, Xi ∈ {0, 1}, HMRF 1Y P (X = x) = ψc (xc ) Z c∈C • Y = {Yi , i ∈ V }, Yi ∈ {0, 1} P (Y = y | X = x) =

Y

P (Yi = yi | Xi = xi )

i∈V

Example: P (Yi = 0 | Xi = 1) = θ > 0 (false positive) – p. 8

Map reconstruction • a = {ai , i ∈ V } ∈ {0, 1}: sample action ya : observation on sample set a • MPM estimator of the hidden map o n MP M PM = arg max Pa (Xi = xi | ya ) , x xM P M = xM i i xi

→ Value of knowledge V

MP M

(a, ya ) =

X i∈V

max Pa (Xi = xi | ya ) xi

– p. 9

Static sampling • Given an initial sample a0 and observations sampling action ya0 , we want to optimise next sample action a • Value of a sample a U

MP M

(a) =

X

Pa,a0 (ya |ya0 )V M P M (a, ya , a0 , ya0 )

ya

• Optimisation problem arg

max

a,|a|≤Amax

U M P M (a)

Computational complexity is NP-hard – p. 10

Heuristic static sampling (1) • Explore sites where initial knowledge is the most uncertain: marginal Pa0 (xi | ya0 ) closest to 12 arg

max

a,|a|≤Amax

X

[min(Pa0 (Xi = 1 | ya0 ), Pa0 (Xi = 0 | ya0 ))]

i,ai =1

• Marginals computation is itself NP-hard ⇒ approximation using sum-prod algorithm

– p. 11

Heuristic static sampling (2) The approximation corresponds to two simplifying assumptions X Additional observations are reliable: θ = 1 X Joint probability approximated as a product of “approximate Qnconditional marginals”: Pa (x|ya ) ∼ i=1 Pa (xi |ya )

– p. 12

Adaptive spatial sampling (1) • Principle: - Sampling locations not settled once for all before the sampling campaign - Intermediate observations are taken into account to design next sampling step - Possibility to visit a cell more than once

– p. 13

Adaptive spatial sampling (2) • a sampling strategy δ is a tree • a trajectory in τ = (a1 , y 1 , . . . , aK , y K )

δ:

Value of a trajectory

U (τ ) =

K X

V M P M (Pe,a0 ,a1 ,...,aK (X|λ, y 0 , y 1 , . . . , y K ))

k=1

Value of a strategy

V (δ) =

P

τ

U (τ )P (τ | δ) – p. 14

Heuristic adaptive spatial sampling • Exact computation is PSPACE-hard ⇒ Heuristic algorithm - online computation - approximate method for static sampling at each step

– p. 15

Evaluation of HMRF method • Evaluation on simulated data (Potts model with external field) • Comparison of behavior of - random sampling (RS) - static heuristic sampling (SHS) - adaptive heuristic sampling (AHS)

– p. 16

Procedure • repeat 10 times - simulate hidden map x from P (x) (50 × 50 sites) - apply regular sampling (about 10% of area): a0 - simulate ya0 from P (yi | xi ) - apply RS, SHS, AHS, 10 times

5 10 15 20 25

• Comparison on reconstruction errors

30 35 40 45 50 5

10

15

20

25

30

35

40

45

50

– p. 17

Error rates 15

45 static adaptive random


40

static adaptive random

11 10

35 9

10 30

8

25

7 6

20 5

5 15 4 10

0 10

20

30

40

50

60

70

80

90

α = 2, β = 0.8

100

5 10

3

20

30

40

50

60

70

80

90

α = 0, β = 0.5

100

2 10

20

30

40

50

60

70

80

90

100

α = −1, β = 0.4

θ = 0.8

– p. 18

Where do we sample? Hidden map 5 10 15 20 25 30 35 40 45 50 5

10

15

20

25

30

35

40

45

50

α = (1, −1), β = 0.4, θ = 0.8

– p. 19

Where do we sample? Static sampling: A and O

20

20

20

20

20

40

40

40

40

40

20

40

20

40

20

40

20

40

20

20

20

20

20

40

40

40

40

40

20

40

20

40

20

40

20

40

20

20

20

20

20

40

40

40

40

40

20

40

20

40

20

40

20

20

20

20

40

40

40

40

20

40

20

40

20

40

20

40

20

40

20

40

20

40

20

40

– p. 20

Where do we sample? Static sampling:marginals

20

20

20

20

20

40

40

40

40

40

20

40

20

40

20

40

20

40

20

20

20

20

20

40

40

40

40

40

20

40

20

40

20

40

20

40

20

20

20

20

20

40

40

40

40

40

20

40

20

40

20

40

20

20

20

20

40

40

40

40

20

40

20

40

20

40

20

40

20

40

20

40

20

40

20

40

– p. 21

Where do we sample? Adaptive sampling: A and O (cumul)

20

20

20

20

20

40

40

40

40

40

20

40

20

40

20

40

20

40

20

20

20

20

20

40

40

40

40

40

20

40

20

40

20

40

20

40

20

20

20

20

20

40

40

40

40

40

20

40

20

40

20

40

20

20

20

20

40

40

40

40

20

40

20

40

20

40

20

40

20

40

20

40

20

40

20

40

– p. 22

Where do we sample? Adaptive sampling: marginals

20

20

20

20

20

40

40

40

40

40

20

40

20

40

20

40

20

40

20

20

20

20

20

40

40

40

40

40

20

40

20

40

20

40

20

40

20

20

20

20

20

40

40

40

40

40

20

40

20

40

20

40

20

20

20

20

40

40

40

40

20

40

20

40

20

40

20

40

20

40

20

40

20

40

20

40

– p. 23

General behavior • Adaptive HS ≥ Static HS ≃ Random S • Discrepancy between Adaptive HS and Static HS increases with - sampling ressources - hidden map structure

– p. 24

Geostatistics approach The hidden map is a realisation of a Boolean Model Ξ. ⇒ X(s) = 1{s∈Ξ} , ∀s ∈ W compact

– p. 25

Map reconstruction versus sampling choice • Map restoration: from observations, using kriging • Sample choice: given an initial sample a0 and observations ya0 , we want to optimise next sample action a Previous sample steps should influence the choice of a → conditional kriging

– p. 26

Conditional kriging • Compute an estimator pλ∗ i (Ya , ya0 ) of E[Xi | Ya , ya0 ], as λ 2 ∗ λ = arg max E (pi (Ya , ya0 ) − Xi ) | ya0 . λ

with

pλi (Ya , ya0 )

= λi +

• Kriging estimator xkrig i

= 1 if

P

k∈a

λk Yk +

λ∗ pi (Ya , ya0 )

P

l∈a0

γl y l .

> 1/2, 0 otherwise

Possible because X and Y are of same nature – p. 27

Static sampling • Value of a sample a: X 2 kri λ∗ U (a) = − E (pi (Ya , ya0 ) − Xi ) | ya0 i∈V

• Same form than in the HMRF case: X kri Pa0 ,a (ya | ya0 )V kri (a0 , a, ya0 , ya ) U (a) = ya

with V

kri

0

(a , a, ya0 , ya ) = −

X i∈V

.

λ∗ 2 E (pi (ya , ya0 ) − Xi ) | ya0 , ya – p. 28

Heuristic sampling • Static samplig - same assumptions lead to the same heuristic than for the HMRF approach - rank sites according to conditional marginals - marginals computation using simple kriging • Adaptive sampling - idem HMRF approach - online + static heuristic – p. 29

Evaluation of geostatical method • Evaluation on simulated data (Boolean model with disk of constant diameter as grains) • Comparison of behavior of - random sampling (RS) - static heuristic sampling (SHS) - adaptive heuristic sampling (AHS)

– p. 30

Procedure • repeat 10 times: - simulate hidden map x on W = [0; 500] × [0; 500] - V = 50 × 50 regular grid - apply regular sampling (about 10% of area): a0 - simulate ya0 from P (yi | xi ) - apply RS, SHS, AHS, 10 times

– p. 31

Error rates 7


6


9

static adaptive random

8 5

7

10

6

4

5 3

4 5

3

2

2 1 1 0

0

10

20

30

40

50

60

r = 15

70

80

90

100

0

0

10

20

30

40

50

60

70

80

90

100

r = 20 λ = 10−4 , θ = 0.8

0

0

10

20

30

40

50

60

70

80

90

100

r = 30

Adaptive HS >> Static HS ≃ Random S except when structure is high – p. 32

Where do we sample? Static HS

Adaptive HS

– p. 33

Comparison on Lavender map Global error 55 HMRF−static−true HMRF−static−init HMRF−adapt−true HMRF−adapt−init HMRF−adapt−reest krig−static−true krig−static−init krig−adapt−true krig−adapt−init krig−adapt−reest

50 45 40 10

35 20

30 30

25 40

20 50

15 60

10 70

5 80 2

4

6

8

10

0

5

10

15

20

25

Provided by A. Calonnec, INRA Bordeaux

– p. 34

Comparison on Hevea map Global error 20 HMRF−static−true HMRF−static−init HMRF−adapt−true HMRF−adapt−init HMRF−adapt−reest krig−static−true krig−static−init krig−adapt−true krig−adapt−init krig−adapt−reest

18 16

5

10

14 15

20

12

25

10 30

35

8

40

6 45

50

4

55

2 2

4

6

8

10

0

5

10

15

20

25

Provided by D. Nandris and F. Pellegrin, IRD – p. 35

Theoretical comparison • Differences - Boolean models continuous space covariance - HMRF discrete space or structured space conditional independences can handle categorial variables

– p. 36

Theoretical comparison • Similarities - Sampling same assumptions lead to the same heuristic - Restoration kriging amount to maximising conditional marginals in 0/1 case They differ by the choice of the approximation method - HMRF: marginals computed using sum-product - Boolean models: marginals computed using kriging

– p. 37

Conclusion • Two methods for spatial sampling in any binary spatial process

• Geostatistical approach seems to perform better. Why? - Model? - Criterion? - Approximation method?

– p. 38

Around this work • Exact algorithms for small (MAP) problems: combining variable elimination and search in trees • Mathieu Bonneau: PhD thesis on adaptive spatial sampling for adventices mapping in crop fields. With Sabrina Gaba (INRA Dijon) - development on a more general framework: knowledge gathering in graphical models - developement of approximate resolution algorithms using reinforcement learning - application to mapping of (multiple?) species abundances from spatial sampling of grown plants

– p. 39

Around this work and many other interesting questions • How to model cost? • Other choices for the sample value? • And if the process to map evolves during the sampling period? • How to combine sampling actions and management actions?

– p. 40