document title

1 downloads 0 Views 51MB Size Report
based on the widely used Simplex (Nelder and Mead 1965) or Levenberg-Marquardt ...... reviews of these tools and methods (e.g Cowles and Carlin (1996) and ...
INVERSION OF REMOTE SENSING DATA IN A SHALLOW WATER ENVIRONMENT USING A TRANS-DIMENSIONAL PROBABILISTIC FRAMEWORK

A thesis submitted for the degree of Doctor of Philosophy of The Australian National University Stephen Sagar August 2015

c Copyright by Stephen Sagar 2015

All Rights Reserved

ii

Declaration Except where otherwise indicated in the text, the research described in this thesis is my own original work, figurative use of the first person plural notwithstanding. The paper reproduced in Appendix B was published in IEEE Transactions on Geoscience and Remote Sensing, and was co-authored with Vittorio Brando and Malcolm Sambridge. I was lead author for this publication and took responsibility for the research.

Stephen Sagar March 31st, 2015

iii

iv

Acknowledgements I would firstly like to thank my supervisors for providing me with such a wide range of experience and knowledge to draw on during this Ph.D. I would like to thank Malcolm Sambridge for his constant encouragment and enthusiasm for new ideas and applications, and for always being generous with his time, knowledge and humour. Thankyou also to Vittorio Brando for always being open and willing to bounce around new concepts, and for being a much needed anchor to the practicalities of the remote sensing problem at hand. Many times I have left meetings with you both feeling rejuvinated about the propects of my research, and this has been invaluable in getting me through the past four years. Although my research directions didn’t follow what he might have preferred, I would also like to thank Phil Cummins for being the initial impetus behind this Ph.D adventure, without whom it would not have happened. I am much indebted to Thomas Bodin for setting me off down the path of transdimensional inversion methods, and for providing a large amount of his time, knowledge and practical advice in the initial stages of my work. This research was made possible by the support of Geoscience Australia, and I am grateful for the significant investment made by the organisation in my education and research outputs. In particular, thanks to Medhavy Thankappan, Adam Lewis and Magnus Wettle for their faith in the initial application stages of my research proposal, and their ongoing support and interest. I have also had the opportunity to work more closely with my colleagues at CSIRO during my Ph.D, and would like to thank Arnold Dekker, Janet Anstee and Hannelie Botha in particular for many interesting conversations on all things shallow water. Data used in this research has been generously provided from a number of sources, and I specifically thank John Hedley, Stuart Phinn and Curtis Mobley. Details of the funding and data source details can be found in the acknowledgments in Appendix B. v

Calculations for this work were performed on the Terrawulf III cluster, a computational facility supported through AuScope. AuScope Ltd is funded under the National Collaborative Research Infrastructure (NCRIS), an Australian Commonwealth Government programme. Finally, my most heartfelt thanks are for my family. To my wife Sonja for providing unconditional support, love, shallow water bathymetry monsters and laughs throughout what has been a very interesting journey for both of us. To Colin for constant companionship and some unwanted keyboard napping on cold Canberra winter days. And to my little man Lars, whose arrival could arguably be considered counter productive, but has certainly provided additional motivation in these end stages.

vi

Publications The following conference and journal papers were produced during the course of the research in this thesis:

Sagar, S., Sambridge, M., Brando, V. & Cummins, P. (2012). A Segmentation Based Bayesian Inversion Algorithm for Shallow Water Bathymetry Retrieval. Proceedings of Ocean Optics XXI, 8-12 October 2012, Glasgow, Scotland. Sagar, S., Brando, V. & Sambridge, M., (2014). Noise Estimation of Remote Sensing Reflectance Using a Segmentation Approach Suitable for Optically Shallow Waters. IEEE Transactions on Geoscience and Remote Sensing, 52(12), pp.7504-7512.

vii

viii

Abstract Image data from remote sensing platforms offer an opportunity to observe and monitor the physical environment at a scale and precision unavailable to previous generations. The ability to estimate environmental parameters in a range of terrestrial and aquatic scenarios, often in remote and inaccessible areas, is a key benefit of using remote sensing data. Estimating physical parameters from remote sensing data takes the form of an inverse problem, predominately tackled using single solution optimisation approaches applied on a pixel-by-pixel basis. These types of inversion methods are poorly suited to the non-uniqueness that characterises many of the physical models in remote sensing, and often require some form of subjective regularisation to produce sensible estimates and parameter combinations. In this thesis the inversion of remote sensing data is cast in a probabilistic framework, with the first application of a trans-dimensional sampling algorithm to this form of problem in a shallow water environment. Probabilistic sampling techniques offer considerable benefits in term of encompassing uncertainties in both the model and the data, and provide an ensemble of solutions from which parameter estimates and uncertainties can be inferred. However, probabilistic sampling has not been widely applied to remote sensing image data, primarily due to the high dimension of the data and inverse problem. Using a physical model for a shallow water environment, we demonstrate the application of a spatially partitioned reverse jump Markov chain Monte Carlo (rj-McMC) algorithm, previously developed for geophysical applications where the dimension of the inverse problem is treated as unknown. To effectively deal with the increased dimension of the remote sensing problem, a new version of the algorithm is developed in this thesis, utilising image segmentation techniques to guide the dimensional changes in the rj-McMC sampling process. ix

Synthetic data experiments show that the segment guided component of the algorithm is essential to sample the high dimensions of a complex spatial environment such as a coral reef. As the complexity of the data and forward model is increased, further innovations to the algorithm are introduced. These include, enabling the estimation of data noise as part of the inversion process, and the use of the segmentation to develop informed starting points for the initial parameters in the model. The original algorithm developed in this thesis is then applied to a hyperspectral remote sensing problem in the coral waters of Lee Stocking Island, Bahamas. The algorithm is shown to produce an estimated depth model of increased accuracy in comparison to a range of optimisation inversion methods applied at this study site. The self-regularisation of the partition modelling approach proves effective in minimising the pixel-to-pixel variation in the depth solution, whilst maintaining the fine-scale spatial discontinuities of the coral reef environment. Importantly, inferring a depth model from an ensemble of solutions, rather than a single solution, enables uncertainty to be attributed to the model parameters. This a crucial step in integrating bathymetry information estimated from remote sensing data with other traditional surveying methods.

x

Contents Declaration

iii

Acknowledgements

v

Publications

vii

Abstract

ix

1 Introduction

1

1.1

Context of the Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

1.2

Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4

1.3

Aim & Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4

1.4

Significance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5

1.5

Overview of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6

2 Remote Sensing Data & Inversion 2.1

What is Remote Sensing? . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.1.1

2.2

Dimensionality and Resolution . . . . . . . . . . . . . . . . . . . . 12

Remote Sensing and Parameter Estimation . . . . . . . . . . . . . . . . . 15 2.2.1

Empirical Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.2.2

Physics-based Methods . . . . . . . . . . . . . . . . . . . . . . . . 17

2.2.3

2.3

11

2.2.2.1

Benefits and Limitations . . . . . . . . . . . . . . . . . . 18

2.2.2.2

Retrieval Algorithms . . . . . . . . . . . . . . . . . . . . 19

Dealing with Uncertainty . . . . . . . . . . . . . . . . . . . . . . . 22 2.2.3.1

Incorporating Uncertainty in the Cost Function . . . . . 23

2.2.3.2

Reflecting Uncertainty in Estimated Parameters . . . . . 25

Spatial Dependence & Coherency . . . . . . . . . . . . . . . . . . . . . . 26 xi

2.4

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3 A Probabilistic Inversion Framework

31

3.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.2

Probabilistic Inversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.3

3.2.1

The Prior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.2.2

The Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.2.3

Why a Probabilistic Approach? . . . . . . . . . . . . . . . . . . . 35

Markov chain Monte Carlo . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.3.1

3.4

A Trans-dimensional Inversion Approach . . . . . . . . . . . . . . . . . . 41 3.4.1

3.5

Remote Sensing and McMC inversion . . . . . . . . . . . . . . . . 37 The rj-McMC Algorithm . . . . . . . . . . . . . . . . . . . . . . . 44 3.4.1.1

Partitioning and Parametrisation of the Model . . . . . 44

3.4.1.2

The Prior . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.4.1.3

Proposing Model Samples . . . . . . . . . . . . . . . . . 47

3.4.1.4

Determining the Acceptance Probability . . . . . . . . . 48

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4 Resolving the Spatial Dimension

53

4.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.2

Raster Data Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.2.1

Synthetic Data Creation . . . . . . . . . . . . . . . . . . . . . . . 54

4.2.2

Prior and Proposal Distributions . . . . . . . . . . . . . . . . . . 55

4.2.3

4.2.2.1

Uninformative Priors . . . . . . . . . . . . . . . . . . . . 55

4.2.2.2

Efficient Sampling and Proposal Distributions . . . . . . 56

Implementation and Ensemble Generation . . . . . . . . . . . . . 57 4.2.3.1

4.3

4.2.4

Ensemble Interpretation . . . . . . . . . . . . . . . . . . . . . . . 58

4.2.5

Analysis of Spatial Retrieval . . . . . . . . . . . . . . . . . . . . . 62

Object Based Image Segmentation 4.3.1

4.4

Burn-In and Convergence . . . . . . . . . . . . . . . . . 57

. . . . . . . . . . . . . . . . . . . . . 66

Segmentation Procedure . . . . . . . . . . . . . . . . . . . . . . . 67

A Spatially Guided rj-McMC . . . . . . . . . . . . . . . . . . . . . . . . 68 4.4.1

Dimensional Change Moves . . . . . . . . . . . . . . . . . . . . . 69

4.4.2

The Prior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 xii

4.5

4.4.3

The Proposal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

4.4.4

Acceptance Terms . . . . . . . . . . . . . . . . . . . . . . . . . . 74

4.4.5

Comparison to the Naive Algorithm . . . . . . . . . . . . . . . . . 75 Sampling in a High Dimensional Space . . . . . . . . . . 80

4.4.5.2

Preservation of Parsimony . . . . . . . . . . . . . . . . . 81

Uncertainty and Spatial Regularisation . . . . . . . . . . . . . . . . . . . 83 4.5.1

4.6

4.4.5.1

Application to High Noise Data . . . . . . . . . . . . . . . . . . . 83 4.5.1.1

The SG Algorithm . . . . . . . . . . . . . . . . . . . . . 83

4.5.1.2

A Pixel-Based McMC . . . . . . . . . . . . . . . . . . . 86

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

5 A Shallow Water Depth Model

91

5.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

5.2

The Shallow Water Inversion Problem

5.3

5.2.1

The Radiative Transfer Model . . . . . . . . . . . . . . . . . . . . 93

5.2.2

Model and Data Uncertainties . . . . . . . . . . . . . . . . . . . . 95

5.3.2

5.5

5.2.2.1

Empirical Model Inputs . . . . . . . . . . . . . . . . . . 95

5.2.2.2

Theory Error . . . . . . . . . . . . . . . . . . . . . . . . 95

5.2.2.3

Data Noise . . . . . . . . . . . . . . . . . . . . . . . . . 96

Synthetic Data Creation . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 5.3.1

5.4

. . . . . . . . . . . . . . . . . . . 91

Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 5.3.1.1

SIOP and Substrate Parameterisation . . . . . . . . . . 98

5.3.1.2

Data Creation, Noise and Resampling . . . . . . . . . . 98

Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

SG Algorithm Application . . . . . . . . . . . . . . . . . . . . . . . . . . 102 5.4.1

Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . . . 102

5.4.2

Computational Improvements . . . . . . . . . . . . . . . . . . . . 106 5.4.2.1

Likelihood Calculation . . . . . . . . . . . . . . . . . . . 106

5.4.2.2

Sample Storage . . . . . . . . . . . . . . . . . . . . . . . 107

An Alternate Noise Model . . . . . . . . . . . . . . . . . . . . . . . . . . 108 5.5.1

A Maximum Likelihood Approach . . . . . . . . . . . . . . . . . . 111 5.5.1.1

Acceptance Terms with a ML Noise Estimation . . . . . 113

5.5.2

Convergence and Sampling . . . . . . . . . . . . . . . . . . . . . . 114

5.5.3

Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 xiii

5.6

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

6 The Full Shallow Water Inverse Problem

123

6.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

6.2

Extending to Multiple Parameters . . . . . . . . . . . . . . . . . . . . . . 124 6.2.1

6.2.2

6.3

6.4

6.2.1.1

Rationale . . . . . . . . . . . . . . . . . . . . . . . . . . 125

6.2.1.2

Data Segmentation . . . . . . . . . . . . . . . . . . . . . 127

Modifications to the Algorithm . . . . . . . . . . . . . . . . . . . 129 6.2.2.1

Non-Dimensional Change Moves . . . . . . . . . . . . . 130

6.2.2.2

Birth and Death Moves . . . . . . . . . . . . . . . . . . 131

Application of the ML-SG Algorithm . . . . . . . . . . . . . . . . . . . . 133 6.3.1

Parameterisation and Convergence . . . . . . . . . . . . . . . . . 133

6.3.2

Full Ensemble Based Solutions . . . . . . . . . . . . . . . . . . . . 136

6.3.3

Parameter Interactions in the Shallow Water Model . . . . . . . . 140

Error Sources and Sensitivity . . . . . . . . . . . . . . . . . . . . . . . . 146 6.4.1

Increased Data Noise . . . . . . . . . . . . . . . . . . . . . . . . . 146

6.4.2

Empirical Parametrisation Errors . . . . . . . . . . . . . . . . . . 147

6.4.3 6.5

Synthetic Data Construction . . . . . . . . . . . . . . . . . . . . . 125

6.4.2.1

Incorrect SIOP Parameters . . . . . . . . . . . . . . . . 149

6.4.2.2

Incorrect Substrate Library . . . . . . . . . . . . . . . . 150

Spectral Resolution Sensitivity . . . . . . . . . . . . . . . . . . . . 154

A Hybrid Algorithm Approach . . . . . . . . . . . . . . . . . . . . . . . . 158 6.5.1

Utilising a Segment Based Optimisation . . . . . . . . . . . . . . 159

6.5.2

Hybrid Algorithm Results . . . . . . . . . . . . . . . . . . . . . . 161 6.5.2.1

6.6

Joint Posterior Probability Distributions . . . . . . . . . 164

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

7 Case Study: Lee Stocking Island, Bahamas

175

7.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

7.2

Background & Study Location . . . . . . . . . . . . . . . . . . . . . . . . 176 7.2.1

Study Site . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

7.2.2

Previous Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

7.2.3

The Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

7.2.4

Parameterising the ML-SG Algorithms . . . . . . . . . . . . . . . 179 xiv

7.3

Inversion using the ML-SG Algorithms . . . . . . . . . . . . . . . . . . . 182 7.3.1

Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 7.3.1.1

7.3.2

7.4

Probabilistic Inversion of PHILLS data

. . . . . . . . . 182

. . . . . . . . . . . . . . 185

7.3.2.1

Stability of the Solutions . . . . . . . . . . . . . . . . . . 186

7.3.2.2

Ensemble Solutions & Estimators

7.3.2.3

Empirical Parameterisation of the RT Model . . . . . . . 199

7.3.2.4

Uncertainty Estimators for Depth . . . . . . . . . . . . . 202

. . . . . . . . . . . . 190

Validation and Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . 206 7.4.1

A Spatially Regularised Solution

7.4.2

Accuracy and Uncertainty . . . . . . . . . . . . . . . . . . . . . . 207

7.4.3

Comparison to a Pixel-Based Solution . . . . . . . . . . . . . . . 214 7.4.3.1

7.4.4 7.5

Optimisation for the Hybrid Algorithm

. . . . . . . . . . . . . . . . . . 206

Assessment of Water Column Constituent Solutions . . . 216

Implications & Discussion . . . . . . . . . . . . . . . . . . . . . . 219

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221

8 Future Directions

223

8.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223

8.2

The Current Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224

8.3

Future Research Directions . . . . . . . . . . . . . . . . . . . . . . . . . . 226 8.3.1

8.3.2

8.3.3 8.4

Multiple Substrate Sampling . . . . . . . . . . . . . . . . . . . . . 226 8.3.1.1

A Simplified rj-McMC . . . . . . . . . . . . . . . . . . . 227

8.3.1.2

Initial Results . . . . . . . . . . . . . . . . . . . . . . . . 228

A Multiple Voronoi Partition Approach . . . . . . . . . . . . . . . 233 8.3.2.1

Implementation . . . . . . . . . . . . . . . . . . . . . . . 234

8.3.2.2

Efficiency Challenges . . . . . . . . . . . . . . . . . . . . 235

8.3.2.3

Initial Results . . . . . . . . . . . . . . . . . . . . . . . . 236

Uncertainty in the Empirical Parameterisation . . . . . . . . . . . 242

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244

9 Synthesis and Conclusions 9.1

245

Using Spatial Coherency to Assist the Inversion . . . . . . . . . . . . . . 247 9.1.1

Tackling High Dimensions . . . . . . . . . . . . . . . . . . . . . . 247

9.1.2

Informing the Inversion . . . . . . . . . . . . . . . . . . . . . . . . 249 xv

9.2

An Ensemble of Image Solutions . . . . . . . . . . . . . . . . . . . . . . . 250 9.2.1

Partition Modelling of Pixel-Based Data . . . . . . . . . . . . . . 250

9.2.2

The Ensemble and Uncertainty . . . . . . . . . . . . . . . . . . . 252

9.3

A General Spatial Inversion Framework . . . . . . . . . . . . . . . . . . . 256

9.4

Criticisms and Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . 256

Bibliography

259

A Radiative Transfer Model Parameterisation

279

B Noise Estimation in Shallow Water

285

xvi

Chapter 1 Introduction 1.1

Context of the Study

The ability to observe, monitor and assess our environment from afar is of benefit to a wide range of disciplines and applications. In the natural environment, the ability to minimise the costs, logistical restrictions and political sensitivities associated with direct access to a study site has a direct influence on the capacity to implement projects ranging from conservation to mapping. Remote sensing provides a unique tool for unobtrusive, low-impact observation of the environment across a range of spatial and temporal scales. The data provided from the vast variety of air and space-borne remote sensing sensors can provide valuable information and products on environmental conditions and phenomenons on a local and global scale, in both terrestrial (Liang 2008) and aquatic (Goodman et al. 2013) settings. The benefits this information can provide, both in the cost effective manner it can be acquired and the unique scale and coverage extent it provides, has resulted in remote sensing being increasingly embraced by scientists and policy makers far removed from the specific remote sensing science community. Indeed, with the broad acceptance of tools such as Google Earth, remote sensing has become well established in the public consciousness as people seek to observe and understand the world at scales and with precision unavailable to previous generations. One of the challenges facing the remote sensing science community is the ability to bridge the gap between the scientific products that can be produced from remote sensing and the usability, format and standards required by the wide range of stakeholders in 1

2

CHAPTER 1. INTRODUCTION

government, academia and the private sector. To increase the uptake and usability of remote sensing derived products, it is important that end users expectations are considered throughout the scientific process (Vanden Borre et al. 2011). An environment particularly suited to the application of remote sensing is the shallow water aquatic zone. In this research we define the shallow water zone in relation to remote sensing, utilising the concept of optically shallow water. A body of water is considered to be optically shallow when there is a quantifiable contribution of reflected light from the substratum (or bottom type) to the remote sensing reflectance signal (Brando et al. 2009). This definition is discussed in more detail in Chapter 5, and is a function of water quality, bottom type, sensor and environmental conditions. For this research we consider that the shallow water zone consists of coastal areas with water column depths ranging anywhere from 0 to ˜25m. The shallow water zone presents a wide range of challenges across stakeholders. Environmental conservationists and scientists are presented with the challenges of monitoring and studying world heritage listed coral reefs such as the Great Barrier Reef, the largest coral reef ecosystem on earth, distributed across approximately 346,000 sq/km off the coast of Queensland, Australia (Day and Dobbs 2013). Governments and policy makers must tackle and balance environmental concerns with economic issues such as shipping access and fisheries, as well as plan and assess emergency management contingencies resulting from events including tsunami inundation and storm surge. Hydrographic surveyors (including the military) continue to address issues in charting water column depth, or bathymetry, in shallow water environments with restricted access constraints, whether they be political, environmental, logistical or cost-based. The remote sensing science community is uniquely placed to tackle many of these challenges in the shallow water zone. Over the last 30+ years, remote sensing methods have been developed to assess, map and monitor coral reef systems (Yamano et al. 2002; Mumby et al. 2004; Phinn et al. 2011; Goodman et al. 2013); examine and manage fisheries (Platt and Sathyendranath 2008; Hamel and Andrefouet 2010); observe and model water quality (Phinn et al. 2005; Maritorena et al. 2010; Schroeder et al. 2012; Aurin et al. 2013) and to estimate bathymetry (Lyzenga 1978; Bierwirth et al. 1993; Lee et al. 1999; Stumpf et al. 2003; Brando et al. 2009; Hedley et al. 2009). This is by no means an exhaustive list of remote sensing applications in the shallow water zone, however it does include examples of a particular form of problem that is often tackled, that of estimating environmental properties or parameters directly from the remote

1.1. CONTEXT OF THE STUDY

3

sensing signal. Applications of this type take the form of an inverse problem, which through the use of a physical model relating the parameters of interest to the data, we can aim to infer or estimate properties of the environmental parameters (Camps-Valls et al. 2011). An ongoing challenge in many of these parameter estimation inverse problems, and one which relates directly to the usability of the product by a stakeholder, is the ability to quantify the uncertainty of parameters produced and estimated by remote sensing analysis. This is of particular relevance where the parameters being estimated are incorporated into a broader risk assessment model, or merged with similar data derived from a technique other than remote sensing. One such parameter, and one which is the focus of much of this research, is shallow water bathymetry. The ability to be able to not only estimate the depth of the water column, but also to produce meaningful uncertainty estimates, is an important step in allowing full integration of remote sensing derived shallow water bathymetry with other highly calibrated bathymetry sources such as multi-beam sonar and laser airborne depth sounding (LADS). To effectively examine uncertainty in an inversion problem, we are required to move beyond obtaining a single “best fitting” solution given the data and a selected optimisation criterion. Instead, we are interested in how plausible the obtained model is, and therefore how large the uncertainties of the parameters are in the final solutions (Debski 2010). A probabilistic inversion approach operates from a statistical point of view, enabling representation of all pieces of information in an inversion problem in probabilistic terms. Inversion and inference under this framework evaluates the a posteriori probability distribution function (or posterior PDF) over the model space - based on the observed data and any prior information on the model and parameters (Smith 1991). Therefore, instead of a single, optimum model and parameter set from an inversion solution, the user can evaluate the ensemble of model solutions characterising the posterior PDF. This distinction between a single optimal solution and an ensemble of solutions is fundamental to understanding uncertainty in the inversion problem. The posterior PDF describes the probability of any given model in the solution ensemble being the true one. Therefore interpreting the posterior PDF as the full solution of the inverse problem allows a wide range of statistical properties to be inferred from the ensemble, including parameter uncertainty (Debski 2010).

4

CHAPTER 1. INTRODUCTION

1.2

Problem Statement

Despite the numerous methods and algorithms developed to estimate environmental parameters from remote sensing data, to date there has been a lack of research into quantifying the uncertainty of these parameters. In order to increase the confidence that a diverse range of stakeholders have in remote sensing derived products, and therefore increase the uptake and acceptance, there is a requirement for a greater understanding of the characteristics and components of uncertainty in these products and parameter estimates. Probabilistic inversion methods, although well suited to parameter uncertainty estimation, have not been widely used in the remote sensing science community, primarily due to two related issues: the high computational demands of probabilistic inversion algorithms; and the large size and high dimensionality of many remote sensing data and inversion problems (Baret and Buis 2008). A family of probabilistic inversion methods, known as trans-dimensional methods, treat the dimension of the inverse problem as unknown, allowing the data to infer the complexity of the model (Malinverno 2002; Sisson 2005). In particular, trans-dimensional methods which employ a spatially adaptive partitioning approach (e.g. Bodin and Sambridge (2009); Hopcroft et al. (2009)) have the potential to tackle the spatial complexity of the remote sensing problem, and offer insights into parameter and model uncertainty. To the best of our knowledge, these types of trans-dimensional inversion methods have not before been applied to an inverse problem of the dimension and scale encountered in remote sensing. Therefore, the problem lies in finding a way of addressing the dimensional challenges of applying such an inversion method to remote sensing data, to fully utilise the benefits it can bring to understanding uncertainty.

1.3

Aim & Scope

The aim of this research is to formulate the remote sensing inverse problem in a suitable probabilistic format, and to develop a method for solving the problem using probabilistic sampling of remote sensing raster data. Through this process we aim to address two key components of the problem at hand: 1. Examine the feasibility and restrictions in applying a trans-dimensional probabilistic approach to a high-dimensional remote sensing inverse problem, and develop

1.4. SIGNIFICANCE

5

techniques to address these. 2. Evaluate the uncertainty characteristics and interactions of the estimated parameters and models that can be inferred from an ensemble of solutions in the probabilistic framework. Primarily, the focus of this research is on the estimation of parameters within a shallow water aquatic environment. However, where applicable, parallels and potential links to other remote sensing problems in terrestrial and aquatic environments will be identified. The physics of the shallow water radiative transfer model is not being examined here, and we do not seek to refine the model and equations outlined in Chapter 5 and Appendix A. However we do show and discuss the interactions of various components of the model in a probabilistic inversion framework, as well as sensitivities that can be identified and examined as part of our new method of remote sensing data inversion. In a sense, the development of the probabilistic inversion method in this research has been conducted with the consideration that the user may wish to substitute any forward model into the inversion method; albeit with some restrictions that are identified and discussed throughout the thesis. Similarly, the data used in this research each have their own specific characteristics, for example the choice of sensor and data source examined in Chapter 7. The rational behind the choice of each data source is outlined in the corresponding case studies; however the development and design of the inversion method itself is such that it is intended as a general framework method for the probabilistic inversion of remote sensing raster data, independent of data and sensor specifications.

1.4

Significance

The primary significance of this research, on a theoretical level, is the development of the first trans-dimensional probabilistic sampling algorithm for optical remote sensing data. We are unaware of any applications of this style of method to the types of remote sensing raster data in this thesis, and specific innovations have been developed, resulting in a fundamentally new method and algorithm for remote sensing inversion. Techniques to address the computational and dimensional challenges of remote sensing inversion using a probabilistic method are proposed and tested, resulting in a method that has

6

CHAPTER 1. INTRODUCTION

the potential to be adapted to remote sensing inversion problems beyond the focus of the specific case studies in this research. On a practical level, the work described here contributes to an increased understanding of the interpretation and characterisation of the uncertainty of environmental parameters estimated from remote sensing data. The specific scope of this thesis examines the shallow water inversion problem within the framework of a trans-dimensional probabilistic inversion method for the first time, providing new insights into the uncertainty of parameters that are estimated from this type of inversion. This moves the shallow water inversion problem away from the single model solution of the classical optimisation approach, towards a fundamentally different style of inversion in which we draw inferences from an ensemble of model solutions. The ability to assign uncertainties to parameters, such as water column depth, begins the process of more closely aligning parameters estimated from remote sensing to those derived from other methods. In the particular case of hydrographic surveying, as with all forms of surveying, uncertainty is a fundamental component of the measurement itself. Although bathymetry models derived from remote sensing are unlikely to meet the high uncertainty standards specified for hydrographic surveys, they can serve as a valuable tool for non-navigational purposes, supplementing surveys in regions with limited bathymetry coverage. Thus, this research provides a framework to provide bathymetry estimates and uncertainties in context for comparison with data from established surveying methods.

1.5

Overview of the Thesis

This thesis contains a further eight chapters that can be broadly grouped into three main themes: background and theory (Chapters 2 and 3), method development and application (Chapters 4, 5, 6, 7), future work and synthesis (Chapters 8 and 9). In Chapter 2, we introduce the concept of remote sensing, and define key characteristics and terms relating to the nature of the data and its acquisition. We discuss and evaluate past and current methods of information retrieval from remote sensing data, drawing distinction between the empirical and analytical methods widely used in the scientific community. With a focus on analytical inversion, we present a critical review of the current methods where we identify and argue the need for a method which can encompass uncertainty in all facets of the model, data and parameter estimates. In Chapter 3 we introduce and discuss the concepts of probabilistic and Bayesian inverse

1.5. OVERVIEW OF THE THESIS

7

theory, and the tools such as Markov chain Monte Carlo (McMC) that can be used for inversion and parameter estimation under the Bayesian framework. The limitations and challenges involved with applying these forms of probabilistic tools to remote sensing inversion problems are examined, and attempts in the literature to address these issues identified. Based on the insights and needs presented to date in these two chapters, we introduce the concept and detail of a trans-dimensional Bayesian inversion algorithm, a Voronoi polygon driven reverse-jump Markov chain Monte Carlo (rj-McMC), which forms the foundation of the method development in the following chapters. Chapter 4 begins the process of method development through the application of the original version of the rj-McMC detailed in Chapter 3 to a simple grey-scale raster dataset. Insights into the ability of the method to retrieve spatial features in the grey-scale test lead to the development of a hypothesis based on using spatial coherency features of the data to guide the rj-McMC. The concept of image segmentation is discussed, and a new segment guided method developed and evaluated on the grey-scale problem in comparison to the original “naive” version of the algorithm. In Chapter 5, the segment guided method is extended to a simplified version of the shallow water remote sensing inversion problem. The shallow water radiative transfer forward model is described, and discussed in the context of its use in the rj-McMC framework. The forward model is then used to create a synthetic data set, with only the depth parameter varied, to test the ability of the segment guided method to retrieve the spatial structure of the data. Results of the method are analysed and evaluated, leading to a fundamental modification in how noise is considered in the algorithm to address the increased scale and dimensionality of the new data and inverse problem. Computational speed enhancements implemented in the algorithm code are also detailed, in preparation for moving to the full radiative transfer problem and larger data sets in later chapters. The synthetic data set is amended in Chapter 6 to incorporate spatial variability of all five parameters in the shallow water radiative transfer model. An additional hybrid method is also proposed in this chapter, utilising an optimisation approach based on the segmentation of the data set to inform the initial parameterisation of the segment guided rj-McMC. The results of all versions of the method are evaluated and discussed, looking forward to the application of the method to a real data case study in Chapter 7. Chapter 7 details a hyper-spectral data case study in Lee Stocking Island, Bahamas,

8

CHAPTER 1. INTRODUCTION

implementing the insights and method developments from the previous chapters. Results are analysed and compared to extensive field data and remote sensing studies completed at the study site to evaluate the various components of the developed segment guided rj-McMC method. In the final two chapters (8 and 9) we summarise the conclusions that can be drawn from the study and evaluate potential future techniques and directions for research. In the appendices we have included work and information that is fundamental to the research in this thesis, but outside of the aims and scope of the main topics in this work. In appendix A we detail the parametrisation of the Brando et al. (2009) semianalytical shallow water radiative transfer model, and outline its implementation in an optimisation context. In appendix B we detail a methodology (published as part of this research) for the estimation of a remote sensing reflectance noise covariance matrix using a segmentation approach, suitable for use in areas and imagery in the optically shallow zone. In figure 1.1 we show a schematic overview of the algorithm structure developed in this thesis. Core algorithm components are highlighted, along with the relevant sections and chapters in which these have been developed or described. We present this algorithm schematic again with a further discussion in Chapter 8, where it is used to summarise the innovations of the previous chapters and provide context to future research directions. In this introductory section we wish to provide the reader with a visual overview of this thesis and algorithm, to supplement the chapter descriptions and provide an early signpost to relevant sections in the manuscript.

1.5. OVERVIEW OF THE THESIS

Figure 1.1 – Schematic of Algorithm Developments in this Thesis including relevant Chapters and Sections.

9

10

CHAPTER 1. INTRODUCTION

Chapter 2 Remote Sensing Data & Inversion 2.1

What is Remote Sensing?

In the broadest of definitions, remote sensing can be considered the acquisition of data or information about an object without touch or contact. Clearly, this concept requires a narrowing of focus for the study of particular applications, and for the scope of this work, the definition given by Campbell and Wynne (2011) refines a number of key attributes: “Remote sensing is the practice of deriving information about the earth’s land and water surfaces using images acquired from an overhead perspective, using electromagnetic radiation in one or more regions of the electromagnetic spectrum, reflected or emitted from the earth’s surface.” Firstly, the definition of data as an image is key to many of the concepts explored in this work. With data in this form; in many ways an enhanced version of a digital photograph, we are dealing with a spatially continuous observation of the earth’s surface in a pixel-based raster format. The benefits and challenges involved with working with this kind of data is central to this research. Secondly, acquisition of images from an overhead perspective defines a scope broad enough to encompass observations from sensors on both airborne and space-based platforms. The third concept of this definition can be used to draw a further distinction about the nature of the data used in this work, that of passive optical remote sensing. Passive optical remote sensing measures the signal at the sensor as the emergent radiation from the earth-atmosphere system in the observation direction, relying on solar radiation as 11

12

CHAPTER 2. REMOTE SENSING DATA & INVERSION

Figure 2.1 – The Electromagnetic spectrum classification based on wavelength (Camps-Valls et al. 2011).

an illumination source. This radiation is measured at different wavelengths across the electromagnetic spectrum (figure 2.1), resulting in a spectral signature for the target being observed (Camps-Valls et al. 2011). Passive optical remote sensing is predominately focused on the visible (VIS) and infrared (IR) regions of the electromagnetic spectrum, which when including near-infrared and shortwave-infrared can be considered the wavelength region of 400 to 3000 nm. It is within these wavelengths where the energy received at the sensor is a product of the reflected sunlight, as distinct from emitted energy in the shorter wavelength regions such as thermal infrared or microwave energy. Applications in aquatic environments are typically more narrowly focused on primarily the visible (and perhaps near-infrared) portion of the spectrum, in the range of 400 to 750nm, due to the strong absorption of water and therefore negligible bottom signal measurable at longer wavelengths. Drawing together these three defined concepts of remote sensing data gives a clearer scope of the type data used in this research; pixel-based image data acquired over the visible and near-infrared portion of the electromagnetic spectrum from a passive air or space-borne sensor platform.

2.1.1

Dimensionality and Resolution

Within the now defined scope of remote sensing data for this work, there are a two key concepts of resolution that contribute directly to the dimensionality of the data, and therefore the dimension of the inverse problem that must tackled.

2.1. WHAT IS REMOTE SENSING?

13

Spatial resolution of a remote sensing data source is determined by the Instantaneous Field of View (IFOV) of the sensor, and is considered as the angle of view from which a signal is received by a single detector element of the sensor (Brown et al. 2005). In practical terms, in combination with the height and sampling rate of the sensor, the IFOV determines the spatial dimensions of a pixel in an image, sometimes referred to as the ground sample distance (GSD) (Schowengerdt 2006). Spectral resolution of a remote sensing sensor is determined by the number, position and bandwidth of the spectral channels of the instrument across the electromagnetic spectrum (Camps-Valls et al. 2011). Higher spectral resolution can be characterised by a larger number of narrow bandwidth spectral channels, which therefore produce a closer representation of the true spectral signature of an object. Figure 2.2 provides a visual illustration of two data sets with a varied spatial and spectral resolution acquired over the same study area. The implications of spatial resolution are self-evident in many respects; the higher the resolution, the smaller the object that can be defined on the earth’s surface, and the more accurately highly complex environments can be mapped and classified. This was demonstrated in various coral reef environments by Andrefouet et al. (2003) in an evaluation of the improved classification accuracy achieved by the higher resolution IKONOS sensor in comparison to Landsat ETM+ data. In practice, this means that a remote sensing data source is nearly always selected on the basis of the scale and requirements of the task at hand, along with the cost implications (Mumby et al. 1999); both in terms of the price of the data and the computational costs associated with an increased spatial resolution. Low spatial resolution can be used effectively for the analysis of large scale phenomenon, taking place over extended areas with a relatively low spatial gradient of change. On the other hand, mapping of highly heterogeneous or complex features, such as urban infrastructure or coral reefs, generally requires a higher spatial resolution which can markedly increase the dimensionality of the data as the study area increases (Hamel and Andrefouet 2010). When considering the spectral resolution of the data, the complexity of the tradeoffs between costs, coverage and resolution increases dramatically. Once again, the data source is generally selected to suit the requirements of the task at hand. For example, in coral reef environments it has been widely shown that lower spectral resolution sensors are either incapable, or significantly poorer than higher spectral resolution sensors at distinguishing between reef substrata types (Hochberg and Atkinson 2003;

14

CHAPTER 2. REMOTE SENSING DATA & INVERSION

Figure 2.2 – An Example of Spectral and Spatial Resolution - A decrease in information content can be observed in the decreased spectral and spatial resolution of the AVNIR-2 data.

2.2. REMOTE SENSING AND PARAMETER ESTIMATION

15

Hedley et al. 2004; Botha et al. 2013). Higher spectral resolution has historically been associated with hyper-spectral sensors, characterised by a large number of narrow contiguous bands across the electromagnetic spectrum (Camps-Valls et al. 2011). These sensors are predominately on airborne platforms where a high spatial resolution can still be achieved, with a trade-off in the increase of the associated acquisition costs; or if on a satellite platform, tend towards a relatively lower spatial resolution. Lower spectral resolution sensors are often referred to as the multi-spectral range of sensors, which are satellite-based sensors covering a range of spatial resolutions. Examples of these sensors include Quickbird, IKONOS, Landsat and ALOS AVNIR-2 where the compensation for a decreased spectral resolution is generally a lower data cost and higher spatial coverage. As technology increases, the distinction between hyper-spectral and multi-spectral sensors continues to become less defined, with new-generation high spatial resolution space-borne sensors such as World-View 2 bridging the gap between spectral and spatial resolution. The aim of this section is not to provide an in depth account and pros and cons of the range of airborne and space-based sensors currently available. Instead we wish to illustrate the wide range of resolution “combinations” available in the data provided by these sensors, and how these resolutions can have a strong impact on the dimensionality and computational cost associated with analysis of these data.

2.2

Remote Sensing and Parameter Estimation

The ability to estimate environmental parameters from remote sensing data is one of the core capabilities of the discipline, and is the subject of ongoing research in fields that span the full range of aquatic (Dekker et al. 2011; Le et al. 2013) and terrestrial (Liang 2007) applications. In the case of passive optical remote sensing, transforming the observed spectral signature of the earth’s surface to environmental parameters describing it’s biophysical properties takes the form of an inverse problem. Methods for solving the inverse problem range from simple empirical models utilising calibration over experimental data sets, to highly complex analytical models of physical radiative transfer processes (Baret and Buis 2008). In this section we examine the range of remote sensing inversion methods, with a focus on the physics-based family of methods and the particular challenges and limitations

16

CHAPTER 2. REMOTE SENSING DATA & INVERSION

inherent in their application to remote sensing data.

2.2.1

Empirical Methods

Empirical methods are based around building relationships between the observed spectral data, and in some cases, field and calibration observations, to develop indices and information on environmental parameters. They often require a relationship between the data and the physical environment to be calibrated over experimental observations, and are sometimes referred to as statistical inversion methods. One of the most well known illustrations of an empirical method is the estimation of the Normalised Difference Vegetation Index, or NDVI. The NDVI is used as an indicator for vegetation greenness, or health, by calculating a ratio of the relative reflectance in the visible red and near infra-red (NIR) portions of the spectrum (Rouse et al. 1973). NDVI illustrates some of the characteristics that are attributed to empirical methods more generally; that although simple to implement, they are limited by the fact they tend to be site-, sensor-, and/or time specific (Kutser et al. 2003). For instance, a NDVI value derived using the red and NIR bands of one sensor cannot be directly compared to the NDVI derived from another sensor with a different spectral, spatial and radiometric configuration. Some of the earliest techniques for shallow water bathymetry estimation fall into the empirical category of methods. Lyzenga (1978, 1981) developed well known and widely used techniques to retrieve bathymetry, using a ratio based index for individual substrate types calibrated against known depth measurements. The band ratio form proposed by Lyzenga (1978) has been further developed by Stumpf et al. (2003) in an attempt to account for low albedo and varying substrate composition in a image, and tuned successfully for a specific site/sensor data set by Dierssen et al. (2003) to enable examination of seagrass properties along with bathymetry. In addition to the site/sensor specific aspects of these methods, generally they are limited by the non-uniqueness of the band-ratio as a function of depth, water column constituents and bottom reflectance (Dekker et al. 2011); and also by the requirement of known depth and/or water constituent measurements or estimates in the study area with which to calibrate the algorithm. For these reasons, we focus on the family of analytical physics-based methods in this research.

2.2. REMOTE SENSING AND PARAMETER ESTIMATION

17

Figure 2.3 – The full remote sensing forward (solid lines) and inverse problem (dashed lines) (Baret and Buis 2008).

2.2.2

Physics-based Methods

Analytical, or physics-based methods, use a radiative transfer (RT) model to simulate the electro-magnetic radiation interactions (absorption, scattering) of a light beam travelling though a particular medium (Mobley 1994). These models take into account the spectral, structural and biochemical properties of the environmental medium, and simulate the reflected radiation for a given observation configuration (wavelengths, sun/sensor geometry). The concept of solving the inverse problem is therefore the application of a retrieval algorithm that in principle is able to accurately estimate the variables of interest from the observed spectral data (Camps-Valls et al. 2011). The full cycle of the forward and inverse problem in remote sensing is illustrated in figure 2.3. An important consideration to note in the general formulation of the inverse problem in all physics-based methods is that the analytical RT model, and additional variables used in its construction, is an approximation of the complex physical radiative transfer processes taking place in the environmental medium in question, be that in a terrestrial or aquatic setting. In the shallow water inverse problem that is dealt with in this research, the RT model is required to account for all aspects of the optically shallow water system shown in figure 2.4. The full details of the RT model used in this work and the parametrisations and processes required to enable its formulation and use in the inversion process are detailed in Chapter 5.

18

CHAPTER 2. REMOTE SENSING DATA & INVERSION

Figure 2.4 – An optically shallow water system - detailing the processes which affect the remote sensing signal (Dekker et al. 2002).

2.2.2.1

Benefits and Limitations

In contrast to empirical methods, physics-based inversion methods offer a level of transferability and repeatability between geographical locations; and in principle require no other input or site data other than the observed spectral data. This means they offer the correct mathematical framework for true inference and inversion, rather than an image based proxy solution. Advantages of this form of inversion include being able to apply one algorithm across a time series of data, or between sensors to evaluate suitability to a particular application (Dekker et al. 2011). The inversion problem in remote sensing is in general ill-posed and non-unique (Camps-Valls et al. 2011). This raises practical problems that must be addressed, such as the difficulty in obtaining a global and unique solution. Theoretically, this non-uniqueness means that for any one observed spectra, there may exist a number of different parameter solution sets that generate an equally well fitting modelled spectra using the forward model. Additionally, the problem can be ill-conditioned, where small perturbations in the observed data can lead to large changes in the inverse solution (Kabanikhin 2008; Ampe et al. 2015). Variations in the data that contribute to this non-uniqueness can result from the combined influence of uncertainties in both the observations and the model (Combal et al. 2002). In terms of the observed spectra, uncertainties come from a combination of the sensor noise of the instrument and the

2.2. REMOTE SENSING AND PARAMETER ESTIMATION

19

data processing being used to convert raw sensor data to the reflectance measurement (including atmospheric correction). Model uncertainties refer to the assumptions, and sometimes simplifications, contained in the forward RT model that may not fully reflect the radiative transfer processes for the medium in question. An example of this form of model uncertainty can be found in the use of semianalytical forms of the RT model, one of which is used in this research and detailed further in Chapter 5. A semi-analytical model can often be parametrised from a full analytical model to isolate environmental parameters of interest, and consists of an approximate solution to the radiative transfer process, including the unknown parameters of interest (the analytical part), and assumptions regarding other parameters in the model (the empirical part) (Wang et al. 2005). Uncertainties from this kind of model are therefore based on both how well the analytical part models the radiative transfer process (theory error), and how accurate the assumed empirical parameters are in respect to the actual physical environment. Model uncertainty can also manifest in other forms, such as that identified by Lee (2009). He examined the effects of applying a semi-analytical model for shallow water remote sensing derived for narrow band (hyperspectral) data to a wideband (multispectral) data source, finding a potential 20% model uncertainty under certain conditions that propagated through to the estimated parameters of interest. One of the features of the physics-based approach to inversion that make it attractive for remote sensing applications is the ability to apply methods to remote geographical locations with little or no prior information. In a practical sense however, this often means a potential increase in some of the above mentioned uncertainties involved in the method. For example, in a semi-analytical model, assumed parameters or input constants to the model may need to be used from other representative environments (Sagar and Wettle 2010). In the following two sections, we look at some common retrieval algorithms used in physics-based inversions and how they deal with these related issues of non-uniqueness and uncertainty.

2.2.2.2

Retrieval Algorithms

Retrieval algorithms for physics-based inversion methods are based around the concept of matching a modelled reflectance spectra with the observed reflectance spectra, through

20

CHAPTER 2. REMOTE SENSING DATA & INVERSION

the minimisation of some defined misfit or cost function, typically in the form of a least squares or Euclidean distance type metric. In the literature, the methods used to achieve this can be predominately grouped into two classes, optimisation and look-up table (LUT) methods. Optimisation methods employ an iterative minimisation search technique, often based on the widely used Simplex (Nelder and Mead 1965) or Levenberg-Marquardt (LM) (Levenberg 1944; Marquardt 1963) algorithms. The parameters of interest are varied in the forward RT model until the cost function defining the fit between the modelled and observed spectra is minimised. In the shallow water remote sensing inversion field, this type of method is widely used, with the closest modelled spectra and associated parameters accepted as the solution on a pixel-by-pixel basis (Lee et al. 1999, 2001; Albert and Gege 2006; Goodman and Ustin 2007; Klonowski et al. 2007; Brando et al. 2009; Fearns et al. 2011). To deal with the non-uniqueness of the inverse problem, optimisation methods often need to incorporate some form of regularisation to constrain and guide the algorithm to the correct solution and parameter set (Combal et al. 2002; Baret and Buis 2008). The most common form of regularisation in a pixel based optimisation inversion is the incorporation of prior information regarding the parameters being estimated. Brando et al. (2009) used extensive fieldwork in the study area to determine optimisation ranges for the parameters of interest, and to derive specific values for the assumed parameters and inputs to the semi-analytical model used. Whilst this approach is highly beneficial in terms of minimising the non-uniqueness of the problem, often the luxury of such sitespecific field work derived parametrisations may not available. As previously stated, one of the benefits of using a physics-based method is just such a situation, where field work inputs or site information are unable to be obtained. In other shallow water applications, parameter ranges are “constrained within reasonable physical limits” (Goodman and Ustin 2007), or initial starting values defined using expert knowledge of the medium (Lee et al. 1999) and/or simulations using derived analytical equations (Albert and Gege 2006). Regardless of the method used to incorporate prior knowledge, the central aim from an optimisation perspective is to avoid unlikely parameter combinations (or local minima), by reducing the parameter search subspace, and to facilitate a faster and more economical inversion (Combal et al. 2002). Look-Up Table (LUT) methods involve the construction of a database of spectra using an analytical or semi-analytic forward RT model. Once constructed, the database

2.2. REMOTE SENSING AND PARAMETER ESTIMATION

21

is searched to find the modelled spectra which minimises the cost function when compared to observed spectra. The associated parameters of interest used to construct the best-fit model are then determined to be the solution for that observation. Inversion methods using LUT’s have been used widely in terrestrial remote sensing applications such as estimating canopy characteristics (Weiss et al. 2000; Combal et al. 2002; Atzberger and Richter 2012; Laurent et al. 2013), and to a lesser extent in shallow water aquatic environments (Mobley et al. 2005; Kutser et al. 2006; Hedley et al. 2009). These methods overcome the potential local minima issues encountered in optimisation techniques, as the search and solution is global, reaching the true minimum of the database (Baret and Buis 2008). The limitations of the method are therefore based primarily on the method in which the database is constructed (Weiss et al. 2000; Hedley et al. 2009). The challenges involved in the construction of a LUT database centre around the ability to create a LUT that is representative of the full parameter space and environmental conditions, whilst maintaining a manageable computational size and structure (Combal et al. 2002). Hedley et al. (2009) present an adaptive look up tree algorithm aimed at efficiently constructing a LUT in spectral space. Given an upper and lower bound on the parameters used in the LUT construction, the algorithm adaptively subdivides that parameter space to ensure even sampling of the spectral space. This results in larger parameter steps being used in situations where the parameter in question has a minimal influence on the forward modelled spectra, and inversely, smaller steps when the parameter has a strong influence on the shape or magnitude of the modelled spectra. Similarly, Weiss et al. (2000) use a derived distribution and transformation model of each parameter based on the forward model reflectance sensitivity across the parameter range, again aiming for an even sampling of the spectral space. In this method the parameter transformation model is selected in a manner described by the authors as “trial and error”, rather than the recursive subdivision method in Hedley et al. (2009), and is specific only to the canopy RT model model used in the study. A good summary and evaluation of the state-of-the-art optimisation and LUT physicsbased methods currently being applied to the shallow water inversion problem is given by Dekker et al. (2011). Common to both the optimisation and LUT family of inversion methods is the selection of a single best-fit model. Whilst optimal a-priori information can be incorporated

22

CHAPTER 2. REMOTE SENSING DATA & INVERSION

into the various optimisation models and LUT construction, anything less than an optimal parametrisation of these models results in an increase in the non-uniqueness and model uncertainties associated with the inversion problem, in addition to the uncertainties associated with the actual observations. In their simplest form, these single solution methods are generally ill-equipped to deal with these uncertainties and to reflect them in the estimated environmental parameters.

2.2.3

Dealing with Uncertainty

The ability to deal with uncertainty effectively in the remote sensing inversion process is crucial to the acceptance and uptake of derived products by users outside of the remote sensing community. In the case of bathymetric products, satellite derived bathymetry is often promoted as a complementary data source to standard hydrographic survey methods (Quadros 2013), but without the strictly defined standards and uncertainties assigned to these other survey products. The inability to quantify uncertainty of estimated environmental parameters has a flow on effect when they are required as inputs to other models or processes, consequently limiting the uncertainty evaluation of these next level products (Lee et al. 2010b). At this point it is useful to make the distinction between the accuracy and uncertainty of an inversion solution and its associated parameter estimates. Often, particularly in shallow water bathymetry estimation, the quality of the depth model is based on its accuracy in comparison to known bathymetric measurements in the study area, using a metric such as the root mean square error (RMSE) or a correlation coefficient (Stumpf et al. 2003; Goodman et al. 2008; Brando et al. 2009; Dekker et al. 2011; Fearns et al. 2011). This is an important process, and the only way to assess the absolute accuracy of the derived measurement. It does however create a somewhat contradictory problem in respect to the promoted benefits of the physics-based method, in that it requires no known depths and as such can be applied in remote areas. In these situations reflecting the uncertainty of the estimated depths takes on even greater importance, however this must be approached with care as it is well known that a highly precise measurement can also be potentially highly inaccurate. In the literature there are generally two components of dealing with uncertainty in remote sensing inversion problems: Incorporating the uncertainty into the inversion process itself (Lauvernet et al. 2008; Brando et al. 2009; Timmermans et al. 2009;

2.2. REMOTE SENSING AND PARAMETER ESTIMATION

23

Laurent et al. 2013), and/or then propagating it into the solution parameters (Wang et al. 2005; Lee et al. 2010a; Hedley et al. 2012a). In both of these components, much centres around what is known about the uncertainty in the problem, what it is considered to encompass, and how it is estimated.

2.2.3.1

Incorporating Uncertainty in the Cost Function

Efforts to incorporate uncertainty into the inversion process often involve methods of reflecting it within the minimised cost function. Brando et al. (2009) use an integrated spectral measure, the noise equivalent reflectance difference (NE4R) which captures both the sensor signal-to-noise ratio and scene specific characteristics (Brando and Dekker 2003; Wettle et al. 2004). Details and methods of calculating this measure are discussed further in appendix B. The NE4R is used in two stages of the Brando et al. (2009) method: 1) as a weighting factor in the cost function that is minimised as part of the optimisation inversion; and 2) to determine a quality indicator output of the inversion defined as the Substratum Detectability Index (SDI). The SDI is used to reflect levels of detectability, in terms of the NE4R, of the modelled reflectance signal from the shallow water substrate in relation to the modelled deep water reflectance signal. Thus, the SDI is able to act as a confidence indicator for the estimated parameters, such as depth, as a function of the contribution of the substrate reflectance. These indicators; the minimised cost function and SDI, provide uncertainty informed information on the quality of the inversion closure and the confidence that can be placed in the estimated parameters, although they do not translate to a quantifiable uncertainty for the estimated parameters. For example, a low minimised closure value and a high SDI would indicate a “better” solution and more confidence could be placed in the depth estimated in this pixel than in a corresponding high closure value, low SDI neighbour. However, this difference is not quantifiable in standard uncertainty terms such as standard deviation or confidence intervals, and as such can only be used in a qualitative manner. Additionally, model uncertainties are not reflected in the Brando et al. (2009) method, hence its application is highly dependant on the accuracy of the RT model and associated parametrisation. In terrestrial canopy (Timmermans et al. 2009; Laurent et al. 2013) and land surface (Lauvernet et al. 2008) optimisation inversion studies, uncertainty has been incorporated

24

CHAPTER 2. REMOTE SENSING DATA & INVERSION

through the use of a cost function which maximises χ2 as a function of the observed and modelled data and parameters (Tarantola 2005). The general form of this cost function as used by Laurent et al. (2013) is described by χ2 =

1 1 (Lo − L) Co−1 (Lo − L) + (va − v)Ca−1 (va − v), 2 2

(2.1)

where Lo is the observed data, L is the modelled data, Co is covariance matrix of the data and model uncertainties, v is a vector or estimated parameter values, va is the vector of a priori parameter values, and Ca is the covariance matrix of the a priori parameters. This form works as a type of regularisation for the very high dimensions of the canopy inversion problem by providing a framework to incorporate prior uncertainties on the estimated parameters, as a well as uncertainties associated with the data and model parameters. To take full advantage of this formulation requires thorough knowledge of the full characteristics of both covariance matrices dealing with data, model and a priori parameter uncertainties. Additionally, this approach makes the possibly restrictive assumption that all errors and prior information on the model parameters follow a Gaussian distribution. Lauvernet et al. (2008) demonstrate the regularisation strength of the approach through its application to a very high dimensional joint atmospheric and canopy parameter retrieval problem. Whilst it is successful in minimising the ill-posedness of the problem, the use of synthetic data in the study ensured that the characteristics of both of the covariance matrices in the equation were fully known, a situation made much more difficult in a real world study, which is acknowledged by the authors. This issue is apparent in the vegetation canopy inversion study by Laurent et al. (2013). In this work, rigorous field work and modelling was completed prior to the inversion to determine accurate a priori estimates of the parameters and their uncertainties to populate the Ca matrix. Conversely, the covariance matrix Co reflecting the data and model uncertainty was set as the identify matrix, due to the difficulty in estimating these uncertainties. This is noted in the study as a critical issue that requires further research to properly balance the radiometric and a priori information. Baret and Buis (2008) further note that in this cost function model, the off-diagonal covariance terms in Co are even more difficult to estimate, primarily due to model uncertainties, and are often poorly known.

2.2. REMOTE SENSING AND PARAMETER ESTIMATION

25

Critically, whilst a full range of uncertainties can theoretically be accounted for in these maximum likelihood Bayesian cost function methods, effective propagation of these uncertainties to the solution requires a linear problem where all uncertainties are Gaussian and well characterised. It is where these criteria and assumptions are not met, that we require a more flexible framework for incorporating uncertainty into the inversion solution and estimated parameter set.

2.2.3.2

Reflecting Uncertainty in Estimated Parameters

The problem of reflecting the uncertainties of estimated parameters using “single solution” remote sensing inversion techniques (optimisation, LUT’s) has been tackled using a variety of theoretical and experimental methods. In an analytical ocean colour inversion study, Lee et al. (2010a) use the theory of error propagation to estimate the uncertainties of inverted inherent optical properties (IOP) parameters. This method is employed on a step-wise quasi-analytical algorithm (Lee et al. 2002), by developing analytical equations to propagate uncertainties of four variables in the algorithm throughout the inversion. This theoretical form of error propagation is effective, but only with a thorough knowledge of the specific RT model and the relative uncertainties of the variables used to propagate through the algorithm. For a full evaluation of uncertainties, and the ability to produce quality maps, it is also acknowledged that data and model uncertainties would need to be incorporated into the error propagation process. Wang et al. (2005) address the ocean colour IOP uncertainty problem by creating an ensemble of inversion results from simulated and field based reflectance measurements. The ensemble is generated by an optimisation inversion of each measured spectra, using a different combination of spectral IOP shape parameters in the RT model. From this full ensemble of solutions, those within 10% of the measured spectra at all wavelengths are considered as belonging to the final ensemble, and their associated IOP parameter estimations used to infer uncertainty. The approach is highly dependant on the error budget assigned to select the final ensemble of solutions, which is based on a combination of data and model uncertainties. The process used to select 10% as the final nonwavelength specific measure appears somewhat arbitrary, with a much wider wavelength dependant range of uncertainties described for each of the uncertainty budget consideration areas. Given that this selection is the primary driver behind the magnitude of the

26

CHAPTER 2. REMOTE SENSING DATA & INVERSION

parameter uncertainty estimations, this would appear to be a significant shortcoming of the method. Similar to Lee et al. (2010a), for the Wang et al. (2005) process to be applied to real remote sensing reflectance data, data noise associated with the remote sensing reflectance signal would need to be incorporated in the methodology. In a shallow water environment, Hedley et al. (2012a) propose an innovative technique for propagating noise and environmental uncertainties through a radiative transfer LUT inversion method (Hedley et al. 2009). In this study, synthetic data-sets representing three multi-spectral sensors (Landsat ETM+, SPOT-4 and the proposed Sentinel-2 sensor) were simulated from an underlying hyperspectral data-set. Using an estimation technique based on Wettle et al. (2004), NE4R covariance matrices were derived from real Landsat and SPOT imagery, and the published Sentinel-2 signal-to-noise ratio values. The estimated NE4R matrix was used to perturb the simulated remote sensing reflectance, by creating twenty new spectra with additive noise components. This ensemble of spectra are inverted independently to estimate an ensemble of twenty parameter values for each pixel, from which uncertainties can be estimated. This approach is a practical method for estimating parameter uncertainties and could easily be applied to real imagery and other inversion problems/methods, providing an accurate estimation of the NE4R covariance matrix can be made. One issue that is not discussed in the paper is the potential influence of the ensemble sample size (20) chosen to infer the mean and uncertainty of each pixel solution, and how dependant these propagated uncertainty estimates are on the chosen sample size. It is understandable that the number of perturbed ensemble spectra be limited, due to the computational time of the inversion. It would be interesting however to see a distribution of the ensemble of estimated parameters for a single pixel, and if this distribution remains constant for an increasing number of perturbed spectra in the ensemble sample.

2.3

Spatial Dependence & Coherency

As a spatially referenced measurement of the environment, remote sensing data naturally exhibits varying degrees of spatial dependence and coherency. Fundamentally, this is expressed by the geostatistical principle that near-by pixels have a tendency to be more similar than pixels further apart; if the reflectance is large in a particular band of one pixel, it is likely to be large in its neighbour (Atkinson 2002). This principle is the driver behind the increased use of techniques such as object based image analysis for

2.3. SPATIAL DEPENDENCE & COHERENCY

27

classification in aquatic (Phinn et al. 2011; Lyons et al. 2012) and terrestrial (Ke et al. 2010; Myint et al. 2011) environments, and has been used to assist in the constraint of problems such as spectral unmixing (Martin and Plaza 2011; Shi and Wang 2014). In the context of remote sensing inversion problems, the use of spatial constraints or structural characteristics have been largely neglected, with most inversion retrieval algorithms applied to independent pixels (Baret and Buis 2008). One of the first attempts to introduce spatial information into an inversion method was by Atzberger (2004) who demonstrated that incorporating an object signature, in addition to the individual pixel spectral data, improved inversion performance. The motivation behind the Atzberger (2004) study was to regularise the non-uniqueness of the canopy inversion problem, using neighbourhood data characteristics rather than prior information on the parameters of interest. The approach was tested on synthetic Landsat data built to represent “agricultural fields”; consisting of 5 x 5 pixels, with a neighbourhood object signature drawn from each field and inverted along with each individual pixel spectra. This method was further developed by Atzberger and Richter (2012) to incorporate a hierarchical style approach to constrain spatial variability of parameters at different object scales, whilst utilising the object signature of a moving 3 x 3 pixel kernel. In this study, fields/objects were manually defined from the remote sensing imagery and spatial constraints imposed on the average leaf angle (ALA) parameter, under the assumption that it was constant for each individual field. Within each field, a moving 3 x 3 object kernel was used to further constrain other variables at that scale in the RT model, whilst also contributing neighbourhood information to the inversion. Both of these studies showed a significant improvement in the parameter retrieval using neighbourhood information when compared to a individual pixel inversion. Recently, Laurent et al. (2013) proposed a variation on this canopy inversion objectbased approach, applying it to hyperspectral data and incorporating prior information in the form of the cost function (Eq. 2.1) described in section 2.2.3.1. A two step approach is utilised, firstly optimising an average spectral signature from each digitised field/object. Secondly, with the assumption that some of the estimated parameters will remain constant for the object, the spectra are used to tailor the construction of an object specific LUT for use in a pixel based inversion. The results are compared to those from a regular pixel based LUT inversion, with a significant improvement in accuracy observed when compared to field observations. These object based methods have all shown to be effective as a regularisation tool in

28

CHAPTER 2. REMOTE SENSING DATA & INVERSION

the canopy inversion problem, and illustrate the potential of using spatial information more generally in inversion studies. It is however worth considering how the “objects” in the three aforementioned studies are defined. In Atzberger (2004), the object size is known a priori as a characteristic of the synthetic data, whilst in Atzberger and Richter (2012) and Laurent et al. (2013) object fields are manually digitised, under the assumption of expert or prior knowledge. In a digitisation process such as this, there will always be a degree of subjectivity in the object definition, and as the object boundaries drive the estimated parameters and are not adaptive within the inversion, errors in the object definition will flow directly into the estimated parameters. In the case studies in question here, where agricultural fields tend to be defined by highly contrasting boundaries in sharp geometrical shapes, this issue may be of minimal concern. In environments where “objects” are less clearly defined, and boundaries considered “fuzzy” in a spectral separability sense, this component of the method needs to be more thoroughly considered. Spatial information has also been considered in the inversion problem without an explicit object definition, through the incorporation of spatial smoothness constraints. These types of approaches develop a hypothesis on the parameters being estimated (or in the model), in respect to the scale and degree in which they vary spatially in the environment. At a larger scale, Lauvernet et al. (2008) demonstrated the ability to constrain atmospheric parameters in a constant window approach when inverting a coupled surface-atmosphere radiative transfer model, whilst Wang et al. (2008) propose a variable scale smoothness constraint to assist in regularisation of land surface parameter retrievals. In the smaller scale shallow water inversion problem, spatial information has had very limited use, with conflicting results from studies that have implemented spatial smoothness constraints within the inversion. In their study on the influence of datapreprocessing on the retrieval of bottom depth and reflectance, Goodman et al. (2008) implement a kernel processing operation that imposes spatial uniformity of water properties within a moving kernel, whilst allowing independent pixel based estimates of depth and bottom reflectance. The inversion is completed in an optimisation framework, with the kernel cost function minimised to estimate one set of three water property parameters for all kernel pixels, and individual depth and bottom reflectance parameters for each pixel in the kernel. The implication apparent in this cost function formulation is that if the hypothesis of constant water properties within the kernel does not hold fully,

2.3. SPATIAL DEPENDENCE & COHERENCY

29

the water property parameters estimated from the full kernel minimised solution will result in a sub-optimal solution at a pixel level. This possible scenario is reflected in the results, with the authors observing a decreased accuracy, primarily in the bathymetry, when using the kernel processing option. Filippi and Kubota (2008) propose a method based on a similar hypothesis, with the addition of proposed spatial smoothness constraints also on the depth and substrate reflectance. The method incorporates a linear diffusion term into the minimised cost function to impose smoothness of all parameters across a moving kernel of 4 neighbouring pixels to each evaluated pixel. The process visits each pixel ˜5 times to fully realise the inversion result and smoothness constraint. The authors found significant accuracy improvement in estimated depths and the removal of random artefacts when compared to an optimised inversion of the original un-smoothed RT model. There are a couple of possible reasons as to why these two methods have produced conflicting results, even when both employed the same semi-analytical RT model developed by Lee et al. (1998, 1999). Firstly, the spatial resolution of the data sets used in each study differs substantially. Goodman et al. (2008) use AVIRIS data with a 20m pixel resolution, whilst the Filippi and Kubota (2008) study is completed using 1.3m pixel resolution PHILLS data. The implication here being that whilst the hypothesis of smoothness appears to have held for all five parameters of a kernel region of 3.9m width, the same hypothesis for only the three water property parameters does not hold over the larger 60m wide kernel. This raises the issue of scale in spatial coherency processes such as these. In a highly heterogeneous environment such a coral reef, an assumption of smoothness may hold, but at different and variable scales across the image for each of the estimated parameters. This issue in turn is related to the data being analysed; in a heterogeneous environment a smoothness assumption for a parameter such as depth is likely to prove less valid as the spatial resolution of the data source decreases. Secondly, the iterative approach used by Filippi and Kubota (2008), where each pixel is visited a number of times, may have a beneficial effect in diffusing the smoothness constraint appropriately throughout each pixel neighbourhood. We would however consider that the scale issue identified above is the major contributor to the accuracy improvement observed. The issues identified in the methods in this section all point to the need for a degree of flexibility in the process of utilising spatial dependency. This could take the form

30

CHAPTER 2. REMOTE SENSING DATA & INVERSION

of accounting for potential subjectivity in object definition, or in allowing an adaptive component to the scale dependant hypothesis that drives the smoothness constraint methods. In a heterogeneous environment the challenge appears to be how to account for highly varying degrees of spatial coherency across an image whilst still using it as an effective regularisation tool.

2.4

Summary

In this chapter we have introduced the concept of the retrieval of environmental parameters from remote sensing raster data using physics-based inversion methods. One of the key issues in the inversion of remote sensing data is the non-uniqueness of the inversion problem. A number of regularisation techniques have been discussed which largely overcome this non-uniqueness problem. However, fully incorporating uncertainty into the inversion and propagating it into meaningful uncertainty estimates associated with retrieved parameters remains an area which has received little attention in the literature. This is particularly the case where the uncertainties associated with the problem are poorly known or difficult to characterise. Of the methods that have addressed uncertainty in the final estimated parameters of the inversion, the generation of an ensemble of solutions driven by model and noise variations is shown to be a practical approach to enabling uncertainty to be quantified. This leads us to consider a new family of probabilistic inversion methods in the following chapter which infer all the characteristics of the estimated parameters from an ensemble solution. These methods offer a flexible framework for propagating uncertainty in the non-linear remote sensing inverse problem, where uncertainties in the problem and prior knowledge may not follow a Gaussian distribution. To assess the benefits of considering spatial coherency, in the following chapter we look to incorporate a spatial component to the probabilistic inversion, aiming to address the requirements of flexibility and adaptivity of scale that have been identified in this chapter.

Chapter 3 A Probabilistic Inversion Framework 3.1

Introduction

In this chapter we begin by introducing the concept of a probabilistic inversion framework, through a review of Bayesian theory and Markov chain Monte Carlo (McMC) methods that can be applied to a range of inverse problems. In doing so we are looking to contrast the fundamental differences between these methods and a single solution optimisation approach, and in particular how these differences relate to the issues in remote sensing data inversion identified in the previous chapter. We then discuss the factors that have limited the application of probabilistic inversion methods of remote sensing data to date. This leads us to examine the features of a trans-dimensional Bayesian inversion method and the potential benefits and challenges involved in implementing this form of algorithm on remote sensing raster data. The aim of this chapter is to introduce the conceptual foundation of these methods in a remote sensing context, so that the more specific aspects of their implementation and theory can be addressed in subsequent chapters dealing with their application, development and modification. In that manner, we conclude the chapter with an outline of the trans-dimensional algorithm that will form the foundation of the methods further developed in later chapters. 31

32

CHAPTER 3. A PROBABILISTIC INVERSION FRAMEWORK

3.2

Probabilistic Inversion

In a probabilistic framework, the inverse problem can be considered as a type of information inference process. This means that the solution of the inverse problem is regarded as a kind of extraction and joining of all the available information at hand to solve the problem (Debski 2010). The core concept of a probabilistic inversion approach is the representation and mapping of all of this information into probabilistic terms, using a Bayesian interpretation of probability that considers each piece of information in an equivalent form of a probability distribution (Smith 1991; Box and Tiao 1992). This concept extends beyond the information inputs to the inverse problem. In a probabilistic framework, the solution is a probability distribution of the estimated parameters over the model space, quantifying the “chance” that a given model and set of parameters is true (Debski 2010). Such an approach, has in principal the potential to provide a framework to encompass the various aspects of uncertainty in the physical remote sensing problems identified in Chapter 2. This not only includes the uncertainty associated with the model and observed data, but importantly, that which can be inferred about the estimated parameters of interest. If we were to consider a physical system and inverse problem, such as we encounter in a remote sensing problem, any given set of physical parameters to be estimated can be considered as a model (m) existing as a point in the model space (Mosegaard and Tarantola 1995). To cast this problem into the probabilistic framework, we must be able to define and quantify probability distributions within the model space describing all of the pieces of information (and their associated uncertainties) we have regarding the problem. This includes the observed data, any prior knowledge we have about the parameters of interest, and the physical model itself (MacKay 2003). In Bayesian terminology, the relationship between these probability distributions is given by Bayes’ rule (Bayes 1763), posterior =

likelihood × prior , evidence

(3.1)

or in notation terms p (m | dobs ) =

p (dobs | m) p (m) , p (dobs )

(3.2)

where dobs is the vector of observed data and m is the vector of unknown model

3.2. PROBABILISTIC INVERSION

33

parameters. In this notation, x | y means x conditional on y; or interpreted in terms of the posterior p (m | dobs ), the probability of obtaining the set of model parameter unknowns m given the observed data dobs . The likelihood p (dobs | m) describes the probability of observing the data given a particular model m, while the prior p (m) contains the prior information we have on the model, represented by a probability density over m. As described in Bodin and Sambridge (2009), the term p (dobs ) is not a function of the model m, simplifying to p (m | dobs ) ∝ p (dobs | m) p (m) .

(3.3)

To frame this as a parameter estimation problem, the full posterior probability density function (PDF) contains all the inferred information on the parameters m in the model space. Thus, the posterior PDF solution is a function of what we know of a particular model m before considering the data (the prior), and a probabilistic view of the fit between data predicted from the model, g(m), and the observed data dobs (the likelihood).

3.2.1

The Prior

The prior in our Bayesian formulation of the inverse problem is a key and somewhat controversial component of the probabilistic inversion approach (Scales and Tenorio 2000). The prior represents any a priori knowledge or information that we may have regarding the model, before the data has been considered. Therefore, in a strict Bayesian formulation, the prior should be formed independent of the observed data in the inverse problem (Tarantola 2005). This knowledge to form the prior can come from a wide range of sources; earlier or less accurate experiments, field observations or knowledge of the study area, or simply physical constraints of the problem at hand. For example, if we considered a shallow water bathymetry inversion problem, physical constraints could lead us to reasonably define a prior that restricts the water column depth to be a nonnegative number. With some expert knowledge of the study area, or other bathymetry data, we might be able to further refine this prior by setting an upper limit to the depths we expect in our inversion solution. The most common criticism that is made of the Bayesian approach is the subjective nature in which prior knowledge is interpreted and represented as the required probability distribution in the inversion framework (Scales and Tenorio 2000). Even within

34

CHAPTER 3. A PROBABILISTIC INVERSION FRAMEWORK

Bayesian applications, there exists approaches such as empirical Bayes which aims to form a more objective prior through interpretation of the observed data, prior to the inversion (Gouveia and Scales 1998; Carlin and Louis 2000). The influence of the prior within our simple Bayesian formulation (equation 3.3) highlights why the prior is such an important component in the probabilistic inversion framework. The prior acts as a form of regularisation of the inversion solution, determining which solutions are “reasonable” given what we already know about the problem (Debski 2010). If, on comparison, the posterior and prior PDFs are similar, it would be reasonable to infer that we have not learned a great deal about the model from the observed data. In the extreme example, if the priors on our parameters of interest are well known with a minimal uncertainty, there is little logical need for an inversion solution at all. The observed data in this case will not add any information beyond the tight constraints of our priors, as we already know our solution. At the other end of the scale, lack of prior knowledge is still required to be represented as a probability distribution in our inverse framework. Often, as in the applications in future chapters of this thesis, this is done through the use of an uninformative prior (e.g.. Bodin and Sambridge 2009; Agostinetti and Malinverno 2010; Dettmer et al. 2012, for further discussion see sections 3.4.1.2 & 4.2.4). Most importantly, we must recognise that the prior has an unavoidable influence on the posterior in a Bayesian inversion framework, with all results being dependent on; and interpreted relative to, the choice of prior used (Sambridge et al. 2013). In the case of large data applications, such as remote sensing, the data dominates the prior and allows for a well behaved but uninformative prior to be used.

3.2.2

The Likelihood

The likelihood function expresses how well a given model g(m) with a particular set of parameter values m can explain the observed data dobs , taking into account observational errors, or noise in the data (Bodin and Sambridge 2009; Debski 2010). A common form of likelihood used in Bayesian inference is a Gaussian distribution p (dobs

 1 | m) = p exp − φ(m) , 2 (2π)n | C | 1



(3.4)

where the misfit φ(m) quantifies the agreement between the modelled and observed

3.2. PROBABILISTIC INVERSION

35

data, and takes the form of φ(m) = (g(m) − dobs )T C −1 (g(m) − dobs ).

(3.5)

In these expressions, n is the number of data and C is the covariance matrix of size n × n that characterises the uncertainty, or noise, in the data. In the case where the data noise is independent and identically distributed (i.i.d.) in a Gaussian distribution, then: C = σ2I

(3.6)

where σ is the noise standard deviation and I is the identity matrix. In this work we choose the Gaussian likelihood as the most appropriate form for our problem. However, we acknowledge that there exists a wide range of likelihood forms described in the literature, including (and not limited to) hierarchical (e.g. Noh et al. (2012)) and heteroscedastic or non-Gaussian (e.g. Schoups and Vrugt (2010)) set-ups.

3.2.3

Why a Probabilistic Approach?

The key difference between a probabilistic inversion approach and optimisation is in the form of the inversion solution. Through characterisation of the full posterior PDF in a probabilistic approach we are presented with an ensemble of model solutions, rather than a single best fit solution or optimal solution generated through optimisation. The ensemble of model solutions is a probability distribution over the model space, quantifying the “chance” that a given model is the true one. Thus, as stated by Debski (2010), the probabilistic approach provides a natural framework for comparing different models and allows the estimation of inversion uncertainties. There is also considerable flexibility offered by the ability to infer a solution from the full ensemble of models. For example, a single best fit solution can still be inferred though optimisation of the posterior PDF, and results in the model with the maximum posterior value, or MAP in Bayesian terminology. One of the challenges is to interpret and inspect the full posterior PDF, which for many non-linear high-dimensional inverse problems becomes an issue as it is often analytically intractable (Gilks et al. 1996; Gallagher et al. 2009). In many physical inverse problems a numerical sampling approach provides the most efficient way to

36

CHAPTER 3. A PROBABILISTIC INVERSION FRAMEWORK

sample the posterior; the most common belonging to the class of Markov chain Monte Carlo (McMC) methods (standard references include Tierney 1994; Gilks et al. 1996; Liang et al. 2010).

3.3

Markov chain Monte Carlo

The concept of a McMC method is to conduct a random walk of the model space using an iterative technique, by means of the ergodic Markov chain process, to draw samples from the posterior probability distribution (Debski 2010; Liang et al. 2010). The chain can be constructed in such a way that after a period of sampling referred to as the “burn-in”, the random walk (hopefully) converges on an area of high posterior probability, with the generated model samples in the chain characterising the posterior probability distribution (Tierney 1994; Gilks et al. 1996). The generation of samples in a McMC proceeds by making a proposal of a new model m0 , based on the current model m, using a draw from the probability distribution q(m0 | m). The new model is assessed for acceptance to the chain based on an acceptance probability α(m0 | m). The assessment of the new model combines a comparison of the relative likelihoods, and information regarding the proposal and prior distributions of both models, m and m0 , to define the acceptance probability

α(m0 | m) = min [1, (prior ratio) × (likelihood ratio) × (proposal ratio)] ,

(3.7)

or in notation terms  p(m0 ) p(dobs | m0 ) q(m | m0 ) α(m | m) = min 1, × × . p(m) p(dobs | m) q(m0 | m) 0



(3.8)

This acceptance criteria ensures the target posterior density will be represented by the generated samples, considering the defined prior information on the model, the nature and structure of the model proposal, and the data misfit and associated variance of the data noise (Liang et al. 2010). In this acceptance equation we can see the proposal probability distribution of the reverse proposal, q(m | m0 ), which can be described as the probability of returning to the previous model m given the new proposed model m0 . The complete process can be summarised thus:

3.3. MARKOV CHAIN MONTE CARLO

37

-

Select an initial model m based on a prior distribution p(m).

-

Propose a new model m0 using a random draw from the proposal distribution q(m0 | m).

-

Determine the acceptance probability of m0 based on equation 3.8.

-

Generate a random number r from a uniform distribution between 0 and 1.

-

If r < α, then m → m0 as we accept the new model m0 and it becomes the current model for the next iteration of the chain. If r > α, then m0 is rejected and the current model is unchanged for the next iteration.

The form of the McMC method described above is that of the widely used MetropolisHastings algorithm (Metropolis et al. 1953; Hastings 1970). McMC algorithms such as these distinguish themselves from Monte Carlo methods where models are sampled independently (possibly with a very low posterior probability); and it is this inherent memory feature of the McMC method that enables efficient sampling of a high posterior probability region of model space (Gilks et al. 1996; Malinverno 2002). Once sampling in this high posterior probability region, it is the acceptance probability α(m0 | m) in equation 3.8. that allows the chain to relax and asymptotically sample the target posterior probability distribution (Tierney 1994). In certain applications (e.g Bodin et al. 2009), where there exists symmetrical proposal distributions (i.e. q(m0 | m) = q(m | m0 )) and a prior ratio of unity, the acceptance criteria in equation 3.8 is then only dependent on the likelihood ratio. This means that all proposed models m0 with a higher likelihood in comparison to the previous model m are accepted, while proposed models with a lower likelihood are accepted with a probability equal to the likelihood ratio (Liang et al. 2010). This scenario exists in the non-dimensional moves of the algorithm we describe in section 3.4.1.3, however in the same section we see that when moving into a trans-dimensional setting the acceptance of proposed samples is dependent on all components in the acceptance term (equation 3.8).

3.3.1

Remote Sensing and McMC inversion

In considering the application of a probabilistic method of inversion to remote sensing data, it is worth again examining the key features that differentiate this family of methods from optimisation based approaches which are common in remote sensing

38

CHAPTER 3. A PROBABILISTIC INVERSION FRAMEWORK

applications (see Chapter 2). Following from this we can examine how these features relate to and possibly address issues that are common to the remote sensing inverse problem, and what the potential barriers are in adopting a probabilistic method. One of the most significant features of moving from an optimisation inversion method to one such as a McMC that is probabilistic in nature is the fundamental shift in interpretation of the inversion result. As we have discussed, instead of the focus on the best fitting single solution, the McMC inversion provides a full ensemble of models. Not only does that enable characterisation of the uncertainties of estimated parameters based on their sampled distribution, there is also considerable flexibility in the interpretation of the ensemble when looking to infer information about the model. For example, in a remote sensing inverse problem that may be highly non-unique, a classic problem in an optimisation framework may be convergence of the algorithm to a local minima well removed from the global minima solution (Baret and Buis 2008; Camps-Valls et al. 2011). As a McMC has no single model solution, in theory this type of incorrect convergence in avoided. In principle, a McMC ensemble characterises the full posterior PDF and we are able to examine the probability of all minima, local and global, in that solution. In practice, the performance of an McMC and the effective sampling of the full posterior PDF is sensitive to a number of factors. These include the scale and design of the proposal distributions, the defined prior distributions and the dimension and nature of the inverse problem, data and noise (Mosegaard and Tarantola 1995; Gelman et al. 1996; Malinverno 2002; Sambridge et al. 2006; Charvin et al. 2009). All of these factors need specific consideration when casting the remote sensing inverse problem in a McMC framework. Indeed, designing an appropriately efficient McMC algorithm will be one of the primary concerns for the applications in this thesis. Studies which tackle the physics-based inversion of remote sensing data in a Bayesian framework, particularly using a full McMC method, have been very limited to date. Although there have been applications on remotely sensed measurements for atmospheric applications (Haario et al. 2004; Tamminen 2004), the application to optical remote sensing inverse problems as defined in the scope of this work will be addressed for the first time here using a trans-dimensional approach introduced in the following section. It would appear the primary reason for the lack of uptake for McMC methods in remote sensing is the scale of the data and the corresponding computational demands (Baret and Buis 2008). If we were to consider a hypothetical scene from a new generation sensor

3.3. MARKOV CHAIN MONTE CARLO

39

such as World View 2 (figure 2.2), covering an area of 8km2 , the number of individual pixels is in excess of 15 million. Even when considering a lower spatial resolution sensor such as Landsat (30m pixels compared to 2m pixels), a comparable area still equates to over 70,000 pixels. In the McMC framework described thus far, inversion of these data sets would require each pixel spectra to be inverted with its own independent McMC process, a considerable computational burden. Zhang et al. (2005) apply the Metropolis McMC algorithm to the inversion of data from the Moderate Resolution Imaging Spectroradiometer (MODIS) to estimate light absorption parameters in a forest environment. MODIS data has a 500m spatial resolution, and therefore the authors base the inversion on a number of temporal acquisition periods over the one pixel covering the study area, resulting in a data set of approximately 12 spectra for each McMC inversion. Whilst the assumption of study area invariance has allowed an increase in spectra to be used in the inversion for the study pixel, the size of the inversion problem is a magnitude or so smaller than that of tackling a full image inversion. Even so, Zhang et al. (2005) note that the Metropolis algorithm is very computationally intensive to obtain a reliable estimate of the posterior parameter distributions. Another issue relevant to a full image McMC inversion is made apparent by the results presented by Zhang et al. (2005). As has been discussed, the ability to examine the posterior PDF of an estimated parameter is a strength of the McMC approach. One of the challenges brought by this flexibility is how to interpret the ensemble effectively to generate sensible solutions and uncertainty estimations for the problem at hand. This is commonly done using standard formulas for the expected model, mean, or standard deviations and/or by using statistical intervals such as the Bayesian credible interval (Box and Tiao 1992; Gilks et al. 1996; Bodin et al. 2009; Gallagher et al. 2009; Minsley 2011). The marginal posterior PDFs of the canopy parameters estimated in the MODIS forest study have been grouped by the authors as “well-constrained” and “poorly-constrained”, as illustrated by the two examples shown in figure 3.1. From these distributions, it is clear that interpretation of the marginal posterior PDFs can not only give valuable insights into the uncertainties of each parameter, but also their sensitivity within the inversion model. The challenge when considering these kind of results from a pixel based McMC inversion is how to interpret the parameter posterior PDFs in an appropriate manner to represent the solution in an image format. Indeed, there is limited value in inferring a single solution for the parameter estimated

40

CHAPTER 3. A PROBABILISTIC INVERSION FRAMEWORK

(a) Example of a “well constrained” parameter

(b) Example of a “poorly constrained” parameter

Figure 3.1 – Marginal posterior PDF histograms for canopy parameters in a McMC inversion (Zhang et al. 2005).

in figure 3.1b, where each parameter value within the prior range is given a similar posterior probability. This type of distribution also demonstrates the value of a probabilistic inversion approach. A scenario such as this unconstrained parameter can only be discovered in a probabilistic framework; whilst an optimisation approach on the same data may display strong parameter instabilities without any visible justifications (Debski 2010). Extracting the mean parameter value for each pixel ensemble solution individually, especially where the distribution reflects a high uncertainty such as exhibited in figure 3.1b., is likely to produce a high pixel-to-pixel variability in the final image solution. Although this variability may be somewhat reflected in the associated uncertainty solution image, the relevance and usability of the estimated parameter image is still likely to be compromised. It therefore may be highly beneficial to incorporate and leverage spatial coherency features of remote sensing data into the McMC framework to minimise the potentially subjective pixel-to-pixel variations in parameter interpretation and estimation. Closely related to interpretation of the posterior PDF is the nature of how noise and uncertainty are reflected in the McMC inversion method. In contrast to optimisation methods, noise plays a fundamental role in the McMC algorithm, through representation in the likelihood function (equation 3.4). One of the challenges in this aspect of McMC application is the difficulty of characterising the full noise covariance matrix to encompass not only the data noise, but that of the model uncertainty as well (Bodin

3.4. A TRANS-DIMENSIONAL INVERSION APPROACH

41

et al. 2012a; Dettmer et al. 2012). The issue of noise estimation is tackled in numerous ways in a probabilistic inversion context. In some applications it is appropriate to assume a i.i.d Gaussian distribution (Hopcroft et al. 2009; Minsley 2011), while in other instances extensive knowledge of the processes involved in the data and model can enable a more rigorous estimation incorporating error covariances (Gouveia and Scales 1998). Other methods include sampling of a data error model as part of the McMC inversion (Dettmer et al. 2010; Gallagher et al. 2011), or through hyper-parameters in a Hierarchical Bayes framework (Malinverno and Briggs 2004; Bodin et al. 2012b). The process of estimating the noise characteristics of remote sensing data is explored in depth in appendix B, whilst the implications of noise in a probabilistic remote sensing raster data inversion are examined throughout the thesis.

3.4

A Trans-dimensional Inversion Approach

A more recent development in probabilistic inversion methods, and one which has been embraced by the geosciences in particular, belongs to a family of methods which treat the number of unknowns in the inversion model as an unknown itself. These are referred to as trans-dimensional inverse problems, where the dimension of the model space is treated as an unknown; or as put by Malinverno (2002) “the number of things you don’t know is one of the things you don’t know”. Introductions to the general concepts of trans-dimensional inference are given by Sisson (2005), Green and Hastie (2009) and Sambridge et al. (2013). A general algorithm for sampling a trans-dimensional inverse problem was first introduced by Green (1995), as a modification to the Metropolis-Hastings algorithm and termed the reversible-jump Markov chain Monte Carlo (rj-McMC). Earlier approaches dealt with special cases of variable dimension models e.g. birth-death McMC (Geyer and Moller 1994). The reversible aspect of the algorithm respects the underlying theory of McMC that requires that any move that is made in the chain is able to be reversed to keep the chain in detailed balance (Gilks et al. 1996; Gallagher et al. 2011); including the change in dimension which is reflected by the jump component. Since its introduction, trans-dimensional inversion has seen take-up within the geosciences for applications such as geostatistics (Stephenson et al. 2004), DC resistivity sounding (Malinverno 2002), seismic tomography (Bodin and Sambridge 2009), receiver function

42

CHAPTER 3. A PROBABILISTIC INVERSION FRAMEWORK

inversion (Agostinetti and Malinverno 2010; Bodin et al. 2012b), geoacoustics (Dettmer et al. 2010) and airborne electromagnetic inversion (Minsley 2011). Some parallels exist between the nature of the remote sensing inverse problem and the rationale for which the trans-dimensional inversion method has been applied to these earth science applications. Even so, the application of such a method to remote sensing data also presents unique challenges primarily based on size and dimensions of the data being considered. In the earth sciences, the trans-dimensional method is generally applied to inverse problems in which the dimension of the earth model being inferred is unknown and should be probabilistically inferred from the data. This can be implemented spatially in one dimensional problems, such as determining the number of layers in a particular earth model (Malinverno 2002; Agostinetti and Malinverno 2010; Dettmer et al. 2010; Minsley 2011), or the dimensions needed to fit time series data as a regression problem (Charvin et al. 2009; Gallagher et al. 2011). In two dimensions, the method can be used to infer the properties and dimensions of a spatially continuous model of a particular physical property of the earth, and it is within this paradigm that the remote sensing inverse problem is tackled. In this two dimensional framework, Stephenson et al. (2004) assessed the rj-McMC algorithm as a spatial interpolation tool for scattered data, in comparison to the more commonly used kriging approach. Also using scattered data points, Hopcroft et al. (2009) examine a selection of borehole measurements across the UK to infer ground surface temperature histories. Common to these studies is the use of the rj-McMC to determine the dimension of a spatially partitioned model with the highest posterior probability, and to use the data contained (or sampled parameter models in the case of Hopcroft et al. (2009)) within each partition to derive the earth model for the partition. This approach, although creating a continuous model is still prone to discontinuities at the partition boundaries, although particularly in the Stephenson et al. (2004) study this was seen as a potential benefit over the smoothing characteristics of kriging when actual discontinuities may exist in the data. A different approach is taken by Bodin and Sambridge (2009) who tackle the seismic tomography inverse problem using a spatially partitioned rj-McMC algorithm. In this work, the data measurements consist of seismic rays irregularly distributed across the domain, and the inverse problem consists of inferring a continuous velocity model. The fundamental difference in the approach to the two aforementioned studies, is the

3.4. A TRANS-DIMENSIONAL INVERSION APPROACH

43

inference of the model from the full ensemble solution of partition models (and their associated velocity values) sampled during the rj-McMC procedure. This enables a generalisation of the velocity model solution that is not constrained to the boundaries of any one particular partition model, able to reflect gradients in the model, whilst still representing appropriate discontinuities and sharp boundaries in the data and model. In effect these studies use the rj-McMC method to create a form of self-regularisation in the tomographic solution, highlighting this intrinsic characteristic of the rj-McMC method (Bodin and Sambridge 2009; Dettmer et al. 2010). Another relevant property of an rj-McMC method for remote sensing inversion is that it exhibits a natural parsimony, which is a characteristic of the Bayesian formulation (Malinverno 2002; MacKay 2003; Sambridge et al. 2006; Agostinetti and Malinverno 2010). In practical terms this means that when considering two models that fit the data equally well, the simpler model will have a higher posterior probability (Malinverno 2000). Extending this to the rj-McMC partition model formulation, for models that fit the data equally, the model with a lower dimension (i.e. fewer partitions) will be preferred. An example of this characteristic appears in the Hopcroft et al. (2009) study, where a configuration of 11 partitions was derived to characterise the inferred ground surface temperature history model based on 23 individual borehole measurements. The implication here is that only 11 partitions, which is not a best fit to each borehole, were needed to adequately fit the data given the data noise characteristics and forward model. The parsimonious aspect of the rj-McMC ensures that in these cases the model does not over-fit the data. These characteristics of the rj-McMC can relate to the spatial coherency aspects of remote sensing image data identified in Chapter 2. As discussed, spatial coherency occurs at a variety of spatial scales and distributions across an image. The self-regularisation and natural parsimony of the rj-McMC may offer a flexibility to exploit this characteristic in a manner which is driven by the data, not imposed in a structured kernel style regularisation. Similarly, the parsimonious nature of the method combined with a partition structure may enable the dimensionality of the data and inverse problem to be reduced, by allowing the data to determine the spatial dimensionality of the model. Locally homogeneous areas in an image consisting of similar pixels have the potential to be represented by the partition structure, and the model parameter posterior PDF sampled within this framework, rather than at a pixel based dimension. One of the questions that this research will explore is whether the rj-McMC method,

44

CHAPTER 3. A PROBABILISTIC INVERSION FRAMEWORK

Figure 3.2 – A Voronoi polygon tessellation structure.

even when considering these parsimonious aspects, can effectively explore the high spatial and parameter dimensions of the model space in the remote sensing image inversion problem. This form of inversion method has not been attempted on continuous raster data of this size and dimensionality before; and therefore we are looking to examine the feasibility of the method and propose ways to improve its implementation. If we consider that the noise in the data is strongly linked to the concept of parsimony and complexity of data fit allowed (Bodin et al. 2012b), how noise is considered spatially and spectrally in remote sensing data is a key aspect of exploring this research question.

3.4.1

The rj-McMC Algorithm

3.4.1.1

Partitioning and Parametrisation of the Model

The first step in employing the rj-McMC method in a spatial framework is to develop a spatial partition parametrisation of the domain, where the structural characteristics and dimensions of the partition are treated as unknowns. In this study we adopt the use of a Voronoi polygon tessellation (Okabe et al. 1992) to partition the spatial domain. Voronoi polygons create a non-overlapping unstructured mesh (figure 3.2) that is characterised by a number of discrete points, or nodes, that define partition boundaries and regions of each Voronoi cell (Aurenhammer 1991). All the points in the spatial domain contained within each Voronoi cell are closer to its node than any other Voronoi node in the tessellation.

3.4. A TRANS-DIMENSIONAL INVERSION APPROACH

45

This form of spatial partitioning has been shown to work effectively in an rj-McMC framework in the aforementioned studies utilising distributed data (Stephenson et al. 2004; Hopcroft et al. 2009), and in a self-regularisation context by Bodin et al. (2009). The flexible, unstructured nature of the Voronoi tessellation enables it to represent the spatial dimensionality of the data and model at various scales across the spatial domain, which relates closely to the spatial coherency issues discussed in the previous section. We now introduce the concept and framework of a trans-dimensional rj-McMC using the Voronoi partition model. The version of the algorithm described in this section was developed by Bodin and Sambridge (2009), and by outlining it here we detail the concepts and processes fundamental to its implementation. This provides a framework to develop and propose modifications specific to the remote sensing problem in later chapters of the thesis. In this formulation of the algorithm we parametrise a simple model for the estimation of one parameter over the spatial domain of the data and Voronoi tessellation

m = (n, c, h),

(3.9)

where h is a vector of size n containing a single estimated Earth parameter for each Voronoi cell, c is a vector of size 2n denoting the locations of each Voronoi cell node, and n is the number of Voronoi cells. Thus we have an inverse problem of dimension 3n, consisting of the coordinate locations c(xi , yi ) and parameters hi , i = 1, . . . , n.

3.4.1.2

The Prior

The prior we define is “relatively” uninformative, indicating that there is no spatial preference for cell positions in the model, or prior knowledge of the model parameters and model dimensionality. The prior takes the form of a product of conditional and marginal distribution functions such that

p(m) = p(c | n)p(h | n)p(n).

(3.10)

The prior on the number of Voronoi cells p(n) is defined by a uniform distribution over an interval I = {nN | nmin < n ≤ nmax }. Hence,

46

CHAPTER 3. A PROBABILISTIC INVERSION FRAMEWORK

  1/ (4n) p(n) =  0

if n  I

(3.11)

otherwise,

where 4n = nmin → nmax nodes. The prior on the depth parameter values hi at each node is defined using a uniform distribution over an interval J = {hi < | hmin < hi < hmax }. Hence   1/ (4h) p(hi | n) =  0

if hi  J

(3.12)

otherwise,

where4h = hmin → hmax . As the depth value in each cell is independent, p(h | n) =

n Y

p (hi | n) .

(3.13)

i=1

The prior for the location of the cell nodes c is represented probabilistically using the grid technique idetailed by Bodin and Sambridge (2009). For n Voronoi cell nodes, h ! possible configurations on a underlying grid of N possible positions, there are n!(NN−n)! where in a remote sensing framework N is the number of pixels. The prior for the cell nodes positions can be given by 

N! p (c | n) = n! (N − n)!

−1 .

(3.14)

Combining these three prior conditions (3.11, 3.13 & 3.14), the full prior probability density function is defined as:

p(m) =

 

n!(N −n)! N !(4h)n 4n

if (n  I and ∀i  [1, n] , hi  J)



0

otherwise.

(3.15)

We note that the use of a discrete grid by Bodin and Sambridge (2009) was for mathematical convenience to ensure the correct analytical form of the algorithm expressions. They show that as the variable N cancels out in the final acceptance terms of the algorithm (see equations 3.20, 3.25 & 3.26), in practice, cell nodes could be placed using a continuous distribution over the region of the model. In contrast, we use the native grid structure of the data (the pixels) to define N , and use this grid to define where cell nodes are located throughout the algorithm process. In

3.4. A TRANS-DIMENSIONAL INVERSION APPROACH

47

modifications made to the algorithm in later chapters, this discrete pixel grid becomes a key component driving the probabilistic placement of nodes based on image segmentation. Whilst we show that all variables describing the grid still cancel in our algorithm, the grid plays a fundamentally different role in our raster data inversion beyond the mathematical convenience of its initial implementation in Bodin and Sambridge (2009). 3.4.1.3

Proposing Model Samples

Proposing new models in the algorithm is completed using four different perturbations to the model vector m. 1. At even steps of the chain - randomly pick one cell node ci and perturb the associated depth (hi ) value using a Gaussian proposal probability density function. In practice, implemented by: 0

hi = hi + uσ1

(3.16)

where σ1 is the standard deviation of the proposal, and u is a random variable drawn from a normal distribution N (0, 1). 2. At odd steps of the chain - randomly select a perturbation of the Voronoi structure from three possible options: 0

(a) BIRTH - a new cell cn+1 is created by randomly selecting an unoccupied pixel location in the spatial image domain. This new Voronoi cell node is 0

assigned a depth value hn+1 based on equation 3.16, where hi is the depth value at the location in the current model, increasing the model dimension by 3. (b) DEATH - a cell is deleted by randomly selecting a node from the n available in the current model. This is the opposite of the birth step, and reduces the model dimension by 3. (c) MOVE - randomly select a node i from the n available in the current model and perturb its position ci based on a two dimensional Gaussian proposal probability density function in the form of  0  q ci | ci =

 T  0  1 1 0 exp − ci − ci ci − ci 2πσc2 2

(3.17)

48

CHAPTER 3. A PROBABILISTIC INVERSION FRAMEWORK

Figure 3.3 – Proposal changes in the Voronoi tessellation partition structure The dark blue cell indicates the Voronoi cell chosen for a particular proposal perturbation. Light blue shading indicates cells in the partition model changed as a result of the perturbation.

These three moves which influence the Voronoi partition structural parameterisation are shown in figure 3.3. 3.4.1.4

Determining the Acceptance Probability

The proposal and prior ratios for these four model perturbations can now be calculated to allow evaluation of the acceptance probability (equation 3.8). The final expressions for each of these terms are shown below, with the subscript for each term denoting the type of model perturbation. Full derivations appear in Bodin and Sambridge (2009). For model perturbations where the number of cells and dimension of the model remains the same (a move or depth perturbation), it is shown that the prior and proposal ratios reduce to unity,

3.4. A TRANS-DIMENSIONAL INVERSION APPROACH



0

p(m ) p(m)

 move/depth

  1 =  0

49

if ∀i  [1, n] , hi  J

,

(3.18)

otherwise,

and 

q(m | m0 ) q(m0 | m)

 = 1.

(3.19)

move/depth

Hence the acceptance probability is determined only by the likelihood ratio. From (3.8) we derive the acceptance term for a non-dimensional change as

α(m0 | m)move/depth =

 h i  min 1, p(dobs |m0 ) p(dobs |m)

if ∀i  [1, n] , hi  J

(3.20)

otherwise.

0



In the proposal moves where a change of dimension is involved (i.e. a birth where have moved from n → n + 1 cell nodes, or in a death from n → n − 1), following (3.15) the prior ratios can be derived as 

0

p(m ) p(m)

 = birth

 

n+1 (N −n)(4h)

  if (n + 1)  I and h0n+1  J



0

otherwise,

(3.21)

and 

0

p(m ) p(m)

 = death

 

(N −n+1)(4h) n

if (n − 1)  I



0

otherwise.

(3.22)

The proposal ratios based on a Gaussian proposal PDF with a standard deviation of σ1 are represented by 

q(m | m0 ) q(m0 | m)



 = birth

2π(N − n) σ1 exp (n + 1)

h0n+1 − hi 2σ12

2 ! ,

(3.23)

and 

q(m | m0 ) q(m0 | m)

 death

h0j − hi n √ = exp − 2σ12 σ1 2π(N − n + 1)

2 ! (3.24)

50

CHAPTER 3. A PROBABILISTIC INVERSION FRAMEWORK

0

where hj is the depth value in the new cell parameterisation at the location of previous cell i , and hi is the parameter value in the current model at the location of the cell birth or death. Finally, we substitute (3.21, 3.23) and (3.22, 3.24) into (3.8), along with the appropriate likelihood functions (3.4) for the current (m) and proposed (m0 ) models, to determine the full acceptance probability terms for the birth and death moves as:

α(m0 | m)birth =    2 √ 0 i)   min 1, σ1 2π exp (hn+1 −h − 4h 2σ 2 1

 

φ(m0 )−φ(m) 2

   if (n + 1)  I and h0n+1  J otherwise,

0

(3.25) and

α(m0 | m)death =

   2 (h0j −hi )   min 1, 4h √ − exp − 2 2σ σ 2π 1

 

1

0

(φ(m0 )−φ(m)) 2

 if (n − 1)  I otherwise. (3.26)

At this point it is important to note that the full acceptance term in the rj-McMC developed by Green (1995) includes a Jacobian matrix | J | to account for dimensional state moves that change the scale of the model vector m. The Jacobian can also apply when the state dimension does not change, and may be needed to account for parameterisation changes. For the form of the rj-McMC detailed above and the single class of parameterisation we deal with, the change in dimensional states are limited to transitions of only one more or one less unknowns, and Bodin et al. (2009) show that the Jacobian is unity and therefore the acceptance term in the form of (3.8) is appropriate. For detailed derivations of this version of the rj-McMC, the reader is referred to Bodin and Sambridge (2009). In the following chapters, we propose a number of modifications to this initial version of the rj-McMC to deal with the specific challenges of remote sensing data inversion, in which we comprehensively detail the development of the algorithm introduced in this thesis. For notational simplicity, the case statements used in this chapter to define the probability of moves outside of the parameter prior

3.5. SUMMARY

51

bounds are omitted in future chapters.

3.5

Summary

In this chapter we have introduced some of the core concepts of Bayesian inversion theory and McMC methods used to numerically sample the posterior probability density function under this framework. The application of these methods to the remote sensing inversion problem have been very limited. To our knowledge there has been no comprehensive attempt to tackle the remote sensing image based parameter estimation problem within a full probabilistic framework. We consider that there is great potential in investigating these methods, as features of a probabilistic inversion method, such as the characterisation of a full ensemble of models, are well suited to deal robustly with uncertainty estimation and non-uniqueness in the remote sensing problem. The limited uptake of sampling methods to date appears to be due to the large size and dimensionality of remote sensing data, and the associated practical (computing resource demands) and theoretical (effective parameter space exploration) issues that this presents. This has led us to investigate a trans-dimensional Bayesian inversion framework, which enables the spatial dimension of the model to become one of the unknowns in the inverse problem. The natural parsimony and self-regularisation characteristics of the partition model implementations of these methods have been identified as having the potential to address some of these remote sensing specific issues. In the next chapter we apply the formulation of the reverse-jump McMC detailed in this chapter to a simple spatially varying raster image. To the best of our knowledge, this method has not been applied to the remote sensing problem before. This first test will enable us to determine its capability in resolving spatially complex and varying features from raster data, and to discuss in more detail the implementation issues and developments that can be made to improve the algorithm.

52

CHAPTER 3. A PROBABILISTIC INVERSION FRAMEWORK

Chapter 4 Resolving the Spatial Dimension 4.1

Introduction

In this chapter we detail the application of the trans-dimensional rj-McMC algorithm introduced at the conclusion of Chapter 3. These initial tests enable us to examine the specific issues encountered when applying the algorithm to a continuous raster data source, and provide a platform to discuss more general elements of McMC implementation theory and ensemble interpretation. Through assessment of the algorithm’s ability to resolve the varying spatial complexity of the test raster data, we develop a hypothesis that considers the manner in which spatial coherency may provide a foundation for the effective guidance of the rj-McMC algorithm. To test this hypothesis we develop a guided version of the rj-McMC algorithm, utilising the outputs of object based image segmentation as an informative layer in the inversion. We conclude with an evaluation of the outputs generated from the two versions of the algorithm; and a discussion on the nature of noise and uncertainty inferred from the range of tests in the chapter.

4.2

Raster Data Test

The purpose of the tests in this chapter is to evaluate the ability of the rj-McMC algorithm we have introduced in Chapter 3 to retrieve and resolve structure and uncertainties from a continuous raster data source. By providing a simplified framework with no inverse problem; and focussing on only the retrieval of the original data values as the parameter of interest, we aim to isolate the ability of the algorithm to resolve the 53

54

CHAPTER 4. RESOLVING THE SPATIAL DIMENSION

spatial structure of the data. This is an important step in considering the feasibility of applying the algorithm to a full remote sensing inverse problem, where spatial structures of varying scales may need to be resolved for each parameter of interest in the inversion.

4.2.1

Synthetic Data Creation

First we need to generate synthetic data to test the application of the algorithm in a raster data setting. For the source data in these tests we generated a greyscale raster of 90,000 pixels (300 pixels x 300 pixels) with data values ranging from 0 to 1. Features were incorporated into the data to reflect a variety of spatial structures, including irregularly shaped regions with anomalous data values and a range of curved and straight transitions. Independent Gaussian noise was added to each pixel data value to create a low noise and high noise image, with a standard deviation of 0.02 and 0.15 respectively (figure 4.1). The aim of the rj-McMC application is therefore to estimate the true model from the input data sets with well characterised noise, generating an ensemble of models where the data value is the estimated parameter i.e. g (m) = m. Previous applications of a spatially partitioned rj-McMC have been applied on data with a significantly lower number of observations than those encountered in remote sensing. Even at the small scale of this test problem, the complexity and scale is large compared to all previous applications of trans-dimensional imaging (Stephenson et al. 2004; Bodin and Sambridge 2009; Hopcroft et al. 2009). It is reasonable to expect a degree of spatial coherency in the underlying parameter models of a remote sensing inversion problem; and therefore this is reflected in the synthetic test data. However, even if we expect the true model to be of a much lower dimensional variability than that of the pixel resolution of the data, we have no prior knowledge to what degree and at what scales the variability occurs. The challenge is whether sampling at the lower generalised spatial dimensions of the partition framework allows us to resolve the underlying true model, observed with many thousands of pixels representing the continuous surface.

4.2. RASTER DATA TEST

(a)

55

(b)

(c)

(d)

Figure 4.1 – Synthetic Greyscale Test Datasets - a) True Model b) Low Noise data (σ = 0.02) c) High noise data (σ = 0.15) and d) High noise data in a 3D view. Note the expanded range and scale of the data in the high noise model.

4.2.2

Prior and Proposal Distributions

4.2.2.1

Uninformative Priors

The rj-McMC algorithm detailed in Chapter 3 can be considered a naive application to the raster surface problem we have described. By this we are referring to both the uninformative definition of the priors, as well as the manner in which the various proposal distributions are defined. We define priors on the number of Voronoi cells n and their associated parameters h as uniform intervals, such that 4n = (nmax − nmin; ) and 4h = (hmax − hmin ). This gives no preference to either their initial values in the model, or the values they may take within these intervals in any proposed models. Similarly,

56

CHAPTER 4. RESOLVING THE SPATIAL DIMENSION

we give no preference to the initial locations (c) of the Voronoi cells within the spatial domain by defining the prior on the native grid formation of the pixels. For details on the prior of c see Bodin and Sambridge (2009) and also section 3.4.1.2. Effectively, we are imposing minimal prior knowledge on the dimensions, structure or parameter values of the model beyond their set upper and lower limits. For the tests in this section we set the range of number of Voronoi cells to nmin = 3, nmax = 250 and the range of depths to hmin = 0 , hmax = 1 to encompass the range of the data values we are seeking to retrieve. Outside of these ranges, prior probabilities for these parameters are set to zero.

4.2.2.2

Efficient Sampling and Proposal Distributions

The proposal distributions in the naive algorithm consist of three components: a change in the parameter value; a change in the number of cells; and/or a change in their locations. When adding a new cell in a birth move, the proposal on the location of the cell is governed by the underlying grid pixel structures. Again, no preference is given on the location of the newly added cell node, except that it cannot be added to a pixel that is already occupied by a cell node. The choice of the Gaussian proposal distribution widths used for a parameter value change σ1 , and a cell node move σc , are free parameters to be chosen. The size of these parameters have an important influence on the efficiency of the chain, and how effective it is at exploring the parameter space. Too small a value for σ and the chain may become stuck in local minima, unable to make the jump to a region of higher probability effectively. Too large, and the chain may be unable to effectively sample the target distribution, as large proposals away from the region are likely to be rejected (MacKay 2003). A balance is generally sought that enables an efficient exploration of the parameter space to coverage on the region of highest probability quickly, whilst still enabling proposals of a suitable size to effectively characterise the region. Achieving this in many inverse problems is non-trivial, particularly in trans-dimensional algorithms, and is the subject of ongoing research (Al-Awadhi et al. 2004; Haario et al. 2004; Rosenthal 2011). In practice, an appropriate proposal width can be estimated through a process of trial and error by examining the acceptance rate of the McMC algorithm. Gelman et al. (1996) proposed an optimal acceptance rate of between 0.23 and 0.44 to reflect an efficient sampling of the target distribution, noting a decrease to the lower bounds

4.2. RASTER DATA TEST

57

of this range with an increase in dimension. Brooks et al. (2003) and Rosenthal (2011) found that a reasonable efficiency can be maintained with an acceptance rate of between 0.1 and 0.6; and a range of studies have used values within this broader range to determine an appropriate proposal distribution for the problem at hand (Bodin et al. 2009; Charvin et al. 2009; Agostinetti and Malinverno 2010; Gallagher et al. 2011). In trans-dimensional applications it can be common to see acceptance rates significantly lower as the dimensional space of the problem increases (King 2011), with a particular decrease in the rate for birth/death dimensional change moves ( 0) .

(5.12)

To determine the modified likelihood expression, we substitute this expression for σ into 5.9 and get

 21 h i 1 N 1 e| , | m) = N log φ (m) + + log (2π)N | C N 2 2 

− log p (dobs

⇒ − log p (dobs | m) =

⇒ − log p (dobs

(5.13)

h i N N N 1 e | , (5.14) log (φ (m)) − log N + + log (2π)N | C 2 2 2 2

N | m) = log (φ (m)) + 2



h i N 1 N e (1 − log N ) + log (2π) | C | . 2 2 (5.15)

This final term in curly brackets does not depend on the misfit φ (m) and residuals r so we can write − log p (dobs | m) =

N log (φ (m)) + constant, 2

(5.16)

which gives our modified likelihood 

p (dobs

 N | m) ∝ exp − log (φ (m)) . 2

(5.17)

5.5. AN ALTERNATE NOISE MODEL

113

The interaction of this form of ML noise estimation within the framework of the SG algorithm can be examined initially though the results of the homogeneous test problem used to test the parsimonious nature of the algorithm in the previous chapter (see section 4.4.5.2). In this test we used a homogeneous surface with known data noise to show that the dimension of the model reduces to the minimum number of cells needed to characterise the homogeneous data. The form of this problem meant that the only uncertainty left in the inverse problem was that of the Gaussian noise added to each pixel of the data. The uncertainty based on fitting a generalised Voronoi structure to a varying surface is removed as the model converges to a parsimonious and homogeneous solution. Examining the misfit φ (m) as a sum of the squared residuals from the converged solutions of this problem can give an indication of the likely data noise. The misfit is shown to be stationary at φ (m) ∼ = 36 in all chains of the run; and along with the number of data observations (N = 90, 000), we can substitute these values into our maximum likelihood expression for sigma (5.12). This yields an estimate for σ = 0.02, identical to the noise which was added to the synthetic data. 5.5.1.1

Acceptance Terms with a ML Noise Estimation

To apply a ML estimation that attempts to encompass the noise in our full inverse problem we need to incorporate the modified likelihood expression shown in (5.17) into the acceptance terms of the SG algorithm. For a proposed move that does not involve a change in dimension, our acceptance term remains a ratio of the likelihood for the proposed model m0 and the current model m, in which case we can use our modified likelihood expression (5.17) in the original acceptance term (3.20). For a proposal move that involves a dimensional change we substitute the modified likelihood expression for m0 and m (5.17) into the full acceptance form (4.22), along with the relevant proposal and prior ratio terms for a birth (4.13 & 4.20) and death (4.14 & 4.21) move. This yields acceptance terms for the birth and death moves in the ML-SG algorithm as ( )# 2 √ 0 0 h − h σ1 2π N log(φ(m )) − N log(φ(m)) i n+1 = min 1, exp − , 2 4h 2σ1 2 (5.18) "

α(m0 | m)birth

114

CHAPTER 5. A SHALLOW WATER DEPTH MODEL

Figure 5.9 – No. of Cells vs Sampling Iteration - sample of 6 chains from the ML-SG runs showing an increasing dimension of the model over time in comparison to the fixed-noise mode.

and

( )# 2 h0j − hi 4h (N log(φ(m0 )) − N log(φ(m))) = min 1, √ exp − − . 2σ12 2 σ1 2π (5.19) "

α(m0 | m)death

In this version of the algorithm, we are not changing the parameterisation, only the structure of the likelihood. Thus, as in the previous derivation of the acceptance terms, the Jacobian matrix remains as unity (see section 3.4.1.4).

5.5.2

Convergence and Sampling

The first runs of the ML-SG algorithm were completed with the same prior specifications as the SG runs in section 5.4.1, using the larger proposal of σ1 = 0.1. Run lengths were initially started at 1,000,000 and incremented by 200,000 for each subsequent run, observing the sampling of the partition dimension that has been used in the previous experiments to indicate convergence of the rj-McMC. In figure 5.9. we can see that the ML-SG runs take a considerably longer number of iterations to reach a near stationary state in terms of the dimension of the model, if we do indeed consider it by this diagnostic to have reached convergence at all.

5.5. AN ALTERNATE NOISE MODEL

115

The increasing number of cells in the ML-SG algorithm implementation may be explained by considering one of the properties of rj-McMC inversion as shown by Agostinetti and Malinverno (2010), who observed a clear relationship between the size of the data noise and dimension of the sampled models. This was observed in a different context by Bodin et al. (2012b) who used a hierarchical rj-McMC approach to estimate a joint posterior distribution for the model dimension and data noise, both considered as unknowns in the inversion. Bodin et al. (2012b) noted that the dimension of the model increased as the estimated data noise decreased, but that the range of these parameter trade-offs was ultimately constrained by the data. In the increasing dimension of our ML-SG sampled models (figure 5.9) we consider that we are seeing the same effect as that observed by Bodin et al. (2012b). As the estimated data noise decreases in the ML-SG sampling with an improved fit to the data, the dimension of the partition models increase. The difference between our problem and that of Bodin et al. (2012b) is the underlying complexity of the data, and the dimension of model required to sample it effectively. In our model we require many 1000s of cells to adequately characterise the spatial complexity of the raster data, whilst the receiver function inversion data of Bodin et al. (2012b) was characterised by a much smaller model dimension of 4-8 layers. Thus, it is likely that we are yet to reach a model dimension that is constrained only by the data noise, as we are still also dealing with the errors relating to fitting the partition to the high spatial complexity of the data. This results in an increase in cells for the model as the partition structure continues to adapt to the data. The ML estimation framework encourages more cells in the model until not only the data noise, but the error relating to the spatial fit of the partition is minimised. We are tackling an inversion problem of significantly higher dimension to those which have used model dimension as a convergence diagnostic previously (Malinverno 2002; Malinverno and Leaney 2005; Bodin and Sambridge 2009; Gallagher et al. 2011). It appears that using the number of model cells to determine convergence in such a highdimensional problem may be unsuitable, and hence we look to examine alternatives such as the stability of the solution over the duration of the chain to assess convergence of the ML-SG algorithm (figure 5.10).

116

CHAPTER 5. A SHALLOW WATER DEPTH MODEL

(a) Run Iterations 7+5 13+6

to

(b) Run Iterations 10+6 to 15+6

(c) Run Iterations 13+5 to 18+6

Figure 5.10 – Ensemble Uncertainty Solutions for the ML-SG Algorithm at varied intervals - The consistency of the model uncertainty indicates the stability of the solution across a range of sampling iteration intervals.

5.5.3

Results

The assessment of the stability of the ML-SG runs leads us to accept the shortest possible run (see figure 5.10a) in which a stable solution has been reached and the acceptance rate of the algorithm maximised. The run of 1,300,000 samples with 700,000 burn in samples discarded resulted in an acceptance rate of 9%, a slight improvement over that of the fixed-noise SG algorithm. In figure 5.11 we present posterior solution ensemble results of the ML-SG algorithm in comparison to the true depth, showing an accurate and smooth model derived by each ensemble indicator. There are two important points we can draw from these ensemble results. Firstly, the mean, median and mode solutions are close to identical, indicating a good convergence of the solution in terms of the marginal posterior distributions of the parameter values at each pixel. Secondly, the variance solution (shown in figure 5.10a) shows an improvement in the uncertainty of the parameter solution in comparison to the fixed noise results (figure 5.7). The magnitude of the uncertainty at boundaries of varying bathymetry has been decreased, and the residual uncertainty resulting from outliers in the partition cell structure has been reduced. This indicates that the ML-SG algorithm has better characterised the full model and data noise, and enabled a more efficient and effective exploration of the high dimensional space. In figure 5.12 we present a comparison between the fixed noise and ML-SG algorithms results, visualised as a horizontal transect of the mean solution models. Both algorithms

5.5. AN ALTERNATE NOISE MODEL

(a) True Depth Model

(c) Median Ensemble Solution

117

(b) Mean Ensemble Solution

(d) Mode Ensemble Solution

Figure 5.11 – Ensemble Depth (m) Solutions for the ML-SG Algorithm Test comparable results for each ensemble indicator suggest a well constrained posterior distribution, with each ensemble indicator producing an accurate model solution.

118

CHAPTER 5. A SHALLOW WATER DEPTH MODEL

(a) Fixed Noise SG Mean Model Transect in comparison to the true depth model Shaded area represents an uncertainty interval of 1σ

(b) ML-SG Mean Model Transect in comparison to the true depth model - Shaded area represents an uncertainty interval of 1σ

Figure 5.12 – Depth Transect Model Comparisons - Mean ML-SG and Standard (Fixed Noise) SG solution models evaluated on a horizontal transect at pixel row 482.

do a comparable job of resolving a highly variable bathymetry surface, however the increased uncertainty in areas of the fixed noise solution again reflects its difficulty in characterising the full noise of the inverse problem. Finally, figure 5.13 shows the marginal posterior parameter distributions for a selection of three locations along the horizontal transect. At point A we see an example of the uncertainty that is created as a function of the partition model being fitted in a region of highly variable small scale bathymetry features. This point is located in a deeper region between two small shallow features. Whilst the mode of the solution for both noise models is close to the true value, we can see the uncertainty that is created by sampling over the surrounding shallow features. Point B displays the same effect for a small shallow feature surrounded by deeper water. Here it is even clearer that the ML

5.5. AN ALTERNATE NOISE MODEL

119

noise model has been more successful in constraining the solution and avoiding outliers in the sampling. Point C is located in a more homogeneous shallow region, and the posterior distribution solutions for both models reflect this with a small variance and accurate recovery of the depth. The ensemble solutions at Points A & B in these results are of particular interest in assessing the benefit of the ML-SG approach, compared to the fixed noise algorithm. The improved constraint of the ML-SG marginal posterior solution at these locations of high spatial variability (figures 5.13b & 5.13d) is reflected in both the transect comparisons (figure 5.12), as well as the decrease in partition based variance artefacts in the ML-SG solution (figure 5.10). In the transect analysis we see an improved retrieval at all regions of high spatial variability, along with a corresponding decrease in the solution uncertainty. This illustrates the increased capability of the ML-SG approach to more effectively sample in a spatially complex and higher dimensional parameter space.

120

CHAPTER 5. A SHALLOW WATER DEPTH MODEL

(a) Point A - fixed noise SG Algorithm

(b) Point A - ML-SG Algorithm

(c) Point B - fixed noise SG Algorithm

(d) Point B - ML-SG Algorithm

(e) Point C - fixed noise SG Algorithm

(f ) Point C - ML-SG Algorithm

Figure 5.13 – Marginal Posterior Depth Parameter Solutions - comparison of the fixed-noise and ML-SG algorithm retrievals in regions of varied spatial complexity (see text for point descriptions and analysis). True model depth values shown as the dashed line.

5.6. SUMMARY

5.6

121

Summary

In this chapter, through introducing the shallow water inversion problem and radiative transfer model, we have begun to identify the aspects of uncertainty that will contribute to the full inversion remote sensing inversion problems tackled later in this thesis. In this first application of the SG algorithm to spectral remote sensing data, we have sought to isolate components of the uncertainty in the model to evaluate the effects of a large increase in the spatial complexity and dimensionality of the problem. This application of the SG algorithm to a synthetic dataset representative of a coral reef environment has raised a number of computational and sampling efficiency issues directly related to the increased size of the data and underlying complexity of the model. With the data and radiative transfer uncertainties well characterised in the inversion, the primary contributor to the uncertainty of the parameter solution remains the uncertainty of fitting the generalised partition surface to a spatially complex data set. To encompass this aspect of the model uncertainty, we have proposed and developed an alternate noise model based on a maximum likelihood estimation. By incorporating the maximum likelihood model into the SG algorithm we are aiming to provide a framework to encompass other elements of uncertainty in the full remote sensing inversion problem. Many of these elements are extremely difficult to estimate independently. To our knowledge this is the first application of a maximum likelihood noise model in a probabilistic inversion of remote sensing data. Initial tests of the ML-SG algorithm on the single parameter synthetic data have shown it to provide a more effective characterisation of parameter uncertainty; and just as importantly, enable a more efficient sampling of the very high dimensional model space. This was illustrated by the consistency of the ensemble solutions from a range of estimators, the smooth inferred uncertainty model and the successful retrieval of the spatially complex bathymetry model. In the next chapter we look to extend the shallow water problem to a full suite of five unknown parameters, and to test the ML-SG on an inverse problem of an increasing dimensionality.

122

CHAPTER 5. A SHALLOW WATER DEPTH MODEL

Chapter 6 The Full Shallow Water Inverse Problem 6.1

Introduction

The primary goal of this chapter is to extend the shallow water inversion problem to encompass estimation of the full range of parameters characterising the water column and bottom reflectance in the radiative transfer model. The addition of a further four unknown parameters in the inversion greatly increases the dimension of the problem that must be solved. We examine the capability of the segment guided and maximum likelihood developments we have made to the trans-dimensional algorithm to deal with these challenges. The creation of a synthetic data set provides a well characterised test-bed in which we can control the spatial scales of variability of each of the five environmental parameters in the inversion problem, and assess the implications of applying a single generalised partition model in this framework. By using this controlled environment, it also enables us to explore the interactions and sensitivities between model parameters and to isolate and investigate the influence of the various forms of error and uncertainty. Combining these insights, we consider methods for interpreting the ensemble of solutions and appropriately reflecting the uncertainty of the estimated parameters in our problem. The high-dimensional nature of the problem we are now addressing leads us to consider options for regularising the inversion through the use of existing data or knowledge about the study area. We discuss the difficulties of using such information to inform the 123

124

CHAPTER 6. THE FULL SHALLOW WATER INVERSE PROBLEM

prior distributions in our inversion framework, and propose a hybrid method which utilises the segmentation aspect of our model in conjunction with an optimisation routine. The encouraging results from this method are compared to the ML-SG application, showing it to be an effective tool in tackling the increased dimensions of the problem, and thus of significant potential benefit when we first apply the algorithm to a real world data problem in the following chapter.

6.2

Extending to Multiple Parameters

In this chapter we now extend the shallow water inversion problem to estimate all five parameters in the radiative transfer model (equation 5.4). In addition to the water column depth H, we free up four addition parameters held constant in the previous chapter. These consist of three parameters describing the concentrations of constituents in the water column (CCHL , CCDOM , CN AP ), and a single parameter (q) describing the proportion of two substrates contributing to the bottom reflectance. The primary effect of these additional parameters on the dimension of the problem is clear, and characteristic of many remote sensing inverse problems. With five free parameters now to solve for, we must explore a significantly larger model space and deal with the possibility of local minima and parameter combinations contributing to a nonunique solution. In previous chapters we have only been concerned with a local minima solution in the context of fitting the parametrised partition model to higher resolution underlying data. Now, we are encountering a model space exploration problem in which we must deal both with the parameter interactions and sensitivities in the RT model, as well as the fitting of the partition model. Recall that the dimension of our problem in previous chapters was 3n, consisting of n Voronoi nodes each with a location (x, y) and parameter h. With four additional parameters added to each node, we extend that model dimension to 7n, which amounts to a significantly larger parameter space when we consider the large n which we have found are needed to sample these problems effectively (˜4000-6000 in the single parameter version of the previous chapter). Another factor that must be considered relates to the characterisation of the spatial domain by the partition model, and concerns the differing scales of variability and spatial coherency of each of the parameters. Thus far we have used the partition model framework as a regularisation tool that makes use of spatial coherency, at the scale of the spatial variability of the depth parameter. In our shallow water problem, whilst studies

6.2. EXTENDING TO MULTIPLE PARAMETERS

125

have shown water column concentrations do exhibit variations related to depth and bottom type (Boss and Zaneveld 2003), it is acknowledged that environment parameters associated with water column constituents show a lower scale of spatial variability in comparison to depth or substrate composition (Goodman et al. 2008). This means that while we might expect finer scale variations in depth or bottom type, we would generally not see the same degree of variation in the water column concentrations as they vary more smoothly across the spatial domain. Furthermore, the interactions in the radiative transfer model mean that individual parameters each have a specific and wavelength dependant influence on the reflectance data. We can see this used in modelling features such as the substrate detectability index (SDI) of Brando et al. (2009), which works off the principle that the reflectance from the substratum has an exponentially decreasing influence on the remote sensing reflectance as depth and water column concentrations increase. The combined radiative transfer model and spatial variation influences mean that we are dealing with a significantly more complex modelling environment than in our previous single parameter tests. In this section we provide details of the construction of a synthetic data set designed to test these more challenging issues, and the modifications that are made to the ML-SG algorithm to address them.

6.2.1

Synthetic Data Construction

6.2.1.1

Rationale

The synthetic data for this chapter was developed using many of the same inputs and processing steps as detailed for the single parameter tests in section 5.3.1. We utilised the same SIOP and substrate parametrisation, and again the Heron Island derived bathymetry subset forms the foundation of the synthetic data set. The synthetic rrs data set was derived using five varying parameter input layers to the radiative transfer model (equation 5.4), retaining the same spectral resolution and noise characteristics as described in section 5.3.1.2. To increase the range of depths in the bathymetry input, the original data (figure 5.2) was used to create a bathymetry input with a depth range of 0.41m to 16.51m, increased from the original range of 0.62m to 4.37m. In this multi-parameter problem, we wish to examine the combined influences and interactions of the water constituents and substrate reflectance across a broader range of depths in our probabilistic inversion

126

CHAPTER 6. THE FULL SHALLOW WATER INVERSE PROBLEM

Figure 6.1 – Environmental parameter inputs to the construction of the multiparameter synthetic data

framework. Input layers for the water constituent concentrations of CHL, CDOM and NAP were designed to reflect a lower frequency of variation across the spatial domain in comparison to the bathymetry, with different patterns of variation between each of the three concentration parameters. Finally a substrate proportion q layer was created by coupling the proportion values to a broad seven class classification of the bathymetry layer, and smoothing the resulting layer with a 10 x 10 pixel median filter. The result of this process is an input layer of lower spatial frequency variation than the bathymetry, representing a higher proportion of coral in shallower regions and higher proportions of sand at depth. These inputs layer are presented in figure 6.1. To create a reasonable representation of the water constituent concentrations, ranges shown in figure 6.1 for the input data layers were derived after consideration of previous field work and studies observing these quantities in Australian coral waters (Wettle 2005; Brando et al. 2009; Dekker et al. 2011). The combined influence of these concentrations can be examined using Kd , the vertical attenuation coefficient for diffuse downwelling light, taken from the original form of the analytical radiative transfer equation (5.1). The water concentrations are related to Kd through the semi-analytical parametrisation detailed in Appendix A, by which we can generate a map of Kd at a wavelength of 490nm for our synthetic data (figure 6.2). Although we do not store samples for an image based ensemble solution of Kd in our algorithm, we do examine the joint interaction of the water constituent concentrations in section 6.5.2.1. One of the influences of having a varied Kd across the study area is that the attenuation of light in the water column it represents not only varies spatially, but is also wavelength dependent, based on the absorption and backscattering characteristics of the SIOPs used in the model parametrisation. This means that we have a spatially complex model that varies in five different dimensions based on our different input layers, and

6.2. EXTENDING TO MULTIPLE PARAMETERS

127

Figure 6.2 – Vertical attenuation coefficient for diffuse downwelling light at 490nm (Kd ) - based on the parameter inputs shown in figure 6.1 and the radiative transfer model detailed in Appendix A.

that for each band of the data generated by the model, this variability is realised at a different scale and pattern. This feature of the data is best visualised graphically, and can be seen by presenting three image bands at different wavelengths across the spectral range of the data (figure 6.3). 6.2.1.2

Data Segmentation

In the previous chapter, the scale of data segmentation was driven by the high frequency spatial variation and coherency of the depth parameter layer. To complete the segmentation of the new data, the same scale, shape and colour parametrisation as applied to the depth based data set was used (see section 5.3.2). With the increased range of depth, and a variability of water constituents at a lower spatial frequency in the new data, using the same segmentation parameters allows us to get another indication of the influence of these factors on the local scales of coherency of the data. In comparison to the depth driven data, the segmentation of the multi-parameter data results in fewer segments, and a clear difference in the density, structure and number of segments across different regions of the image (figure 6.4). In the lower half of the image where the concentrations of water constituents and the resulting Kd is comparatively lower, we see a similar scale and structure of the segmentation, driven primarily by the structure of the bathymetry. In the upper portion of image, where Kd

128

CHAPTER 6. THE FULL SHALLOW WATER INVERSE PROBLEM

(a) Band 10 - 490nm

(b) Band 17 - 560nm

(c) Band 33 - 720nm

(d) Profile of rrs values at row 500 of the synthetic data

Figure 6.3 – Subsurface remote sensing reflectance (rrs ) at varied wavelengths - highlights the wavelength dependent responses to characteristics of the synthetic data parametrisation. Note the negligible reflectance in the NIR portion of the spectrum shown by Band 33 in all but the shallowest regions of the model.

6.2. EXTENDING TO MULTIPLE PARAMETERS

129

Figure 6.4 – Multi-scale object based segmentation of the multi-parameter synthetic data into 526 segments - note the decreased density and number of segments in regions of higher water constituent concentrations.

is higher, we see fewer segments and much less complexity in the segmentation structure, with fewer segments particularly over shallow areas in comparison to the depth driven image data. This shows the combined influence of the increased water column depth and varied water constituent concentrations. The increased Kd in the upper portion has a noticeable effect on the scales of spatial data coherency in this region, with a decreased influence of the higher frequency spatial variation of the depth parameter layer and an increased influence of the lower frequency variation of the water concentration layers. This is one of the key challenges that the partition modelling algorithm faces, that of resolving the variable spatial complexity of each individual parameter, where there is a complex non-linear combined influence of these parameters on the spatial variability of the data.

6.2.2

Modifications to the Algorithm

To incorporate multiple parameters into our ML-SG algorithm we firstly define some new terms to encompass the parameter ranges and Gaussian proposal widths of the full suite of parameters. We define V and σ as vectors of size P containing the parameter values and Gaussian proposal widths (σk ) of each parameter Vk respectively, where k = 1, . . . , P and P is the number of parameters to be solved for; being five in the case of our model.

130

CHAPTER 6. THE FULL SHALLOW WATER INVERSE PROBLEM

We can then define the prior ranges of the parameters as 4Vk = (Vk,max − Vk,min ) for all parameters. The schedule of the proposed steps in the algorithm remain the same (section 3.4.1.3), with a parameter value change proposed at each even step of the chain, whilst at each odd step chain a Voronoi cell move, birth or death move is selected with equal probability. For a parameter change in the chain we randomly select a cell node ci in the same manner as the original algorithm, however we then also randomly select a parameter Vk to perturb using its associated proposal width σk , using the equivalent form of equation 3.16. In the case of a birth move, to ensure reversibility of the move, we require that all P parameters in the added cell node cn+1 are perturbed randomly and independently from their values in the current model V using the corresponding proposal widths σk . 6.2.2.1

Non-Dimensional Change Moves

The priors for non-dimensional moves in the algorithm (parameter change or cell node move) can be formed using the original form of the prior (equations 3.10 & 3.15). For multiple parameters we can rewrite the decomposed form of the prior as p(m) = p(c | n)p(V | n)p(n),

(6.1)

where the priors on the cell positions and number of nodes remain the same. The probability of assigning the P parameter values to each of the n cell nodes in the model can be written as p(V | n) = Q P

1

k=1 4Vk

This means that the term

QP

k=1

n .

(6.2)

4Vk essentially replaces 4h in the full prior (3.15)

and the prior ratio for non-dimensional change moves again reduces to unity. The proposal distributions for these moves remain symmetrical, with no change in the structure of a cell position move with multiple parameters. In a parameter value change, as we select a parameter to perturb in V with a probability of 1/P , there is equal probability that we will select the same parameter in the reverse move. Thus, the probability of going from m to m0 is equal to that of going from m0 to m, resulting in a proposal ratio that reduces to unity for both types of move. This means that for

6.2. EXTENDING TO MULTIPLE PARAMETERS

131

non-dimensional change moves in the algorithm the acceptance terms remain the same as previous single parameter versions, determined by a ratio of the current and proposed likelihood functions.

6.2.2.2

Birth and Death Moves

The prior ratios for the birth and death moves involving multiple parameters can be derived by replacing the single parameter prior range term 4h in equations 4.13 and Q 4.14, with the term Pk=1 4Vk . This gives a prior ratio in the ML-SG algorithm for a multiple parameter birth move as 

p(m0 ) p(m)

 = birth

sb + 1 , Q (rb − sb )( Pk=1 4Vk )

(6.3)

with the prior ratio for a death move as 

p(m0 ) p(m)

 death

(rb − sb + 1)( = sb

QP

k=1

4Vk )

.

(6.4)

For a birth move proposal, where we are perturbing all parameters to create the proposed cell node, we can first rewrite the proposal ratio as q(m | m0 ) q(c | m0 ) q(V | m0 ) =  . q(m0 | m) q(c0 | m) q(V0 | m)

(6.5)

The only term that changes in this equation when considering multiple parameters is q(V0 | m), which represents the probability of proposing the new parameter values given the current model. If we consider the probability of generating a full set of P parameters at the new cell node n + 1 as the combined probability of generating each individual parameter, we can write q (V0 | m) =

P Y

q (Vk0 | m) ,

(6.6)

k=1

where 0 q Vpm |m =



Substituting 6.7 into 6.6 gives

σk

1 √

0 Vk,n+1 − Vk,i exp − 2σk2 2π

2 ! .

(6.7)

132

CHAPTER 6. THE FULL SHALLOW WATER INVERSE PROBLEM

1  √

q (V0 | m) = Q P

k=1 σk



g exp

P X

0 Vk,n+1 − Vk,i − 2σk2

k=1

2 !! (6.8)

We can then use the previously derived proposal probabilities for the cell locations (4.15 & 4.16) and substitute these and the new equation 6.8 into 6.5 to derive the proposal ratio for a multiple parameter birth move using the ML-SG algorithm as



q(m | m0 ) q(m0 | m)

 =

Q  √ g P 2π (rb − sb ) k=1 σk (sb + 1)

birth

exp

P X k=1

0 Vk,n+1 − Vk,i 2σk2

2 !! . (6.9)

This leads to the death proposal ratio as the inverse of 6.9 , where we have moved from n to (n − 1) cells, given by



q(m | m0 ) q(m0 | m)

 death

= Q P

sb

k=1 σk

 √ g exp 2π (rb − sb + 1)

P X k=1

0 Vk,j − Vk,i − 2σk2

2 !! , (6.10)

0

where Vk,j are the parameter values in the new cell parametrization at the location of previous cell i where i = 1, . . . , n − 1. For these dimensional change moves we substitute the likelihood expression for m0 and m (5.17) into (4.22), along with the new derived multiple parameter proposal and prior ratio terms for a birth (6.3 & 6.9) and death (6.4 & 6.10) move. This yields acceptance terms for the multiple parameter birth and death moves in the ML-SG algorithm as

 0

α(m

| m)death = min 1,

Q

P k=1

σk

QP

 √

k=1



4Vk

g

( exp

N log(φ(m0 )) − N log(φ(m)) − 2 and

P X k=1

0 Vk,n+1 − Vk,i 2σk2

2 !! ...

 ,

(6.11)

6.3. APPLICATION OF THE ML-SG ALGORITHM

Parameter (k) CHL (µg/L) CDOM NAP(mg/L) q H(m)

V min 0.02 0.001 0.4 0 0

Vmax 0.22 0.011 2.2 1 17

133

σ 0.015 0.0007 0.07 0.06 0.2

Table 6.1 – Parameter prior ranges and Gaussian proposal widths for the multi-parameter ML-SG algorithm.



α(m0 | m)death

QP

4Vk = min 1, Q k=1 √ g exp P 2π k=1 σk

(

(N log(φ(m0 )) − N log(φ(m))) − 2

6.3 6.3.1

P X k=1

0 − Vk,i Vk,j − 2σk2

2 !! ...

 .

(6.12)

Application of the ML-SG Algorithm Parameterisation and Convergence

For the first application of the ML-SG algorithm to the multi-parameter problem, we allowed 72 individual chains to run in parallel for 1.7M steps. Prior parameter ranges and the final proposal widths for each parameter change are shown in table 6.1. These result in an acceptance rate of 20% for parameter value change proposal moves and 3.8% for dimensional change moves in the chain. To demonstrate convergence of the chains, we look at the stability of the variance of the solutions across chains, for all parameters. As discussed in section 5.5.2, we are unlikely to be reaching true convergence in these very high dimensional problems in the traditional global sense of the model. This is indicated by the increasing number of partitions throughout the duration of the chain sampling using the ML-SG algorithm. However, at a local scale we see that parameter estimation and sampling variance stabilises, and can provide an indication of a functional convergence of the algorithm. To assess this we recorded all parameter samples across a random selection of 10 chains in the model, for a single random pixel location in the data. Using a sliding window of 5,000 samples we calculate the variance across chains of each parameter in

134

CHAPTER 6. THE FULL SHALLOW WATER INVERSE PROBLEM

Figure 6.5 – Parameter variance of 10 combined chains - sampled using a sliding window of 5,000 samples and normalised to the maximum observed variance of each parameter (end of burn-in period indicated by dashed line).

the window, continuing through the length of the chain in steps of 1,000 samples. The results are shown in figure 6.5, with the variance of each parameter normalised against the maximum observed variance for that parameter. We can see from this figure that all parameter variances appear to have stabilised and reached constant values at around the 1.2M sample mark. Another method of visualising this information is as the sampling range of the combined trace plots, drawn for each parameter from the selection of independent chains, shown in figure 6.6. The analysis gives us an indication of when the variance of each parameter begins to stabilise across chains in the full model, and hence where we can consider the run to have converged. Based on these tests, we discard the first 1.2M samples as burn-in and retain the final 500,000 samples for each chain. Again, we must stress that in these high dimensional problems which we are tackling, full global convergence of the chain based on the full partition model is very difficult to assess, and may indeed never be reached. Hence we are looking at local convergence indicators to assess if sampling at this scale can be considered to have stabilised, and inferring information from the ensemble produced is indeed sensible.

6.3. APPLICATION OF THE ML-SG ALGORITHM

135

(a) CHL

(b) CDOM

(c) NAP

(d) q

(e) Depth

Figure 6.6 – Parameter sample trace plot ranges for 10 combined chains in the ML-SG algorithm run - indicating stability of the solution across all parameters at ˜1.2M samples.

136

CHAPTER 6. THE FULL SHALLOW WATER INVERSE PROBLEM

6.3.2

Full Ensemble Based Solutions

To represent the full posterior ensemble solutions for each parameter, we can again use an inferred pixel-based mean and uncertainty from the sample ensemble of each parameter. Interpreting these results against the true model inputs layers provides a number of initial insights regarding multi-parameter and scale issues in our ML-SG inversion framework. In the mean ensemble solutions and uncertainty maps (figures 6.7 & 6.8) we can clearly see the interactions of parameters in the radiative transfer model. In the model, the relative contribution of the bottom reflectance decreases exponentially as the depth of the water column increases. Hence we see the increased uncertainty in solutions for these two parameters in deeper areas of the study area. This is consistent with the uncertainty propagation study of Hedley et al. (2012a), as is the relatively robust retrieval of the depth component in comparison to the bottom reflectance (or substrate composition) as the water column increases. We see this in the deepest area of the ensemble solutions, where although both depth and q have a higher uncertainty, depth is resolved well in comparison to the true model whilst the mean solution for q is considerably higher. We can visualise this concept by comparing the full modelled spectra at test point 1 dp in equation 5.1) using (see figure 6.9) to the modelled optically deep water spectra (rrs

the known input parameters at this point (figure 6.10). The modelled optically deep spectra consists only of contributions from the water column, with the close match to the full modeled spectra indicative of the minimal influence of the bottom reflectance, reflected in the subsequent uncertainty of the q parameter estimation. Bottom reflectance contribution can be characterised by the Substratum Detectability Index (SDI) of Brando et al. (2009), and its influence on the capability to discriminate between bottom types over a range of water column depths has been quantified in the study of Botha et al. (2013). Conversely, the uncertainty of the depth and q solutions decreases considerably over shallow regions, with the opposite effect evident in the water column concentration solutions (figure 6.8). Over these shallow regions we see a clear increase in the uncertainty and a poorer mean retrieval of all three concentrations in comparison to their true model values. The water depth is again the significant influencing factor on these retrieval accuracy and uncertainty issues. In the shallow water radiative transfer system, as the

6.3. APPLICATION OF THE ML-SG ALGORITHM

137

(a) True environmental parameter inputs

(b) Depth - Mean Ensemble Solution

(c) Depth - Uncertainty Ensemble Solution

(d) q - Mean Ensemble Solution

(e) q - Uncertainty Ensemble Solution

Figure 6.7 – ML-SG Algorithm - Ensemble mean and uncertainty (σ) parameter solutions for Depth (m) and q.

138

CHAPTER 6. THE FULL SHALLOW WATER INVERSE PROBLEM

(a) CHL - Mean Ensemble Solution

(b) CHL - Uncertainty Ensemble Solution

(c) CDOM - Mean Ensemble Solution

(d) CDOM - Uncertainty Ensemble Solution

(e) NAP - Mean Ensemble Solution

(f ) NAP - Uncertainty Ensemble Solution

Figure 6.8 – ML-SG Algorithm - Ensemble mean and uncertainty (σ) parameter solutions for CHL (µg/L), CDOM and NAP (mg/L) .

6.3. APPLICATION OF THE ML-SG ALGORITHM

139

Figure 6.9 – Sample test points to examine parameter and model interactions - locations indicated on the true depth model.

water column gets shallower the fraction of upwelling radiance from the interactions of photons within the water column becomes less, compared with the fraction derived from the substratum and vice versa. Therefore, values of absorption and backscattering (a function of the water concentrations) become less reliable, as the bottom reflectance contribution becomes larger (Dekker et al. 2011). In these estimated water concentration mean solutions we see that CDOM is the most accurately retrieved, with less effects on mean retrieval and uncertainty in the problematic shallow areas observed in both NAP and CHL. NAP is estimated with a similar spatial distribution of uncertainty to CDOM, although with a poorer accuracy over the shallower regions. CHL is the most poorly resolved of all the parameters, with clear accuracy and uncertainty effects particularly evident in shallow waters. These results are consistent with those of Jay and Guillaume (2011), who completed a sensitivity study on the same water constituent concentrations using a synthetic dataset and optimisation routine. They found CHL to be poorly resolved across all depths in comparison to NAP and CDOM, with NAP having the most consistent relationship between water column depth and the accuracy and uncertainty of retrieval.

140

CHAPTER 6. THE FULL SHALLOW WATER INVERSE PROBLEM

  dp of the modelled spectra at Test Figure 6.10 – Optically Deep component rrs Point 1 - comparison to the full modelled spectra.

The ensemble solution maps also highlight another possible spatial scale artefact in the parameter retrievals. As in the single parameter problem, where we saw increased uncertainty at boundaries of varied bathymetry, we also see an increased uncertainty and poor retrievals over smaller shallow features where the surrounding water depth varies significantly. This is evident not only in the water concentrations, but also the estimated depth, suggesting possible effects of the varying partition model as it tries to resolve these highly detailed features with a more generalised structure that now defines all five parameters of interest. We examine the implications of this effect further in the results presented in section 6.5.2., whilst in the next section we focus further on the interactions between the parameters we have identified in the ensemble solutions, and appropriate methods for representing the solution uncertainties.

6.3.3

Parameter Interactions in the Shallow Water Model

One of the benefits of interpreting the posterior solution ensemble with a point based estimator such as the mean (and associated uncertainty), as we have in the previous section, is its computational simplicity. In the case of our generalised partition sampling

6.3. APPLICATION OF THE ML-SG ALGORITHM

141

framework, we have also shown that it provides an effective way to infer a “better” solution, containing more information and detail than any one single model in the sampling chain. However, by choosing these estimators we are making some key assumptions about the Gaussian form of the posterior distributions at each pixel in the model. If these assumptions are wrong (for example, a bi-model, flat or even one-tailed posterior parameter distribution) the choice of a mean and Gaussian uncertainty can mean an incorrect, or at best, misleading inferred solution from the ensemble. Similar arguments, again based on assumptions about the shape of the posterior, can also be made regarding the other point estimators such as median or mode that we could use to represent our ensemble solutions (Debski 2010). Whilst these estimators are a practical method of representing the ensemble solution in an image format, we need to examine the structure of the posterior distributions to ensure that we are representing the solution and inferred uncertainty appropriately. We have identified various points in the study area (figure 6.9) that represent a range of parameter value and scale variability scenarios. By taking the marginal posterior probability distribution of the estimated parameters at these points we can examine parameter interactions in different physical scenarios, both in terms of the radiative transfer model and also the spatial variability of the environment. Assessing the structure of the individual PDFs also enables us to determine the appropriate form of the uncertainty to represent parameter variability, and if our assumed Gaussian distribution is valid across these different points and scenarios. In figures 6.11 and 6.12 we see the parameter interactions for two points located in deeper regions of the study area, with higher relative Kd and water concentrations at test point 1 in comparison to point 3. At both of these points we see the effect of the increased relative contribution of the water column reflectance in comparison to that from the bottom substratum. Water column concentrations are retrieved accurately (with the exception of CHL), with a low uncertainty, whilst the depth and q estimation both exhibit an increased uncertainty. The depth mean retrieval accuracy remains relatively robust, however the poorly constrained solutions for q result in a mean far from the true value of the parameter. The mean of the ensemble for q at point 1, whilst far from the true value, could still be considered an appropriate representation of the ensemble. However, at point 3 the mean is clearly a poor estimator, with the maximum probability of the ensemble occurring at the mode of the solution in a onetailed distribution, very close to the true value. This kind of distribution also highlights

142

CHAPTER 6. THE FULL SHALLOW WATER INVERSE PROBLEM

(a)

(b)

(c)

(d)

(e)

Figure 6.11 – Test Point 1 ML-SG Algorithm - Marginal posterior probability distributions - Mean ensemble solution estimator indicated by the red dashed line, true parameter value indicated by the black dashed line. 90% HPD interval displayed in blue.

6.3. APPLICATION OF THE ML-SG ALGORITHM

143

(a)

(b)

(c)

(d)

(e)

Figure 6.12 – Test Point 3 ML-SG Algorithm - Marginal posterior probability distributions - Mean ensemble solution estimator indicated by the red dashed line, true parameter value indicated by the black dashed line. 90% HPD interval displayed in blue.

144

CHAPTER 6. THE FULL SHALLOW WATER INVERSE PROBLEM

(a)

(b)

(c)

(d)

(e)

Figure 6.13 – Test Point 2 ML-SG Algorithm - Marginal posterior probability distributions - Mean ensemble solution estimator indicated by the red dashed line, true parameter value indicated by the black dashed line. 90% HPD interval displayed in blue.

6.3. APPLICATION OF THE ML-SG ALGORITHM

145

the issues discussed by Stark and Tenorio (2010) and raised earlier in section 4.2.4, regarding the influence of a uninformative prior in an interval form. If the true value of the parameter sits at the edge of the prior range, as in this case, uncertainty in the PDF can only be modelled in one tail of the distribution, with clear implications for not only the uncertainty but also the relevance of the mean as an appropriate estimator. At test point 2 (figure 6.13) we see the corresponding effects of a shallower water column on the parameter estimations and uncertainties. In this shallow region, depth and q are very accurately resolved, with very low uncertainty. NAP and CDOM are resolved with less accuracy than at the deeper test points, and with a higher uncertainty, whilst CHL becomes highly unconstrained across the full prior range. Whilst many of these PDFs are localised and single peaked (suggesting a mean estimator is reasonable), it is clear when examining the range of PDFs across these parameters and sample points, that a number do not conform to a Gaussian distribution. Dealing with these unbalanced, one-sided or unconstrained posterior distributions not only has ramifications for the type of point estimator we might consider, but also the form of uncertainty that accurately reflects the solution. A routine practice in Bayesian inference is to summarise the marginal posterior distributions by the use of a posterior credible interval (Chen and Shao 1999). This approach determines the interval in which the defined percentage of posterior samples occur, with equal tails outside at the lower and upper bounds of the interval. For instance, a 90% credible interval defines 90% of the posterior samples, excluding 5% of samples outside of each of the upper and lower bounds. However, when the posterior distributions are non-symmetrical (as in many of our observed cases), the use of a highest probability density (HPD) credible interval may be more appropriate. In the marginal posteriors of figures 6.11, 6.12 and 6.13 we calculated a 90% HPD credible interval based on the method of Chen and Shao (1999), represented by the blue sections of the histogram. The HPD differs from the standard credible interval by defining the shortest possible interval in the parameter space that represents the given probability content of the interval (Box and Tiao 1992). This means that for distributions such as that shown for q at point 3 (figure 6.12d), the HPD credible interval isolates a smaller interval of the highest probability section of the one-tailed distribution, rather than enforcing a specific percentage of samples at the lower probability end of the tail.

146

CHAPTER 6. THE FULL SHALLOW WATER INVERSE PROBLEM

6.4

Error Sources and Sensitivity

In the previous chapter we discussed the three sources of uncertainty and error contributing to the shallow water inversion problem, and the difficulties of characterising or estimating them effectively for representation in the probabilistic inversion framework. This led to consideration of an alternate noise model, and the development of the ML-SG algorithm as an attempt to encompass this uncertainty in the inversion problem. In this section we use our well characterised synthetic data problem as an opportunity to isolate these individual uncertainty aspects, and assess their influence in the shallow water inversion. By doing so, we are able to evaluate the ability of the alternate noise model in the ML-SG algorithm to appropriately reflect the uncertainty these different error sources bring to the parameter estimations. The methods for determining parameter proposal scales and McMC convergence for the tests in this section were completed in the same manner as described for the the initial ML-SG test in this chapter, and so are not described for each individual test.

6.4.1

Increased Data Noise

In section 4.5.1 we saw how the partition sampling aspect of the algorithm was particularly effective in estimating parameters from noisy data, utilising the spatial coherency constraints inherent to the partition method as a form of regularisation. For this test we increase the random Gaussian noise in the multi-parameter data from 0.0015sr−1 to 0.008sr−1 , creating a data set with significantly less information content to reflect the spatial distribution of parameters (figure 6.14). In comparison to the spectral transects of the original dataset (figure 6.3), this noisy data shows far less definition between the spectral bands, and less distinct individual spectral response features to changes of the parameters in the model. From applying the ML-SG algorithm to this noisy data for the multiple parameter inversion, we see the same regularisation characteristics in the estimated solutions (figure 6.15). In particular we observe a very similar retrieval of the ensemble mean solutions to the low noise test, with only a slight (and expected) increase in the estimated uncertainty of the parameters. These features are also visible in the marginal posterior parameter distributions. Figures 6.15g & h show the marginal posteriors for depth and q at test point 3, displaying a similar structure of the posterior, with an increased uncertainty

6.4. ERROR SOURCES AND SENSITIVITY

147

Figure 6.14 – Profile of rrs values at row 500 of the increased noise 0.008sr−1 synthetic data.



(shown by the increased HPD interval range) when compared to the low noise data solution (figure 6.12).

6.4.2

Empirical Parametrisation Errors

Another form of error that enters the inversion process is that resulting from the inclusion of fixed parameters as the empirical component of the semi-analytical radiative transfer model (see section 5.2.2.1). These can be referred to as “observational” or “hidden” parameters in the inversion framework, in that they may be measured in a different setting to the inversion problem itself, and are “hidden” as a fixed parameter in the radiative transfer forward model (Debski 2010). An important consideration in an inverse problem context, is that for a given forward model, one man’s “hidden” parameter may be another man’s “model” parameter to be estimated, and vice versa. In our inversion problem we have parametrised our model with fixed “hidden” parameters consisting of the SIOPs and substrates used in the construction of the synthetic data. With the knowledge that these empirical components in our model are used to create the data, this gives us the opportunity to examine the effects of changing these parameters of the radiative transfer model in the inversion process. This simulates a likely scenario in a real data application, where the site specific SIOP and substrate parameters may be poorly known, and require estimation from previous work. One of the questions we address in the next section is, as these parameters have no explicit uncertainty representation in our model, does the flexibility of the noise estimation of the ML-SG algorithm assist in incorporating the uncertainty and errors introduced by an incorrect empirical parametrisation?

148

CHAPTER 6. THE FULL SHALLOW WATER INVERSE PROBLEM

(a) Depth - Mean

(b) q - Mean

(c) NAP - Mean

(d) Depth - Mean

(e) q - Uncertainty

(f ) NAP - Uncertainty

(g) Test Point 3 - Marginal posterior probability distribution for depth

(h) Test Point 3 - Marginal posterior probability distribution for q

Figure 6.15 – Increased Noise Data - ML-SG Algorithm - Ensemble mean and uncertainty (σ) parameter solutions for Depth (m), q and NAP (mg/L) and marginal posterior distributions for Depth and q.

6.4. ERROR SOURCES AND SENSITIVITY

149

(a)

Figure 6.16 – Influence of varied SIOP sets used in the radiative transfer (RT) forward model - True model parameters at test point 3 used with a correct (Set 1) and incorrect (Set 2) empirical parameterisation.

6.4.2.1

Incorrect SIOP Parameters

In this test we replace the SIOP parameter set used in the construction of the data (Set 1) with a set characterised by different absorption and backscattering components (Set 2 - see Appendix A). In respect to our synthetic data and inverse problem, this means that for a “correct” set of model parameters, the modelled reflectance spectra using the RT model with the “incorrect” SIOP set is now of a different shape and lower magnitude to the synthetic data (figure 6.16). Through inversion of the data with the incorrectly parametrised RT model, we can examine how this error manifests itself in the estimated parameters to create a closer fit to the data, and importantly whether these changes are accommodated and reflected in the uncertainty estimates using the ML-SG algorithm. The mean ensemble solutions for the incorrect SIOP parametrisation are presented in figure 6.17. We see a number of compensations made across all the parameters to reflect essentially what is a change in the RT forward model. In low Kd deeper areas we see an increase in estimated q, with a corresponding decrease in the depth values. In deeper high Kd areas we see a smaller increase in q, with a decrease in depth and noticeable decrease in CDOM. A decrease of q is observed in shallow high Kd regions, with an increase in NAP prevalent across all regions, and an estimate of CHL clearly

150

CHAPTER 6. THE FULL SHALLOW WATER INVERSE PROBLEM

driven by the spatial distribution of Kd . Perhaps most importantly, we do not see the corresponding increase in the estimation of uncertainty for any of the model parameters. An illustration of this is shown in the marginal posterior distributions of q and depth at test point 1 (figure 6.17f & g). These show a clearly incorrect estimation of both parameters at this point, although with a smaller uncertainty than that shown in our “correct” SIOP inversion solution (figure 6.11). This is a common feature of the distributions observed in this test utilising the incorrect SIOP parametrisation. This test demonstrates the importance of the correct SIOP parametrisation in the forward RT model. SIOP variability is recognised by Ampe et al. (2015) as the highest contributor to inaccuracy and uncertainty, in their study which examined the influence of various sources of error and uncertainty in the optimisation inversion retrieval of water column constituent concentrations. Our results point to the inability of this kind of error in the model to be reflected in the solution using the ML-SG noise formulation in the likelihood of the rj-McMC approach. One option is to account for the SIOP uncertainty in the structure of the prior, and in Chapter 8 we discuss further avenues to address these issues in the formulation of the ML-SG algorithm.

6.4.2.2

Incorrect Substrate Library

Another source of empirical parametrisation error in the RT model is the selection and definition of substrates, the proportion of which are defined by the model parameter q. To examine this sensitivity we replace the coral spectra used in the synthetic data creation with a seagrass spectra, characterised by a lower albedo and different spectral shape (figure 6.18). This still provides a dark and light substrate to the model for inversion, although with a varied spectral magnitude and shape. A test of this nature again simulates a situation which could easily occur in a real data test, the estimation of substrates for a study site that has no available in-situ measurements, or a variation in the substrate spectra through measurement and observation procedures. In the mean ensemble solutions (figure 6.19) we can observe clearly the effect of water column depth on the contribution of the substratum reflectance to the model. In deeper areas, where the contribution from the substratum reflectance is low (and thus, so is the contribution from our incorrectly parametrised substrates) we estimate model parameters comparable to those of the correct parametrisation. In shallow areas, where

6.4. ERROR SOURCES AND SENSITIVITY

(a) CHL - Mean

(b) CDOM - Mean

(d) q - Mean

(f ) Test Point 1 - Marginal posterior probability distribution for CDOM

151

(c) NAP - Mean

(e) Depth - Mean

(g) Test Point 1 - Marginal posterior probability distribution for depth

Figure 6.17 – Varied SIOP Inputs - ML-SG Algorithm - Ensemble mean parameter solutions for CHL (µg/L), CDOM, NAP (mg/L) , Depth (m), q and NAP (mg/L) and marginal posterior distributions for CDOM and depth.

152

CHAPTER 6. THE FULL SHALLOW WATER INVERSE PROBLEM

Figure 6.18 – Input Substrate Library - comparison of seagrass and coral spectra.

the substrate reflectance dominates the model, we see the most notable compensation in a much lower estimated value of CDOM across all shallow regions. Interestingly, we see a robust estimation of q and depth across these shallow areas. As we have still provided a dark and light substrate to draw a proportion q from, it would appear that the smaller changes in spectral magnitude and shape have been compensated for through primarily the concentration of CDOM, while the q proportion and depth remains “correct”. This may prove to be a significantly different scenario if we were to replace a “correct” substrate with one of a vastly different shape and magnitude. In that case, a compensation by water column constituents may prove insufficient, with the dominant shallow region influence of the depth parameter used to compensate for such a change in bottom reflectance magnitude. When we examine the uncertainty characteristics of a point in one of the affected shallow water regions (figure 6.19), we see a similar scenario to that of the SIOP parametrisation test. The compensation seen in the estimated CDOM parameter value is not reflected in the uncertainty of the solution. This again shows that the noise formulation in our ML-SG algorithm does not effectively account for this type of empirical parametrisation error in the model, and highlights the increased importance of a correct parametrisation of substrates in the RT model for the estimation of all parameters in shallower waters.

6.4. ERROR SOURCES AND SENSITIVITY

(a) CHL - Mean

(b) CDOM - Mean

(d) q - Mean

(f ) Test Point 4 - Marginal posterior probability distribution for CDOM

153

(c) NAP - Mean

(e) Depth - Mean

(g) Test Point 4 - Marginal posterior probability distribution for q

Figure 6.19 – Varied Substrate Inputs - ML-SG Algorithm - Ensemble mean parameter solutions for CHL (µg/L), CDOM, NAP (mg/L) , Depth (m), q and marginal posterior distributions for CDOM and q.

154

CHAPTER 6. THE FULL SHALLOW WATER INVERSE PROBLEM

(a)

(b)

Figure 6.20 – Quickbird spectral response filter (a) and derived synthetic Quickbird 4-band data (b) - comparison to the original 36-band synthetic data.

6.4.3

Spectral Resolution Sensitivity

The concept of spectral resolution was discussed earlier in section 2.1.1, where it was defined as the number, position and bandwidth of the spectral channels of the remote sensing instrument across the electromagnetic spectrum (Camps-Valls et al. 2011). In this context, any remotely sensed spectra is a generalised representation of the true spectral signature of the observed object. The higher the resolution, the closer the spectra will represent the true spectral signature. Even in our synthetic data tests, we have generalised the full 1nm spectral signature of the derived synthetic data to 36 bands across the 400nm to 750nm wavelength range. Studies have shown a clear relationship between the spectral discrimination capabilities of sensors in aquatic environments as spectral resolution decreases (Hochberg and Atkinson 2003; Botha et al. 2013), as well as an increased uncertainty in inversion parameter retrievals in shallow water environments (Hedley et al. 2012b). In our probabilistic inversion framework we can examine this effect, and assess if this increased uncertainty is evident and represented effectively in the ML-SG inversion. To produce a data set of reduced spectral resolution we re-sample the original 1nm synthetic data set produced with our input parameters (figure 6.1) and the forward RT model using the spectral response filter of the 4 band commercial Quickbird sensor (figure 6.20a). Random Gaussian noise of 0.0015sr−1 is again added to each band of the synthetic

6.4. ERROR SOURCES AND SENSITIVITY

155

Quickbird data. The resulting decrease in information content from this reduced spectral resolution can be seen clearly in a comparison of the 4 band spectra to the equivalent spectra from the original 36 band synthetic data set (figure 6.20b). Inversion of this 4 band data using the ML-SG algorithm can enable a direct evaluation of the effects of this decrease and how it is reflected in the ensemble solutions. The ensemble mean solutions for the synthetic Quickbird data ML-SG inversion are shown in figures 6.21 and 6.22, along with the marginal posteriors for test point 3. We can observe a decrease in the accuracy of the mean solution retrievals for each parameter, along with a large increase in the uncertainty of each parameter solution. The accuracy of retrievals for all parameters follow the same pattern as the previous test, with a higher accuracy of water column concentration retrieval in deeper waters, and vice verse for depth and substrate proportion q. The inaccuracies and uncertainties associated with each estimated parameter is increased significantly from the original 36 band inversion. The marginal posterior parameter distribution for point 3 shows NAP to be the only well constrained parameter, with the other four parameters displaying a wide poorly constrained distribution. Although these poorly constrained distributions again raise the issue of the appropriate point estimator to represent the ensemble solution, they also show the ability of the ML-SG to reflect this uncertainty appropriately in the solution. The poorly constrained solutions from the synthetic Quickbird data are to be expected, and results from the reduced information content in the data and subsequent non-uniqueness of the inverse problem. In aquatic remote sensing the limitations of multispectral data of this kind have been identified for both physics-based (Botha et al. 2013) and empirical (Vahtmae and Kutser 2007) classification and retrieval applications. Often, a regularisation technique is needed to constrain the solution or classification, such as the contextual editing process described by Mumby et al. (1998) which limits substrate classification solutions to known physical characteristics of the environment. In our method we benefit from the self-regularisation provided by the partition modelling approach. Therefore, whilst the solution retrieval is poorer than that of the higher spectral resolution synthetic data test, we still produce a usable depth solution (figure 6.22a) that avoids pixel-to-pixel variation and accurately reflects the increased uncertainty of the model.

156

CHAPTER 6. THE FULL SHALLOW WATER INVERSE PROBLEM

(a) CHL - Mean

(b) CHL - Uncertainty

(c)

(d) CDOM - Mean

(e) CDOM - Uncertainty

(f )

(g) NAP - Mean

(h) NAP - Uncertainty

(i)

Figure 6.21 – Reduced Spectral Resolution data - ML-SG Algorithm - Ensemble mean and uncertainty (σ) parameter solutions for CHL (µg/L), CDOM and NAP (mg/L)and marginal posterior distributions at Test Point 3 from 4-Band synthetic Quickbird data.

6.4. ERROR SOURCES AND SENSITIVITY

157

(a) Depth - Mean

(b) Depth - Uncertainty

(c) q - Mean

(d) q - Uncertainty

(e)

(f )

Figure 6.22 – Reduced Spectral Resolution data - ML-SG Algorithm - Ensemble mean and uncertainty (σ) parameter solutions for q and Depth (m)and marginal posterior distributions at Test Point 3 with 4-Band synthetic Quickbird

158

CHAPTER 6. THE FULL SHALLOW WATER INVERSE PROBLEM

6.5

A Hybrid Algorithm Approach

When dealing with the shallow water inversion problem in a real data application, there often exists some form of a-priori data or knowledge of the environment being studied. In the case of bathymetry, this could take the form of measurements from other survey methods, observations from hydrographic charts or even other coarse scale raster bathymetry products. In principle, the probabilistic inversion framework offers the ideal framework for incorporating this additional knowledge into the problem through the prior. Any additional information contributed through the prior acts as a form of regularisation in the problem, certainly more informative than the uniform parameter range priors we have established for our current tests. In practice however, it is difficult to incorporate this kind of spatially based prior knowledge into a partition based trans-dimensional framework. In our algorithm, as we define a single prior for each parameter across the entire model, spatially based prior information holds little meaning in the continually shifting parametrisation of the spatial domain that occurs as the Voronoi structure changes. What this information does offer is the opportunity to estimate informed starting locations in the parameter space for our initial Voronoi parametrisation, and thus a point in the model space closer towards the minima we are seeking. In a similar way in which the segmentation of the data informs the initial structure of the partition model, the segmentation can also be used as a flexible framework for transferring this prior parameter knowledge to the initial parametrisation. In this section we extend this concept by considering how the segmentation, and what it represents, can offer a method to develop an informed starting point for the algorithm, even without other existing data. This hybrid inversion approach develops a starting point for the parametrisation of the ML-SG algorithm utilising optimisation methods, simulating the incorporation of existing bathymetry information. In the standard ML-SG algorithm inversion, the increased dimension of the multi parameter problem and increased depth parameter range has resulted in some chains and Voronoi cells becoming possibly stuck in local minima, manifesting as elements of the Voronoi partition structure visible in the ensemble uncertainty solution (figure 6.7). We show in this section that the hybrid method for deriving starting points in the parameter space can assist in preventing the algorithm from becoming stuck in these types of local minima.

6.5. A HYBRID ALGORITHM APPROACH

6.5.1

159

Utilising a Segment Based Optimisation

If we consider the basis for the use of data segmentation so far in this work, it has been under the hypothesis that each segment represents a spatial grouping of data with similar characteristics. We have regarded this segmentation as a representation of the spatial complexity of the underlying data, which can be used as a guide to inform where more (or less) complexity may be required in our partition model. In this hybrid approach, we draw from both of these assumptions to develop a generalised starting parameter layer from which the segment placed partition nodes can take their initial starting values for the rj-McMC. Firstly we must draw from each segment a spectra which can be considered most representative of the ensemble contained in that segment. Whilst we can clearly see spectral variation at the scales in which we have segmented the data (figure 6.4), we can use a median spectra approach to mitigate against these obvious outliers. We select a median spectra from each ensemble of segment based on reference wavelength of 520nm. It is important to select an existing spectra from the ensemble, rather than an interpolated spectra (e.g. formed from medians in each band), as this has no valid physics-based relationship within the radiative transfer model. By selecting spectra in this manner the dimension of the data has effectively reduced from 287,182 to 526 spectra. This number of spectra is quite manageable within the optimisation framework of the SAMBUCA (Semi-Analytical Model for Bathymetry, Unmixing and Concentration Assessment) algorithm (Wettle and Brando 2006; Brando et al. 2009) (see Appendix A). The SAMBUCA algorithm was parametrised with identical values and inputs as the ML-SG algorithm, and each spectra processed to produce an optimised parameter set for each segment. These parameter values are mapped back to the individual pixels belonging to the segment to produce a segment based estimated parameter layer. In figure 6.23 we show the estimated bathymetry layer based on the 526 segment based spectra. The estimated bathymetry layer is clearly not an optimal representation of the original bathymetry. The results observed could be affected by a number of factors, including the structure of the segmentation, the individual pixel spectra selected, the reference wavelength used, or even SAMBUCA converging to a local minima. The important feature of this method is not the accuracy with which the layer defines the final bathymetry, but how we can use this layer to inform the starting points in our probabilistic ML-SG

160

CHAPTER 6. THE FULL SHALLOW WATER INVERSE PROBLEM

Figure 6.23 – Segment based SAMBUCA estimation of Depth (m) - Median spectra selected from each segment and inverted using the SAMBUCA optimisation algorithm.

inversion. As the estimated layer is in the same structure as the segmentation, when we initialise the ML-SG algorithm by placing each cell node within a segment, we can use this layer to assign an initial bathymetry value or any other derived parameter value to that node. This method offers considerable flexibility, as we are only seeking a reasonable starting value for the parameters in the chain, not altering the structure of the prior. Hence the subjectivity of the segmentation scale, median spectra selection and accuracy of the optimisation solution has no detrimental effect on the subsequent probabilistic inversion. Assigning values to the segment structure can even be carried out using existing types of bathymetry data as described in the introduction to this section. With a better starting model, approximately in the high probability region of parameter space, we hope to achieve an improved parameter retrieval and accelerated convergence of the Markov chain. It seems plausible that even a poor, or highly approximate estimated layer, will provide more information than a random allocation of parameter values across the full range of the prior.

6.5. A HYBRID ALGORITHM APPROACH

6.5.2

161

Hybrid Algorithm Results

For the test of the hybrid ML-SG algorithm we run the rj-McMC with the same parameter proposal ranges and scales as the original algorithm test (table 6.1), with the initial starting points for the Voronoi cell depth parameter values defined by the segmentbased SAMBUCA estimated bathymetry (figure 6.23). Convergence of the algorithm was assessed using the localised parameter variance method across a selection of chains as described in the initial ML-SG run of this chapter. These results are presented in figure 6.24, with a burn-in of the first 800,000 samples discarded, and 500,000 samples retained for the ensemble solutions. The first benefit of the hybrid method is evident in the decreased burn-in period (1.2M to 800,000) required to reach a stable sampling state, as we have commenced the chain at a location in the parameter space of a higher posterior probability. We could even argue that the parameter samples in figure 6.24 have reached a stable state even earlier in the chain, showing the significant speed up achieved by the informed parameter starting points. The MAP solutions for the hybrid algorithm and standard ML-SG inversion (figure 6.25) also show that the hybrid algorithm has been very successful in preventing the McMC chain from becoming stuck in local minima, which are still evident in the original ML-SG Voronoi partition with large depth value cells scattered across shallow regions. The MAP solutions also indicate the decrease in the dimension of the partition structure model of the hybrid model, with a less complex Voronoi model of fewer cells (mean = 2326) in comparison to the original ML-SG model (mean = 4352). This is a result of the hybrid algorithm having less birth proposals accepted (2.9% in comparison to 3.5%) in the chain. As we have discussed earlier (section 4.4.5.2), the acceptance term for the birth proposal encourages a larger change of the parametrisation in the newly created cell. In the hybrid algorithm, these large parameter changes that encourage a cell birth are balanced by a decrease in the fit to the data, as we are already close to the true parameter value. This decrease in the number of Voronoi cells using the hybrid algorithm has a positive implication for the dimension of the model in our inversion , again reflected in the decreased burn-in period required. The ensemble mean solutions for the hybrid algorithm show a generally comparable accuracy across all parameters to the original ML-SG algorithm, however there are some notable features displayed in the solutions for the parameters in figure 6.26. The mean

162

CHAPTER 6. THE FULL SHALLOW WATER INVERSE PROBLEM

(a) CHL

(b) CDOM

(c) NAP

(d) q

(e) Depth

Figure 6.24 – Parameter sample trace plot ranges for 10 combined chains in the Hybrid ML-SG algorithm run.

6.5. A HYBRID ALGORITHM APPROACH

163

(a) ML-SG Algorithm

(b) Hybrid ML-SG Algorithm

Figure 6.25 – MAP estimate solutions for the ML-SG and Hybrid Algorithm runs - note the number of solution artefacts present in the ML-SG MAP solution (highlighted with white circles) that have been avoided in the Hybrid solution.

164

CHAPTER 6. THE FULL SHALLOW WATER INVERSE PROBLEM

solution for the depth parameter displays the biggest improvement in accuracy over the original ML-SG solution, in particular in the deep regions of the upper left of the image where the original algorithm showed a general overestimation of the depth. There is also an improved resolution of the shallow depths of small coral bommie features surrounded by deeper water, assisted by the initial segment based estimation constrained to these types of highly delineated features. The most notable feature of these results is the improved convergence and structure of the uncertainty estimates for the depth and q parameters. The residual uncertainty produced by Voronoi cells stuck in local minima has been removed, with the resulting uncertainty smoother across the spatial domain and reflecting more closely the parameter interactions within the RT model that we have previously described. The parameter posterior marginal distributions for point 3 (figure 6.27) reflect what we can observe in the ensemble solutions. CHL and CDOM estimations are comparable to the original ML-SG estimation, and hence were not presented in figure 6.26. The posterior distributions for depth and q display the decreased uncertainty evident in the ensemble solutions produced by the Hybrid algorithm in comparison to the original ML-SG posterior distributions for the test point (figure 6.12).

6.5.2.1

Joint Posterior Probability Distributions

We have also presented the ensemble solution for NAP in figure 6.26 to illustrate the still present issue we first raised in section 6.3.2, that of an increased uncertainty of estimated parameters over smaller scale features and detailed regions. Whilst the informed start for the depth parameter has improved the estimation over these small bommie type features for depth, the uncertainty for the depth parameter is still inconsistently high for what we have observed in other regions of shallow water. The mean estimated values for NAP are also overestimated over these small bommie features, with a high corresponding uncertainty. Examining the marginal posteriors for these two parameters at test point 5, over one of these features, clearly shows how these estimates are represented in the ensemble (figure 6.28). Both of the histograms display a strongly one-tailed distribution with a high associated uncertainty and large 90% HPD interval range. The true value for the depth parameter (figure 6.28a) is identical to the mode of the solution, although the mean overestimates the depth due to the width of the distribution. In the case of the NAP

6.5. A HYBRID ALGORITHM APPROACH

165

(a) Depth - Mean

(b) Depth - Uncertainty

(c) q - Mean

(d) q - Uncertainty

(e) NAP - Mean

(f ) NAP - Uncertainty

Figure 6.26 – Hybrid ML-SG Algorithm - Ensemble mean and uncertainty (σ) parameter solutions for Depth (m), q and NAP (mg/L).

166

CHAPTER 6. THE FULL SHALLOW WATER INVERSE PROBLEM

(a)

(b)

(c)

(d)

(e)

Figure 6.27 – Hybrid Algorithm - marginal posterior probability distributions for parameters at Test Point 3 - Mean ensemble solution estimator indicated by the red dashed line, true parameter value indicated by the black dashed line. 90% HPD interval displayed in blue.

6.5. A HYBRID ALGORITHM APPROACH

(a)

167

(b)

Figure 6.28 – Hybrid Algorithm - marginal posterior probability distributions for Depth and NAP at Test Point 5 - Mean ensemble solution estimator indicated by the red dashed line, true parameter value indicated by the black dashed line. 90% HPD interval displayed in blue.

parameter (figure 6.28b), the mode of the solution is at the opposite end of the parameter range, with the mean estimator pulling the estimated value a little closer to the true model value. These results are inconsistent with the physical characteristics of the test point, for instance, observe the well contained posterior of a similar shallow point 2 (figure 6.13) in a region of less spatial parameter variability. For this reason we consider that the ensembles produced over these small shallow features are a result of a non-unique solution resulting from the generalised partition structure. As the partition samples over the small shallow feature, and the surrounding deeper water, a comparable fit to the data could be achieved by a number of parameter value combinations, depending on the proportions of deep and shallow pixels in the Voronoi cell. We can examine the relationship between NAP and depth during this sampling using a multi-parameter posterior density plot, which represents the joint posterior probability of the two parameters (figure 6.29). This joint probability plot shows the region of highest joint probability to be very close to the true values for both parameters, yet far from the mode of the one dimensional NAP posterior PDF ensemble solution. This is a clear example of how viewing the probability of the solution in this two dimensional form offers a different interpretation of the full solution, and may help to highlight the success/failure of the one dimensional solution. The other region of high joint probability along the upper portion of the plot

168

CHAPTER 6. THE FULL SHALLOW WATER INVERSE PROBLEM

Figure 6.29 – Joint posterior probability density plot for NAP and Depth at Test Point 5 - Note the region of highest joint probability is located close to the true parameter values but far from the most probable single parameter NAP estimate.

6.5. A HYBRID ALGORITHM APPROACH

169

Figure 6.30 – Joint posterior probability density plot for CDOM and Depth at Test Point 5 - Note the close estimate of the true parameter values given by the region of highest joint probability.

indicates possible non-unique solutions, as the partition samples over the neighbouring deep and shallow regions, and the high NAP concentration compensates in varying degrees for increases in the depth. This analysis of joint posterior probability is a powerful feature of the probabilistic ensemble interpretation approach, and enables another level of analysis to extract information from solutions which may otherwise appear unconstrained. In figure 6.30 we apply this analysis to the joint probability of the CDOM and depth parameters at point 5. We can see that the region of highest joint probability is very close to the true values for both parameters, and far from the mean of the depth parameter. Another way to examine the joint posterior probability of the multiple water concentration parameters in the RT model is to sample the spectral characteristics of the attenuation coefficient Kd . As all three water concentration parameters contribute to

170

CHAPTER 6. THE FULL SHALLOW WATER INVERSE PROBLEM

Figure 6.31 – Sampling density of Kd spectra at Test Point 5 - Sampling density and posterior probability of the Kd spectra is shown by the colour scale in comparison to the true synthetic model.

Kd , at each step of the chain the combination of these parameters produces an individual spectral Kd . By analysing the posterior density for these Kd solutions in comparison to the true Kd , we are effectively assessing if the region of highest joint posterior probability for these three parameters occurs at the true value for the water column attenuation. Figure 6.31 shows the sampling distribution of Kd throughout the hybrid algorithm run at point 5. We see that the highest probability density of Kd occurs very close to the true value of Kd , suggesting that the region of highest joint posterior of the three water concentrations is indeed close to the true parameter values in the model. This does not preclude the possibility that the highest probability density estimation of Kd is reached by a non-unique and “incorrect” parameter combination of the three water

6.6. SUMMARY

171

concentrations, although the absence of another region of high posterior probability in the Kd distribution suggests this is unlikely. The posterior distribution analysis of Kd represents a three-dimensional joint posterior of the three most unconstrained parameters in our inversion, the water column concentrations. The ability to interpret the full ensemble in this way, along with the joint posteriors of other parameter combinations, enables a far greater scope for analysis of the solution. For instance, in the examples above, although we have estimated the true Kd quite closely with the joint posterior solution, the highest probability solution is still slightly lower than that of the true Kd (figure 6.31). We can further investigate this through the joint posteriors of figures 6.29 and 6.30, where we see that the slight underestimation of NAP by the highest probability region of the ensemble is the most likely contributor to this slight underestimation of Kd . Whilst we do not explore the estimation and sensitivities of Kd further in this work, this example provides a illustration of one of the strengths of the probabilistic approach and the flexible ways in which it can be used to to infer information from an inversion solution.

6.6

Summary

In this chapter, we have extended the shallow water inversion problem to the estimation of five free parameters in the spatial domain of the model, encompassing concentrations of constituents in the water column, as well as depth and substratum composition. This has significantly increased the dimension of the inverse problem from the previous chapter, as well as introducing the possibility of non-unique minima solutions resulting from parameter combinations in the radiative transfer model. With the application of the ML-SG algorithm, we have shown it to be effective in representing the different uncertainty characteristics of each parameter in various physical scenarios of the synthetic data and radiative transfer model. We have shown a clear relationship between the accuracy of parameter retrieval and the depth of the water column, with more accurate depth and substrate composition retrievals as the water column decreases, and vice versa for water constituent concentrations. This is comparable to previous studies we have referred to, however using our probabilistic approach, we have also been able to directly infer the uncertainty and level of constraint related to these parameter estimations. Being able to examine the nature of this uncertainty and the distributions of our parameter ensemble solutions has enabled us to assess and

172

CHAPTER 6. THE FULL SHALLOW WATER INVERSE PROBLEM

discuss the appropriate use of ensemble estimators; particularly for parameters that have a non-Gaussian or poorly constrained posterior distribution. This has led us to adopt a Bayesian HPD credible interval method, which we consider more appropriately represents the varied posterior PDFs of the estimated parameters. The well-characterised test-bed environment of our synthetic data has enabled us to isolate and test a number error sources within the inverse problem. The aim of these tests was two-fold: Firstly, to assess the parameter interactions and sensitivities within the inversion as these error sources were introduced; and secondly, to evaluate the ability of the ML-SG algorithm to appropriately represent the uncertainty in the parameter ensemble solutions. For sources of error which directly changed the nature of the data (increased data noise / changed spectral resolution) we have shown the ML-SG algorithm to be very effective in accounting for this form of error. The algorithm shows strong spatial regularisation characteristics for data with high noise levels, able to infer a coherent spatial estimation of parameters, with only a small comparative increase in the parameter uncertainty estimation. With a decrease in the spectral resolution of the data, the algorithm shows an expected decrease in the accuracy of the parameter retrievals. However, most importantly, this is also reflected appropriately by the uncertainty characteristics of the parameters inferred from the ensemble solutions. In contrast, for error sources which influence the parametrisation of the radiative transfer forward model (SIOP and substrate variations), the ML-SG has proven unable to encompass this variation effectively in the ensemble parameter solutions. Whilst we have been able to examine parameter interactions and strong compensations between parameters to account for the introduced radiative transfer model error, the ML-SG algorithm does not reflect an appropriate level of uncertainty in these compensating parameters. Hence, in these error source scenarios we are many times presented with ensemble parameter solutions which are quite precise, but overly optimistic. This is not unexpected, as the structure of the ML-SG algorithm looks to infer noise based on the fit of the data and model, not from the parametrisation of the forward model itself. What these tests do demonstrate is the importance of a correct parametrisation of the forward model. In Chapter 8, we discuss effective methods of characterising parameterisation uncertainty within the probabilistic inversion framework. Finally, in this chapter, we have addressed the issue of the high-dimensionality of the multiple parameter problem, evident in the Voronoi based local minima that ML-SG

6.6. SUMMARY

173

algorithm has still shown to be susceptible to becoming trapped within. By considering how to incorporate prior bathymetric knowledge into the inversion, we have developed a segment based optimisation starting point algorithm, referred to as the Hybrid ML-SG algorithm. This approach has been devised to utilise the segmentation as a flexible framework for estimating non-random starting points in the parameter space, either through optimisation or from existing data. Here we have a segment based bathymetry estimation from the SAMBUCA optimisation algorithm as an informed starting point for the ML-SG inversion. In principle this can easily be extended to a full suite of estimated parameters. In our initial tests, the Hybrid algorithm has been effective in improving parameter retrieval and assisting convergence, by commencing the sampling chain closer to the target region of highest posterior probability. We have also shown how the Hybrid approach reduces the possibility of Voronoi cells from becoming stuck in local minima, resulting in a corresponding smoother spatial variation of the parameter ensemble uncertainty solutions, less influenced by the structure of the sampling partition. In the following chapter, we apply our developed algorithms to a real hyperspectral data source located in Lee Stocking Island, Bahamas. This offers the first opportunity to evaluate the developments we have made to date; and how they translate to the sources of variability present in any real world data and scenario. In the selected study area, we are able to compare the results of our probabilistic framework algorithms to extensive previous work using optimisation methods, and to validate results using existing data and previous results.

174

CHAPTER 6. THE FULL SHALLOW WATER INVERSE PROBLEM

Chapter 7 Case Study: Lee Stocking Island, Bahamas 7.1

Introduction

In this chapter we present the first application of our developed probabilistic ML-SG algorithms to a real hyperspectral data source over a shallow water coastal region in Lee Stocking Island, Bahamas. This region has been the subject of numerous aquatic remote sensing studies, including a study by Dekker et al. (2011), comparing the performance of a variety of state-of-the-art physics-based shallow water inversion algorithms. By basing our real data case study at this location, we can make a direct comparison and assessment of our developed algorithms. Application to a real data source and study area raises the question of the appropriate empirical parametrisation of our radiative transfer model. In particular, we discuss the implications of inclusion of the multiple substrate types that exist in the study area, and how this presents difficulties in sampling using an McMC approach. We assess the possible effects of this empirical parametrisation on the accuracy and uncertainty of our parameter estimates, and discuss the challenges of convergence presented by the high-dimensional problem space of the real study area and data. Through validation to acoustic depth data and comparison to the various algorithm results presented in Dekker et al. (2011), we show the ability of our algorithm to produce accurate depth estimates in line with the best performing algorithms in that comparative study. We illustrate the benefits of the spatial regularisation and smoothing features of 175

176

CHAPTER 7. CASE STUDY: LEE STOCKING ISLAND, BAHAMAS

the algorithm, in contrast to the results from a pixel-based inversion approach, focusing specifically on the outputs of the SAMBUCA optimisation algorithm in the study. Also highlighted is the appropriate estimate of uncertainty that is provided for the bathymetry model by the ML-SG algorithm, an inversion output not provided by any of the methods applied in the comparative study. This leads to a discussion on the difficulty of appropriately reflecting the uncertainty of some of the less constrained parameters in the probabilistic framework, with a focus on the combined importance and influence of a correct parametrisation and appropriate prior parameter range. We conclude the chapter with a discussion of both the positive and challenging implications of applying the ML-SG algorithms to the shallow water inversion problem, based on the findings and observations from this first real data case study.

7.2 7.2.1

Background & Study Location Study Site

The study site is located to the west of Lee Stocking Island (LSI), Bahamas including an area known as Horseshoe Reef (figure 7.1). The site is characterised by a shallower shoreward region consisting of seagrass beds, clean to biofilmed ooid sands, patch coral reefs, consolidated carbonate substrates, macrophytes and turf algae (Lesser and Mobley 2007; Dekker et al. 2011). Bathymetry of the study area ranges from 1-2m over some of the shallow features in the western shoreward region, to ˜13-14m towards the northeastern oceanic waters, with bottom visibility evident down to ˜13m in some regions (Dekker et al. 2011). The optically clear waters around Horseshoe reef are characterised by relatively low chlorophyll concentrations of less than 0.2 µg/L (Lesser and Mobley 2007). Absorption properties of the water column are dominated by increased CDOM concentrations over the shallower western areas, a result of CDOM being derived from the decay of corals, seagrass and other benthic biota. This concentration varies both spatially and temporally, as high-CDOM water from the shallows exchanges with lower-CDOM oceanic waters (Boss and Zaneveld 2003; Mobley et al. 2005; Dekker et al. 2011).

7.2. BACKGROUND & STUDY LOCATION

177

Figure 7.1 – Horseshoe Reef, Lee Stocking Island, Bahamas - Study site and true colour subset of the PHILLS data shown by the red polygon. Depths range from ˜1m in the west of the study site to ˜13m in the north-east of the site.

7.2.2

Previous Studies

The Lee Stocking Island region is the location of the Caribbean Marine Research Centre (CMRC), and the surrounding waters have been used as test regions for a variety of remote sensing studies (Boss and Zaneveld 2003; Decho et al. 2003; Dierssen et al. 2003; Louchard et al. 2003; Zhang et al. 2003; Mobley et al. 2005; Lesser and Mobley 2007). In particular, the Horseshoe Reef study site was selected on the basis of previous work that has been completed using physics-based inversion approaches at the site. To derive information on bathymetry, water optical properties and substratum composition, Lesser and Mobley (2007) apply the Comprehensive Reflectance Inversion based on Spectral matching and Table Lookup (CRISTAL) LUT inversion method developed by Mobley et al. (2005) to the Horseshoe Reef region. In particular, using diver observed transects as validation data, their study demonstrated the potential for benthic classification using the LUT method, hyperspectral data, and a large library of substrate reflectance spectra. The CRISTAL method was again applied at Horseshoe Reef as part of an intercomparison study by Dekker et al. (2011), along with a range of other optimisation,

178

CHAPTER 7. CASE STUDY: LEE STOCKING ISLAND, BAHAMAS

Algorithm

Type of Inversion

Lyzenga (Lyzenga 1981)

Empirical

Hyperspectral Optimisation Process Exemplar model (HOPE) (Lee et al.

Physics-based Optimisation

1998, 1999, 2001) Bottom Reflectance Un-mixing Computation of the Environment model

Physics-based Optimisation

(BRUCE) (Klonowski et al. 2007) Semi-analytical model for Bathymetry, Un-mixing and Concentration

Physics-based Optimisation

Assessment (SAMBUCA) (Brando et al. 2009) Comprehensive Reflectance Inversion based on Spectral matching and Table

Physics-based LUT

Lookup (CRISTAL) (Mobley et al. 2005) Model Inversion by Adaptive Linearised Look-up Trees (ALLUT) (Hedley

Physics-based LUT

et al. 2009)

Table 7.1 – Inversion methods used in the Dekker et al. (2011) intercomparison study at Horseshoe Reef, LSI, Bahamas.

LUT and empirical inversion methods, including the SAMBUCA algorithm (table 7.1). The Dekker et al. (2011) study is the first comprehensive evaluation and comparison of state-of-art inversion methods for shallow water remote sensing. By applying the ML-SG algorithm to the same input and validation data, this can enable us to directly compare the accuracy and functionality of our algorithm to these well established methods. We can also demonstrate how the ML-SG algorithm is effective in dealing with some of the issues encountered by these methods in this particular case study, and assess the additional aspect of uncertainty estimation that the probabilistic inversion method provides.

7.2.3

The Data

Data over the Horseshoe Reef site was acquired by the airborne Ocean Portable Hyperspectral Imager for Low-Light Spectroscopy (PHILLS) on 17th May 2000. The PHILLS instrument records 128 spectral channels between 400nm and 1000nm at a nominal bandwidth of 4.6nm and a ground sampling pixel size of between 1-2m square (Davis et al. 2002; Mobley et al. 2005). For this study we adopt the data specifications of the Dekker et al. (2011) study of 72 bands, 5nm wide between 402 to 748nm with a pixel size of 1.3m. The study area of ˜0.4km2 shown in figure 7.1 is defined, consisting of a total of 240,320 pixels in a 751 pixel by 320 pixel region. Pre-processing of the data, including atmospheric correction using the TAFKAA algorithm (Gao et al. 2000) is detailed in Mobley et al. (2005) and

7.2. BACKGROUND & STUDY LOCATION

179

Figure 7.2 – Acoustic Depth Sounding Transects - individual observations corrected for tidal height at the time of the PHILLS data acquisition and positioned by differential GPS data.

Lesser and Mobley (2007). To relate the PHILLS observations of above-surface remote sensing reflectance (Rrs , sr−1 ) to the sub-surface remote sensing reflectance (rrs , sr−1 ) modelled by our radiative transfer model, we use the model developed by Lee et al. (1998), where Rrs ≈

0.5rrs . 1 − 1.5rrs

(7.1)

Boat based acoustic survey validation data used in the Dekker et al. (2011) study; tide corrected to the time of the image acquisition, covers a significant proportion of our study area (figure 7.2). Acquisition of this data is detailed in Lesser and Mobley (2007), including the linear interpolation used to create the bathymetric surface (figure 7.11) with which we compare with ML-SG algorithm outputs later in this chapter.

7.2.4

Parameterising the ML-SG Algorithms

To enable comparison and evaluation with the results from Dekker et al. (2011), we adopt the class of model parametrisation based on the SAMBUCA implementation as described in that study. The SIOP parameters used in the study were from estimations made in the coral reef environment of Heron Island, Australia (Wettle 2005; Wettle and

180

CHAPTER 7. CASE STUDY: LEE STOCKING ISLAND, BAHAMAS

Brando 2006). These correspond to the values of SIOP set 2 detailed in Appendix A. The parametrisation of the substrates in our model, as compared to the original SAMBUCA optimisation algorithm, present a problem that must be considered in the framework of our rj-McMC methodology. In the SAMBUCA optimisation algorithm (Appendix A), the model is parametrised with a full library of substrate reflectance spectra. In the case of the Dekker et al. (2011) study, this consisted of a total of 8 reflectance spectra characterising the various benthic substratum types in the LSI region (Mobley et al. 2005). In an optimisation framework, the parameter q represents the best matching set of two substrates from the library, and their respective proportions of q and (1 − q). In the SAMBUCA optimisation this is completed by assessing every combination and proportion of substrate contribution at each step of the model optimisation, and retaining the best fitting two substrates and associated q value (Brando et al. 2009). Not only does this create a combinatorial problem in the optimisation that increases computation time exponentially as the number of substrates in the library increases, but it frames q as a non-continuous variable in terms of an McMC framework. In our ML-SG proposal recipe we propose perturbations to our model (m → m0 ) by changing the values of parameters in the model, such as q, based on a Gaussian proposal distribution q (m0 | m). As we have discussed in earlier chapters, the efficient sampling of the McMC algorithm is determined by the size and nature of this proposal distribution. In a non-continuous parameter case, a q value for substrates A and B has no defined relationship in the parameter (or model) space to a q value for substrates C and D. Therefore, a Gaussian type proposal distribution does not work for this form of problem, and establishing a proposal design that is efficient and ensures the reversibility of the rj-McMC algorithm is non-trivial. In Chapter 8 we discuss and propose options to address this issue of multiple substrates and non-continuous parameters as a pointer to future research. In this chapter we retain the form of our ML-SG algorithm recipe and select two substrate spectra, representing a dark (seagrass) and light (clean ooid sand) substratum, from the set used in Dekker et al. (2011) (figure 7.3). The methodology for acquisition/measurement of these spectra is outlined in Mobley et al. (2005). Restricting our substrate model parametrisation to two substrates has potential implications for the ability of our RT model to accurately represent the benthic reflectance component of the model, particularly given that we have acknowledged that the study

7.2. BACKGROUND & STUDY LOCATION

181

Figure 7.3 – Substrate reflectance spectra used in the ML-SG algorithm inversions of the LSI PHILLS data - Clean ooid sand and Seagrass as detailed in Mobley et al. (2005) and Dekker et al. (2011).

region contains a far wider range of substrate types. In Dekker et al. (2011), the optimisation inversion methods each deal with the issue of multiple substrates in a slightly different way. The HOPE algorithm uses the bottom albedo value at 550nm and the 550nm normalised spectral shape of a single bottom type for each pixel. Two possible bottom types are used, sand and seagrass, and these are selected for each pixel using an empirical approach prior to inversion. The BRUCE algorithm separates the benthic substrate spectra into three classes; sediment, coral and vegetation. In a manner similar to SAMBUCA, all linear proportion combinations of these three classes are evaluated in the optimisation, with the best fitting proportion of each class retained to estimate the bottom reflectance. Considering the structure of these algorithms, our two substrate parametrisation of the SAMBUCA radiative transfer model fits somewhere between the HOPE and BRUCE approaches. Whilst working with only sand and seagrass, we enable a mixture of these two substrates within a pixel. Thus unlike the HOPE algorithm, we can represent a variation of the spectral shape of the bottom reflectance, not constrained to that of one of the end member spectra. Evaluating how our parametrisation influences the accuracy

182

CHAPTER 7. CASE STUDY: LEE STOCKING ISLAND, BAHAMAS

Figure 7.4 – Multi-Scale Segmentation of the PHILLS study site data subset 406 segments created using segmentation weighting criteria of Scale = 10, Shape = 0.1 and Compactness = 0.5.

of our inversion results in comparison to these methods of varying complexity is a key benefit of using the Horseshoe Reef study area for this case study.

7.3 7.3.1

Inversion using the ML-SG Algorithms Segmentation

In the segmentation of the PHILLS data we follow the same iterative process using the eCognition software package as that used for our previous synthetic data sets. Weighting in the segmentation was equal across each of the 72 bands in the data, resulting in a segmentation of 406 segments as shown in figure 7.4. We can observe in the segmentation a clear increase in the size of the segments over regions of increased homogeneity, such as the deeper waters in the east of the study area. As expected, an increase in detail of the segmentation is evident over shallower more complex regions, particularly in the north-west corner of the image. 7.3.1.1

Optimisation for the Hybrid Algorithm

To develop informed segment based inputs for our Hybrid ML-SG algorithm introduced in section 6.5, we first must extract a representative spectra from each segment. For this we use a reference wavelength of 520nm to select the median spectra from the ensemble in each segment. In figure 7.5. we show the full ensemble of spectra from two different

7.3. INVERSION USING THE ML-SG ALGORITHMS

(a)

183

(b)

Figure 7.5 – Spectra ensembles extracted from individual segments in the PHILLS data - ensembles shown in blue and extracted from segments in a deeper (a) and shallower (b) region of the study area. Median representative spectra at a reference wavelength of 520nm shown in red.

segments in the data, and the corresponding representative median spectra. We can particularly see in the ensemble from the segment in a shallower region (figure 7.5b), that the distribution of spectra can vary significantly within a segment. This is to be expected, as we do not expect to capture pixel scale variations in the data with the segmentation scales required for effective implementation of the ML-SG algorithms. What is important to distinguish however, is that by providing segment based parameter starting values and spatial proposal guidance, we are not preventing the algorithm from resolving these small scale variations if indeed it is what the data is telling us. Hence, a poor segmentation that captures none of the spatial or spectral variability of the data will at worst prove an inefficient guiding component of the algorithm, one that is in a sense “overruled” by the data in the inversion process. For creation of segment based starting layers for the five parameters in the model, we invert each of the 406 extracted median segment spectra through the optimisation process of the SAMBUCA algorithm as detailed in appendix A. We then map each of the estimated parameters back to each pixel in the segment from which the inverted median spectra was drawn. This creates the informed starting value models from which the initial Voronoi cell tessellation will be parametrised with in the Hybrid ML-SG algorithm run (figure 7.6). We can see from these segment based inversion estimates that there is a large

184

CHAPTER 7. CASE STUDY: LEE STOCKING ISLAND, BAHAMAS

(a) Depth (m)

(b) Substrate Proportion (q)

(c) CHL(µg/L)

(d) CDOM

(e) NAP (mg/L)

Figure 7.6 – Segment based SAMBUCA estimated parameter layers - estimated from the PHILLS data extracted 520nm median spectra in each segment using the SAMBUCA optimisation inversion algorithm.

7.3. INVERSION USING THE ML-SG ALGORITHMS

Parameter CHL (µg/L) CDOM NAP (mg/L) q H (m)

V min 0 0.001 0.01 0 0

Vmax 0.2 0.021 0.3 1 14

185

Proposal (σ) 0.008 0.004 0.03 0.04 0.07

Table 7.2 – Parameter prior ranges and Gaussian proposal widths for the PHILLS data inversion.

segment-to-segment variation, particularly in the CHL and NAP concentration parameters of the solution. This is in line with what we have learned about our radiative transfer model in the synthetic study from chapter 6. In those tests we saw that CHL and NAP were consistently the two least constrained parameters in the model. As we discussed, small variations in these parameters, particularly in shallower regions, have a minimal influence on the modelled reflectance spectra. Hence, in the SAMBUCA optimisation this results in a wide range of estimates between segments for these parameters. Despite these variations, it is clear in the depth and substrate proportion (figures 7.6a and 7.6b) estimates that the optimised segment based layers provide a much better parameter starting point than what we would achieve with a randomised start within the prior ranges of the parameters. Consequentially, we expect a better performance of the McMC algorithm.

7.3.2

Probabilistic Inversion of PHILLS data

The PHILLS data was inverted by implementing both the ML-SG and Hybrid variations of our algorithms, described in earlier chapters of this thesis. The Terrawulf computing cluster (see section 4.2.3) was used to run 72 individual chains in parallel for each of the two algorithm inversions. Results from these individual chains are combined to form a full posterior ensemble solution, which we can use to estimate properties for comparison with the other physics-based inversion solutions in Dekker et al. (2011). The Gaussian proposal widths for each parameter proposal (table 7.2) were refined in an iterative manner to achieve an overall acceptance rate of 21% and 14% for the MLSG and Hybrid algorithms respectively. As we observed in the synthetic data examples, the high dimensions of the inverse problem results in low acceptance rates for the birth and death dimensional change moves in the chain, with a combined rate of 7% for the

186

CHAPTER 7. CASE STUDY: LEE STOCKING ISLAND, BAHAMAS

 Figure 7.7 – LSI PHILLS NE4R Covariance matrix Rrs , sr−1 - Estimated using the method described in Appendix B.

ML-SG and 3% for the Hybrid algorithms. In the formulation of the misfit for the likelihood function (equation5.5), we must now specify a covariance matrix C which reflects the nature of the spectral noise correlation in the data. In these runs we used the noise equivalent reflectance difference (NE4R) covariance matrix (figure 7.7) estimated from the PHILLS data in Sagar et al. (2014)(This paper is presented in full in Appendix B).

7.3.2.1

Stability of the Solutions

To assess the degree of convergence of the algorithm, we again examine the stability of the parameter solutions over the iterations of the chain. Figure 7.8. shows the ranges of the marginal posterior trace plots for three of the estimated parameters in both the ML-SG and Hybrid algorithm tests at a single pixel. We can clearly see the benefit of the informed parameter starting values of the Hybrid algorithm approach in increasing the speed of convergence of the algorithm. In the hybrid algorithm run we reach a stable solution at around 250,000 sampling iterations, whilst this is not achieved in the ML-SG algorithm until around 400,000 iterations. Retaining 500,000 samples per chain after this burn-in period for each algorithm reduced the processing time from 131hrs for the ML-SG algorithm to 120hrs for the Hybrid algorithm. It should be noted that this is not a linear decrease in the processing time in line

7.3. INVERSION USING THE ML-SG ALGORITHMS

(a)

(c)

(e)

187

(b)

(d)

(f )

Figure 7.8 – Parameter sample trace plot ranges for 10 combined chains in the PHILLS standard ML-SG (a,c,e) and Hybrid (b,d,f ) algorithms - Burn-in period indicated with the dashed black line. Note the decreased burn-in time required for the Hybrid algorithm.

188

CHAPTER 7. CASE STUDY: LEE STOCKING ISLAND, BAHAMAS

with the reduced number of comparative samples in each run. This is because of the decreased number of cells in the Voronoi partition used in the sampling of the Hybrid algorithm, a feature of the method discussed in section 6.5.2. The processing time for each iteration in the chain is determined by the number of pixels that have been changed in the proposed model, and thus how many must be assessed in the proposed likelihood calculation or stored as a sample. The Hybrid algorithm in this test samples using a tessellation with an average of 1421 cells, whilst the ML-SG algorithm uses an average of 4856 cells. Therefore at each step of the chain for the Hybrid model a proposed model involves, on average, a larger proportion of pixels in the data. This means that the processing time per sampling iteration increases, resulting in a balancing out of the efficiency benefits gained from the reduced burn-in period. It is important to note that convergence speed is not the sole benefit of the Hybrid approach. As we have seen in the synthetic data tests, using the informed starting points also assists in avoiding local minima in the solution and decreasing the global misfit. This can be seen in these real data tests by comparing the depth MAP solutions (figure 7.9) for both algorithms, and the improved misfit/likelihood of the Hybrid solution compared to the standard ML-SG algorithm (figure 7.10). The MAP solutions clearly show the numerous local minima results across the smaller cells of the higher dimension partition model of the standard ML-SG algorithm. The Hybrid algorithm MAP solution illustrates the significant benefit that the informed starting parameter layers bring to reaching a region of higher posterior probability, avoiding localised minima and assisting convergence. This is further illustrated in figure 7.10. where we compare the sampled likelihoods of chains from both of the algorithms. As expected, the Hybrid algorithm begins sampling at a much lower misfit, before reaching a relatively stable state very early in the chain. The standard ML-SG algorithm takes much longer to reach a higher level of misfit, and even at the later stages of the chain we see this misfit still decreasing as the local minima artefacts in the model slowly get visited in the very high dimensional partition. Convergence of the algorithm must again be questioned, as clearly the ML-SG algorithm has not reached a stable sampling state. We would consider that this may be an avoidable by-product of the dimensions of the problem, determined by the number of parameters (5 model parameters and 2 cell locations) and the number of cells in the partition. In the case of the standard ML-SG algorithm run, we are now sampling a problem with an average dimension of over 33,000. It may be the case that we never

7.3. INVERSION USING THE ML-SG ALGORITHMS

189

(a)

(b)

Figure 7.9 – Maximum Posterior (MAP) Depth (m) Solutions for the standard ML-SG (a) and Hybrid (b) algorithms - note the increased number of cells and local minima present in the standard ML-SG solution.

190

CHAPTER 7. CASE STUDY: LEE STOCKING ISLAND, BAHAMAS

Figure 7.10 – Negative Log Likelihood for six selected chains in the standard (red) and Hybrid (blue) ML-SG algorithm runs - Note the improved speed and degree of convergence of the Hybrid algorithm.

reach convergence, in a traditional McMC definition, for problems of this size in remote sensing. In these high dimensional inversion problems some type of informed starting parameter space, such as that achieved with the Hybrid optimisation algorithm may be necessary for effective subsequent application of probabilistic inversion methods. Bearing in mind the local solution stability we have shown, in the results of these algorithm runs we must also assess whether the solutions we can infer from the ensemble are sensible given these convergence challenges.

7.3.2.2

Ensemble Solutions & Estimators

To examine the posterior solutions of the two algorithms, in figures 7.12 - 7.16 we first present the mean and standard deviation ensemble solutions for each of the five estimated parameters. Unlike the synthetic data studies which we have presented in previous chapters, we are unable to directly compare these estimated solutions to a known true spatial distribution for each parameter. For depth, we can make an initial evaluation using the interpolated bathymetry surface generated from the acoustic survey data, as shown in figure 7.11.

7.3. INVERSION USING THE ML-SG ALGORITHMS

191

Figure 7.11 – Interpolated Acoustic Bathymetry (m) for the Horseshoe Reef Study Area - linear interpolation procedure from the individual acoustic soundings detailed in Lesser and Mobley (2007).

One of the most distinct differences between the two algorithm solutions is the degree of Voronoi cell driven variance in the uncertainty solutions for the standard MLSG algorithm. We also saw this increase in the synthetic data tests as we increased the dimension of the problem, and it is related closely to the issues of convergence discussed in the previous section. The absence of the these artefacts in the Hybrid uncertainty solutions is another indicator of the improved convergence using this algorithm. Overall uncertainty is higher for all parameters in the standard ML-SG algorithm solutions, suggesting poorer convergence, and we can confirm this through examination of the marginal posterior distributions later in figures 7.18, 7.20 and 7.21. The ensemble solution of the substrate proportion q is very similar for both algorithms, as is the solution for the CDOM concentration. The CDOM solution corresponds to the observations of the study site made by Boss and Zaneveld (2003) and Dekker et al. (2011), with higher concentrations of CDOM estimated over the shallower areas of the shoreward region. CHL and NAP remain the most poorly constrained parameters, as we have seen in the segment based optimisation and the synthetic data studies. To assess the ensemble solutions in more detail we identify in figure 7.17 a selection of test points and transects, covering a range of depths across the study area. From a visual assessment of the mean ensemble depth parameter solutions (in comparison to the interpolated acoustic data) we can see that both algorithms have had difficulty estimating an accurate depth in the darker shallow regions of Horseshoe Reef itself

192

CHAPTER 7. CASE STUDY: LEE STOCKING ISLAND, BAHAMAS

(a) Mean Solution - Standard ML-SG

(b) Standard Deviation Solution - Standard ML-SG

(c) Mean Solution - Hybrid ML-SG

(d) Standard Deviation Solution - Hybrid ML-SG

Figure 7.12 – Ensemble Depth (m) Solutions derived from the standard and Hybrid ML-SG algorithms.

7.3. INVERSION USING THE ML-SG ALGORITHMS

193

(a) Mean Solution - Standard ML-SG

(b) Standard Deviation Solution - Standard ML-SG

(c) Mean Solution - Hybrid ML-SG

(d) Standard Deviation Solution - Hybrid ML-SG

Figure 7.13 – Ensemble Substrate Proportion (q) Solutions derived from the standard and Hybrid ML-SG algorithms.

194

CHAPTER 7. CASE STUDY: LEE STOCKING ISLAND, BAHAMAS

(a) Mean Solution - Standard ML-SG

(b) Standard Deviation Solution - Standard ML-SG

(c) Mean Solution - Hybrid ML-SG

(d) Standard Deviation Solution - Hybrid ML-SG

Figure 7.14 – Ensemble CHL (µg/L) Solutions derived from the standard and Hybrid ML-SG algorithms.

7.3. INVERSION USING THE ML-SG ALGORITHMS

195

(a) Mean Solution - Standard ML-SG

(b) Standard Deviation Solution - Standard ML-SG

(c) Mean Solution - Hybrid ML-SG

(d) Standard Deviation Solution - Hybrid ML-SG

Figure 7.15 – Ensemble CDOM Solutions derived from the standard and Hybrid ML-SG algorithms.

196

CHAPTER 7. CASE STUDY: LEE STOCKING ISLAND, BAHAMAS

(a) Mean Solution - Standard ML-SG

(b) Standard Deviation Solution - Standard ML-SG

(c) Mean Solution - Hybrid ML-SG

(d) Standard Deviation Solution - Hybrid ML-SG

Figure 7.16 – Ensemble NAP (mg/L) Solutions derived from the standard and Hybrid ML-SG algorithms.

7.3. INVERSION USING THE ML-SG ALGORITHMS

197

Figure 7.17 – Test Points and Transects located in the Horseshoe Reef Study Area - note the visible data quality and glint variations in the data.

(Point 2), and the deeper sandy canyon heading east into oceanic waters (Point 3). By examining the marginal posterior PDFs of these points we can begin to determine if these inaccurate estimates are a product of the mean estimator we have used, or if the true value is not at all contained within our ensemble solution and we need to consider alternate causes. In these depth solution PDFs (figure 7.18) we can see that the mean, shown as the dashed red line, is a reasonable estimator of the depth parameter solution distribution at both of these points. The Hybrid solutions show a lower uncertainty and a smaller HPD credible interval, which we have also inferred from the full image ensemble solutions. At Point 2, located on the low albedo Horseshoe reef area, the Hybrid algorithm estimates a more accurate depth with a lower uncertainty, although both methods overestimate the true depth. The true depth at Point 2 sits at the lower boundary of the 90% HPD credible interval for both methods, suggesting we have been able to appropriately infer the level of uncertainty of these two algorithm estimates of varied accuracy. In contrast, the posterior depth PDFs of the brighter, deeper location of Point 3 show both algorithms to have under estimated the depth by approximately 2m. Importantly, this is not reflected in the uncertainty of the solutions, with the true depth located well outside of the distributions of both the higher uncertainty standard ML-SG and lower uncertainty Hybrid solutions. One criticism that may be levelled at our Hybrid algorithm is that the solution

198

CHAPTER 7. CASE STUDY: LEE STOCKING ISLAND, BAHAMAS

(a) Standard ML-SG Point 2

(b) Standard ML-SG Point 3

(c) Hybrid ML-SG Point 2

(d) Hybrid ML-SG Point 3

Figure 7.18 – Marginal posterior distributions for test points where depth has been poorly estimated by the ML-SG algorithms - True depth value based on acoustic data shown by the black line, mean estimated depth shown in red. Extent of the 90% HPD credible interval shown by blue bars.

7.3. INVERSION USING THE ML-SG ALGORITHMS

199

may have the potential to be highly biased towards the informed starting points for the parameters that we provide. There are two main reasons why we believe we are avoiding this biased sampling and that we are adequately exploring the parameter space; one based on on the theoretical design of our algorithm and one on the results we have observed. Firstly, when we construct our initial parameterisation using the Voronoi partition, the model for each individual parallel chain is assigned random cell node locations based on the segmentation. This results in a different initial partition model of the spatial domain for each chain in the algorithm run. Thus when each model draws informed parameter values from the segmentation, we end up with (in the case of the tests in this chapter) 72 different initial starting parameter models and spatial realisations in the parameter space. Secondly, if we observe the mean solutions for parameters from the Hybrid algorithm run, we see that in many regions the mean solution has moved far from the segment based parameter layers used to initialise the models. In particular, if we were to examine the mean results for NAP (figure 7.16c) and CHL (figure 7.14c) we see that they are markedly different to the initial segment based model (figure 7.6). Even in the depth parameter solution, we see the starting point model has produced anomalous starting values in some individual segments (figure 7.6a) that are resolved to a more coherent and smooth mean ensemble solution (figure 7.12c). These results illustrate that, if supported by the data, the Hybrid algorithm is able to explore the parameter space to reach a region of unbiased high posterior probability away from the initial informed starting values.

7.3.2.3

Empirical Parameterisation of the RT Model

Both of the test points examined in the previous section raise questions regarding the appropriate empirical parametrisation of the model with SIOPs and substrates; ones that cannot be easily answered without validation data for water constituent concentrations and substrate composition at these points. We can however use the findings regarding empirical parametrisation sensitivity from our synthetic data study (section 6.4), and the observations we have on substrate compositions over Horseshoe reef from Lesser and Mobley (2007) to examine this issue further. Point 2 is located directly over the Horseshoe Reef area that was the focus of the

200

CHAPTER 7. CASE STUDY: LEE STOCKING ISLAND, BAHAMAS

benthic habitat mapping assessment of Lesser and Mobley (2007) utilising the CRISTAL inversion algorithm. As part of this study, diver based observations and photographs were used to classify the reef area using 45 randomly placed 0.25m2 quadrants along three transect lines. Each photographic quadrant was assessed to derive an average percentage cover for each benthic substrate end member over the reef, resulting in an estimated cover of: • 69.8% Sediment, grass, turf, or macrophytes • 14.3% Dark sediment or sand and sparse grass • 13.3% Pure coral • 2.5% Bare sand Given that we have only included sand and seagrass in our model parametrisation, this wide variety of substratum present in the study areas means there is a high likelihood of this parametrisation manifesting as errors and compensations in the inversion solutions. In our synthetic study (section 6.4.2.2) we saw this incorrect substrate parametrisation manifest predominately as a variation in the estimated water column constituents. Without knowledge of the true constituent values, and the substrate and shape of the bottom reflectance at the sample point, it is difficult to directly transfer these findings to this real data case. We can observe a decrease in the NAP concentration over the reef area compared to the surrounding areas (figure 7.16), and an increase of CDOM (figure 7.15) in comparison to the neighbouring deeper waters. This CDOM increase however is in line with the increased CDOM over shallower areas observed by Boss and Zaneveld (2003) that we have discussed previously. Therefore, it is difficult to isolate if the substrate parametrisation has indeed contributed to the overestimation of the depth. With a simplified parametrisation of substrates over a complex area, we might expect this to be reflected in the uncertainty of the substrate proportion uncertainty (figure 7.19). However, examining this q distribution at sample point 2 shows that we are unable to make this assessment due to the structure of the uniform prior. As we see in figure 7.19b, the distribution of q at point 2 is at the extreme lower bounds of the prior range, reflecting an estimate of almost 100% cover of seagrass. This raises the issue identified by Stark and Tenorio (2010) that we have discussed in other examples of our

7.3. INVERSION USING THE ML-SG ALGORITHMS

(a)

201

(b)

Figure 7.19 – Marginal posterior distributions for substrate proportion (q) Illustration of the underestimation of uncertainty when parameter estimates occur at the bounds of the prior ranges at Point 2 (b), shown in comparison to a shallower estimated proportion with higher uncertainty at Point 1 (a). Extent of the 90% HPD credible interval shown by blue bars.

work, that of the ability to appropriately reflect uncertainty when parameter estimate falls at the boundaries of a uniform prior interval. Compare this to the uncertainty of the q distribution at point 1, located at a shallower region with a brighter substrate signal (figure 7.19a). In this scenario we would expect a lower uncertainty in the q distribution, as the substratum is contributing a greater proportion of reflectance to observed signal. However due to the truncated distribution of the point 2 distribution at the interval bounds, we see a much decreased uncertainty of the deeper and darker substratum location, contrary to what we expect and likely an under representation of the true uncertainty in the model. If we examine our mean estimated solutions at test point 3 we see a scenario similar to our synthetic study experiment, where we tested an incorrect SIOP parametrisation (section 6.4.2.1). In that example we observed the incorrect parametrisation to cause an under-estimation of depth in a 13m deep region, compensated by an increased NAP concentration and increase in the proportion of darker substrate estimated over a known sand substratum. This is very similar to what we see at point 3, with an underestimation of the depth by approximately 2m at a true depth of 11m. We also see an increased estimated

202

CHAPTER 7. CASE STUDY: LEE STOCKING ISLAND, BAHAMAS

concentration of NAP compared to the surrounding regions, and a relatively low proportion of sand (figure 7.13) estimated for an area that appears to consist of a bright substrate. Whilst this is based on visual observation of the deeper canyon region in which the point is located, these results do suggest that there may exist some errors in the SIOP parametrisation. This is not entirely unexpected, as the SIOP set used in our inversion and the SAMBUCA implementation in Dekker et al. (2011) originates from field work completed in another representative environment, not at the study site (see section 7.2.4). If there does exist an error in the SIOP parametrisation, it does not manifest in other regions of the study area, where the ensemble mean solutions estimate the true depth quite well. To examine this we present a selection of marginal posterior PDFs in a shallow (figure 7.20) and deeper (figure 7.21) region of the study area at points 1 and 4 respectively. We can see at both these points the reduction in uncertainty produced by the Hybrid algorithm, particularly in the deeper waters at point 4. CDOM remains a more highly constrained parameter in the water column compared to NAP, as we observed in our synthetic study applications. In the NAP concentration for point 1 in the Hybrid solution we see the same issue of parameter estimation at the prior bounds, suggesting that we may have an underestimation of uncertainty in this solution. In summary, we do not see any conclusive evidence across these test points that we can attribute depth estimation inaccuracies to empirical parametrisations errors in the model. At point 2 and 3 we see some indicators that this may be the case; based on descriptions of the environment in literature, assumptions made from image interpretation and the results of our synthetic studies. However at points 1 and 4 we see an accurate retrieval of depth, especially in the deeper sample point where we might expect the influence of an incorrect SIOP parametrisation to be more pronounced. Thus, whilst it remains important to parametrise the radiative transfer model with accurate data based on the study area, we cannot isolate our sub-optimal parametrisation as a driver of the inaccuracies we see in some areas of the estimated model solutions.

7.3.2.4

Uncertainty Estimators for Depth

Across all four test points the marginal PDFs for depth are far less influenced by the prior range boundary effects on uncertainty (figures 7.18, 7.20 and 7.21), a result of knowing with some certainty the upper bounds of the true depths from the existing acoustic

7.3. INVERSION USING THE ML-SG ALGORITHMS

203

(a)

(b)

(c)

(d)

(e)

(f )

Figure 7.20 – Test Point 1 Marginal posterior distributions for depth, CDOM and NAP by the standard (a, c, e) and Hybrid (b, d, f ) algorithms - True depth value based on acoustic data shown by the black line, mean estimated depth shown in red. Extent of the 90% HPD credible interval shown by blue bars.

204

CHAPTER 7. CASE STUDY: LEE STOCKING ISLAND, BAHAMAS

(a)

(b)

(c)

(d)

(e)

(f )

Figure 7.21 – Test Point 4 Marginal posterior distributions for depth, CDOM and NAP by the standard (a, c, e) and Hybrid (b, d, f ) algorithms - True depth value based on acoustic data shown by the black line, mean estimated depth shown in red. Extent of the 90% HPD credible interval shown by blue bars.

7.3. INVERSION USING THE ML-SG ALGORITHMS

(a)

205

(b)

Figure 7.22 – Pixel-based scatter plots of the Ensemble Mean vs the Ensemble Maximum Posterior Depth (a) and NAP (b) solutions - Better fit to the the 1:1 line indicates a closer representative of a normal distribution and improved stability of the parameter estimate.

data in the region. The distributions of these point depth estimates, particularly for the Hybrid algorithm, are much closer to a Gaussian distribution than those of the less constrained parameters such as NAP. This means that we can utilise the mean and standard deviation as an appropriate estimator of the solution and representation of uncertainty with more confidence for our bathymetry model. We can illustrate and confirm this by comparing the mean and mode depth solution for the each pixel to that of the less constrained NAP solution (figure 7.22), with a very strong correlation between the depth mode and mean solutions evident in the scatter plot.

Given the local stability of the solutions shown earlier in section 7.3.2.1 and the structure of the marginal depth PDFs shown here, we would consider that even given the global convergence issues highlighted earlier in this chapter, it is appropriate to infer depth parameter estimates from the mean and standard deviations of the ensemble solutions. This is particularly true of the Hybrid algorithm solutions, which have been shown to have a higher degree of convergence and constraint of parameters within the solution.

206

CHAPTER 7. CASE STUDY: LEE STOCKING ISLAND, BAHAMAS

7.4

Validation and Assessment

In this section we draw from the ensemble mean and standard deviation solutions presented and assessed in the previous section to validate the full model against the available acoustic data and make comparison to the inversion results of the methods presented in the Dekker et al. (2011) comparative study. This involves un-published pixel-based image inversion solutions, using the SAMBUCA algorithm, completed as part of the study and supplied through personal communication with the authors.

7.4.1

A Spatially Regularised Solution

In our synthetic data studies, we have seen one of the key benefits of the ML-SG algorithm to be its ability create a spatially smooth solution whilst maintaining discontinuities when they are supported by the data. This minimises pixel-to-pixel variations in the estimated solutions that may be driven by noise in the data or by non-unique solutions and minima in the inversion process. This kind of pixel-to-pixel variation has been specifically noted in inversion results of PHILLS data in other study regions of Lee Stocking Island (Mobley et al. 2005; Filippi and Kubota 2008). In both of these studies spatial smoothing was applied at a fixed spatial scale, either through the use of a spatial constraint within the optimisation algorithm (Filippi and Kubota 2008), or in a post processing kernel operation (Mobley et al. 2005). In the ML-SG algorithm this spatial regularisation is a fundamental component of the algorithm, at a spatial scale driven by the data, and realised as an inferred estimate from the full ensemble of solutions. This feature of our algorithm can be illustrated by comparing our ensemble solution for depth to the pixel based optimisation solution derived using the SAMBUCA algorithm as part of the Dekker et al. (2011) inter-comparison study (figure 7.23). In Dekker et al. (2011) it is noted that there is significant sun glint present in the PHILLS data, most clearly visible in the deeper eastern extents of the study area (see figure 7.17). In the SAMBUCA pixel-based depth solution we can see the pixel-to-pixel variation in the bathymetry generated by these glinted pixels in both the shallow and deeper regions of the study area. To compare the SAMBUCA pixel based solution to our ensemble mean solutions from the Hybrid and standard ML-SG algorithms, we select a subset of the data and

7.4. VALIDATION AND ASSESSMENT

207

Figure 7.23 – SAMBUCA pixel-based optimisation depth (m) solution - completed for the comparative study of Dekker et al. (2011), unpublished and supplied via personal communication with the authors.

present the solutions along with the interpolated acoustic data in figure 7.24. We can clearly see the pixel scale artefacts in the depth estimation of the pixel based solution, with very shallow pixel solutions being derived as a result of the increase in the spectral reflectance caused by the surface reflected sun glint. The ML-SG solutions illustrate the ability of the algorithm to deal with this form of pixel based artefact or noise, with a much smoother model derived, representative of the interpolated acoustic data surface. Interestingly, there exists a noise artefact from the flight line acquisition in the data, visible in the upper left of the image subset. Whilst this is not visible in the full scale version of the ensemble mean solution (figure 7.12), in this reduced scale subset we see that although the ML-SG and Hybrid algorithm account for and smooth the pixelbased noise, the continuous boundary created in the data by this artefact is recognised and resolved by both algorithms. Even though this is obviously a noise feature, this illustrates the adaptive spatial regularisation characteristics of the algorithm to resolve discontinuities in the data when required.

7.4.2

Accuracy and Uncertainty

One of the key aims of this research is to not only estimate parameters accurately through the inversion process, but to ensure that the inferred uncertainty of those parameters is appropriately reflected in the solutions. As we have discussed in section 7.3.2.2. we have already identified some key regions in the study area in which the difference

208

CHAPTER 7. CASE STUDY: LEE STOCKING ISLAND, BAHAMAS

Figure 7.24 – Subset comparison of the pixel-based SAMBUCA depth solution to the ML-SG algorithms solution - note the visible glint over the shallow waters of the subset and the very shallow artefacts produced in the pixel-based depth solution. Both ML-SG algorithms display strong spatial regularisation and smoothing characteristics to deal with this form of pixel based artefact.

7.4. VALIDATION AND ASSESSMENT

Algorithm BRUCE* Hybrid ML-SG Standard ML-SG HOPE* CRISTAL* SAMBUCA Lyzenga* ALLUT*

209

RMSE (m) 0.86 0.87 0.97 1.12 1.14 1.28 1.68 2.36

r2 0.91 0.92 0.89 0.85 0.88 0.85 0.72 0.81

Table 7.3 – Accuracy evaluation results for the standard and Hybrid ML-SG algorithm solutions and inversion methods used in Dekker et al. (2011), which are marked with *.

between the estimated depth and the acoustic data is not reflected in the uncertainty of the solution, and some possible reasons for this in the empirical parametrisation of the model. In this section we look more holistically at the full depth model to assess the accuracy and uncertainty of our solutions in comparison to the pixel based SAMBUCA model and the acoustic data over the study area. To quantify the accuracy of our depth ensemble solutions we validate the mean ensemble solution against the individual acoustic depth soundings, distributed across the study area as shown in figure 7.2. These are presented for both ML-SG algorithm solutions, along with the SAMBUCA pixel based solution (figure 7.23), and a summary of the root mean square error (RMSE) and correlation coefficient for each solution in the density scatter plots of figure 7.25. We can see the improved accuracy of both the standard and Hybrid ML-SG solutions over that of the pixel-based SAMBUCA solution, with a noticeable decrease of individual pixel solutions that deviate away from the 1:1 correlation line. This is another indication of the spatial regularisation and robust treatment of the pixel-to-pixel noise in the data by the ML-SG algorithms and model parametrisation. When we compare the validation of the ML-SG solutions to the other empirical, optimisation and LUT methods in Dekker et al. (2011) (table 7.3), we see that the solution for the Hybrid algorithm produces the highest r2 value and the second smallest RMSE value (0.87m), very close to the best performing BRUCE solution (0.86m). Interpretation of these results must also take into account the potential relative inaccuracies in the acoustic data because of mismatches between acoustic sounding locations and image pixels because of imperfect geolocation (Dekker et al. 2011). In

210

CHAPTER 7. CASE STUDY: LEE STOCKING ISLAND, BAHAMAS

(a)

(b)

(c)

Figure 7.25 – Density scatter plots of the acoustic data vs the depth solutions for the pixel-based SAMBUCA optimisation (a), Standard ML-SG (b) and Hybrid ML-SG (c) algorithms - Increased density represented in red.

7.4. VALIDATION AND ASSESSMENT

211

terms of the absolute accuracy of the acoustic data, Mobley et al. (2005) assessed the cross tracks of a similar acoustic survey used in their study of LSI and found it be accurate to within 0.1-0.2m for depths of 2-12m. To examine the uncertainty of the ensemble solutions we look at transects of the data at two locations shown in figure 7.17. In these transects we plot the ensemble model solutions against the the interpolated acoustic data, along with an inferred uncertainty range of ±2σ (figures 7.26 & 7.27). In transect A (figure 7.26) we see a comparable mean solution of the standard and Hybrid ML-SG solutions over the shallow western section of the transects. Moving into deeper waters at around the 450 pixel point of the transect we see the standard ML-SG overestimate the depth, whilst the Hybrid algorithm maintains a close fit to the data out to the maximum depth of 12-13m. Uncertainty of the standard ML-SG solution is significantly higher than that of the Hybrid solution, encompassing any misfit to the validation data at all points of the the transect. The lower uncertainty of the Hybrid solution shows the expected increase in deeper waters (as does the standard solution), only failing to encompass the validation data misfit in a small number of locations along the transect. The comparison of the solutions along transect B (figure 7.27) contains the regions that we earlier identified to contain depth estimation inaccuracies, including the dark substrates of Horseshoe Reef (pixels 400-450) and the deeper light substrate canyon directly east of the reef (pixels 450-540). Both solutions again accurately model the western shallower section of the study area, the anomaly between pixels 0 to 20 being a result of the lack of acoustic coverage of this small shallow feature, which is therefore not represented in the interpolated acoustic surface. The very low uncertainty of the Hybrid solution in this shallow section does fail to encompass the misfit acoustic data at some points, however this difference is so small that we consider it within the errors and uncertainties of the acoustic data itself. Across the Horseshoe Reef section we observe the same overestimation of depth by both solutions which we saw in the ensemble image solutions and marginal posterior PDFs. In this section the Hybrid solution more closely models the true depth with a lower uncertainty, with the increased uncertainty ranges in both solutions reflecting appropriately the increased misfit to the acoustic data. In contrast, both solutions underestimate the depth of the adjacent eastern canyon, with the uncertainty of both solutions not adequately reflecting the respective solution misfit to the true depth. In

212

CHAPTER 7. CASE STUDY: LEE STOCKING ISLAND, BAHAMAS

(a)

(b)

Figure 7.26 – Transect A results for the standard (a) and Hybrid (b) mean ensemble solutions in comparison to the Acoustic ground truth data - grey region illustrates an uncertainty range of ±2σ.

7.4. VALIDATION AND ASSESSMENT

213

(a)

(b)

Figure 7.27 – Transect B results for the standard (a) and Hybrid (b) mean ensemble solutions in comparison to the Acoustic ground truth data - grey region illustrates an uncertainty range of ±2σ.

the deeper waters past the canyon we again see a general overestimation of depth by the standard ML-SG algorithm and close fit and appropriate uncertainty reflected by the Hybrid algorithm solution. We see in these transects a clear difference in the nature and size of the uncertainties between the standard and Hybrid ML-SG algorithm solutions. There are two probable causes for this, which relate to the dimension of the model being sampled and the degree of convergence of the algorithm. Firstly, we have discussed earlier in this chapter the lack of convergence of the ML-SG algorithm, evident in the local minima remaining in many of small Voronoi cells of the MAP solution (figure 7.9). This small scale spatial variation manifests itself in wider uncertainty limits, and produces a similar response

214

CHAPTER 7. CASE STUDY: LEE STOCKING ISLAND, BAHAMAS

shown in the peaks and troughs of the transect solutions. Secondly, the higher number of Voronoi cells generated and used to sample in the standard ML-SG solution means when evaluating a misfit to the data in each cell, a smaller number of data observations (pixels) needs to be evaluated. This means that the degree of effective spatial regularisation of the potential pixel-based noise within this subset of pixels decreases as does the number of pixels. As noted by Bodin and Sambridge (2009), as the number of data observations decrease in an evaluated cell, the noise of these observations maps more directly into the inferred solution. This results in an increased estimate of uncertainty, as evident in the standard ML-SG solutions sampled with a higher-dimension Voronoi partition. We demonstrated an extreme example of this effect in the earlier single value test of Chapter 4, where we saw the data noise in a pixel based McMC map directly into the retrieved pixel value of the solution (see section 4.5.1.2).

7.4.3

Comparison to a Pixel-Based Solution

We can use the transect approach to show a direct comparison of the Hybrid ML-SG mean solution to that of the pixel-based SAMBUCA solution and the acoustic data (figure 7.28). The pixel-to-pixel variation in the SAMBUCA solution is clearly apparent in these transects, particularly over deeper areas of the study area. Causes of this variation may include the pixel based noise artefacts mapping directly into the pixel solution, or non-unique minima solutions being reached in the optimisation process, particularly as depth increases in the eastern extents of the study area. The transect comparison gives one of the clearest visual representations of the benefits in producing a coherent and accurate bathymetric model given by the spatial regularisation characteristics of the Hybrid ML-SG algorithm. Of particular note in the comparison of these solution transects is the poor depth recovery shown by the pixel-based SAMBUCA optimisation solution in the problem areas of Horseshoe Reef and the adjacent deeper canyon. If we examine the image based solution of the pixel based SAMBUCA algorithm (figure 7.23) we can clearly see that the optimisation inversion, even with the full range of substrates available to characterise this complex region, has overestimated the depth over the reef to a similar magnitude as the standard and Hybrid ML-SG algorithms. Similarly, the pixel based solution has been unable to retrieve an accurate depth in the adjacent canyon, underestimating the

7.4. VALIDATION AND ASSESSMENT

215

(a)

(b)

Figure 7.28 – Comparison of the Hybrid ML-SG mean ensemble solution to the pixel-based SAMBUCA solution and Acoustic Ground Truth Data - shown at transects A (a) and B (b).

216

CHAPTER 7. CASE STUDY: LEE STOCKING ISLAND, BAHAMAS

true depth by a similar amount as the ML-SG algorithms. In Dekker et al. (2011) a similar transect comparison is completed on all the inversion retrieval solutions, with each algorithm displaying a similar degree of difficulty in retrieving the true depth in these identified problem regions. This is noted in the study, however the authors make no specific attempt to address why these areas have proved problematic. In a general statement the authors note that retrieval accuracies may be affected by a range of factors including preprocessing steps such as atmospheric correction, geometric correction and air-water interface corrections (Dekker et al. 2011). The statement is backed up by the results of a study by Goodman et al. (2008) who found the variations in the methods of atmospheric correction and air-water interface corrections (such as de-glinting) had the most significant influence on the outputs from a semi-analytical model inversion. Dekker et al. (2011) consider that the common spread of retrieval accuracies across all the compared methods means they have captured the general environmental conditions and empirical parametrisations of the radiative transfer model. They do however concede that spatial/temporally matched SIOP and benthic reflectance data may improve the accuracy of retrievals.

7.4.3.1

Assessment of Water Column Constituent Solutions

Assessment of the accuracy of our retrieved water column constituent solutions is difficult, as no validation data exist for the concentrations as parametrised in our form of the radiative transfer model. Thus, it is difficult to say if the spatial variations and ranges shown in our ensemble mean solutions represent the true values, or are compensating for other parameters in the inversion model. As noted by Lesser and Mobley (2007), a physics-based inversion model is more likely to retrieve an incorrect IOP (or water column concentration) than it is to retrieve a poor depth. For example, in deep regions where we see an unexpected low substrate uncertainty (sometimes as a result of the parameter bound effect we have identified), it is possible that the estimated water concentrations are compensating to enable the accurate depth retrievals we are observing. One way we can assess the combined estimates of our water column constituent concentrations is through the comparison of the absorption coefficient a, describing the rate of attenuation of light in the water column due to absorption from water

7.4. VALIDATION AND ASSESSMENT

217

(a) Pixel-based SAMBUCA optimisation solution - completed for the comparative study of Dekker et al. (2011), unpublished and supplied via personal communication with the authors.

(b) Hybrid ML-SG mean ensemble solution.

Figure 7.29 – Absorption Coefficient at 440nm (a440) solutions.

column constituents (Mobley 1994). In Dekker et al. (2011), the absorption coefficient a at a reference wavelength of 440nm (a440) is used to compare the water column optical property retrievals of all methods, and make comparison with absorption field observations. We can extract a solution estimate of a440 from the SAMBUCA radiative transfer model detailed in appendix A, using our derived water column constituents and the SIOP set used in the inversion. The a440 solution retrieved from the pixel based SAMBUCA optimisation is show in figure 7.29 in comparison to the solution retrieved from the Hybrid ML-SG algorithm inversion. We can see a very similar distribution from both a440 solutions, with higher values over the shallowest regions of the study area, dominated by the higher concentrations

218

CHAPTER 7. CASE STUDY: LEE STOCKING ISLAND, BAHAMAS

Figure 7.30 – Histogram distribution of Absorption at 440nm (a440) for pixel solutions from the Hybrid ML-SG Algorithm.

of CDOM in the water column (Boss and Zaneveld 2003; Dekker et al. 2011). The Hybrid solution shows a generally higher absorption value across most areas, and again displays the self-smoothing characteristics of the method in dealing with the pixel-based variation in the data. To compare the derived a440 solution to the algorithm solutions in Dekker et al. (2011) we use the histogram method they employ to assess the distribution of a440 pixel values across the study area (figure 7.30). The histogram distributions for the compared algorithm a440 solutions in Dekker et al. (2011) show a general distribution between 0.001 to 0.18m−1 , with a peak between 0.05 and 0.07m−1 . This is consistent with our solution from the Hybrid algorithm which shows a peak at approximately 0.075m−1 . To compare these values, in Dekker et al. (2011) field ac9 measurements (Zaneveld and Boss 2003; Mobley et al. 2005) were used that show values in the range of 0.05 to 0.06m−1 for the study area. Although these confirm the inversion solutions to be in the correct range, the authors advise to use the field measurements with caution as they were acquired either 1km from the study site at the time of image acquisition, or at the study site on a different day. Bearing this in mind, the results do suggest that we have seen no significant compensation in the water constituent concentration solutions using the Hybrid ML-SG algorithm, with the a440 spatial distribution and ranges close to other inversion solution results and the available field data.

7.4. VALIDATION AND ASSESSMENT

7.4.4

219

Implications & Discussion

The accuracy of the estimated mean depth models by our ML-SG algorithm compares very favourably with the estimations made by the algorithms in the Dekker et al. (2011) study, with the Hybrid ML-SG results producing an RMSE of a very similar magnitude to that of the best performing BRUCE algorithm. One of the main drivers behind these positive accuracy results is the strong spatial regularisation characteristics that the partition modelling approach displays. In our synthetic studies we have seen this characteristic of the ML-SG algorithm prove very effective in dealing with the stochastic form of pixel-to-pixel noise in the data. Moving to a real data scenario we find it is also very effective in dealing with deterministic aspects of pixel-to-pixel noise, such as the glint present in the PHILLS LSI image. This is clearly seen when we compare the accuracy of the pixel-based SAMBUCA optimisation model to that of our Hybrid ML-SG algorithm (figure 7.25). This improved accuracy over the pixel-based inversion result could be considered somewhat unexpected if we were to compare the empirical parametrisation of the substrate component of the two algorithms. As we have discussed, whilst we only use a dark and light pair of substrates in our ML-SG algorithms, the full SAMBUCA optimisation allows for incorporating all 8 substrates characterising the LSI region. The limited parametrisation does not manifest in a decrease in accuracy for the ML-SG algorithms, and both SAMBUCA and the ML-SG approach have comparable difficulty in resolving the same regions of the study area (as do all algorithms in the comparative study). One possible reason for this may again be a result of the spatial generalisation and regularisation that comes from sampling the data using the partition framework. For each Voronoi cell, we are fitting our modelled data to a large number of observed pixels. We are therefore more likely to be sampling a range of pixel-to-pixel variation of substrate types in any cell, even as the partition adapts to the general spatial coherency and complexity of the data. Thus, we are conceptually fitting our estimated model (based on a proportion of dark and light substrate) to an aggregate of the proportions of all the substrates in the cell. It appears that by encompassing the general reflectance magnitude range of the full substrate library (by using a dark and light substrate) that the depth model and algorithm is quite robust to this limited parametrisation, and not as dependant on a more pure substrate end member as is a pixel based method. Of course, this limits the application of our algorithm in terms of the mapping of benthic

220

CHAPTER 7. CASE STUDY: LEE STOCKING ISLAND, BAHAMAS

composition, and in the next chapter we discuss some possible directions to encompass this in future research. It has been difficult in this case study to isolate the effects of any potential parametrisation errors resulting from the non-site specific SIOP set used. Given that we have seen in the synthetic case studies that an incorrect parametrisation is not accounted for in the uncertainty estimates for the bathymetry model, we reiterate the importance of an accurate parametrisation in the radiative transfer model to increase the confidence in the uncertainty solutions of the ML-SG algorithms. We have also seen in this case study a clear influence of the prior interval ranges on the estimated uncertainty of the parameters. Where we have estimated parameters (most often CHL, NAP and q) at the bounds of the parameter range, we also see a likely underestimation of the uncertainty of those parameters. We could tackle this issue by increasing the bounds of the prior for the water constituent parameter (or depth if the upper limit is unknown), however the 0 to 1 limit of the substrate proportion is a key component of our model and has clear implications for our ability to estimate an appropriate uncertainty for that parameter. Furthermore, increasing the prior range on the parameters increases the parameter space to be explored, an issue when we are already dealing with the difficulties of convergence in this high-dimensional problem. In Dekker et al. (2011) one of the key findings of the study is that each of the inversion algorithms displays its own strengths and weaknesses, and it is recommended that they are applied on a fit-for-purpose basis. From the observations we have made in this chapter, we would consider that this very much applies to the probabilistic ML-SG algorithm approach as well. We have shown that the algorithm has strong spatial regularisation characteristics, and that these result in an accurate and smooth bathymetry model. We have also shown that the uncertainty estimates of the depth are appropriate, and that the algorithm deals robustly with a range of pixel-based data noise and a limited substrate parametrisation when estimating the bathymetry model. Conversely, we have seen in both the real case study and synthetic data examples that the estimated distribution of the water columns parameters (in particular NAP and CHL) are often highly un-constrained. This makes choosing appropriate ensemble estimators for both the parameter values and uncertainty difficult, and potentially misleading in terms of the true parameter value. We have also seen the bounds of the prior ranges of these parameters (and the substrate proportion q) strongly influence the appropriate uncertainty estimates for the parameter solutions when they exist at the

7.5. SUMMARY

221

extremes of the prior bounds. In line with considering the relative strengths and weaknesses of each algorithm, we must also consider the computational burdens of each approach. In section 9.4 we discuss the challenges associated with our algorithm and the comparative computational loads of algorithms in this study. Whilst it is clear that the ML-SG algorithm is considerably more computationally intensive than any optimisation based algorithm, this must always be considered in the context of the trade-off between operational requirements and the considerable benefits we have shown the algorithm to provide.

7.5

Summary

In this chapter we have demonstrated, to the best of our knowledge, the first application of a trans-dimensional probabilistic inversion algorithm to remote sensing data, in the form of our developed ML-SG rj-McMC algorithms. We have tested our algorithms on a shallow water aquatic study site located at Lee Stocking Island, Bahamas, using hyperspectral data acquired from the PHILLS sensor on-board an aircraft platform. Examining the performance of both the standard and Hybrid version of our ML-SG algorithm, we have identified and discussed the difficulty of reaching convergence of the rj-McMC in these forms of high dimensional problems. Using the informed segment based starting parameter layers produced as part of the Hybrid algorithm process, we have demonstrated an improved speed and degree of convergence in comparison to the standard ML-SG algorithm. This leads us to the conclusion that this form of informed starting point assistance may indeed be necessary for the scale and dimension of this form of remote sensing problem. However, through assessment of the local parameter solution stability, the structure and constraint of the depth parameter marginal posterior PDFs, and the MAP depth solution, we are confident that we can use the Hybrid ML-SG algorithm estimated depth ensemble solutions as an appropriate and sensible estimate of the parameter value and uncertainty. Through validation and assessment to acoustic bathymetry observations at the study a site we have been able to quantify the accuracy of our estimated mean ensemble bathymetry model. The bathymetry model produced by the Hybrid ML-SG algorithm was shown to fit the validation data with an RMSE of 0.87m, comparable with the best performing of the physics-based inversion methods employed in the cross-comparison study at the site by Dekker et al. (2011). Further evaluation against a pixel based

222

CHAPTER 7. CASE STUDY: LEE STOCKING ISLAND, BAHAMAS

optimisation solution using the SAMBUCA algorithm shows the considerable benefits of the spatial regularisation characteristics of the ML-SG algorithm in dealing with all forms of pixel based noise to create an accurate, smooth and spatially coherent bathymetry model. Comparison of the standard and Hybrid ML-SG solutions clearly show the decreased uncertainty estimated for the Hybrid depth solution, with the uncertainty solution appropriately reflecting the misfit to the data in the vast majority of locations. The simplified two substrate parametrisation of our radiative transfer model appears to be dealt with robustly by the structure of the ML-SG algorithms, with no identifiable effect on the accuracy of the retrieved depths in comparison to algorithms with a more comprehensive substrate library parametrisation. Whilst we are confident in the estimations of depth in our inversion algorithm, we have found that the estimated ensemble solutions for the water column constituents are generally highly un-constrained, or have the potential to have their uncertainty underestimated due to the effects of parameter estimation close to bounds of the prior range. This cause of uncertainty underestimation is prevalent also in the solutions for the substrate proportion q. We also acknowledge that the simple two substrate parameterisation of our model is an insufficient representation of the substratum variability to enable use of the our algorithm for benthic composition mapping. Based on these observations we consider the Hybrid ML-SG algorithm to have considerable strengths in retrieving an accurate bathymetry model and associated uncertainty from data with considerable pixel-to-pixel noise. In it’s current formulation, the algorithm is less successful in producing constrained estimations of water column concentrations, although we are unable to conclude if this is influenced by site specific or SIOP parametrisation issues. In the following chapter we deal with concepts and directions for future research, aiming to address some of the issues we have identified in this and earlier chapters. These include the incorporation of multiple substrates and using multiple partition models to assist in the constraint of the water column constituent parameters in the inversion process. We have shown in this chapter the ability of our probabilistic ML-SG algorithm approach to retrieve a robust and accurate model of water column depth and uncertainty, and in the subsequent future research chapter we look to provide directions to extend this capability more effectively to the other components in the shallow water inversion problem.

Chapter 8 Future Directions 8.1

Introduction

In this chapter we identify three key areas in the implementation of our algorithm that we consider would benefit from future research, to increase its effectiveness for remote sensing and shallow water inversion problems. As these future directions stem from findings we have made throughout the development of our ML-SG algorithm, we first take a snapshot of the current state of our algorithm process to draw together the components which have been developed in various chapters of this thesis. The inclusion of multiple substrate types within the inversion model has been identified as a key issue that needs to be addressed in future development of our probabilistic inversion algorithm. In Chapter 7, this issue was particularly relevant when comparing the ML-SG to other inversion approaches using a comprehensive library of substrates. Even with a limited substrate library the ML-SG algorithm displayed a useful ability to estimate an accurate depth, although we recognise that this limits its application to substrate mapping applications. In this chapter we explore a simple McMC approach to cast the multiple substrate problem in a probabilistic framework and present some initial results from this method. The second area for further study is extension of our algorithm through multiple Voronoi partition models. By replacing the single partition model with individual partitions for each parameter of interest in the inversion, we allow the scale, dimension and spatial complexity of the parameter model to be driven by the data independently. Some promising early results are presented which demonstrate the effectiveness of this 223

224

CHAPTER 8. FUTURE DIRECTIONS

idea. Computational and efficiency issues are highlighted as areas for future work. Finally, we discuss the issue of empirical parameterisation uncertainty, or uncertainty in the forward model. This is an issue which has been highlighted in both the sensitivity studies of Chapter 6 and the Lee Stocking Island study of Chapter 7, and we consider it an important area for future research as well as applications of the rj-McMC approach. Whilst we present some initial results in this chapter, we must stress that these are only illustrative concepts at present. Further work is required to either incorporate them into the full ML-SG algorithm framework, in the case of the multiple substrate problem, or to develop a computationally efficient implementation, in the case of the multiple partition model.

8.2

The Current Algorithm

As we look to highlight some future directions for the research issues in our work, it is appropriate to summarise and bring together the algorithm developments we have made to date. To do so, we present the schematic flow of the current best-practise version of the ML-SG Standard and Hybrid algorithm (figure 8.1). This figure was previously presented in Chapter 1 as a visual summary of the components in this thesis. Each of the algorithm processes are identified, and the most relevant section/s in the thesis describing their development and implementation are noted. In the diagram, we have differentiated between the core developments made during the work in this thesis, and procedures that form part of the broader process of our algorithms. To the best of our knowledge, this is first time that some of these state-of-the-art procedures in remote sensing (e.g. image segmentation) and probabilistic inversion (e.g. parallel McMC chain implementation) have been combined. When integrated with the core innovations in our ML-SG algorithm, we have developed a new approach to remote sensing inversion in a probabilistic framework.

8.2. THE CURRENT ALGORITHM

225

Figure 8.1 – Current best-practice structure and processes of the Standard & Hybrid ML-SG Algorithm - With the core algorithm developments of this thesis, along with a novel combination of remote sensing and probabilistic inversion procedures, we have established a new approach to remote sensing data inversion.

226

CHAPTER 8. FUTURE DIRECTIONS

8.3 8.3.1

Future Research Directions Multiple Substrate Sampling

In the previous chapter we discussed the difficulties in incorporating more than two substrate types in the current formulation of the ML-SG algorithms, due to the noncontinuous nature of the substrate proportion variable q when additional combinations of two substrates are available in the model. Whilst these additional combinations can be dealt with in an optimisation framework (e.g. Brando et al. (2009)), determining a valid proposal step between two discrete substrate combinations and proportions in a probabilistic sampling framework is an area that requires further consideration. With these limitations in mind, results obtained for bathymetry estimation using our two substrate parameterisation were shown to be of a comparable or higher accuracy to algorithms using a more complex substrate library, however this feature of our model does restrict its application to substratum mapping applications. Additionally, it would be interesting to examine any potential improvements in the bathymetry estimation accuracy of the ML-SG algorithm with a more comprehensive substratum representation. Retrieving the substrate reflectance in shallow water inversion applications is often dealt with as a spectral unmixing problem (e.g. Goodman and Ustin (2007)), as it is in the SAMBUCA combinatorial parameterisation of the problem. By isolating this aspect of the model it is possible to approach spectral unmixing as its own inverse problem to which we can apply probabilistic methods such as McMC. Eches et al. (2010) use a novel approach to a terrestrial spectral unmixing problem in hyper-spectral imagery, by considering the number of spectral end members that contribute to a measured spectra as an unknown parameter in a trans-dimensional rjMcMC framework. In that study, the dimension of the problem is determined by the number of end member spectra used in each proposed model, and a change of dimension is proposed by a birth (end member addition) or a death (end member removal) move in the algorithm. The authors implement this rj-McMC using a hierarchical Bayesian approach, requiring the sampling of additional hyperparameters in the inversion process. Whilst the results Eches et al. (2010) achieve are encouraging, we would prefer to avoid the complexity of additional parameters in our high-dimensional problem if possible, considering that in the shallow water problem spectral unmixing is only one component

8.3. FUTURE RESEARCH DIRECTIONS

227

in the broader ML-SG rj-McMC framework. In this section we look to illustrate a straightforward method for spectral unmixing in a probabilistic framework. The intention is to explore how an McMC approach can deal with aspects of this problem, with the view to implementing a more complex substrate parameterisation in the full rj-McMC algorithm in the future. 8.3.1.1

A Simplified rj-McMC

To cast this problem in a probabilistic framework, we implement a simple version of the Metropolis-Hastings McMC to unmix a single observed spectra, described with the following steps: 1. Randomly select n substrates from a library of N substrate end members 2. Randomly assign proportion values qi to each of the ai selected substrates, such P that ni=1 qi = 1. This is achieved by taking a random deviate of n − 1 integers between 1 and 100 to define the break points for each of the qi proportion values. For instance, for n = 3 substrates, a n−1 draw of [23, 45] would result in proportion values of q = [0.23, 0.22, 0.55]. 3. Compute the initial modelled spectra m at each wavelength λ based on the model P equation m(λ) = ni=1 ai (λ) qi . 4. Determine the likelihood p(dobs | m) of the initial model spectra m in comparison to the observed spectra dobs using equation 3.4. 5. Propose a new model m0 and likelihood p(dobs | m0 ) using steps 1 to 4. h i 0 obs |m ) 6. Calculate the acceptance term α, such that α(m0 | m) = min 1, p(d . p(dobs |m) 7. Evaluate α against a random number r drawn from a uniform distribution between 0 and 1. If r < α, then m → m0 as we accept the new model m0 and it becomes the current model for the next iteration of the chain. If r > α, then m0 is rejected and the current model is unchanged for the next iteration. 8. Sample by repeating steps 5 to 7. In this formulation we are using the same random assignment of the number of substrates and their proportions (steps 1 & 2) to both determine the initial model, as well as to

228

CHAPTER 8. FUTURE DIRECTIONS

propose new models in the chain. By using the prior distribution p(m) as the proposal distribution q(m0 | m) in this way, we are able to use the simplified acceptance term in step 6, as the prior and proposal ratios from the full acceptance term (equation 3.8) cancel to unity. This method also enables us to sample varied dimensions of the model (i.e. the number of substrates n in the model), without the need for a specific birth or death proposal step. Randomly proposing independent models from the parameter space, as distinct from models dependant on the previous model in the chain, is known as independence sampling. As noted by Brooks (1998), independence sampling and importance sampling (such as the random-walk McMC applications in previous chapters) characterise areas of high posterior probability in essentially different ways. An importance sampler samples points of high probability in parameter space by visiting these points regularly, whilst an independence sampler samples these high probability points by remaining there for long periods in the chain. This kind of random sampling has implications for exploring high dimensional parameter spaces efficiently, both in randomly finding an initial region of high probability, and then effectively sampling around that region. However, in our substrate unmixing problem it provides a simple method to explore the potential of using a probabilistic approach when incorporating multiple substrate types. The use of the prior as the proposal distribution has also been explored as a component of a full rj-McMC inversion (Agostinetti and Malinverno 2010; Dosso et al. 2014). In a recent trans-dimensional inversion study to estimate a geoacoustic profile, Dosso et al. (2014) found that using a prior-based proposal scheme for parameters in a birth move of the algorithm improved convergence efficiency over a regional Gaussian parameter proposal scheme. These results suggest the potential of also incorporating this kind of approach into other aspects of the ML-SG algorithm, beyond the substrate unmixing problem we are tackling here.

8.3.1.2

Initial Results

To test our prior based McMC, we construct a synthetic substrate spectra based on a linear combination of three substrates drawn from the substrate library used in the Lee Stocking Island study of Dekker et al. (2011). In figure 8.2 we show the three selected substrates, representing our library of size N = 3 for this test, and the synthetic

8.3. FUTURE RESEARCH DIRECTIONS

229

Figure 8.2 – Substrate library used for the creation of the combination synthetic spectra (shown by the dashed line) - Synthetic spectra consists of a linear combination of 34% Sand, 29% Coral and 37% Seagrass. Substrates drawn from the library of Mobley et al. (2005) as used in Dekker et al. (2011).

combination spectra created using proportions of 34% sand, 29% coral and 37% seagrass. The McMC procedure detailed in the previous section was then used to sample the posterior probability distributions of the model parameters; the dimension of the model ( n = number of substrates) and the proportion qi of each substrate, where i = 1, . . . , N . Data noise was considered as i.i.d for each band in the likelihood calculation (equation 3.4) and set as σ = 0.02. We can see in figure 8.3 that as we are independently sampling from the prior ranges of the parameters, there is no burn-in period for the 600,000 collected samples in the traditional sense of a random-walk importance sampling McMC algorithm. In our example, we propose a random model that quickly moves the chain into a region of low misfit and high posterior probability. However, by the very definition of the random proposal this is purely by chance, and finding this region of high probability quickly in a higher dimensional problem and sampling it effectively is by no means guaranteed. On examining the ensemble solutions for our chain, the posterior probability distribution of the model dimension (n) was very strongly weighted towards a correct model dimension of three, with a total of 599,974 samples out of 600,000. To examine the posterior distribution of the proportions (q) of each of the substrates in the model, we

230

CHAPTER 8. FUTURE DIRECTIONS

Figure 8.3 – Misfit (φ(m)) of the spectral unmixing model run over all iterations of the chain.

present the marginal and joint marginal posteriors of each substrate combination in figures 8.4 - 8.6.

We can see that the ensemble of solutions has estimated the true proportion values of each of the three substrates in the model very closely, with the regions of highest probability occurring at the true values for each of the substrates individually and in the joint posterior distributions. There is a noticeable increase in the uncertainty of the two darker substrate solutions (coral/seagrass) in comparison to that of the lighter sand substrate. This is consistent with the physics of the problem we are examining, and may be explained by features of the synthetic data model we have created. Firstly, the darker coral and seagrass spectra are of a similar magnitude of reflectance, with the main difference occurring in the increased reflectance and different shape of the coral spectra in the region from 470 to 670nm. Thus, in figure 8.6, we see an

8.3. FUTURE RESEARCH DIRECTIONS

231

Figure 8.4 – Joint posterior solution for proportion of Sand & Coral substrate end-members.

Figure 8.5 – Joint posterior solution for proportion of Sand & Seagrass substrate end-members.

232

CHAPTER 8. FUTURE DIRECTIONS

Figure 8.6 – Joint posterior solution for proportion of Coral & Seagrass substrate end-members.

linear negative gradient relationship between the two substrates in the model, as the two are sampled in varied proportions within the approximately 66% of the total model that they represent. This is also reflected in the joint posterior of Sand/Coral (figure 8.4) and Sand/Seagrass (figure 8.5), where the small uncertainty of the sand parameter solution is largely independent to the uncertainty of either of the darker substrates. The reduced uncertainty of the sand parameter solution is expected, considering its contribution to the synthetic model spectra, which was created with similar proportions of each end member. As the brightest substrate in the model, the sand spectra is the most dominant and influential on the synthetic spectral shape and magnitude. Therefore it is sampled as a very well constrained parameter in the McMC, while the less dominant, darker, and similar spectra of the coral and seagrass produce solutions which are less constrained and display higher uncertainties. We have used this example to show the potential of using a probabilistic approach to the multiple substrate unmixing problem, although further testing of this simple approach is required. In particular, we need to examine the efficiency of the prioras-proposal random approach to the unmixing problem when the size of the substrate

8.3. FUTURE RESEARCH DIRECTIONS

233

library and dimension of the problem increases. Related to this increase in library size, further work is also needed in the interpreting the ensemble results when only a subset of the full library contribute to the highest probability solution, which frames the inversion as a model selection problem.

8.3.2

A Multiple Voronoi Partition Approach

One of the driving motivations behind our algorithm development throughout this research has been to develop an inversion method which takes advantage of the varying degrees of spatial coherency and complexity observed in remote sensing data. This spatial complexity is of course a product of the underlying environmental parameters, and we have seen specifically in the shallow water inverse problem that these parameters each display different scales of spatial variation based on the natural environment. Earlier, in Chapter 2, we discussed studies in the literature that have proposed approaches to account for these issues of spatial coherency in the shallow water environment. For example, Goodman et al. (2008) proposed a kernel based approach to constrain the spatial variation of the water column constituent parameters in the inversion whilst allowing a higher scale of spatial variation in the depth model. Filippi and Kubota (2008) describe a method which imposes a smoothing constraint across all parameters in the inversion solution. In our ML-SG algorithms we have developed a flexible spatial parametrisation of the model space that can adapt to the spatial coherency of the underlying data. However, we have seen that the scale of the partition models in our tests are strongly driven by the spatial complexity of the depth (and substrate composition) in the shallow water environment, as it is these parameters that most influence the variation in reflectance in a shallow water environment (Dekker et al. 2011). To further extend our algorithm we outline a new modification which enables each parameter in the inversion to be represented by an individual partition model, with a dimension and scale representative of the spatial complexity of that parameter. In doing so, we aim to allow the water column constituent parameters in particular to be sampled with partition models more suited to their less spatially complex distribution. This helps facilitate a self-regularisation of these parameters and improve the constraints of the parameter solutions that currently display uncertainties driven by the spatial structure of the depth.

234

CHAPTER 8. FUTURE DIRECTIONS

8.3.2.1

Implementation

The fundamental difference in this version of the algorithm is the independent spatial parameterisation of each of the five model parameters using the Voronoi partition model. In the previous versions of our algorithm, we parameterised the spatial domain with a single partition model, attributing each Voronoi cell and node in the partition with values for each of the five model parameters. In this modification we generate five independent partition models, with nodes and cells each attributed with a single parameter value and representing the spatial realisation of that parameter in the current model. Parametrising the spatial domain in this manner offers a number of flexibilities in the implementation of the ML-SG algorithm. Firstly, whilst we evaluate the full model using all the individual parameter partition models, when we propose any move in the algorithm we are only changing the model in that single parameter partition. Hence, we can use the formulation of the single parameter ML-SG algorithm detailed in sections 4.4 & 5.5.1.1. This enables us to choose the way in which we utilise the segmentation for each parameter partition model. For instance, as the segmentation is primarily driven by the spatial complexity of the depth and substrate parameters, it may only make sense to use the segment guided (SG) approach when proposing node locations in the model for those parameters. The water column parameter node location proposals can be completed using the random approach of the naive version of the algorithm (section 3.4.1), or still with the SG approach if desired (as we have discussed previously, an uninformative segmentation will simply be “over-ruled” by the data). The informed starting point approach of the Hybrid algorithm can also be easily used, with the segment based optimised parameters mapped to each initial parameter partition model. This flexibility is made possible by the fact that in each version of our algorithm, the terms concerning the number and placement of nodes do not appear in the final derivation of any of the acceptance term equations. The model proposals of the rj-McMC algorithm are implemented with the following recipe: 1. Select a parameter partition model to propose a change to with a uniform random probability from N partitions. 2. At every odd step of the chain, choose a node at random to propose a parameter value change.

8.3. FUTURE RESEARCH DIRECTIONS

235

3. At every even step of the chain, propose either a cell node move, birth or death with equal probability. For dimensional change proposals (Birth/Death) the method of node placement or removal (segment guided/random) is determined based on the parameter partition model selected in step 1.

8.3.2.2

Efficiency Challenges

This multiple-partition model version of the algorithm presents two major challenges in terms of computational and sampling efficiency. The first challenge is a result of the fundamentally different way in which the parameters are represented in the full model. In the previous versions of the algorithm, each node in the single Voronoi partition model was attributed with values for all parameters in the forward model. Using the localised stack algorithm (section 5.4.2) we could determine which partition cells had changed in a proposed model, and which pixels must then be evaluated in the likelihood for that proposed model. Therefore, to determine a modelled spectra for each pixel in a changed partition cell, only a single call to the forward model was required to calculate a modelled spectra based on the parameters for that cell. In the multiple partition model, the parameter values at a particular pixel are described by multiple Voronoi partitions, each with their own spatial structure. This means that even though the stack algorithm can still be used to identify changed pixels of a proposed model, we cannot simply calculate a modelled spectra from a single nodebased set of parameters. In the current cell node based formulation of the algorithm, for each pixel we must query each Voronoi partition to determine which cell it is contained within, and what the parameter value of that cell node is. After determining the set of parameter values for the pixel, we must call the forward model for each pixel to calculate a modelled spectra. This does present a computational burden in the current formulation. We can however reorganise the structure of the algorithm to move from a node-based, to a pixelbased parameter storage framework, significantly improving the efficiency of implementing a multi-partition model. The second challenge concerns the efficiency of sampling using the new recipe and multiple partitions. In the single partition model of the algorithm, proposal changes to the structure of the partition (birth/death/cell moves) involve a change to all parameter models. For example, this means that the probability of proposing a birth move that

236

CHAPTER 8. FUTURE DIRECTIONS

changes all parameter models at an even step of the chain is 1/3. In this new recipe, where changes take place on individual parameter partitions, at every even step of the chain there is only a 1/15 chance that a birth move for a particular parameter will be proposed. This means that we have to sample a chain five times as long to propose the equivalent number of partition change moves to each individual parameter as in the single partition model algorithm. However, in this case individual (as opposed to collective) parameter perturbations are tested for acceptance in the Markov chain.

8.3.2.3

Initial Results

To evaluate the multi-partition model approach, we applied the method of the previous section to the same synthetic multiple parameter data, problem and settings described in Chapter 6. We use the segmentation of the multi-parameter synthetic data (section 6.2.1.2) to inform the prior placement and proposals of the partition nodes for the depth (H) and substrate proportion (q) under the single parameter ML-SG formulation of the algorithm (sections 4.4 & 5.5.1.1). The node locations for the water column parameter partition models use the random prior placement and proposal moves of the naive algorithm (section 3.4.1). Parallel chains were run across 72 individual compute nodes, with a burn in period of 400,000 samples, and the following 300,000 samples retained for ensemble solution estimation. The computational challenges of the algorithm formulation are immediately apparent in this test run. Under the node-based parameter storage framework, compute times for the 700,000 samples of the run were in excess of 259 hours, approximately 6 times slower than a single-partition run of comparable length. However, it is incorrect to make a direct comparison of these processing times, not least because we are yet to move the structure of the algorithm to pixel-based storage. In the multi-partition framework, we are sampling the inverse problem with a fundamentally different parameterisation and proposal structure. It is therefore important that we improve the coding efficiency of the algorithm so as to more thoroughly examine the effects this new parameterisation has on algorithm convergence and sampling over longer runs. In figure 8.7 we see the degree of convergence of the algorithm over the time-restricted run of this initial test, with all parameters still clearly displaying changing trends. Despite these challenges, the parameter ensemble solutions for the test run are very encouraging, and confirm some early hypotheses that we made regarding the reasoning

8.3. FUTURE RESEARCH DIRECTIONS

237

(a)

(b)

(c)

(d)

(e)

Figure 8.7 – Parameter sample trace plot ranges for 10 combined chains using a Multiple Partition model approach.

238

CHAPTER 8. FUTURE DIRECTIONS

(a) Depth MAP Solution (m)

(b) Substrate Proportion (q) MAP Solution

Figure 8.8 – Maximum Posterior Depth (MAP) Solutions for Depth and Substrate Proportion - note the increased dimension of the partition model used to characterise the higher spatial complexity of the parameters (see true model values in figure 6.1).

behind this approach. In figure 8.8 we see the posterior ensemble MAP solutions for the two most spatially complex parameters in the model, depth and q, with the detail and number of Voronoi cells reflecting this complexity. In the single-partition versions of the algorithm, this complexity of partition is forced on the more spatially continuous parameters in the model, such as the water column parameters in our synthetic data model. Figure 8.9 shows that by allowing each parameter to determine its own dimension and structure of partition model, the parsimonious nature of the rj-McMC results in partition models of less complexity and far fewer cells when sampling models for parameters of a lower spatial complexity. To examine how these partition models of varying dimension influence the ensemble solutions, in figures 8.10 and 8.11 we present the ensemble mean and standard deviation solutions for depth and NAP concentration. We contrast these against the ensemble solutions of the single partition ML-SG algorithm from Chapter 6. We can clearly see the spatial structure of the depth parameter is resolved comparably by both methods, suggesting that the dimension of the single-partition model is indeed driven by the depth variability as we have discussed. In comparison, we can see a distinct difference between the two methods for the estimated spatial structure of the less complex NAP parameter (figure 8.11). In the

8.3. FUTURE RESEARCH DIRECTIONS

239

(a) CHL (µg/L) MAP Solution

(b) CDOM MAP Solution

(c) NAP (mg/L) MAP Solution

Figure 8.9 – Maximum Posterior Depth (MAP) Solutions for Water Column Concentrations - note the decreased dimension of the partition model used to characterise the smoother spatial complexity of the parameters (see true model values in figure 6.1).

240

CHAPTER 8. FUTURE DIRECTIONS

(a) Single Partition Mean

(b) Multi-Partition Mean

(c) Single Partition Standard Deviation

(d) Multi-Partition Standard Deviation

(e) Single Partition Marginal Posterior at Test Point 5

(f ) Multi-Partition Marginal Posterior at Test Point 5

Figure 8.10 – Comparison of Ensemble Solutions for Depth (m) using the multi-partition and single partition parameterisations.

8.3. FUTURE RESEARCH DIRECTIONS

241

(a) Single Partition Mean

(b) Multi-Partition Mean

(c) Single Partition Standard Deviation

(d) Multi-Partition Standard Deviation

(e) Single Partition Marginal Posterior at Test Point 5

(f ) Multi-Partition Marginal Posterior at Test Point 5

Figure 8.11 – Comparison of Ensemble Solutions for NAP (mg/L) using the multi-partition and single partition parameterisations.

242

CHAPTER 8. FUTURE DIRECTIONS

single partition method, the large number of smaller cells allow for variability in the NAP parameter to be mapped at scales driven by the complexity of the depth layer. In particular, this results in high variability over small shallow features; the water column parameters becoming highly unconstrained as the water column becomes shallower. In the multi-partition method where the spatial distribution of the NAP parameter itself determines the dimension of the model, we see a much smoother estimated mean ensemble solution, with the variance less driven by the individual depth related features. The larger partition cells reflect the lower spatial complexity of the water column parameter, and act as a form of spatial regularisation for the parameter over these smaller scale depth features that could otherwise contribute to increased uncertainties. We can see this effect not only in the ensemble image solutions, but also if we examine the marginal posteriors of test point 5 (figure 8.11), a location which we highlighted in our analysis of the results in section 6.5.2.1. This point is located over one of the small shallow features we have identified as problem areas in the single-partition method results. The marginal posterior of the single partition method shows this variability, with a large uncertainty and range of the estimated parameter. In comparison, the multi-partition marginal posterior of this point shows a much more constrained solution, suggesting the spatial constraints of the simpler, more representative parameter based partition model is effective in dampening the uncertainties driven by the depth variation. Although the mean parameter estimate for the point is still not at the true value of the parameter, we consider that the algorithm in this run is still converging and that all parameters in the model are moving towards the region of highest posterior probability (see figure 8.7). The developments proposed in this section are in the initial stage of testing, and further research is required to tackle some of the computational and efficiency issues discussed. However, the results in this initial work are encouraging, and we consider it a significant development path to be explored for multiple parameter inversion problems in remote sensing.

8.3.3

Uncertainty in the Empirical Parameterisation

A recurring issue we have seen in our tests is the influence of errors or uncertainties in the forward model on the posterior ensemble solutions. In Chapter 6 we examined this by changing the empirical inputs to the radiative transfer model from those which we

8.3. FUTURE RESEARCH DIRECTIONS

243

used to create the synthetic data. In Chapter 7, this problem was shown to be more difficult to isolate, as the empirical parameterisation of the model is completed using the best available data based on the study location.

The empirical parameters in our radiative transfer model are also measurements, and therefore have a level of uncertainty, even if it is unknown or difficult to characterise. Ideally, in the probabilistic framework we should try to encompass this uncertainty into the inversion, along with the data noise, and hence have it reflected in the resulting ensemble solution. One of the ways this has been tackled in the literature is through the use of Hierarchical Bayes (see for example Gelman et al. (2003)). Through the use of hyperparameters which are included in the inversion process, Hierarchical Bayes has been shown to be an effective method for accounting for data errors in the form or unknown standard deviations and correlation between errors. Such approaches have been applied in both fixed and trans-dimensional inverse problems (Malinverno and Briggs 2004; Bodin et al. 2012a).

The Hierarchical Bayes approach was an option considered when looking to characterise the data noise in Chapter 5. We adopted a maximum likelihood (ML) approach, which conveniently handles the extra parameters required by Hierarchical Bayes approach. Characterising the errors in the forward model parameters more effectively, in a form in which they can be reflected appropriately by the ML or Hierarchical Bayes approach, is an area that would benefit from future research.

Other research directions that could be pursued in this area involve better understanding the influences and accuracies of the range of empirical parameters used in the forward radiative transfer model. For example, by understanding the scale of influence of each empirical parameter on the estimated solution we could better focus on which parameters in the radiative transfer model are most crucial to estimate correctly and accurately. Studies such as that conducted by Gerardino-Neira et al. (2008) can provide important information regarding the sensitivities of individual empirical parameters and their influence in the full inversion. This information may even provide pointers for field measurement processes and the estimation of SIOPs for regions with unknown properties.

244

CHAPTER 8. FUTURE DIRECTIONS

8.4

Summary

In this chapter we have summarised the algorithm development component of the thesis, by providing a snapshot of the current best-practise version of the ML-SG algorithm and outlining areas which are important to pursue in future research. In applying the trans-dimensional inversion approach to the high dimensions of the shallow water remote sensing problem, we have identified a number of challenges. By outlining some potential algorithm refinements and processes, we have provided a foundation on which the issues of multiple substrates and the varied spatial complexity of individual parameters may be tackled in a probabilistic framework. In a simple McMC implementation we have used a prior-as-proposal formulation to demonstrate how multiple substrates might be encompassed into a probabilistic estimation of bottom reflectance. Initial results from a spectral unmixing problem using three substrates show the method to accurately resolve both the dimension of the problem, as well as reflect the true values and uncertainties of each substrate proportion in the model. Whilst these results are promising, they are intended to point the way for future research, and further work is needed to assess the efficiency of the method with a larger substrate library and in the full shallow water inversion problem. By proposing a multi-partition model approach to parameterising the spatial domain, we have demonstrated a method that allows the scale and dimension of each parameter model in the inversion to be determined independently. This modification we feel is significant, as it allows the self-regularisation and natural parsimonious characteristics of the rj-McMC method to be reflected in all parameter solutions, and not driven by the dominant parameters in the forward model. From initial results we can already see the smoother spatial distributions of the water column concentration parameters more appropriately reflected in the estimated ensemble solutions. Using multiple partitions does negate some of the computational benefits of having all parameter values assigned to each node in the single-partition, and this initial test has clearly highlighted new computational and efficiency challenges. While these challenges undoubtedly need to be addressed, the multi-partition approach is clearly an encouraging research direction to be explored.

Chapter 9 Synthesis and Conclusions In this thesis we have cast the optical remote sensing inversion problem in a probabilistic framework, and developed the first application of a trans-dimensional inversion approach to raster based remote sensing image data. In this chapter we summarise the major achievements, and synthesise our findings into three core components. We also present a critique of our method and consider the challenges that remain in the application of the algorithm. A key issue that we have needed to address in this work is the challenge of effectively and efficiently applying probabilistic inversion methods to the high dimensions of inverse problems in remote sensing. This issue has resulted in remote sensing inverse problems predominately being addressed with single solution optimisation inversion methods. These methods can be ill suited to the non-uniqueness of the inverse problem, and do not offer the benefits of uncertainty analysis inherent to an ensemble based probabilistic approach. To address this problem we have investigated the use of a reversible jump McMC algorithm, coupled with an adaptive spatial partition modelling approach, a technique previously developed for geophysical data inversion. In applying this method to remote sensing data we encounter a dimension of inverse problem and data size far larger than any previous applications of the rj-McMC that we are aware of, and we have developed a number of innovations in this thesis to create an original algorithm to tackle this specific type of data and inversion. In the first instance we have investigated the ability of the algorithm and Voronoi partitioning approach to resolve varied spatial features in a simple single value raster image retrieval test. Even at this reduced scale of image problem we observed the 245

246

CHAPTER 9. SYNTHESIS AND CONCLUSIONS

difficulty the partition model encountered in adapting to and resolving smaller scale complex features in the data. Utilising image segmentation techniques, we developed a new method informed by the spatial coherency of the data, called the segment-guided (SG) algorithm. We showed that our method displayed improved retrieval of complex spatial features and assisted in the spatial adaptation and convergence of the algorithm. We then introduced the physics-based shallow water inversion problem which we have used as a foundation for the remaining tests and developments in this work. Using a derived bathymetry model representative of the type of spatial complexity that would be encountered in a coral reef environment, we constructed a synthetic hyper-spectral remote sensing image data set. This image data was inverted using the SG algorithm to retrieve the depth model, with our method showing an enhanced capability to resolve fine scale spatial variability in the data. To jointly address the issue of unknown data noise and the uncertainty of fitting a partition model to the highly complex data, we developed a maximum likelihood noise estimation technique (ML-SG) to be incorporated into the algorithm. We showed clear improvements in the accuracy of the parameter retrieval and model uncertainties estimated with the ML-SG version of the algorithm. Increasing the dimension of the problem, we created a synthetic data set in which all five parameters in the shallow water radiative transfer model were allowed to vary spatially. We adapted the structure of the ML-SG algorithm for the estimation of multiple parameters, and demonstrated its ability to estimate parameter models each with their own independent spatial scale variation. We also showed the potential of the solution ensembles produced by the method to be used to examine parameter sensitivities and uncertainties in the inversion problem. In these tests we showed the ML-SG algorithm to be very effective in encompassing noise in the data, but not suited to addressing theory errors or empirical parameterisation errors in the forward model. To improve the efficiency of the algorithm, and to minimise the potential for nonunique solutions in the multi-parameter problem, we introduced an extension to the ML-SG algorithm which we called the Hybrid algorithm. The aim of this approach was to enable parameter starting points for the algorithm to be estimated using supplementary data observations or by an optimisation method, both based on the object-based segmentation. In the multi-parameter synthetic data case we showed that by starting the chains in a region of higher posterior probability we saw a large improvement in the degree of convergence and parameter retrieval accuracy. Finally, we tested the ML-SG and Hybrid algorithms with a real hyper-spectral

9.1. USING SPATIAL COHERENCY TO ASSIST THE INVERSION

247

data case study in the shallow water environment of Lee Stocking Island, Bahamas. In comparison to a suite of optimisation inversion algorithms applied to the same data, we showed the ML-SG and Hybrid algorithms to produce a bathymetry model of comparable or improved accuracy to that of the best performing optimisation algorithms in the study of Dekker et al. (2011). Of particular note was the performance of our algorithms against the pixel-based optimisation inversion of the SAMBUCA algorithm, from which our radiative transfer forward model is taken. Our methods show a capability to constrain the pixel-to-pixel noise in the data, producing a smooth bathymetry model without the spurious depth features generated in the pixel based optimisation solution. This produces a more accurate and usable bathymetry product, complete with depth uncertainty estimates that we have shown to accurately reflect the uncertainty in the parameter model.

9.1

Using Spatial Coherency to Assist the Inversion

One of the points of innovation in this thesis is the manner in which we have utilised the spatial coherency and complexity characteristics of the data as a guiding framework for the inversion process. We have shown our approach to be effective, and perhaps essential, in dealing with the very high dimensions of a physics based remote sensing image inverse problem in a probabilistic framework. This has been achieved through novel developments to a trans-dimensional McMC partition based inversion, which allow these spatial characteristics of the data to play a key role in guiding and informing the inversion at different stages of the process.

9.1.1

Tackling High Dimensions

One of the keys to our approach was to recognise that the underlying physical models which we are trying to retrieve in a remote sensing inverse problem vary at different spatial scales across an image. This varied structure is reflected in the data, and is inevitably at a coarser and more irregular scale than that of the native pixel-based resolution of the data. In theory, this situation lends itself well to the adaptive spatial partitioning features of the rj-McMC method, such as that described by Bodin and Sambridge (2009). In this type of algorithm, modifications to the structure and dimension of a Voronoi partition model are made using a McMC process, by proposing the placement,

248

CHAPTER 9. SYNTHESIS AND CONCLUSIONS

removal and movement of cell nodes randomly across the spatial domain during the Markov chain. This approach has been shown by a number of authors to be an effective way of dealing with a spatial problem of unknown dimensions, with the partition model adapting to the underlying dimension of the parameters and the data (Stephenson et al. 2004; Bodin and Sambridge 2009; Hopcroft et al. 2009). In our image-based problem, the dimension of the data and the underlying model is significantly larger than these studies which have successfully applied this approach. Even in the simple single parameter recovery tests of Chapter 4, this results in a dimension of partition model of ˜400 cells, in comparison to the ˜12 cells used in the similar single parameter retrieval problem in Bodin and Sambridge (2009). We have shown that working in this very high dimensional space presents difficulties to the random proposals of the standard rj-McMC, with the partition model unable to efficiently and effectively adapt to the more spatially complex regions of the data. To deal with this issue we have developed a novel new version of the algorithm called the segment-guided (SG) rj-McMC, which is based on the structure of an object-based segmentation of the remote sensing image data. Object-based segmentation is a method of characterising the spatial coherency of the data, creating a representation of the different scales of data complexity across the image. Through using this segmentation structure we have shown that we can provide guidance at the initialisation stage of the algorithm, and at trans-dimensional steps in the Markov chain. This guidance enables the algorithm to reach a region of higher posterior probability in the parameter space and improves the sampling efficiency and convergence of the rj-McMC algorithm. We have shown in tests on synthetic data that the SG algorithm greatly improves the accuracy of the parameter retrieval in spatially complex regions of the data when compared to the standard rj-McMC method. In all of our tests, we can clearly observe the Voronoi partition structure more effectively adapting to the spatial structure of the data, with smaller cells allowing the spatial variability of the model parameter/s to be realised in these complex regions. As the scale, complexity and number of parameters in our data increases through the chapters in the thesis, we have shown that our SG algorithm framework is essential in dealing with the increasing dimensions of the problem whilst maintaining the parsimonious nature and benefits of the trans-dimensional rj-McMC inversion approach.

9.1. USING SPATIAL COHERENCY TO ASSIST THE INVERSION

9.1.2

249

Informing the Inversion

By developing the Hybrid algorithm (Chapter 6) we have extended the utility of the object-based segmentation, adding a further component to the SG algorithm to deal with the high dimensional challenges of a multi-parameter remote sensing inverse problem. We have shown that the segmentation structure can be used as a flexible tool to create informed starting parameter value layers for the initialisation of the algorithm. The flexibility in the algorithm comes from the ability to determine the segment based parameter layers with a method, or data, best suited to the study area or problem at hand. The segmentation can be used to sample distributed field data, or as in the cases illustrated in this thesis, to aggregate the pixel-based data observations to create a segment based set of spectra. In our tests we have shown how an optimisation inversion can be applied to these set of spectra, with the resulting parameter estimations mapped back to the segment objects to create the starting parameter layers for the rj-McMC initialisation. Applying the optimisation at this stage is quite efficient, as we only have a set of spectra of the same size as the number of segments. In the case of the LSI case study data of Chapter 7, this effectively reduces the data from 240,320 pixel spectra to 406 segment based spectra. By initialising each Voronoi cell node with the optimised parameter values of the segment in which it is placed, we are starting the SG algorithm at a location in the parameter space of a much higher posterior probability than that which is achieved with a random parameter assignment within the uniform distribution of the prior. We have shown that this is very effective in avoiding local minima in the parameter space, especially in the non-unique setting of the multi-parameter problem, or as the dimension of the partition model increases. The Hybrid algorithm works most effectively in conjunction with a parallel chain processing framework, as this process helps to ensure that we are still exploring parameter space sufficiently and the informed parameter layers do not overly constrain the search. As we have a different partition parameterisation to initialise each parallel chain, this creates a different initial spatial representation of the starting parameters to begin each chain. This ensures we explore the parameter space effectively, and across the ensemble of model samples produced we allow the data to determine the degree of constraint of the parameters. We saw this clearly in the LSI case study, where the final ensemble

250

CHAPTER 9. SYNTHESIS AND CONCLUSIONS

solution models of a number of parameters differed significantly to the initial segment based parameter layers. The Hybrid algorithm we have developed is a very practical and pragmatic way of tackling the high dimensions of the remote sensing problem, enabling the user to leverage existing data or optimisation techniques. This version of the algorithm is important to have in the user toolbox, especially as the size and dimension of the remote sensing image data and inverse problem increases. We consider that it may be a necessary addition to enable the benefits of a probabilistic inversion approach to be realised in these very high dimensional problems. Already in the tests in this thesis we have demonstrated the significant improvements in parameter accuracy and speed of convergence when using the informed starting points of the hybrid algorithm.

9.2

An Ensemble of Image Solutions

The fundamental difference between the probabilistic approach and optimisation is the creation not of a single solution, but an ensemble of solutions. In the case of the partition modelling framework which we have employed, this creates an ensemble stack of solutions, each a generalised parameter model across the spatial domain. The ensemble stack enables us to infer a range of solutions at the original pixel resolution of the data, with novel and useful properties when compared to standard pixel-based single solution methods.

9.2.1

Partition Modelling of Pixel-Based Data

In this thesis we have demonstrated the first ever application of a partition modelling inversion approach to remote sensing data. We have shown that the adaptive nature of the Voronoi partition, guided by the segmentation of the SG algorithm, offers significant benefits in estimating continuous parameter models from pixel-based data. In an rj-McMC inversion, the parsimonious nature of the algorithm ensures that the dimension of the partition model is only ever of a sufficient size and complexity to fit the data and noise. In a homogeneous region of an image for instance, one large cell will always be preferred to two smaller cells. Thus, as the partition cells adapt to the spatial complexity of the data, we have shown that they act as a very effective constraint on the pixel based noise of the underlying data observations. Unlike a single pixel based

9.2. AN ENSEMBLE OF IMAGE SOLUTIONS

251

inversion, where the pixel data noise maps directly into the parameter solutions, the generalised partition model acts as an effective self-regularisation tool. By using the partition model, we are fitting a spatially constrained parameter model to a number of pixels in each cell, effectively eliminating pixel-to-pixel noise driven artefacts in the parameter solution. It is important to note that the spatial regularisation that is achieved by the partition model in our algorithm is driven by the data, and not imposed by the user. This is a key difference in our approach to other spatial constraints used in remote sensing inversion, such as kernel based processing, as we are letting the spatial complexity of the data drive the partition structure. Modelling the spatial distribution of parameters with a generalised partition model does mean that any single solution in the chain is not likely to be an optimal retrieval of the spatial complexity of that parameter. We have seen this in the presentation of the maximum posterior (MAP) solutions for parameters throughout the thesis, with the resolution of the parameter model only ever as fine as the partition cells defining it. This is where we have demonstrated the strength of the ensemble stack, with the ability to derive an estimated solution from the ensemble of partition based models of a far higher resolution and accuracy than any one model in the ensemble. Combining the self-regularisation of the partition model and the ability to estimate a pixel resolution parameter solution from the ensemble stack, we can produce a naturally smooth parameter model whilst retrieving complex discontinuities in the model when supported by the data. This was demonstrated most clearly in the synthetic data tests of Chapter 6, where we illustrated the ability of the algorithm to characterise complex small scale depth features and retrieve a smooth bathymetry model over homogeneous regions, even with an increasing level of data noise. In the Lee Stocking Island case study we have shown the tangible benefits of the probabilistic partition modelling approach of the SG algorithm in comparison to the pixel based optimisation of the SAMBUCA method (Brando et al. 2009). To illustrate these results we again present figures 7.25 and 7.28 here in this section. Not only did we achieve a significant increase in the accuracy of the derived depth model as displayed in figure 9.1, we were able to visually illustrate the self-regularisation and noise constraint properties of the SG algorithms. This is seen in figure 9.2 from this case study, displaying the constraint of the pixel based noise artefacts of the SG algorithm depth solutions,

252

CHAPTER 9. SYNTHESIS AND CONCLUSIONS

with spurious pixel based depth features shown in comparable solutions from the pixelbased optimisation of the SAMBUCA algorithm. In Chapter 8 we introduced some early results from a multi-partition implementation of the algorithm, where each unknown parameter in the inversion is defined by its own partition model. From these results we consider that the self-regularisation aspects of the partition model approach can be extended to the varied spatial scales of the individual parameters. By allowing the data and spatial scale variation of each parameter to define an individual partition, we have already seen that the water constituent parameters are estimated by a much smoother model, more reflective of their true spatial variation. The larger cells and reduced dimension of these individual partitions acts as a self-regularisation for the water constituent parameters, whereas in the single partition model we saw uncertainty and spatial variation in these solutions, driven by the finer scale and more influential depth parameter in the model.

9.2.2

The Ensemble and Uncertainty

How we infer a solution from the ensemble is an important step in our algorithm process, and one which is crucial to one of the main aims of this thesis, that of estimating the uncertainty of the parameters in the solution. Underpinning this process is the manner in which noise is characterised in the probabilistic algorithm. In this work we developed a novel maximum likelihood (ML) formulation of the noise in the SG algorithm. This formulation derives a maximum likelihood estimate of the noise at each step of the Markov Chain, based on the misfit of the model to the observed data. In structuring the noise model in this way, we not only encompass data noise, but also the uncertainty of fitting a generalised partition model to the fine scale pixel variability of remote sensing data. This means that the user does not need to know the nature of the data noise (which can be difficult to characterise), as it is estimated as part of the inversion. The method is flexible, as we can input noise correlation characteristics of the data, even if the magnitude of the noise is unknown. In the paper reproduced in appendix B, we show our original approach to estimating the noise covariance structure of remote sensing data in a shallow water region. Using this method to estimate the noise covariance of the data in the Lee Stocking Island case study, we show how this information can be incorporated into the ML-SG algorithm and produce bathymetry uncertainty

9.2. AN ENSEMBLE OF IMAGE SOLUTIONS

253

(a)

(b)

(c)

Figure 9.1 – Lee Stocking Island hyperspectral PHILLS data test - Density scatter plots of the acoustic data vs the depth solutions for the pixel-based SAMBUCA optimisation (a), Standard ML-SG (b) and Hybrid ML-SG (c) algorithms (Increased density represented in red).

254

CHAPTER 9. SYNTHESIS AND CONCLUSIONS

(a)

(b)

Figure 9.2 – Lee Stocking Island hyperspectral PHILLS data test - Comparison of the Hybrid ML-SG mean ensemble solution to the pixelbased SAMBUCA solution and Acoustic Ground Truth Data shown at two transect locations. Note the effective constraint by the Hybrid algorithm of the pixel based spurious depth features evident in the SAMBUCA solution.

9.2. AN ENSEMBLE OF IMAGE SOLUTIONS

255

estimates from the ensemble that are an accurate representation of the depth model uncertainty, based on ground truth data. Working with an ensemble of solutions also enables us to look more closely at the interactions of parameters in the inversion. In the synthetic and real data tests we have been able to observe the degree of constraint of different parameters in a range of modelling scenarios. For example, we have consistently observed the water constituent parameters, especially CHL, to be highly unconstrained with a high uncertainty in shallower waters. This is not surprising, and is consistent with the retrieval accuracy of these parameters in similar scenarios in the literature (Dekker et al. 2011; Jay and Guillaume 2011). The difference with the ensemble approach is that we are able to explicitly examine this through the marginal posterior PDF of the parameter, and identify the nature of the uncertainty and interaction with other parameters in the model. We have demonstrated the benefits of using a joint posterior analysis approach to infer meaningful estimates from the unconstrained water column parameters. Using the attenuation coefficient Kd as a multi-dimensional joint posterior distribution of the water column concentrations, we have illustrated how parameter interactions, contributions and local solution minima can be identified. This kind of information is not always able to be inferred from a single parameter distribution, and shows the flexibility of the probabilistic ensemble approach. Examining the posterior PDFs of the parameters has allowed us to determine the suitability of a range of ensemble estimators for particular scenarios. For instance, we have shown that characterising some of the poorly constrained parameters using a Gaussian uncertainty estimation measure such as standard deviation may be inappropriate, and we have illustrated examples such as the HPD credible interval that provide additional information to the user. We have also been able to demonstrate the potential pitfalls of using a uniform prior interval, where the uncertainty of the parameter can be underestimated if the distribution occurs at the bounds of the interval. In the Lee Stocking Island case study we have been able to show that using a mean and standard deviation estimator for the depth parameter model is appropriate, through examination of the ensemble stack and parameter posterior PDFs, and validation to ground truth acoustic bathymetry data.

256

CHAPTER 9. SYNTHESIS AND CONCLUSIONS

9.3

A General Spatial Inversion Framework

One of the potential benefits of the way in which we have developed the algorithm in this work, is the flexibility we have in defining the forward model. We have demonstrated the algorithm using a shallow water radiative transfer model with multiple unknown parameters. However, we can substitute any forward radiative transfer model into the algorithm which relates the unknown parameters of interest to the observed spectral data of a remote sensing image. We have therefore created a general probabilistic inversion algorithm for remote sensing inversion, with the potential to be applied to other aquatic physics-based problems such as ocean colour inversion, or to terrestrial applications such as the characterisation of forest canopies. The components of ML noise estimation and a Hybrid starting point framework provide additional flexibility to the approach, removing the requirement for data noise characterisation, and allowing the incorporation of existing data or parameter estimation techniques to assist in high dimensional problems. By estimating parameters from remote sensing data using our algorithm and framework, we have been able to demonstrate that meaningful uncertainty estimates can be made for the models produced as part of the inversion. Just as importantly, we have shown that the ensemble framework allows us to examine where these uncertainty estimates are valid, and where they have not appropriately reflected the model uncertainty. This is a significant development, and is a step towards aligning the estimated products from remote sensing data with products produced from other methods of data measurement. In the shallow water application area, in which we have focused on in this thesis, we have established a framework where we can begin to align the uncertainty of shallow water bathymetry models derived from optical remote sensing data with the more established methods of hydrographic surveying, so that these models can function as a more complementary data source.

9.4

Criticisms and Challenges

There are two main criticisms that we make of our algorithm and method, as well as two challenges that we have raised in the previous future directions chapter. Firstly, the computational demands of our algorithm are high, even on the relatively small image subsets that we have used in the test in this thesis. The single solution

9.4. CRITICISMS AND CHALLENGES

257

methods used in the Dekker et al. (2011) study processed the same PHILLS LSI data in times ranging from 48 minutes for the HOPE algorithm, through to 1147 hours for the full eight substrate implementation of SAMBUCA. For a fair comparison to our algorithm implementation we must consider these run times on a hypothetical parallel cluster of 72 CPU nodes, reducing the HOPE and SAMBUCA processing time to 40 seconds and 15.9 hours respectively. In comparison, on processors of a similar speed, we inverted the data using the Hybrid algorithm in 120 hours. Although this processing time is not necessarily prohibitive, and we consider that there may be numerous coding efficiencies that could be realised in the algorithm, it is significantly higher than the range of optimisation methods. This is not unexpected, and it means a user must always evaluate the trade-offs between operational speed and the benefits we have shown the ML-SG to provide. However, this is an ongoing challenge in probabilistic inverse methods, and we must consider this factor as we look to expand the algorithm application to larger study sites and datasets. We have implemented localised search procedures to mitigate against the need to process anything other than the cells and pixels that change at any step of the chain, however an increase in data size means an increase in the number of partition cells and the dimension of the problem that must be sampled. This ties into the second criticism of current method, the efficiency of sampling and convergence of the algorithm. We have seen throughout this thesis the difficulty in assessing the convergence of the algorithm when applied to the high dimensions of the remote sensing problem. We must stress that this is not a problem unique to our McMC inversion, and one that is recognised as particularly difficult in a trans-dimensional framework (Sisson 2005). In most instances in the multi-parameter problem we have assessed convergence based on the local stability of the parameter estimates, although we have acknowledged that it is unlikely we have reached a global degree of convergence across the whole image inversion problem. At the scales and dimensions we are working with, as we continue to increase the iterations of the chain it is likely there is still always some small area in a parameter model that has not reached a stable sampling state in a region of highest posterior probability. The issue will be compounded as the data size and study area increases. This has ramifications regarding the information we can confidently infer from the ensemble of solutions. However, on examination of the broad stability of the solution and uncertainty maps we have produced for the depth parameter in particular, we are confident that we can infer a relevant and appropriate bathymetry model from

258

CHAPTER 9. SYNTHESIS AND CONCLUSIONS

the ensemble produced with our algorithm. Aside from these two critiques of the method, there are two key challenges which we feel should be addressed to improve the algorithm, and we have outlined these in more detail in Chapter 8. Firstly, the ability to quantify, or estimate as part of the inversion, the uncertainty being contributed by errors in the forward model is an important component of ensuring parameter uncertainty estimates remain relevant. We have discussed this as both errors in the empirical parameters of the model, and errors in the theory of the forward model. As the current ML noise formulation in the algorithm does not encompass these forms of model error, further work is recommended in this area. Secondly, the issue of multiple substrates in the shallow water model has been raised many times in this thesis, and we have shown initial results of a simple approach to tackle additional substrates in an McMC setting in Chapter 8. The issue that the substrate problem represents is a broader challenge across probabilistic sampling methods in general, that of tackling non-continuous, discrete or combinatorial parameter problems in algorithms such as the trans-dimensional rj-McMC.

Bibliography Agostinetti, N. P., Malinverno, A., 2010. Receiver function inversion by transdimensional Monte Carlo sampling. Geophysical Journal International 181 (2), 858– 872. Al-Awadhi, F., Hurn, M., Jennison, C., 2004. Improving the acceptance rate of reversible jump MCMC proposals. Statistics and Probability Letters (2), 189–198. Albert, A., Gege, P., 2006. Inversion of irradiance and remote sensing reflectance in shallow water between 400 and 800 nm for calculations of water and bottom properties. Applied Optics 45 (10), 2331–2343. Ampe, E., Raymaekers, D., Hestir, E., Jansen, M., Knaeps, E., Batelaan, O., 2015. A Wavelet-Enhanced Inversion Method for Water Quality Retrieval From High Spectral Resolution Data for Complex Waters. IEEE Transactions on Geoscience and Remote Sensing 53 (2), 869–882. Andrefouet, S., Kramer, P., Torres-Pulliza, D., Joyce, K. E., Hochberg, E. J., GarzaPerez, R., Mumby, P. J., Riegl, B., Yamano, H., White, W. H., Zubia, M., Brock, J. C., Phinn, S. R., Naseer, A., Hatcher, B. G., Muller-Karger, F. E., 2003. Multisite evaluation of IKONOS data for classification of tropical coral reef environments. Remote Sensing of Environment 88 (1-2), 128–143. Atkinson, P. M., 2002. Spatial Statistics. In: Stein, A., Meer, F. V. d., Gorte, B. (Eds.), Spatial Statistics for Remote Sensing. No. 1 in Remote Sensing and Digital Image Processing. Springer Netherlands, pp. 57–81. Atzberger, C., 2004. Object-based retrieval of biophysical canopy variables using artificial neural nets and radiative transfer models. Remote Sensing of Environment 93 (1-2), 53–67. 259

260

BIBLIOGRAPHY

Atzberger, C., Richter, K., 2012. Spatially constrained inversion of radiative transfer models for improved LAI mapping from future Sentinel-2 imagery. Remote Sensing of Environment 120, 208–218. Aurenhammer, F., 1991. Voronoi diagrams - a survey of a fundamental geometric data structure. ACM COMPUTING SURVEYS 23 (3), 345–405. Aurin, D., Mannino, A., Franz, B., 2013. Spatially resolving ocean color and sediment dispersion in river plumes, coastal systems, and continental shelf waters. Remote Sensing of Environment 137, 212–225. Baatz, M., Schaape, A., 2000. Multiresolution Segmentation: an optimization approach for high quality multi-scale image segmentation. In: Angewandte Geographische Informationsverarbeitung XII. Beitraage zum AGIT-Symposium Salzburg 2000, Karlsruhe. pp. 12–23. Baret, F., Buis, S., 2008. Estimating Canopy Characteristics from Remote Sensing Observations: Review of Methods and Associated Problems. In: Liang, S. (Ed.), Advances in Land Remote Sensing. Springer Netherlands, pp. 173–201. Bayes, T., 1763. An Essay Towards Solving a Problem in the Doctrine of Chances. C. Davis, Printer to the Royal Society of London. Bierwirth, P. N., Lee, T. J., Burne, R. V., 1993. Shallow sea-floor reflectance and water depth derived by unmixing multispectral imagery. Photogrammetric Engineering and Remote Sensing 59 (3), 331–338. Blaschke, T., 2010. Object based image analysis for remote sensing. ISPRS Journal of Photogrammetry and Remote Sensing 65 (1), 2–16. Bodin, T., Sambridge, M., 2009. Seismic tomography with the reversible jump algorithm. Geophysical Journal International 178 (3), 1411–1436. Bodin, T., Sambridge, M., Gallagher, K., 2009. A self-parametrizing partition model approach to tomographic inverse problems. Inverse Problems 25 (5), 055009. Bodin, T., Sambridge, M., Rawlinson, N., Arroucau, P., 2012a. Transdimensional tomography with unknown data noise. Geophysical Journal International 189 (3), 1536– 1556.

BIBLIOGRAPHY

261

Bodin, T., Sambridge, M., Tkalcic, H., Arroucau, P., Gallagher, K., Rawlinson, N., 2012b. Transdimensional inversion of receiver functions and surface wave dispersion. Journal of Geophysical Research: Solid Earth 117 (B2), B02301. Boss, E., Zaneveld, J. R., 2003. The effect of bottom substrate on inherent optical properties: Evidence of biogeochemical processes. Limnology and Oceanography 48 (1), 346–354. Botha, E. J., Brando, V. E., Anstee, J. M., Dekker, A. G., Sagar, S., 2013. Increased spectral resolution enhances coral detection under varying water conditions. Remote Sensing of Environment 131, 247–261. Box, G. E. P., Tiao, G. C., 1992. Bayesian Inference in Statistical Analysis. Wiley, New York. Brando, V., Dekker, A., 2003. Satellite hyperspectral remote sensing for estimating estuarine and coastal water quality. IEEE Transactions on Geoscience and Remote Sensing 41 (6), 1378 – 1387. Brando, V. E., Anstee, J. M., Wettle, M., Dekker, A. G., Phinn, S. R., Roelfsema, C., 2009. A physics based retrieval and quality assessment of bathymetry from suboptimal hyperspectral data. Remote Sensing of Environment 113 (4), 755–770. Brooks, S., 1998. Markov chain Monte Carlo method and its application. Journal of the Royal Statistical Society: Series D (The Statistician) 47 (1), 69–100. Brooks, S., Giudici, P., 1998. Convergence Assessment for Reversible Jump MCMC Simulations. In: Bayesian Statistics. Vol. 6. Oxford Univerisity Press. Brooks, S. P., Giudici, P., Roberts, G. O., 2003. Efficient Construction of Reversible Jump Markov Chain Monte Carlo Proposal Distributions. Journal of the Royal Statistical Society. Series B (Statistical Methodology) 65 (1), 3–55. Brown, C. W., Connor, L. N., Lillibridge, J. L., Nalli, N. R., Legeckis, R. V., 2005. An Introduction to Satellite Sensors, Observations and Techniques. In: Miller, R. L., Castillo, C. E. D., Mckee, B. A. (Eds.), Remote Sensing of Coastal Aquatic Environments. No. 7 in Remote Sensing and Digital Image Processing. Springer Netherlands, pp. 21–50.

262

BIBLIOGRAPHY

Campbell, J. B., Wynne, R. H., 2011. Introduction to Remote Sensing, 5th Edition. Guilford Press, New York. Camps-Valls, G., Tuia, D., Gomez-Chova, L., Jimenez, S., Malo, J., 2011. Remote Sensing Image Processing. Vol. 5 of Synthesis Lectures on Image, Video, and Multimedia Processing. Morgan & Claypool. Carlin, B. P., Louis, T. A., 2000. Empirical Bayes: Past, Present and Future. Journal of the American Statistical Association 95 (452), 1286–1289. Charvin, K., Gallagher, K., Hampson, G. L., Labourdette, R., 2009. A Bayesian approach to inverse modelling of stratigraphy, part 1: method. Basin Research 21 (1), 5–25. Chen, M.-H., Shao, Q.-M., 1999. Monte Carlo Estimation of Bayesian Credible and HPD Intervals. Journal of Computational and Graphical Statistics 8 (1), 69–92. Combal, B., Baret, F., Weiss, M., Trubuil, A., Mace, D., Pragnere, A., Myneni, R., Knyazikhin, Y., Wang, L., 2002. Retrieval of canopy biophysical variables from bidirectional reflectance:: Using prior information to solve the ill-posed inverse problem. Remote Sensing of Environment 84 (1), 1–15. Corner, B., Narayanan, R., Reichenbach, S., 2003. Noise estimation in remote sensing imagery using data masking. International Journal of Remote Sensing 24 (4), 689–702. Cowles, M. K., Carlin, B. P., 1996. Markov Chain Monte Carlo Convergence Diagnostics: A Comparative Review. Journal of the American Statistical Association 91, 883–904. Curran, P., Dungan, J., 1989. Estimation of signal-to-noise: a new procedure applied to AVIRIS data. IEEE Transactions on Geoscience and Remote Sensing 27 (5), 620–628. Curtis, A., Lomax, A., 2001. Prior information, sampling distributions, and the curse of dimensionality. Geophysics 66, 372–378. Davis, C., Bowles, J., Leathers, R., Korwan, D., Downes, T. V., Snyder, W., Rhea, W., Chen, W., Fisher, J., Bissett, P., Reisse, R. A., 2002. Ocean PHILLS hyperspectral imager: design, characterization, and calibration. Optics Express 10 (4), 210–221.

BIBLIOGRAPHY

263

Day, J. C., Dobbs, K., 2013. Effective governance of a large and complex crossjurisdictional marine protected area: Australia’s Great Barrier Reef. Marine Policy 41, 14–24. Debski, W., 2010. Chapter 1 - Probabilistic Inverse Theory. In: Renata Dmowska (Ed.), Advances in Geophysics. Vol. Volume 52 of Advances in Geophysics. Elsevier, pp. 1– 102. Decho, A., Kawaguchi, T., Allison, M., Louchard, E., Reid, R., Stephens, F., Voss, K., Wheatcroft, R., Taylor, B., 2003. Sediment properties influencing upwelling spectral reflectance signatures: The ”biofilm gel effect”. Limnology and Oceanography 48, 431–443. Dekker, A., Brando, V., Anstee, J., Pinnel, N., Kutser, T., Hoogenboom, E., Peters, S., Pasterkamp, R., Vos, R., Olbert, C., Malthus, T., 2002. Imaging Spectrometry of Water. In: Meer, F. D. v. d., Jong, S. M. D., Meer, F. D. (Eds.), Imaging Spectrometry. Vol. 4 of Remote Sensing and Digital Image Processing. Springer Netherlands, pp. 307–358. Dekker, A. G., Peters, S. W. M., 1993. The use of the Thematic Mapper for the analysis of eutrophic lakes: a case study in the Netherlands. International Journal of Remote Sensing 14 (5), 799–821. Dekker, A. G., Phinn, S. R., Anstee, J., Bissett, P., Brando, V. E., Casey, B., Fearns, P., Hedley, J., Klonowski, W., Lee, Z. P., 2011. Intercomparison of shallow water bathymetry, hydro-optics, and benthos mapping techniques in Australian and Caribbean coastal environments. Limnology and Oceanography: Methods 9, 396–425. Dettmer, J., Dosso, S. E., Holland, C. W., 2010. Trans-dimensional geoacoustic inversion. The Journal of the Acoustical Society of America 128 (6), 3393. Dettmer, J., Molnar, S., Steininger, G., Dosso, S. E., Cassidy, J. F., 2012. Transdimensional inversion of microtremor array dispersion data with hierarchical autoregressive error models. Geophysical Journal International 188 (2), 719–734. Dierssen, H. M., Zimmerman, R. C., Leathers, R. A., Downes, T. V., Davis, C. O., 2003. Ocean Color Remote Sensing of Seagrass and Bathymetry in the Bahamas Banks by High-Resolution Airborne Imagery. Limnology and Oceanography 48 (1), 444–455.

264

BIBLIOGRAPHY

Dosso, S., Wilmut, M., 2006. Data Uncertainty Estimation in Matched-Field Geoacoustic Inversion. IEEE Journal of Oceanic Engineering 31 (2), 470–479. Dosso, S. E., Dettmer, J., Steininger, G., Holland, C. W., 2014. Efficient transdimensional Bayesian inversion for geoacoustic profile estimation. Inverse Problems 30 (11), 114018. Dosso, S. E., Holland, C. W., Sambridge, M., 2012. Parallel tempering for strongly nonlinear geoacoustic inversion. The Journal of the Acoustical Society of America 132 (5), 3030–3040. Eches, O., Dobigeon, N., Tourneret, J. Y., Jun. 2010. Estimating the Number of Endmembers in Hyperspectral Images Using the Normal Compositional Model and a Hierarchical Bayesian Algorithm. IEEE Journal of Selected Topics in Signal Processing 4 (3), 582–591. Fearns, P., Klonowski, W., Babcock, R., England, P., Phillips, J., 2011. Shallow water substrate mapping using hyperspectral remote sensing. Continental Shelf Research 31 (12), 1249–1259. Filippi, A. M., Kubota, T., 2008. Introduction of spatial smoothness constraints via linear diffusion for optimization-based hyperspectral coastal ocean remote-sensing inversion. Journal of Geophysical Research: Oceans 113. Gallagher, K., Bodin, T., Sambridge, M., Weiss, D., Kylander, M., Large, D., 2011. Inference of abrupt changes in noisy geochemical records using transdimensional changepoint models. Earth and Planetary Science Letters 311 (1-2), 182–194. Gallagher, K., Charvin, K., Nielsen, S., Sambridge, M., Stephenson, J., 2009. Markov chain Monte Carlo (MCMC) sampling methods to determine optimal models, model resolution and model choice for Earth Science problems. Marine and Petroleum Geology 26 (4), 525–535. Gao, B.-C., Jan. 1993. An operational method for estimating signal to noise ratios from data acquired with imaging spectrometers. Remote Sensing of Environment 43 (1), 23–33.

BIBLIOGRAPHY

265

Gao, B.-C., Montes, M. J., Ahmad, Z., Davis, C. O., 2000. Atmospheric correction algorithm for hyperspectral remote sensing of ocean color from space. Applied Optics 39 (6), 887–896. Gao, L., Du, Q., Zhang, B., Yang, W., Wu, Y., 2013. A Comparative Study on Linear Regression-Based Noise Estimation for Hyperspectral Imagery. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 6 (2), 488–498. Gao, L.-R., Zhang, B., Zhang, X., Zhang, W.-j., Tong, Q.-X., 2008. A New Operational Method for Estimating Noise in Hyperspectral Images. IEEE Geoscience and Remote Sensing Letters 5 (1), 83–87. Gelman, A., Carlin, J. B., Stern, H. S., Rubin, D. B., 2003. Bayesian Data Analysis, 2nd Edition. Chapman and Hall/CRC, Boca Raton, Fla. Gelman, A., Roberts, G., Gilks, W., 1996. Efficient metropolis jumping rules. In: Bayesian Statistics. Vol. 5. Oxford Univerisity Press. Gerardino-Neira, C., Goodman, J., Velez-Reyes, M., Rivera, W., 2008. Sensitivity Analysis of a Hyperspectral Inversion Model for Remote Sensing of Shallow Coastal Ecosystems. In: Geoscience and Remote Sensing Symposium, 2008. IGARSS 2008. IEEE International. Vol. 1. IEEE, Boston, USA. Geyer, C. J., Moller, J., 1994. Simulation Procedures and Likelihood Inference for Spatial Point Processes. Scandinavian Journal of Statistics 21 (4), 359–373. Giardino, C., Brando, V. E., Dekker, A. G., Strombeck, N., Candiani, G., 2007. Assessment of water quality in Lake Garda (Italy) using Hyperion. Remote Sensing of Environment 109 (2), 183–195. Gilks, W. R., Richardson, S., Spiegelhalter, D. J., 1996. Markov chain Monte Carlo in practice. Chapman & Hall / CRC, London. Goodman, J., Ustin, S. L., 2007. Classification of benthic composition in a coral reef environment using spectral unmixing. Journal of Applied Remote Sensing 1 (1), 011501. Goodman, J. A., Lee, Z., Ustin, S. L., 2008. Influence of atmospheric and sea-surface corrections on retrieval of bottom depth and reflectance using a semi-analytical model: a case study in Kaneohe Bay, Hawaii. Applied Optics 47 (28), F1–F11.

266

BIBLIOGRAPHY

Goodman, J. A., Purkis, S. J., Phinn, S. R., 2013. Coral Reef Remote Sensing: A Guide for Mapping, Monitoring and Management. Springer Science & Business Media. Gouveia, W. P., Scales, J. A., 1998. Bayesian seismic waveform inversion: Parameter estimation and uncertainty analysis. Journal of Geophysical Research 103, 2759–2779. Green, P. J., 1995. Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82 (4), 711–732. Green, P. J., 2003. Transdimensional Markov Chain Monte Carlo. In: Highly Structured Stochastic Systems. Oxford Statistical Sciences Series. Oxford Univeristy Press, Oxford U.K. Green, P. J., Hastie, D. I., 2009. Reversible jump MCMC. Bioinformatics 24 (13), i407– 13. Green, R., Pavri, B., Chrien, T., 2003. On-orbit radiometric and spectral calibration characteristics of EO-1 Hyperion derived with an underflight of AVIRIS and in situ measurements at Salar de Arizaro, Argentina. IEEE Transactions on Geoscience and Remote Sensing 41 (6), 1194 – 1203. Guo, Q., Dou, X., 2008. A Modified Approach for Noise Estimation in Optical Remotely Sensed Images With a Semivariogram: Principle, Simulation, and Application. IEEE Transactions on Geoscience and Remote Sensing 46 (7), 2050–2060. Haario, H., Laine, M., Lehtinen, M., Saksman, E., Tamminen, J., 2004. Markov chain Monte Carlo methods for high dimensional inversion in remote sensing. Journal of the Royal Statistical Society - Series B (Statistical Methodology) 66 (3), 591–607. Hamel, M. A., Andrefouet, S., 2010. Using very high resolution remote sensing for the management of coral reef fisheries: Review and perspectives. Marine Pollution Bulletin 60 (9), 1397–1405. Hastings, W. K., 1970. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57 (1), 97 –109. Hedley, J., Mumby, P., Joyce, K. E., Phinn, S. R., 2004. Spectral unmixing of coral reef benthos under ideal conditions. Coral Reefs 23 (1), 60–73.

BIBLIOGRAPHY

267

Hedley, J., Roelfsema, C., Koetz, B., Phinn, S., 2012a. Capability of the Sentinel 2 mission for tropical coral reef mapping and coral bleaching detection. Remote Sensing of Environment 120, 145–155. Hedley, J., Roelfsema, C., Phinn, S. R., 2009. Efficient radiative transfer model inversion for remote sensing applications. Remote Sensing of Environment 113 (11), 2527–2532. Hedley, J. D., Harborne, A. R., Mumby, P. J., 2005. Simple and robust removal of sun glint for mapping shallow-water benthos. International Journal of Remote Sensing 26 (10), 2107–2112. Hedley, J. D., Roelfsema, C. M., Phinn, S. R., Mumby, P. J., 2012b. Environmental and Sensor Limitations in Optical Remote Sensing of Coral Reefs: Implications for Monitoring and Sensor Design. Remote Sensing 4 (12), 271–302. Hochberg, E. J., Andrefouet, S., Tyler, M. R., 2003. Sea surface correction of high spatial resolution Ikonos images to improve bottom mapping in near-shore environments. IEEE Transactions on Geoscience and Remote Sensing 41 (7), 1724–1729. Hochberg, E. J., Atkinson, M. J., 2003. Capabilities of remote sensors to classify coral, algae, and sand as pure and mixed spectra. Remote Sensing of Environment 85 (2), 174–189. Hofmann, P., Strobl, J., Blaschke, T., 2008. A method for adapting global image segmentation methods to images of different resolutions. In: Proceedings of XXIth ISPRS Congress. Beijing. Hoogenboom, H., Dekker, A., Althuis, I., 1998. Simulation of AVIRIS Sensitivity for Detecting Chlorophyll over Coastal and Inland Waters. Remote Sensing of Environment 65 (3), 333–340. Hopcroft, P. O., Gallagher, K., Pain, C. C., 2009. A Bayesian partition modelling approach to resolve spatial variability in climate records from borehole temperature inversion. Geophysical Journal International 178 (2), 651–666. Hu, C., Feng, L., Lee, Z., Davis, C. O., Mannino, A., McClain, C. R., Franz, B. A., 2012. Dynamic range and sensitivity requirements of satellite ocean color sensors: learning from the past. Applied Optics 51 (25), 6045–6062.

268

BIBLIOGRAPHY

Iglewicz, B., Hoaglin, D. C., 1993. How to Detect and Handle Outliers. ASQC Quality Press, Milwaukee. Jay, S., Guillaume, M., 2011. Estimation of water column parameters with a maximum likelihood approach. In: 2011 3rd Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (WHISPERS). Lisbon. Kabanikhin, S. I., 2008. Definitions and examples of inverse and ill-posed problems. Journal of Inverse and Ill-posed Problems 16 (4), 317–357. Ke, Y., Quackenbush, L. J., Im, J., 2010. Synergistic use of QuickBird multispectral imagery and LIDAR data for object-based forest species classification. Remote Sensing of Environment 114 (6), 1141–1154. King, R., 2011. Statistical Ecology. In: Handbook of Markov Chain Monte Carlo. Chapman & Hall / CRC, London, pp. 419–447. Klonowski, W. M., Fearns, P. R., Lynch, M. J., 2007. Retrieving key benthic cover types and bathymetry from hyperspectral imagery. Journal of Applied Remote Sensing 1 (1), 011505. Kutser, T., Dekker, A. G., Skirving, W., 2003. Modeling spectral discrimination of Great Barrier Reef benthic communities by remote sensing instruments. Limnology and Oceanography 48 (1-2), 497–510. Kutser, T., Miller, I., Jupp, D., 2006. Mapping coral reef benthic substrates using hyperspectral space-borne images and spectral libraries. Estuarine, Coastal and Shelf Science (3), 449–460. Kutser, T., Vahtmae, E., Praks, J., 2009. A sun glint correction method for hyperspectral imagery containing areas with non-negligible water leaving NIR signal. Remote Sensing of Environment 113 (10), 2267–2274. Laurent, V. C., Verhoef, W., Damm, A., Schaepman, M. E., Clevers, J. G., 2013. A Bayesian object-based approach for estimating vegetation biophysical and biochemical variables from APEX at-sensor radiance data. Remote Sensing of Environment 139, 6–17.

BIBLIOGRAPHY

269

Lauvernet, C., Baret, F., Hascoet, L., Buis, S., Le Dimet, F.-X., 2008. Multitemporalpatch ensemble inversion of coupled surface-atmosphere radiative transfer models for land surface characterization. Remote Sensing of Environment 112 (3), 851–861. Le, C., Hu, C., Cannizzaro, J., English, D., Muller-Karger, F., Lee, Z., 2013. Evaluation of chlorophyll-a remote sensing algorithms for an optically complex estuary. Remote Sensing of Environment 129, 75–89. Lee, Z., 2009. Applying narrowband remote-sensing reflectance models to wideband data. Applied Optics 48 (17), 3177–3183. Lee, Z., Arnone, R., Hu, C., Werdell, P. J., Lubac, B., 2010a. Quantification of uncertainties in remotely derived optical properties of coastal and oceanic waters. In: SPIE 7678, Ocean Sensing and Monitoring II. Lee, Z., Arnone, R., Hu, C., Werdell, P. J., Lubac, B., 2010b. Uncertainties of optical parameters and their propagations in an analytical ocean color inversion algorithm. Applied Optics 49 (3), 369–381. Lee, Z., Carder, K., Chen, R., Peacock, T., 2001. Properties of the Water Column and Bottom Derived from Airborne Visible Infrared Imaging Spectrometer (AVIRIS) Data. Journal of Geophysical Research - Oceans, 11639–11651. Lee, Z., Carder, K. L., Arnone, R. A., 2002. Deriving inherent optical properties from water color: a multiband quasi-analytical algorithm for optically deep waters. Applied optics 41 (27), 5755–5772. Lee, Z., Carder, K. L., Mobley, C. D., Steward, R. G., Patch, J. S., 1998. Hyperspectral Remote Sensing for Shallow Waters. I. A Semianalytical Model. Applied Optics 37 (27), 6329–6338. Lee, Z., Carder, K. L., Mobley, C. D., Steward, R. G., Patch, J. S., 1999. Hyperspectral Remote Sensing for Shallow Waters. 2. Deriving Bottom Depths and Water Properties by Optimization. Applied Optics 38 (18), 3831–3843. Lesser, M. P., Mobley, C. D., 2007. Bathymetry, water optical properties, and benthic classification of coral reefs using hyperspectral remote sensing imagery. Coral Reefs 26 (4), 819–829.

270

BIBLIOGRAPHY

Levenberg, K., 1944. A method for the solution of certain non-linear problems in least squares. Quarterly Journal of Applied Mathmatics II (2), 164–168. Liang, F., Liu, C., Carroll, R. J., 2010. Advanced Markov Chain Monte Carlo Methods: Learning from Past Samples. John Wiley & Sons, Ltd. Liang, S., 2007. Recent developments in estimating land surface biogeophysical variables from optical remote sensing. Progress in Physical Geography 31 (5), 501–516. Liang, S. (Ed.), 2008. Advances in Land Remote Sensing: System, Modeling, Inversion and Application. Springer Science & Business Media. Louchard, E. M., Reid, R. P., Stephens, F. C., Davis, C. O., Leathers, R. A., Downes, T. V., 2003. Optical Remote Sensing of Benthic Habitats and Bathymetry in Coastal Environments at Lee Stocking Island, Bahamas: A Comparative Spectral Classification Approach. Limnology and Oceanography 48 (1), 511–521. Lyons, M. B., Phinn, S. R., Roelfsema, C. M., 2012. Long term land cover and seagrass mapping using Landsat and object-based image analysis from 1972 to 2010 in the coastal environment of South East Queensland, Australia. ISPRS Journal of Photogrammetry and Remote Sensing 71, 34–46. Lyzenga, D., 1981. Remote sensing of bottom reflectance and water attenuation parameters in shallow water using aircraft and Landsat data. International Journal of Remote Sensing 2 (1), 71–82. Lyzenga, D. R., 1978. Passive remote sensing techniques for mapping water depth and bottom features. Applied Optics 17 (3), 379–383. MacKay, D. J. C., 2003. Information Theory, Inference and Learning Algorithms. Cambridge University Press. Malinverno, A., 2000. A Bayesian criterion for simplicity in inverse problem parametrization. Geophysical Journal International 140 (2), 267–285. Malinverno, A., 2002. Parsimonious Bayesian Markov chain Monte Carlo inversion in a nonlinear geophysical problem. Geophysical Journal International 151 (3), 675–688. Malinverno, A., Briggs, V. A., 2004. Expanded uncertainty quantification in inverse problems: Hierarchical Bayes and empirical Bayes. Geophysics 69 (4), 1005–1016.

BIBLIOGRAPHY

271

Malinverno, A., Leaney, W. S., 2005. Monte Carlo Bayesian look ahead inversion of walkaway vertical seismic profiles. Geophysical Prospecting 53 (5), 689–703. Maritorena, S., d’Andon, O. H. F., Mangin, A., Siegel, D. A., 2010. Merged satellite ocean color data products using a bio-optical model: Characteristics, benefits and issues. Remote Sensing of Environment 114 (8), 1791–1804. Maritorena, S., Morel, A., Gentili, B., 1994. Diffuse Reflectance of Oceanic Shallow Waters: Influence of Water Depth and Bottom Albedo. Limnology and Oceanography 39 (7), 1689–1703. Marquardt, D. W., 1963. An Algorithm for Least-Squares Estimation of Nonlinear Parameters. Journal of the Society for Industrial and Applied Mathematics 11 (2), 431–441. Martin, G., Plaza, A., 2011. Region-Based Spatial Preprocessing for Endmember Extraction and Spectral Unmixing. IEEE Geoscience and Remote Sensing Letters 8 (4), 745–749. Martin-Herrero, J., 2008. Comments on: A New Operational Method for Estimating Noise in Hyperspectral Images. IEEE Geoscience and Remote Sensing Letters 5 (4), 705–709. Mengersen, K., Robert, C., Guihenneuc-Jouyaux, C., 1998. MCMC Convergence Diagnostics: A Reviewww. In: Bernardo, J., Berger, O., Dawid, A., Smith, A. F. M. (Eds.), Bayesian Statistics. Vol. 6. Oxford University Press, Oxford U.K, pp. 415– 440. Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., Teller, E., 1953. Equation of State Calculations by Fast Computing Machines. The Journal of Chemical Physics 21 (6), 1087–1092. Minsley, B. J., 2011. A transdimensional Bayesian Markov chain Monte Carlo algorithm for model assessment using frequency-domain electromagnetic data. Geophysical Journal International 187 (1), 252–272. Mobley, C. D., 1994. Light and Water: Radiative Transfer in Natural Waters. Academic Press, San Diego.

272

BIBLIOGRAPHY

Mobley, C. D., Sundman, L. K., Davis, C. O., Bowles, J. H., Downes, T. V., Leathers, R. A., Montes, M. J., Bissett, W. P., Kohler, D. D. R., Reid, R. P., Louchard, E. M., Gleason, A., 2005. Interpretation of hyperspectral remote-sensing imagery by spectrum matching and look-up tables. Applied Optics 44 (17), 3576–3592. Mobley, C. D., Zhang, H., Voss, K. J., 2003. Effects of optically shallow bottoms on upwelling radiances: Bidirectional reflectance distribution function effects. Limnology and Oceanography 48, 337–345. Morel, A., 1974. Optical properties of pure water and sea water. In: Jerlov, N. G. (Ed.), Optical Aspects of Oceanography. Academic Press, London. Mosegaard, K., Sambridge, M., 2002. Monte Carlo Analysis of Inverse Problems. Inverse Problems 18 (3), R29–R54. Mosegaard, K., Tarantola, A., 1995. Monte Carlo sampling of solutions to inverse problems. Journal of Geophysical Research 100 (B7), 12,431–12,447. Mumby, P. J., Clark, C. D., Green, E. P., Edwards, A. J., 1998. Benefits of water column correction and contextual editing for mapping coral reefs. International Journal of Remote Sensing 19 (1), 203–210. Mumby, P. J., Green, E. P., Edwards, A. J., Clark, C. D., 1999. The cost-effectiveness of remote sensing for tropical coastal resources assessment and management. Journal of Environmental Management 55 (3), 157–166. Mumby, P. J., Skirving, W., Strong, A. E., Hardy, J. T., LeDrew, E. F., Hochberg, E. J., Stumpf, R. P., David, L. T., 2004. Remote sensing of coral reefs and their physical environment. Marine Pollution Bulletin 48 (3-4), 219–228. Myint, S. W., Gober, P., Brazel, A., Grossman-Clarke, S., Weng, Q., 2011. Per-pixel vs. object-based classification of urban land cover extraction using high spatial resolution imagery. Remote Sensing of Environment 115 (5), 1145–1161. Nelder, J. A., Mead, R., 1965. A Simplex Method for Function Minimization. The Computer Journal 7 (4), 308–313.

BIBLIOGRAPHY

273

Noh, M., Wu, L., Lee, Y., 2012. Hierarchical likelihood methods for nonlinear and generalized linear mixed models with missing data and measurement errors in covariates. Journal of Multivariate Analysis 109, 42–51. Okabe, A., Boots, B., Sugihara, K., 1992. Spatial Tessellations: Concepts and Applications of Voronoi Diagrams. John Wiley & Sons Ltd (Import). Phinn, S., Dekker, A., Brando, V., Roelfsema, C., 2005. Mapping water quality and substrate cover in optically complex coastal and reef waters: an integrated approach. Marine Pollution Bulletin 51 (1-4), 459–469. Phinn, S. R., Roelfsema, C. M., Dekker, A., Brando, V. E., Anstee, J., Daniels, P., 2006. Remote sensing for coastal ecosystem indicators assessment and monitoring. SR 30.1 final report: Maps, techniques and error assessment for seagrass benthic habitat in Moreton Bay. CRC for Coastal Zone, Estuary and Waterway Management. Phinn, S. R., Roelfsema, C. M., Mumby, P. J., 2011. Multi-scale, object-based image analysis for mapping geomorphic and ecological zones on coral reefs. International Journal of Remote Sensing 33 (12), 3768–3797. Platt, T., Sathyendranath, S., 2008. Ecological indicators for the pelagic zone of the ocean from remote sensing. Remote Sensing of Environment 112 (8), 3426–3436. Pope, R. M., Fry, E. S., Nov. 1997. Absorption spectrum (380-700 nm) of pure water. II. Integrating cavity measurements. Applied Optics 36 (33), 8710–8723. Purkis, S., Kohler, K., Riegl, B., Rohmann, S., 2007. The Statistics of Natural Shapes in Modern Coral Reef Landscapes. The Journal of Geology 115 (5), 493–508. Quadros, N., 2013. Bathymetry Acquisition - Technologies and Strategies: Investigating shallow water bathymetry acquisition technologies, survey considerations and strategies. Tech. rep., CRC-SI, Melbourne, Victoria. Roelfsema, C., Phinn, S., Jupiter, S., Comley, J., Beger, M., Paterson, E., 2010. The application of object based analysis of high spatial resolution imagery for mapping large coral reef systems in the West Pacific at geomorphic and benthic community spatial scales. In: Geoscience and Remote Sensing Symposium (IGARSS), 2010 IEEE International. pp. 4346–4349.

274

BIBLIOGRAPHY

Rosenthal, J. S., 2000. Parallel computing and Monte Carlo algorithms. Far East Journal of Theoretical Statistics 4, 207–236. Rosenthal, J. S., 2011. Optimal Proposal Distributions and Adaptive MCMC. In: Handbook of Markov Chain Monte Carlo. Chapman & Hall / CRC, London, pp. 93–110. Rouse, J., Hass, R., Schell, J., Deering, D., 1973. Monitoring Vegetation Systems in the Great Plains with ERTS. In: Third ERTS Symposium, NASA. Vol. SP-351 I. pp. 309–317. RSES, 2014. Terrawulf Computational Cluster - Research School of Earth Sciences ANU. Terrawulf Cluster at RSES. URL http://rses.anu.edu.au/TerraWulf/index.php?p=specs Sagar, S., Brando, V., Sambridge, M., 2014. Noise Estimation of Remote Sensing Reflectance Using a Segmentation Approach Suitable for Optically Shallow Waters. IEEE Transactions on Geoscience and Remote Sensing 52 (12), 7504–7512. Sagar, S., Wettle, M., 2010. Mapping the fine-scale shallow water bathymetry of the Great Barrier Reef using ALOS AVNIR-2 data. In: OCEANS 2010 IEEE - Sydney 24-27 May. IEEE, pp. 1–6. Sambridge, M., 2013. A Parallel Tempering algorithm for probabilistic sampling and multimodal optimization. Geophysical Journal International 196, 357–374. Sambridge, M., Bodin, T., Gallagher, K., Tkalcic, H., 2013. Transdimensional inference in the geosciences. Philosophical transactions. Series A, Mathematical, physical, and engineering sciences 371, 20110547. Sambridge, M., Gallagher, K., Jackson, A., Rickwood, P., 2006. Trans-dimensional inverse problems, model comparison and the evidence. Geophysical Journal International 167 (2), 528–542. Santos, V. J., Bustamante, C. D., Valero-Cuevas, F. J., 2009. Improving the fitness of high-dimensional biomechanical models via data-driven stochastic exploration. IEEE transactions on bio-medical engineering 56 (3), 552–564. Scales, J. A., Snieder, R., 1998. What is noise? Geophysics 63 (4), 1122–1124.

BIBLIOGRAPHY

275

Scales, J. A., Tenorio, L., 2000. Prior Information and Uncertainty in Inverse Problems. Geophysics 66, 389–397. Schott, J. R., 1997. Remote Sensing: The Image Chain Approach. Oxford University Press. Schoups, G., Vrugt, J. A., 2010. A formal likelihood function for parameter and predictive inference of hydrologic models with correlated, heteroscedastic, and non-Gaussian errors. Water Resources Research 46 (10), W10531. Schowengerdt, R. A., 2006. Remote Sensing: Models and Methods for Image Processing. Academic Press. Schroeder, T., Devlin, M. J., Brando, V. E., Dekker, A. G., Brodie, J. E., Clementson, L. A., McKinna, L., 2012. Inter-annual variability of wet season freshwater plume extent into the Great Barrier Reef lagoon based on satellite coastal ocean colour observations. Marine Pollution Bulletin 65 (4-9), 210–223. Shi, C., Wang, L., 2014. Incorporating spatial information in spectral unmixing: A review. Remote Sensing of Environment 149, 70–87. Sisson, S. A., 2005. Transdimensional Markov Chains: A Decade of Progress and Future Perspectives. Journal of the American Statistical Association 100 (471), 1077–1089. Sisson, S. A., Fan, Y., 2007. A distance-based diagnostic for trans-dimensional Markov chains. Statistics and Computing 17 (4), 357–367. Smith, A. F. M., 1991. Bayesian Computational Methods. Philosophical Transactions of the Royal Society of London. Series A: Physical and Engineering Sciences 337 (1647), 369 –386. Stark, P. B., Tenorio, L., 2010. A Primer of Frequentist and Bayesian Inference in Inverse Problems. In: Biegler, L., Biros, G., Ghattas, O., Heinkenschloss, M., Keyes, D., Mallick, B., Marzouk, Y., Tenorio, L., Waanders, B. v. B., Willcox, K. (Eds.), Large-Scale Inverse Problems and Quantification of Uncertainty. John Wiley & Sons, Ltd, pp. 9–32.

276

BIBLIOGRAPHY

Stephenson, J., Gallagher, K., Holmes, C. C., 2004. Beyond kriging: dealing with discontinuous spatial data fields using adaptive prior information and Bayesian partition modelling. Geological Society, London, Special Publications 239 (1), 195–209. Stumpf, R. P., Holderied, K., Sinclair, M., 2003. Determination of water depth with high-resolution satellite imagery over variable bottom types. Limonology And Oceanography 48 (1/2), 547–556. Tamminen, J., 2004. Validation of nonlinear inverse algorithms with Markov chain Monte Carlo method. Journal of Geophysical Research 109 (D19). Tarantola, A., 2005. Inverse problem theory and methods for model parameter estimation. Society for Industrial and Applied Mathematics. Tierney, L., 1994. Markov Chains for Exploring Posterior Distributions. The Annals of Statistics 22 (4), 1701–1728. Tierney, L., Mira, A., 1999. Some adaptive Monte Carlo methods for Bayesian inference. Statistics in Medicine 18 (17-18), 2507–2515. Timmermans, J., Verhoef, W., van der Tol, C., Su, Z., 2009. Retrieval of canopy component temperatures through Bayesian inversion of directional thermal measurements. Hydrol. Earth Syst. Sci. 13 (7), 1249–1260. Trimble, 2012. eCognition Developer 8.7.1 User Guide. Tech. rep., Trimble Germany GmbH. Tu, Z., Zhu, S.-C., 2002. Image Segmentation by Data-Driven Markov Chain Monte Carlo. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 657–673. Uss, M. L., Vozel, B., Lukin, V. V., Chehdi, K., 2012. Maximum likelihood estimation of spatially correlated signal-dependent noise in hyperspectral images. Optical Engineering 51 (11), 1712. Vahtmae, E., Kutser, T., 2007. Mapping bottom type and water depth in shallow coastal waters with satellite remote sensing. Journal of Coastal Research (50), 185–189. Vanden Borre, J., Haest, B., Lang, S., Spanhove, T., Forster, M., Sifakis, N. I., 2011. Towards a wider uptake of remote sensing in Natura 2000 monitoring: Streamlining

BIBLIOGRAPHY

277

remote sensing products with users’ needs and expectations. In: 2011 2nd International Conference on Space Technology (ICST). pp. 1–4. Voss, K. J., Chapin, A., Monti, M., Zhang, H., 2000. Instrument to Measure the Bidirectional Reflectance Distribution Function of Surfaces. Applied Optics 39 (33), 6197–6206. Wang, P., Boss, E. S., Roesler, C., 2005. Uncertainties of inherent optical properties obtained from semianalytical inversions of ocean color. Applied optics 44 (19), 4074– 4085. Wang, Y., Yang, C., Li, X., 2008. Regularizing kernel-based BRDF model inversion method for ill-posed land surface parameter retrieval using smoothness constraint. Journal of Geophysical Research: Atmospheres 113 (D13), 1029. Weiss, M., Baret, F., Myneni, R. B., Pragnere, A., Knyazikhin, Y., 2000. Investigation of a model inversion technique to estimate canopy biophysical variables from spectral and directional reflectance data. Agronomie 20 (1), 3–22. Wettle, M., 2005. Monitoring Coral Reef Bleaching from Space - A feasibility study using a physics-based remote sensing approach. Ph.D. thesis, University of Hull, Hull, UK. Wettle, M., Brando, V. E., 2006. SAMBUCA - Semi-analytical model for Bathymetry, un-mixing and concentration assessment. CSIRO Land and Water Science Report 22/06, CSIRO. Wettle, M., Brando, V. E., Dekker, A. G., 2004. A methodology for retrieval of environmental noise equivalent spectra applied to four Hyperion scenes of the same tropical coral reef. Remote Sensing of Environment 93 (1-2), 188–197. Wettle, M., Daniel, P. J., Logan, G. A., Thankappan, M., 2009. Assessing the effect of hydrocarbon oil type and thickness on a remote sensing signal: A sensitivity study based on the optical properties of two different oil types and the HYMAP and Quickbird sensors. Remote Sensing of Environment 113 (9), 2000–2010. Yamano, H., Tamura, M., Kunii, Y., Hidaka, M., 2002. Hyperspectral remote sensing and radiative transfer simulation as a tool for monitoring coral reef health. Marine Technology Society Journal 36 (1), 4–13.

278

BIBLIOGRAPHY

Zaneveld, J. R. V., Boss, E., 2003. The influence of bottom morphology on reflectance: Theory and two-dimensional geometry model. Limnology and Oceanography 48 (1), 374–379. Zhang, H., Voss, K. J., Reid, R. P., Louchard, E. M., 2003. Bi-directional reflectance measurements of sediments in the vicinity of Lee Stocking. Limnology and Oceanography 48, 380–389. Zhang, Q., Xiao, X., Braswell, B., Linder, E., Baret, F., Moore III, B., 2005. Estimating light absorption by chlorophyll, leaf and canopy in a deciduous broadleaf forest using MODIS data and a radiative transfer model. Remote Sensing of Environment 99 (3), 357–371.

Appendix A Radiative Transfer Model Parameterisation This appendix consists of three components to detail the semi-analytical radiative transfer model that we have used in this work: 1. Detail of the parameterisation of the Brando et al. (2009) SAMBUCA shallow water radiative transfer model introduced in Chapter 5. 2. Definition of the specific inherent optical properties (SIOPs) of the model, and the values used for these SIOP sets as empirical inputs to the model. 3. Outline of the optimisation inversion process used by the original SAMBUCA algorithm.

Parameterisation of the model The Lee et al. (1998, 1999) model If we recall from Chapter 5, the analytical model proposed by Maritorena et al. (1994) forms the foundation of the radiative transfer (RT) model, expressing rrs for an optically shallow water body:   dp dp rrs = rrs + exp (−Kd H) Aexp (−κB H) − rrs exp (−κC H) 279

(A.1)

280

APPENDIX A. RADIATIVE TRANSFER MODEL PARAMETERISATION

dp is the subsurface rewhere rrs is the subsurface remote sensing reflectance spectra; rrs

mote sensing reflectance of a hypothetical optically deep water column; A is the bottom albedo or substratum reflectance; H is the water column depth; Kd is the vertical attenuation coefficient for diffuse downwelling light; κB is the vertical attenuation coefficient for diffuse upwelling light originating from the bottom; and κC is the vertical attenuation coefficient for diffuse upwelling light originating from each layer in the water column. This analytical expression for rrs was subsequently modelled by Lee et al. (1998, 1999) using a series of semi-analytical relationships. These approximated the attenuation coefficients and the deep water component of equation A.1 as functions of the absorption (a) and backscattering (bb ) coefficients of the water column for a particular sun zenith angle (φw ) and off-nadir sensor viewing angle (φ), κ , cos (φw )

Kd =

 κC = κ  κB = κ

DuC cos (φ)



DuB cos (φ)



(A.2)

,

(A.3)

,

(A.4)

dp rrs = u (0.084 + (0.17u)) .

(A.5)

κ = a + bb ,

(A.6)

bb , a + bb

(A.7)

Where

u=

 and the optical path-elongation factors for photons from the water column DuC ,  and the bottom DuB , are approximated as, DuC = 1.03 (1 + 2.4u)0.5 ,

(A.8)

281

DuB = 1.04 (1 + 5.4u)0.5 .

(A.9)

The SAMBUCA parameterisation In SAMBUCA, the algorithm of Lee et al. (1998, 1999) was modified by Brando et al. (2009) to retrieve the optically active constituents of the water column (CHL, CDOM and NAP). The absorption and backscattering coefficients are described by, a = aw + aphy + aCDOM + aN AP ,

(A.10)

bb = bbw + bbphy + bbN AP

(A.11)

where aw and bbw are the absorption and backscattering of pure water (Morel 1974; Pope and Fry 1997) . The non-water absorption terms are parameterised as follows: aphy (λ) = CCHL · a∗phy (λ)

(A.12)

aCDOM (λ) = CCDOM · a∗CDOM (λ0 ) exp [−SCDOM (λ − λ0 )]

(A.13)

aN AP (λ) = CN AP · a∗N AP (λ0 ) exp [−SN AP (λ − λ0 )]

(A.14)

where CCHL is the concentration of chlorophyll-a and a∗phy (λ) is the chlorophyll-a specific absorption spectrum. As the concentration of CDOM (CCDOM ) is represented by aCDOM (440 nm), the reference wavelength λ0 was set at 440 nm, SCDOM is the spectral decay constant for CDOM absorption coefficient and a∗CDOM (λ0 ) is set to 1. CN AP is the concentration of NAP; a∗N AP (λ0 ) is the specific absorption of NAP at the reference wavelength, and SN AP is the spectral slope constant for NAP absorption coefficient; and the reference wavelength λ0 was set at 550 nm for NAP absorption coefficient. The non-water backscattering terms are parameterised as follows: bbphy (λ) = CCHL ·

b∗bphy

 (λ0 )

λ0 λ

Yphy (A.15)

282

APPENDIX A. RADIATIVE TRANSFER MODEL PARAMETERISATION

bbN AP (λ) = CN AP ·

b∗bN AP

 (λ0 )

λ0 λ

YN AP (A.16)

where b∗bphy (λ0 ) is the specific backscattering of algal particles at the reference wavelength, Yphy , the power law exponent for the algal particles coefficient; b∗bN AP (λ0 ) is the specific backscattering of NAP at the reference wavelength, and YN AP , the power law exponent for NAP backscattering coefficient. The reference wavelength λ0 was set at 546 nm for both the algal and non algal particle backscattering coefficient. Thus the full parameterisation of the model consists of these new water column specific inherent optical properties (SIOPs) and the environmental parameters of interest; where CCHL is the concentration of Chlorophyll-a, CCDOM is the concentration of CDOM, CN AP is the concentration of NAP, H is the water column depth , and qij is the proportion contribution (qij and 1-qij ) of any two substrates (Ai and Aj ) to the bottom reflectance,

model rrs =f

CCHL , CCDOM , CN AP , H, qij , Ai (λ) , Aj (λ) , SCDOM , SN AP , YP HY , YN AP , a∗P HY (λ) , a∗N AP (λ0 ) , b∗bP HY (λ0 ) , b∗bN AP (λ0 )

! .

(A.17)

SIOP Parameterisation As introduced in section 5.2.2.1, the optimal parameterisation and empirical estimation of the SIOP components of the semi-analytical RT model relies on field sampling of the water body of interest. If this is not possible, this can be achieved with appropriate values from the literature (Brando et al. 2009). In this work we have used two sets of SIOP measurements obtained during a fieldwork campaign in lagoon and open water environments at Heron Island, QLD, Australia (Wettle 2005; Wettle and Brando 2006). The values for each set of SIOPs, as required to parameterise the full radiative transfer model (equation A.17), are shown in table A.1 and figure A.1.

283

SCDOM SN AP YP HY YN AP a∗N AP (550nm) b∗bP HY (546nm) b∗bN AP (546nm)

Set 1 (open water) 0.0168 0.00977 0.878 0.878 0.0043 0.0016 0.0225

Set 2 (lagoon) 0.0183 0.0101 1.178 1.178 0.0017 0.0008 0.0118

Table A.1 – Specific Inherent Optical Properties (SIOPs) - values used for the parameterisation of the radiative transfer model in this thesis, measured at Heron Island, QLD as detailed in Wettle (2005) and Wettle and Brando (2006).

  Figure A.1 – Chlorophyll-a specific absorption spectrum a∗phy (λ)

SAMBUCA Optimisation The inversion-optimisation scheme in SAMBUCA is completed using the Downhill Simplex method (Nelder and Mead 1965). As described in Brando et al. (2009), the mod model input elled data rrs is compared to input data (rrs ) using a goodness-of-fit, or error function. The set of variables that minimizes the difference between these two spectra is used to estimate the environmental variables being sought, e.g. water column depth, substratum composition or the concentrations of the optically active constituents of the water column.

284

APPENDIX A. RADIATIVE TRANSFER MODEL PARAMETERISATION

The optimization residuum, 4, is the measure of the difference between the measured and modelled spectra. In our use of SAMBUCA, 4 is estimated according to a least squares distance (LSQ) spectral magnitude matching function (e.g. in Lee et al. (1998, 1999); Mobley et al. (2005); Albert and Gege (2006)) , 

PN  i=1

LSQ =

input

model rrs (λi ) − rrs PN input (λi ) i=1 rrs

2  21 (λi ) ,

where N is equal to the number of spectral bands in the data.

(A.18)

Appendix B Noise Estimation in Shallow Water To provide a foundation to the concept of noise estimation for remote sensing data and the method used to derive the covariance matrix used in Chapter 7, we present a reproduction of our paper recently published in IEEE Transactions on Geoscience and Remote Sensing as: Sagar, S., Brando, V. & Sambridge, M., 2014. Noise Estimation of Remote Sensing Reflectance Using a Segmentation Approach Suitable for Optically Shallow Waters. IEEE Transactions on Geoscience and Remote Sensing, 52(12), pp.7504-7512. Abstract – This paper outlines a methodology for the estimation of the environmental noise equivalent reflectance in aquatic remote sensing imagery, using an object based segmentation approach. Noise characteristics of remote sensing imagery influence directly the accuracy of estimated environmental variables; and provide a framework for a range of sensitivity, sensor specification and algorithm design studies. The proposed method enables estimation of the noise equivalent reflectance covariance of remote sensing imagery through homogeneity characterisation using image segmentation. The method is first tested on a synthetic data set with known noise characteristics, and is successful in estimating the noise equivalent reflectance under a range of segmentation structures. Testing on a PHILLS hyperspectral image in a coral reef environment shows the method to produce comparable noise equivalent reflectance estimates in an optically shallow water environment to those previously derived in optically deep water. This method is of benefit in aquatic studies where homogenous regions of optically deep water were previously required for image noise estimation. The ability of the method to characterise the covariance of an image is of significant benefit when developing probabilistic inversion techniques for remote sensing.

285

286

APPENDIX B. NOISE ESTIMATION IN SHALLOW WATER

INTRODUCTION The ability to assess the noise characteristics of remote sensing imagery is an important step in establishing its capabilities and limitations in an operational and scientific context. Following Scales and Snieder (1998) , noise can be defined as ‘that part of the data that we do not expect the model to explain’, by which we capture the deterministic components of the data in addition to noise characteristics that must be modeled as stochastic processes. Noise characteristics which encompass the full sensor-atmosphere-target system in the remote sensing imagery processing chain defined by Schott (1997) are an important component in sensitivity analysis and inversion studies (Gao 1993; Hoogenboom et al. 1998; Timmermans et al. 2009; Wettle et al. 2009; Camps-Valls et al. 2011; Hedley et al. 2012b; Botha et al. 2013) quantifying the theoretical levels of information that can be extracted from a given system and image. Accurate characterization of noise is essential in inversion methods which propagate uncertainty estimates to parameter retrievals (Brando and Dekker 2003; Wang et al. 2005; Lee et al. 2010b; Hedley et al. 2012a), provide confidence or reliability indicators on estimated parameters (Brando et al. 2009), or require a balance of the relative influence of data and prior information in Bayesian style cost functions (Tarantola 2005; Lauvernet et al. 2008; Laurent et al. 2013). The importance of noise is also highlighted in studies that extend sensitivity analysis to encompass implications for algorithm and sensor design. Hedley et al. (2012b) use sensor noise properties to assess the capabilities of a full remote sensing system to accurately discriminate benthic classes in coral reef environments under a range of environmental and acquisition conditions. Alternatively, noise characteristics of a particular system can be used to optimise algorithm design for specific applications, such as water quality assessment (Brando and Dekker 2003; Giardino et al. 2007). One such measure commonly used in aquatic applications (which is the focus of this paper), is the environmental noise equivalent reflectance difference (NE4R). Incorporated in this measure is the signal-to-noise of the instrument, scene specific influences resulting from atmospheric variability, the air-water interface, and refractions of diffuse and direct sky and sunlight (Brando and Dekker 2003; Wettle et al. 2009). This target-level independent measure is in contrast to the commonly evaluated sensor-level signal-to-noise ratio (SNR) (Gao 1993; Hu et al. 2012). In this work the NE4R is considered as a property of the sub-surface remote sensing reflectance (rrs , sr−1 ), defined as Lu /Ed , where Lu is the upwelling radiance and Ed is the downwelling irradiance evaluated just below the water surface. Techniques for the estimation of NE4R have largely focused on the derivation of a NE4R spectra with a standard deviation value assigned to each image band (Dekker and Peters 1993;

287

Green et al. 2003; Wettle et al. 2004). These are based on the definition of image noise as additive with a zero mean and Gaussian distribution (Corner et al. 2003). Importantly, these methods assume independence of the noise estimates between bands, as they do not account for any spectral correlation of the image based NE4R. This issue is particularly relevant when considering remote sensing data in a Bayesian inversion context, where an accurate representation of noise in the multi-dimensional space of the problem (spectral in this case) is crucial to the successful implementation of many sampling algorithms (Agostinetti and Malinverno 2010; Camps-Valls et al. 2011). The kernel-based method described by Wettle et al. (2004) is an extension of the pixel growing approach developed by Dekker and Peters (1993). Here the standard deviation of each band is derived by sampling over an expanding number of pixels (by increasing a square kernel size) in a homogenous region of the image until a first asymptotic standard deviation level is reached. Assuming the homogeneity of the sampled region, these standard deviation values are taken as a NE4R estimate, under the assumption that any spectral variation in the sampling region can be attributed to environmental noise. Both methods require a homogenous sampling area of sufficient size to allow robust standard deviation estimates. Guo and Dou (2008) implement a semi-variogram approach modified from Curran and Dungan (1989) which limits adverse effects of any heterogeneity of the sampling area. However they recognize the need for an automated location tool, such as that used by Wettle et al. (2004), to minimize the errors introduced by spatial variability. This kind of methodology has been applied in previous optically shallow water studies, defined by a quantifiable contribution of the substratum to the subsurface reflectance signal of the water body (Brando et al. 2009; Dekker et al. 2011; Hedley et al. 2012a). However, estimation in these studies has been made over optically deep water, where a homogenous signal can be assumed and a band wise standard deviation for the entire image estimated. The obvious restriction in these cases is then the availability of a suitable deep water area in the same image as the shallow water study area. Furthermore, the spatial features of waveinduced sun-glint in deep water areas can compromise the spectral homogeneity requirements of a sample area. These glint features can be considered deterministic components of the noise, and can be modeled and corrected for by image ’de-glinting’ (Hedley et al. 2012a, 2005), though there remains conjecture on the suitability of this process in shallow water areas where a negligible near infra-red signal cannot be assumed (Kutser et al. 2009). Hence, we wish to avoid these features in both shallow and deep study areas, and focus on the elements of the noise that can be modeled stochastically such as the pixel-to-pixel variations of the environmental noise components.

288

APPENDIX B. NOISE ESTIMATION IN SHALLOW WATER

We propose a method that takes advantage of image based segmentation to define homogenous object segments in a remote sensing data set. By identifying the segments with the highest homogeneity we show how an ensemble of residuals can be extracted that can be used to estimate the NE4R covariance matrix from a highly heterogeneous optically shallow environment. The homogeneity testing of segments used in our method shares some similarities with the approach developed by Gao (1993), modified for an ocean colour sensitivity study by Hu et al. (2012). In this study, the authors assess the homogeneity of number of small pixel kernels to develop standard deviation histograms for the estimation of the sensor SNR. Their technique is applied specifically to optically deep water, and focused on the image invariant estimation of a sensor SNR spectrum. In contrast, we apply a segmentation approach to allow estimation of the image based NE4R covariance in an optically shallow environment. Sections B & B describe the foundation of our method and descriptions of the synthetic test data and airborne hyperspectral data to which it is applied. Section B outlines the results of our method application, including in the airborne hyperspectral study, a comparison of the results derived over a shallow section of the image to those obtained using a pixel growing technique in deep water. The paper is concluded with a discussion of the two case studies. The importance of a covariance noise matrix which allows implementation of Bayesian style inversion methods for remote sensing data is emphasised.

THE METHOD The covariance noise matrix C has an important role in Bayesian style Markov chain Monte Carlo (McMC) inversion methods, such as the Metropolis-Hastings algorithm (Metropolis et al. 1953; Hastings 1970). In these methods the sampling is controlled by the likelihood, p(d | m) , which is the probability of obtaining the data, d , from the model, m . For normally distributed correlated noise, it is given by: p(d | m) = 

  1 T −1  1 exp − 2 e C e 2 |C|

1 N

(2π)

(B.1)

where N is the number of data observations, and e is an N-vector of data error residuals calculated from the observed and predicted data. In the case of a pixel in remote sensing data, C is a square data covariance matrix of dimension (B x B), where B is the number of spectral bands. If the data noise is band-wise uncorrelated, C will then be diagonal. A common determination of C is via the analysis of the error residuals (Gouveia and Scales 1998):

289

C=

n 1 X T rij rij N

(B.2)

i=1

where N is the number of data observations (pixels), and rij is a vector of data error residuals of size B. To evaluate the noise residuals in remote sensing data we must determine their deviation from a ‘true’ spectra. In this manner we use the theory behind the pixel growing techniques described earlier (Dekker and Peters 1993; Wettle et al. 2004). Critically, we make the same assumption that a homogenous region of pixels is available with stationary zero mean Gaussian noise. Therefore the mean (or median) band values for a homogenous grouping of pixels (or segment) represent the ‘true’ values from which the noise residuals rij can be determined for the ith pixel in the segment:

rij = x ¯j − dij

(B.3)

where given a number of bands B, is the mean value for band j(j = 1, ..., B) determined from all pixels in the segment, and dij is the value of band j for the pixel i in the segment. Each vector rij for the N pixels in the segment can then be substituted into Eq. B.2 to produce the C matrix of dimension B x B. To this point, the method differs little from the pixel growing technique. Indeed, the residuals could be determined from the best determined kernel size described in Wettle et al. (2004), and a covariance matrix calculated using Eq. B.2 from which the diagonal will be equivalent to the NE4R spectra determined by pixel-growing. The method proposed in this work acquires the ensemble of noise residuals by assuming that environmental noise is not only additive, but also uncorrelated and uncoupled to signal magnitude (Dekker and Peters 1993; Wettle et al. 2004; Hedley et al. 2012a). In the literature, it is common for the single deep water derived NE4R estimate to be assumed and adopted as invariant for the full image (Wettle et al. 2004; Brando et al. 2009; Dekker et al. 2011; Hedley et al. 2012a; Uss et al. 2012). Therefore, noise residuals from one homogenous region will be of the same distribution as noise residuals from any other region of the same degree of homogeneity, regardless of the magnitude of the signals in the respective regions. Hence, the residual vectors used in the final calculation of C (Eq. B.2) can be derived from a userdefined number of homogenous regions, combined to form the full ensemble of pixel based noise residuals. It is important to note that the assumption of additive noise independent of signal may not hold true over bright high-reflectance targets, and hence, as stated by Wettle et al. (2004) this methodology is primarily designed for low-radiance targets such as the aquatic applications covered in this paper.

290

APPENDIX B. NOISE ESTIMATION IN SHALLOW WATER

Definition of homogenous segments was completed using the Object Based segmentation software eCognition, which produces multi-scale image segments by taking into account a weighted criteria based on the homogeneity of the ’colour’ and ’shape’ of the segment (Trimble 2012). In remote sensing analysis, colour is defined as the spectral signature of the data recorded in each pixel. Shape is further split into two sub-criteria, smoothness and compactness, which evaluate the relative length and structure of the object. Finally, the user can select a scale parameter to reflect the image environment being interpreted. Formula underlying these criteria can be found in Hofmann et al. (2008), as well as in the documentation for the eCognition software (Trimble 2012). Segmentation of the data in our process is strongly weighted towards the ’colour’ criteria, in order to produce as spectrally homogenous regions as possible, although as with many software based object based segmentation packages, there remains a degree of user-specified subjectivity and iterative estimation, particularly in terms of scale parameters using for a segmentation. In previous work, Gao (1993) estimates a noise spectra from hyperspectral imagery based on object seeking segmentation, using all segments (above a size threshold) in the subsequent algorithm. Thus, the segmentation procedure itself is key to the final noise estimation (Martin-Herrero 2008). In this work we develop a methodology which attempts to deal with this degree of subjectivity in the segmentation results. Segmentation results are assessed in two steps. Firstly, we calculate the standard deviation Sj of each band j (j=1,. . . ,B), for each segment :

Sj =

N X

r

i=1

1 (xij − x¯j )2 N

! (B.4)

where N is the number of pixels in the segment, xij is the value of band j in pixel i , and is the mean of band j for all pixels N . An average standard deviation over all bands is then calculated for each segment, providing an initial ranking of the level of homogeneity of all segments: PB Save =

j=1 Sj

B

(B.5)

Next, we exclude pixels within segments that contain extreme outlying values, to ensure the ensemble consists of normally distributed residuals. This is done by identifying outliers in the error residuals calculated for each band/pixel in a segment using a modified z-score proposed by Iglewicz and Hoaglin (1993): Mij =

0.6745 (xij − x¯j ) median (| xij − x¯j |)

(B.6)

291

where xij is the value of band j in pixel i, is the median of band j from all pixels in the segment. This formulation is less susceptible to extreme values in the tails of the residual distribution then the standard z-score or mean based indicators to detect outliers. Iglewicz and Hoaglin (1993) suggest a Mij threshold of 3.5 to flag potential outliers under a normality distribution assumption. By ranking the homogeneity of the segments, and by testing these segments under the assumption of a normal distribution, we minimise errors introduced by the segmentation procedure (allowing for an off-the-shelf package to be used). In this context, errors refer to groups of pixels that may be incorporated into an otherwise homogenous segment as a result of the segmentation scale, such as extended glint features or varied benthic features, which are not part of the noise processes we wish to model stochastically. To ensure noise estimation on the largest sample of residuals possible, the 10 most homogenous segments are selected, under an increasing pixel-per-segment threshold, beginning at 100 and increasing in steps of 100. The resulting residual ensemble sets, sized from min = (1000, 2000. . . ) are tested under the modified z score outlier test, and the largest ensemble in which the percentage of outliers detected remains under 5% is accepted. These homogenous segments, resulting in normally distributed residuals rij , are used for final estimation of the covariance matrix with tested residuals substituted into Eq. B.4 to produce the between band covariance matrix C.

DATA

Synthetic Data with Known Correlated Noise To test the ability of the method to accurately retrieve noise characteristics, a synthetic test was constructed based on known noise statistics. The synthetic data was developed to reflect an optically shallow, spatially heterogeneous environment in which application of the kernel based methods of Wettle et al. (2004) and Dekker and Peters (1993) may encounter difficulties, due to their requirements of extended areas of homogeneity. A sub-surface remote sensing reflectance (rrs ) data set was constructed using the parameterised semi-analytical radiative transfer model detailed in Brando et al. (2009) (Eq. B.7), based on the model developed by Lee et al. (1998, 1999). model rrs = f (CCHL , CCDOM , CN AP , H, qij )

(B.7)

292

APPENDIX B. NOISE ESTIMATION IN SHALLOW WATER

Figure B.1 – Inputs for creation of the synthetic Heron Island data set from left - Bathymetry, CHL concentration, CDOM concentration, NAP concentration, Q substrate ratio.

where CCHL is the concentration of Chlorophyll-a, CCDOM is the concentration of CDOM (Coloured Dissolved Organic Matter), CN AP is the concentration of NAP (Non-Algal Particles), H is the water column depth , and qij is the proportion contribution (qij and 1- qij ) of any two substrates (in this case, coral and sand ) to the benthic reflectance. Specific inherent optical properties (SIOP’s) of the water column and its constituents required in the Brando et al. (2009) model were parameterised using values obtained in the Heron Island (Great Barrier Reef, Queensland, Australia) 2004 field work campaign detailed in Wettle (2005), and subsequent technical report by Wettle and Brando (2006). Initial spatial variability of the data set was represented using an input bathymetry (H) layer from Heron Island derived by Hedley et al. (2009) using an adaptive LUT approach from high resolution 1m pixel CASI (Compact Airborne Imaging Spectrometer) imagery data. A subset of this image was selected from a section of the lagoon with a variable bathymetry, exhibiting features at various scales such as coral bommies and ridges, channels and shallow platforms. Based on this subset, a study area was constructed of 287,182 pixels with a depth range of 0.2-16.4m over an area 511m by 562m. Spatial variation of the remaining four parameters was reflected by the layers shown in Figure B.1; using values within the range of other studies using this model (Wettle and Brando 2006; Brando et al. 2009). Distribution of the q substrate proportion was loosely correlated to the bathymetry layer to create a higher proportion of sand at depth, and a higher proportion of coral in shallower areas. Using the parameterised Brando et al. (2009) forward model and the layers shown in Fig B.1, a synthetic rrs image data set was constructed (Fig. B.2). Initial data was generated from 350 - 900nm at 1nm intervals, and then reduced to 4 bands using the spectral response filter of the multi-spectral Quickbird sensor (Fig B.3a). Random, correlated Gaussian noise was then added to each band and pixel of the data set using a band based standard deviation of 0.002 sr−1 and a correlation matrix as shown in Fig. B.3b.

293

Figure B.2 – Synthetic Heron Island subset rrs image data at representative Quickbird spectral resolution. Example spectra shown for varied water column depth

(a)

(b)

Figure B.3 – Quickbird spectral response filter (a) and Correlation matrix (b) used to apply random Gaussian 0.002sr-1 noise to the rrs synthetic data

294

APPENDIX B. NOISE ESTIMATION IN SHALLOW WATER

Figure B.4 – Lee Stocking Island PHILLS study area – polygon defining the optically shallow sub-section used for noise estimation Airborne Hyperspectral Data with Correlated Noise We tested the algorithm on image data acquired from the Ocean PHILLS (Portable Hyperspectral Imager for Low-Light Spectroscopy) sensor [46] over Lee Stocking Island (LSI) in the Bahamas (Fig B.4). This dataset is described in detail in Dekker et al. (2011); with the geometric, radiometric and atmospheric processing applied to the data outlined in Mobley et al. (2005). For comparison to the kernel-based NE4R spectra derived by Dekker et al. (2011), we adopt the specifications of the dataset used in the kernel-based estimation of 83 bands, ˜5nm wide between 402.4 to 800.4nm. In the LSI study area, we define an optically shallow subsection of the image (0 to ˜11m depth) (Fig. B.4) based on a combination of the acoustic validation data and derived results detailed in Dekker et al. (2011). This is in contrast to the optically deep section in the northeast of study area image in which the kernel-based method of Wettle et al. (2004) was used for estimation of the spectral NE4R (Dekker et al. 2011).

Segmentation We completed three segmentations of the synthetic data, using a range of scale values in the eCognition process. These settings constrain the size and structure of the image object segments, as the process optimizes the homogeneity of the object segments created under these constraints. It can clearly be seen that an increase in the scale value produces segmentation with larger object segments, with an accompanying increase in heterogeneity in some of these

295

Figure B.5 – Segmentation of synthetic data - illustration of the effect of scale on the size of derived segments a) Scale 5 b) Scale 9

segments (Fig. B.5). Hence, a degree of user input and subjectivity is required in this process, and it is through this variation that we are examining the influence of these subjective segmentation parameter choices on the final estimation of the NE4R for the data set.

To select the 10 segments for evaluation, the iterative size threshold and outlier testing process determined a size threshold of 500 pixels per segment for each scale segmentation, resulting in an ensemble of >5000 pixels for each noise estimate.

To examine the sensitivity of noise estimation to the segmentation procedure as applied to a real data set, the LSI data was also segmented under a number of scale regimes, ranging from scale 3 to 7 (Fig B.6). Object segments were then ranked, tested and extracted as in the synthetic study, with a threshold of 100 pixels per segment reached for each of the segmentation scales before the residual ensemble produced >5% outliers. In order to compare and contrast the deep water kernel NE4R of Dekker et al. (2011) and our estimation made in the shallow water region, segments were also extracted and residuals calculated for the eastern deep water area of the LSI data.

296

APPENDIX B. NOISE ESTIMATION IN SHALLOW WATER

Figure B.6 – Influence of the lower and upper scale limits on the segmentation of LSI PHILLS data shown over a small sub-set of the study area - (left) Scale 3 (right) Scale 7 RESULTS Synthetic Study For each segmentation scale in the synthetic data test we were able to analyse a ten segment ensemble based on a minimum segment size of 500 pixels after outlier testing. The resulting NE4R covariance matrices are shown in Fig. B.7, in comparison to the treatment covariance matrix which was applied to the original data. In these results we see no noticeable effect of the segmentation scale increase on the accuracy of the derived covariance matrices, with a close estimation of the original NE4R covariance matrix across all three scales. 8

PHILLS Case Study After the segment size threshold and outlier testing, residual ensembles for each segmentation scale of the PHILLS data were restricted to segments of a minimum of 100 pixels. A comparison of the covariance matrix C estimated from these four scales of segmentation and their respective top 10 segment residuals are shown in Fig. B.8. These estimates, having been drawn from the optically shallow areas of the data can then be compared to the matrix estimated using the segmentation in the deep water region of the study area (Fig B.8e). A stronger noise correlation between spectral bands can be identified between 400-600nm in all estimations of C, with the magnitude of the correlations slightly decreased in the deep water matrix estimation. Much of the correlated noise observed in Fig B.8. occurs in the 450-550 nm region where variability in depth or benthic composition will affect the local homogeneity of

297

Figure B.7 – Estimated Heron Island NE4R Covariance Matrices – comparison between original applied covariance and estimates made from segmentation scale variations

298

APPENDIX B. NOISE ESTIMATION IN SHALLOW WATER

Figure B.8 – LSI PHILLS NE4R Covariance matrices – estimated at four scaling levels in the optically shallow study area (a-d) in comparison to deep water estimation (e).

299

Figure B.9 – NE4R spectra estimated for LSI PHILLS data – comparison between shallow water scale variations and deep water estimations (segment and kernel based).

these bands for remote sensing imagery in optically shallow waters. An increase in covariance in this spectral region is noted for scale 7 (Fig. B.8d) where the large segments are likely to encompass more variability in the benthic composition than the smaller scales (Fig B.8 a-c) or than the optically deep water segments (Fig B.8e). Visually this effect is evident in Fig. 7 where the large segments show more variability in the true colour composite. From these estimated covariance matrices, we extract the NE4R spectra estimates for comparison to the kernel estimation spectra used in Dekker et al. (2011). (Fig B.9) Consistent with the covariance matrix estimates, the larger scale segmentation (scale 7) generates a NE4R spectrum higher than those estimated by the other three smaller segmentation scales. In comparison to the kernel based estimation, a marked decrease in the NE4R estimate between ˜400-700nm is observed in all segmentation based spectra except the scale 7 estimation. This suggests a possible deviation from the normal distribution of residuals assumption in this larger scale segmentation. It could be considered that the glint present in the LSI data (Dekker et al. 2011) would contribute to an increased noise estimation when using a kernel based method, as the method does not allow isolation or removal of these glint affected features. By estimating the NE4R using the segmentation approach, glinted features can be accounted for at both the segmentation and outlier detection stages. We can illustrate this by examining the deep water segmentation derived NE4R in comparison to the kernel

300

APPENDIX B. NOISE ESTIMATION IN SHALLOW WATER

based NE4R derived in the same deep water section of the data. Comparison shows the deep segment based NE4R to be of approximately the same magnitude as those estimated in shallow waters and significantly lower than the deep water kernel based spectra.

DISCUSSION In the synthetic data case study, our method has shown to be robust in dealing with a range of segmentation regimes, with outlier testing enabling an accurate estimation of the NE4R covariance, independent of the segmentation parameters chosen. Extending the method to the LSI data example, there remains a degree of threshold determination required by the user in terms of the scale of segmentation in relation to the data source. As the scale of the segmentation increases, and therefore the size of the object segments created, the homogeneity of the segments will be decreased to an extent where the normal distribution of residuals assumption may become invalid and unable to be accounted for using an outlier test. Hence, the NE4R estimate will increase at this point in correlation with the increase in segmentation scale. This can be seen in Fig. B.9 in the increase of the NE4R spectra from the smaller scales (3-5) to the scale 7 estimation. One of the primary drivers of segment homogeneity on a pixel-to-pixel basis is the spatial variability of benthic composition, and the scales on which this occurs. The noise correlations observed in Fig B.8. show some variations across even the smaller segmentation scales, and between shallow and deep water, suggesting that the outlier detection process was not fully able to account for this effect. Notwithstanding, the estimate of NE4R spectra (i.e. the diagonal of the covariance matrix) is consistent between the smaller scales and the deep water estimate in Fig. B.9, confirming the potential of the proposed approach to estimate noise properties across optically shallow and deep portions of the imagery. To mitigate against the extent of variability from benthic composition, it is recommended to use the smallest scale segmentation capable of producing object segments of a sufficient number of pixels for meaningful statistical properties to be obtained. It is also worthwhile considering the nature of which elements of the data are being segmented in each of the test cases. In the synthetic data, the only non-deterministic component of the data is the random noise added, therefore partitioning is driven by the underlying subsurface remote sensing signal. In the real data case, other elements become present such as pixel-pixel air-water interface variations, glint features and residual artifacts from data processing. Some of these features can often be corrected for deterministically (eg. glint) and are hence are avoided with the segmentation, however the underlying pixel-pixel variation remains

301

at all scales of segmentation, and a segment will only ever be as homogenous as these underlying processes. The influence of these added noise components in the real data study is shown by the decrease in the minimum size of the segments in the LSI study that are considered valid after the outlier process (from 500 in the synthetic study, to 100 in the real data study). The use of multi-scale segmentation enables statistical testing of homogeneity over a larger sample of observations in comparison to kernel restricted methods, particularly in areas of high spatial variability. The irregular nature of the segments allows adaptation to the spatial structure of the data; a higher number of selected pixels then enabling outlier testing and removal without discarding the remaining data. Hu et al. (2012) apply a fixed window of 3 x 3 pixels on an open ocean environment, testing each window for homogeneity using a maximum/minimum ratio threshold. The standard deviation (SD) of each band in the window is then calculated, and the SD histogram distribution of all the individual windows used to iteratively determine the appropriate max/min threshold. In this method, the SD estimate of each band is drawn from a small sample of 9 pixels, and any outliers in the window require the full sample to be discarded. It is important to note that, similar to Wettle et al. (2004), our method does not look to account for spatially correlated noise such as striping, and a natural property of the segmentation and outlier detection process is the avoidance of these features. This property of our method can be extended to other spatial features such as glint, which can be dealt with effectively in deep water areas where a kernel estimation may be forced into encompassing the spectral variation caused by glinted features into the NE4R estimation. This is clearly shown in Fig B.9, where the kernel based estimation of NE4R is higher then that estimated using the segmentation method. The glint induced overestimation of NE4R could have possible implications on pixel based inversion methods such as Brando et al. (2009), where one NE4R for the entire image is used to characterise the noise for each individual pixel, glinted or un-glinted. The ability of the method to extract an NE4R estimate in optically shallow water, comparable to that derived in deep water is of significant practical benefit in aquatic studies where optically deep water may not be available in the image data. Whilst the kernel based methods do not explicitly rule out estimation from these regions, the highly heterogeneous spatial nature of these areas often means a single square area of sufficient size and homogeneity is not available. This feature of our method is of potential benefit when considering aspects of spatial correlation in the noise processes being examined. As previously stated, we are working under a commonly used assumption of an image invariant noise; however it is acknowledged that the pixel-to-pixel component of the NE4R has the potential to vary across an image, which would require more sophisticated development of these kind of image based noise estimators.

302

APPENDIX B. NOISE ESTIMATION IN SHALLOW WATER

If one was to look to account for this, we would consider it a positive development to be able to estimate noise more directly in the area in which further analysis is to be completed. In application to the LSI study area, containing spatial variability of depth, substrate and water column constituents as well as pixel-pixel components; we have produced NE4R estimates from an optically shallow region comparable to those estimated from the deep water areas of the data set. A significant feature of our method is that it not only estimates a spectral NE4R, but also characterises the full noise covariance matrix of the data, including off-diagonal crosscovariances. Gao et al. (2008) acknowledge and treat the band spectral correlation in hyperspectral imagery using a multiple linear regression; however the noise estimate produced thereafter does not seek to estimate any remaining correlation of noise. The estimation of noise as a spectral signature is common to all the techniques assessed in a comparative study of noise estimation by Gao et al. (2013), including those of Gao (1993) and Gao et al. (2008). Moving to an estimate of spectral noise covariance becomes particularly important in the application of Bayesian inversion methods to remote sensing data; an area that is much less explored then optimisation style inversion techniques. This is an important step in dealing with the high-dimensional challenges of many remote sensing inversion problems using a probabilistic approach.

CONCLUSIONS We have introduced a segmentation based noise estimation methodology that can successfully estimate the noise equivalent reflectance spectra and noise covariance matrix from aquatic remote sensing imagery over both optically deep and optically shallow sites. The methodology has been shown to be insensitive to the parameters used in the segmentation when applied to a synthetic test data source. Extension to a hyperspectral data over an optically shallow coral reef site has identified a minimal amount of user input required to relate the scale of the segmentation to the input data source, and to mitigate against the inclusion of benthic substrate variability. Estimation of the noise equivalent spectra and covariance matrix has been completed in optically shallow water and glint affected regions of a hyperspectral airborne PHILLS data set, with these results shown to be comparable to those estimated using a commonly used kernel approach in optically deep water. This represents a potentially significant practical benefit in aquatic studies where optically deep water cannot be observed or significant sun-glint is present.

303

ACKNOWLEDGMENTS The derived Heron Island bathymetry data were provided by John Hedley and Stuart Phinn; with funding for the original CASI dataset provided by Australian Research Council grants to S. Phinn. The Lee Stocking Island PHILLS data were collected and processed by Curtiss Davis and the PHILLS project team at the U.S. Naval Research Laboratory. Curtis Mobley provided the processed PHILLS data and ancillary data sets for use by the participants at the ARC/CSIRO/ONR Shallow Water Workshop held at Moreton Bay, Australia in 2009. The Lee Stocking Island work was funded through ONR as part of the Coastal Benthic Optical Properties (COBOP) accelerated research initiative. Calculations were performed on the TerraWulf III a computational facility supported through the AuScope inversion laboratory. AuScope Ltd is funded under the National Collaborative Research Infrastructure Strategy (NCRIS) and the Education Infrastructure Fund (EIF), both Australian Commonwealth Government Programmes. This work was partly funded by CSIRO’s Wealth from Oceans Flagship. We thank Arnold Dekker and three anonymous reviewers for insightful comments on this manuscript.