Development of a Predictive Bayesian Data-Derived Multi ... - NOAA

21 downloads 69 Views 1MB Size Report
The Coastal Response Research Center ... Programming Language Selection, Software Compiling, Testing for ... conceptual model development phase.
DEVELOPMENT OF A PREDICTIVE BAYESIAN DATA-DERIVED MULTI-MODAL GAUSSIAN MAXIMUM LIKELIHOOD MODEL OF SUNKEN OIL MASS

A Progress Report Submitted to The Coastal Response Research Center

Submitted by Dr. James Englehardt Angelica Echavarria-Gregory Department of Civil, Architectural and Environmental Engineering University of Miami 1251 Memorial Drive McArthur Engineering Building, Suite 321 Coral Gables, Florida, 33124-0630 Reporting Period: August 15, 2008 (start date) to October 15, 2008 Submission Date: October 15, 2008

This project was funded by a grant from NOAA/UNH Coastal Response Research Center. NOAA Grant Number(s): NA04NOS4190063. Project Number: 08-088

1

I.

I.A.)

Scheduled Tasks / Objectives

Task 1.1: Task 1.2: Task 1.3: Task 1.4: Task 1.5: Task 2.1: Task 2.2: I.B.)

Accomplishments Since Project Start

Literature Review Description of the Conceptual Model Identification of Data Sources Identification of Essential Needs of the Future Software Data Acquisition from Sources Programming Language Selection, Software Compiling, Compatibilities with Math and Theory Submission of First Detailed Report

Testing

for

Progress on These Tasks / Objectives

Task 1.1: Literature Review A literature review has been conducted with the objective of identifying bathymetric conditions for which the proposed predictive Bayesian superimposed Gaussian model would be expected to be effective for locating sunken oil on the basis of limited field data. A brief summary of the more relevant references is given below. Protocols for NRDA Surveys (2001) provide a summary of the behavioral patterns and mechanisms associated with the sinking of released oils. This report also describes the interaction among some physical variables (e.g. velocity and oil density) to consider during the conceptual model development phase. Beegle- Krause et al. (2006) compared oil spill events and made conclusions about the transport mechanisms of the released oil. One mayor finding was the effect of wave energy on oil transport. The study also formulates a mathematical model for estimating the total wave energy at the bottom of the ocean, which is required when studying how the oil moves in deep waters. Additional information concerning wave formation can be found in Dean and Dalrymple (1991). Task 1.2: Description of the Conceptual Model: Three Team meetings have been conducted to develop the conceptual basis of the model. Questions included (a) the type and quantity of information to be expected in the days following a spill, (b) the appropriate programming environment to support NOAA response activities and interface with other software products, (c) the format of output desired (e.g., mapping interface for users), and (d) the computational approaches appropriate given the preceding requirements. The model form has now been identified, and is explained briefly here. Predictive Bayesian probability distributions are expressions of unconditional (with respect to parameter uncertainty) probability, obtained through use of the Total Probability Theorem. The Total Probability Theorem states that the unconditional probability of a quantity r, f(r), unconditional upon the value of s [e.g., s may be a parameter of f(r)] is:

2

f(r) = ∫ [f(r|s) f(s)] ds.

(1)

Thus, the predictive distribution is found by multiplying the traditional Bayesian posterior for the uncertain parameter vector by the sampling distribution (in this case the superimposed Gaussian model) and integrating the product over the uncertain parameter space (Aitchison and Dunsmore 1975). The approach used in this project is to develop a predictive Bayesian superimposed Gaussian model expressing the relative probability of finding sunken oil at various locations on a bay bottom. That is, the unconditional probability of a particle of oil being at (x, y) = X is found by integrating over the uncertain parameter space as: f(X, t ) J j

f j ( X, t | μ j , σ j , γ j ) f(μ j , σ j , γ j ) L j (CX,t ,i , CX,t ,i ,..., CX,t ,i | μ j , σ j , γ j ) μ j σ j γ j

(2)

j 1

f (X | μ j , σ j , γ j ) in which j is the j-th Gaussian puff given knowledge of parameters j = ( x, y), ,j and j; j = vjt; vj is the j-th average sunken oil velocity vector; t is time; j2 = 2Dt is the standard deviation or measure of the effective “breadth” of the puff at time, t; D is the horizontal average sunken oil dispersion coefficient; j is the fraction of sunken oil found in the j-th puff so that j = 1; and L j (CX,t ,i , CX,t ,i ,..., CX,t ,i | μ j , σ j , γ j ) is the likelihood function of the observed f (μ j , σ j , γ j ) concentration data, CX,t,i. In the absence of subjective prior information, is set equal to a constant (unity). Also, because it will not be known how much of the total oil spilled sinks, f(X) will not be a concentration or a normalized fraction of the total oil spilled, but a concentration relative to an arbitrary concentration datum (that may be roughly converted to concentration later if sufficient concentration data are available). Because the sunken oil data to be collected will be in terms of concentration (approximate, e.g., high, moderate, low) rather than location, the likelihood function will be in terms of concentration, as follows:

L j (CX,t , CX,t ,..., CX,t | μ j , σ j , γ j ) in which 1

J j 1

j

I i 1

exp(

CX,t ,i )

(3)

f j (X, t | μ j , σ j , γ j ) , and CX,t,i is the observed concentration at X, t. That

is, it is assumed that the concentration sampled at a point in space and time with mean J f (X, t | μ j , σ j , γ j ) j 1 j j comes from an exponential distribution, f(Cx,t,i) = exp(- Cx,t,i), of sampling error with scale parameter . This error distribution was chosen because it is the maximum entropy distribution for a quantity having a fixed mean value [equal to J f (X, t | μ j , σ j , γ j ) j 1 j j ] and a non-negative range, as the distribution of observed sunken oil concentration around a true mean would. A maximum entropy distribution is the true 3

distribution of a random quantity with respect to a given set of testable information, by the Principle of Maximum Entropy (Jaynes, E., 1957). Information is testable if consistency of the information with the distribution can be determined. Testable information in the case of Equation (2) is a mean equal to 1/ and a range of zero to infinity. In that case the distribution is exponential ( ), because any other distribution would assume additional information and/or not account for the testable information just described. Task 1.3: Identification of Data Sources It was agreed in team meetings with NOAA/CRRC that the most useful data for this project comprise data collected by NOAA on historical spills and spills of opportunity, and that such data will be supplied by NOAA. Other potential sources of data were identified to include wave reports with the wave energy spectrum at the surface for selected NOAA’s buoys, from NOAA National Data Buoy Center (NDBC); and current vector data generated for bays along the Gulf of Mexico, from Dr. Robert Weisberg, Ocean Circulation group, College of Marine Science, University of South Florida. . Task 1.4: Identification of the Essentials of the Future Software It was decided to develop a stand-alone product, in a programming environment expected to be compatible with other NOAA models currently under development. User inputs will include the scale of the model, geographic coordinates of spill location, and data on concentration as a function of time and space. It was further decided that the model would be developed with capability to accept concentration data without regard to the distribution of sampling locations (that is, with no requirement for a regular sampling grid). Simple operation and rapid computation are goals, to support rapid response in a variety of potential spill locations. Output will be presented graphically in geographic form, with capability for possible animation to show the predicted location of a sunken oil mass on a map as a function of time after a spill event. Task 1.5: Data Acquisition from Sources We are coordinating with NOAA/CRRC to receive data on previous spills, for model verification, as possible. Data is to be screened and sent when available. Task 2.1: Programming Language Compatibilities with Math and Theory

Selection,

Software

Compiling,

Testing for

As suggested by NOAA, the model is being developed in the Python open source release 2.5 environment, along with the following brother packages, modules and/or their dependents (not listed). All modules have been successfully located, compiled, and installed within the Python directory (Table 1):

4

Table 1. Python Modules or Compiled Software to be used in the Creation of the Program Module / Package: WinPy Numpy Scipy Mayavi2 / tvtkVTK, mlab Matplotlib RPy

I.C.)

Use: Python editor for Windows. Multidimensional and numeric module. Scientific library / functions. 3D graphics and interactive tools, visualization tool kits and interphase, respectively. 2D plotting library. Module to use R in Python, may be used for statistical analysis.

Difficulties Encountered

Literature findings on the science of sunken oil mass transport and recovery are strictly general. Data on previous spills in which sunken oil was significant will be useful. The literature research task continues on schedule. I.D.)

Preliminary Data:

Initial output from early model stages is under development. I.E.)

Discussion and Importance to Oil Spill Response / Restoration

We expect the initial model developed in this 18-month period to be useful for identifying sunken oil hotspots, addressing the need for tracking of sunken oil following a spill, to target cleanup activities and to support cleanup termination decisions. Sunken oil is difficult to “see” because sensing techniques (VSORS, ROVS) show only a small space at a point in time (BeegleKrause et al. 2006). Moreover, the oil may re-suspend and sink with changes in salinity, sediment load, and temperature (Michel, 2006), making fate and transport models difficult to deploy and calibrate when even the presence of sunken oil is difficult to assess. For these reasons, together with the expense of field data collection, there is a need for a statistical datalimited technique integrating field data collection with statistical fate and transport modeling. We expect that the beta version developed can be later enhanced to accept bathymetry and other information, to improve accuracy and resolution. I.F.)

Manuscripts, Reports, Presentations

A presentation was made to CRRC and NOAA on 29 May 2008, to present the proposed project, prior to the project start date (15 August 2008). In response to proposal presentation, feedback was collected regarding the types and format of data that may be expected to be available as input to the model, and possible programming environments.

5

I.G.) Personnel James Englehardt, Ph.D., P.E. Pedro Avellaneda, Ph.D. Angelica Echavarria-Gregory Chris Barker, Ph.D. Nancy Kinner, Ph.D. Amy Merten, Ph.D

II.

Principal Investigator (PI) Professor Postdoctoral Research Fellow Research Assistant (RA) PhD. Student NOAA Liaison Oceanographer, NOAA/NOS/OR&R Project Officer Director, CRRC Project Officer Director, CRRC

Tasks and Activities for Next Detailed Reporting Period

II.A.) Tasks for the Next Reporting Period to Meet Project Objectives Task 1.1: Task 1.5: Task 2.3: Task 2.4: Task 2.5: Task 2.6: Task 2.7: Task 2.8: Task 2.9: Task 2.10: Task 2.11: Task 2.12: Task 2.13:

Literature Review Data Acquisition from Sources Development of Simple Gaussian Model for location, test. Development of Conditional Gaussian Model and Likelihood Function for location, test. Add capability to accept concentration data, test. Development of Time-Dependent Functions and Capabilities Incorporation of graphing capabilities through interaction with Mayavi2 or VTK platforms. Submission of First Short Report Add Superposition Capability Model Verification with Data from Sources and Literature Checks on Normalization and Internal Consistency Evaluation of mapping options (interphase software(s) and usability) Programming for mapping

II.B.) Work Plan to Accomplish Tasks / Objectives

The UM team will continue to develop the software as proposed. The project RA is continuing coding of the software, including design, development, testing, and evaluation of model algorithms as specified in the consecutive tasks itemized in Section II.A, above, and prepare draft report and website material. The Postdoctoral Research Fellow will oversee and collaborate in this programming and development effort, prepare draft reports, and develop and maintain the project website. The Principal Investigator will continue to direct the work in cooperation with the Project Officers and the NOAA Liaison and submit reports in compliance

6

with the project deliverables schedule. The complete research group of the PI with sharedbackground in risk analysis and environmental engineering will continue to share ideas and provide input in weekly research meetings to critique and discuss ideas within the scope of the project. In addition, monthly conference call meetings between NOAA, CRRC, and UM team members will be held to monitor progress, discuss data and model needs, and clarify direction. As appropriate, direct communication between the NOAA Liaison and the UM team will be continued to address programming issues.

II.C.) Concerns or Difficulties Anticipated UM team would like to input actual field data during model development, when data become available from NOAA.

III.

EXPENDITURES

III.A) Percentage of Total Budget expended for the Current Reporting Period Expenditures to date have been in the range anticipated in the project proposal for the work accomplished to date. IV.

SCHEDULE

7

IV.A) Original Timeline V. Current and Projected Schedules

8

IV.A) Original Timeline V. Current and Projected Schedules (Continued)

9

V.

References

Aitchison, J. and I. Dunsmore (1975) Statistical Prediction Analysis, Cambridge University Press, New York, Chapter 2. C.J. Beegle-Krause, C.H. Barker, G. Watabashi, W. Lehr, “Long-Term Transport of Oil from T/B DBL-152: Lessons Learned for Oils Heavier than Seawater”, NOAA Office of Response and Restoration, Seattle, Washington, U.S.A., 2006. CRRC (Coastal Response Research Center), “Submerged Oil – State of the Practice and Research Needs”. Durham, New Hampshire, December 2006. Dean, R.G., and R.A. Darlymple, “Water Wave Mechanics For Engineers and Scientists”, World Scientific, Singapore, 1991. DeGroot, Morris H. (1989) Probability and Statistics, 2nd. ed., Addison-Wesley Publishing Co., Reading, MA, Chapter 6. Jaynes, E. (1957) “Information Theory and Statistical Mechanics,” Physical Review, 106, 620630 Michel, J. 2006. Assessment and recovery of submerged oil: Current state analysis. U.S. Coast Guard Research & Development Center, Groton, CT. 34 pp. + appendices. NDBC, www.ndbc.noaa.gov Protocols For NRDA Surveys, Oil behavior, Pathways and Exposure. October 2001.

10