Inverse Problems as Statistics - CiteSeerX

11 downloads 0 Views 285KB Size Report
Theorem 1. The Rao-Blackwell Theorem (see 37] Th. 1.7.8) Let X have prob- ability distribution P 2 P = fP : 2 g, and let T be su cient for P. Let be an estimator of ...
Inverse Problems as Statistics P.B. Stark Department of Statistics University of California Berkeley, CA 94720-3860 USA [email protected], WWW home page: http://www.stat.berkeley.edu/stark Technical Report 552

Abstract. What mathematicians, scientists, engineers, and statisticians

mean by \inverse problem" di ers. For a statistician, an inverse problem is an inference or estimation problem. The data are nite in number and contain errors, as they do in classical estimation or inference problems, and the unknown typically is in nite-dimensional, as it is in nonparametric regression. The additional complication in an inverse problem is that the data are only indirectly related to the unknown. Standard statistical concepts, questions, and considerations such as bias, variance, meansquared error, identi ability, consistency, eciency, and various forms of optimality apply to inverse problems. This article discusses inverse problems as statistical estimation and inference problems, and points to the literature for a variety of techniques and results.

1 Introduction This paper casts inverse problems as statistical estimation and inference problems. Along the way, it presents some statistical ideas that I nd helpful in thinking about inverse problems. Inverse problems are formulated di erently and raise di erent questions for a statistician and an applied mathematician. For example, to a statistician, the number of data is nite, and the data contain errors that are modeled at least

2

in part as stochastic. To a statistician, bias, variance, identi ability, consistency, and similar notions gure prominently; emphasis is on estimation and inference. Applied mathematicians often are more interested in existence, uniqueness, and construction, given an in nite number of noise-free data, and stability given data contaminated by a deterministic disturbance. These two viewpoints are related; for example, identi ability and uniqueness are similar. Moreover, there is a deep connection between statistical measures of the diculty some estimation problems (estimating a linear functional of an element of a Hilbert space from linear observations observed with Gaussian errors) and the theory of optimal recovery of a linear functional from linear data with deterministic errors [17]. Moreover, many of the mathematical tools employed are the same: functional analysis, convex analysis, optimization theory, nonsmooth analysis, approximation theory, harmonic analysis, and measure theory. Section 2 presents some probabilistic and statistical terminology in a suciently general framework that most inverse problems become special cases. Section 3 presents a canonical inverse problem, and translates it into the language of section 2. Section 4 introduces some ideas and notation from statistical decision theory. Section 5 applies the ideas of section 4 to estimation, and presents some common loss functionals used to de ne optimal estimators. Section 6 does the same thing for con dence sets. Section 7 examines some estimators used in statistics and inverse problems, including the Backus-Gilbert method, Bayes estimation, maximum likelihood (including penalized likelihood [regularization] and the method of sieves), shrinkage, and strict bounds.

2 Preliminaries. This section introduces notation used throughout the rest of the paper. We will specialize to the case that the unknown model (parameter) is an element of a separable Banach space, n real data are observed, and the joint probability distribution of the data errors is known. These conditions can be relaxed in various ways; in particular, the assumption that the parameter is in a Banach

3

space is often unnecessary. In most of the development, it suces for the data to take values in a separable Banach space, not necessarily 0, nlim !1 P fkg() ? (X)k < "g = 1:

(19)

bias() = bias(; ; g)  E  ((X) ? g()):

(20)

If a parameter is not identi able, there is no estimator that is consistent whatever be  2 . (Obviously, the estimator (X) = c is consistent when g() = c, but because #fg()g  2, there is some  2  for which g() 6= c.) The bias at  of the estimator  :