Statistical inference, state distribution, and noisy data

1 downloads 0 Views 1MB Size Report
Statistical inference, state distribution, and noisy data. L. Rebollo Neiral and A.G. Constantinides. Department of Electrical and Electronic Engineering, Imperial ...
Physica A 198 (1993) North-Holland

514-537

SDI: 0378-4371(93)E0184-G

Statistical inference, data

state distribution,

and noisy

L. Rebollo Neiral and A.G. Constantinides Department of Electrical and Electronic London SW7 ZBT, UK

Engineering,

Imperial College, Exhibition Road,

A. Plastino* and F. Zyserman2 Departamento Argentina

de Fisica, Universidad National de La Plata, C.C. 67, 1900 La Plata,

A. Alvarez*, R. Bonetto*

and H. Viturro2

Centro de Investigacidn y Desarrollo en Procesos Cataliticos, Calle 47 No. 257, C.C. 59, 1900 La Plata, Argentina

Received 13 November 1992 Revised manuscript received 16 April

1993

A method for finding the state distribution in a physical system, on the basis of a detailed analysis of the experimental response of the system to an external probe, is presented. The concomitant algorithm, based upon the maximum entropy principle, is specifically devised so as to take into account the presence of noise due to experimental errors in the response signal. The formalism is illustrated by recourse to two numerical problems and is applied to a realistic situation involving X-ray diffraction.

1. Introduction Any preconcerted sign conveying information can be called a “signal”. We wish here to address the problem of studying signals containing information about the state distribution of physical systems by recourse to a detailed analysis of its experimental response to an external probe. The basic idea is here that a well-known probe (i.e., electromagnetic radiation) impinges upon the system, interacts with it, and is afterwards analyzed via a convenient

’ CICPBA-CONICET ’ CONICET

0378-4371/93/$06.00

0

1993 - Elsevier

Science

Publishers

B.V. All rights

reserved

L. Rebollo Neira et al. I Statistical inference,

state distribution, and noisy data

515

detection procedure. As a consequence of this interaction, the signals acquire information about the state distribution. As the study of the signals with which the experimenter probes the system necessarily consists of a finite number of noisy measurements, the maximum entropy principle (MEP) [l] will be employed in order to undertake the reconstruction of the state distribution on the basis of partial information#‘. In order to make the concomitant considerations independent of any specific detection procedure, a vectorial representation of signals is adopted, which establishes a unique correspondence between “observed” quantities and measurements performed upon the corresponding signal. This will entail representing a signal f as a vector ket 1f) and a measurement as a mapping that assigns to it a real number. Considerations will be restricted to measurements that can be represented by linear functionals. Some preliminary advances in this direction were reported [3]. They were marred, however, by a serious limitation. A reduced sub-set of the measurements considered there was assumed to be noiseless, in the sense that the resultant state distribution was forced to exactly account for it. The present effort is, we believe, of a much more general character and encompasses a wider range of possible applications. The paper is organized as follows: the problem to be solved is introduced in section 2 together with an algorithm that, based upon the MEP, allows one to determine state distributions from a noisy numerical representation of signals that account for the system’s response to an external probe. Some idealized numerical tests are performed in section 3 and a realistic problem is tackled in section 4. Finally, some conclusions are drawn in section 5.

2. Formalism 2.1.

Statistical considerations

For the sake of definiteness we assume that the system S we are interested in consists of a (finite) number M of subsystems S,, (n = 1,. . . , M) of known properties. Our purpose is that of finding out the relative population of S (we denote the one corresponding to S, by C,, > 0). Other instances can be easily accommodated so as to be well represented by the situation described above, without loss of generality. Their common factor is our ignorance, here described

by the set of non-negative numbers

C,,.

We take the view that in order to study S one interacts with it by means of xl The importance illustrated by Jaynes

of accounting with reference

for noisy data within the MEP formalism has been beautifully to problems in Spectral Analysis and image reconstruction [2].

516

L. Rebollo Neira et al. I Statistical inference,

state distribution, and noisy data

an input signal (probe) IZ) whose properties are assumed to be known. The interaction between the signal [Z) and S results in a response signal If) which, conveniently analyzed, provides information concerning the state-distribution, which makes it appropriate to call If) an “statistical signal”. The corresponding process is represented according to

wz> =If)

(2.1)

2

where the linear operator @ (associated to S) portrays the effect that the system produces upon the input signal so as to originate the statistical response If). As S is a composition of the M subsystems S,, it makes sense to decompose I&’in the following fashion:

(2.2) where I@mis accountable

RIO = I4 >

for the action of S, upon the probe IZ), i.e.,

n=l,...,M,

(2.3)

In) being the (k nown!) response signal evoked by S, when impinged upon by the input signal IZ). We shall work under the hypothesis that the set of signals In) gives rise to a linear space U, of dimension M. We can thus write

(2.4) so it becomes quite clear that the response If) is contained within U, and carries information concerning the numbers C, we are trying to find out. In order to accomplish such a goal one needs to perform observations upon If) (a necessarily finite number N of them). The N figures {fi, . _. , fN} , which arise as the result of having performed appropriate measurements (upon the output signal If)), are to be thought of as being produced by a mapping from U, upon the set of real numbers (see next subsection). 2.2. Treatment of a noisy numerical representation We assume that the numerical representation of If) is obtained in such a way that measurements are performed as a function of a parameter x which adopts the values xi (i = 1, . . . , IV), so that there is a one-to-one correspondence between the h and the x,. Zf the measurements are performed independently,

nothing prevents

us from

thinking

of the xi as defining

an

L. Rebollo Neira et al. ! Statistical inference,

state distribution, and noisy data

517

(orthogonal) set of vectors [xi) that span on N-dimensional linear space E. We now introduce the idea that the experimental figures J; can be represented as bilinear forms [4] that constitute a connection between the spaces E and U,,,. These forms can be regarded as a mapping F of E x U, upon the field of the real numbers that, out of the ordered pair [xi, f], produces the number &. We thus define the linear form {xi]f} as

.h=

{xilf

>

7

i=l,...,N.

The numerical representation of If) is then given by the set of the {xi]f}, it becomes tempting to write down the definitions

(2.5) and

(2.6) Here 1f) p is the vector that an appropriate experimental procedure allows one to build-up as a representation of If) in “the experimental space” E (or in its dual space [4]). Notice that we have so far neglected to consider experimental errors. Our {xi]f }-set constitutes an idealization. Given 1f) and a value xi for x we assume that a unique number f; exists that is represented by {xi]f}. The experimental set-up however, gives rise to errors (“noise”), and what we really have are numbers f p affected by uncertainties Af p. Thus, instead of (2.6) we have, for the representation of If) in E, vectors of the form (2.7) The existence of noise implies that the experimental set-up does not yield a unique numerical representation of If) but that of a whole bunch of vectors “close” to it, out of which the selected figures f p correspond to a certain ketlf’),. In other words, while it is true that to each vector in U, we can univocally associate a vector in E, the converse statement does not hold. Due to the finite representation and experimental errors, each vector in E is mapped, not into a single vector, but into a whole “region” 0 in U,. Here lies the crux of the problem. Our experimental information refers to E (not to U,,,), but, in order to make some progress, we need to convert this information into one referred to U,. To this end, within 0, we aim to select a special vector If*) on the basis on some appropriate criterium. The problem we thus face is that of building up a vector 1f *) E U,,

If*) =

5lcm ,

(2.8)

518

L. Rebollo Neira et al. I Statistical inference,

state distribution, and noisy data

out of the {xi]fo}- se t such that the Cz constitute a good approximation to the unknown (“true”) C,. One might in this connection introduce the following criterium: in order to fix If*), look within 0 C U, for that vector such that the “distance” in E between If*), and ]fO)p is minimized. For this purpose we start by writing down the representatives in E of In) and If*)

(2.9) and

lf*)pzi$ =nfI, CzIn), . {‘ilf*>lxi)

The distance between vectors of E is defined, as usual, with reference norm given by the scalar product

(2.10) to the

w*,>f”,>= J(f* -f”)l(f* -f”>>;‘* = ,(W*lW*);‘* ,

(2.11)

where lAf*), is the ordinary (vector) difference between If*)r and If”) r. We make a rather “natural” assumption: in order that If*), and If”), be as close as possible, we should require that the vector difference is of such nature that If* -f”), does not belong to the projection of 17, onto E. In other words, we demand that If* -f’)” b e orthogonal to the space of signals U,,,,. This leads to the set of conditions

,w* -f”>, = 0

(2.12a)

or, equivalently,

p(4f*)p= ,(nlf”>p9

n = 1,. . . ,M.

(2.12b)

It is shown in appendix A that the set of conditions (2.12) allows one to select a vector If*), with the property that its distance to If”), is an absolute minimum. Thus we see that, as tackling the system (2.12) is equivalent to minimize the distance (2.11), the solution is the one that would be obtained by following the well-known least-squares (LS) approach [5,6]. Unfortunately, the set (2.12) does not restrict the pertinent CE to the subset of positive real numbers. Moreover, eqs. (2.12) saddle us with an ill-posed problem. In other words, although there exists just ooze set {Cz, n = 1, . . . , M} that minimizes

L. Rebollo Neira et al. I Statistical inference,

state distribution, and noisy data

519

the distance (2.11) there exist other sets {C,, n = 1, . . . , M} that, if we define

d,(C*, C) =

(n=l 2 (ICZ- c”lj2y2 >

(2.13)

have the following annoying property: even if d,(C*, C)+ 0 implies d( f*, ,f,)+ 0,the converse does not hold. Let us denote by d* the distance between If*), and If’)r, and d the distance between, say, If), and IfO)r, where If) is constructed with the C,,. The ill-posed nature of our problem is reflected in the fact that, for arbitrary E > 0, even if d*-esd

Suggest Documents