Time-domain semi-parametric estimation based on a ... - CiteSeerX

NMR IN BIOMEDICINE NMR Biomed. 2005;18:1–13 Published online 19 January 2005 in Wiley InterScience (www.interscience.wiley.com). DOI:10.1002/nbm.895

Time-domain semi-parametric estimation based on a metabolite basis set H. Ratiney,1 M. Sdika,1 Y. Coenradie,1,2 S. Cavassila,1 D. van Ormondt2 and D. Graveron-Demilly1* 1

Laboratoire de RMN, CNRS UMR 5012, Universite´ Claude Bernard Lyon I-CPE, France Applied Physics, Delft University of Technology, PO Box 5046, 2600 GA Delft, The Netherlands

2

Received 26 January 2004; Revised 2 June 2004; Accepted 8 June 2004

ABSTRACT: A novel and fast time-domain quantitation algorithm—quantitation based on semi-parametric quantum estimation (QUEST)—invoking optimal prior knowledge is proposed and tested. This nonlinear least-squares algorithm fits a time-domain model function, made up from a basis set of quantum-mechanically simulated whole-metabolite signals, to low-SNR in vivo data. A basis set of in vitro measured signals can be used too. The simulated basis set was created with the software package NMR-SCOPE which can invoke various experimental protocols. Quantitation of 1H short echo-time signals is often hampered by a background signal originating mainly from macromolecules and lipids. Here, we propose and compare three novel semi-parametric approaches to handle such signals in terms of bias-variance trade-off. The performances of our methods are evaluated through extensive Monte-Carlo studies. Uncertainty caused by the background is accounted for in the Crame´r–Rao lower bounds calculation. Valuable insight about quantitation precision is obtained from the correlation matrices. Quantitation with QUEST of 1H in vitro data, 1H in vivo short echo-time and 31P human brain signals at 1.5 T, as well as 1H spectroscopic imaging data of human brain at 1.5 T, is demonstrated. Copyright # 2005 John Wiley & Sons, Ltd. KEYWORDS: magnetic resonance spectroscopy (MRS); time-domain quantitation; metabolite basis set; quantum mechanics; Crame´r–Rao lower bounds

INTRODUCTION Magnetic resonance spectroscopy (MRS) plays an important role in diagnosing major diseases, such as cancer, Alzheimer’s disease, multiple sclerosis, etc., and monitoring the effect of therapies.1–6 The present challenge is to quantify proton short echo-time spectra which exhibit many metabolites, and to estimate their concentrations. Quantitation of such spectra is difficult. Three major problems are encountered: (1) strongly overlapping metabolite peaks (many hundreds); (2) low signal-to-noise ratio (SNR); (3) a broad, partially known background originating mainly from macromolecules and lipids that overlaps the metabolite peaks and hampers the quantitation. Moreover, the (residual) water peak needs to be removed.

*Correspondence to: D. Graveron-Demilly, Laboratoire de RMN, CNRS UMR 5012, Universite´ Claude Bernard Lyon I-CPE, 3 Rue Victor Grignard, 69616 Villeurbanne, France. E-mail: [email protected] Contract/grant sponsor: Philips Medical Systems. Abbreviations used: AMARES, advanced method for accurate, robust and efficient spectral fitting; CRB, Crame´r–Rao bounds; NMRSCOPE, NMR spectra calculation using operators; QUEST, quantitation based on quantum estimation; SVD, singular value decomposition; VARPRO, variable projection. Copyright # 2005 John Wiley & Sons, Ltd.

Fitting of model functions to low-SNR in vivo data with strongly overlapping peaks requires invocation of ever more prior knowledge about the model parameters. Note that a (correct) model function itself already constitutes powerful prior knowledge. The challenge is how to obtain that prior knowledge and eventually the metabolite concentrations. An example of invoking extensive prior knowledge is to use measured spectra of selected metabolite solutions as numerical model functions in the frequency domain,7–9 as in the LCModel or the time domain.10–12 Alternatively, one can analyze the modelsolution data first in terms of spin Hamiltonian parameters.13–15 Given these spin Hamiltonian parameters, one can then compute theoretical metabolite signals/ spectra quantum-mechanically for the measurement protocol used by a scanner. Using a ‘whole-metabolite’ model function rather than one based on individual peaks as is done in VARPRO16 or AMARES17,18 presents some advantages. In particular, it allows easier handling of prior knowledge. Quantitation in the time-domain offers useful features. First, in MR, the time-domain is the measurement-domain. As a consequence, missing initial and/or final data points do not really hamper the quantitation: one can omit such points from the fit.19 In the frequency- or transform-domain, the effect of missing data is spread NMR Biomed. 2005;18:1–13

2

H. RATINEY ET AL.

over the entire spectrum and hence is more difficult to handle. A second useful feature of the time-domain is that it enables one to automatically process water20 and background signals21,22 with SVD-based methods. In this paper, we present a novel and fast time-domain quantitation algorithm QUEST (quantitation based on quantum estimation)23–26 based on extensive prior knowledge obtained by quantum mechanics or from measured in vitro metabolite solutions. This nonlinear least-squares algorithm fits a time-domain model function, combination of (quantum-mechanically simulated/ in vitro measured) metabolite signals, to low-SNR in vivo data. The metabolite basis set was created with NMRSCOPE,27 which can handle various experimental protocols, and this for arbitrary strength of the magnetic field. Quantitation of 1H short echo-time signals is often hampered by a background signal originating mainly from macromolecules and lipids. We use a semi-parametric approach. Three novel methods—combined with the QUEST quantitation algorithm—to handle such signals are proposed. They apply to the case where no separately measured background signal is available. In this study, their performances in terms of bias-variance trade-off are evaluated through extensive Monte-Carlo studies and compared with the Crame´r–Rao lower bounds (CRBs). Moreover, uncertainty caused by the background is accounted for by adding a correction term in the CRBs calculation. Valuable insight about quantitation precision is derived from the correlation matrices. Finally, we apply QUEST successfully to data at 1.5 T from 1H in vitro NMR, short echo-time 1H human brain MRS, 31P human brain MRS and long-echo-time 1H MRSI. For the in vitro data, we compare the result with that obtained from the LCModel. In the following, the term ‘background signal’ or ‘background’ pertains to non-metabolite features. This applies to both domains.

METHOD An in vivo MRS signal, contaminated by a nondescript background signal, can be modelled by x ¼ ^xMet ðpÞ þ bðÞ þ e

ð1Þ

where ^xMet is the metabolite part whose model function is known, b the background signal whose model function is often only partially known, and e Gaussian-distributed noise. The challenge is to obtain reliable estimates of the metabolite parameters in the presence of the ‘nuisance’ signal b and the noise e. In a strictly semi-parametric approach, one considers a finite number of metabolite parameters and a nuisance function in an infinite dimensional space. In practice, one often approximates (models) the nuisance function so as to render the nuisance parameter set finite. The wanted parameters of Copyright # 2005 John Wiley & Sons, Ltd.

the metabolites and the nuisance parameters of the background will be denoted by the vectors p and respectively. An indispensable further aspect is estimation of an error bar for each metabolite parameter that accounts for the detrimental presence of a background.28,29 The algorithm QUEST,23–26 for optimal fitting of equation (1) to (contaminated) in vivo MRS data is made up of: (a) quantum-mechanical simulation of a time-domain basis set of ‘whole-metabolite’ signals; (b) time-domain semiparametric estimation of metabolite and macromolecule parameters, p and ; and (c) extension of the Crame´r–Rao lower bounds (CRBs) to semi-parametric estimation. Details are given in the next three subsections.

(a) Metabolite basis set A metabolite basis set comprising whole-metabolite signals was simulated by quantum mechanics with NMR-SCOPE27 using the spin Hamiltonian parameters (number of spins, chemical shifts, J-couplings) given in Govindaraju et al.15 NMR-SCOPE, based on the productoperator formalism, can handle various NMR pulse sequences such as STEAM and PRESS. It directly provides the corresponding time-domain signals. Signals of the main MR-observable 1H metabolites in the human brain—aspartate (Asp), choline (Cho), creatine (Cr), amino butyric acid (GABA), glucose (Glc), glutamate (Glu), glutamine (Gln), lactate (Lac), myo-inositol (mI), N-acetylaspartate (NAA), phosphocreatine (PCr) and taurine (Tau)—were simulated. Signals modelling the lipids (Lip) at 0.9 and 1.3 ppm were not included in the basis set, in view of the fact that their model function is insufficiently known. An example of a metabolite basis set computed with NMR-SCOPE for a PRESS sequence with an echo-time of 20 ms and used in QUEST is shown in Fig. 1. The main advantage of this approach is that it obviates repeating tedious experimental work needed for acquiring signals of in vitro metabolite solutions for any new experimental protocol. Of course, a new metabolite basis set must then be computed, but note that QUEST can use an in vitro metabolite basis set as well.

(b) Semi-parametric estimation of p and h Often, the model function of an in vivo MRS signal consists of a parametric—metabolites—and a non-parametric—background—part. The handling of both parts leads to the notion of semi-parametric estimation. We devised the approach described below. First, we treat pure parametric estimation using a simulated or measured time-domain metabolite basis set. Subsequently, we treat semi-parametric estimation, taking the contaminating background into account too. NMR Biomed. 2005;18:1–13

TIME-DOMAIN SEMI-PARAMETRIC ESTIMATION

3

values in the metabolite basis set—are included in the estimation procedure to automatically compensate for the effect of magnetic field inhomogeneities. In most cases m ¼ 0. Soft constraints on m and !m have been used in the minimization procedure. tn ¼ nts þ t0 ; n ¼ 1; 2; . . . ; N, are the sampling times, in which t0 is the dead-time of the receiver, included in the estimation, and ts the sampling interval; 0 is an overall phase, included in the estimation; and i2 ¼ 1. The vector p of the metabolite parameters to be estimated is then p ¼ ½ðam ; m ; !m ; m Þ; m ¼ 1; 2; . . . ; M; 0 ; t0 T where the superscript T denotes transposition. Note that the number of parameters is only 4M þ 2. Semi-parametric approaches. In this section, we present various currently used methods as well as three novel semi-parametric approaches to handle the background. Figure 1. Fourier transform of a metabolite basis set at 1.5 Tesla, simulated by quantum mechanics with NMR-SCOPE for a PRESS sequence with an echo-time of 20 ms, that can be used in QUEST. Lorentzian lineshapes are used

Parametric nonlinear least-squares fitting. In absence of—or after removal of—the non-parametric part, a Levenberg–Marquardt algorithm is used to minimize the distance between the raw signal x and the model function ^x jjx ^xjj2 ! min

ð2Þ

In fact, QUEST is based on the core of VARPRO/ AMARES algorithms.16,17 The complex-valued time-domain model samples, ^xn ; n ¼ 1; 2; . . . ; N where N is the number of data-points, is written as a linear combination of the M —either quantum-mechanically simulated or in vitro measured— weighted metabolite model samples ^xm n ; m ¼ 1; 2; . . . ; M; n ¼ 1; 2; . . . ; N of the basis set ^xn ¼ expði0 Þ

M X

am^xm n exp½ðm þ i!m Þtn þ im

m¼1

ð3Þ m

where ^x , m being superscript, can be modelled first as a sum of exponentially damped sinusoids or used as a whole. Modelling of ^xm enables the use of basis metabolite signals sampled differently from x. The damping factors of ^xm should be set close and preferably not greater than the in vivo values. am are M amplitudes to be estimated. Note that these amplitudes represent the relative proportions of the M metabolite signals ^xm in the signal x rather than the amplitudes of individual spectral components. m ; !m ; m represent small, changes of the damping factors, angular frequencies, and phase shifts, respectively. These changes—relative to the initial Copyright # 2005 John Wiley & Sons, Ltd.

Currently used approaches. Background in the basis set—If a physical background model function is available, the case is rendered parametric, as above. The background signal can then be added to the basis set.30–33 This approach is hereafter defined as the ‘In Base’ approach. If a physical background model function is not a priori available, i.e. not separately measured, two kinds of approaches can be distinguished. Background part of an overall model function—the background signal is approximated by some mathematical function, e.g. a sum of splines, wavelets or polynomials with adjustable parameters, and then included in the parametric nonlinear least-squares fit above. In that case, the model function is ^xðp; Þ ¼ ^xMet ðpÞ þ ^bðÞ

ð4Þ

where ^bðÞ is the mathematical model function of the background and is the nuisance parameter vector mentioned above. This was done in the frequency-domain7,9,34,35 and in the time-domain.36,37 Often, a penalization term is added in equation (4). Background separately handled—the background signal is handled in a pre-processing step. This can of course be done in either domain. In the frequency-domain, one can invoke the smoothness and broadness of the background. In the time-domain, one can exploit the fact that the background signal decays quickly. Among these methods, we cite: modelling of the background in the frequency-domain with wavelets and subsequent subtraction;34,38 measurement of the background signal and subtraction in the frequency-domain39 or in the time-domain;12 truncation of initial data points.40,42 This approach is hereafter defined as the ‘Trunc’ approach. Because macromolecule and lipid signals decay rapidly in the time-domain, the major part of the background signal NMR Biomed. 2005;18:1–13

4

H. RATINEY ET AL.

is in the very first points of the signal. Strategic truncation of these initial samples can separate the metabolite part from fast decaying parts and thus decorrelate metabolite and nuisance parameters.43 In this process, part of information about the metabolites is lost as can easily be established by calculating corresponding CRBs;41 SVD-based modelling and removal.21,22 SVD-based techniques enable automatic modelling of signals of unknown shape such as those of water and macromolecules. Moreover, by applying general prior knowledge about damping factors (line-widths), spectral components belonging to the background can easily be distinguished and then subtracted. Of course, the effect of overlap with wanted metabolite features cannot be completely eliminated, as with other semiparametric techniques. Proposed semi-parametric approach ‘Subtract’. A common problem with most of the mentioned methods is the strong correlation between background and metabolite parameters due to spectral overlap. To reduce this correlation, we devised a three-step time-domain preprocessing procedure, Subtract, that uses truncation but retrieves part of the concomitantly lost information.44,45 The steps of Subtract are described in Fig. 2. Our procedure is as follows. (1) Truncate (omit) an appropriate number of initial data points. The resulting signal xtrunc can be modelled by xtrunc ^xMettrunc þ etrunc

ð5Þ

After quantitating with QUEST, the residue approximates the noise because the background resides mainly in the omitted initial data points. Back-extrapolating then ^xMettrunc to t ¼ 0, one obtains an approximation of ^xMet . Subtraction of the latter from the raw data, in turn yields a good approximation ~ b of the background signal plus noise, i.e. ~b x ^xMet ¼ b þ e

ð6Þ

Back-extrapolation assumes that the metabolite parameters have been correctly estimated and that the major part of the metabolite signal near t ¼ 0 will be recovered. This in turn leads to improved separation of metabolite and background signals when compared with SVD-based handling of the complete signal mentioned above. (2) Model the background signal ~b by Lorentzian or Gaussian components using either an SVD-based method with selection of the spectral components according to damping factor criteria, or AMARES, leading to ~b ^bðÞ þ e ð7Þ This background quantitation yields the nuisance parameter vector ¼ ð1 ; 2 ; . . . ; N Þ, where N is finite, as mentioned earlier. The size of N plays a role in bias-variance trade-off. Parameterization of the background is also necessary to compute the extended CRBs on metabolites taking into account the uncertainty caused by the background. (3) Subtract the parameterized background ^bðÞ from the raw signal. The resulting signal xsubt in which

Figure 2. The steps of the QUEST algorithm are illustrated using simulated brain metabolite signals mimicking observed in vivo signals at 1.5 T. Upper row, from left to right, time-domain signals: raw signal x and truncated signals xtrunc ; estimated signals x^Met and x^Mettrunc ; background signals b~ (grey) and b^; signal xsubt . Lower row, from left to right: FFT of x; FFT of x^Met ; FFT of b~ (grey) and b^; FFT of xsubt Copyright # 2005 John Wiley & Sons, Ltd.

NMR Biomed. 2005;18:1–13

TIME-DOMAIN SEMI-PARAMETRIC ESTIMATION

the subscript stands for ‘subtraction of the background’, is then ‘background-free’ and can be modelled by xsubt ^xMet þ e

ð8Þ

Note that xsubt has the original number of N data points, contrary to xtrunc , and therefore contains more metabolite information. After the three pre-processing steps, the resulting signal xsubt is ready for final metabolite quantitation by QUEST. The steps of the QUEST algorithm combined with Subtract as well as the corresponding fit results are illustrated in Fig. 2, using simulated brain metabolite signals mimicking in vivo signals observed at 1.5 T. The fits obtained with Trunc and Subtract do not look very different. However, after the untangling procedure, the background-related biases on parameters are reduced but their standard deviations increase. The goal of back-extrapolation and modelling is to retrieve lower standard deviations. In the following, Subtract-Ext corresponds to Subtract with extended CRBs, computed with equation (13). Proposed semi-parametric approaches ‘InBaseSingle’ and ‘InBase-Multiple’. At the second step of the proposed preprocessing method, one gets an estimate of the background signal ^b. The latter can then be included in the metabolite basis set in the same vein as with the mentioned InBase methods. The main difference between these alternative semi-parametric approaches is that ^ b is estimated rather than measured. We distinguish two variations of this: In InBase-Multiple, the parameters of the individual components modelling ^b and resulting from the SVD decomposition are free in the following overall quantitation procedure. This has some similarities to the frequency-domain work of Seeger et al.33 where the macromolecule spectra were modelled by several entities. In InBase-Single, ^b is considered as one entity in the same way as the metabolites, corresponding to the sum of the signals modelling ^b, whose parameters—a, , ! and —are free in the following overall quantitation procedure. In the following, Subtract is compared in terms of biasvariance trade-off with Trunc, in which the initial data points are simply truncated (omitted), and with InBaseSingle and InBase-Multiple. More details about the optimum number of truncated data points and the performances of these methods for background accommodation are given in Ratiney et al.43 Finally, we mention that the methods Trunc, Subtract, and InBase-Single combined with QUEST have recently been implemented in the jMRUI software package.42,46 Copyright # 2005 John Wiley & Sons, Ltd.

5

(c) Error estimation Quantitation errors of metabolites are caused by measurement noise and inadequate modelling of the overlapping background signal. The errors are calculated by estimating the CRBs.41,47,48 The latter method is valid only if the model function is known. If part of the model function is not available, the CRB calculation is incomplete, and resulting bounds are too low. We extended the traditional CRB calculation accordingly, following the treatment in Spall and Garner.28 For the metabolites—model functions assumed known—the CRBs are usually computed from the Fisher information matrix Fp 47, in which the subscript represents the metabolite parameter vector p introduced above. Fp ¼

1 Noise 2

Time-domain semi-parametric estimation based on a ... - CiteSeerX

Time-domain semi-parametric estimation based on a ... - CiteSeerX

Suggest Documents

Semiparametric Estimation of a Characteristic-based ... - CiteSeerX

Semiparametric Estimation of Covariance Function for ... - CiteSeerX

Semiparametric Estimation in Multivariate

Semiparametric estimation of outbreak regression. - CiteSeerX

Semiparametric estimation of long-memory models - CiteSeerX

On Semiparametric Mode Regression Estimation - Cedric/CNAM

Semiparametric Estimation of a Partially Linear ... - DukeSpace

ESTIMATION IN A SEMIPARAMETRIC MODULATED RENEWAL ...

Efficient semiparametric estimation of haplotype

A New Semiparametric Estimation of Large Dynamic

Estimation in Semiparametric Time Series

Semiparametric estimation of conditional mean

Semiparametric Estimation of Markov Decision

Efficient estimation of copula-based semiparametric Markov ... - arXiv

Two likelihood-based semiparametric estimation methods for ... - arXiv

Semiparametric Estimation of the Optimal Reserve Price ... - CiteSeerX

Semiparametric density estimation of shifts between curves - CiteSeerX

Semiparametric Estimation in Regression Models for Point ... - CiteSeerX

Semiparametric Duration Models - CiteSeerX

On Semiparametric Clutter Estimation for Ship Detection ... - IEEE Xplore

Estimation of Possibly Misspecified Semiparametric ... - Yale Economics

Identification and Semiparametric Estimation of ... - Google Sites

Nonparametric/semiparametric estimation and testing ... - Google Sites

Parametric and Semiparametric Estimation in Models with ...