IEEE SIGNAL PROCESSING LETTERS, VOL. 5, NO. 8, AUGUST 1998
209
Application of Blind Source Separation to 1-D and 2-D Nuclear Magnetic Resonance Spectroscopy Danielle Nuzillard and Jean-Marc Nuzillard
Abstract— In this letter, we present the application of blind source separation to the processing of mixtures of one-dimensional (1-D) and two-dimensional (2-D) nuclear magnetic resonance subspectra. A spectroscopic technique called distortionless enhancement by polarization transfer (DEPT) allows the generation of three spectra in which the response intensities of the CH, CH2 , and CH3 fragments of organic molecules are modulated according to an experimental parameter. The blind source separation technique offers an attractive way of separating these responses automatically. A scheme that allows the processing of 2-D DEPT-like data sets is presented. Index Terms—Blind source separation, nuclear magnetic resonance, spectroscopy.
I. INTRODUCTION
S
OURCE separation offers a solution to a widespread problem in the field of engineering that consists of finding which signals have been produced by a set of sources using information recorded by sensors. Blind source separation is applied when little or nothing is known about the mixing process [1]–[4]. This letter presents an original application of blind source separation in the field of one-dimensional (1-D) and two-dimensional (2-D) nuclear magnetic resonance (NMR) spectroscopy. Blind source separation techniques are based on secondand/or fourth-order statistical properties of signals. Two algorithms were considered: JADE (joint approximate diagonalization of eigen-matrices [5]) and second-order blind identification (SOBI) [6]. The former requires very stationary signals, as proved by numerical simulations. The latter is based only on second-order statistics and has proved to be reliable in the context of noisy and nonstationary signals. Both methods work on statistically independent signals, even though this constraint can be relaxed to some extent. The following sections only deal with the SOBI algorithm. II. PRINCIPLE
OF THE
SOBI ALGORITHM
Let
be a vector of source signals with components, an mixing matrix ( ), and the noise introduced by . sensors in the vector of the detected signals: and The goal of source separation is to find estimates Manuscript received November 30, 1997. The associate editor coordinating the review of this manuscript and approving it for publication was Prof T. S. Durrani. D. Nuzillard is with Facult´e des Sciences, LAM, 51687 Reims Cedex 2, France (e-mail:
[email protected]). J.-M. Nuzillard is with the Laboratoire de Pharmacognosie, CPCBAI, 51687 Reims Cedex 2, France (e-mail:
[email protected]). Publisher Item Identifier S 1070-9908(98)05913-6.
of and so that . The SOBI algorithm resorts of a vector to the properties of the covariance matrix . The statistical signal defined by is always a independence of the sources ensures that diagonal matrix. The detected signals being the only input of the separation process, the source signals can only be known as relative values. They are considered as normalized so that . The SOBI algorithm starts with the search of an whitening matrix so that the whitened data estimated forms a set of orthogonal vectors: . As is a unitary transformation noted , and consequence, . It means that can for any value be found as the unitary transformation that jointly diagonalizes any set of covariance matrices of the whitened data. Then and are evaluated as and .
III. TREATMENT OF 1-D CARBON-13 NMR SPECTRA An NMR time-domain signal is modeled by a sum of damped harmonic functions to which Gaussian noise is superimposed. Signal processing is performed by Fourier analysis, resulting in spectra made of sums of Lorentzian shaped resonance lines [7]. Statistical independence of two signals requires their scalar product to be zero both in the time domain or in the frequency domain. Therefore, the narrower the resonance lines, the lower the probability of signals dependence. Carbon-13 NMR spectroscopy is ideally suited for blind source separation, as it yields spectra made of narrow lines whose frequencies (NMR chemical shifts, noted and expressed in parts per million, or ppm) are spread on a relatively wide range. The main problem associated with carbon-13 NMR spectra is their low signal-to-noise ratio (SNR). The distortionless enhancement by polarization transfer (DEPT) technique differentiates the NMR signals originating from carbons bearing one, two, or three hydrogen atoms. The sample excitation sequence is parameterized by a “flip angle,” resulting in signal intensities proportional to for a CH group [8]. Using 45 , 90 , and 135 produces three DEPT spectra that are linear combinations of the CH, CH , and CH subspectra. The problem is to separate them. The mixing process could were known be completely determined if the parameter accurately. Its value depends on spectrometer calibration and may vary from sample to sample. This fact justifies the search for an efficient automatic procedure by blind source separation. Three DEPT time-domain signals of an organic molecule of plant origin were recorded. A Lorentzian line-broadening
1070–9908/98$10.00 1998 IEEE
210
IEEE SIGNAL PROCESSING LETTERS, VOL. 5, NO. 8, AUGUST 1998
filter is applied in order to improve the SNR of the Fourier spectrum. A phase correction is then applied to the spectra so that their real parts display symmetrical Lorentzian line shapes (see Fig. 1). An inverse Fourier transformation (FT) yields the data submitted to the SOBI source separation algorithm. The number of covariance matrices to be jointly diagonalized is set to four, its default value [6]. Time lags correspond to the shifts of one to four time-domain sampling points. The three spectra in Fig. 2 are obtained by FT of the data produced by the SOBI algorithm. A human operator could not achieve a better result by trial and error. It is interesting to note that groups of very close resonance lines are correctly handled as 25.5, 37, and 67 ppm. However, some those located at small unwanted peak residues are visible, like the one at 48 ppm in the CH subspectrum (Fig. 2, top trace). It corresponds to an incomplete cancellation of the CH signal that exists at this chemical shift. Such troubles are caused by frequency glitches taking place during the data recording.
Fig. 1. Three experimental DEPT spectra of a test organic molecule, combinations of its CH, CH2 , and CH3 carbon-13 NMR subspectra.
IV. TREATMENT OF 2-D E-HSQC NMR SPECTRA The most sensitive atomic nucleus for NMR spectroscopy is H, also named proton. Its main drawback is the relatively narrow range of chemical shifts compared to individual resonance line widths. This means that even for simple component molecules within mixtures, the statistical independence of their signals is practically never insured. The interpretation of 1-D H and C NMR spectra is simplified by 2-D NMR spectroscopy [7]. Among the numerous available possibilities, the heteronuclear single quantum correlation (HSQC) technique produces 2-D spectra in which a peak occurs at coordinates equal to H and C chemical shifts of directly bound H and C nuclei [9]. The proton signals that are superimposed on a single 1-D axis are spread onto a 2-D surface, making collisions much less likely. Identification of responses from CH, CH , and CH groups is achieved by the E-HSQC technique, adapted from E-HMQC [10] and whose principle is similar to 1-D DEPT. A flip angle parameter whose calibration may be uncertain produces 2-D spectra in which the CH, CH , and CH subspectra are mixed. An interactive separation is far less practical when manipulating 2-D spectra and therefore an automatic procedure brings a real simplification of the separation task. The SOBI algorithm is designed to deal with 1-D timedomain signals. Therefore, handling of 2-D spectra requires some pre- and postprocessing. The 2-D spectra are produced by 2-D FT of apodized time-domain signals. In order to treat a reasonable amount of significant data, rectangles around peak locations are drawn and the data points lying outside them are discarded. The rows within each rectangle are joined to form a single vector. A pseudo-dataset in the frequency domain is built from the concatenation of all these vectors. It is completed with zeros to bring its size to the next integer power of two. An inverse fast FT leads to the desired timedomain data set. Preprocessing is performed on the three E-HSQC spectra and its result is submitted to separation through the SOBI algorithm. Postprocessing is achieved by: FT of the separated pseudo-1-D time-domain signals, piecing
Fig. 2. Separated CH, CH2 , and CH3 subspectra, from bottom to top.
of the pseudo-1-D-spectra at the sizes of the initial rectangles, and placement of the rectangles into blank 2-D spectra. The E-HSQC spectra of our test molecule are submitted to the whole process. Zero filling, apodization of raw 2-D timedomain data by an arch of cosine function, 2-D FT, and spectra phasing are performed in both dimensions. One of the resulting spectra is presented in Fig. 3(a) as a contour plot. From the 128 k points of the real part of each 2-D spectrum, 6868 points enclosed in 17 rectangles form the pseudo-1-D-datasets whose inverse FT’s are submitted to SOBI. The number of covariance matrices and time lags are set as in the 1-D treatment. The result of the whole processing is presented in Fig. 3(b)–(d), showing stacked plots of the 2-D subspectra of the CH , CH , and CH nuclei systems, respectively. This representation is chosen to show the high quality of the separation. V. CONCLUSION The first example of blind source separation applied to 1-D and 2-D NMR signals is presented. The described processing
NUZILLARD AND NUZILLARD: BLIND SOURCE SEPARATION
211
(a)
(b)
(c)
(d)
Fig. 3. (a) One of the E-HSQC spectra of the test organic molecule. (b)–(d) Stacked plots of the separated HSQC subspectra of the CH3 , CH2 , and CH groups. The view is restricted to the upper right quarter of the spectral region shown in (a).
scheme allows a good quality automation of a procedure that is otherwise performed manually. Other applications of blind source separation in NMR spectroscopy can be foreseen, as with the analysis of physical mixtures of compounds or with the analysis of chemical reaction kinetics. REFERENCES [1] C. Jutten and J. H´erault, “Blind separation of sources: An adaptive algorithm based on neuromimetic architecture,” Signal Process., vol. 24, pp. 1–10, 1991. [2] E. Moreau and O. Macchi, “New self-adaptive algorithms for source separation based on contrast functions,” in Proc. IEEE Signal Processing Workshop on Higher Order Statistics, Lake Tahoe, NV, 1993, pp. 215–219. [3] N. Delfosse and P. Loubaton, “Adaptive separation of independent sources: A deflation approach,” in Proc. ICASSP, 1994, vol. 4, pp. 41–44.
[4] P. Comon, “Independent component analysis, a new concept,” Signal Process., vol. 36, pp. 287–314, Apr. 1994. [5] J.-F. Cardoso and A. Souloumiac, “Blind beamforming for Gaussian signals,” Proc. Inst. Electr. Electron. Eng. F, Dec. 1993, vol. 140, pp. 362–370. [6] A. Belouchrani, K. Abed-Meraim, J.-F. Cardoso, and E. Moulines, “A blind separation technique using second order statistics,” IEEE Trans. Signal Processing, vol. 45, pp. 434–444, Feb. 1997. [7] R. R. Ernst, G. Bodenhausen, and A. Wokaun, Principles of Nuclear Magnetic Resonance in One and Two Dimensions. Oxford, U.K.: Clarendon, 1987. [8] M. R. Bendall, D. T. Pegg, and D. M. Doddrell, “Pulse sequences utilizing the correlated motion of coupled heteronuclei in the transverse plane of the doubly rotating frame,” J. Magn. Reson., vol. 52, pp. 81–117, 1983. [9] G. Bodenhausen and D. J. Ruben, “Natural abundance nitrogen-15 NMR by enhanced heteronuclear spectroscopy,” Chem. Phys. Lett., vol. 69, pp. 185–189, 1980. [10] X. Zhang and C. Wang, “Proton detected editable heteronuclear multiple quantum correlation experiment at natural abundance,” J. Magn. Reson., vol. 91, pp. 618–623, 1991.