Real Time Digital Signal Processing Implementation for ... - IEEE Xplore

3 downloads 17844 Views 1MB Size Report
Real Time Digital Signal Processing Implementation for an APD-Based PET Scanner With. Phoswich Detectors. R. Fontaine, Member, IEEE, M.-A. Tétrault, ...
784

IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 53, NO. 3, JUNE 2006

Real Time Digital Signal Processing Implementation for an APD-Based PET Scanner With Phoswich Detectors R. Fontaine, Member, IEEE, M.-A. Tétrault, F. Bélanger, N. Viscogliosi, R. Himmich, J.-B. Michaud, Student Member, IEEE, S. Robert, J.-D. Leroux, H. Semmaoui, P. Bérard, J. Cadorette, C. M. Pepin, and R. Lecomte, Member, IEEE

Abstract—Recent progress in advanced digital signal processing provides an opportunity to expand the computation power required for real time extraction of event characteristics in avalanche photodiode (APD)-based Positron Emission Tomography (PET) scanners. These developments are made possible by a highly parallel data acquisition (DAQ) system based on an integrated analog front-end and a high-speed fully digital signal processing section that directly samples the output of each preamplifier with a free-running, off-the-shelf, 45-MHz analog-to-digital converter that feeds the sampled data into a field programmable gate array (FPGA) VirtexII PRO from Xilinx. This FPGA features 31 000 logic cells and two PowerPC processors, which allows up to 64 channels to be processed simultaneously. Each channel has its own digital signal processing chain including a trigger, a baseline restorer and a timestamp algorithm. Various timestamp algorithms have been tested so far, achieving a coincidence timing resolution of 3.2-ns full-width at half-maximum (FWHM) for APD coupled to Lutetium Oxyorthosilicate (APD-LSO) and 11.4-ns FWHM for APD coupled to Bismuth Germanium Oxide (APD-BGO) detectors, respectively. Channels are then multiplexed into a DSP processor from Texas Instruments for crystal identification by an ARMAX recursive algorithm borrowed from identification and vector quantization theory. The system can sustain an event rate of 10 000 events/s/channel without electronic dead time. Index Terms—Avalanche photodiodes (APDs), field programmable gate array (FPGA), Positron Emission Tomography (PET), real time digital signal processing.

I. INTRODUCTION OSITRON Emission Tomography (PET) imaging of small animals offers several unique advantages for biomedical and pharmaceutical research. Most current PET scanners rely on detectors based on photomultiplier tubes (PMTs) because they supply high-quality signals with large noiseless amplification and are easier to implement. However, large arrays of scintillators must be coupled to a small number of bulky PMTs to

P

Manuscript received June 12, 2005; revised March 29, 2006. This work was supported in part by the Natural Sciences and Engineering Research Council of Canada (NSERC) and the Fonds Québécois de Recherche sur la Nature et les Technologies (FQRNT). R. Fontaine, M.-A. Tétrault, F. Bélanger, N. Viscogliosi, R. Himmich, J.-B. Michaud, S. Robert, J.-D. Leroux, and H. Semmaoui are with the Department of Electrical and Computer Engineering, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada (e-mail: [email protected]). P. Bérard, J. Cadorette, C. M. Pepin, and R. Lecomte are with the Sherbrooke Molecular Imaging Centre, Department of Nuclear Medicine and Radiobiology, Université de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada. Digital Object Identifier 10.1109/TNS.2006.875441

achieve the required spatial resolution, resulting in high dead time in the detector front-end and limited maximum count rate per mm . One way to avoid these problems is to directly couple each individual scintillator to an avalanche photodiodes (APDs). The design of PET scanners based on such APD detectors involves many specific technological challenges and tradeoffs that did not need to be addresses with PMT-based detectors, such as the high front-end electronic channel density [1]. In another respect, the smaller diameter ring of animal scanners exacerbates parallax error, which makes it more difficult to achieve the required high spatial resolution across an extended field-ofview. High-precision coincidence timing and low dead time processing electronics are also required to enable high-count-rate capabilities and a more selective discrimination of useful events, which are crucial issues for dynamic imaging of fast biological processes. Although many solutions have been proposed to solve these problems [2]–[4], implementing them in an analog electronic data acquisition (DAQ) system requires a major integration effort which severely limits system flexibility and upgradeability. Digital signal processing methods make it possible to address most of these issues [5]–[8]. With the advent of new highly integrated programmable electronic devices and recent progress in advanced digital signal processing, new opportunities can be exploited to explore innovative solutions to the aforementioned problems [7], [9]. In particular, it is now possible to sample analog signals directly at the output of the charge sensitive preamplifier (CSP) with a high-speed free-running analog-to-digital converter (ADC) and to defer processing to digital processors [9]. Highly parallel, real time digital signal processing algorithms can also be implemented in high-capacity field programmable gate array (FPGA) to cope with the stringent requirements of small animal PET scanners. Moreover, new digital solutions to event time stamping, crystal identification and energy discrimination can be implemented and executed for individual detector channel in real time without electronic dead time. This paper describes the hardware and the digital signal processing solutions that were developed to implement such real time signal feature extraction in a PET scanner.

II. HARDWARE IMPLEMENTATION The digital architecture used for real time implementation is a legacy of an initial architecture presented by Université

0018-9499/$20.00 © 2006 IEEE

FONTAINE et al.: REAL TIME DIGITAL SIGNAL PROCESSING IMPLEMENTATION

785

Fig. 1. High-level architecture of a four-ring PET scanner and layout of the analog and digital boards.

Fig. 2. Photograph of the analog and digital front-end boards. The analog board on the left holds the 16-channel preamplifier ASIC and individual APD biasing circuits. The digital board on the right implements 32 dual-ADCs and digital signal processing.

Fig. 3. Architecture of one digital signal processing channel (DAQ channel) included in a FPGA.

de Sherbrooke, Sherbrooke, QC, Canada [9]. The PET architecture consists of four main subsystems linked through highspeed serial links as depicted in Figs. 1 and 2. A printed circuit board (PCB)–the analog board–includes a 16-channel analog front-end based on an integrated CSP [10] along with 500-V individual adjustable regulators for APD biasing. Four analog boards are stacked to produce a 64-channel block. The differential signals from the CSPs are sampled by one digital board containing 32 dual-channel, 8-bit, 45-MHz ADC. An FPGA from Xilinx [11] handles individual channel buffering, digital processing, and data concentration. This FPGA contains 13 696 slices (a slice is the basic logic block of Xilinx FPGAs that contains a register, function generator, MUX, etc.) and implements the digital signal processing chain, including an adaptive leading edge detector, a signal normalization block, a timestamp

detector, and a data multiplexer (Fig. 3). The data multiplexer is used to concentrate data to a digital signal processor (DSP) from Texas Instruments,1 which performs crystal identification of phoswich detectors. The result of crystal identification is sent back to the FPGA for event data packaging and shipping to a data concentrator through a 2.4-Gbps serial transceiver included in the FPGA. The events from four digital boards are merged into one Data Concentrator, where they are sorted out according to their timestamp. Events are then sent to a Digital Coincidence engine retaining relevant ones based on their timestamp, geometric address and energy. Twelve digital boards, arranged in a star-like 1TMS320C6414, TMS320C6415, TMS320C6416 Fixed-Point Digital Signal Processors (Rev. M), SPRS136M, Texas Instruments, Dallas, TX (www.ti.com).

786

IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 53, NO. 3, JUNE 2006

fashion, make up a four-layer, 192-crystal ring of detectors. Up to four (or eight) such rings can be stacked to form the scanner. III. FPGA AND TIMESTAMP EXTRACTION The digital signal processing chain in the DAQ FPGA consists of five cascaded blocks as depicted in Fig. 3: an ADC block, a First In First Out (FIFO) memory, a baseline restorer, a normalization block, and a timestamp algorithm. The ADC block is based on a free running ADC that samples data and stores them in the FIFO. When an event has energy higher than a predetermined leading edge level, the value from a 24-bit reference counter increased synchronously with the 45-MHz ADC clock is latched as a rough timestamp. The DC level is evaluated from the mean value of the floor samples before the trigger flag. The maximum amplitude, DC level, timestamp, and 64 samples are stored in a FIFO, implemented in an 8-bit 2048-block RAM. The latter can store up to 25 events before event loss occurs. This first DAQ section requires 150 slices per channel and can run at up to 150 MHz. The Baseline Restorer block subtracts the DC level as samples are read from the FIFO. Data normalization is needed for digital signal processing algorithms before the calculation process. The data normalization is performed in the Normalization block by dividing all event samples by the maximum value found in the sampled data stream (64 bytes). Division is performed with a lookup table residing in a block RAM and a hardware multiplier included in the FPGA, giving the result in a single clock cycle. A customized PicoBlaze microcontroller [12] manages normalization and baseline restoration, and implements a more accurate timestamp calculation algorithm to refine the rough value. The PicoBlaze is a fully customizable, logic implemented 8-bit microcontroller. It uses about 100 slices of logic and an instruction block RAM. It can run at up to 100 MIPS. One PicoBlaze unit copes with eight DAQ channels. While the PET scanner is designed to work at 10 000 events/s/channel, 20 MIPS are required for eight DAQ channels, and hence eight PicoBlazes are required for processing the 64 channels residing on one digital board. The initial 24-bit timestamp, latched at event trigger from a 45-MHz reference counter, has a precision of 22 ns, which is well above the desired timing resolution of 2–3-ns full-width at half-maximum (FWHM) for detectors in coincidence. One must perform more advanced digital signal processing to interpolate the timestamp between samples with a higher accuracy. Algorithms based on simple and advanced digital signal processing [13] as well as neural networks [14] have been tested so far. These algorithms refine the initial 24-bit estimate by adding the five least significant bits to the final timestamp. This configuration leads to a 0.68-ns precision on the timestamp which is accurate enough for PET coincidence resolution with the selected detectors. Among the tested algorithms, a straightforward digital constant fraction discriminator (DCFD) derived by simple interpolation can be easily implemented in the PicoBlaze:

(1)

Fig. 4. Coincidence time resolution for opposite APD-LSO detectors measured with a simple DCFD algorithm.

TABLE I NUMBER OF SLICES AND SPEED OF EACH FPGA BLOCK

where represents the timing correction at , evaluated at a fixed percentage of the maximum signal amplitude between ), and . This computation requires data samples ( only a few clock cycles, which is negligible compared to the normalization/baseline processes. At a sampling rate of 45 MHz, this simple algorithm produces a coincidence time resolution of 3.2-ns FWHM for APD-LSO detectors (Fig. 4) and 11.4 ns for APD-BGO detectors (not shown), respectively. An improvement in time resolution by a factor of two or more is expected using more advanced digital timing discrimination algorithms [13], [14]. A further timing correction originating from the calibration process will shift th. This further correction, which involves the dynamic features of individual detector and variation of the signal skew in the PCB, is not considered in this paper. Once the timestamp is determined, an 8-to-1 multiplexer with a FIFO register queues the samples for crystal identification in the digital signal processor. This multiplexer requires 300 slices while the FIFO takes up 200 more slices of the FPGA resources. Table I summarizes the slices needed for implementing every processing blocks for real time digital signal processing. IV. CRYSTAL IDENTIFICATION PROCESS Efficient crystal identification measurement for depth-of-interaction (DOI) or position decoding represents an additional challenge in PET scanners. Various detector designs based on PMTs or APDs have been proposed so far [3], [4], [15], [16]. The actual implementation of many of these methods is still problematic and most did not go beyond simulation studies or laboratory tests [4], [15], [16]. Stacks of crystals with different scintillation light responses have also been proposed to improve

FONTAINE et al.: REAL TIME DIGITAL SIGNAL PROCESSING IMPLEMENTATION

787

Fig. 5. General system identification scheme.

DOI or position resolution [2], [17]–[21]. This approach is generally simpler to implement and yields excellent results when appropriate signal processing is applied. Such techniques for crystal identification can be based on pulse shape discrimination [22], [23], statistical methods [24], or frequency domain transforms [5]. The method used here for real time processing consists of modeling the whole DAQ chain in the Laplace domain [6]. The Laplace transform is preferred because computation can be performed with real numbers in contrast to the Fourier transform where complex representation is required. Each individual block of the DAQ—consisting of the detector, the CSP, the antialiasing filter, and the ADC including all individual noise contribution—is modeled in the pole-zero space. All a priori known contributors to the model are cascaded to form the system model represented by the adaptive filter of Fig. 5. Simplification of poles and zeros that do not affect significantly the model can be made. The only unknown in the model is the scintillating crystal that can be approximated by a single pole [6]. The identification process, which is a legacy from identification theory [25], is based on an adaptive filtering scheme which compares the output sampled from CSPs with the output of a digital filter with adjustable coefficients (DFAC) controlled by an adaptive algorithm (AA) (Fig. 5). A least mean square (LMS) or a recursive least square (RLS) method evaluates coefficients of the DFAC which models the unknown part of the model, i.e., the crystal. The AA tries to recursively that comes from the subtraction of minimize the error and the estimated response the output of the samples of the adaptive filter. An exogenous variable, which represent the noise, is also added in the model. This recursive process is better known as an identification model with an AutoRegressive Moving Average with eXogeneous variable (ARMAX). The identification process outputs an estimation of the remaining pole and gain of the crystal signal and an estimation of the noise. The digital filter poles and zeros are compared with known matrix data using a derivative of vector quantization (VQ) [26]. A population of events 18 000 is used to build subspaces according to the pole, gain and error of the crystal model in our case. As an example, the populations for BGO and LSO signals including Compton photons are represented in this three-dimensional (3-D) space in Fig. 6. A boundary plane can be traced to separate these populations for discrimination. Simulation of this digital method achieved discrimination rate of 100% for BGO-LSO crystals (40-ns and 300-ns decay times, respectively). However, crystal pairs with faster decay times

Fig. 6. Clusters created by populations of LSO and BGO events in the poles subspaces and boundary plane used for discrimination.

TABLE II ERROR RATES OF CRYSTAL IDENTIFICATION WITH THE ARMAX METHOD

Fig. 7. Crystal identification algorithm implemented in the digital signal processor.

for improving the coincidence timing resolution are harder to discriminate. A simulation performed with an LSO-LYSO (40 ns and 50 ns, respectively) pair showed that a discrimination rate better than 99.5% can be achieved with Compton photons included (Table II) [6]. Crystal identification of phoswich detectors is implemented in a digital signal processor from Texas Instruments. This DSP can handle eight 8-bit multiply and accumulate (MAC) operations per cycle for a total of 5760 million MACs at a clock frequency of 720 MHz. Two external memory interfaces (EMIFs) —EMIFA and EMIFB—are used for data transfer between FPGA and DSP for real time implementation of DOI measurement (Fig. 7). After the transfer of the samples with their timestamp, the ARMAX identification process is executed. Initial values are injected into the algorithm to increase convergence rate. The RLS iterations are made on 32 samples. A derivative of VQ is then executed for crystal identification. There are 70 MACs per sample needed for identifying a crystal; hence 2240 MACs are needed for an event or nearly 1500 million MACs per digital board for an estimated event rate of 10 000 events/s/detector.

788

IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 53, NO. 3, JUNE 2006

V. CONCLUSION Real time extraction of energy, timestamp, and crystal identification can be performed using dedicated high-performance algorithms implemented in programmable digital signal processing PET electronics. Tested algorithms were shown to be able to achieve a coincidence timing resolution of 3.2-ns FWHM for LSO-APD detectors in coincidence and crystal identification rate of nearly 100% even with scintillators having almost identical light responses. A sustained rate of up to 10 000 events/s/detector can be supported without electronic dead time or event losses.

ACKNOWLEDGMENT The authors want to thank Réjean Bernier, Daniel Rouleau, Olivier Lessard, Joël Riendeau, and François Lepage for their contribution in the scanner development and tests. They are also grateful to Xilinx, Inc., which provided the FPGA and the tools for conducting tests, and to PerkinElmer Optoelectronics (Vaudreuil, QC, Canada) for the supply of prototype APDs and APD detectors. Finally, the authors want to thank the Canadian Microelectronics Corporation for its contribution to the fabrication of the ASIC.

REFERENCES [1] R. Lecomte, “Initial results from the Sherbrooke APD positron tomograph,” IEEE Trans. Nucl. Sci., vol. 43, no. 3, pp. 1952–1957, Jun. 1996. [2] A. Saoudi and R. Lecomte, “A novel APD-based detector module for multi-modality PET/SPECT/CT scanners,” IEEE Trans. Nucl. Sci., vol. 46, no. 3, pp. 479–484, Jun. 1999. [3] Murayama, “A depth encoding scintillation detector unit with a position-sensitive photomultiplier tube,” IEEE Trans. Nucl. Sci., vol. 47, pp. 1045–1050, 2000. [4] W. Moses, “A room temperature LSO/PIN photodiode PET detector module that measures depth of interaction,” IEEE Trans. Nucl. Sci, vol. 42, no. 4, pp. 1085–1089, Aug. 1995. [5] M. Streun, “Pulse shape discrimination of LSO and LuYAP scintillators for depth of interaction detection in PET,” in IEEE NSS/MIC Conf. Rec., 2002, vol. 3, pp. 1636–1639. [6] J.-B. Michaud, “Experimental results of identification and vector quantization algorithms for DOI measurement in digital PET scanners with phoswich detectors,” presented at the IEEE NSS/MIC Conf., Rome, Italy, Oct. 2004. [7] M. Streun, G. Brandenburg, H. Larue, C. Perl, and K. Ziemons, “The data acquisition system of ClearPET Neuro—A small animal PET scanner,” IEEE Trans. Nucl. Sci., vol. 53, no. 3, pp. 700–704, Jun. 2006.

[8] M.-A. Tétrault, “Real-time digital coincidence detection system for high resolution APD-based animal PET scanner,” in Proc. NSS/MIC 2005, Oct. 2005. [9] R. Fontaine, “Preliminary results of a mixed signal data acquisition sub-system for a distributed, digital, computational, APD-based, dualmodality PET/CT architecture for small animal imaging,” presented at the IEEE NSS/MIC Conf., Rome, Italy, Oct. 2004. [10] J.-F. Pratte, S. Robert, G. De Geronimo, P. O’Connor, S. Stoll, C. M. Pepin, R. Fontaine, and R. Lecomte, “Design and performance of 0.18-m CMOS charge preamplifiers for APD-based PET scanners,” IEEE Trans. Nucl. Sci., vol. 51, no. 5, pp. 1979–1985, Oct. 2004. [11] Virtex II PRO and Virtex II PRO X platform FPGAs : Complete Datasheet, DS083, V4.2, Xilinx, Inc., San Jose, CA, Mar. 1, 2005 [Online]. Available: www.xilinx.com [12] PicoBlaze 8-bit Embedded Microcontroller User Guide for Spartan-3, Virtex-II, and Virtex-II Pro FPGAs, UG129, V1.1, Jun. 10, 2004 [Online]. Available: www.xilinx.com [13] J.-D. Leroux, “Time determination of BGO-APD detectors by digital signal processing for positron emission tomography,” presented at the IEEE NSS/MIC Conf., Portland, OR, Oct. 2003. [14] J.-D. Leroux, “Time discrimination techniques using artificial neural networks for positron emission tomography,” presented at the IEEE NSS/MIC Conf., Rome, Italy, Oct. 2004. [15] R. Bruyndonckx, S. Leonard, J. Liu, S. Tavernier, P. Szupryczynski, and A. Fedorov, “Study of spatial resolution and depth of interaction of APD-based PET detector modules using light sharing schemes,” IEEE Trans. Nucl. Sci., vol. 50, no. 5, pp. 1415–1419, Oct. 2003. [16] P. A. Dokhale, “Performance measurements of a depth-encoding PET detector module based on position-sensitive avalanche photodiode read-out,” Phys. Med. Biol., vol. 49, no. 18, pp. 4293–4304, 2004. [17] N. Inadama, “A depth-of-interaction detector for PET with GSO crystals doped with different amounts of Ce,” Proc. IEEE NSS/MIC, pp. M.2–M.3, 2001. [18] R. Lecomte, “A new dual crystal depth sensitive detector for high resolution PET cameras,” J. Nucl. Med., vol. 27, p. 974, 1986. [19] L. R. MacDonald and M. Dahlbom, “Depth of interaction for PET using segmented crystals,” IEEE Trans. Nucl. Sci., vol. 45, no. 4, pp. 2144–2148, Aug. 1998. [20] S. Holte, H. Ostertag, and M. Kesselberg, “A preliminary evaluation of a dual crystal positron camera,” J. Comput. Assist. Tomogr., vol. 11, no. 4, pp. 691–697, 1987. [21] R. Lecomte, A. Saoudi, D. Rouleau, H. Dautet, D. Waechter, M. Andreaco, M. Casey, L. Eriksson, and R. Nutt, “An APD-based quad scintillator detector module with pulse shape discrimination coding for PET,” in Proc. IEEE NSS/MIC Conf., 1999, vol. 3, pp. 1445–1447. [22] J. Seidel, J. J. Vaquero, S. Siegel, W. R. Gandler, and M. V. Green, “Depth identification accuracy of a three layer phoswich PET detector module,” IEEE Trans. Nucl. Sci., vol. 46, no. 3, pp. 485–490, Jun. 1999. [23] A. Saoudi, C. M. Pepin, F. Dion, M. Bentourkia, R. Lecomte, M. Andreaco, M. Casey, R. Nutt, and H. Dautet, “Investigation of depth-of-interaction by pulse shape discrimination in multicrystal detectors read out by avalanche photodiodes,” IEEE Trans. Nucl. Sci., vol. 46, no. 3, pp. 462–467, Jun. 1999. [24] T. A. DeVol, “Monte Carlo optimization of depth-of-interaction resolution in PET crystals,” IEEE Trans. Nucl. Sci., vol. 40, no. 2, pp. 170–174, Apr. 1993. [25] L. Ljung, System Identification: Theory for the User, 2nd ed. Englewood Cliffs, NJ: Prentice-Hall, 1998, p. 609. [26] A. Gersho and R. M. Gray, VQ and Signal Compression. Norwell, MA: Kluwer, 1992, p. 732.

Suggest Documents