AN FPGA BASED GENERIC PROTOTYPING PLATFORM EMPLOYED IN A CMOS LASER DOPPLER BLOOD FLOW CAMERA Yiqun Zhu, Barrie R Hayes-Gill, Steve P Morgan, and Nguyen Hoang School of Electrical and Electronic Engineering The University of Nottingham University Park, Nottingham, NG7 2RD, United Kingdom Email:
[email protected]
ABSTRACT This paper presents an FPGA based generic prototyping platform employed in a CMOS laser Doppler blood flow imaging application. The proposed platform consists of a Xilinx Spartan-3 FPGA device, Analog Device 125MSPS, 14-bit ADCs, Cypress 167MHz, 2Mx18bits QDR memory, and a high-speed (480MHz) USB 2.0 link. In this platform, the FPGA device not only acts as controllers of the ADCs, QDR memory and the USB link, but also provides a high-speed FFT processing capability, which is a key computation in laser Doppler blood flowmetry. Based on this platform, a full-field laser Doppler blood flowmetry imaging system can potentially deliver flow image frames (256x256 pixels) every 0.0065 to 1.7 seconds, depending on the number of FFT points (32 1024), which is seven times faster than existing systems. Furthermore, the FPGA device offers 403 reconfigurable I/O pins, which can accommodate other applications requiring a large number of I/O pins. 1. INTRODUCTION The two most widely deployed image sensor technologies are CCD (charge coupled device) and CMOS (complementary metal oxide semiconductor). It is well known that the CCD camera offers higher image quality, but suffers lower read-out speed. On the contrary, compared with the CCD camera, the current CMOS camera technology has slightly lower image quality. However, the CMOS camera enjoys high-speed random access to the image pixels of interest which means that data can be rapidly transferred to a digital signal processor. One of the key research projects we are investigating is full-field laser Doppler blood flowmetry, in which an inhouse designed CMOS camera route has been followed. Such an approach offers high-speed random access to individual pixels. Laser Doppler blood flowmetry [1] [2] [3] [4] has been used to measure microcirculation in superficial tissue for over three decades since the first tissue experiment performed by Stern [5]. In the existing
laser Doppler blood flowmetry systems, depending on whether using FFT processing or not, they either suffer from low accuracy, in terms of blood flow and concentration, or have low frame rates when used for fullfield applications. For example, in laser Doppler blood flowmetry with FPGA processing [6], a significant degradation of accuracy has been observed because of using digital filters rather than FFT processing due to the limited number of taps used in the filter. Alternatively, the moorLDI2TM imager [7] can take up to 5 minutes to obtain an image of 256x256 pixels, because of the nature of the scanning system. Recently, frame rates have been increased significantly by using full-field CMOS camera systems instead of scanning systems. For instance, Serov [8] [9] used a commercial CMOS imager sensor for laser Doppler blood flowmetry that delivered high-resolution flow images of 256x256 pixels every 1.2 seconds. However, this involves compromising the sampling frequency of the system. Conventional laser Doppler systems usually require sampling frequencies of at least 40KHz whereas Serov used a sampling frequency of only 8KHz to achieve a high frame rate. In addition, the low modulation depths (~1%) of laser Doppler blood flow signals means that a high number of bits are usually required and Serov compromised this to 8 bits to achieve a high frame rate. It is known from [8] that the key timeconsuming FFT signal processing was executed on a general purpose CPU, in which an image of 256x256 pixels takes 2.9 seconds if using a 512-point FFT, or a single 512-point FFT takes 44 microseconds. The FFT calculation is therefore the limiting factor of the frame rate, in other words, in order to increase the overall frame rates, the FFT processing engine must be sped up. On the other hand, it is well known that FFT cores implemented on the Xilinx FPGA devices can offer very high-speed processing. This is because an FPGA uses a concurrent architecture where several parallel processing elements are operating in parallel thereby massively increasing the processing speed. For example, it will be seen in this paper that a 512-point FFT only takes 7 microseconds on a Xilinx Spartan-3 FPGA, XC3S1500FG676-4 [10], implemented with a pipeline architecture. It is seven times
faster than that in [8], so it is apparent that the Xilinx FPGA will speed up the overall frame rates. More specifically, one of the key roles for the Xilinx FPGA will be an FFT computation accelerator in the platform. In addition to the Xilinx FPGA device, the platform that we have developed is integrated with: up to four highspeed (up to 125MSPS) and high-resolution (14 bits) ADCs; a high-speed (up to 167MHz) and high-capacity (2Mx18bits) QDR memory; and a high-speed USB 2.0 link with the host PC. The multi-channel high-speed and high-resolution ADCs are required to sample the analogue pixels. For example, in a full-field (64x64 pixels) laser Doppler blood flowmetry, a sampling rate of 40KSPS is required for each pixel, so the ADC sampling rate can be 40MSPS, 80MSPS and 160MSPS when using 4, 2 or 1 ADCs, respectively. On the other hand, in laser Doppler blood flowmetry, a high resolution ADC (14 bits) is required to pick up a very weak laser Doppler shift signal from a very large DC background. In addition, high-speed and high capacity memory units are required to buffer the sampled data between ADCs and the FPGA dataprocessing units. Furthermore, in order for the sampled data to be sent to a host PC for initial data analysis and processing algorithm development, a high-speed communication link between the ADCs and the host PC is highly desirable. As a result, the Xilinx FPGA device is not only used as an FFT computation accelerator, but also as the controllers for the four ADCs, the high-speed and high-capacity QDR memory, and the high-speed USB link. The attraction of using an FPGA in this platform is, unlike conventional CPU or DSP based systems, all functional units, such as the FFT processing, the ADC controller, the memory controller and the USB controller, can all be implemented separately and run in parallel. In other words, this concurrent behavior allows each processing element to proceed at its own speed and is not slowed down by any other processing element. Another attraction of using this type of platform is that due to the reprogrammable nature of the FPGA, the presence of the high speed ADC’s and high capacity memory, and the FPGA device offering 403 reconfigurable I/O pins, this platform can be used in other high speed data acquisition applications. For example, we are deploying this same system for a power system fault location instrument that requires a rapid computation time such that at the onset of a system fault the high speed transient occurring can be
sampled and its exact location can be determined [11]. Although this paper describes the system employed for a laser Doppler blood flow camera, it can be seen that this platform can easily be used in many other applications and hence offers a generic platform for high speed data acquisition and true real-time signal processing. The paper is organized as follows. A brief exploration on generic platform design approach and design objectives are given in Section 2. FPGA based platform design and implementation is described in Section 3. A case study is presented in section 4, in which a reconfigurable full-field laser Doppler blood flowmetry prototyping system has been implemented based on the platform. Discussions and conclusions are presented in section 5 and 6, respectively. 2. GENERIC PLATFORM DESIGN APPROACH AND OBJECTIVES In order for the platform to be as generic as possible, the platform is built up based on separate PCB modules rather than a single PCB module. Physically, the platform is composed of one FPGA module, one USB module and up to four ADC modules. In the platform, the FPGA module is classified as the motherboard while the USB module and ADC modules are referred to as the daughter boards. The Xilinx Spartan-3 FPGA device and two Cypress QDR memory devices, CY7C1303AV25 [12], are located on the FPGA motherboard, on which the USB and ADC daughter boards are inserted via an IDC connector. There are seven digital interface IDC connectors available on the FPGA module, one of which is used to connect with the USB module, four are connected to each of the four ADC modules whilst the other two connectors are used as general digital I/O. Therefore, compared with a single module system, the multi-module based platform has much more flexibility. For example, in applications where only one ADC module is required, three more connectors become available for general I/O. These four connectors, which are designed to interface with the ADC modules, can be easily reconfigured as general I/O due to the capability to reconfigure Xilinx Spartan-3 FPGA devices. Fig. 1 shows the block diagram of our generic FPGAbased prototyping platform deployed in the laser Doppler blood flow CMOS camera application. The generic FPGA platform is enclosed by the broken line.
ADC Module Prototye pixels or the pixels from CMOS camera chip under test
ADC Module FPGA Module
USB Module
Host PC
ADC Module
ADC Module
Fig. 1. FPGA-based laser Doppler flow CMOS camera prototyping platform From the viewpoint of system design, the multi-module architecture accommodates two main phases in the development of a laser Doppler blood flow CMOS camera as follows: Algorithm Development: In this phase, the platform can act as a high-speed, high-resolution data acquisition system to sample individual prototype single pixels (before fabrication of an array of pixels) or individual pixels from a prototype custom-made CMOS array. In other words, the pixel analogue signals can be digitized and then sent to a host PC via the high-speed USB link. Therefore, in the host PC, pixel characterization and data analysis can be conducted. The purposes of this phase is to provide the designer with an opportunity to improve the design of a single pixel and also to develop data-processing algorithms when testing the custom made blood flow CMOS camera array. FPGA Prototyping: In the FPGA prototyping phase, the high-level data processing algorithms to calculate blood flow and concentration (in the form of C or Matlab code) developed in the algorithm development phase are converted into VHDL codes and implemented on the Xilinx Spartan-3 FPGA device in the generic platform. Of course, in the case of a final commercial system based on the FPGA device, the output of the FPGA prototyping phase will be a ready-to-use system prototype, whereby all modules will be integrated into a single PCB module providing a compact, hand held, blood flow camera. 3. PLATFORM DESIGN AND FPGA IMPLEMENTATION Although the entire generic platform has been built up with an FPGA module, a USB module and up to four ADC
modules, this section focuses on the FPGA design and implementation aspects of the controller cores of the ADC, QDR memory and USB modules, rather than the PCB modules themselves. This section concludes with a description of the FFT processing units. 3.1. ADC Controller Core The ADC module has been designed based on the Texas Instruments 14-bit 125MSPS analog-to-digital converter, ADS5500 [13]. The analogue signal is sampled at the falling edge of the sampling clock and the digital data of this sample will not be available until 17.5 sampling clock cycles after the sampling point. For example, if the analogue signal is sampled at the falling edge of clock 1, the corresponding digital data will be valid after the rising edge of approximately clock 17. It is approximate because the input sampling clock, “ClockToADC”, does not accurately match the digital data output, in terms of timing. However, there is a clock output from the ADC, “ClockFromADC”, which is accurately matched with the digital data output. Therefore, “ClockFromADC” has been used to latch the digital data output to ensure reliability. In addition, the input sampling clock, “ClockToADC”, is also used to generate an address to locate the pixel to be sampled, more specifically, the rising edge of “ClockToADC” is used to switch the pixel addresses which maximize the sampling window of the analogue signal, because the analogue signal is sampled at the falling edge of ClockToADC. A VHDL/FPGA controller core as shown in Fig. 2 has been designed to control the ADC module in the FPGA device.
3.2. QDR Memory Controller Core
Start
DCM
50MHz OSC
ClockToADC
WriteEnable
Pixel Address
ADC State Machine Reset
ClockFromADC
DataFromADC
14
FIFO (16x14)
ReadEnable 14
FifoHalfFull DataToQDRController
Fig. 2. ADC controller core It can be seen from Fig. 2 that the ADC controller core is mainly composed of a state machine and a FIFO with a size of 16x14 bits. The FIFO has been generated from the Xilinx FIFO Generator [14], which allows us to generate a FIFO component with true independent clocks for the input and output ports. With this feature of independent clocks, the FIFO can be used as an ADC data buffer, which perfectly resolves the synchronization issue of the clocks “ClockToADC” and “ClockFromADC”. A global programmable clock is generated from the Xilinx Digital Clock Manager (DCM) [15]. As each member of the Xilinx Spartan-3 family has four DCMs with the exception of XC3S50, a programmable clock with a wide range of frequencies can be obtained. For instance, with only two DCMs, one of which is doubling the frequency and the other is working as frequency divider with a division of 2.5, a global clock of 40MHz can be produced from the 50MHz clock, which in turn is obtained from an oscillator module. The global clock is used as the ADC sampling clock (“ClockToADC”), the clock of the ADC state machine, and the clock signal for the FIFO output. Obviously, the “ClockFromADC” is directly connected to the FIFO input clock to ensure the sampled data from the ADC can be reliably latched into the FIFO. With this ADC controller core, the data-receiving operation becomes very simple. Once the start signal has been issued, a block of data of 8x14 bits can be read out from the FIFO and sent to the QDR memory controller to save into the QDR memory after the FIFO half full is set.
Quad Data Rate (QDR™) SRAM technology represents one of the most advanced memory technologies, in terms of data throughput rate and capacity, which has been jointly developed for high-performance applications by Cypress, Renesas, IDT, NEC, and Samsung. QDR memory has two key features as follows: Firstly, it uses both rising edge and falling edge for memory operations, in terms of memory clock, so Double Data Rate (DDR) can be achieved with the same clock frequency. Secondly, it has separate read and write data ports, which are independent of each other and support concurrent data transactions. Therefore, compared with the conventional SRAM of single data port and single data rate, QDR memory has a fourfold improvement in the data rate. There are three key reasons to choose QDR for this platform. Firstly, separate independent read and write ports allows us to connect the write port to the ADC and read port to the processing units, such as an FFT core. Secondly, Double Data Rate for each port allows us to use the lower clock frequency to implement the same data rate, in other words, we can use a low-speed PCB board to achieve high-speed throughput rate. This therefore avoids the highly challenging high-speed PCB design. Finally, as the high capacity QDR memory uses double clock edges, it cannot be connected directly to circuits with a single clock edge, such as the ADC controller core, FFT core and USB core in the platform. In other words, a QDR memory controller core has to be developed to interface between the single clock edge system and the double clock edge system. On the other hand, Xilinx Spartan-3 FPGA has a DDR feature for the double clock edge system, which was one of the reasons for choosing it in this platform. Based on the Virtex-II QDR interface reference design [16], a QDR VHDL/FPGA memory controller core, as shown in Fig. 3, for Cypress QDR memory, CY7C1303AV25, has been developed.
Reset 18
QdrK
SystemClock
QdrKn
WriteAddress
QdrRefCLK
WriteEnable
QdrBW
36
WriteData
QdrWPS
18
ReadAddress
QdrRPS
36
QDR Memory (1Mx18)
ReadEnable
QdrA
18
ReadData
QdrD
18
DataValid
QdrQ
18
Fig. 3. QDR memory controller core
Table 1. Resource Utilization of Pipelined FFT Cores on Xilinx XC3S1500FG676-4 Point Size 32 64 128 256 512 1024
Input Data Width 16 16 16 16 16 16
Phase Factor Width 16 16 16 16 16 16
The QDR CY7C1303AV25 has an 18-bit wide address bus which is shared between the read and write ports. With the QDR memory controller core, totally separate independent read and write ports, both of which have 18-bit address bus and a 36-bit data bus, are available to interface with the single clock edge systems. More specifically, in the case of our platform, it is apparent from the left side of Fig. 3 that SystemClock is a global clock of the single clock edge system. WriteAddress, WriteEnable and WriteData are interfaced with the ADC controller core as shown in Fig. 2, while ReadAddress, ReadEnable and ReadData are read-out interface signals, which can be connected to either an FFT core or the USB controller core described in following sub-section. 3.3. USB Controller Core RDY SystemReset
INT
SystemClock
FLAGA FLAGB FLAGC
BufferHalfEmpty
IFCLK
BufferEmpty
SLRD
BufferFull
SLWR
To USB Module
SLOE BufferWriteStart 16
BufferData
PKTEND FIFOADR
3
FD
16
Fig. 4. USB controller core In order to send either the raw sampled pixel data to the host PC for the algorithm development or the postprocessing data to the host PC for the CMOS camera test stage, the Cypress high-speed USB controller, CY7C68001 [17], has been used to build up a high-speed communication link between the generic platform and the host PC in the USB module in the platform. As a result, a VHDL/FPGA USB controller core, as shown in Fig. 4, has been designed and implemented on the Xilinx Spartan-3 FPGA as a micro-controller to configure this high-speed
MULT18x18 8 (25%) 8 (25%) 12 (37%) 12 (37%) 16 (50%) 16 (50%)
Block RAM 1 (1.5%) 1 (1.5%) 1 (1.5%) 1 (1.5%) 1 (1.5%) 1 (1.5%)
Slices 1160 (8%) 1536 (12%) 1856 (14%) 2327 (18%) 2839 (22%) 3694 (28%)
USB link, because the proper configuration needs to be carried out externally for CY7C68001 to enable the USB link. It can be seen from Fig. 4 that, with the USB controller core, the data interface with other units is quite straightforward. With one or more of the flag signals set (i.e. BufferHalfEmpty, BufferEmpty and BufferFull), the 16-bit data to be sent (“BufferData”) is written into the USB buffer by using the write enable signal “BufferWriteStart” at the appropriate time. 3.4. FFT Processing Unit With Xilinx Spartan-3 FPGA devices, FFT processing units can be easily built up based on FFT IP cores generated automatically from the Xilinx FFT generator [14]. More specifically, without a single line VHDL code, an FFT core can be created with particular requirements, such as transform length (number of points), resolution (number of bits) and implementation feature (Pipelined Streaming I/O, Radix-4 Burst I/O and Radix-2 Minimum Resources). Therefore, the key task for designing FFT processing units is to understand the interface signals, in terms of functionality and timing, rather than writing an FFT core itself. Although, compared with the other two implementations, pipeline streaming I/O consumes more resource, the pipeline streaming I/O is still selected to be used for the platform, because it is the most suitable FFT implementation for the platform, which allows easy interfacing with the QDR controller core to achieve streaming-line FFT processing. Table 1 shows the resource utilization of the pipelined FFT cores on the Xilinx XC3S1500FG676-4. It can be seen from Table 1 that, even with a 1024point FFT, less than half of the hardware resource has been consumed. On the other hand, it is also known from the FPGA implantation that the maximum frequency that the FFTs in Table 1 can run is 73MHz. Therefore, the FFT calculations with 512 and 1024 points can be done within 7 and 14 microseconds, respectively, which are usually not possible even for the high-end DSP processors.
14
16 x16
16 x16
16
1
14
ADC Controller
16 x16
14
ADC Controller
16 x16
1
ADC
16 x16
ADC
16 x16
MUX
16 x16
MUX
16 16 x16
Host PC
14
14 14
14
FFT 1024
14
14 14
14
USB Module
1
64
16
14
14
USB Controller
16 x16
28
Further Processing
16 x16
28
MUX
16 x16
14
ADC Controller
16 x16
1
ADC Controller
16 x16
ADC
16 x16
FPGA Device
QDR Mem Controller
14
ADC
16 x16
MUX
16 16 x16
MUX
QDR Mem (1M x 18)
14 28
28
14
64
QDR Mem (1M x 18)
14
QDR Mem Controller
Fig. 5. 64x64 full-field laser Doppler blood flowmetry system based on the generic FPGA platform Table 2. The frame rate for full-field laser Doppler blood flow CMOS cameras based on the platform showing the frame rate and the ADC sampling rate for different sized camera arrays, different number of FFT points and different numbers of ADCs. 32x32 pixel array No. of ADCs 32 1 1.22KHz/40M 2 1.22KHz/20M 4 1.22KHz/10M 64x64 pixel array No. of ADCs 32 1 610Hz/80M 2 1.22K/80M 4 1.22K/40M 128x128 pixel array No. of ADCs 32 1 153Hz/80M 2 305Hz/80M 4 610Hz/80M 256x256 pixel array No. of ADCs 32 1 38Hz/80M 2 76Hz/80M 4 153Hz/80M
64 610Hz/40M 610Hz/20M 610Hz/10M
The number of FFT Points 128 256 305Hz/40M 153Hz/40M 305Hz/20M 153Hz/20M 305Hz/10M 153Hz/10M
512 76Hz/40M 76Hz/20M 76Hz/10M
1024 38Hz/40M 38Hz/20M 38Hz/10M
64 305Hz/80M 610Hz/80M 610Hz/40M
The number of FFT Points 128 256 153Hz/80M 76Hz/80M 305Hz/80M 153Hz/80M 305Hz/40M 153Hz/40M
512 38Hz/80M 38Hz/40M 38Hz/20M
1024 9.5Hz/40M 9.5Hz/20M 9.5Hz/10M
64 76Hz/80M 153Hz/80M 305Hz/80M
The number of FFT Points 128 256 38Hz/80M 19Hz/80M 76Hz/80M 38Hz/80M 153Hz/80M 38Hz/40M
512 9.5Hz/80M 9.5Hz/40M 9.5Hz/20M
1024 2.4Hz/80M 2.4Hz/20M 2.4Hz/10M
64 19Hz/80M 38Hz/80M 76Hz/80M
The number of FFT Points 128 256 9.5Hz/80M 4.8Hz/80M 19Hz/80M 9.5Hz/80M 38Hz/80M 9.5Hz/40M
512 2.4Hz/80M 2.4Hz/40M 2.4Hz/20M
1024 0.6Hz/40M 0.6Hz/20M 0.6Hz/10M
4. CASE STUDY The generic platform will initially be used for a number of full-field laser Doppler blood flowmetry systems that allows the measurement of blood flow changes reliably and non-intrusively. In this section, a CMOS camera of frame size 64x64 with 1024-point FFT is used as an example to describe how the platform interfaces with the CMOS camera, in terms of data path, and what maximum frame rate can be achieved. Towards the end of this section, the maximum frame rates are presented for a variety of both frame sizes and the number of FFT points. A 64x64 full-field laser Doppler blood flowmetry system based on the generic FPGA platform is given in Fig. 5, in which four ADCs are adopted. In addition, for the sake of simplicity, only data paths are shown. In laser Doppler blood flowmetry the measured frequencies are typically in the range of 0 to 20KHz [8], in other words, the pixel bandwidth is 20KHz, according to the Shannon sampling law, the sampling rate of 40KSPS is required for each sample. In order for the data paths from the ADC modules to the FFT processing units not to be blocked, QDR memory must operate in what is called a “ping-pong” fashion. In other words, one half of the memory is receiving data from the ADC controller cores while the other half of the memory is reading date out and sending them to the FFT processing units and vice versa. Considering the total QDR memory size of 2Mx18 bits, in order to simplify the interfacing with the data bus (14-bit) from the ADC controller cores, the total effective memory size will be 2Mx14bits, hence half of the total effective memory size will be 1Mx14bits. However, for the case of using a 1024point FFT for each pixel, which means that 1024 samples have to be collected for each pixel before any further operation, the frame size will be 1024 x 64 x 64 x 14bits, or 4Mx14bits. Therefore, to allow the “ping-pong” processing, each frame needs to be divided into four stages. Consider four ADC channels involved, the total data sampled by each ADC at each stage will be 256Kx14bits (16x16 pixels). Therefore, it can be easily worked out that the required ADC sampling rate is 10MSPS to guarantee that the pixel sampling rate is roughly 40KSPS. In the first stage each ADC samples 1024 points per pixel for the first block of 16x16 pixels, these sampled data will be stored in the first half of the QDR memory. In the second stage, each ADC samples 1024 points pixel for the second block of 16x16 pixels and these data will be sent to the second half of the QDR memory, while the data in the first half of the QDR memory will be read out and sent for FFT processing. In this way, four stages will complete full field processing, in order to keep the data
flowing smoothly, the FFT processing unit needs to run in “streaming-line” fashion with the frequency of at least 40MHz. As we have seen that an 1024-point FFT can run at 73MHz in pipeline fashion. Therefore, one FFT core is more than enough to handle the data rate. The frame rate can be worked out to be 9.5Hz, which is 10MHz divided by 256 (each 10MSPS ADC is shared by a block of 16x16 pixels for each stage), then divided by 1024 (no. of points/pixel) and then divided by 4 (no. of stages). As we can see from the above description, the number of ADC channels could be reduced to two or one by increasing the ADC sampling rate to 20MSPS or 40MSPS, respectively. However, the overall frame rate remains the same, because the QDR memory size controls the throughput rate in each case. By using the same method above, the frame rates for full-field laser Doppler blood flow CMOS cameras based on this platform can be calculated for the frame sizes of from 32x32 pixels to 256x256 pixels, and the number of FFT points of from 32 points to 1024 points and shown in Table 2, in which the overall frame rate is given on the left side of the oblique, while the ADC sampling rate is on the right. For example a 32-point FFT using 4 ADCs on a 256x256 pixel array has a frame rate of 153Hz with each ADC sampling at 80MSPS. 5. DISCUSSION The proposed FPGA platform presented here is a generic platform that can be applied to all stages of our camera designs. In addition, it can be seen from Section 4 that the FPGA based platform has a significant advantage over existing systems. Based on the platform, a laser Doppler blood flow camera imager with a size of 256x256 pixels can deliver overall frame rates of 153Hz, 76Hz, 38Hz, 9.5Hz, 2.4Hz, and 0.6Hz for the acquired time-domain signal of 32 points, 64 points, 128 points, 256 points, 512 points, and 1024 points, respectively. However, with the same size of 256x256 pixels, the frame rate that has been obtained in [3] is 0.34Hz for the case of 512 points with both a reduced sampling frequency (8KHz) and ADC resolution (8 bits). Hence the proposed platform based system has the potential to be seven times faster than the existing system in [3], while having a much higher sampling frequency (40KHz) and improved ADC resolution (14 bits), which will lead to higher quality images. 6. CONCLUSIONS In this paper, a new FPGA based generic prototyping platform for laser Doppler blood flow imaging has been proposed. In the proposed platform, the pixels from inhouse designed CMOS camera chips under test can be
sampled with the maximum sampling frequency of 80MSPS and the nominal resolution of 14 bits. The sampled data can then be sent either to the host PC via the high-speed USB 2.0 link between the platform and the host PC for high-level algorithm development or to the data processing units developed in the FPGA device for the required data processing, such as FFT processing. The platform can be used not only for rapid prototyping of any types of CMOS camera systems, but also for high-level algorithm development or CMOS chip test. To confirm that the proposed platform is not only generic, but also has very high performance, with the Xilinx FPGA device as a high-speed FFT co-processor, full-field laser Doppler blood flowmetry systems based on the platform have been presented, which have shown seven times faster frame rate than the existing systems can be reached. For instance, the platform based full-field laser Doppler blood flowmetry system can deliver flow images of 256x256 pixels every 1.7 seconds with an FFT of 1024 points. On a more general point the system presented here offers the researcher/engineer a generic platform that is ideally suited to many other high-speed data acquisition applications.
[4]
R. Michaely, A. Serov, P. Jacquot, and T. Lasser, “Laser Doppler blood-flow imaging combined with topographical imaging of the sample,” Proc. Of SPIE, vol. 6081, pp. 5461, Multimodal Biomedical Imaging, Feb. 2006.
[5]
M. D. Stern. In vivo evaluation of microcirculation by coherent light scattering. Nature 254: 57-58, 1975.
[6]
C. Kongsavatsak, D. He, S. P. Morgan, B. R. Hayes-Gill, J. A. Crowe, and M. Clark, “Laser Doppler blood flowmetry with FPGA processing,” SPIE-Int. Soc. Opt. Eng. Proceedings of the SPIE - The International Society for Optical Engineering, vol.5486, no.1, 2003, pp. 141-147. USA.
[7]
“Laser Doppler Imagers: The moorLDI2TM – Imager”, http://www.moor.co.uk.
[8]
A. Serov and T. Lasser, “Full-field high-speed laser Doppler imaging system for blood-flow measurements,” Proc. Of SPIE, vol. 6080, pp. 15-22, Advanced Biomedical and Clinical Diagnostic Systems IV, Feb. 2006.
[9]
A. Serov and T. Lasser, “Combined laser Doppler and laser speckle imaging for real-time blood flow measurements,” Proc. Of SPIE, vol. 6094, pp. 15-22, Optical Diagnotics and Sensing VI, Feb. 2006.
Acknowledgements The authors would like to acknowledge the financial support of the Engineering and Physical Research Council (EPSRC, Swindon, UK) who provided funds for this work.
[10] “Spartan-3
7. REFERENCES [1]
[2]
[3]
G. E. Nilsson, T. Tenland, and P. A. Oberg, “Evaluation of a laser Doppler flowmeter for measurement of tissue blood flow,” IEEE Transaction on biomedical engineering, BME27, no.10, pp. 597-604, Oct. 1980. A. Serov, B. Steinacher, and T. Lassor, “Full-field laser Doppler perfusion imaging and monitoring with an intelligent CMOS camera,” Optics Express, vol. 13, no. 10, 16 May 2005. A. Serov and T. Lasser, “High-speed laser Doppler perfusion imaging using an integrating CMOS image sensor,” Optics Express, vol. 13, no. 18, 5 Sep. 2005.
FPGA Family: http://www.xilinx.com.
Complete
Data
Sheet,”
[11] D. W. P. Thomas, R. J. O. Carvalho, and E. T. Pereira,
“Fault location in distribution systems based on traveling waves,” IEEE Bologna PowerTech (IEEE Cat. No.03EX719), Vol. 2, pp. 5, 2003, Piscataway, NJ, USA. [12] “CY7C1303AV25, 18-Mb burst of 2 piplelined SRAM with
QDR architecture,” http://www.cypress.com. [13] “ADS5500, 14-bit 125MSPS analogue-to-digital converter
datasheet,” http://www.ti.com. [14] “FIFO Generator V2.1 Product Specification,” DS317,
http://www.xilinx.com. [15] “Using Digital Clock Managers (DCMs) in Spartan-3
FPGAs,” XAPP462, http://www.xilinx.com. [16] “Synthesizable
QDR http://www.xilinx.com.
SRAM
Interface,”
XAPP262,
[17] “CY7C68001, Cypress EZ-USB SX2 high-speed USB
interface device,” http://www.cypress.com.