A Full Parallel Event Driven Readout Technique ... - Semantic Scholar

sensors Article

A Full Parallel Event Driven Readout Technique for Area Array SPAD FLIM Image Sensors Kaiming Nie, Xinlei Wang, Jun Qiao and Jiangtao Xu * School of Electronic Information Engineering, Tianjin University, 92 Weijin Road, Nankai District, Tianjin 300072, China; [email protected] (K.N.); [email protected] (X.W.); [email protected] (J.Q.) * Correspondence: [email protected]; Tel./Fax: +86-22-2789-0832 Academic Editor: Vittorio M. N. Passaro Received: 4 December 2015; Accepted: 22 January 2016; Published: 27 January 2016

Abstract: This paper presents a full parallel event driven readout method which is implemented in an area array single-photon avalanche diode (SPAD) image sensor for high-speed fluorescence lifetime imaging microscopy (FLIM). The sensor only records and reads out effective time and position information by adopting full parallel event driven readout method, aiming at reducing the amount of data. The image sensor includes four 8 ˆ 8 pixel arrays. In each array, four time-to-digital converters (TDCs) are used to quantize the time of photons’ arrival, and two address record modules are used to record the column and row information. In this work, Monte Carlo simulations were performed in Matlab in terms of the pile-up effect induced by the readout method. The sensor’s resolution is 16 ˆ 16. The time resolution of TDCs is 97.6 ps and the quantization range is 100 ns. The readout frame rate is 10 Mfps, and the maximum imaging frame rate is 100 fps. The chip’s output bandwidth is 720 MHz with an average power of 15 mW. The lifetime resolvability range is 5–20 ns, and the average error of estimated fluorescence lifetimes is below 1% by employing CMM to estimate lifetimes. Keywords: SPAD; FLIM; CMM; event driven readout; image sensor

1. Introduction Fluorescence lifetime imaging microscopy (FLIM) is a rather new and effective tool that can be used to analyze complex biological samples, either at the microscopic or macroscopic level [1,2]. The progress of confocal microscopy improves the time resolution. The map of fluorescence lifetime allows one to discriminate different fluorophores and to acquire valuable insights into the behavior of emitting molecules, thus obtaining information like local pH and oxygen concentration in cells, etc. [3]. The two most common techniques for measuring the fluorescence lifetime are the modulated frequency-domain technique and the time-domain technique [4]. Time-domain techniques include time-correlated single-photon counting (TCSPC) and time-gated technique [5]. TCSPC allows for high accuracy in measuring lifetime and it is the most photon-efficient technique. In commercial TCSPC systems, one detector, typically an avalanche photodiode (APD) or photomultiplier tube (PMT), with one time-to-digital converter (TDC) measurement channel is raster scanned across a sample. At each point in the image, a laser is pulsed and the arrival time of the first fluorescent photon relative to the laser pulse is measured. With repeated laser pulses, the arrival time of these individual photons is collected and the lifetime is extracted from the exponentially distributed decay curves fitted to the resulting histogram of photon arrival times. Although TCSPC has been developed for many years, several drawbacks still need to be solved. The imaging system is expensive and cumbersome. Compact, high-speed, and portable system-on-chip FLIM solutions which are robust and easy to operate are in increasing demand, especially in clinical and commercial applications [6].

Sensors 2016, 16, 160; doi:10.3390/s16020160

www.mdpi.com/journal/sensors

Sensors 2016, 16, 160

2 of 14

Recently, compact photon counting devices have emerged, known as single-photon avalanche diodes (SPADs) [7–9]. As solid-state devices, SPADs can operate in normal atmosphere and at room temperature. The device will not get damaged in detecting strong lights compared with PMT. More importantly, with the development of the SPAD, it can be fabricated in standard CMOS process [10]. The integration of SPADs into standard CMOS process could result in significant system capability of highly parallel single photon detection [11]. Therefore, the SPAD FLIM image sensor is becoming an attractive research field [12]. There are a host of excellent reported works [13–15], and most of them are the products of MEGAFRAME project [16–18]. Although improved imaging speed has been demonstrated in some of these designs, the parallel acquisition channels lead to very high data throughput that limits the achievable readout frame rates. For photon starved applications such as FLIM, most of the data generated in one frame is useless. So, for higher speed FLIMs, the data compression is necessary. Shepard et al. [19] proposed the fastest TCSPC-based fluorescence lifetime imaging system, which is capable of acquiring lifetime images at 100 fps. A data-compression data-path provides a mechanism for efficiently transmitting data off-chip in an event-driven manner. In addition to the time information, recording precise location is also of importance to the SPAD chip applied in FLIM, and accordingly it is hard to implement event driven readout based on area array image sensors [20]. Henderson et al. proposed several high-throughput time-resolved mini-silicon photomultipliers based on the event driven readout technique [21]. Zappa et al. proposed a linear 60 ˆ 1 SPAD FLIM array based on the event driven readout [22]. In this work, we propose an area array 16 ˆ 16 SPAD FLIM image sensor, which adopts full parallel event driven readout method by recording the column information, the row information and time information individually. The chip has a fill factor of 5.7%. It can achieve maximum imaging frame rate of 100 fps with a 10 MHz repetition rate excitation laser. The power dissipation of the sensor is 15 mW. The remainder of this paper is organized as follows: Section 2 describes an overview of the image sensor architecture, and the detailed design of each on-chip component. Section 3 shows the analysis of measurement accuracy, highlighting the influence of pile-up effect due to the unique structure. Section 4 shows the circuits simulation results. Section 5 concludes this work. 2. TCSPC FLIM Imaging System 2.1. Circuits Architecture The diagram of the entire system is shown in Figure 1. The resolution is 16 ˆ 16. Internally, each pixel contains a SPAD device, a quenching circuit [23], a calibration circuit and a mono-stable circuit. In each 8 ˆ 8 pixel array, there are corresponding processing circuits, including a TDC array of 4 ˆ 1, a column address recording module and a row address recording module. The entire chip contains two PLLs, in which voltage controlled oscillator (VCO) keeps the same structure and frequency with gated ring oscillator (GRO) in time-to-digital converters (TDCs). The phase-locked loops (PLLs) are in charge of stabilizing the oscillation frequency of GRO, and Process Voltage Temperature (PVT) variations are thus eliminated. Since the probability of each pixel detecting photons is about 1%, only a few pixels can actually detect photons successfully after one excitation laser pulse. In each excitation period, for the 8 ˆ 8 SPAD array, only four groups of time and position information are recorded. When the chip detects a photon, two OR-trees will lump the time data and the position data from 8 ˆ 8 pixels.

Sensors 2016, 16, 160 Sensors 2016, 16, 160 Sensors 2016, 16, 160

3 of 14 3 of 14 3 of 14

8×8 SPAD Array 8×8 SPAD Array



OR-tree Column Address OR-tree Record Address Circuit Column Record Circuit

OR-tree Column Address OR-tree Record Address Circuit Column Record Circuit

OR-tree OR-tree


OR-tree OR-tree

TDC TDC TDC TDC TDC Row Address Record Circuit Row Address Record Circuit

Column Address Record Address Circuit Column OR-tree Record Circuit OR-tree

OR-tree OR-tree

Timing &Readout Timing Circuit &Readout Circuit

TDC TDC TDC TDC TDC PLL PLL TDC TDC TDC

Column Address Record Address Circuit Column OR-tree Record Circuit OR-tree OR-tree OR-tree

Row Address Record Circuit Row Address Record Circuit TDC TDC TDC

Row Address Record Circuit Row Address Record Circuit TDC TDC TDC TDC TDC TDC TDC TDC PLL PLL TDC TDC TDC

Timing &Readout Timing Circuit &Readout Circuit

TDC TDC TDC TDC TDC Row Address Record Circuit Row Address Record Circuit

Figure 1. 1. Circuit architecture. Figure Circuit architecture. Figure 1. Circuit architecture.

The operating principle of the event record circuits is shown in Figure 2. A group of interleaved The operating operating principle principle of the the event event record record circuits is in Figure 2. A A group group of of interleaved interleaved The circuits is shown shown Figure token-passing shift registersof are used to distribute events to theinarray of2.TDCs. The time data of token-passing shift registers areare used to distribute events to thetoarray TDCs. The time data of detected token-passing shift used to distribute events the of array ofarray TDCs. The time data of detected photons areregisters transmitted into TDCs in turn, and consequently TDC can record photons’ photons are transmitted into TDCs in TDCs turn, andturn, consequently TDC array can record photons’ arrival detected photons are transmitted into and position consequently array canby record photons’ arrival time successively [24]. The columninand row data TDC is obtained two encoders time successively [24]. The column and row position data is obtained by two encoders respectively. arrival time successively [24]. data The iscolumn and corresponding row position data is obtained two encoders respectively. Then the position stored into registers, which isby controlled by the Then the position data is stored into corresponding registers, which is controlled by the output of respectively. Then the position data is stored into corresponding registers, which is controlled by the output of the OR-tree. In order to guarantee the validity of data, the data in one frame is valid only the OR-tree. In order to guarantee the validity of data, the data in one frame is valid only when the output of amounts the OR-tree. In order to guarantee the validity of are data, dataBy inemploying one frame event is valid only when the of column data, row data and time data thethe same. driven amounts of columnofdata, rowdata, data row and data time and datatime are the same. Bysame. employing event driven readout when the amounts column data are the By employing event driven readout instead of reading out all data, the data rate is reduced to 1/16 of the initial rate, which instead of reading all data, rate isdata reduced of the readout instead ofout reading outthe alldata data, rate to is 1/16 reduced to initial 1/16 ofrate, thewhich initial significantly rate, which significantly mitigate the requirement for the input/output (I/O) bandwidth. mitigate the requirement for input/output (I/O) bandwidth. significantly mitigate the requirement for input/output (I/O) bandwidth. 4-bit Encoder Encoder

Row OR-tree Row OR-tree ` `

` ` ` ` ` ` ` `

Column OR-tree Column OR-tree

... ...

8×8 8×8 Pixel Pixel Array Array

8-bit column bus 8-bit column bus (a) (a)

Time

8-bit row busbus 8-bit row

... ...

(a) (a)

No.1 Row

No.1 No.2 Row Row Time Time No.2 Row No.3 Row Time Time No.3 Row No.4 Row The ResultTime of Photon Detection No.4 Row The Result of No.5 Row Photon Detection No.5 No.6 Row Row Time No.6 No.7 Row Row Time Time No.7 No.8 Row Row Time Time 8-bit No.8 Row Time 8-bit 8-bit column bus 8-bit column bus

Register1 Register2 Register1 Register3 Register2 Register3

4-bit Ctrl OR-tree

Ctrl

TDC1 TDC2 TDC1 TDC3 TDC2 TDC3

OR-tree

Output Output

Time

Output Data Output Data

Time Time Time 4-bit Encoder Encoder OR-tree OR-tree

(b) (b)

4-bit

Ctrl Ctrl

Register1 Register2 Register1 Register3 Register2 Register3

Time Time

Figure 2. (a) The in-pixel OR-tree; (b) the operating principle of the event record circuits. Figure circuits. Figure 2. 2. (a) (a) The The in-pixel in-pixel OR-tree; OR-tree; (b) (b) the the operating operating principle principle of of the the event event record record circuits.

2.2. SPAD Pixel Array 2.2. SPAD Pixel Array 2.2. SPAD Pixel Array In Figure 3, the implementation of the in-pixel circuits is shown. An active quenching circuit In Figure 3, the implementation of the in-pixel circuits is shown. An active quenching basedInon current sensing is implemented instead of using quenching resistors. After pixels arecircuit reset, Figure 3, the implementation of the in-pixel circuits is shown. An active quenching circuit based on current sensing is implemented instead of using quenching resistors. After pixels are reset, transistors M1, M2, M3 and M4 form a positive which canresistors. sense the sudden increase of based on current sensing is implemented instead feedback, of using quenching After pixels are reset, transistors M1, M2, M3 and M4 form a positive feedback, which can sense the sudden increase of cathode current of the and quench the current promptly In addition, the logic circuit transistors M1, M2, M3SPAD and M4 form a positive feedback, which [25]. can sense the sudden increase of cathode current ofM5 the SPAD and quench the current promptly [25]. In addition, the logic circuit controls transistor speedand up the quenching process. After finishing SPAD does not cathode current of thetoSPAD quench the current promptly [25]. Inquenching, addition, the logic circuit controls transistor M5 to speed up the quenching process. After finishing quenching, SPAD does not go back transistor to GeigerM5 mode instantly, waits for the global signal to openSPAD M6. The controls to speed up thebut quenching process. AfterRESET finishing quenching, doesoutput not go go back of to Geiger mode instantly, waits forinto the aglobal RESET signal to open M6.through The output voltage the quenching circuit isbut converted sub-nanosecond voltage pulse the voltage of the quenching circuit is converted into a sub-nanosecond voltage pulse through the mono-stable circuit. mono-stable circuit.

Sensors 2016, 16, 160

4 of 14

Sensors 2016, 16, 160

4 of 14

Sensors 2016, 16, 160

4 of 14 back to Geiger instantly, but waits for theofglobal RESET signal open M6. voltage of Then the mode pulse is sent through OR-trees row and column toto provide rowThe andoutput column outputs the quenching circuit is converted into a sub-nanosecond voltage pulse through the mono-stable circuit. of the 8 × 8the pixel array. Thethrough whole 8OR-trees × 8 pixel array 8-bit row bus and 8-bit column bus as outputs outputs. Then pulse is sent of rowhas and column to provide row and column Then thea calibration pulse is sentmodule through OR-trees ofinto rowpixels and column totoprovide row and column outputs Inthe addition, order calibrate TDC quantization error of 8 × 8 pixel array. The whole is 8 ×embedded 8 pixel array has 8-bitinrow bus and 8-bit column bus as outputs. of the 8 ˆby 8 pixel array. The whole 8 ˆ 8 pixel array has 8-bit row bus and 8-bit column bus as outputs. induced signal path delay. In addition, a calibration module is embedded into pixels in order to calibrate TDC quantization error In addition, a calibration module is embedded into pixels in order to calibrate TDC quantization error induced by signal path delay. induced by signal path delay.

VDD

M1

Calibrate_Enable Calibrate Pixel Calibrate_Enable M3 Calibrate Signal Monostable Output Pixel M3 Signal Circuit Output Monostable Circuit M4

M1

M4

VDD

MUX MUX

M2 M6 M2 M6

Pixel_Reset Logical Pixel_Reset Pixel_Enable Circuit Logical Pixel_Enable Circuit

M5

SPAD

M5

SPAD Anode

GND

Anode

GND Figure 3. The in-pixel circuits. Figure 3. The in-pixel circuits. Figure 3. The in-pixel circuits.

2.3. Time-to-Digital Converters Array 2.3. Time-to-Digital Converters Array 2.3. Time-to-Digital Converters The TDC array, which Array is similar with the structure shown in [14], has a time resolution of The TDC array, which is similar with the structure shown in [14], has a time resolution of 97.6 The ps. TDC The structure of TDC is shown 4. In orderinto[14], minimize the TDC’s power array, which is similar with in theFigure structure shown has a time resolution of 97.6 ps. The structure of TDC is shown in Figure 4. In order to minimize the TDC’s power consumption, consumption, the conversion is achieved in reverse START-STOP The GROthe begins to oscillate 97.6 ps. The structure of TDC is shown in Figure 4. In ordermode. to minimize TDC’s power the conversion is achieved in reverse START-STOP mode. The GRO begins to oscillate as soon as the as soon as the the START signal occurs. The global STOP signal is synchronized excitation laser. consumption, conversion is achieved in reverse START-STOP mode. The with GROthe begins to oscillate START signal occurs. The global STOP signal is synchronized with the excitation laser. In addition, Insoon addition, oscillator can start oscillating the node voltage the GRO laser. being as as thethe START signal occurs. The global immediately, STOP signal isafter synchronized with theofexcitation the oscillator can start oscillating immediately, after the node voltage of the GRO being reset. A reset. A 7-bitthe counter records the number of cycles of the GRO seven most-significant-bits In addition, oscillator can start oscillating immediately, afterasthe node voltage of the GRO(MSBs) being 7-bit counter records the number of cycles of the GRO as seven most-significant-bits (MSBs) of the of theAmeasurement. The three (LSBs) are by the eight states(MSBs) of the reset. 7-bit counter records the least-significant-bits number of cycles of the GRO as provided seven most-significant-bits measurement. The three least-significant-bits (LSBs) are provided by the eight states of the GRO. The GRO. The 10-bit timeThe information is stored into Register A and Register Bby alternately, theofother of the measurement. three least-significant-bits (LSBs) are provided the eight and states the 10-bit time information is stored into Register A and Register B alternately, and the other register’s register’s be read out. GRO. The data 10-bitwaits timeto information is stored into Register A and Register B alternately, and the other data waits to be read out. register’s data waits to be read out. B0 (a) STOPB0 (a) START

Vbias

Vbias Vbias START STOP START STOP B4 B4

STOPB1 START STOP START Vbias Reset Vbias Reset Reset Vbias Reset Vbias START STOP START STOP B5 B5

B2 STOPB2 START STOP START Vbias Vbias

B3 STOPB3 START STOP START Vbias Reset Vbias Reset

Vbias Vbias START STOP START STOP B6 B6

Reset Vbias Reset Vbias START STOP START STOP B7

(b) (b) 8-bit to 3-bit Decoder 8-bit to 3-bit Decoder START 8-bit START

8-bit GRO

STOP

GRO

RegisterA 3-bit 3-bit

RegisterB RegisterA RegisterB 7-bit

MUX MUX

STOP START Vbias

B1

10-bit Output 10-bit Output

7-bit 7-bit Counter 7-bit Counter

STOP

B7

Figure Figure 4. 4. (a) (a) The The GRO GRO circuit; circuit; (b) (b) the the structure structureof ofthe the10-bit 10-bitTDC. TDC.

Figure 4. (a) The GRO circuit; (b) the structure of the 10-bit TDC.

Figure 55 shows the RESET signal is used to reset TDCs and Figure the timing timing diagram diagramofofthe thescheme. scheme.The The RESET signal is used to reset TDCs pixels. The READ1, READ2, RESET1 and RESET2 signals are used to control the two groups of and pixels. READ1, READ2, RESET1ofand are used to is control thereset TDCs and of Figure The 5 shows the timing diagram the RESET2 scheme. signals The RESET signal used to registers ineach eachTDC TDCREAD2, alternately. registers in alternately. pixels. The READ1, RESET1 and RESET2 signals are used to control the two groups of registers in each TDC alternately.

Sensors 2016, 16, 160 Sensors2016, 2016,16, 16,160 160 Sensors

5 of 14 55 of 14

READOUT_CLK READOUT_CLK Readout_valid Readout_valid READ1 READ1 READ2 READ2 RESET1 RESET1 RESET2 RESET2 STOP(10M) STOP(10M) RESET RESET

Figure 5. The timing diagram of the scheme. Figure5.5.The Thetiming timingdiagram diagramof ofthe thescheme. scheme. Figure

3. Fluorescence Lifetime Imaging Error Analysis for the Proposed Readout Method FluorescenceLifetime LifetimeImaging ImagingError ErrorAnalysis Analysisfor forthe theProposed ProposedReadout ReadoutMethod Method 3.3.Fluorescence 3.1. Pile-Up Effect during Events Readout 3.1.Pile-Up Pile-UpEffect Effectduring duringEvents EventsReadout Readout 3.1. The readout circuits of this chip lead to the result that not every pulse can be detected by TDC Thereadout readoutcircuits circuitsof ofthis thischip chiplead leadto tothe theresult resultthat thatnot not every everypulse pulsecan canbe bedetected detectedby byTDC TDC The arrays or position recording circuits. The pile-up effect of TCSPC is classified into three types. The arrays or position recording circuits. The pile-up effect of TCSPC is classified into three types. The first arrays or position recording circuits. The pile-up effect of TCSPC is classified into three types. The first one is the traditional pile-up effect and it can be alleviated by reducing photon-rate to 1%, and one one is the pile-up effect andand it can be alleviated by reducing photon-rate to 1%, andand the first is traditional the traditional pile-up effect it can be alleviated by reducing photon-rate to 1%, the photon-rate is directly proportional to the emission intensity. The second one is caused by the photon-rate is directly proportional to the emission intensity. The second one is caused by the dead the photon-rate is directly proportional to the emission intensity. The second one is caused by the dead time of SPAD devices. Nonetheless, thanks to the global reset of the active quenching circuits, time time of SPAD devices. Nonetheless, thanks to the resetreset of the quenching circuits, this dead of SPAD devices. Nonetheless, thanks to global the global of active the active quenching circuits, this part can be neglected. The last one is the dead time of the processing circuits of the chip, i.e., the partpart can can be neglected. TheThe last last one one is the timetime of the circuits of the i.e., the this be neglected. is dead the dead of processing the processing circuits ofchip, the chip, i.e.,time the time interval between the photon’s arrival and being detected. The chip proposed in this paper is interval between the photon’s arrival and being detected. The chip proposed in this paper is influenced time interval between the photon’s arrival and being detected. The chip proposed in this paper is influenced mostly by the third kind of pile-up effect. Pulses from each pixel are shortened by the mostly by the thirdbykind pile-up Pulses fromPulses each pixel shortened the mono-stable influenced mostly the of third kindeffect. of pile-up effect. fromare each pixel arebyshortened by the mono-stable circuit but still have a finite length tp. For the 8 × 8 pixel array, if the time interval of any circuit but still havebut a finite length tp . Forlength the 8 tˆp. 8For pixel if thearray, time interval of any two photon mono-stable circuit still have a finite thearray, 8 × 8 pixel if the time interval of any two photon events is less than tp apart, two pulses will merge together at the output of the OR-tree. events is less than t apart, two pulses will merge together at the output of the OR-tree. Consequently, p less than tp apart, two pulses will merge together at the output of the OR-tree. two photon events is Consequently, only the first event will get processed further and the second event is missed only the first event further and the second event missed completely. Consequently, only will the get firstprocessed event will get processed further andisthe second event is Figure missed6 completely. Figure 6 shows the process of an 8 × 8 pixel array detecting photons. It can be seen that shows the process of an 8 ˆ 8 pixel array detecting photons. It can be seen that the electrons labelled completely. Figure 6 shows the process of an 8 × 8 pixel array detecting photons. It can be seen that the electrons labelled as 2, 3 and 7 will not be detected due to the dead time of the OR-tree. Thus, the as 2, 3 and 7 labelled will not be detected thebedead time due of the thethe photons’ arrival the electrons as 2, 3 and 7due willtonot detected toOR-tree. the deadThus, time of OR-tree. Thus,time the photons’ arrival time histogram is modulated by the readout circuits. histogram is modulated by the readout circuits. photons’ arrival time histogram is modulated by the readout circuits. T T

SPAD1 SPAD1 SPAD2 SPAD2 SPAD3 SPAD3 Or-tree Output Or-tree Output

1 1

4 4

6 6 5 5

2 2

7 7

3 3 Detected Detected 1 1

Detected Detected 4 4



8 8 Detected Detected 8 8

Figure6. 6.The Theschematic schematicdiagram diagramof ofpile-up pile-up effect. effect. Figure Figure 6. The schematic diagram of pile-up effect.

Another Another influencing influencing factor factor to to be be considered considered in in the the detecting detecting behavior behavior is is the the interactions interactions among among Another influencing factor to be considered in the detecting behavior is the interactions among pixels within the same sub-array and it is different from the common cross-talk phenomenon in pixels within the same sub-array and it is different from the common cross-talk phenomenon pixels within the same sub-array and it is different from the common cross-talk phenomenon in device level. It derives from thethevariation TDCs. The in device level. It derives from variationofofthe theconnecting connecting probability probability towards towards TDCs. The device level. It derives from the variation of the connecting probability towards TDCs. The probability is modulated by the neighboring pixels. If the sensor is supposed to work as a Mini-Silicon probability is modulated by the neighboring pixels. If the sensor is supposed to work as a Mini-Silicon probability is modulated by the neighboring pixels. If the sensor is supposed to work as a Mini-Silicon Photomultiplier Photomultiplier (MSP), (MSP), the the lifetimes lifetimes that that each each pixel pixel needs needs to to measure measure are are the the same; same; therefore, therefore, the the Photomultiplier (MSP), the lifetimes that each pixel needs to measure are the same; therefore, the interactions interactions among among the the sub-array sub-array are are also also the the same. same. Hence, Hence, the the influence influence of of interactions interactions can can be be interactions among the sub-array are also the same. Hence, the influence of interactions can be neglected. However, if the sensor is supposed to work as an area array detector, each pixel may need neglected. However, if the sensor is supposed to work as an area array detector, each pixel may need neglected. However, if the sensor is supposed to work as an area array detector, each pixel may need to means a rapid decay and most photons areare expected to be to detect detectdifferent differentlifetimes. lifetimes.Short Shortlifetime lifetime means a rapid decay and most photons expected to to detect different lifetimes. Short lifetime means a rapid decay and most photons are expected to be detected in a short period after excitation. Long lifetime is in the opposite condition. This results in a be detected in a short period after excitation. Long lifetime is in the opposite condition. This results detected in a short period after excitation. Long lifetime is in the opposite condition. This results in a larger probability that the TDC channels are occupied by pixels that detect short lifetimes. The larger probability that the TDC channels are occupied by pixels that detect short lifetimes. The

Sensors 2016, 16, 160

6 of 14

in a larger probability that the TDC channels are occupied by pixels that detect short lifetimes. The pile-up effect is thus deteriorated. Furthermore, the modulated influence along the whole detecting period is not the same. In order to simplify the analysis, the fluorescence is assumed to have a single-exponential decay. The influence is analyzed and simulated in Maximum Likelihood Estimator (MLE), Center-of-Mass Method (CMM) and Least-Squares Method (LSM). When the sensor works as a MSP, the average number of photons that one pixel can detect is [26]: 1 c aver pµ; tq “ 2 expp´t{τq ˆ N τ

#

expp´µp1 ´ e´t{τ qq expp´µe´t{τ pet p {τ ´ 1qq

t ă tp t ě tp

(1)

where τ is the lifetime, N2 is the number of pixels and µ is the expected number of photons the pixel array detects. When the sensor works as an area array detector, we only analyze one 8 ˆ 8 pixel array because each 8 ˆ 8 pixel array is independent. Assuming that τ i,j is the lifetime which pixel (i, j) needs to detect, and then the probability of photon detection on pixel (i, j) is: Pi,j ptq “

P0i,j expp´t{τi,j q τi,j

(2)

where P0i,j is the photon-rate of pixel (i, j). Then the probability failing to detect photon on pixel (i, j) is: # 1

Pi,j pt; t p q “

1 ´ P0i,j p1 ´ expp´t p {τi,j qq 1 ´ P0i,j pexpp´pt ´ t p q{τi,j q ´ exppt{τi,j qq

t ă tp t ě tp

(3)

If the interaction of other pixels is neglected, then for a certain pixel, which is assumed as pixel (1, 1), the probability of photon detection along with time is: 1

P1,1 pt, t p q “

n ź n ź

1

Pi,j pt, t p qP1,1 ptq

(4)

i“2 j“2

To simulate the influence of pile-up effect, the Monte Carlo simulation of the operation process of the chip is done in MATLAB. The MATLAB random number generator is used to simulate individual photon events with the appropriate statistical distribution. For a single-exponential decay, the probability that the pile-up occurs can be analyzed. Figure 7a,b is the simulation results of the counts error of events in different measuring windows on conditions that the pixel working under different τ MSP /tp and τ ARRAY /τ 1,1 , respectively, where τ MSP is the lifetime being measured by the pixel array when it is used as a MSP, τ 1,1 is the lifetime being measured by the pixel (1, 1) when the SPAD array is used as an array sensor, and τ ARRAY is the lifetime being measured by other pixels. The dashed line is the simulated data, while the solid line is the theoretical curve. From the two figures, it can be seen that the simulation results keep in accordance with the theory anticipation. During the period close to the excitation, it is more probable for the pixel array to detect photons, so the pile-up effect is more apparent and the counts error increases. After a while, the probability of photon detection falls, so the distribution gradually approaches to single-exponential curve. In PMT working mode, as τ MSP /tp goes smaller, the pile-up effect gets degraded. But when the sensor works as an array imager, the influence is modulated along time. When τ ARRAY /τ 1,1 = 0.1, pixel (1,1) detects longer lifetimes than other pixels do. Then at the beginning of the detection, pile-up effect from pixel (1,1) is less and the counts error is comparatively small. However, in later detection time, pile-up effect from pixel (1,1) becomes strong and the counts error is comparatively large. In the simulation process of the chip circuits, tp is found to be 360 ps.

Sensors 2016, 16, 160 Sensors 2016, 16, 160

7 of 14 7 of 14

Figure7.7.(a) (a)The Thecounts countserrors errorsofofevents eventsalong alongnormalized normalized time under different τMSP (b) the thecounts counts Figure time under different τ MSP /t/tpp;; (b) errorsof ofevents eventsalong alongnormalized normalizedtime timeunder underdifferent differentττARRAY ARRAY/τ . . errors /τ1,11,1

3.2.Fluorescence FluorescenceLifetime LifetimeImaging ImagingAlgorithm Algorithm 3.2. Inthis thissection, section,the theinfluence influenceof ofpile-up pile-upeffect effectto to MLE, MLE, CMM CMM and and LSM LSM is is simulated simulated and and analyzed. analyzed. In For a fluorescence histogram with a single-exponential decay f(t) = Aexp(−t/τ) in a measurement For a fluorescence histogram with a single-exponential decay f (t) = Aexp(´t/τ) in a measurement window00ď≤ tt ď ≤T by the the 10-bit 10-bit TDC TDC(M (M==1024), 1024),the thelifetime lifetimeestimated estimatedusing usingMLE, MLE,τ MLE τMLE,,can can window T recorded recorded by be obtained by [27]: be obtained by [27]: M

1  {exp(h /  MLE )  1}1  M (exp(Mh /  MLE )  1)1   jN j / Nÿ M c

1 ` texpph{τMLE q ´ 1u´1 ´ MpexppMh{τMLE q ´ 1qj 1´1 “

jNj {Nc

(5) (5)

“1the LSB of TDC, Nj is the where Nc is the total signal counts within the measurement window, h jis number of recorded counts in the jth time bin (j = 1, 2, ... , M). where Nc is the total signal counts within the measurement window, h is the LSB of TDC, N j is the The CMM can be viewed as a hardware implementation algorithm of the MLE although their number of recorded counts in the jth time bin (j = 1, 2, ... , M). physical definitions are not the same. The CMM is easy to be implemented by FPGA [28], and then it The CMM can be viewed as a hardware implementation algorithm of the MLE although their is possible to achieve video-rate fluorescence lifetime imaging by employing CMM to estimate physical definitions are not the same. The CMM is easy to be implemented by FPGA [28], and then fluorescence lifetimes. The lifetime estimated using CMM, τCMM, can be obtained by: it is possible to achieve video-rate fluorescence lifetime imaging by employing CMM to estimate fluorescence lifetimes. The lifetime estimated using  M CMM, τ CMM  , can be obtained by:

 ¨

jN  M

˛

j

1 1  jř  CMM  ˚ jNj  ‹ h τCMM

 j“1N c ˚ “˚  ˚  Nc ˝

(6)

12‹ ` ‹ h 2‹ ‚

(6)

The LSM minimizes the chi-square, ∑{(o − e)2/e}, where o is the statistics value and e is the ř LSM, τLSM, can be obtained by: expected value. The lifetime estimated using The LSM minimizes the chi-square, {(o ´ e)2 /e}, where o is the statistics value and e is the M expected value. The lifetime estimated using LSM, τ LSM , can obtained jT /  LSMby: )  jNbei2 exp( exp(T /  LSM ) M   exp(T /  LSM )  1 exp( MT /  LSM )  1

j 1 M

/ )  Nřexp( jNi2jTexppjT{τ LSM q M 2 i

j 1

(7)

LSM

j “1 exppT{τLSM q M ´ “ (7) M lifetime using MLE, CMM and LSM Figure 8 illustrates the impact of pile-up effect on estimating exppT{τLSM q ´ 1 exppMT{τLSM q ´ 1 ř 2 exppjT{τ N q LSM itheoretical respectively, when the pixel array measures uniform lifetime.j“The results marked as solid 1 lines are compared to Monte Carlo simulations marked with asterisks (scattered points). It can be seenFigure that there is not much difference between MLEon and CMM under condition that the lifetime 8 illustrates the impact of pile-up effect estimating lifetime using MLE, CMM and being LSM measured is when short. the Butpixel whenarray the lifetime being measured getsThe longer, it cannot be guaranteed all respectively, measures uniform lifetime. theoretical results marked asthat solid photons triggered by the laser cansimulations be detectedmarked by thewith sensor due to(scattered the limited detection lines are compared to Monte Carlo asterisks points). It cantime be window. As a isresult, the measuring error of CMM increases. From Figure 8, itthat alsothe can be seen that seen that there not much difference between MLE and CMM under condition lifetime being tp is the dominant factor that influences the measurement accuracy.

Sensors 2016, 16, 160

8 of 14

measured is short. But when the lifetime being measured gets longer, it cannot be guaranteed that all photons triggered by the laser can be detected by the sensor due to the limited detection time window. As a result, the measuring error of CMM increases. From Figure 8, it also can be seen that tp is the Sensors 2016, 16, 160 8 of 14 dominant factor that influences the measurement accuracy.

Figure while using using (a) (a) MLE; MLE; Figure 8. 8. The The relationship relationship between between measuring measuring error error of of different different lifetimes lifetimes and and ttpp while (b) (b) CMM; CMM; (c) (c) LSM LSM to to estimate estimate lifetimes. lifetimes.

Figure effect of of pixels under condition thatthat the chip works as an as area Figure 99shows showsthe theinteraction interaction effect pixels under condition the chip works anarray area image sensor.sensor. In this In figure, is assumed that τ 1,1that is 1 ns, 10 ns and1015nsnsand respectively. Also, the array image this it figure, it is assumed τ1,1 5isns, 1 ns, 5 ns, 15 ns respectively. measuring error is relative τ 1,1 . When short, the pile-up effecteffect becomes degraded and Also, the measuring error isto relative to τ1,1.τWhen ARRAY is short, the pile-up becomes degraded ARRAY τis the is serious. Considering the the situation thatthat τ 1,1τis and τ ARRAY anddeviation the deviation is serious. Considering situation 1,1 5 isns 5 ns and τARRAYisis11ns, ns,the the deviation deviation reaches reaches 9% 9% by by employing employing LSM. LSM. To To maintain maintainthe the average averageestimated estimatederror errorbelow below1% 1%in in CMM CMM algorithm, algorithm, the detecting detecting range range of of the the pixels pixels are are 5–20 5–20 ns. ns. MLE MLE maintains maintains its its accuracy accuracy below below 1% 1% in in full full range. range. But MLE MLE requires requires massive massive calculations calculations and and is is not not practical practicalthrough throughhardware. hardware. The hold hold time time of encoders encoders also influences influences the the accuracy accuracy of measurements. measurements. When When the the time interval interval The two events events is is shorter shorter than than the the hold hold time time of of encoders, encoders, the the encoders encoders used to record record the the column column and and of two row data data of of events events cannot cannot get get the the correct correct position position code code of of the the first first event. event. Once that occurs, occurs, the output row code of of encoder encoder is is “1000”, “1000”, the the data data of of this this time time of of fluorescence fluorescencetrigger triggerisisabandoned. abandoned. code The resolvability resolvability range range of of this this TCSPC TCSPC design design depends depends on the error from the estimation algorithm algorithm The and the the influencing influencingfactors factorsfrom fromdetecting detectingoperation. operation.CMM CMMtypically typically has ideal resolvability range and has anan ideal resolvability range of of T/4 T/100with withpost postsoftware softwarecalibrations calibrationswhere whereTTisisthe theperiod periodof ofthe thelaser laserpulse. pulse. The The origination origination T/4 to to T/100 of this this algorithm algorithm error error is is Poisson Poisson noise noise [28]. [28]. The The influencing influencing factors, factors,such suchas as the the pulse pulsewidth widthttpp output of from mono-stable circuit shorten thethe detecting range. In circuit and andthe theinteraction interactioneffect effectamong amongpixels, pixels,also also shorten detecting range. this design, after ananoverall under In this design, after overallconsideration, consideration,the theresolvability resolvabilityrange rangeisis approximately approximately 5–20 5–20 ns under 10 MHz MHz laser laser excitation excitation rate. rate. 10 detecting range in this this design design is suitable suitable in applications applications of long decay lifetimes. However, The detecting actual biomedical biomedical applications applications such as Indocyanine Indocyanine Green (ICG), the resolvability resolvability range range of of in actual of great great importance. importance. The design should be optimized to satisfy satisfy the the extending extending sub-nanosecond is of arrival time time of of photons photons range. First, the quantization range of TDC should be enough as the reverse arrival becomes longer longer under under rapid rapid decays. decays. Then, Then, the module module delays delays also need to be minimized to accelerate becomes the response. response.Additionally, Additionally, number of TDCs shared thesub-array same sub-array be the thethe number of TDCs shared amongamong the same should beshould enlarged enlarged topile-up alleviate pile-up effects. to alleviate effects.

Figure 9. The relationship between measuring errors and τARRAY while using (a) MLE; (b) CMM;

in actual biomedical applications such as Indocyanine Green (ICG), the resolvability range of sub-nanosecond is of great importance. The design should be optimized to satisfy the extending range. First, the quantization range of TDC should be enough as the reverse arrival time of photons becomes longer under rapid decays. Then, the module delays also need to be minimized to accelerate the response. Additionally, the number of TDCs shared among the same sub-array should Sensors 2016, 16, 160 9 of be 14 enlarged to alleviate pile-up effects.

whileusing using(a) (a) MLE; MLE; (b) (b) CMM; CMM; Figure 9. The relationship between measuring errors and ττARRAY ARRAY while Sensors estimate lifetimes. (c)2016, LSM16,to160

9 of 14

In actual FLIM measurements, the existence of background and DCR will restrict the accuracy In actual FLIM measurements, the existence of background and DCR will restrict the accuracy of of measurement. As background and DCR are rather lower than the signal intensity, as seen in measurement. As background and DCR are rather lower than the signal intensity, as seen in Figure 10, Figure 10, so the influence to the pile up effect can be ignored. Background and DCR are taken into so the influence to the pile up effect can be ignored. Background and DCR are taken into account by account by the method mentioned in [28]. the method mentioned in [28].

Figure Figure 10. 10. The The events events distribution distribution under under condition condition that that background background and and DCR DCR are are considered. considered.

4. SimulationResults Resultsof of Circuits Circuits 4. Simulation The The circuit circuit design design is based based on the 0.13 µm μm 1P3M 1P3M CIS CIS process process and and SPADs SPADs are are substituted substituted by by aa Verilog The Verilog A model has the of directly generating random random events obeying VerilogAAmodel. model. The Verilog A model hasadvantage the advantage of directly generating events the exponential distribution. Synchronized to 10 MHz to laser proposed the structure of a obeying the exponential distribution. Synchronized 10excitations, MHz laserthe excitations, proposed 16 ˆ 16 array is simulated detect various lifetimes with 1% detecting probability. The resolution × 16 array to is simulated to detect various lifetimes with 1% detecting probability. The structure of a 16 of two column-level TDCs is 97 ps. We should take note that the resolution of the SPAD array is not resolution of two column-level TDCs is 97 ps. We should take note that the resolution of the SPAD constrained 16 ˆ 16.below The SPAD canSPAD be extended more 8 ˆ sub-arrays with array is notbelow constrained 16 × array 16. The array by canplacing be extended by8 placing more 8 the ×8 same TDC/SPAD ratio. However, too large may betoo problematic layout The design of sub-arrays with the same TDC/SPAD ratio.array However, large arrayinmay be routing. problematic in layout 16 ˆ 16 SPAD array in this work is to testify the proposed readout method in fast imaging mode. routing. The design of 16 × 16 SPAD array in this work is to testify the proposed readout method in The pixelmode. is influenced by process, voltage and temperature variations. Figure 11 is the process fast imaging corner simulation result of the circuit, where tdelay is the time delay between the process output The pixel is influenced by quenching process, voltage and temperature variations. Figure 11 is the pulse pixels andresult the photon’s arrival timecircuit, and twidth is the the output pulse. The cornerofsimulation of the quenching where tdelaywidth is theoftime delay between theprocess output corner (NMOS: Slow and PMOS: Slow)’ tois‘ffthe (NMOS: Fast PMOS: Fast)’ the pulse ofcovers pixels‘ss and the photon’s arrival time and twidth width of theand output pulse. Theand process ˝ C to 80 ˝ C. The simulation results show that t temperature is swept from ´40 varied from 90 ps corner covers ‘ss (NMOS: Slow and PMOS: Slow)’ to ‘ff (NMOS: Fast and width PMOS: Fast)’ and the to 170 ps, andisthe tdelayfrom varied 400 to The 650 ps. The deviation the process corner contributes to temperature swept −40from °C to 80ps°C. simulation resultsofshow that twidth varied from 90 ps the estimation of lifetimes. to 170 ps, and error the tdelay varied from 400 ps to 650 ps. The deviation of the process corner contributes to the estimation error of lifetimes.

corner simulation result of the quenching circuit, where tdelay is the time delay between the output pulse of pixels and the photon’s arrival time and twidth is the width of the output pulse. The process corner covers ‘ss (NMOS: Slow and PMOS: Slow)’ to ‘ff (NMOS: Fast and PMOS: Fast)’ and the temperature is swept from −40 °C to 80 °C. The simulation results show that twidth varied from 90 ps to 1702016, ps, and the tdelay varied from 400 ps to 650 ps. The deviation of the process corner contributes Sensors 16, 160 10 of 14 to the estimation error of lifetimes.

(a)

(b)

Figure11. 11. The process corner simulation result of the quenching circuit: (a) twidth; (b) tdelay. Figure Sensors 2016, 16, 160 The process corner simulation result of the quenching circuit: (a) twidth ; (b) tdelay .

10 of 14

anan input of of ramp signal. Figure 12 is12the The non-linearity non-linearity of ofthe theproposed proposedTDC TDCisisanalyzed analyzedwith with input ramp signal. Figure is linearity result of the proposed TDC. As the actual time range that the TDC needs to quantify the linearity result of the proposed TDC. As the actual time range that the TDC needs to quantify is 0–95 ns, the digital output of the proposed TDC is limited to 950. The The differential differential nonlinearity nonlinearity (DNL) (DNL) is ´0.082LSB −0.082LSB\0.102LSB, and the integral nonlinearity (INL) is −0.205\0.282 LSB. \0.102LSB, ´0.205\0.282 theFLIM FLIMsimulation, simulation,it is it assumed is assumed the fluorescence intensity of the picture whole During the that that the fluorescence intensity of the whole picture is uniform, which means that the probability of every to detect photons is 1%. This leads is uniform, which means that the probability of every pixelpixel to detect photons is 1%. This leads to to the fact that the simulation result is worse than the actual measured result because the impact the fact that the simulation result is worse than the actual measured result because the impact of pile-up is maximized. The The maximum maximum imaging imaging frame frame rate rate is is 100 fps, in the situation that a lifetime map can be obtained by handling the information of about 1000 photons each pixel. If the accurate imaging is needed, the imaging frame rate can be decreased to reduce the standard deviation of estimated lifetimes.

Figure 12. The TDC. Figure 12. The DNL DNL and and INL INL of of the the proposed proposed TDC.

The No. 1 picture (the initial lifetime map in Figure 13) used as fluorescence source has a The No. 1 picture (the initial lifetime map in Figure 13) used as fluorescence source has a fluorophore whose lifetime is 14 ns, and the background lifetime is 4 ns. The total exposure time is fluorophore whose lifetime is 14 ns, and the background lifetime is 4 ns. The total exposure time is 10 ms, 100 ms and 2 s, respectively, and the output data of the circuits is handled by MLE, CMM and 10 ms, 100 ms and 2 s, respectively, and the output data of the circuits is handled by MLE, CMM and LSM, respectively. The simulation results are shown in Figure 13. It can be seen that the outline of LSM, respectively. The simulation results are shown in Figure 13. It can be seen that the outline of the the fluorophore is obvious when the total exposure time is 10 ms, which means that the imaging fluorophore is obvious when the total exposure time is 10 ms, which means that the imaging frame frame rate can be 100 fps. rate can be 100 fps. The No. 2 picture (the initial lifetime map in Figure 14) used as fluorescence source has a The No. 2 picture (the initial lifetime map in Figure 14) used as fluorescence source has a fluorophore whose lifetime is 10.5 ns. The simulation results are shown in Figure 14. The outline of fluorophore whose lifetime is 10.5 ns. The simulation results are shown in Figure 14. The outline of fluorophores becomes difficult to distinguish when the imaging frame rate approaches 100 fps, the fluorophores becomes difficult to distinguish when the imaging frame rate approaches 100 fps, the variance of fluorescence lifetime imaging results no longer stands negligible. In the circumstance of 10 fps, namely 100 ms exposure time, the fluorophore’s profile can be distinguished. Table 1 shows the mean values, variances and FoMs of detecting results of pixel arrays with 10.5 ns lifetime. The Figure of Merit (FoM) of fluorescence lifetime imaging can be expressed as follows: FoM 



(8)

Sensors 2016, 16, 160

11 of 14

variance of fluorescence lifetime imaging results no longer stands negligible. In the circumstance of 10 fps, namely 100 ms exposure time, the fluorophore’s profile can be distinguished. Table 1 shows the mean values, variances and FoMs of detecting results of pixel arrays with 10.5 ns lifetime. The Figure of Merit (FoM) of fluorescence lifetime imaging can be expressed as follows: FoM “ a

τ

(8)

στ ` ∆τ 2 2

where στ is the standard deviation of estimated lifetimes, and ∆τ is the average offset of estimated lifetimes. Table 1. The simulation results of lifetimes of 10.5 ns.

Actual Lifetime [ns] Estimated Lifetime [ns] Average Error [%] Standard Deviation@(10 ms)[ps] Standard Deviation@(100 ms)[ps] Standard Deviation@(2 s)[ps] Sensors 2016, 16, 160 Imaging FoM@(2 s)

MLE

CMM

LSM

10.5 10.537 0.35 413 34.9 23 241

10.5 10.53 0.28 410 34.6 22.9 279

10.5 10.589 0.84 672 115 46 105

11 of 14

Table 2 summarizes the detailed simulation results of the estimated fluorophore’s lifetime. The Table 2 summarizes the detailed simulation results of the estimated fluorophore’s lifetime. The volatility of FoM is attributed to the lack of sample volume. Since the simulation of circuits takes a volatility of FoM is attributed to the lack of sample volume. Since the simulation of circuits takes a lot lot of time, only 20 M cycles are simulated. It can be concluded that the FoM of LSM is rather smaller of time, only 20 M cycles are simulated. It can be concluded that the FoM of LSM is rather smaller than than MLE and CMM. As the fluorescence lifetime falls in between 5 ns and 14 ns, CMM is close to MLE and CMM. As the fluorescence lifetime falls in between 5 ns and 14 ns, CMM is close to MLE in MLE in terms of FoM, but when it comes to 14 ns or longer, CMM’s FoM presents apparent decrease. terms of FoM, but when it comes to 14 ns or longer, CMM’s FoM presents apparent decrease. Table 2. The simulation results of lifetimes of 14 ns. Table 2. The simulation results of lifetimes of 14 ns.

MLE Actual Lifetime [ns] 14 MLE Estimated Lifetime [ns] [ns] 14.03 14 Actual Lifetime Estimated Lifetime [ns] Average Error [%] 0.24 14.03 Average Error [%] Standard Deviation@(10 ms) [ps] 442 0.24 Standard Deviation@(10 ms) [ps] 442 Standard Deviation@(100 ms) [ps] ms) [ps] 119 119 Standard Deviation@(100 Standard Deviation@(2 s) [ps] s) [ps] 36.9 36.9 Standard Deviation@(2 Imaging s) FoM@(2 s) Imaging FoM@(2 280 280

CMM CMM 14

13.95 14 13.95 0.28 0.28 423 423 114 114 35.3 35.3 242 242

Figure13. 13.The Thesimulation simulationresult resultof ofNo. No.11picture. picture. Figure

LSM 14 14.24 14 14.24 1.69 1.69 487 487 158 158 45 45 59 59 LSM

Sensors 2016, 16, 160

12 of 14

Figure 13. The simulation result of No. 1 picture.

Figure Figure 14. 14. The The simulation simulation result result of of No. No. 22 picture. picture.

The simulation results above indicate that moderate measurement accuracy is attainable The simulation results above indicate that moderate measurement accuracy is attainable using full using full parallel event driven readout method. The CMM and MLE algorithm can be put in use parallel event driven readout method. The CMM and MLE algorithm can be put in use for estimating for estimating fluorescence lifetimes. MLE is capable of extending the lifetime resolvability range fluorescence lifetimes. MLE is capable of extending the lifetime resolvability range with a high with a high performance computer. But the CMM shows its advantage in the easy implementation performance computer. But the CMM shows its advantage in the easy implementation of hardware. of hardware. Table 3 summarizes the performance of the chip and shows the comparisons with other works. A slow repetition rate excitation laser is used in order to extend the lifetime resolvability range. The low power dissipation is attributed to the reduced number of TDCs. The chip’s maximum readout sample rate is 10 MSps, which is faster than that reported in all except one of the previously published articles that are referenced in Table 3. The frame rate is defined as the lifetime generation rate. The lifetime is estimated from the histograms consisting of large number of photons. The maximum frame rate in this work is 100 fps. The less number of collected photons means a lower signal-to-noise ratio compared with previous work in Reference [14] or [18]. Meanwhile, the output bandwidth is reduced to 720 Mbps. The Figure of Merit (FoM) of the proposed chip can be expressed as follows: FoM “

Power Resolution ¨ Readout_Sample_Rate

(9)

The FoM of the proposed structure is 5.86 pJ/(sample¨ pixel). The extremely low chip FoM is attributed to the implementation of full parallel event-driven readout, the small resolution and the reduction of measuring accuracy. Table 3. Comparisons of different architectures. Parameters

Reference 14

Reference 18

Reference 15

Reference 19

This Work

Resolution CMOS Technology Fill Factor [%] Number of TDC LSB of TDC [ps] Laser Pulse Rate [MHz] Sampling Rate [Sps] Lifetime frame rate [fps] @ histogram photons Bandwidth [bps] Power [W] Chip FoM [pJ/(sample¨ pixel)]

160 ˆ 128 0.13 um 1 1/1 55 40 50 k 0.08 @ « 30,0000 51.2 G 550 m 537

32 ˆ 32 0.13 um 2 1/1 119 40 1M 1.8 @ « 22,0000 10.24 G 90 m* 87

64 ˆ 64 0.35 um 0.93 4096/1 350 N/A 718 3.9 @ « 200 N/A 1.4 476 k

64 ˆ 64 0.13 um 0.77 1/1 62.5 20 20 M 100 @ « 600 42 G 8.79 107

16 ˆ 16 0.13 um 5.7 64/4 97 10 10 M 100 @ = 1000 720 M 15 m** 5.86

* Without I/O pads. ** Simulation result of core power consumption.

Sensors 2016, 16, 160

13 of 14

5. Conclusions In this work, we have proposed a full parallel event driven readout method which is implemented in the area array SPAD image sensor for high-speed FLIM. The maximum imaging frame rate is 100 fps, and the number of TDCs used declined to 16. The average power consumption is 15 mW. The output bandwidth is reduced to 720 Mbps. We did the analysis of imaging error caused by pile-up effect which is induced by the readout method. The lifetime resolvability range is 5–20 ns, and the average error of estimated fluorescence lifetimes is below 1%. The proposed readout method is suitable in video-rate FLIM with the resolvability of long decay lifetimes. Acknowledgments: This work is supported by the National Natural Science Foundation of China (61434004, 61274021), and the Postdoctoral Science Foundation of China (2015M581297). Author Contributions: Kaiming Nie, Xinlei Wang and Jun Qiao conceived the ideas and innovations, performed the simulation and analyzed the data; Jiangtao Xu provided supervision and guidance in this work. Conflicts of Interest: The authors declare no conflict of interest.

References 1.

2.

3. 4. 5.

6.

7.

8. 9. 10. 11. 12.

13.

Siegel, J.; Elson, D.S.; Webb, S.E.D.; Lee, K.C.B.; Vlandas, A.; Gambaruto, G.L.; Lévêque-Fort, S.; Lever, M.J.; Tadrous, P.J.; Stamp, G.W.H.; et al. Studying biological tissue with fluorescence lifetime imaging: Microscopy, endoscopy, and complex decay profiles. Appl. Opt. 2003, 42, 2995–3004. [CrossRef] [PubMed] Fatakdawala, H.; Poti, S.; Zhou, F.; Sun, Y.; Bec, J.; Liu, J.; Yankelevich, D.R.; Tinling, S.P.; Gandour-Edwards, R.F.; Farwell, D.G.; et al. Multimodal in vivo imaging of oral cancer using fluorescence lifetime, photoacoustic and ultrasound techniques. Biomed. Opt. Express. 2013, 4, 1724–1741. [CrossRef] [PubMed] Cubeddu, R.; Comelli, D.; D’Andrea, C.; Taroni, P.; Valentini, G. Time-resolved fluorescence imaging in biology and medicine. J. Phys. D Appl. Phys. 2002, 35, R61–R76. [CrossRef] Pawley, J.B. Handbook of Biological Confocal Microscopy; Springer: New York, NY, USA, 1995; pp. 516–534. Mosconi, D.; Stoppa, D.; Pancheri, L.; Gonzo, L.; Simoni, A. CMOS Single Photon Avalanche Diode Array for Time-Resolved Fluorescence Detection. In Proceedings of the 32nd European Solid-State Circuits Conference (ESSCIRC), Montreaux, Switzerland, 19–21 September 2006; pp. 564–567. Li, D.; Arlt, J.; Richardson, J.; Walker, R.; Buts, A.; Stoppa, D.; Charbon, E.; Henderson, R. Real-time fluorescence lifetime imaging system with a 32 ˆ 32 0.13 µm CMOS low dark-count single-photon avalanche diode array. Opt. Express. 2010, 18, 10257–10269. [CrossRef] [PubMed] Savuskan, V.; Gal, L.; Cristea, D.; Javitt, M.; Feiningstein, A.; Leitner, T.; Nemirovsky, Y. Single photon avalanche diode collection efficiency enhancement via peripheral well controlled field. IEEE Trans. Electron Dev. 2015, 62, 1939–1945. [CrossRef] Cova, S.; Ghioni, M.; Lacaita, A.; Samori, C.; Zappa, F. Avalanche photodiodes and quenching circuits for single-photon detection. Appl. Opt. 1996, 35, 1956–1976. [CrossRef] [PubMed] Richardson, J.A.; Webster, E.A.G.; Grant, L.A.; Henderson, R.K. Scaleable Single-Photon Avalanche Diode Structures in Nanometer CMOS Technology. IEEE Trans. Electron Dev. 2011, 58, 2028–2035. [CrossRef] Palubiak, D.P.; Deen, M.J. CMOS SPADs: Design Issues and Research Challenges for Detectors, Circuits, and Arrays. IEEE J. Sel. Top. Quantum Electron. 2014, 20, 409–426. [CrossRef] Stoppa, D.; Mosconi, D.; Pancheri, L.; Gonzo, L. Single-Photon Avalanche Diode CMOS Sensor for Time-Resolved Fluorescence Measurements. IEEE Sens. J. 2009, 9, 1084–1090. [CrossRef] Li, D.D.; Ameerbeg, S.M.; Arlt, J.; Tyndall, D.; Walker, R.J.; Matthews, D.R. Time-Domain Fluorescence Lifetime Imaging Techniques Suitable for Solid-State Imaging Sensor Arrays. Sensors 2012, 12, 5650–5669. [CrossRef] [PubMed] Vitali, M.; Bronzi, D.; Krmpot, A.J.; Nikolić, S.N.; Schmitt, F.J.; Junghans, C.; Tisa, S.; Friedrich, T.; Vukojević, V.; Terenius, L.; et al. A Single-Photon Avalanche Camera for Fluorescence Lifetime Imaging Microscopy and Correlation Spectroscopy. IEEE J. Sel. Top. Quantum Electron. 2014, 20, 1–10. [CrossRef]

Sensors 2016, 16, 160

14.

15. 16.

17. 18.

19. 20. 21.

22. 23. 24.

25. 26. 27. 28.

14 of 14

Veerappan, C.; Richardson, J.; Walker, R.; Li, D.D.U.; Fishburn, M.W.; Maruyama, Y.; Stoppa, D.; Borghetti, F.; Gersbach, M.; Henderson, R.K.; et al. A 160 ˆ 128 single-photon image sensor with on-pixel 55 ps 10 b time-to-digital converter. In Proceedings of the IEEE International Solid-state Circuits Conference (ISSCC), San Francisco, CA, USA, 20–24 February 2011; pp. 312–314. Schwartz, D.E.; Charbon, E.; Shepard, K.L. A Single-Photon Avalanche Diode Array for Fluorescence Lifetime Imaging Microscopy. IEEE J. Solid State Circuits 2008, 43, 2546–2557. [CrossRef] [PubMed] Stoppa, D.; Borghetti, F.; Richardson, J.; Walker, R.; Grant, L.; Henderson, R.K.; Gersbach, M.; Charbon, E. A 32 ˆ 32-pixel array with in-pixel photon counting and arrival time measurement in the analog domain. In Proceedings of the European Solid-state Circuits Conference (ESSCIRC), Athens, Greece, 14–18 February 2009; pp. 204–207. Niclass, C.; Favi, C.; Kluter, T.; Gersbach, M.; Charbon, E. A 128 ˆ 128 single-photon image sensor with column-level 10-bit timeto-digital converter array. IEEE J. Solid State Circuits 2008, 43, 2977–2989. [CrossRef] Gersbach, M.; Maruyama, Y.; Trimananda, R.; Fishburn, M.W.; Stoppa, D.; Richardson, J.A.; Walker, R.; Henderson, R.; Charbon, E. A Time-Resolved, Low-Noise Single-Photon Image Sensor Fabricated in Deep-Submicron CMOS Technology. IEEE J. Solid State Circuits 2012, 47, 1394–1407. [CrossRef] Field, R.M.; Realov, S.; Shepard, K.L. A 100 fps, Time-Correlated Single-Photon-Counting-Based Fluorescence-Lifetime Imager in 130 nm CMOS. IEEE J. Solid State Circuits 2014, 49, 867–880. [CrossRef] Homulle, H. Development of a Multichannel TCSPC System in a Spartan 6 FPGA. Ph.D. Thesis, Technical University of Delft, Delft, Netherland, 2014. Dutton, N.A.W.; Gnecchi, S.; Parmrsan, L.; Holmes, A.J.; Rae, B.; Grant, L.A.; Henderson, R.K. A Time-Correlated Single-Photon Counting Sensor with 14 GS/s Histogramming Time-to-Digital Converter. In Proceedings of the IEEE International Solid-state Circuits Conference (ISSCC), San Francisco, CA, USA, 22–26 February 2015; pp. 1–3. Villa, F.; Lussana, R.; Tosi, A.; Zappa, F. High fill-factor 60 ˆ 1 SPAD array with 60 sub-nanosecond integrated TDCs. IEEE Photonic Tech. Lett. 2015, 27, 1261–1264. [CrossRef] Stipˇcević, M. Active quenching circuit for single-photon detection with Geiger mode avalanche photodiodes. Appl. Opt. 2009, 48, 1705–1714. [CrossRef] [PubMed] Tyndall, D.; Rae, B.R.; Li, D.D.U.; Arlt, J.; Johnston, A.; Richardson, J.A.; Henderson, R.K. A High-Throughput Time-Resolved Mini-Silicon Photomultiplier With Embedded Fluorescence Lifetime Estimation in 0.13 µm CMOS. IEEE Trans. Biomed. Circuits Syst. 2012, 6, 562–570. [CrossRef] [PubMed] Mita, R.; Palumbo, G. High-Speed and Compact Quenching Circuit for Single-Photon Avalanche Diodes. IEEE Trans. Instrum. Meas. 2008, 57, 543–547. [CrossRef] Arlt, J.; Tyndall, D.; Rae, B.R.; Li, D.D.U.; Richardson, J.A.; Henderson, R.K. A study of pile-up in integrated time-correlated single photon counting system. Rev. Sci. Instrum. 2013, 84, 103105. [CrossRef] [PubMed] Hall, P.; Sellger, B. Better Estimates of Exponential Decay Parameters. J. Phys. Chem. 1981, 85, 2491–2496. [CrossRef] Li, D.D.U.; Arlt, J.; Tyndall, D.; Walker, R.; Richardson, J.; Stoppa, D.; Charbon, E.; Henderson, R.K. Video-rate fluorescence lifetime imaging camera with CMOS single-photon avalanche diode arrays and high-speed imaging algorithm. J. Biomed. Opt. 2011, 16. [CrossRef] [PubMed] © 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons by Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).

A Full Parallel Event Driven Readout Technique ... - Semantic Scholar

A Full Parallel Event Driven Readout Technique ... - Semantic Scholar

Suggest Documents

GATEWAYS: A TECHNIQUE FOR ADDING EVENT-DRIVEN ...

Realizing Event-Driven SOA - Semantic Scholar

Towards proactive event-driven computing - Semantic Scholar

Event-driven Adaptation in COP - Semantic Scholar

from event-driven workflows - Semantic Scholar

SenaaS:An Event-driven Sensor Virtualization ... - Semantic Scholar

and Event-Driven Dynamic Systems - Semantic Scholar

A Parallel Search-and-Learn Technique for ... - Semantic Scholar

Eve: A parallel event-driven programming language

A distributed, event-driven control architecture for ... - Semantic Scholar

A Framework for Event-driven Demonstration ... - Semantic Scholar

LuaTS â A Reactive Event-Driven Tuple Space - Semantic Scholar

LuaTS â A Reactive Event-Driven Tuple Space - Semantic Scholar

A Real-Time, Event Driven Neuromorphic System ... - Semantic Scholar

Ubiquitous Nature of Event-Driven Approaches: A ... - Semantic Scholar

A Centralized Cache Miss Driven Technique to ... - Semantic Scholar

a "fingerprinting technique" - Semantic Scholar

issues in parallel discrete event simulation for an ... - Semantic Scholar

parallel discrete event simulation of queuing - Semantic Scholar

Event-B Decomposition for Parallel Program ... - Semantic Scholar

Scalable Event-Driven Native Parallel Processing: The ... - APT

Efficient Event-driven Simulation of Parallel Processor Architectures

Event- and Time-Driven Techniques Using Parallel ...

Lookup Table Powered Neural Event-Driven ... - Semantic Scholar

A Full Parallel Event Driven Readout Technique ... - Semantic Scholar