May 25, 2010 - TX. RX sync logic sync logic. Master. FPGA. Slave. FPGA. CRU sample. SerDes. Word. Aligner. 8B/10B clock adapt. extracted clock local clock ...
PET system synchronization and timing resolution using high-speed data links Ramón J. Aliaga, José M. Monzó, Michele Spaggiari, Néstor Ferrando, Rafael Gadea, Ricardo J. Colom Instituto de Instrumentación para Imagen Molecular Universidad Politécnica de Valencia
17th Real-Time Conference 25/05/2010
Contents y y y
◦ ◦ ◦ y
◦ y y
Introduction System architecture High-speed data links PTP synchronization scheme Phase correction Hardware requirements
Digitization and timing resolution Trigger delay compensation
First test results Conclusion 2
17th Real-Time Conference 25/05/2010
Introduction y y y y
y y
The analog section in PET systems is shrinking due to early digitization and/or front-end integration Tendency to digitize trigger signals directly on the photodetector board Eventually, we want to integrate the whole analog front-end and digitization in a single ASIC and minimize front-end card size We favor complete physical separation of digital front-end (digitization and channel-specific processing) from central processing (coincidence processing, etc) But digitizer cards still need to be synchronized for event timestamping… …and sync resolution specifications get more restrictive due to continuously improving coincidence timing resolution
3
17th Real-Time Conference 25/05/2010
Introduction front-end + digitization
analog signals
analog front-end
y
digital links
digitization coincidence detection
coincidence detection
Approaches to synchronization:
◦ Backplane with precise clock delivery network ◦ Proposal: Synchronization over data links 4
17th Real-Time Conference 25/05/2010
System architecture for remote synchronization Central processing unit:
synchronization path
• Data collection • Coincidence processing Front-end cards:
DETECTORS
y y
• Analog conditioning • Digitization • Single event detection
Hierarchic (master-slave) data links transfer clock frequency and synchronize local references Short tree structure to minimize time uncertainty 5
17th Real-Time Conference 25/05/2010
clock adapt. local clock extracted clock
TX
RX
RX
TX
8B/10B
Word Aligner
SerDes
sync logic
Master FPGA
sync logic
High-speed data links between FPGAs Slave FPGA
sample
CRU
Data coding with embedded clock y Receivers extract clock from signal transitions and use it for deserialization y Extracted clock can be used for external logic y Special physical coding is needed (8B/10B) so daughter cards do not lose clock y
6
17th Real-Time Conference 25/05/2010
Timestamp synchronization: IEEE1588 y y
y y
Precision Time Protocol (PTP): Two-frame synchronization Each end has a timestamp counter with unknown initial value Tx and Rx local timestamps are recorded Offset can be computed if propagation time is equal or difference Δt is known
total flight time (T M 2 − T M 1 ) − (T S 2 − T S 1 ) Δ TS =
1 [(TM 1 + TM 2 ) − (TS 1 + TS 2 ) − Δ t ] 2
must be known t SM − t MS
7
17th Real-Time Conference 25/05/2010
Synchronization timing diagram tTX
tp
tRX tTX
tMS
y y
tp tSM
tRX
Clock domain crossing adds timing errors due to phase Phase difference has to be measured and considered in new timestamp computation 8
17th Real-Time Conference 25/05/2010
Phase measurement y y y
DMTD (Dual-Mixer Time Difference) method Clocks are sampled by a generated clock with a similar frequency We obtain clocks with lower frequency but the same phase relationship
sampling clock
9
17th Real-Time Conference 25/05/2010
Phase measurement implementation y
Sampling clock is synthesized so that
master FPGA reference clock
N fD = f ref N +1 y y
Phase difference and edge distance are multiplied by N Can be implemented entirely inside the FPGA
TX
freq synth
RX sampling
extracted clock
counter edge detection
edge detection
–
phase estimation 10
17th Real-Time Conference 25/05/2010
Deterministic latency y
We need to know the difference Δt exactly ) + (t − t ) + (t − t ) + Δ ϕ T Δ t = (t −t TX , S
TX , M
p , SM
p , MS
RX , M
RX , S
assumed null
2π
clk
phase error
TX and RX latency must be deterministic across different devices and power cycles y Restricted choice: Virtex-5 LXT+, Stratix IV GX+ y Example: on a Virtex-5, external logic is required to phase-shift internal transceiver clocks and to manually shift bits for word alignment T y
t RX = C +
clk
10
⋅ bit shift
11
17th Real-Time Conference 25/05/2010
ADC and timing resolution y
Timing resolution for digital trigger depends on ADC sampling rate: ◦ Minimum peaking time estimated to be equal to 2 samples ◦ Faster sampling allows tighter shaping with sharper edges and better time resolution
Digitization of many channels means ADCs with serial output are more practical y Higher rate (160 MHz) + serial output = high-speed link y Gigabit receivers introduce nontrivial delay y
12
17th Real-Time Conference 25/05/2010
y
y
y
One ADC channel is reserved for compensation Channels are aligned by programming test patterns A ramp signal is started and total delay is estimated by fitting lines Ramp delay is assumed constant
high-speed links ADC RX ADC
RX
ADC
RX
ADC
RX
multi-channel ADC
channel alignment
y
trigger signals
ADC to FPGA delay compensation
FPGA ramp generator
13
trigger estimation
delay estimation
17th Real-Time Conference 25/05/2010
First measurements 1 0.9
1 0
2
Frequency (normalized)
0.7 0.6
5
3
0.5 0.4
8
6
0.8
nº = (slave rx bit shift + master rx bit shift) mod 10
9
7
matches latency specification
4
0.3 0.2 0.1 0 0
y y y y
800
1600
2400 3200 4000 Phase difference (ps)
4800
5600
6400
First measurements connecting two Xilinx ML505 evaluation boards Latency determinism was tested by repetition across power cycles 150 ps FWHM phase difference resolution Better resolution expected after statistical processing 14
17th Real-Time Conference 25/05/2010
Conclusion y
We propose:
◦ A PET system architecture with integrated analog and digital front-end and high-speed ADCs ◦ A method for high-resolution system synchronization over data links ◦ A method for compensation of added ADC delay
First tests indicate that overall synchronization resolution may be below 300 ps, so it won’t affect timing resolution y Results are tentative, no real confirmation yet (custom FPGA boards needed) y