Fixed-Point Implementation of Active Disturbance Rejection Control for Superconducting Radio Frequency Cavities* Shen Zhao, Senior Member, IEEE, Nathan Usher, Dan Morris, and John Vincent
Abstract— A fixed-point implementation of the previously explored active disturbance rejection control (ADRC) for superconducting radio frequency (SRF) cavities is presented in this paper. The developed fixed-point algorithm is very suitable for being implemented on a field programmable gate array (FPGA) allowing high sampling rate (>20 MHz), which may be needed in special applications. The design tradeoffs are discussed as well to ensure its implementation on a low end FPGA with limited resources. Through both simulations and hardware tests, it is demonstrated that the fixed-point algorithm provides performance as good as the floating-point algorithm.
I. INTRODUCTION The linear accelerator (LINAC) has some advantages over the ring-type accelerator. For example, it is able to accelerate heavy ions to higher energy and accelerate electrons to high speed which is close to the speed of light. The superconducting radio frequency (SRF) cavity as the basic module in the LINAC is a key component. The ongoing Facility for Rare Isotope Beams (FRIB) project at the National Superconducting Cyclotron Laboratory (NSCL) at Michigan State University (MSU) will build a LINAC that requires over 340 SRF cavities [1]. Since the cavity is operating in the superconducting condition and has very little loss, it has extremely high quality factor (108-1010). The lightly loaded SRF cavity still has a very narrow bandwidth (around 70 Hz or less), which makes it sensitive to detuning forces, such as microphonics and Lorenz force. Hence the design of the low level RF (LLRF) control for SRF cavities is very challenging. In a previous research [2], it is found that the active disturbance rejection control (ADRC) can reject the microphonics effectively. Two to four times of performance improvement over the existing PID control in hardware tests has been reported. It has been applied to control the amplitude and the phase of the SRF cavities in the 3 MeV/u re-accelerator (ReA3) which is under construction at NSCL [3] since early 2011, and chosen as the main LLRF control algorithm for FRIB. A recent effort in the LLRF design for SRF cavities at NSCL and FRIB is using the self-excited loop (SEL) to resonate the cavity in the absence of a tuner [4]. In the SEL *This material is based upon work supported by U.S. Department of Energy Office of Science under Cooperative Agreement DE-SC0000661, and also by the National Science Foundation under Grant PHY-06-06007. All authors are with the Facility for Rare Isotope Beams (FRIB) and the National Superconducting Cyclotron Laboratory (NSCL), Michigan State University, East Lansing, MI 48824 USA. Shen Zhao is the corresponding author, and can be reached at: 517-908-7234 (phone); 517-908-7126 (fax);
[email protected] (email).
mode, the frequency of the driving RF signal needs to match the cavity natural frequency by adding an offset to the reference frequency. This is done by adding a phase shift to the reference signal every sampling period. Experimental results have shown that the amount of phase shift being added cannot exceed 30 degrees; otherwise it will cause waveform distortion in the driving RF signal [4]. Hence the maximum frequency offset is limited to one twelfth (30/360) of the sampling frequency. Currently, the ADRC algorithm is implemented using floating-point math on a fixed-point digital signal processor (DSP) and running at 50 KHz sampling rate, which provides a maximum frequency offset of around 4 KHz. If the cavity natural frequency is more than 4 KHz away from the reference frequency, then the cavity will not be resonated. To achieve a higher frequency offset, the sampling rate has to be increased, and this has been the biggest motivation for a field programmable gate array (FPGA) based implementation. Other factors such as removing the DSP chip to reduce the cost and increase reliability are also taken into consideration. ADRC has been mostly implemented in microprocessors, although implementation on other platforms, for example programmable logic controller (PLC) [5] and FPGA, has been reported. Previous efforts on implementing the ADRC algorithm on an FPGA can be found in [6]-[8]. But the floating-point calculation is still used in those cases, which requires a lot of FPGA resources, coding efforts and generally will result in greater latency. In addition, using a fixed-point algorithm is very reasonable since in most digital control applications the analog signal has already been digitized by the analog-to-digital converter (ADC) and, in fact, been represented by integer numbers. The paper is organized as follows. The system dynamics of the SRF cavity and the corresponding continuous ADRC design are briefly introduced in Section II. The fixed-point algorithm is developed in Section III. The implementation and verification of the fixed-point algorithm are presented in Section IV. Finally, concluding remarks are given in Section V. II. SRF CAVITY MODEL AND ADRC DESIGN The simplified SRF cavity dynamics can be represented by the following voltage vector model [9]
V&cI + ω1 2VcI + ∆ωVcQ = ω1 2VgI V&cQ + ω1 2VcQ − ∆ωVcI = ω1 2VgQ
,
(1)
where VcI and VcQ are the in-phase (I) and quadrature (Q) components of the cavity voltage which is the system output;
VgI and VgQ are the I and Q components of the generator voltage which is the system input; ω1 2 is the cavity half
bandwidth; ∆ω is the cavity detuning frequency which is the difference between the generator RF frequency and the cavity natural frequency. It can be seen from (1) that the cavity dynamics are two coupled first order systems. The development of the continuous ADRC has been described in [2] and various literatures [10], [11]. Therefore, only the result is presented here using the relevant first order ADRC as an example. For a general first order system
y& = f ( y, w, t ) + bu ,
(2)
the corresponding ADRC design is given below as
xˆ&1 = xˆ 2 + bˆu + l1 ( y − xˆ1 ) , xˆ& 2 = l 2 ( y − xˆ1 ) u = (k (r − y ) − xˆ ) bˆ 1
(3)
2
where r is the reference signal; y is the system output; u is the system input; w is the external disturbance; t is time; f (⋅) is the total disturbance which includes both unknown internal dynamics and external disturbances; xˆ1 and xˆ 2 are the observer states estimating the system output and the total disturbance respectively; l1 and l 2 are the observer gains; k1 is the controller gain; b is a system related constant and is the cavity half bandwidth ω1 2 in this application; bˆ is an estimation of b . The coupling between the I and Q components is treated as external disturbance in this application. According to the parameterization technique [12], the observer and controller gains are chosen as: l1 = 2ω ob , l 2 = ω ob2 and k1 = ω c , where ω ob is the observer bandwidth and ω c is the controller bandwidth. Furthermore, let ω ob = αω c , where α is a preselected constant (normally from one to ten), then ω c becomes the only tuning parameter.
For the derivation of the cavity model and the detailed ADRC design, readers are encouraged to refer to [2]. Also note that the amplitude (A) and the phase (P) are controlled directly instead of the I and Q components. The relation between AP and IQ is A = I 2 + Q2 tan −1 ( Q I ) π 2 I = A cos P or undefined Q = A sin P P = − π 2 tan −1 ( Q I ) + π −1 tan ( Q I ) − π
Since (4) is a static relation, amplitude and phase will share the same dynamics as I and Q components. III. FIXED-POINT ALGORITHM DEVELOPMENT The discrete algorithm being used previously is the current discrete extended state observer (CDESO) developed in [13]. The benefit of the CDESO is that it takes advantage of the current measurement and uses it to correct the current state estimation. The benefit becomes trivial, however, when the sampling rate is very high. Since the CDESO uses the pole matching method, there will be an exponential function in calculating the observer gains, which posts some difficulties on a fixed-point implementation. Hence the much simpler Euler’s method is used instead as in [7]. Basically, the derivative can be calculated using the difference quotient when the sampling rate is very high and (3) becomes
(
)
xˆ1 [k + 1] = xˆ1 [k ] + T xˆ 2 [k ] + bˆu[k ] + l1 ( y[k ] − xˆ1 [k ]) , (5) xˆ 2 [k + 1] = xˆ 2 [k ] + Tl 2 ( y[k ] − xˆ1 [k ]) u[k + 1] = (k (r [k + 1] − y[k + 1]) − xˆ [k + 1]) bˆ 1
2
where T is the sampling period; [k + 1] represents the current sample of the variables; [k ] represents the previous sample. A. Simplification and Optimization Define ω = ω c T , x1′ [k ] = xˆ1 [k ] ω , x ′2 [k ] = xˆ 2 [k ]T ω 2 , b ′ = bˆ ω , (5) can then be rewritten as c
x1′[k + 1] = x1′[k ] + ωx2′ [k ] + b′u[k ] + 2α ( y[k ] − ωx1′[k ])
x′2 [k + 1] = x′2 [k ] + α 2 ( y[k ] − ωx1′[k ]) u[k + 1] = (r [k + 1] − y[k + 1] − ωx2′ [k + 1]) b′
. (6)
The division takes a very long time to compute and normally is converted to the multiplication of the reciprocal. It is easy to count that (5) and (6) need seven additions/subtractions and six multiplications each. Compared to addition or subtraction, multiplication is more time consuming, so it is preferable to use as little multiplication as possible. In (6), if α is chosen to be an integer power of two, such as one, two, four or eight, the multiplication can then be accomplished by left shift easily. What is more, it is noticed that the first equation in (6) can be further simplified to x1′ [ k + 1] = x1′ [ k ] + r [ k ] − y [ k ] + 2α ( y [ k ] − ω x1′ [ k ]) , (7)
I >0 I = 0, Q > 0 . (4) I = 0, Q = 0 I = 0, Q < 0 I < 0, Q > 0 I < 0, Q ≤ 0
if the actual applied system input (controller output) is just the value calculated from the third equation in (6). The cases include amplitude control, if the system input does not saturate, and phase control. In the case of phase control, the calculated system input is directly applied even though the effective system input may be different due to the wrap-around effect. For more details on wrap-around effect in phase control, please refer to [2], [14].
If the system input does saturate in amplitude control, the b′u[k ] term in the first equation in (6) can still be pre-calculated as b′umax and b′umin saving one multiplication, where umax and umin are the upper and lower saturation bound of the system input. In summary, with the above simplification and optimization, the phase control will only need six additions and three multiplications, and the amplitude control will need seven additions and four multiplications.
Hence, s ωc ωc 1 ∆ω nω 1 < s nω ≤ m−1 ⇒ m < ∆ω c ≤ m −1 , m 2 2 2 2 ω
where ∆ωc is the resolution for the controller bandwidth. Similarly, the resolution for the system related constant bˆ satisfies the following condition. bˆ bˆ < ∆bˆ ≤ m −1 2 −1 2 −1 m
B. Scaling The fixed-point calculation is equivalent to integer calculation with the adjustment of the decimal point accordingly. So any fraction numbers need to be scaled to integers to facilitate the calculation. In (6), y[k + 1] is from the ADC, hence it is normally a 8 to 24 bit integer depending on the resolution of the ADC; r[k + 1] is the reference for y[k + 1] and should have the same data type; u[k + 1] being applied to digital-to-analog converter (DAC) is also an integer; α has been chosen to be integer; only ω , b ′ , 1 b′ could be fraction numbers. Since in the calculation, b ′ is multiplied by umax or umin which is normally a very big integer, no further scaling is necessary for b ′ .
s
Now left shift ω by nω bits such that 2m−1 ≤ ω nω < 2m , and left shift binv = 1 b′ by nbinv bits such that s sn s n nbinv binv can be represented by 2 m −1 ≤ binv < 2 m . Thus ω ω and binv s m bits unsigned integers. The notation p q here means variable p is left shifted by q bits. After scaling, (6) and (7) become
s s s s x1′nω [ k + 1] = x1′nω [ k ] + ω nω x2′ [ k ] + b′nω umax or min [ k ] s n + n +1 s n + n +1 + y ( ω α ) [ k ] − ω ( ω α ) x1′ [ k ] (8) s n +2 n s s s n +2 n x2′ nω [ k + 1] = x2′ nω [ k ] + y ( ω α ) [ k ] − ω ( ω α ) x1′ [ k ] sn sn s n +n s binv u ( ω binv ) [ k + 1] = binv r ω [ k + 1] − y nω [ k + 1] sn s n binv ω ω x2′ [ k + 1] −binv and
(
)
(
)
s s s s x1′nω [ k + 1] = x1′nω [ k ] + r nω [ k ] − y nω [ k ] , s n + n +1 s n + n +1 + y ( ω α ) [ k ] − ω ( ω α ) x1′ [ k ]
(9)
where nα = log 2 α . In (8) and (9), y [ k ] is not multiplied by a number that is less than one, which means no precision from the measurement is lost. C. Parameter Resolution Analysis The resolution for the parameters, however, will be s limited. Above ω nω has a resolution of one, i.e.
s ∆ω nω = 2 nω (ω c + ∆ω c )T − 2 nω ω cT = 1 .
(10)
(11)
(12)
It is obvious from (11) and (12) that to get a higher resolution for the parameters, m needs to be increased, which means more resources are needed. The choice between high resolution and low resource utilization is a design tradeoff. IV. IMPLEMENTATION AND VERIFICATION There are two versions of the LLRF controller. The one for the ReA3 LINAC uses the Xilinx Spartan 3E FPGA (XC3S500E-4), which has 20 build-in 18-by-18 multipliers (primitive MULT18X18SIO) [15]. The newer version for the FRIB LINAC uses the Xilinx Spartan 6 FPGA (XC6SLX150T-3), which has 180 DSP48A1 slices (configurable to 18-by-18 multipliers) [16]. Since the Spartan 3E FPGA has a limited amount of multipliers, two different procedures, which will be introduced below, are developed to meet different needs. A. 4-Cycle Procedure This procedure tries to put everything in parallel to minimize the latency. It takes four clock cycles to finish all the calculations. Hence the maximum sampling rate will be one fourth of the maximum clock rate. Following are the detailed steps. Cycle 1: a) Update the reference and measurement signals; b) determine the saturation condition (not necessary for phase control) and apply control signal; c) calculate error e [ k + 1] = r [ k + 1] − y [ k + 1] ; d) update part of xs1′nω [ k + 1] which is independent of the saturation condition; e) update xs2′ nω [ k + 1] ; Cycle 2: Update the rest part of xs1′nω [ k + 1] based on the saturation condition (not necessary for phase control); Cycle 3: a) Save e [ k + 1] for later use; b) calculate s s s s ω nω x1′nω [ k + 1] ; c) calculate ω nω x2′ nω [ k + 1] (not necessary for phase control); d) calculate sn s n s n s binv binv ω ω x2′ ω [ k + 1] ; e) calculate binvnbinv e [ k + 1] ; Cycle 4: Calculate control signal u [ k + 1] ; The problem with the 4-cycle procedure is the resource utilization. Though it suits the Spartan 6 chip very well, it will take almost all of the multipliers available in the Spartan 3E chip to implement both amplitude and phase control. The 8-cycle procedure is then developed to resolve this problem.
B. 8-Cycle Procedure This procedure reduces the multiplier usage by adopting the multiplexing technique. For each control loop only three multipliers are needed in this procedure. Cycle 1: a) Update the reference and measurement signals; b) determine the saturation condition (not necessary for phase control) and apply control signal; c) calculate error e [ k + 1] = r [ k + 1] − y [ k + 1] ; d) update part of xs1′nω [ k + 1] which is independent of the saturation condition; e) update xs2′ nω [ k + 1] ; Cycle 2: Update the rest part of xs1′nω [ k + 1] based on the saturation condition (not necessary for phase control); s s Cycle 3: Calculate ω nω x1′nω [ k + 1] ;
s s Cycle 4: Calculate ω nω x2′ nω [ k + 1] (not necessary for phase control); sn s n s n binv Cycle 5: Calculate binv ω ω x2′ ω [ k + 1] ; sn binv Cycle 6: Calculate binv e [ k + 1] ; Cycle 7: Save e [ k + 1] for later use; Cycle 8: Calculate control signal u [ k + 1] ; The 8-cycle procedure reduces the resource utilization at the cost of increased latency which leads to decreased maximum sampling rate. It is more suitable for the Spartan 3E chip. C. Performance and Comparison The resource utilization and achieved performance of the proposed fixed-point implementation are summarized in Table I and Table II. The coding is done in Verilog. Each module takes around 100 lines of code. Compared to the floating-point implementation in [8] which takes 27 pages of code, the coding effort is trivial. Also, in [8] the algorithm is finished in 48 clock cycles, while here only 4 or 8 clock cycles 2
Convert
-K-
r
Amplitude 4-cycle 328 7 11 55 74 18.5
Procedure # of Logic Cell Used % of Utilization # of Multiplier a Used % of Utilization Max Clock Rate (MHz) Max Sample Rate (MHz)
Phase
8-cycle 380 8 3 15 74 9.25
4-cycle 249 5 8 40 77 19.25
a. MULT18X18SIO
TABLE II. PERFORMANCE ON XILINX XC6SLX150T-3 CHIP Amplitude
Controlled Variable
4-cycle 330