2200
IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 57, NO. 4, AUGUST 2010
CCD Base Line Subtraction Algorithms Ivan V Kotov, Alexandra. I Kotov, James Frank, Paul O’Connor, Victor Perevoztchikov, and Peter Takacs
Abstract—High statistics astronomical surveys require photometric accuracy on a few percent level. The accuracy of sensor calibration procedures should match this goal. The first step in calibration procedures is the base line subtraction. The accuracy and robustness of different base line subtraction techniques used for Charge Coupled Device (CCD) sensors are discussed. A specialized algorithm for base line subtraction in CCD images containing sparse signals was developed. This algorithm does not require taking separate bias exposures. Statistical properties data are of the algorithm and the algorithm performance on 55 presented. This algorithm is compared with the bias exposures approach. Details of the bias exposure analysis are also discussed. Index Terms—Calibration, charge coupled devices, error analysis.
ular image during the CCD testing (it could be less in normal operation on a telescope). Quite often CCD images contain sparse signals. In such situations, the base line is recorded in many pixels of the image itself. An algorithm was developed that uses data from the same image to generate a base line subtraction. The idea is to use pixels containing no signal to calculate the local base line level and subtract this from the signal pixels. All these algorithms were applied to data sets obtained during LSST sensor characterization [1] and results are presented below. More discussions of the base line subtraction can be found, for example, in [2] and an interpolation method is outlined in [3]. II. BIAS EXPOSURE ANALYSIS DETAILS
I. INTRODUCTION
T
HE usual approach to base line subtraction in CCD images is to use an overscan region. The overscan region is formed by readings of the CCD readout node, ideally, without shifting any charges into it. Thus, pure base line is recorded. Usually, the number of such readings is added to the image at the end of each row and/or column. The subtraction of the average overscan amplitude from the image active area removes the base line. The base level subtraction using the overscan region is an accurate and “low cost” technique. The additional error introduced by the application of this technique can be made as small as desirable by increasing the number of overscan readings. Additional time for image recording increases only by the amount of time needed to readout overscan pixels, which is usually small in comparison to exposure time plus readout time of the active part of the row/column. But this technique does not work well when images have fixed pattern features. The base line level for each pixel can be measured directly by recording separate bias exposures, hereafter referred to as the “measured base line method”. With enough averaging, stable electronic offsets can be measured and subsequently subtracted without introducing additional noise. The drawback of this approach is spending the measurement time on bias exposures. The “cost” of a bias image is the same as the “cost” of a regManuscript received December 18, 2009; revised March 22, 2010; accepted April 22, 2010. Date of publication June 28, 2010; date of current version August 18, 2010. This work was supported by the Department of Energy under Contract DE-AC02-76SF00515 with the Stanford Linear Accelerator Center, Contract DE-AC02-98CH10886 with Brookhaven National Laboratory, and Contract W-7405-ENG-48 with Lawrence Livermore National Laboratory. Additional support was provided by private donations, grants to universities, and in-kind support at Department of Energy laboratories and other LSSTC institutional members. The authors are with the Brookhaven National Laboratory, Upton, NY 11973 USA (e-mail:
[email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TNS.2010.2049660
The base line can be calculated for each pixel simply by averaging over all available bias frames. Statistical error in the av. Howerage decreases with the number of frames, N, as ever, the bias frame approach is complicated by the presence of background signals, and it is usually impossible to eliminate the background completely. Particle events from cosmic rays and natural radioactivity are typical examples (collectively referred to as cosmic rays). For pixels that contain these signals, a simple average is biased toward higher values. To eliminate this bias, cosmic ray signals should be rejected. The truncated mean approach effectively removes high amplitudes caused by cosmic ray events and requires only a single pass over the data. Other robust algorithms such as median filtering and n-sigma outlier rejection are implemented in the IRAF “ccdproc” package [4] and their properties are discussed, for example, in [5], [6]. More specialized algorithms and discussions can be found in [7]. These algorithms either require multiple data passes, for example, the n-sigma outlier rejection algorithm, or their accuracy is inferior to the mean estimate. For example, the median filtering accuracy is approximately less than the accuracy of the mean [5]. In our application of the truncated average, the minimum and maximum amplitudes for each pixel are removed from the averaging. This estimate is robust and unbiased in the case of symmetrical noise distribution (the case for pixels unaffected by cosmic rays). Truncation, however, causes a small bias in pixels affected by cosmic ray events. The maximum amplitude produced by a cosmic ray is rejected correctly but removing the minimum amplitude adds a bias. This bias can be estimated using the probability density function of the minimum ampli, tude, (1) is the amplitude probability density function, where is the cumulative distribution function and
0018-9499/$26.00 © 2010 IEEE
KOTOV et al.: CCD BASE LINE SUBTRACTION ALGORITHMS
2201
2
Fig. 2. The 110 40 portion of one bias image. (a) raw image shows the gradient at the beginning of each row; (b) same image being processed using base line subtraction algorithm. A cosmic ray hit is present at x 100, y 360.
Fig. 1. Bias frames analysis. Amplitude distributions shown on panels are: (a) active pixel average amplitudes; (b) overscan pixel average amplitudes; (c) active pixel amplitudes after overscan amplitude subtraction; (d) active pixel amplitudes after simple average subtraction; (e)active pixel amplitudes after truncated average subtraction.
is the number of amplitudes in the sample. An example of a similar density function can be found in [5] and more examples were are in references therein. The bias values for different computed numerically for a normal noise distribution, . The bias value depends on as , where is a proportionality factor that is a weak function of N. For N in the 10–200 range, changes from 2 to 3 correspondingly. The bias value quickly becomes insignificant as N increases. This bias shift is always smaller than the accuracy in the trunand this shift is cated average. For our data sample . Twenty bias frames were acquired for the bias data set. Properties of this data set and results obtained with overscan subtraction and with measured base line subtraction are shown in Figs. 1 and 2. In this data set the base line has a gradient at the beginning of each row. It starts from a high point in the first pixel of the row, gradually decreases and reaches a constant level at pixel 7. Higher amplitude peaks (one peak per column) in Fig. 1(a) correspond to these electronic offsets. To illustrate spatial positions of these features, one of the bias images is shown in Fig. 2.
There are 16 overscan pixels at the end of each row. For every row, the overscan average was calculated and subtracted from each pixel of that row. Fig. 1(c) shows the amplitude distribution of all the pixels contained in the bias dataset after subtraction of the average amplitude computed in the ovescan region. As expected, the overscan method of the base line subtraction is not very effective as shown in this plot. A more sophisticated overscan fitting method [4] was not used because time consuming stability and robustness study would be required. Fig. 1(d) shows the amplitude distribution of all the pixels after subtraction of the pixel average amplitude. The error in times less than the read out the pixel average amplitude is noise. The measured base line subtraction with simple average works better than the overscan subtraction, but still there is a tail to the left of the true bias value, as shown in Fig. 1(d). This tail is a reflection of the base level upward shift for pixels affected by cosmic ray events. This tail disappears in the truncated average approach, as can be seen in Fig. 1(e). The fraction of pixels affected by cosmic ray hits depends on exposure time. For 6 . Truncation removes 10% sec exposures this fraction is of the base line data sample in our case. This data “loss” hardly affects the statistical accuracy of the base line values. The width of the “zero” peak after either pixel average subtraction or pixel truncated average subtraction does not change (within fitting errors, 0.1%). ALGORITHM FOR IMAGES WITH SPARSE SIGNALS A specialized base line subtraction algorithm was developed for analysis of CCD images containing sparse signals. The algorithm can best be described by separating it into 2 procedural steps: 1) subtract the row average from each element of the row;
2202
IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 57, NO. 4, AUGUST 2010
2) subtract the column average from each element of the column. For convenience, the overscan signal is subtracted first to get rid of an overall constant. Using matrix representation of the image data with the first index being the row index and the second index being the column index, the result of the algorithm application mathematically can be written as (2) is the resulting bias subtracted amplitude, is the where raw amplitude, is the number of rows in the image, and is the number of columns. This algorithm exactly subtracts a very broad class of base “surfaces”. Let the base line vary across the image as an arbitrary function of either X or Y coordinates, or as a linear com. Then the bination of both functions where recorded pixel amplitude is represents the random variation of the base line, i.e., noise. Application of the algorithm, i.e., plugging this raw amplitude into (2), produces the following result (3) . The base “surface” is completely subtracted out and modified Next we calculate the error matrix of the noise by the algorithm application. We assume the usual properties , the same magniof the input noise : zero mean for each pixel and independence from other tude pixels. The input noise covariance matrix can then be written as
Fig. 3. X-ray amplitude and cluster size distributions for: (a) one pixel clusters; (b) two pixel clusters; (c) three pixel clusters; (d) four pixel clusters; (e) five or more pixel clusters; (f) cluster size. Raw data have been processed using sparse signal base line subtraction algorithm.
ALGORITHM PERFORMANCE ON DATA AND COMPARISON WITH BASE LINE MEASUREMENT METHOD
(4) It is easy to see that the mathematical expectation of the bias . The noise covariance masubtracted noise is zero trix after algorithm application becomes
(5) The pixel variance is practically the same as in the original . Very image, less a negligible factor small correlation is introduced within the rows and columns. Pixels containing signals should be eliminated from the averaging over rows and columns, but the calculated averages should be subtracted from all the pixels. The usual way to decide if there is a signal in a pixel is to compare the pixel amplitude with some . The threshold value above the base line, typically decision accuracy improves with more accurate base line subtraction. In turn, base line calculations improve with more accurate signal removal. This leads to an iterative procedure with multiple passes. On our data sets the convergence is fast, no more than 2–4 iterations are needed. The result of the algorithm application to the image with a cosmic hit is shown in Fig. 2(b). The performance of the algorithm is demonstrated below on a data set acquired for absolute gain determination.
The lines provide for absolute gain calibration of the entire electronics chain in a very straightforward way. X rays absorbed in silicon generate electrons in amounts proX-rays produce 1620 portional to the energy. For example, electrons. The conversion gain, e-/a.d.u. can be determined from peak position. The accuracy of the base line subtracthe tion directly affects the gain value. The relative gain error for the and the peak position is base line subtraction accuracy . data were obtained by multiple CCD exposures to The a source. The source swings over the CCD surface on a motorized arm mounted inside the cryostat. The swing time was reduced to 6 sec to minimize pile up of X-ray clusters. The data were accompanied by a series of bias exposures. This data set allows for direct comparison of the base line subtraction algorithm with the base line measurement method. The X-ray spectra obtained using these two base line subtraction methods are shown in Figs. 3 and 4. The results in Fig. 3 are obtained with the base line algorithm described in Section III. Fig. 4 shows results obtained with the base line measurement method. Spectra are shown for different cluster sizes. The cluster size is the number of pixels in a 3 3 search box above threshold. The search box is centered on the pixel with the maximum amplitude. The distribution of pixels among different cluster sizes is shown in panel (f) of each figure.
KOTOV et al.: CCD BASE LINE SUBTRACTION ALGORITHMS
2203
Fig. 5. Amplitude distributions on base line subtraction steps: (a) raw data; (b) after overscan subtraction; (c) after column average subtraction; (d) after row average subtraction. Fig. 4. X ray amplitude and cluster size distributions produced using measured base line subtraction method. Figure layout is the same as on Fig. 3. FITTED
TABLE II ALGORITHM CONVERGENCE SPEED
TABLE I PEAK POSITION
K
Solid lines show results of the , line fit. Values of energies, the Fano factor , and the average enwere fixed paergy needed to create e-/hole pair rameters. Fit parameters are: the number of detected X-rays for peak position. each line, the r.m.s. read-out noise and the , peaks is on the level of 1% The background under the or less depending on the number of pixels in the cluster. Background was not included in the fit model. Results of the fit to the peak position and the statistical errors are shown in Table I. Systematic errors associated with the cluster finding method and background presence are not included. The x-ray amplitude and cluster size distributions obtained using the measured base line subtraction method are very simpeak position between these ilar to Fig. 3. The difference in two methods is less than 0.2%, in good agreement with the fit errors. Thus both methods produce statistically identical results. The algorithm convergence is demonstrated in Fig. 5. Fig. 5(a) shows the raw data. The over scan subtracted data is shown in Fig. 5(b). Other plots show the pixel amplitude distribution for one iteration of the column and row average amplitude subtraction. The procedure convergence speed can be assessed using the mean and sigma of the amplitude distribution for pixels without signals. Peaks on the left side of all plots corresponds to such pixels. These peaks are fit by a
Gaussian function . The results are summarized in the Table II. On our data set, changes in mean and sigma values become smaller than fit errors after the second iteration. So, two iterations are enough to subtract the base line in this case. Also interesting to note is that, in this particular case, the over scan subtraction applied alone would cause the 0.9% error in the gain value even in the “flat” area of the image. III. SUMMARY Three base line subtraction approaches have been used on calibration data set. The overscan subtraction method does not work well on our data set because of the presence of fixed pattern features and a systematic shift of the average overscan amplitude relative to the active pixel base line. Both the base line measurement approach and the algorithm for images with sparse signals produce very similar results. The algorithm does not require taking bias exposures and thus can be used when bias exposures are not available or when the bias pattern is changing over time and bias exposures would not provide a reliable base line subtraction. Other advantages of the algorithm over the base line measurements approach are measurement time reduction and raw data reduction.
2204
IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 57, NO. 4, AUGUST 2010
This algorithm also can be used as the spatial high pass filter. For example, illumination non-uniformities can be significantly removed from flat field images [8] processed using the algorithm. When base line measurements are required, the truncated average of bias frames provides better results than simple average. ACKNOWLEDGMENT The authors are grateful to Veljko Radeka for support of this work and to Roger M. Smith for critical reading of the manuscript and useful suggestions. REFERENCES [1] V. Radeka et al., “LSST sensor requirements and characterization of the prototype LSST CCDS,” J. Instrum., vol. 4, p. P03002, 2009.
[2] S. B. Howell, Handbook of CCD Astronomy. Cambridge, U.K.: Cambridge Univ. Press, 2000. [3] E. Bertin, SExtractor v2.5 User’s Manual [Online]. Available: http:// terapix.iap.fr/IMG/pdf/sextractor.pdf [4] P. Massey, A User’s Guide to CCD Reductions with IRAF, Feb. 1997 [Online]. Available: http://iraf.noao.edu/docs/photom.html [5] B. I. Justusson, “Median filtering: Statistical properties,” in Topics in Applied Physics: Two-Dimensional Digital Signal Processing II. Berlin/Heidelberg, Germany: Springer, 1981, vol. 43. [6] V. J. Hodge and J. Austin, “A survey of outlier detection methodologies,” in Computer Science: Artificial Intelligence Review. Houten, Netherlands: Springer, 2004, vol. 22, pp. 1573–7462. [7] M. T. Schuster, M. Marengo, and B. M. Patten, “IRACproc: A software suite for processing and analyzing spitzer/IRAC data,” in Proc. SPIE, 2006, vol. 6270E, p. 65S. [8] I. V. Kotov, A. I. Kotov, J. Frank, P. O’Connor, V. Radeka, and P. Takacs, “Study of pixel area variations in fully depleted thick CCD,” in Proc. SPIE Symp. Astronomical Telescopes and Instrumentation: Observational Frontiers of Astronomy for the New Decade, San Diego, CA, Jun./Jul. 2010, paper 7742–6.