PUBLICATIONS Journal of Geophysical Research: Atmospheres RESEARCH ARTICLE 10.1002/2014JD021455 Key Points: • A cloud mask algorithm is developed for GOES imager radiance assimilation • Local variability of cloud can affect detection accuracy • An optimal set of the threshold determination is described
Correspondence to: X. Zou,
[email protected]
Citation: Zou, X., and C. Da (2014), An objective regional cloud mask algorithm for GOES infrared imager radiance assimilation, J. Geophys. Res. Atmos., 119, 6666–6680, doi:10.1002/2014JD021455. Received 2 JAN 2014 Accepted 19 MAY 2014 Accepted article online 23 MAY 2014 Published online 9 JUN 2014
An objective regional cloud mask algorithm for GOES infrared imager radiance assimilation Xiaolei Zou1 and Cheng Da1 1
Department of Earth, Ocean and Atmospheric Sciences, Florida State University, Tallahassee, Florida, USA
Abstract A
local, regime-dependent cloud mask (CM) algorithm is developed for isolating cloud-free pixels from cloudy pixels for Geostationary Operational Environmental Satellite (GOES) imager radiance assimilation using mesoscale forecast models. In this CM algorithm, thresholds for six different CM tests are determined by a one-dimensional optimization approach based on probability distribution functions of the nearby cloudy and clear-sky pixels within a 10° × 10° box centered at a target pixel. It is shown that the optimizal thresholds over land are, in general, larger and display more spatial variations than over ocean. The performance of the proposed CM algorithm is compared with Moderate Resolution Imaging Spectroradiometer (MODIS) CM for a 1 week period from 19 to 23 May 2008. Based on MODIS CM results, the average Probability of Correct Typing reaches 92.94% and 91.50% over land and ocean, respectively.
1. Introduction A Geostationary Operational Environmental Satellite (GOES) provides nearly time continuous evolution of weather phenomena over the instrument’s full-disk domains with high horizontal resolution. Direct assimilation of imager radiance measurements from the GOES in numerical weather prediction (NWP) models happened much later than the assimilation of radiances from polar-orbiting satellites. Preliminary numerical experiments on the assimilation of the geostationary radiances were conducted by Köpken et al. [2004] for the Meteosat Visible and InfraRed Imager onboard Meteosat-7 and by Szyndel et al. [2005] and Stengel et al. [2009] for the Spinning Enhanced Visible and InfraRed Imager (SEVIRI) onboard Meteosat-8. The study of GOES imager radiance data in global data assimilation by Su et al. [2003] showed neutral or slightly degraded impact on the performance of the forecast skills. Recently, Zou et al. [2011] and Qin et al. [2013] investigated the benefit of directly assimilating GOES radiance data from GOES-11 and GOES-12 for improving Gulf coast quantitative precipitation forecasts using the National Centers for Environmental Prediction (NCEP) Gridpoint Statistical Interpolation (GSI) system. In these studies, GOES data were thinned to a resolution that is much coarser than its original observation resolution, and only the cloud-free radiance data were assimilated. In this study, a local, regime-dependent cloud mask (CM) algorithm is developed for maximizing the usage of cloud-free pixels from GOES imager instruments in mesoscale forecast models. In GOES imager radiance assimilation, cloudy pixels must be detected reliably so that the radiance can be uniquely related to atmospheric temperature, water vapor, and surface temperature and emissivity. The proposed CM algorithm is thus intended for removing radiance observations contaminated by clouds in the Quality Control (QC) procedure before imager radiance assimilation. In the NCEP GSI system, quality control for imager radiance assimilation must be based on imager radiance observations inputted to the system, i.e., GOES imager channels 2–4 and 6. Therefore, it is different from retrieving cloud properties, which could require using multisensor measurements [Huang et al., 2005, 2006]. Since clouds have distinct effects on atmospheric radiation at various infrared wavelengths, the spectral differences between two channels and/or spatial differences among neighboring points are often utilized for cloud detection. Research on deriving an accurate cloud mask (CM) from geostationary and polar-orbiting satellite data had started ever since the launch of the first Earth observing satellite TIROS-1 in 1960. It is still a challenging problem. The performance of various cloud detection algorithms varies regionally and with time of day [Merchant et al., 2005; Reuter et al., 2009]. A single CM test from uses of either spectral or spatial difference may be suited for some situations but not for other applications due to variations in surface emissivity and boundary layer temperature and limitations of sensor spatial resolution.
ZOU AND DA
©2014. American Geophysical Union. All Rights Reserved.
6666
Journal of Geophysical Research: Atmospheres
10.1002/2014JD021455
Different CM algorithms were developed for advanced very high resolution radiometer (AVHRR) [Stowe et al., 1999], Moderate Resolution Imaging Spectroradiometer (MODIS) [Ackerman et al., 2006], and SEVIRI [Hocking et al., 2010]. A CM algorithm consists of a set of CM test indices and a corresponding set of thresholds required for an implementation of each of these CM tests. Most CM algorithms are threshold-based test approaches in which thresholds are often set empirically and are of the same values in all types of clouds. Since an inappropriate threshold for a CM test could degrade the overall performance of a CM algorithm, Heidinger [2011] proposed a new CM algorithm. First, an 8 week training data set was established in which Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO) data were collocated with SEVIRI data. Then, the CM thresholds were determined for the CM algorithm for SEVIRI data by allowing no more than a 2% false cloud detection rate. The collocated CALIPSO data in this training data set were used as the “truth.” Thresholds over land were set differently from those over ocean. Hocking et al. [2010] proposed to use different thresholds not only over ocean and land but also over different surface vegetation types, as well as different time periods (e.g., daytime or nighttime). Such a surface type- and time-dependent threshold setting made the CM tests more adaptive. This study aims at developing a local, regime-dependent CM algorithm in order to decrease the leakage rate (LR) and decrease the False Alarm Rate (FAR). The thresholds in the Objective Regional CM (ORCM) algorithm are determined objectively at the pixel level. First, two local, first-guess “clear-sky” and “cloudy” data sets are established using data within a small region centered at a target pixel whose CM is to be determined. The threshold for each CM test is then determined by maximizing the sum of the total number of the clear pixels found by this CM test in the first-guess clear-pixel data set and the total number of the cloudy pixels found by this CM test in the first-guess cloudy-pixel data set. Such a procedure is repeated for a total of five CM tests used by the Advanced Baseline Imager (ABI), MODIS, Met Office, and European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT) CM algorithms. Finally, a composite CM is generated. A pixel is determined as cloud-contaminated if at least one test flags this pixel as cloudy. By considering the spatial and temporal variations of clouds and maximizing the success rates of identifying both cloudy and clear pixels, the CM tests could be more adaptive. This paper is organized as follows: Section 2 gives a brief description of GOES imagery observations and model simulations. Six CM tests adopted in this study are described in section 3. Section 4 describes how the two local, first-guess clear-sky and cloudy data sets are generated. Mathematical formulations, graphical illustrations, and numerical results of an objective method for determining an “optimal” set of thresholds are presented in section 5. Summary and conclusions are provided in section 6.
2. Data Description and Model Setting GOES imager instrument onboard GOES-12, GOES-13, GOES-14, GOES-15 provides visible and infrared measurements with five channels. Channel 1 is a visible channel located at 0.65 μm, channel 2 is an infrared channel with a central wavelength of 3.9 μm, channel 3 is a water vapor channel at 6.5 μm, and channel 4 and channel 6 are two infrared channels with central wavelengths located at 10.7 μm and 13.3 μm, respectively. The GOES visible channel has the highest 1 km resolution. The resolution of channels 2–4 is approximately 4 km. The spatial resolution for channel 6 is approximately 8 km for GOES12 and 4 km for GOES-13 to GOES-15. The GOES-East imager completes four scans over continental U.S. every 30 min, and a full disk scan every 3 h. In this study, infrared channels observations from GOES-12 imager are utilized. Other information used in the LR-CM algorithm includes model simulations of brightness temperature, solar zenith angle, satellite zenith angle, surface types (land, ocean, or coastal), and terrestrial elevation. Model simulations of brightness temperatures for GOES imager channels 2, 3, 4, and 6 required for the CM tests in the ORCM algorithm are calculated by Community Radiative Transfer Model (CRTM) [Weng, 2007; Han et al., 2007]. The Weather Research Forecast (WRF) Advanced Research WRF (ARW) model forecasts at 10 km resolution and 30 min frequency are used as input to CRTM. The WRF-ARW forecasts are made with initial conditions being generated by the National Centers for Environmental Prediction (NCEP) GSI data analysis system in which conventional observations and GOES-12 imager radiance observations are assimilated [Zou et al., 2011; Qin et al., 2013].
ZOU AND DA
©2014. American Geophysical Union. All Rights Reserved.
6667
Journal of Geophysical Research: Atmospheres
10.1002/2014JD021455
Table 1. Six CM Tests Used by ABI, MODIS, Met Office, and EUMETSAT CM Algorithms Name GROSST
Condition for Cloudy Pixels
Organization
> εGRST
Met Office CM
B
10.7μm
O
10.7μm
3γσ > εTUT 10:7μm Omax O10:7μm 3γðz max zÞ > ε RTCT σ
TUT RTCT CH46T
(O
10.7 μm
O
10.7μm
13.3 μm
z
) (B
10.7 μm
) < ε46T
13.3 μm
B
O O < ε42T (daytime) 10.7μm 3.9μm O O > ε42T (nighttime) 10.7μm
CH42T WtrVprT
(O
10.7μm
O
3.9μm
6.5μm
) (B
10.7μm
B
6.5μm
) < ε WVT
ABI CM ABI CM EUMETSAT CM MODIS CM EUMETSAT CM
3. A Brief Description of CM Tests in ORCM Algorithm Six CM tests employed in this study are listed in Table 1. They are called Gross Test (GROSST), Thermal Uniformity Test (TUT), Relative Thermal Contrast Test (RTCT), Channel 4 Minus Channel 6 Test (CH46T), Channel 4 Minus Channel 2 Test (CH42T), and Water Vapor Channel Test (WtrVprT). The GROSST was used at Met Office [Hocking et al., 2010]. The TUT and RTCT were used in the ABI CM algorithm [Heidinger, 2011]. The CH46T and WtrVprT were employed at by European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT) [2007]. The CH42T was used in the MODIS CM algorithm [Ackerman et al., 2006]. A brief description of each of the six CM tests is provided below. 3.1. Gross Test A pixel is determined as cloudy by GROSST if B10:7 μm O10:7 μm > ε GRST 10.7 μm
(1)
10.7 μm
and O represent brightness temperature at 10.7 μm from model simulations under where B clear-sky conditions and observations, respectively. This test is based on the assumption that the observed brightness temperature at the 10.7 μm window channel is much colder than the simulated brightness temperature in the presence of cloud, which is not the case for clear-sky pixels. This test is used for all surface types. The GROSST is widely used and was incorporated in several CM algorithms [e.g., EUMETSAT, 2007; Hocking et al., 2010]. 3.2. Thermal Uniformity Test A pixel is flagged as cloudy by TUT if σ10:7μm 3γσ z > ε raw TUT
(2)
where σ and σ are the standard deviation of 10.7 μm brightness temperature and terrain height in a 3 × 3 pixel box, respectively, and γ is the lapse rate that is set to a value of 7 K κμ- 1. The term 3γσ z is an offset accounting for the contribution of the terrain height variation to σ10.7μm [Heidinger, 2011]. In the ORCM algorithm, the TUT index for identifying cloudy pixels is modified to 10.7μm
z
σ10:7μm γσ z > ε TUT
(3)
in order to avoid a negative value of TUT index in (2). This test aims at detecting cloud edges, multilayer clouds, and isolated clouds. It is used in advanced very high resolution radiometer (AVHRR) CM algorithm [Stowe et al., 1999] and ABI CM algorithm [Heidinger, 2011]. Due to the discontinuity of temperature along the coast and the banks of the rivers and lakes, observations at these locations are not included when this test is applied. 3.3. Relative Thermal Contrast Test A cloudy pixel is identified by RTCT as cloudy if O10:7μm O10:7μm 3γ ðz max zÞ > ε raw max RTCT
(4)
where O10.7 μm represents the observed 10.7 μm brightness temperature of the target pixel whose CM is to be determined, O10:7μm is the highest 10.7 μm brightness temperature in the 3 × 3 box surrounding the target max ZOU AND DA
©2014. American Geophysical Union. All Rights Reserved.
6668
Journal of Geophysical Research: Atmospheres
10.1002/2014JD021455
pixel, and 3γ(zmax z) is an offset to account for the terrain effect [Heidinger, 2011]. The RTCT in the ORCM algorithm is also modified as follows: O10:7μm O10:7μm γ ðzmax zÞ > ε RTCT (5) max This test is designed to find a pixel within a cloud that causes a relatively larger spatial variation of brightness temperature at 10.7 μm [Heidinger, 2011]. Coastal pixels and pixels with a minimum O10.7μm in their neighboring 3 × 3 boxes being greater than 300 K are excluded form this test. 3.4. Channel 4 Minus Channel 6 Test The formula for identifying cloudy pixels by CH46T is as follows: B10:7μm B13:3μm O10:7μm O13:3μm > ε 46T
(6)
where O10.7 μm and O13.3 μm are observed brightness temperatures at 10.7 μm and 13.3 μm, respectively, and B10.7 μm and B13.3 μm are model-simulated brightness temperatures at 10.7 μm and 13.3 μm under clear-sky conditions. This test is based on the consideration that a negative difference between 10.7 μm and 13.3 μm brightness temperature of large magnitude suggests an existence of clouds. The term (B10.7 μm B13.3 μm) serves as an offset for the difference between these two channels, representing the temperature difference even under a clear-sky condition. This test is included in the EUMETSAT CM algorithm [EUMETSAT, 2007]. 3.5. Channel 4 Minus Channel 2 Test A pixel is deemed cloudy by CH42T during daytime if O3:9μm O10:7μm > ε raw 42T
ðdaytimeÞ
(7)
ðnighttimeÞ
(8)
At nighttime, this test is changed to O10:7μm O3:9μm > ε raw 42T
where O10.7 μm and O3.9 μm are observed brightness temperatures at 10.7 μm and 3.9 μm, respectively. A daytime (nighttime) pixel has a solar zenith angle (θ) being less (greater) than 85°. During daytime, the solar radiation reflected from clouds will cause the brightness temperature at 3.9 μm being greater than that at 10.7 μm, resulting in a negative difference of O10.7 μm O3.9 μm between these two channels. During nighttime, the difference between these two channels, O10.7 μm O3.9 μm, is, in general, positive due to the lack of solar reflection and a larger emissivity at longer wave band [Ackerman et al., 2006]. This test is included in the MODIS CM algorithm [Ackerman et al., 2006]. Although being positive for opaque clouds (e.g., thick water clouds and fog), (O10.7 μm O3.9 μm) could be negative due to the presence of thin ice clouds during nighttime [Jedlovec et al., 2008]. Based on this consideration, CH42T during nighttime is modified to O10:7 μm O3:9 μm > ε P42T
ðnighttimeÞ
(9a)
O3:9 μm O10:7 μm > ε N42T
ðnighttimeÞ
(9b)
or
The inequalities (9a) and (9b) are applied to all pixels during nighttime. 3.6. Water Vapor Channel Test Using the water vapor channel at 6.5μm, a cloudy pixel can be detected based on the following inequality: (10) B10:7μm B6:5μm O10:7μm O6:5μm > ε WVT This CM test can be used to identify high clouds. If high clouds exist, the observed brightness temperature O10.7 μm will be close to the cloud top temperature, thus causing a smaller difference between 10.7 μm and 6.5 μm than that under cloud-free conditions. This test is included in the EUMETSAT CM algorithm [EUMETSAT, 2007]. It is applied to all pixels during both daytime and nighttime.
ZOU AND DA
©2014. American Geophysical Union. All Rights Reserved.
6669
Journal of Geophysical Research: Atmospheres
10.1002/2014JD021455
4. The First-Guess CM Data Sets The success of a CM algorithm is judged by not to fail to detect cloud-affected radiance observations at all GOES imager channels due to different regional and physical behaviors of clouds. In order to increase the applicability of various CM tests, an explicit dynamic set of thresholds was generated by simply adding various offsets to those CM test formulas (1)–(10), which were described in the above section. Examples are offset caused by terrain height, an offset related to diurnal cycle [EUMETSAT, 2007]. As mentioned before, Heidinger [2011] succeeded in accurately inferring the properties of clouds over multiple channels by collocating observations from SEVIRI with those from CALIPSO. However, CALIPSO observations are available from polar-orbiting satellites, each of which only provides observations twice daily. GOES imager radiance assimilation for regional numerical weather forecasts requires a CM algorithm that can capture cloud features in all cases and realistically. The CM thresholds in a CM algorithm for GOES imager radiance assimilation in regional NWP models must be updated as frequent as the data assimilation cycle (e.g., ≤ 6 h) and rely but on GOES imager observations themselves without involving other instruments. Under these constraints, a local, regime-dependent set of CM thresholds is developed and described in this section. The first CM test of the proposed ORCM algorithm is GROSST, which produces the first-guess CM data sets required for determining an optimal set of CM thresholds of the remaining five CM tests. GROSST was reported to have a Probability of Correct Typing (PCT) of about 80% and had the highest PCT among all the CM tests used in the EUMETSAT CM algorithm [EUMETSAT, 2007]. In addition, over 80% of cloudy pixels can be successfully detected by this test [EUMETSAT, 2007; Hocking et al., 2011]. GROSST examines the differences of brightness temperature of GOES imager channel 4 (10.7 μm) between model simulations and observations (e.g., B10.7 μm O10.7 μm). Two examples are provided in Figures 1a and 1b: The first is a daytime case (Figure 1a, 1745–1747 UTC on 22 May 2008), and the second is a nighttime case (Figure 1b, 0633–0635 UTC on 23 May 2008). Cloud distributions for both cases are shown by simultaneous measurements of GOES-12 imager visible channel 1 at 0.65 μm (Figure 1c) and channel 2 at 3.9 μm (Figure 1d). It is seen that the observed brightness temperatures at the 10.7 μm window channel are much colder than the simulated brightness temperature in cloudy areas. Hocking et al. [2010] proposed two empirically threshold values for ε raw (see equation (1)): 3.5 K over land and 2.5 K over ocean, for GROSST to identify cloudy pixels. The GRST thresholds are modified to 4.5 K over land and 2.5 K over ocean in the ORCM algorithm. By examining the differences of brightness temperatures of GOES imager channel 4 at 10.7 μm between model simulations and observations (B-O) for both cases (Figures 1a and 1b), it is seen that these two thresholds for ε raw can capture GRST large areas of observed clouds for both cases. In order to better detect cloud in a target region, the first-guess clear-sky and cloudy data sets are generated for each target GOES imager pixel based on spatial and temporal variations within its 10° × 10° area in the ORCM algorithm. The 10° × 10° size is chosen based on the largest horizontal scale of different types of clouds [Wang and Sassen, 2013].
5. Threshold Optimization 5.1. Probability of Occurrence of Clear-Sky and Cloudy Pixels The probability of occurrence of cloudy pixels detected by the ith CM test based on the first-guess CM data sets is defined as follows: N G cld ∩ T i ε j ≤ ε < ε j þ 0:1 ε ≤ ε < ε þ 0:1 ¼ (11) Pcld j j i N G cld ∩T i ε j ≤ ε < ε j þ 0:1 is the probability of occurrence of cloudy pixels when the ith CM test index where Pcld i is within the interval [εj, εj + 0.1), Ti(εj ≤ ε < εj + 0.1) represents all pixels for which the ith CM test index is within [εj, εj + 0.1), and Gcld represents the first-guess cloudy pixel data set determined by GROSST. N represents the total number of data count. Similarly, the probability of occurrence of clear pixels detected by the ith CM test can be defined as Pclr i
ZOU AND DA
εj ≤ ε < εj þ 0:1
¼
N Gclr ∩T i
©2014. American Geophysical Union. All Rights Reserved.
εj ≤ ε < εj þ 0:1 N Gclr ∩T i
(12)
6670
Journal of Geophysical Research: Atmospheres
10.1002/2014JD021455
Figure 1. Differences of brightness temperature of GOES imager channel 4 (10.7 μm) between model simulations and observations (B-O) at (a) a daytime period of 1745–1747 UTC on 22 May 2008 and (b) a nighttime period of 0633–0635 UTC on 23 May 2008. (c) Reflectance of GOES-12 imager channel 1 (visible channel, 0.65 μm) during the daytime period of Figure 1a. (d) Brightness temperature distribution of GOES-12 imager channel 1 (infrared channel 2, 3.9 μm) during the nighttime period of Figure 1b.
where Pclr εj ≤ ε < εj þ 0:1 is the probability of occurrence of clear-sky pixels when the ith CM test index is i within the interval and Gclr represents the first-guess clear-sky pixel data set determined by GROSST. Figure 2 shows variations of the probability of occurrence of clear-sky pixels (Pclr i ) and the probability of occurrence of cloudy pixels (Pcld i ) found by TUT (Figures 2a and 2b), RTCT (Figures 2c and 2d), CH46T (Figures 2e and 2f), Negative CH42T (Figures 2g and 2h), and WtrVprT (Figures 2i and 2j) of two target regions, one located over land and the other over ocean, based on the first-guess clear-sky and cloudy pixels data sets. For TUT metric, most clear-sky pixels have a value of the TUT index being less than 1 K (Figures 2a and 2b). The probability of occurrence has a maximum at about 0.2 K, decreases rapidly as the TUT index increases, and reduces to zero at about 4 K. On the contrary, the probability of occurrence of cloudy pixels increases as the TUT index increases from zero to about 2 K. The clear-sky pixels over ocean (Figure 2b) have a narrower frequency distribution than that over land (Figure 2a). This is because the ocean surface is more uniform than land. The probability of occurrence of cloudy pixels is much broader than that of clear-sky pixels, meaning the variability of the TUT index under cloudy condition is much larger than that under clear-sky condition. In addition, the probability of occurrence of cloudy pixels is less (greater) than the clear-sky pixels when the TUT index is less (greater) than 1.2 K over land and 0.5 K over ocean. Significant differences of probability distributions between clear-sky pixels and cloudy pixels are also found for the RTCT, CH46T, Negative CH42T, and WtrVprT indices (Figures 2c–2j). For each CM index, there exists a
ZOU AND DA
©2014. American Geophysical Union. All Rights Reserved.
6671
Journal of Geophysical Research: Atmospheres
10.1002/2014JD021455
Figure 2. Probability of occurrences of clear-sky pixels (blue) and cloudy pixels (red) determined by the first-guess CM near a pixel over (a, c, e, g, and i) land or (b, d, f, h, and j) ocean for TUT, RTCT, CH46T, CH42T, and WtrVprT indices.
threshold above which most pixels are cloudy. However, there exists an overlap range of each of the CM index of the clear-sky and cloudy distributions, meaning that the CM indices could have same values at some clear-sky conditions as in clouds. A too large threshold will miss cloudy pixels, thus increasing the LR. On the contrary, a too small threshold will incorrectly classify clear-sky pixels as cloudy, resulting in an increase of FAR. In the proposed ORCM algorithm, a one-dimensional optimization procedure is used to obtain a set of optimal thresholds in which the LR and FAR can be constrained. 5.2. Mathematical Formulation An objective function is defined for each CM test as follows: cld cld N Gclr ∩T clr i ðεÞ þ N G ∩T i ðεÞ f PCT ð ε Þ ¼ i N ½T i
(13)
cld where T clr i ðεÞ and T i ðεÞ represent the clear-sky and cloudy pixels determined by the ith CM test if the cld threshold value is set to ε, respectively, Ti is the sum of T clr i ðεÞ and T i ðεÞ, and N represents the total number of PCT data count. In fact, the objective function in equation (13), f i ðεÞ, is a measure of the PCT for the ith CM test with a threshold ε when verified with the two first-guess CM data sets.
ZOU AND DA
©2014. American Geophysical Union. All Rights Reserved.
6672
Journal of Geophysical Research: Atmospheres
10.1002/2014JD021455
Figure 3. Variations of f PCT (red), f RFAR (blue), and f RLR (orange) at (a, c, e, g, and i) a target pixel over land and (b, d, f, h, and j) i i i a target pixel over ocean for TUT, RTCT, CH46T, CH42T, and WtrVprT indices. The optimal thresholds are indicated by vertical dashed line (black).
The objective function f PCT i ðεÞ is maximized subject to the following two constraints: N Gclr ∩T cld i ðεÞ ð ε Þ ¼ ≤ α1 ði ¼ 1; 2; …; 5Þ f RFAR i N T cld i ðεÞ N Gclr ∩T cld i ðεÞ RLR ≤ α2 ði ¼ 1; 2; …; 5Þ f i ðεÞ ¼ N Gclr
(14)
(15)
where f RFAR ðεÞ describes the percentage of cloudy pixels identified by the ith CM test that were initially i flagged as clear-sky pixels by GROSST in the first-guess CM, f RLR i ðεÞ represents the percentage of clear-sky ZOU AND DA
©2014. American Geophysical Union. All Rights Reserved.
6673
Journal of Geophysical Research: Atmospheres
10.1002/2014JD021455
Figure 4. Distribution of optimal thresholds over (a, c, e, g, i, and k) land and (b, d, f, h, j, and l) ocean for TUT, RTCT, CH46T, Negative CH42T, Positive CH42T, and WtrVprT for a case study at 0415 UTC on 22 May 2008.
pixels in the first-guess CM that are identified as cloudy by the ith CM test, and, α1 and α2 are constants. The constraints (14) and (15) are imposed to suppress the FAR and LR of the resulting CM, respectively. In this study, α1 and α1 are set to 10%. The two constraints (14) and (15) are related to FAR and LR, which can be written as N Gclr ∩T cld i ðεÞ ð ε Þ ¼ f FAR i clr N T cld i ðεÞ þ T i ðεÞ N Gcld ∩T clr i ðεÞ LR f i ðεÞ ¼ cld N T i ðεÞ þ T clr i ðεÞ
(16)
(17)
RFAR if the first-guess CM were used as the truth. The difference between f FAR ðεÞ is the denominator. i ðεÞ and f i cld The reason to use N T i ðεÞ is to deal with an extreme case for which most pixels within a 10° × 10° box are either cloudy or clear sky. If f FAR i ðεÞ were used as a constraint, the optimal threshold would be extremely large or small in these extreme cases. The larger f RFAR ðεÞ is the CM generated by the ith CM test that differs i more greatly from the first-guess CM. Placing an upper limit for f RFAR ðεÞ could avoid the ith CM test to i
ZOU AND DA
©2014. American Geophysical Union. All Rights Reserved.
6674
Journal of Geophysical Research: Atmospheres
10.1002/2014JD021455
Figure 4. (continued)
find too many cloudy pixels so as to keep FAR low. Since the GROSST cannot flag all cloud pixels, ith CM test is allowed to flag some percentage of initially clear-sky pixels as cloudy, thus decreasing the LR. This is achieved by setting a value greater than 0 in the second constraint (15). In addition, an upper limit is imposed on the second constraint (15) to avoid the ith CM test to convert too many clear-sky pixels to cloudy pixels. RFAR Figure 3 shows variations of f PCT ðεÞ, and f RLR i ðεÞ, f i i ðεÞ at a target pixel over land and a target pixel over ocean for the TUT, RTCT, CH46T, CH42T, and WtrVprT indices, as well as the optimal thresholds satisfying RFAR equations (11)–(13). In general, f PCT ðεÞ and f RLR i ðεÞ has a maximum for all indices (i = 1, 2, …, 5). Both f i i ðεÞ decrease as ε increases. Results in Figure 3 can be explained as follows: when ε is near zero, nearly all clear-sky pixels are incorrectly flagged as cloudy pixels. As ε increases, more clear-sky pixels are correctly
interpreted, resulting an increase of f PCT i ðεÞ. Due to an overlap of the probability distribution between clear-sky and cloudy pixels (see Figure 2), more clear-sky pixels are correctly identified while some cloudy pixels are mistakenly taken as clear-sky pixels. This slows down the increase of f PCT i ðεÞ. When the total number of missed cloudy pixels is larger than that of correctly flagged clear-sky pixels, f PCT i ðεÞ starts to decrease. The optimal threshold obtained under the constraints (14) and (15) related to FAR and LR is
ZOU AND DA
©2014. American Geophysical Union. All Rights Reserved.
6675
Journal of Geophysical Research: Atmospheres
10.1002/2014JD021455
Figure 5. The (a, c) GOES-12 CM obtained by this study and (b, d) MODIS CM at 0500 UTC on 23 May as shown in Figures 5a and 5b and at 0420 UTC on 22 May 2008 as shown in Figures 5c and 5d. Cloudy and clear-sky pixels are indicated in red and blue, respectively.
greater than the value at which f PCT i ðεÞ reaches the maximum for all the five CM tests. The optimal threshold for the land pixel is 2.1 K for TUT, 2.3 K for RTCT, 2.1 K for CH46T, 2.9 K for nighttime Ch42T, and 5.1 K for WtrVprT, respectively. The optimal threshold for the ocean pixel is 0.6 K for TUT, 0.7 K for RTCT, 1.3 K for CH46T, 3.7 K for nighttime CH42T, and 4.3 K for WtrVprT. 5.3. Numerical Results The optimal thresholds for each CM test obtained by the ORCM algorithm over ocean and land are shown by the histograms in Figure 4. There are several major differences for the optimal thresholds over land versus ocean. First, the optimal threshold values are, in general, smaller over ocean than over land. Second, Another obvious characteristic is associated with the threshold variation. The optimal thresholds for TUT, RTCT, CH46T, and negative CH42T have more variations over land than over ocean. Third, the optimal thresholds for positive CH42T have less variation over land than over ocean due to the small number of pixels that are applicable for this CM test. Finally, the variations of the thresholds for WtrVprT are similar over land and ocean. It is reminded that these optimal thresholds are directly generated by the optimization procedure without any additional tuning. Since the ORCM algorithm intrinsically generates a set of pixel-dependent, implicitly dynamic thresholds, it avoids the need to refresh periodically the threshold values in order to obtain a relative consistent performance. The level 2 CM products from Moderate Resolution Imaging Spectroradiometer (MODIS) on board the polarorbiting satellites Terra and Aqua were of widely accepted quality [Ackerman et al., 2006] and were commonly utilized to evaluate new CM algorithms [Hocking et al., 2011; Heidinger, 2011]. The MODIS CM was provided with a spatial resolution of 1 km at nadir, along with the following four types of information: high-confidence
ZOU AND DA
©2014. American Geophysical Union. All Rights Reserved.
6676
Journal of Geophysical Research: Atmospheres
10.1002/2014JD021455
Figure 6. The (a, c) GOES-12 CM obtained by this study and (b, d) MODIS CM at 1530 UTC on 23 May as shown in Figures 6a and 6b and at 0415 UTC on 22 May 2008 as shown in Figures 6a and 6b. Cloudy and clear-sky pixels are indicated in red and blue, respectively.
clear, confidently clear, uncertain, and cloudy. In this study, the Terra MODIS CMs are compared with the CMs generated by the ORCM algorithm during 19–23 May 2008. Each MODIS pixel is firstly collocated with the nearest GOES imager pixel within a temporal difference of 15 min. To further reduce validation uncertainty, a GOES imager pixel is flagged as clear if over 90% of collocated MODIS pixels are of the high-confidence clear type. A GOES imager pixel is identified as cloudy when more than 90% of collocated MODIS pixels are cloudy. The GOES imager pixels that fall off the above two criteria are excluded from comparison. Totally 5,616,090 GOES pixels are collocated for evaluating the persistent performance of the ORCM algorithm. Figures 5 and 6 show the performance of the ORCM algorithm under four different sky conditions: large portion of clear-sky (Figures 5a and 5b), overcast clouds (Figures 5c and 5d), isolated clouds over ocean (Figures 6a and 6b), and congested clouds mixed with isolated clouds over ocean (Figures 6c and 6d). The CM generated by the ORCM algorithm (Figure 5a) matches the MODIS CM (Figure 5b) quite well except for a small amount of false clouds in Mexico. A PCT of 94.52%, FAR of 2.52%, and LR of 2.96% are achieved for the case with a large portion of clear sky. Three additional experiments are conducted to test the sensitivity of CM results to the box size chosen for CM test of a target pixel. It is found that the PCT reaches 94.55%, 94.50%, 94.50%, and 94.52% when the box size is varied from 4° × 4°, 6° × 6°, 8° × 8°, and 10° × 10°. This suggests the robustness of the proposed CM algorithm to the local box size. Under an overcast cloud condition (Figures 5c and 5d), the PCT, FAR, and LR are 95.06%, 2.23%, and 2.71%, respectively. One additional experiment is
Table 2. PCT, FAR, and LR Based on Collocated MODIS CM for GOES-12 Imager Pixels Over Land During 19–23 May 2008 Period Daytime Nighttime Total
ZOU AND DA
Counts of Observations
PCT (%)
FAR (%)
LR (%)
1,411,504 1,443,261 2,854,765
92.73 93.15 92.94
4.36 2.51 3.42
2.91 4.34 3.64
©2014. American Geophysical Union. All Rights Reserved.
6677
Journal of Geophysical Research: Atmospheres
10.1002/2014JD021455
Table 3. Same as Table 2 Except for Over Ocean Period Daytime Nighttime Total
Counts of Observations
PCT (%)
FAR (%)
LR (%)
1,623,800 1,137,525 2,761,325
91.15 92.01 91.50
2.94 4.72 3.68
5.91 3.27 4.82
conducted with GROSST threshold of 3.5 K over land, and the FAR of the final CM increases from 2.23% to 2.73%. Results for the two cases shown in Figure 5 suggest that the ORCM algorithm can handle extreme conditions with an average PCT over 94%. One explanation for such a relative high PCT is that the constraints on f RFAR ðεÞ i and f RLR i ðεÞ are set to depend on the threshold ε, which imposes a strong constraint on the threshold optimization when the amount of clear-sky pixels is either far greater or far less than that of cloudy pixels. Clouds over ocean can severely degrade the sea surface temperature (SST) retrieval with infrared data. Elimination of cloud-contaminated radiance is the biggest challenge when retrieving infrared SST [Reynolds et al., 2004], so it is of great importance to check the performance of the ORCM algorithm under such circumstance. The ability for the ORCM algorithm to detect isolated clouds over ocean is shown in Figures 6a and 6b, and that for the ORCM algorithm to simultaneously detect congested and isolated clouds is shown in Figures 6c and 6d. The GOES CMs compared favorably with MODIS CMs. The PCT, FAR, and LR are 92.27%, 2.98%, and 4.74%, respectively, for the case shown in Figures 6a–6c. The PCT is as high as 93.54%, and the FAR and LR are as low as 4.17% and 2.29%, respectively. The corresponding histogram distribution of the optimal thresholds for the case study at 0415 UTC on 22 May 2008 (Figures 6c and 6d) is shown in Figure 4.
Figure 7. The (a) PCT, (b) FAR, and (c) LR of the CM determined by the LR-CM algorithm for all the GOES-12 imager pixels collocated with MODIS data during a 5 day period from 19 May to 23 May 2008.
ZOU AND DA
©2014. American Geophysical Union. All Rights Reserved.
The average performance for a 5 day period from 19 May to 23 May 2008 is provided in Tables 2 and 3 and also shown in Figure 7. The PCT over land is 92.73% during daytime and 93.15% during nighttime. The average performance over land is 92.94%. The optimal thresholds over land have more variations (Figure 4), so they can better capture local temporal and spatial variations of cloud than a prespecified constant threshold. For the cloud detection over ocean, the PCT is 91.15% during daytime and 92.01% during nighttime. The averaged performance for a total of 5,616,090 collocated GOES imager pixels is a PCT of 92.23%, a FAR of 3.55%, and a LR of 4.22%. These results confirm that the ORCM algorithm performs well.
6678
Journal of Geophysical Research: Atmospheres
10.1002/2014JD021455
6. Summary and Conclusions A geostationary satellite provides a time-continuous evolution of weather phenomena over its instrumentobserving domain. Direct assimilation of geostationary radiance in recent studies proves to improve the forecast skill of numerical models. Removing cloud-contaminated radiance data is a critical step in geostationary radiance assimilation since the current data assimilation system can only ingest clear-sky radiances. The ORCM algorithm is developed to remove cloud-contaminated observations in a quality control step prior to data assimilation. It employs an optimal procedure for determining a set of dynamic thresholds where local temporal and spatial variations of clouds are introduced. Specifically, the ORCM algorithm determines implicit dynamic threshold based on the local distribution of clear-sky and cloudy pixels. First, a GROSST is used for generating an approximated distribution of clear-sky and cloudy pixels. The thresholds utilized in each CM tests are then objectively determined by a one-dimensional optimization approach based on the information provided by the first-guess data sets. Two constraints are imposed to ensure the correctness of the ORCM algorithm under extreme sky conditions (e.g., overcast clouds and large portion of clear sky). A pixel is identified as cloudy if it is flagged as cloudy by any CM tests in the algorithm. It is observed that the distribution of optimal thresholds over land possesses more variations than that over ocean. Besides, the average threshold values over land are also greater than the averages over ocean for most CM tests. A total of 5,616,090 GOES-12 infrared Imager pixels are collocated with MODIS CM during a 5 day period from 19 May to 23 May 2008 in order to evaluate the performance of the ORCM algorithm. A high PCT (above 92%) and a low FAR (below 4%) and a low LR (4%) are achieved during daytime or nighttime over land or ocean. The proposed CM algorithm can easily be implemented for other infrared imager data such as SEVIRI and ABI onboard geostationary satellites. This is our third study on GOES imager radiance assimilation following the work by Zou et al. [2011] and Qin et al. [2013], who investigated the assimilation of GOES imager radiance observations at a rather coarse resolution through data thinning. Further investigation on identifying cloudy radiances is intended to involve more CM tests and develop a principal component analysis based CM algorithm in which correlations between different CM tests can be taken cared of. The final CM algorithm will be incorporated into the GSI system to remove cloud-contaminated radiance data. It represents part of the work toward realizing the full potential of GOES high temporal (3–15 min) and spatial resolutions (4–8 km) data.
Acknowledgments This work was jointly supported by Chinese Ministry of Science and Technology under 973 project (2010CB951600) and NOAA GOES-R Risk Reduction Program. We also thank NOAA National Environmental Satellite, Data, and Information Service (NESDIS) and Comprehensive Large Array-Data Stewardship System (CLASS) for providing observational data.
ZOU AND DA
References Ackerman, S., K. Strabala, P. Menzel, R. Frey, C. Moeller, L. Gumley, B. Baum, S. W. Seemann, and H. Zhang (2006), Discriminating clear-sky from cloud with MOIDS algorithm theoretical basis document (MOD35), 129 pp. European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT) (2007), Cloud detection for MSG: Algorithm theoretical basis document, 26 pp. Han, Y., F. Weng, Q. Liu, and P. van Delst, (2007), A fast radiative transfer model for SSMIS upper atmosphere sounding channels, J. Geophys. Res., 112, D11121, doi:10.1029/2006JD008208. Heidinger, A. (2011), NOAA NESDIS Center for Satellite applications and research algorithm theoretical basis document ABI cloud mask, 93 pp. Hocking, J., P. Francis, and R. Saunders (2010), Cloud detection in Meteosat second generation imagery at the Met Office, Forecasting R&D Tech. Rep. 540, 43 pp. Hocking, J., P. Francis, and R. Saunders (2011), Cloud detection in Meteosat Second Generation imagery at the Met Office, Meteorol. Appl., 18, 307–323. Huang, J., P. Minnis, B. Lin, Y. Yi, M. M. Khaiyer, R. F. Arduini, and G. G. Mace (2005), Advanced retrievals of multilayered cloud properties using multispectral measurements, J. Geophys. Res., 110, D15S18, doi:10.1029/2004JD005101. Huang, J., P. Minnis, B. Lin, Y. Yi, S. Sun-Mack, T.-F. Fan, and J. R. Ayers (2006), Determination of ice water path in ice-over-water cloud systems using combined MODIS and AMSR-E measurements, Geophys. Res. Lett., 33, L21801, doi:10.1029/2006GL027038. Jedlovec, J. G., L. S. Haines, and F. J. LaFontaine (2008), Spatial and temporal varying threshold for cloud detection in GOES Imagery, IEEE Trans. Geosci. Remote Sens., 28, 879–885. Köpken, C., G. Kelly, and J.-N. Thépaut (2004), Assimilation of Meteosat radiance data within the 4D-Var system at ECMWF: Assimilation experiments and forecast impact, Q. J. R. Meteorol. Soc., 130, 2277–2292. Merchant, C. J., A. R. Harris, E. Maturi, and S. MacCallum (2005), Probabilistic physically based cloud screening of satellite infrared imagery for operational sea surface temperature retrieval, Q. J. R. Meteorol. Soc., 31, 2735–2755. Qin, Z., X. Zou, and F. Weng (2013), Evaluating added benefits of assimilating GOES Imager radiance data in GSI for coastal QPFs, Mon. Weather Rev., 141, 75–92. Reuter, M., W. Thomas, P. Albert, M. Lockhoff, R. Weber, K.-G. Karlsson, and J. Fischer (2009), The CM-SAF and FUB cloud detection schemes for SEVIRI: Validation with synoptic data and initial comparison with MODIS and CALIPSO, J. Appl. Meteorol. Climatol., 48, 301–316. Reynolds, R. W., C. L. Gentemann, and F. Wentz (2004), Impact of TRMM SSTs on a climate-scale SST analysis, J. Clim., 17, 2938–2952. Szyndel, M. D. E., G. Kelly, and J.-N. Thépaut (2005), Evaluation of potential benefit of SEVIRI water vapour radiance data from Meteosat-8 into global numerical weather prediction analyses, Atmos. Sci. Lett., 6, 105–111.
©2014. American Geophysical Union. All Rights Reserved.
6679
Journal of Geophysical Research: Atmospheres
10.1002/2014JD021455
Stengel, M., P. Undén, M. Lindskog, P. Dahlgren, N. Gustafsson, and R. Bennartz (2009), Assimilation of SEVIRI infrared radiances with HIRLAM 4D-Var, Q. J. R. Meteorol. Soc., 135, 2100–2109. Stowe, L. L., P. A. Davis, and E. P. McClain (1999), Scientific basis and initial evaluation of the CLAVR-1 global clear/cloud classification algorithm for the Advanced Very High Resolution Radiometer, J. Atmos. Oceanic Technol., 16, 656–681. Su, X., J. C. Derber, J. A. Jung, and Y. Tahara (2003), The usage of GOES imager clear-sky brightness temperatures in the NCEP global data assimilation system. Preprints, 12th Conf. on Satellite Meteorology and Oceanography, Long Beach, CA, Amer. Meteor. Wang, Z., and K. Sassen (2013), Level 2 combined radar and lidar cloud scenario classification product process description and interface control document. 61pp. Weng, F. (2007), Advances in radiative transfer modeling in support of satellite data assimilation, J. Atmos. Sci., 64, 3799–3807. Zou, X., Z. Qin, and F. Weng (2011), Improved coastal precipitation forecasts with direct assimilation of GOES 11/12 imager radiances, Mon. Weather Rev., 139, 3711–3729.
ZOU AND DA
©2014. American Geophysical Union. All Rights Reserved.
6680