36
JOURNAL OF APPLIED METEOROLOGY AND CLIMATOLOGY
VOLUME 46
Optimization of an Instance-Based GOES Cloud Classification Algorithm RICHARD L. BANKERT Naval Research Laboratory, Monterey, California
ROBERT H. WADE Science Applications International Corporation, Monterey, California (Manuscript received 27 December 2005, in final form 18 April 2006) ABSTRACT An instance-based nearest-neighbor algorithm was developed for a Geostationary Operational Environmental Satellite (GOES) cloud classifier. Expert-labeled samples serve as the training sets for the various GOES image classification scenes. The initial implementation of the classifier using the complete set of available training samples has proven to be an inefficient method for real-time image classifications, requiring long computational run times and significant computer resources. A variety of training-set reduction methods were examined to find smaller training sets that provide quicker classifier run times with minimal reduction in classifier testing set accuracy. General differences within real-time image classifications as a result of using the various reduction methods were also analyzed. The fast condensed nearest-neighbor (FCNN) method reduced the size of the individual training sets by 68.3% (fourfold cross-validation testing average) while the average overall accuracy of the testing sets decreased by only 4.1%. Training sets resulting from these reduction methods were also applied within a real-time classifier using a one-nearestneighbor subroutine. Using the FCNN-reduced set, the subroutine run time on a 30° latitude ⫻ 30° longitude image (GOES-10 daytime) with 11 289 600 total pixels decreased by an average of 60.7%.
1. Introduction Following the motivation and method described in Tag et al. (2000), a one-nearest-neighbor Geostationary Operational Environmental Satellite (GOES) cloud classifier was developed to analyze imagery in terms of specific and general cloud types, and snow cover, sunglint, and clear-sky (no clouds, ground snow, or sun glint) conditions. The classifier algorithm can be applied to both day and night imagery from either the GOES-10 or GOES-12 satellite as opposed to the daytime-only Advanced Very High Resolution Radiometer (AVHRR) classifier described in Tag et al. (2000). Also, all visible and infrared channels (excluding the 13.3-m channel from GOES-12) were used in the classifier development. An instance-based classification algorithm employs a machine learning technique in which training datasets
Corresponding author address: Richard Bankert, Naval Research Laboratory, 7 Grace Hopper Ave., Monterey, CA 939435502. E-mail:
[email protected] DOI: 10.1175/JAM2451.1
JAM2451
are stored in their entirety and a distance function is used to make predictions. For most purposes, operational users rely on the availability of near-real-time cloud classification data or imagery. This requirement can be difficult to achieve with instance-based classifier applications that use a large training set of samples, as is the case of the one-nearest-neighbor classifier used in this study. Enhancing the classifier in order to optimize speed and accuracy is the primary motivation of this research. In the recent past, various automated cloud classification algorithms have been developed for a variety of intended purposes. Such uses include comparisons of different sensors in cloud detection and classification abilities (Li et al. 2005), enhancement of rain-rate algorithms (Hong et al. 2004; Miller and Emery 1997; Uddstrom and Gray 1996; Liu et al. 1995), and classification of cloud and surface types to benefit cloud information retrievals (Li et al. 2003). Cloud classification algorithms have been applied to data from various satellites and sensors, including the Advanced Very High Resolution Radiometer (AVHRR) on National Oceanic and Atmospheric Administration
JANUARY 2007
37
BANKERT AND WADE
TABLE 1. Classification types for both day and night. Day classes
Night classes
Stratus (St) Stratocumulus (Sc) Cumulus (Cu) Altocumulus (Ac) Altostratus (As) Cirrus (Ci) Cirrocumulus (Cc) Cirrostratus (Cs) Cumulus Congestus (CuC) Cs associated with deep convection Cumulonimbus (Cb) Clear (ground or ocean) Snow covered ground Haze (smoke, dust, etc.) Sun glint
Low clouds Midlevel clouds High, thin clouds High, thick clouds Cumulus congestus Deep convection Thin, high over low clouds Open-celled cumulus Snow-covered ground Clear (ground or ocean)
(NOAA) satellites (Pavolonis et al. 2005; Derrien and LeGleau 1999; Baum et al. 1997; Miller and Emery 1997; Uddstrom and Gray 1996), the Moderate Resolution Imaging Spectroradiometer (MODIS) on the Earth Observing System (EOS) Terra and Aqua satellites (Li et al. 2005; Lee et al. 2004; Li et al. 2003), the Imager on GOES I–M (Hong et al. 2004; Tian et al. 1999), the Multispectral Radiometer (MSR) on the European Meteorological Satellite (Meteosat) (Lewis et al. 1997), and the Visible Infrared Spin Scan Radiometer (VISSR) on the Japanese Geostationary Meteorological Satellite (GMS) (Liu et al. 1995). Cloud classifier methodologies are varied and not limited to nearest-neighbor and other instance-based algorithms. These methodologies have included thresholding techniques (Pavolonis et al. 2005; Derrien and
LeGleau 1999; Liu et al. 1995), maximum likelihood (Li et al. 2005; Li et al. 2003), unsupervised clustering (Hong et al. 2004), multicategory support vector machines (Lee et al. 2004), neural networks (Tian et al. 1999; Miller and Emery 1997; Lewis et al. 1997), fuzzy logic (Baum et al. 1997), and Bayesian discriminant functions (Uddstrom and Gray 1996). An extensive review of earlier cloud classification research can be found in Pankiewicz (1995). A one-nearest-neighbor procedure (Cover and Hart 1967) serves as the instance-based classification methodology (Aha et al. 1991) for the classifier described here. Similar to the methods in Tag et al. (2000), eight training sets of expert-labeled samples were created to serve as the instances from which the nearest neighbor is found for an unclassified (or testing) image sample. Three weather and imagery analysis experts classified samples independently, and only those samples for which all experts agreed were used in the training sets. These eight datasets are differentiated by a unique combination of the following three characteristics of the unknown sample: 1) GOES-10 or GOES-12, 2) day or night, and 3) land or sea background. Daytime samples were defined by solar zenith angles less than 85° with nighttime samples requiring solar zenith angles greater than 90°. The output classes discriminated within the classifier are listed in Table 1. The lack of the higher-resolution visible channel necessitates that the nighttime classes be more general in nature and that the sample size be 32 ⫻ 32 pixels as opposed to the 16 ⫻ 16 pixel size used for the daytime classifier. The characteristic features or attributes described in Tag et al. (2000) were used to represent each training sample. These spectral, textural, and ancillary features were de-
TABLE 2. Selected features used to represent each training-set sample for the GOES-10 dataset. Feature definitions and descriptions can be found in Tag et al. (2000). For GOES imagery channel 1 (Ch1): 0.65 m, channel 2 (Ch2): 3.9 m, channel 3 (Ch3): 6.7 m, channel 4 (Ch4): 11.0 m, and channel 5 (Ch5): 12.0 m. G10 day, land
G10 day, sea
G10 night, land
G10 night, sea
Lat Date Ch2 min Ch2 median Ch3 median Ch4 max Ch4 range Ch4 median Ch5 min Ch5 median Ch1 SD (texture mean) Ch1 entropy Ch1 mean sum Ch1 mean sum (SD)
Lat Date Ch1 median Ch1 std dev Ch2 max Ch2 mean Ch3 max Ch4 min Ch4 range Ch4 mean Ch5 median Ch4–Ch5 mean Ch1 mean difference Ch1 entropy SST
Lat Date Ch2 max Ch3 min Ch3 mean Ch4 min Ch2–Ch4 median Ch4–Ch5 min Ch4–Ch5 contrast
Ch2 range Ch3 min Ch4 range Ch2–Ch4 min Ch2–Ch4 max Ch2 mean difference Ch2 entropy Ch2 cluster shade Ch4 mean difference Ch5 mean sum Ch2–Ch4 SD (texture) SST–Ch4 mean SST–Ch4 min
38
JOURNAL OF APPLIED METEOROLOGY AND CLIMATOLOGY
VOLUME 46
TABLE 3. Selected features used to represent each training-set sample for the GOES-12 dataset. Feature definitions and descriptions can be found in Tag et al. (2000). For GOES imagery channel 1 (Ch1): 0.65 m, channel 2 (Ch2): 3.9 m, channel 3 (Ch3): 6.7 m, channel 4 (Ch4): 11.0 m, and channel 5 (Ch5): 12.0 m. G12 day, land
G12 day, sea
G12 night, land
G12 night, sea
Lat Date Ch1 min Ch2 median Ch3 max Ch4 mean Ch4 median Ch4 std dev Ch1 mean diff Ch1 SD (texture) Ch1 angular second moment (mean) Ch2 reflection min
Date Ch1 mean Ch3 mean Ch4 min Ch4 mean Ch4 median Ch4 std dev Ch1 mean diff (SD) Ch1 SD (texture: max) Ch1 SD (texture: SD) Ch1 contrast (SD) Ch1 entropy Ch2 reflection max SST
Lat Date Ch3 mean Ch3 std dev Ch2–Ch4 median Ch2 SD (texture) Ch2 mean sum Ch4 SD (texture) Ch2–Ch4 mean sum
Lat Date Ch2 min Ch2 std dev Ch3 min Ch3 max Ch3 mean Ch2–Ch4 max Ch2–Ch4 mean Ch2–Ch4 median Ch2 mean sum Ch4 mean diff SST–Ch4 min
rived from the GOES visible and infrared channels or from specific data information (e.g., latitude, date). From the large feature set (100 ⫹ features), a feature selection algorithm was applied to each of the eight training sets. The selection algorithm used was the beam variant of the backward sequential selection (BSS) described in Bankert and Aha (1996). The selected set of features for each of the eight training sets can be found in Tables 2 and 3. Note that in addition to individual channel features being extracted, channel difference (e.g., channel 2 ⫺ channel 4) features were also computed. Also, the sea surface temperature (SST) was taken from a National Centers for Environmental Prediction (NCEP) climatological database. For detailed information on the computation of textural features [mean difference, mean sum, contrast, angular second moment, entropy, cluster shade, and standard deviation (SD)], see Tag et al. (2000) and Welch et al. (1992). The sizes of the eight training sets employed by the GOES cloud classifier are listed in Table 4. In an operational setting, when a real-time image is presented to the classifier, the one-nearest-neighbor algorithm computes the distance (within the appropriate feature
space) between a given sample within the image and each of the training-set samples. This is repeated for all image samples and, for large images, can result in long run times and the intense use of computer resources. Mitigating these potential problems, by employing training-set reduction methods without compromising classifier accuracy, is the goal of this research. The focus here is not on developing new training-set reduction techniques but on how current techniques can be applied to a cloud classification dataset.
TABLE 4. Number of samples for each complete training set. GOES-10 GOES-10 GOES-10 GOES-10 GOES-12 GOES-12 GOES-12 GOES-12
day, land day, sea night, land night, sea day, land day, sea night, land night, sea
5313 5937 4539 5518 2832 2454 3078 2511
FIG. 1. Two-class, two-feature example of training-set sample distribution within feature space.
JANUARY 2007
39
BANKERT AND WADE
FIG. 2. Example of a cross-validation fold for the GOES-10 day land dataset. Each of the five training sets serves as the dataset used in a one–nearest neighbor for which the testing set is presented. SD1: samples with features within one standard deviation of feature mean, and SD2: samples with features within two standard deviations of feature mean.
Section 2 is a description of the training-set reduction methods studied. A comparison of the methods as applied to each training set and real-time data is presented in section 3. Conclusions and discussion are in section 4.
2. Training-set reduction methods The most basic learning method to apply to an instance-based algorithm is to store all training-set samples. For large training sets, this method requires a relatively large amount of computer memory resulting in the slow execution of the classifier. Classifier generalization may also be degraded with the inclusion of noisy samples (Wilson and Martinez 2000). The long computer run time renders a quick turnaround for near-real-time image classifications costly, if not impossible. To provide potential users with near-real-time classifications, a number of training-set reduction methods were tested. The goal of these tests was to make the training set as small as possible with minimal effect on the classifier accuracy.
From the application of various training-set reduction techniques, retaining central samples in each class or finding those training-set samples that reside near feature space decision boundaries are two of the possible results (Wilson and Martinez 2000). Figure 1 is an example of a training set within feature space for a two-class (class X and class O), two-feature application of a one-nearest-neighbor algorithm. When the complete training set is reduced to central samples, only red samples would be included. Conversely, only the green samples would be included when the training set is reTABLE 5. Average (fourfold cross validation) overall accuracy for testing set given the nonreduced training set. GOES-10 GOES-10 GOES-10 GOES-10 GOES-12 GOES-12 GOES-12 GOES-12
day, land day, sea night, land night, sea day, land day, sea night, land night, sea
89.7% 89.9% 85.7% 77.9% 88.6% 90.1% 87.8% 84.9%
40
JOURNAL OF APPLIED METEOROLOGY AND CLIMATOLOGY
VOLUME 46
TABLE 6. Comparison of the testing set results using each training-set reduction method with the results using the complete training set. Cent: centroids, SD1: samples with all features within one standard deviation of class/feature mean, SD2: samples with all features within two standard deviations of class/feature mean, run-time reduction is percent decrease in computer run time for a 30° latitude ⫻ 30° longitude images (land/sea both included in given image). G10DL: GOES-10 day, land; G10DS: GOES-10 day, sea; G10NL: GOES-10 night, land; G10NS: GOES-10 night, sea; G12DL: GOES-12 day, land; G12DS: GOES-12 day, sea; G12NL: GOES-12 night, land; G12NS: GOES-12 night, sea. Size reduction (%)
G10DL G10DS G10NL G10NS G12DL G12DS G12NL G12NS
Accuracy reduction (%)
Run-time reduction (%)
Cent
SD1
SD2
FCNN
Cent
SD1
SD2
FCNN
Cent
SD1
SD2
FCNN
99.6 99.7 99.7 99.8 99.3 99.3 99.6 99.5
86.6 89.5 76.0 87.9 86.7 85.2 76.5 77.5
24.5 29.1 17.1 27.8 28.0 27.2 21.9 20.2
72.4 72.0 65.1 61.2 71.6 72.0 67.8 64.4
29.9 29.7 36.9 13.1 25.8 23.1 33.9 42.8
14.6 18.1 16.5 4.5 12.4 12.9 12.1 16.6
3.2 4.3 4.0 0.9 3.1 4.5 5.3 4.8
4.0 3.4 5.5 4.3 2.4 3.3 5.5 4.3
71.9
68.0
23.8
60.7
52.1
46.2
11.5
33.4
45.9
43.4
16.7
39.1
31.5
25.8
7.1
24.2
duced to samples near decision boundaries. If testing sample A is being classified, it would be classified as an O class for the red training set and as an X class for the green training set. Samples B and C would be classified as O and X, respectively, regardless of whether the reduced training set consisted of central samples or boundary samples. One methodology for creating a reduced training set of central feature space samples is using the feature space centroids of each class as the training samples. This method also creates the smallest training set that is equal to the number of classes (i.e., one training sample per class). Each class centroid for this study consisted of the average value of each feature for each class. Another methodology for creating a reduced training set of central samples is to remove any training-set
sample that has at least one feature that falls more than a distance of standard deviations outside of the feature mean. The smaller is, the smaller the size of the cluster that represents each class. The values of one and two were used here. Hart (1968) introduced a training-set reduction method called the condensed nearest-neighbor (CNN) rule, which produces a training set of samples close to class boundaries in feature space. This method results in a subset of training samples such that every training sample not in the subset would be classified correctly by the subset in a one-nearest-neighbor classifier. CNN begins with a random selection of one training sample in each class to be placed in the subset. If a sample from the remaining training set is misclassified using only this subset of training samples, then that sample is added to
FIG. 3. Graphical comparison of average size, accuracy, and run-time reduction percentages over all eight datasets.
JANUARY 2007
BANKERT AND WADE
41
FIG. 4. GOES-10 (1900 UTC 5 Dec 2005) visible channel image.
the subset immediately. This process continues until no more samples can be added to the subset. This method should result in training samples that are closest to the class boundaries in feature space. A variant of this method is known as the fast condensed nearestneighbor (FCNN) rule (Angiulli 2005), which starts the subset selection with the class centroids instead of a random selection.
3. Method comparison Four different training-set reduction methods were compared with a nonreduced training set by performing
a fourfold cross validation of each of the eight complete training datasets. For a fourfold cross validation, the data are randomly split into four subsets. Each subset is iteratively held out as the testing set and the remaining three subsets are combined to form the training data. Each training set was reduced by each of the trainingset reduction methods, creating a total of five training sets (four reduced sets and one nonreduced set) for each of the eight original datasets. The four reduced training sets are 1) centroids only, 2) samples with all features within one standard deviation of class/feature mean (noted as SD1 hereinafter), 3) samples with all features within two standard deviations of class/feature
42
JOURNAL OF APPLIED METEOROLOGY AND CLIMATOLOGY
VOLUME 46
FIG. 5. Same as Fig. 4, but for infrared (11 m) image.
mean (noted as SD2 hereinafter), and 4) samples selected using the FCNN rule. To clarify this description, Fig. 2 is a diagram example for the data distribution of a given training dataset. A one-nearest-neighbor algorithm was performed on each testing set sample using the reduced training set as the search space for a nearest neighbor. The overall accuracy (average of the fourfold cross validation) for each testing set when the nonreduced training set was used is listed in Table 5. The classification accuracy at night was predictably lower for all cases resulting from the lack of visible channel data, but all accuracies were greater than 77%. For individual classes (not shown here), cirrocumulus (Cc) clouds
were the most misclassified (by percentage) for daytime classifications, while mixed and high, thin cloud classes were consistently the most misclassified for nighttime classifications. In comparison with the nonreduced training set, the average (of the fourfold cross validation) percent training-set size reduction, the average decrease in accuracy for the testing set, and average percent decrease in run times (using near-real-time images) for each reduced training set are computed and presented in Table 6. The size and accuracy reductions had fairly small variability among the eight datasets for any given trainingset reduction method. Typical run times when using the complete training sets ranged from 8 to 12 min on a
JANUARY 2007
BANKERT AND WADE
43
FIG. 6. Same as Fig. 4, but for the cloud classification image using all training samples for the one-nearest-neighbor classifier.
single processor machine for the image size used here (30° latitude ⫻ 30° longitude). Longer run times were observed for the classification of larger areas. The differences in run-time reduction between GOES-10 versus GOES-12 and day versus night need further investigation. The difference in the total number of training samples (Table 4), with the complete GOES-10 datasets being much larger than GOES-12 datasets and the sample size differences in day versus night within realtime image classifications may be contributing factors. For three of the methods (excluding SD2), the size of the training set was reduced by at least 61%. The fol-
lowing questions need to be answered: 1) Do these large training-set reductions result in a huge cost in accuracy for the one-nearest-neighbor algorithm, and 2) do they reduce the run time significantly? As expected, the greater the size reduction, the greater the run-time reduction. Non-SD2 methods had at least a 24% average run-time reduction over all of the datasets, with all three methods providing more than a 60% average run-time reduction for the GOES-10 day real-time classifications. While the SD1 and the FCNN methods have similar run-time reductions, the FCNN method performed better in terms of minimizing the
44
JOURNAL OF APPLIED METEOROLOGY AND CLIMATOLOGY
VOLUME 46
FIG. 7. Same as Fig. 6, but using centroids as the training sets for the one-nearest-neighbor classifier.
accuracy losses over the testing sets with a maximum decrease of 5.5% overall. The centroid method provides the greatest benefit in terms of computer run times with an average of 50.4% reduction over all datasets, but at a much greater cost in accuracy (29.4% average decrease). The SD2 method accuracies were higher than the centroid and SD1 methods for all datasets. However, despite the much larger training-set size, this method did not provide significant accuracy improvement when compared with the FCNN method. The FCNN method produces less accuracy reduction (i.e., higher accuracy) for half of the eight datasets when compared with the accuracy reduction using the
SD2 method with a much quicker run time. A graph of the size, accuracy, and run-time reduction averages over all eight datasets is displayed in Fig. 3. The SD2 method provided the smallest decrease (3.8%) in accuracy, but the FCNN method had a similar decrease (4.1%) and resulted in a greater reduction in computer run time (39.4% versus 14.8%). While a direct linear relationship between training size reduction and decrease in computer run time appears to be evident in Fig. 3, examining the individual datasets (Table 6) provides a different observation. For example, GOES-10 day centroid datasets were created from a 99.6% (land) and 99.7% (sea) reduction from
JANUARY 2007
BANKERT AND WADE
45
FIG. 8. Same as Fig. 6, but using training samples with feature values within one standard deviation of class/feature mean (SD1 method) for the one-nearest-neighbor classifier.
the total training dataset. The run-time reduction for GOES-10 day (land-and-sea-combined image) was 71.9%. GOES-12 day centroid datasets were similar in size reduction to that of the GOES-10 day dataset, but run-time reduction was only 45.9%. Among the reasons for these differences could be dataset size and/or feature space size variability. Although an in-depth interpretation is beyond the scope of this paper, some tendencies appeared in the individual class accuracies throughout the various testing set results. For the daytime land datasets and all four nighttime datasets, the convective classes [day: cu-
mulonimbus (Cb), night: deep convection (DC)] have higher accuracies when using the SD2 training set as opposed to using the nonreduced training set. One possible explanation is that these classes are unique within their respective feature spaces and the SD2 training-set cluster becomes very isolated. This type of feature space distribution would allow any Cb or DC testing sample, whether it falls within, near, or far outside the cluster, to be classified correctly. When the complete dataset was used, training samples closer to class decision boundaries for all classes are included, thus allowing more Cb or DC testing samples to be misclassified.
46
JOURNAL OF APPLIED METEOROLOGY AND CLIMATOLOGY
VOLUME 46
FIG. 9. Same as Fig. 6, but using training samples with feature values within two standard deviations of class/feature mean (SD2 method) for the one-nearest-neighbor classifier.
This explanation was reinforced by the fact that for all eight datasets, Cb (day) or DC (night) testing samples had higher accuracies using the SD2 training sets than when using the FCNN training sets. The samples in the FCNN training sets were closer to class boundaries than those within a cluster near the centroid. Conversely, stratus-type clouds [day: stratus (St) and altostratus (As), night: high thick (mainly cirrostratus)] had higher accuracies when the FCNN training sets were used as opposed to the SD2 training sets. The chosen feature sets may limit the distance between the SD2 clusters for these classes, requiring the inclusion of boundary or
outlier samples to maintain the class accuracies in the testing set results. The statistics generated from the various training– testing set combinations provide a measure of the potential performance of a given algorithm. Examining images generated with the reduced training sets is also informative. As an example, a GOES-10 daytime (with both land and sea in the image) dataset was classified five times using each of the training sets; the complete set and the four reduced sets using the methods are described here. Visible and infrared as well as classification images for this example are displayed in Figs.
JANUARY 2007
BANKERT AND WADE
47
FIG. 10. Same as Fig. 6, but using training samples found through application of the FCNN rule for the one-nearest-neighbor classifier.
4–10. The differences among the classification images are visually distinguishable and provide an illustration of how the classification can be affected by changes in the training set. To obtain an indication of the quantitative differences when examining real-time examples, a set of 32 images was classified using all of the appropriate reduced training sets along with a classification using the complete training set. Using the complete training-set classification image as the benchmark, the average percent pixel differences are listed in Table 7. As seen in the testing data, SD2 and FCNN provided the closest approximation to the benchmark. Taking
into account the significant differences in run time, the FCNN method would appear to be the preferred method for training-set reduction. While all 32 images were from the same time of year (late autumn/early winter), similar results should occur throughout the year because the same training data and classes are applied through all seasons.
4. Discussion For a successful instance-based cloud classifier, collecting as much training-set data as is needed to repre-
48
JOURNAL OF APPLIED METEOROLOGY AND CLIMATOLOGY
TABLE 7. Comparison of each training-set reduction method with the complete training set in terms of average percent pixel difference for 32 real-time images (8 per dataset), where G10 denotes GOES-10 and G12 denotes GOES-12.
G10 G10 G12 G12
day night day night
Cent
SD1
SD2
FCNN
17.3 10.6 13.4 20.9
13.4 8.1 8.4 6.6
6.6 4.7 5.9 3.9
7.2 3.5 4.2 5.9
sent the universe of possibilities for the given classes appears to be a goal. However, reaching this goal may not always be desirable. Not only could a large training dataset introduce noise that can degrade classifier performance, it may make near-real-time analysis resulting from excessive computer run times cumbersome, at a minimum, and nearly impossible in certain situations. The motivation behind this research was to find methods to reduce the size of training sets and the associated computation run time with minimal accuracy impact. While there appears to be some individual class preference to using clustered central training samples (i.e., SD2 method) rather than class boundary training samples (i.e., FCNN method), the overall accuracy differences were minimal for nearly all of the eight datasets. For this type of imagery analysis, employing the FCNN reduced training sets provided greater overall benefits with minimal accuracy reductions, significantly reduced computer run times, and faster nearreal-time classifier output. It is worth noting that the FCNN method is not necessarily going to be the best method for every one-nearest-neighbor problem. Data type, data amount, and features used would also have an impact. In addition, a different reduced training-set method may work best if a different instance-based classification algorithm (other than nearest neighbor) were used on the same dataset. The choice of the training-set reduction method, as well as the choice of the classification method, is guided by user needs. Additional supervised machine learning algorithms and/or training-set reduction methods could be examined in future research. Acknowledgments. The support of the sponsor, the Federal Aviation Administration (FAA) through the Aviation Weather Research Program (AWRP), is gratefully acknowledged. The views expressed are those of the authors and do not necessarily represent the official policy or position of the FAA. Past support for cloud classification research from the Office of
VOLUME 46
Naval Research and Oceanographer of the Navy is greatly appreciated. Thanks are extended to those past and present members of the Satellite Meteorological Applications Section of the Naval Research Laboratory Marine Meteorology Division for their advice and support. REFERENCES Aha, D. W., D. Kibler, and M. K. Albert, 1991: Instance-based learning algorithms. Mach. Learn., 6, 37–66. Angiulli, F., 2005: Fast condensed nearest neighbor rule. Proc. 22d Int. Conf. on Machine Learning, Bonn, Germany, International Machine Learning Society, 25–32. Bankert, R. L., and D. W. Aha, 1996: Improvement to a neural network cloud classifier. J. Appl. Meteor., 35, 2036–2039. Baum, B. A., V. Tovinkere, J. Titlow, and R. M. Welch, 1997: Automated cloud classification of global AVHRR data using a fuzzy logic approach. J. Appl. Meteor., 36, 1519–1540. Cover, T. M., and P. E. Hart, 1967: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory, 13, 21–27. Derrien, M., and H. Le Gleau, 1999: Cloud classification extracted from AVHRR and GOES imagery. Proc. 1999 EUMETSAT Meteorological Satellite Data User’s Conf., Copenhagen, Denmark, EUMETSAT, 545–553. Hart, P. E., 1968: The condensed nearest neighbor rule. IEEE Trans. Inf. Theory, 14, 515–516. Hong, Y., K.-L. Hsu, S. Sorooshian, and X. Gao, 2004: Precipitation estimation from remotely sensed imagery using an artificial neural network cloud classification system. J. Appl. Meteor., 43, 1834–1853. Lee, Y., G. Wahba, and S. A. Ackerman, 2004: Cloud classification of satellite radiance data by multicategory support vector machines. J. Atmos. Oceanic Technol., 21, 159–169. Lewis, H. G., S. Cote, and A. R. L. Tatnall, 1997: Determination of spatial and temporal characteristics as an aid to neural network cloud classification. Int. J. Remote Sens., 18, 899– 915. Li, J., W. P. Menzel, Z. Yang, R. A. Frey, and S. A. Ackerman, 2003: High-spatial-resolution surface and cloud-type classification from MODIS multispectral band measurements. J. Appl. Meteor., 42, 204–226. Li, Z., J. Li, P. Menzel, and T. J. Schmit, 2005: Imager capability on cloud classification using MODIS. Extended Abstracts, 21st Int. Conf. on Interactive Information Processing for Meteorology, Oceanography, and Hydrology, San Diego, CA, Amer. Meteor. Soc., CD-ROM, P1.30. Liu, G. S., J. A. Curry, and R. S. Sheu, 1995: Classification of clouds over the western equatorial Pacific Ocean using combined infrared and microwave satellite data. J. Geophys. Res., 100, 13 811–13 826. Miller, S. W., and W. J. Emery, 1997: An automated neural network cloud classifier for use over land and ocean surfaces. J. Appl. Meteor., 36, 1346–1362. Pankiewicz, G. S., 1995: Pattern recognition techniques for identification of cloud and cloud systems. Meteor. Appl., 2, 257– 271. Pavolonis, M. J., A. K. Heidinger, and T. Uttal, 2005: Daytime global cloud typing from AVHRR and VIIRS: Algorithm description, validation, and comparisons. J. Appl. Meteor., 44, 804–826.
JANUARY 2007
BANKERT AND WADE
Tag, P. M., R. L. Bankert, and L. R. Brody, 2000: An AVHRR multiple cloud-type classification package. J. Appl. Meteor., 39, 125–134. Tian, B., M. A. Shaikh, M. R. Azimi-Sadjadi, T. H. Vonder Haar, and D. L. Reinke, 1999: A study of cloud classification with neural networks using spectral and textural features. IEEE Trans. Neural Networks, 10, 138–151. Uddstrom, M. J., and W. R. Gray, 1996: Satellite cloud classification and rain-rate estimation using multispectral radi-
49
ances and measures of spatial texture. J. Appl. Meteor., 35, 839–858. Welch, R. M., S. K. Sengupta, A. K. Goroch, P. Rabindra, N. Rangaraj, and M. S. Navar, 1992: Polar cloud and surface classification using AVHRR imagery: An intercomparison of methods. J. Appl. Meteor., 31, 405–420. Wilson, D. R., and T. R. Martinez, 2000: Reduction techniques for instance-based learning algorithms. Mach. Learn., 38, 257– 286.