Using Digital Image Processing for Counting ...

3 downloads 0 Views 1MB Size Report
Embrapa Agricultural Informatics, Av. André Tosello, 209 - Barão Geraldo. C.P. 6041 - 13083-886 - Campinas, SP, Brazil. E-mail: [email protected].
1

Using Digital Image Processing for Counting Whiteflies in Soybean Leaves

Jayme Garcia Arnal Barbedo Embrapa Agricultural Informatics, Av. André Tosello, 209 - Barão Geraldo C.P. 6041 - 13083-886 - Campinas, SP, Brazil E-mail: [email protected]

This work was funded by the Brazilian Agricultural Research Corporation, under grant n. 03.12.01.002.00.00.

ABSTRACT This paper presents a new system, based on digital image processing, to quantify whiteflies on soybean leaves. This approach allows counting to be fully automated, considerably speeding up the process in comparison with the manual approach. The proposed algorithm is capable of detecting and quantifying not only adult whiteflies, but also specimens in the nymph stage. A complete performance evaluation is presented, with emphasis on the conditions and situations for which the algorithm succeeds, and also on the circumstances that need further work. Although this proposal was entirely developed using soybean leaves, it can be easily extended to other kinds of crops with little or no changes in the algorithm. The system employs only widely used image processing operations, so it can be easily implemented in any image processing software package.

Publisher version: https://doi.org/10.1016/j.aspen.2014.06.014 Citation: Barbedo, J. G. A. (2014). Using digital image processing for counting whiteflies on soybean leaves. Journal of Asia-Pacific Entomology, 17(4), 685-694.

2

INTRODUCTION The whitefly is a small insect that feeds from the sap of a wide variety of plants (Flint, 2002). According to Martin and Mound (2007), there are more than 1500 identified species of whiteflies. This is one of the main pests that affects agriculture, with damages coming both from sap loss and from the transmission of a variety of diseases carried by the whiteflies. In order to reduce the losses caused by whiteflies, two kinds of actions are normally carried out: 1) monitoring crops in order to detect infestations as soon as possible, so control measures can be implemented more effectively; 2) research on more effective means to monitor crops and control the pest. In both cases, counting the number of insects (nymphs and adults) is a fundamental part of the process. The most direct way of measuring whitefly infestation is to manually identify and count the insects inside a selected region. In general, this approach does not require sophisticated apparatus and, more importantly, relies on the remarkable human ability to resolve ambiguities and unclear situations, even under far from ideal conditions. On the other hand, human beings are susceptible to physiological and psychological phenomena that may be important sources of error: fatigue, visual illusions, boredom, among others. Also, humans are usually much slower than machines in performing simple tasks like counting. Two main strategies for automatically counting whiteflies can be found in the literature, one using sticky traps, and the other using plant leaves directly. The first strategy employs a trap consisting of a piece of paper containing a sticky substance (Boissard et al., 2008), which is placed in the greenhouse or field. Then, the insects caught by the trap are counted. The advantage of this approach is that it provides a smooth and neutral surface, making the counting easier. On the other hand, only specimens capable of flying are captured, which implies that insects in the early stages of development (nymphs) are not taken into account. The method proposed by Cho et al. (2007) explore size and color features to identify and count whiteflies, aphids and thrips, using color transformations, simple mathematical morphology operations, and thresholding. Boissard et al. (2008) proposed a cognitive vision system that combines image processing, neural learning and knowledge-based techniques to detect and count pests. Following this system, Martin et al. (2011) proposed a video camera network for detecting and counting whiteflies and aphids, with the objective to reduce the amount of pesticides used in greenhouse crops; the detection is based on visual cues like size, shape and color. Video cameras were also used by Bechar and Moisan (2010) to count whiteflies; the insects are identified by means of a sophisticated parametric approach, which employs techniques like Mixture-of-Gaussians and Principal Component Analysis to extract small spots present on the scene. In the second strategy, the count is performed directly on the leaves, which may or may not be removed from the plant. In this case, non-flying individuals (nymphs and eggs) may be taken into consideration. The drawback is that imperfections, lesions and veins present in the leaves may hamper the count by mimicking or concealing the objects of interest (insects). Huddar et al. (2012) proposed a method capable of counting insects not only in greenhouses, but also in open farms; the algorithm has four steps: color conversion, segmentation based on the relative difference in pixel intensities, noise reduction by erosion, and counting based on Moore

3

neighbor tracing and Jacob's stopping criterion. The method proposed by Pokharkar and Thool (2012) has two main stages: the object extraction comprises a sequence of operations like background subtraction, filtering (Gaussian blur and Laplacian), and segmentation; the second stage extracts both local and regional features based on color, shape and size. The system proposed by Mundada and Gohokar (2013) begins by smoothing the image using a mean filter, then a large number of features are extracted and fed to a Support Vector Machine (SVM) classifier, which reveals if the plant is infested, and finally a second SVM is used to classify the pests into either whiteflies or aphids. Faithpraise et al. (2013) proposed a system capable of detecting and classifying a wide variety of pests; the main focus of the algorithm is to compensate for perspective distortions so the insects can be correctly identified no matter their position with respect to the sensor (camera). Despite those efforts, the methods proposed in the literature still cannot deal with many of the challenges involved in counting whiteflies, which are described in the next section. As a result, the use of those methods is limited to specific conditions or to complete hardware systems specially designed to work together with the developed algorithm (e.g. Bauch and Rath, 2005). In this context, this paper has two main objectives. The first one is to describe whitefly characteristics in each phase of their lifecycle, with emphasis on the challenges resulting from their peculiarities, as presented in the "Whiteflies Characteristics" section. This same section presents a study in which different color spaces are considered in order to determine the representation that best favors the segmentation and identification of the objects of interest (nymphs and adult whiteflies). The second objective is to propose a new system for detecting and counting whiteflies capable of overcoming some of the challenges imposed by the problem. Such a system is an evolution of the algorithm proposed in Barbedo (2013), and its improvements are based on techniques described in Barbedo (2012). The main difference between the current work and previous studies is the ability to detect and count young insects (nymphs). This is important for rapid implementation of control measures before nymphs reach the adult winged stage and spread. The resulting algorithm is simple and easy to implement in any of the available image processing packages. The results are presented with special emphasis on the conditions under which the algorithm tends to work, and also when it tends to fail. Final remarks presented focus on possible solutions that may be explored in the future to improve the algorithm's performance under more complex situations.

MATERIALS AND METHODS Whitefly Characteristics Color Properties and Morphological Characteristics The lifecycle of whiteflies is composed of six stages, as shown in Fig. 1. Each of those stages is described in the following, based on information extracted from Souza (2004).

4

Fig. 1 - Whitefly lifecycle. (Images of the eggs and first, second and third instars are reproduced with permission of Dr. Surendra Dara, from University of California Cooperative Extension).

The whitefly eggs are elongated, having a length of about 0.2 mm and width of approximately 0.08 mm. In the beginning, the eggs are white, but acquire a yellow hue as the embryo develops, until they reach a reddish hue just before eclosion. Due to their small size, magnification is required for the eggs to be detected in digital images. In this work, only images captured directly by conventional digital cameras, without magnification, are used, hence the eggs are not taken into consideration. In the first instar, the nymphs are 0.3 mm long and have a pale yellow coloration. Their detection in conventional images is difficult because of the small size and, more importantly, because they are mostly translucent. In the second and third instars, the nymphs grow in size up to 0.6 mm, however the characteristics of color and opacity change little. Thus, although the detection is a easier due to increased size, the fact that they are translucent makes the detection hard even by visual inspection (Fig. 1). Fourth instar nymphs are larger and are much more opaque than younger specimens. In most cases, this significantly increases their contrast with respect with its surroundings, as can be observed in Fig. 1. The translucent structure in the image representing the fourth instar is an empty exoskeleton shed by a nymph to allow growth. Those exoskeletons pose an additional challenge for the algorithm, as they have the same size and shape as the actual nymphs. When they finally reach the adult stage, whiteflies are about 1 mm long, their abdomens are yellow, and the wings are white. Those characteristics make this the easiest stage to be detected, although the high mobility presented by the insects in this phase may result in distortions and aberrations in the captured image.

5

Color Models The color model normally used to store images is the RGB (Red-Green-Blue). This widely used format does not always convey enough information to allow an adequate image segmentation. This section has as objective to show how the whiteflies are represented in the different color models, in order to infer which color transformations should be applied prior to segmentation. Fig. 2a shows the reference image to be used in the comparisons, which contains third and fourth instar nymphs, adult whiteflies, and exoskeletons. None of the color channels of the RGB color model provides an adequate representation for a successful image segmentation. However, subtracting the blue channel from the green channel results in a more adequate representation (Fig. 2b). As can be observed, in this representation the yellow regions become very bright, while white regions become dark, which is very useful to distinguish between nymphs and exoskeletons. The second color model tested was the CMYK (Cyan-Magenta-Yellow-Key). In the channel K of this color model, adult whiteflies and nymphs appear dark (Fig. 2c). The problem here is that veins and trichomes (epidermal hair cells) are also dark. As a consequence, shape and size rules would have to be applied after the image thresholding in order to identify the correct objects. The third color model tested was the CIELab, also known simply as Lab (Lightness and two channels representing opposing color dimensions). The channel b of this representation is useful in the identification of empty exoskeletons, which appear dark in Fig. 2d. The wings of adult whiteflies also appear dark, which, combined with a lighter region related to the head and abdomen of the insect, may be a good cue for the presence of adult whiteflies. The fourth color model studied was the HSV (Hue-Saturation-Value). This model is a cylindricalcoordinate representation of the RGB color model, and was created as an attempt to better approximate human perception (Smith, 1978). In the channel H of this color model, nymphs and abdomens of adult insects present a dark shade, which sets them apart from the rest of the scene, while empty exoskeletons appear almost transparent (Fig. 2e). The last color model studied was the XYZ, which is composed by three channels based on the concept of relative luminance. None of the channels of this color model was able to highlight the objects of interest. Although some channels of the tested color spaces present some favorable characteristics to the segmentation of whiteflies, they all have weaknesses that may cause them to fail under certain conditions. For that reason, further studies were carried out in order to determine how chaining color transformations could improve the results. When a color transformation is applied, the resulting channels reveal new information that might not be visible in the former representation. In this work, a further step on this "information mining" was taken by chaining color transformations. However, simply chaining color transformation would make no sense, as no matter the number of intermediate transformations, the final result would always be simply the transformation from the original to the last color representation. What was done here was to always use the RGB-related equations to perform each transformation. For example, consider the RGB-HSV-Lab transformation chain. The first transformation is done normally, using the RGB-HSV transformation equations. In the second

6

transformation, however, instead of using the HSV-Lab equation, the RGB-Lab equation is applied over the HSV representation. In other words, the H, S and V channels are treated as if they were the R, G and B channels. This results in a representation of the image that is completely different from any that would be obtained by using the conventional transformations, and such a new representation often carries new information that can be explored to identify objects. This process is possible because, with exception of CMYK, all color spaces have three channels. In the case of the CMYK, the K channel is discarded prior to a new transformation. Fig. 3 illustrates the example above.

Fig. 2 - a) Example of image containing adult whiteflies and nymphs. b) Representation with the blue channel subtracted from the green channel. c) Channel K of the color space CMYK. d) Channel b of the color space Lab. e) Channel H of the color space HSV.

7

Fig. 3 - The color transformation process. The equations used to go from one color model to the next always consider RGB as the reference model.

The adopted chaining process was inspired by empirical observations. Because of that, all possible chains of up to three transformations were investigated for each type of object, considering all the color models described before. The best were selected for inclusion in the proposed algorithm, as described in the next section. Image Capture The proposed system is sensitive to the conditions under which the images are captured. In order to ensure the best results, the following conditions should be met: - The illumination should be as diffuse as possible, so prominent features in the surface of the leaf do not cast any shadows that could harm the segmentation process. This also prevents specular reflections, which can be very damaging. This condition can be easily met in laboratory, as long as direct illumination is avoided, that is, the leaf is not put directly beneath the light source. In the field, if the weather is overcast, the illumination is usually naturally diffused. However, sunny conditions require that a semi-transparent or opaque screen be used to cast a shadow over the leaf. The screen should be distant enough from the leaf so the image can be comfortably captured. - The image should be captured perpendicularly to the leaf, that is, the lens axis must be normal to the surface of the leaf. This condition aims to prevent size and shape distortions due to perspective. It is important to highlight that this condition is usually met even if the person capturing the image was not instructed to do so, as the perpendicular position normally provides the best depiction of the leaf's conditions. However, if the leaf is in an awkward position, it may be necessary to mechanically move it to a better angle of capture. This may have the undesirable effect of disturbing the insects, thus leaves that are in more appropriate position should be preferred as targets, if possible at all. It is worth noting that, as will be seen in the results section, deviations of up to 30° from the ideal 90º angle will cause only minor drops in the algorithms accuracy, which means that only in very rare occasions the leaf will have to be manually moved. - If the leaf is rugged, it should be carefully flattened to avoid perspective distortions and focus problems. As it may be impossible to do this without disturbing the insects and displacing the very elements that should be counted, an alternative solution would be to take into account only the regions of the leaf that are perfectly focused and have little perspective distortion.

8

It is important to emphasize that small to moderate deviations from those ideal conditions will not have significant impact on the accuracy of the system. More importantly, even if one or more of those conditions are not observed, this does not mean that the algorithm will inevitably fail. However, the farther from ideal are the capture conditions, the less reliable will be the estimates yielded by the system. The "Tests and Discussion" Section presents some tests regarding how sensitive the system is to variation on those conditions. Database The main database used in the tests is composed of 748 images of soybean leaves. The degree of infestation of those leaves varies from completely healthy and pest free to highly infested. In a few images all whiteflies are at the same stage of development, while most of the images contain specimens in various stages of the whitefly lifecycle. About a quarter of the images were captured under close-to-ideal conditions, while the conditions for the other 75% were varied so the sensitivity of the system to those conditions could be investigated (see "Results" section). The images were captured using a 10-MPixel consumer level compact camera (Panasonic Lumix DMCLZ10). Although higher-end camera models would likely provide better quality images, the idea here was to keep the test conditions as close as possible to those expected to be produced by potential users of the system. The captured images are 3648 pixels wide and 2736 pixels high. They were stored in the JPEG format (best quality setting) using 8 bits per pixel for each color channel (RGB). Being a lossy compression algorithm, the JPEG format will always introduce some undesirable artifacts. On the other hand, it is unlikely that users of the proposed algorithm will capture the images in the uncompressed RAW format, and many cameras simply do not have such an option, justifying the format choice. For some of the images, a translucent plastic pipe with 10 cm of diameter was attached to the camera in order to define a circular region of interest. This is a common practice in studies involving whiteflies, as this procedure compensates for leaf size variations by forcing that the counts are always performed over regions with similar areas. An additional benefit is that, with the diameter of the pipe being known, it is possible to convert any measurements from pixels to any other length unit. Fig. 4 shows an example of image captured using the pipe. This will be the image that will be used as example throughout the paper. This image was chosen because it contains several elements of interest: nymphs, adult whiteflies, exoskeletons, and also some dark structures caused by fungi. Although the leaf in the image was removed from the host plant and taken to a laboratory, this is not mandatory in order to meet the conditions described in the previous section. In fact, this was done here in order to ensure maximum control over environmental variables, which is necessary for rigorous and meaningful tests. The leaf removal was done very carefully to avoid disturbances on the insects' positions, but eventual disruptions on the original positions have no impact on the results presented here. However, in recognition to the importance of having some results using images captured in the field, a smaller set containing 15 images of leaves under sunlight and 15 images of leaves taken in a cloudy day, was also built.

9

Fig. 4 - Image to be used as example in the description of the algorithm.

Description of the System The system to be presented in the following is strongly based on color heuristics and mathematical morphology. The primary reason for this choice was to keep the algorithm simple to implement and to ensure that the resulting code would have low computational complexity, allowing real time processing and implementation in devices with low computational power. Additionally, using more sophisticated techniques, such as artificial neural networks (ANN), support vector machines (SVM), and deep learning, did not statistically improve the results for the dataset used in the tests. The results of the statistical analysis is shown in the Results section. The system is composed by four main parts: delimitation of the region of interest (ROI), application of the color transformations, threshold-based segmentation, and detection of young nymphs (first to third instars). Delimitation of the Region of Interest (ROI) The ROI delimitation procedure described in this section may be skipped if the user chooses to perform the delimitation manually. In the context of this work, this delimitation was always performed automatically. In order to isolate the ROI, the image is transformed to the CMYK color space, from which only channels C and Y are used (Fig. 5a and 5b). The channels C and Y are then binarized using, respectively, the values 150 and 100 as thresholds, that is, all pixels above those levels are made white, and all others are made black. The resulting binary masks are shown in Fig. 5c and 5d. In the following, the binary operation AND is performed between both masks, and all holes (black regions totally surrounded by white pixels) are filled (Fig. 5e). Finally, the connected objects are

10

identified, and only that with the largest area is kept, resulting in the final mask (Fig. 5f). Fig. 6 shows the result of applying the mask to the reference image shown in Fig. 4. Color Space Transformation As mentioned above, different color transformation chains were tested. In the end, the following chains led to the best results: - Nymphs: RGB - Lab - XYZ; from the final result, the third channel was considered (Z) (Fig. 7a). - Adult whiteflies: RGB - XYZ - XYZ - CMYK; from the final result, the first channel was considered (C) (Fig. 7b). - Exoskeletons: RGB - CMYK - XYZ; from the final result, the third channel was considered (Z) (Fig. 7c). - Leaf lesions and fungi: RGB - Lab - CMYK; from the final result, the second channel was considered (M) (Fig. 7d).

Fig. 5 - a) Representation of the image in the channel C of the CMYK color space. b) Representation of the image in the channel Y of the CMYK color space. c) Binary mask of channel C. d) Binary mask of channel Y. e) Combination of the two masks using the binary operation AND. f) Final mask that defines the region of interest.

Delimitation of the Objects The images resulting from the color transformations are thresholded according to the following rules: - Nymphs: all pixels with value above 230 are made white (Fig. 7e). - Adult whiteflies: all pixels with value below 13 are made white (Fig. 7f). - Exoskeleton: all pixels with value below 64 are made white (Fig. 7g). - Lesions and fungi: all pixels with value below 128 are made white (Fig. 7h).

11

Fig. 6 - Masked reference image.

These threshold values were determined using a single image containing all types of objects considered. An obvious problem with this approach is that different lighting conditions will probably change the object/background relations, in which case those threshold values may not properly highlight the objects anymore. In order to partially compensate for this, the pixel values of the entire image is shifted so the peak of the green channel histogram is located at the value 140. For example, if the peak of the histogram of a given image is at 130, the values of all pixels of all channels of the image are increased by 10. Pixels whose resulting values are below 0 and above 255 are made equal to 0 and 255, respectively. The procedure described above works very well if the leaf under scrutiny has a shade of green close to the shade of the leaf used to determine the threshold values. As this shade departs from the reference one, the relation between the colors of leaf and objects changes, in which case the effectiveness of the pixel value shift may be reduced. In practice, it was observed that this becomes a problem only when the leaf presents a considerably lighter shade of green. Problems may also arise when the color of the leaf is not homogeneous. If considerably different shades of green are present within the leaf, the algorithm will have problems detecting the objects in some regions. A possible solution to this problem would be applying the pixel value shift locally, however tests have shown that the side effects caused by this approach cause more damage than the problem it should mitigate. Thus, this remains an open problem to be tackled in future research. All considerations made in this paper assume that the leaves are completely, or at least mostly, green. As the hue departs from green to yellow and brown, the algorithm will most likely fail. It is important to highlight, however, that the method is capable of dealing with small lesions and diseased regions. A final observation regarding the thresholding and pixel value shift adopted in this work is that all tests were performed using soybean leaves. The leaves used in the tests presented a quite wide range of shades of green (this variation being due both to lighting variations and leaf maturity), which makes the observations

12

made here comprehensive. In other words, it is likely that the results hold for other plant species. However, for some plants, a new algorithm optimization may be required. After the thresholding, the objects of interest are highlighted in the binary images, along with a large number of spurious objects. In other to eliminate those undesirable elements, all objects smaller than 10% of the size of the largest object are removed (Fig. 7i-l) − in manually annotated images, the observed ratio between the largest and smallest objects was 9.3, 1.2, 5.5, and 157.1 for nymphs, adult whiteflies, exoskeletons, and lesions and fungi, respectively. As can be seen, the ratio for lesions is way above the adopted threshold, but very small lesions usually cannot be distinguished from debris, thus their removal is inevitable. Objects that touch the borders of the ROI are also discarded, because that region normally contains distortions that can lead to error. After that, all connected objects are identified and counted.

Fig. 7 - a-d) Images after color transformations. e-h) Thresholded images. i-l) Images after purging of spurious objects.

Detection of Young Nymphs The strategy to detect young nymphs has a starting point the images resulting from the color transformation adopted for nymphs (Fig. 7a). As is the case for nymphs of fourth instar, a thresholding is carried out, but in this case all pixels with value above 100, and not 230, are made white. Fig. 8a shows the result of this threshold applied to Fig. 2a, where nymphs of third and fourth instar, parts of adult whiteflies, and a large amount of spurious elements, particularly leaf veins, are highlighted. In order to clean the image, a rule is applied which states that all connected objects whose eccentricity is greater than 0.9 are eliminated − empirical tests using manually annotated images revealed that young nymphs had eccentricities between 0.7 and 0.85. The eccentricity is calculated by first identifying the smallest ellipse

13

that encloses the object, then the distance between the foci of the ellipse is measured, and finally this distance is divided by the length of the longest axis of the ellipse. A perfect circle has an eccentricity value of zero, while a line has an eccentricity value of 1. Fig. 8b shows the result of the application of this rule. As can be seen, elongated objects are removed from the image. An undesirable collateral effect of this operation is that all nymphs that touch veins are also discarded. In the following, all objects that were detected in previous steps are discarded, particularly fourth instar nymphs and adult whiteflies (Fig. 8c). Finally, all objects smaller than 10% of the size of the largest object are eliminated (Fig. 8d).

Fig. 8 - Process for detecting nymphs of first to third instar.

As can be observed, some spurious objects still persist after the operations, and not all nymphs are detected.

14

This happens for two main reasons: 1) the high degree of translucency of the nymphs requires the adoption of a threshold value that inevitably includes a large amount of spurious objects; 2) size variation between the nymphs of first and third instar makes it difficult for a simple size criterion to remove spurious objects. The eccentricity criterion successfully eliminates a large amount of undesirable objects, but unfortunately it is not able to remove them all. In practice, it was observed that, if the leaf has little debris and there is low variation on the size of the nymphs (no more than 50%), the error levels, even for first instar nymphs, is quite low. On the other hand, less than ideal conditions will lead to large error rates. Because of that, the implementation of the algorithm includes a step in which the detected objects are marked over the original image. The resulting image is presented to the user, who can, if necessary, manually correct the results. This correction is quick and simple: with the left mouse button, the user can mark undetected objects, and with the right button, he/she can unmark falsely detected elements. This correction step will add some time to the entire process but, as it will be seen in the next section, even in this case the proposed system is faster than manual counting. RESULTS AND DISCUSSION The main results for the tests performed with the algorithm are shown in Tables 1 and 2. The deviations between estimated and actual number of objects are shown in the third column of Table 1. Sometimes those deviation values can be very low due to false positives and false negatives mutually cancelling out. In other words, if the number of false positives and false negatives is similar, the estimated number of objects will be very close to the actual value, despite the large number of misclassifications. As a result for more meaningful information, the fourth column presents the F-measure for each class. This value, which is the harmonic mean of precision (a measure for false positives) and recall (a measure for false negatives) (Powers, 2011), gives a measure for the amount of mistakes made by the algorithm, being 0 the worst and 1 the best possible value. The minimum acceptable F-measure value actually depends on the type of application, but a threshold of 0.8 may be taken as a rule of thumb. Finally, Table 2 presents a confusion matrix that aims to discriminate the errors made by the algorithm − for example, first instar nymphs were correctly classified 219 times, were classified as second instar 33 times, were identified as third instar once, and so on. Table 1 - Relative accuracy for the count estimate. Object 1st Instar 2nd Instar 3rd Instar 4th Instar Adult Whiteflies Exoskeletons Fungi and Lesions

Actual 705 898 1226 1520 1367 2126 256

Estimated 360 705 981 1550 1353 2210 276

Deviation -49% -21% -20% +2% -1% +4% +8%

F-Measure 0.68 0.79 0.85 0.94 0.95 0.93 0.81

The main remarks that can be drawn from the tables are presented next. A large number of nymphs of first and second instars were not detected at all, and the algorithm also had some problems differentiating between these two classes. This bad performance for is due to two main factors,

15

the high degree of translucency and the small size. The translucency causes the nymphs to blend with the leaf, making the detection difficult. The small size may cause two problems. First, one of the strategies adopted for eliminating spurious objects is the removal of elements much smaller than the largest object detected. Although the size threshold was carefully determined so it would not discard actual nymphs, if part of the nymph is not detected, its size may be below the threshold, in which case it is discarded. Second, if there are only small objects, that is, if the largest detected object is small, the size threshold may be so low that a large number of spurious objects end up being considered in the counting process. As a result of those observations, the manual correction may be necessary in approximately 85% of the cases in which first and second instar nymphs are present. Table 2 - Confusion matrix obtained using the proposed algorithm. The values represent the number of observations. Predicted Classes

Actual Classes

1st Instar

2nd Instar

3rd Instar

4th Instar

Adults

Exoskel.

Lesions

Leaf

1st Instar

219

33

1

0

0

12

3

437

2nd Instar

99

557

50

0

0

15

3

174

3rd Instar

13

89

883

37

3

139

7

55

4th Instar

0

0

28

1445

32

12

1

2

Adults

0

0

1

50

1283

8

21

4

Exoskel.

17

20

16

11

14

2008

29

11

Lesions

1

2

1

7

21

10

207

7

Leaf

11

4

1

0

0

6

5

*

The number of false positives for nymphs of third instar is relatively small, with most confusion happening with the second instar. On the other hand, the number of missed objects (false negatives) is quite high. The main reason for this is that the exoskeletons often have characteristics that are similar to the third instar nymphs. A second major reason for such a high number of missed objects is that, as commented before, the procedure for detecting nymphs from first to third instars has the collateral effect of removing all objects that touch the leaf veins. In this case, manual corrections may be necessary in about 35% of the cases. The results for nymphs of fourth instar are significantly better due to the greater opacity of the nymphs. Most errors, either false positives or false negatives, occur due to some degree of characteristics overlapping between the classes. For example, part of the body of adult whiteflies have colors identical to those of fourth instar nymphs, which, depending on the angle of the insect with respect to the camera's sensor, may lead to

16

error. This class also have some similarities with third instar insects (size) and with some lesions (color). This may cause nymphs to remain undetected when they are located in diseased areas of the leaf, and also when the image was captured under inadequate lighting conditions. The results for adult whiteflies are also good, with a few errors being due characteristics overlapping with other classes (especially fourth instar). It was observed that some errors were caused by whiteflies moving at the instant of the image capture. In the case of exoskeletons, depending on the lighting during the image capture, third instar nymphs may be misclassified as exoskeletons, causing a large number of false positives for this class. Fungi and lesions may vary considerably in terms of size, shape and color. In some cases, they may emulate the characteristics of other objects of interest, causing error. The relatively low F-measure is most likely because this class has the fewest number of samples, being more subject to outliers. In general, the observed results can be considered good. Some tests were carried out in order to determine the sensitivity of the algorithm to non-ideal conditions: - Specular reflection: when directly illuminated at certain angles, some surfaces may reflect almost all light, producing a glaring effect that may obfuscate an entire region of the image. In cases like this, the detection of the objects of interest is nearly impossible. To avoid this problem, the illumination should be as diffuse as possible, as stated before. - Shadows: if the leaves are directly illuminated at a given angle, prominent structures, like veins and the insects themselves may cast shadows that will be yet another visual structure that must be filtered out. Small shadows, caused by high angle illumination, are not very damaging. On the other hand, low angle illumination may produce extensive shadows that may cover objects, making their detection more difficult. Additionally, the algorithm may perceive the shadows as lesions or fungi, which can severely damage the whole identification process. This is yet another reason for using a diffuse illumination. If that is not possible, the illumination should be as perpendicular to the leaf's surface as possible. - Angle of capture: as stated before, the images should be captured with the central axis of the sensor being as perpendicular to the leaf's surface as possible (90º). Table 3 shows the effects of changing the angle of capture over the F-measures for each type of object. As can be seen, small deviations from the 90º angle have limited impact on the results. However, as the deviation is increased, a significant drop in accuracy is observed, mostly due to perspective aberrations and focus problems. Table 3 - Effects of the angle of capture over the performance of the method, in terms of the F-measures. Object 1st Instar 2nd Instar 3rd Instar 4th Instar Adult Whiteflies Exoskeletons Fungi and Lesions

90º 0.68 0.79 0.85 0.94 0.95 0.93 0.81

75º 0.67 0.77 0.82 0.93 0.94 0.90 0.81

60º 0.60 0.71 0.76 0.88 0.90 0.83 0.76

45º 0.50 0.60 0.67 0.77 0.77 0.65 0.66

17

- Rugged leaves: when the leaves are severely rugged or irregular, some parts of its surface may not be properly focused, that is, they appear blurred in the captured image. This is particularly damaging for the detection of small objects, like young nymphs, as they cannot be resolved under such conditions. In order to avoid this kind of problem, the leaf should be carefully flattened (if possible), or cameras and lenses with a large depth of field should be used. - Leaves with varying shades of green: as stated before, the reference leaf has the peak of the green histogram at the value 140, and a compensation must be applied to any leaf that does not match that. Fig. 9 plots the combined error, given by

(

), for adult whiteflies and fourth instar nymphs. As

can be seen, the error grows as the green histogram peaks move farther from the reference value, and the effect is more pronounced for leaves with lighter shades of green (larger peak values).

Fig. 9 - Influence of the leaf's shade of green on the results. The vertical axis quantifies the error (in %), and the horizontal axis organizes the shades of green from dark (left) to light (right).

- Images captured in the field: Table 4 shows the F-measures obtained using the main database, the 15 images captured under sunny conditions, and the 15 images captured with cloudy weather. As can be seen, the results obtained under cloudy conditions were just slightly worse than those obtained for the original database. This is because the clouds scatter light, reducing specular reflections and making the shadows less prominent. On the other hand, the system had trouble dealing with the images captured under direct sunlight, not only due to the occurrence of shadows and specular reflections, but also because the colors are "washed out" by the intense lighting. Therefore, if the images are to be captured in the field, cloudy weather is preferable. If that is not possible, using some kind of screen to cast a shadow over the leaf may reduce the undesirable effects.

18

Table 4 - Effects of weather condition over the performance of the method, in terms of the F-measures. Object 1st Instar 2nd Instar 3rd Instar 4th Instar Adult Whiteflies Exoskeletons Fungi and Lesions

Main Database 0.68 0.79 0.85 0.94 0.95 0.93 0.81

Cloudy

Sunny

0.66 0.77 0.79 0.95 0.95 0.89 0.84

0.35 0.39 0.47 0.69 0.62 0.55 0.78

As stated before, statistical tests were performed in order to compare the heuristic-based approach with more sophisticated techniques such as artificial neural networks (ANN), support vector machines (SVM), and deep learning. This analysis was performed by means of the Fisher's randomization test, which evaluates the probability that randomly choosing either of the two compared approaches would lead to better results than deliberately choosing the one with the best average results. If this probability is too low (P