A Direction-Guided Ant Colony Optimization Method for ... - IEEE Xplore

Thisarticlehasbeenacceptedforinclusioninafutureissueofthisjournal.Contentisfinalaspresented,withtheexceptionofpagination. IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING

1

A Direction-Guided Ant Colony Optimization Method for Extraction of Urban Road Information From Very-High-Resolution Images Dandong Yin, Shihong Du, Shaowen Wang, and Zhou Guo

Abstract—Typical object-based classification methods only take image object properties as criteria to classify roads, leaving the associated edge information unused. These methods often lead to fragmented road areas and inconsistent road widths and smoothness. Meanwhile, very-high-resolution (VHR) images contain a large amount of edge information and different types of geographic objects, thus, it is challenging to extract roads by typical edge-based extraction or grouping methods. In this study, a globally optimized method is developed to integrate both object and edge features to extract urban road information from VHR images. This novel method extends ant colony optimization (ACO) through deploying and moving ants (artificial agents) along roads with the guidance of comprehensive object and edge information. As ants spread pheromone along their paths, roads are recognized based on aggregated pheromone levels. A set of experiments on VHR images showed that our method significantly outperforms object-based classification methods with not only improved road extraction quality but also enhanced stability when applied to large and complex images. Index Terms—Ant colony optimization (ACO), image classification, object-based image analysis (OBIA), road extraction.

I. I NTRODUCTION

R

OAD TRAFFIC is a major transportation problem in urban environments with dense population. Road network planning, construction, and management have significant impacts on urban development and sustainability. It has long been recognized as an important research topic to acquire road information rapidly, accurately, and efficiently. After the emergence of remote sensing technologies, deriving road information from satellite images has soon become an active area of research. Since 1976, extensive methodological research has Manuscript received March 30, 2015; revised July 23, 2015; accepted August 26, 2015. This work was supported in part by the National Natural Science Foundation of China under Grant 41471315 and in part by the U.S. National Science Foundation under Grant 1354329 and Grant 1443080. (Corresponding author: Shihong Du.) D. Yin are with the Institute of Remote Sensing and GIS, Peking University, Beijing 100871, China, and also with the Department of Geography and Geographic Information Science and National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, Champaign, IL 61801 USA (e-mail: [email protected]). S. Du and Z. Guo are with the Institute of Remote Sensing and GIS, Peking University, Beijing 100871, China. S. Wang is with the Department of Geography and Geographic Information Science and National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, Champaign, IL 61801 USA (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/JSTARS.2015.2477097

been pursued [1], [2]. However, most, if not all, of the existing methods address particular data types and road condition scenarios while having generality limitations [3]. During the past decade or so, very-high-resolution (VHR) images have become increasingly available and been widely used for road information extraction. While VHR images provide incredibly abundant details of land surface, challenges remain for systematically and automatically recognizing large, complicated objects like roads [4]. With low-resolution remote sensing images, road width is not a major factor because roads can be represented as linear objects. With VHR images, however, many details inside or beside roads including road markings, vehicles, street trees, and shadows of roadside buildings exist as explicit features. In the VHR case, it is no longer appropriate to represent roads as linear objects [2], especially in complex urban situations, where roads are often occluded by heavy traffic and shadows [5]. In general, existing road extraction methods suffer when they encounter irregular geometric deformations and radiometric changes due to various scenarios of road junctions, traffic jams, shadows, and road markings. [6]. It is, therefore, of significant interest to resolve this challenge for extracting road information from VHR images based on the following two complementary strategies. One strategy is to represent roads as complex objects, which expand at large scale and are interrelated to many other related objects. The other is to establish a holistic object reference framework to capture and utilize heterogeneous contextual objects and link them to road areas. This paper combines these two strategies to establish a novel method for road information extraction based on VHR images with a particular focus on urban settings. Our major contribution is two-fold: 1) a direction-guided approach with roads represented as “smooth direction fields” containing a set of features, which leverages the advantages of both edge- and object-based methods and 2) an extended ant colony optimization (ACO) algorithm to intelligently organize these features into a topological model of road areas. A set of experiments demonstrated that our method outperforms object-based methods for extracting roads under different road conditions (e.g., street trees) and image scenarios (e.g., shadow occlusion), and has the ability of taking advantage of multiple information sources extracted from VHR images. This research also tackles associated data intensive challenges in the following three specific aspects: 1) computational intensity [7], [8]; 2) noise immunity; and 3) global connection

1939-1404 © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Thisarticlehasbeenacceptedforinclusioninafutureissueofthisjournal.Contentisfinalaspresented,withtheexceptionofpagination. 2

IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING

optimization, i.e., to link separately detected local objects into a composite object that matches human definition of a road. Existing road extraction methods such as edge-based geometric grouping and region-based classification (cf. Section II) were not developed to resolve these data intensity aspects. Edgebased methods suffer in the first aspect as they tend to be entangled at bottom-level information and such disadvantage aggravates as image resolution reaches a high level. On the other hand, object-based classification methods perform reasonably well in the first aspect, but fail in the second and third, leading to fragmented and broken results. The performance of object-based classification may be improved by employing more intelligent classification methods [9]–[11] or involving more human supervision. However, such improvements have not addressed the issues of building inter-object connections and capturing global information since these methods only worked on individual objects. In comparison, our direction-guided ACO method aims to resolve the challenges in all of the three aspects. Regarding the first aspect, computational intensity is reduced because each ant’s decision relies only on local contexts without the need to compute global information intensively. As a result, the method is scalable to large images. Furthermore, since ants are loosely coupled agents, the extended ACO algorithm is suitable to be parallelized for taking advantage of high-performance parallel computing resources. As for the second, the method is generic to incorporate extensive information needed for road extraction, as long as related pheromone rules and heuristic functions can be defined. In terms of the third, inter-object connections can be captured spontaneously by ants’ movements. One single step of an ant connects two sets of features, whereas one complete ant path is naturally an object structure containing a group of rich features. It is worth noting that our method not merely groups atomic units for classification as object-based classification does, but also produces final results from competitions as well as recombinations of parts of these paths based on pheromone level. Therefore, the method is able to build hybrid object structures with flexibility to split and recombine these structures. To the best of our knowledge, this is the first work using ACO to extract road information from VHR images. II. ROAD E XTRACTION R EVIEW Most of existing methods for road information extraction start from the characteristics of roads. In general, major characteristics of road areas in VHR images include geometrical, radiation, topological, and contextual characteristics [12]. These methods exploit such characteristics because they widely exist in various roads and are relatively easy to be captured in computer data structures. Topological characteristics are often combined with geometric ones in low-resolution images at large spatial scales. Contextual characteristics are the most difficult to be defined mathematically, but they help improve final results if handled properly. Based on these characteristics, existing methods using VHR images fall into two categories: 1) edge-based geometric grouping and 2) region-based classification. Edge-based methods generally represent roads with sets of image edges based on geometric and topological characteristics of roads to find and group the edges belonging to the same

roads from a large number of edges of other roads and nonroads. A typical method in this category is perceptual grouping [13] that uses the principles of Gestalt psychology to generate a set of grouping rules to measure the connections between different edges. The edges are then grouped hierarchically to build up complete roads according to these rules. This method is highly sensitive to the quality of detected edges and can be easily disturbed by irrelevant details in VHR images. Dynamic programming-based approaches [14] or active contour model (Snake) [15], [16] iteratively adjust the current results to obtain an accurate contour of roads by minimizing an energy function, measuring both the pixel conformity inside road areas and edge sharpness and smoothness of road contours. While effective in certain cases, these methods suffer a lot in quality and performance when processing VHR images due to the mixture of road and nonroad contours and massive on-road objects. It is difficult to distinguish road contours from nonroad ones and group the found contours only by the information of connection and energy functions. In addition, all of these edge-based methods are often computationally intensive when finding road contours from a huge number of edges, leading to unacceptably long processing time when applied to abundant feature details in VHR images. Mainstream region-based methods for road extraction are object-based image classification [17] that focuses on image object properties (e.g., spectrums, shapes, and textures). The main idea is first to segment an image into image objects with image segmentation techniques like threshold segmentation [18], mean-shift segmentation [19], and FNEA (fractal net evolution approach) [20], then calculate various properties [21] of these objects, and finally classify these objects with what have been calculated and classification methods including decision trees [9], fuzzy classification [10], and support vector machine (SVM) [11], [22]. These kinds of methods take full advantage of computing on image objects rather than separate pixels, and extending the spectra space to feature space with flexible dimensions. However, when applied to large and complex objects such as roads, these methods still show insufficiency in employing spatial connections of roads as they only recognize image objects of roads from those of other types of geographical features. That is, they overlook road connections embedded between image objects. In addition, region-based methods also suffer greatly from the disturbances originated from VHR image details. For example, they will make falsenegative errors on shadows, street trees, cars, and road markings as well as false-positive errors on nearby building ceilings. More importantly, there is a lack of description of the relationships between objects. Therefore, the extracted results often appear to be scattered into irregular fragments, with loss of connections between parts and widely spread false appendices across the entire image rather than compact, connected, smooth shapes. Based on aforementioned observations, both edge- and region-based methods have serious issues when applied to VHR images, respectively. This is why we fuse the edge and image segmentation information to gain cross-validation advantage. In our work, both kinds of information are treated as “features” with common properties as well as specific ones. Furthermore, an extended ACO algorithm is designed to globally organize

Thisarticlehasbeenacceptedforinclusioninafutureissueofthisjournal.Contentisfinalaspresented,withtheexceptionofpagination. YIN et al.: DIRECTION-GUIDED ACO METHOD FOR EXTRACTION OF URBAN ROAD INFORMATION

3

Fig. 1. Self-adaptive behaviors of natural ants. (a) There are two routes with different lengths between the ant colony and the food. (b) Initially the ants have equal preference on each route. However, it takes a longer time to complete a round trip on the longer route, leading to a lower frequency of ant visiting for the spots on the longer route than those on the shorter route. As a result, the pheromone level of the longer route becomes lower than the shorter due to natural decaying. (c) The difference in pheromone level further changes the ants’ distribution. Finally, all ants choose the shorter route.

the information by pheromone feedback and acquire reasonable shapes by built-in restrictions of connectivity and smoothness. III. ACO R EVIEW Entomologists have discovered that ants are able to find the shortest path between their nests and food sources without global guidance (Fig. 1). The key element that guides a colony of ants to exhibit such an intelligent behavior is the pheromone they spread along the routes [23], [24]. Enlightened by such ants’ behaviors, Dorigo [24] proposed a new optimization algorithm in 1991. Later, he applied this algorithm into a series of combinatorial optimization problems and gained remarkable results [25], [26]. From then on, ACO has attracted significant attention in numerous research areas, and many extensions of ACO were developed including, e.g., MMAS [27] and COAC [28]. The fundamental mechanisms of all of these methods are universal and are summarized as follows. A. Local Vision and Decision Strategy For one ant at a specific state, it can only “sense” its neighborhood information, make a local decision with a certain strategy (usually with a degree of randomness) using the information, and move on to another state. B. Past Experience Recording Apart from the local information, ants are usually set to have a memory of their past states, which is also considered when making a local decision. In addition, the memory could also provide heuristic information according to characteristics of certain global problems. C. Pheromone Update and Feedback The major function of pheromone update is to measure the quality of one certain solution made by a single ant in the amount of pheromone to attract more ants closer to the its track. The feedback pattern then determines to what extent the existing pheromone affects the decisions of future ants. Among these mechanisms, the pheromone update and feedback rules are usually the most important to ACO methods. They are also the main differences between different ACO

extensions. These rules provide mechanisms not only for individual, lower level intelligence (ant) to communicate with each other to form a collective, higher level intelligence (ant colony), but also for “external” intelligence (system designer) to interact with “internal” intelligence (ant colony) to guide the search process toward a desirable state of problem solving. In addition, the rules also determine the system’s preference in the tradeoff between the speed of convergence and the breadth of search. Apparently, a broader searching space stands for a higher expected quality of the solution, meanwhile it results in a longer expected time for the system to converge or even may not converge at all within a reasonable amount of time. In our algorithm, ants’ decision strategy is connected with the probability of being parts of a road. Higher-level direction information is extracted from basic edge and polygon information to feed this collective intelligence [29] method, which further calculates and balances local and global characteristics of direction-information distribution. After generations of exploration, evaluation, and feedback, the ant colony as a whole is able to generate and accumulate knowledge about possible road areas in an image and represent them with pheromone distributions. The road areas then reveal themselves by their outstanding pheromone levels. IV. M ETHODOLOGY Our method involves two stages for road detection: edge and image information fusion and the application of ACO algorithm on the integrated edge and image information (Fig. 2). Information is drawn from VHR images via an integration of edge detection and image segmentation, and further passed on to a direction-guided ACO. This ACO algorithm then organizes the former information and extracts road areas out of them, marking the prospective areas with pheromone. Finally, roads are selected by threshold filtering. The detailed processes of this method are described in this section. A. Extraction of Features from VHR Images Starting from a corrected VHR image, the following steps are used to gather the needed information: 1) segmenting VHR images into image objects using FNEA [20] embedded in eCognition Developer 8.64.1 [30], and calculating the geometric features for each object (e.g., main direction, ratio of length to width, and area);



Fig. 3. State of an ant is mapped to a geographical point (marked as s) on the image. At each state, the ant’s local environment consists of a square vision box vs and a set of features fs that intersect with the vision box.

Fig. 2. Overall workflow: information extracted by edge detection and image segmentation is integrated by ACO to produce final results in the form of pheromone distribution.

2) detecting edges using EDISON [31], [32] operator with appropriate parameters, and calculating geometric features for each edge (e.g., the total length and direction). The outputs of these two steps are two sets of regions and edges, respectively, containing the segmentations and edges with their properties (mainly their direction). These two sets are input into the direction-guided ACO algorithm for extracting road information. B. Direction-Guided ACO In our model, directional properties of regions and edges are the key connections between this new ACO algorithm and road detection. The idea is based on the fact that most of objects in road contexts, despite their vast diversity in types, are likely to share a common direction: the direction of a road they are related to. This is true for road markings, traffic, and the edges of roads, even buildings near roads. All of these objects are most likely to expand along the direction of a road. Therefore, by explicitly capturing the direction consistency of edges and image objects and linking them with decision strategies, ants can be managed to search out road areas. In addition, topological and geometric characteristics of roads are represented by carefully defining the states of ants and the moves between states. C. Ant States Ants are placed in a continuous 2-D Euclidean space within the extent of a VHR image. Each ant is spawned from a fixed set of points called birth points and keeps moving in the 2-D space until it reaches the boundary of the image. Each ant at each state s has a coordinate (x, y) associated with it. All ants have the same fixed vision radius r and an atomic moving distance step. Square area vs = [x − r, x + r] × [y − r, y + r] is defined as the vision of an ant (Fig. 3). Whenever an ant moves,

it chooses a direction angle d, and moves toward that direction with a length of step, i.e., (x, y) → (x + cos(d), y + sin(d)). There is also a restriction when selecting moving direction that sharp turns are always forbidden for ants. Therefore, ants would naturally generate consecutive and smooth paths just like real roads. D. Quality of Direction Distribution All the features intersecting the ant’s vision have their own direction. All these directions contribute to the direction distribution of this local environment. The next step is to measure the consistency of the distribution that is strongly correlated to the possibility of this area being part of a road. In this study, this consistency is defined as the quality of a direction distribution. Before calculating the quality of a direction distribution, it is necessary to suppress noise. In VHR images, either edge detection or image segmentation generates a large number of small, meaningless fragments. The directions of these fragments are almost random and have no relationship with roads. Therefore, a measurement of significance is needed to suppress such noise. For polygon feature p, its significance is related to its ratio of length to width, i.e., wp = θ · Length_to_width(p); for edge feature e, the significance is related to its total length: we = ϕ · Length(e); where θ and ϕ are normalized coefficients. The idea is to emphasize linear objects that are more likely to be parts of roads, and to suppress those noises that have low performance in the above properties. Based on the quantified significance, the quality of a direction distribution can be defined. Let feature set F = {f } be all features intersected with a certain vision box, df and wf respectively refer to the direction and significance of feature f , and dmean the significance weighted average direction, then the quality of a direction distribution Q(F ) is defined as f ∈F wf Q(F ) = σ0 (1) 2. f ∈F wf (df − dmean ) In (1), σ0 is a scaling coefficient. The Q value positively correlates to the consistency of the direction field. An area with a higher Q value means a higher direction consistency, and thus a higher probability of being a part of a road [Fig. 4(a)];


5

keep accumulating and increasing to an even higher level. Those features with a wrong direction will not receive sufficient pheromone update, thus their pheromone level will keep decaying. Accordingly, these two kinds of features can be distinguished by pheromone levels for recognizing road areas. In addition to the main strategy, two refinement rules are defined for better performance: 1) the pheromone level is restricted to a certain interval to prevent early convergence [27]; 2) a tiny constant amount of pheromone τfootprint (τfootprint = 0.01τmax in practice) is rewarded to any features “sensed” by any ant in its lifespan, which is called “footprint pheromone.” This is necessary to expand pheromone from the chosen feature to other features on the target road and, thus, distinguish them from the background. F. Heuristic Function

otherwise, not [Fig. 4(b)]. A set of spatially consecutive areas with high Q values is likely to be a segmentation of a complete road. With the quality of direction distribution, the next objective is to drive the ants to search the areas with high Q values.

When the ACO algorithm is initialized, there is little difference in pheromone levels among all of the features. Therefore, ants need a starting strategy to choose the direction without pheromone guidance. This is implemented by introducing a heuristic function. The direction information is still of key importance in the heuristic function. As described above, the pheromone information lays an emphasis on qualities of areas. In the real world, however, not only every part of a road has a direction itself but also each road as a whole should be consistent (or smooth at least) in direction. Therefore, the heuristic function is defined to achieve global consistency. Assume an ant passes through n areas (states) with the quality of each area {Qi }, and the chosen direction of each area being {di }, then the ant is set to have an expected direction de : n i=1 Qi · di . (3) de = n i=1 Qi

E. Pheromone Rule

For each feature f in the current vision box with a direction df , its heuristic information ηf is defined as

Fig. 4. Samples of low-quality areas (left) and high-quality areas (right); (a) and (b): raw images; (c) and (d): features colored by direction; (e) and (f): direction distribution graphs.

When an ant finishes its life span (i.e., reaches the edges of an image boundary), it updates the pheromone along its path according to the quality of the path. Since ants are encouraged to go through the areas with high Q values, they should lay more pheromone on the spot that leads them to a high- Q area so that this experience could be passed on to future ants. Based on the idea above, the pheromone update rule is as follows. Assume an ant has passed through n areas (states), the set of chosen features of each area is {fi∗ }, the quality of each area is {Qi }, then this ant will, respectively, add the following amount of pheromone to each chosen feature fi∗ : n j=i+1 Qj (2) Δτfi∗ = k n−i where k is a scaling coefficient, i.e., the reward of a chosen feature is proportional to the average quality of areas after choosing this feature. Since road areas are expected to be of high quality, a feature whose direction leads to a nearby road is likely to be followed by a high-quality path, thus it will be rewarded a high level pheromone update and future ants are more likely to choose it when it is inside their visions. As this positive feedback goes on, its pheromone level will

ηf = cos (λ · (df − de ))

(4)

where λ isa scaling coefficient used to map the range |df − de | into 0, π2 , so that the heuristic function reaches its peak for a feature in the expected direction de and decays when the direction df deviates from the expected direction, as is shown in Fig. 5. The expected direction is the quality-weighted average direction of an ant’s past states, and thus it is likely to be the direction of a road that the ant might be on. Combining the heuristic function and pheromone update rule together, the ants are encouraged to get a consistent direction both within separate parts of the path and the path as a whole. G. Decision Rule Based on all information that an ant should consider when making local decisions, the final decision rule is defined as follows: assuming ant a “senses” a set of features F = {fi } in its current vision box, the possibility for the ant to choose a certain feature fi is Pfi =

τfαi ηfβi wfγi l∈F

τlα ηlβ wlγ

(5)



Fig. 6. Test Images. (a) Image #1. (b) Image #2. Fig. 5. Expected direction and corresponding heuristic weight distribution on one spot of a sample path.

where α, β, and γ are systematic parameters to balance the weights of different factors, and denoted by decision vector D = (α, β, γ) for convenience. All other notations are in accordance with those above. The reason of considering the significance information wf here again is to suppress the effects of noise as before. The significance information also serves as an intuition to prevent ants from being misguided by noise at the very beginning stages, before the pheromone and heuristics are well accumulated. The final result is a probability, which means that ants have various choices at one state instead of greedily picking the locally best feature. This gives the algorithm the flexibility to explore all possibly paths and to jump out of local optima for a globally better solution.

H. Pseudo Algorithm Algorithm 1. Pseudo code of the direction-guided ACO. Initialize pheromone distribution WHILE (number_of_ants_generated < Maximum_Ant_ Number) Generate a new ant Ai at a valid birth point Gi WHILE (Ai has not reached the image’s bounding box) Calculate Ai ’s vision box Vij at Ai current location j Search all features Fij = {fijk } that intersect with Vij Calculate the selection weights for all features in Fij Choose a feature fj∗ in Fij by weighted random selection Get the direction d∗j of fj∗ , move Ai one step towards d∗j Append the local situation (fj∗ , Fij ) into Ai ’s path Pi Update j with the new location LOOP END Update footprint pheromone to every feature in Pi Update feedback pheromone to the chosen features in Pi Increase number_of_ants_generated by 1 Decay all pheromone by a factor ρ LOOP END

V. E XPERIMENTS A. Data and Accuracy Assessment A set of experiments were conducted on QuickBird images with a spatial resolution of 0.61 m. The images contain four bands (R, G, B, and NIR). Two typical road regions (Fig. 6) in Beijing City were selected. Image #1 (1000 × 1000 pixels) shows a complex road context, where large shadows and heavy traffic on major road areas can be seen clearly. A larger and more comprehensive scene is captured in Image #2 (5000 × 5000 pixels) with rivers, complicated road intersections, denser buildings, more shadows, and on-road traffic. Image #1 was used to demonstrate the workflow and the corresponding results of our algorithm while the Image #2 serves the purpose of demonstrating the scalability of the algorithm to large images. Reference maps consisting of road areas were produced manually by qualified human experts to assess the accuracy of our method. It should be noted that the roads partially occluded by shadows and street trees in the two images were corrected and included in the reference map. To evaluate the overall performance of the method, the extracted road map was compared with the reference map pixel-by-pixel. Each pixel falls into the following four categories [33]: true positive (TP), true negative (TN), false positive (FP), and false negative (FN), where true/false stands for the road/nonroad areas on the reference map and positive/negative means the road/nonroad classification result of the applied method. Based on the comparison results between the two datasets, three quality metrics were employed to provide quantitative assessment |T P | |T P | + |F N | |T P | Correctness (% ) = 100 × |T P | + |F P | 2 · Completeness · Correctness F1 (% ) = Correctness + Correctness Completeness (% ) = 100 ×

(6) (7) (8)

where operator |·| denotes the number of pixels in each category. Both the completeness and correctness measure the


Fig. 7. Results of Image #1. (a) Extracted edge map. (b) Direction color map. (c) Ant trajectory map. (d) Pheromone map. (e) Result of the direction-guided ACO algorithm. (f) Result of object-based classification.

quality of road extraction. F1 -score integrates completeness and correctness into one measurement. An object-based road extraction experiment was also conducted by using the e-Cognition 8.64.1 software. The performance of the ACO method was compared with that of the object-based classification. It is worth mentioning that in object-based classification, the classification rules and related parameters were tried many times to obtain the best results. B. Results For Image #1, the extracted edge map [Fig. 7(a)] and direction color map [Fig. 7(b)] are first obtained by using the EDISON system and ArcGIS tools, respectively. By running the direction-guided ACO algorithm, the ant trajectory map [Fig. 7(c)] and the pheromone distribution map [Fig. 7(d)] were obtained. Fig. 7(e) and (f) shows the extracted roads by our method and the object-based classification, respectively. In Fig. 7(e) and (f), green pixels refer to the correctly extracted road pixels (TP) with red pixels indicating the incorrectly extracted (FP) road pixels, and blue pixels indicating the missed road pixels (FN). In addition to visual illustration, corresponding numerical results are listed in Table I. As shown in Fig. 7(e), (f), and Table I, our method outperforms the object-based classification method in e-Cognition. More road pixels were successfully extracted without being influenced by shadows and street trees. This can further be verified by Table I. The presented approach shows the performance

7

of 91.83% and 83.76% in completeness and correctness, indicating that the number of missed road pixels is less than that of the incorrectly detected road pixels. The F1 -score for the Image #1 is 87.61%, indicating a significant improvement of road extraction quality over the object-based classification method (56.87%). Visual comparison of Fig. 7(e) and (f) shows that the object-based classification produced more fragmental and broken roads. Such drawbacks are overcome in our directionguided ACO method, presented as significant improvement in completeness. As Table I shows, our method is better than the object-based classification in all three metrics. The completeness, correctness, and F1 -score of the latter are 43.39%, 82.50%, and 56.87%, respectively, which are much smaller than the results of our method. This performance improvement attributes to both the fusion of edge- and object-information and the extended ACO algorithm. The performance gap increases in the larger Image #2. As the diversity of ground objects increases, the result of object-based classification is more fragmented and incomplete [Fig. 8(d)]. Meanwhile, the ants show their immunity to various noise and stay mainly on the road areas [Fig. 8(a)]. Superior results [Fig. 8(c)] can be easily derived from the pheromone distribution [Fig. 8(b)]. Metrics in Table I further highlights the advantage of our method. By comparing the metrics between Image #1 and Image #2, it is clear that the extended ACO method keeps stable and superior performance in different image contexts while the performance of object-based classification drops severely in all aspects when applied to a larger image. For example, the completeness of ACO stays above 90% (91.83% and 90.63%), whereas the completeness of the object-based classification drops from 43.39% to 27.64%. Similar trends can be observed in correctness and F1 -score as well. Therefore, the effectiveness and scalability of our method are evident. The intelligence of the method can be illustrated by Fig. 8(a) and (b), in which several ants appeared running outside the roads but there were few follow-ups since these areas have irregular direction distribution. The pheromone feedback mechanism prevented future ants from running into these paths again. Therefore, the pheromone level in these areas stays in an insignificant level and, thus, could be effectively excluded from final results. As argued in previous sections, computational intensity of our method is relatively low. Although theoretical estimation of the computation intensity is nontrivial due to the randomness in individual ant behavior and interactions between ants, our empirical observation indicates that the computational intensity of the extended ACO algorithm is insignificant. For example, for the image with 5000 × 5000 pixels, where there are about 100 000 features in total, it took about 500 ants to reach convergence and the total execution time on a regular personal computer is around 50–70 s. C. Discussion The results described above have shown superior performance of our method. To further explain the results, some other aspects of the method are discussed as follows. First, the



TABLE I P ERFORMANCE OF D IRECTION - GUIDED ACO V ERSUS O BJECT-BASED C LASSIFICATION

Fig. 8. Results of Image #2. (a) Ant trajectory map. (b) Pheromone map. (c) Result map of the extended ACO algorithm. (d) Result of object-based classification. It can be seen that despite of some individual distractions, the colony as a whole is able to find the road areas effectively. It is also worth noting that the experiment image has never been down-sampled and the on-road traffic is as heavy as shown on Image #1. It is demonstrated that the object-based classification suffers more in this larger image due to the increased diversity of ground objects while our method remains to be effective, proving its scalability to large images.

preferred spatial resolution of the source image is specified. Since our algorithm does not operate on pixels but the detected edges and segmentations, the spatial resolution of image is not

directly influential and the algorithm is supposed to be applicable to any resolution. However, we assume the effectiveness of the algorithm is most significant when applied to VHR images


9

of classification processes, our method needs much less supervised guidance information (only an estimated road width and several birth points are needed), whereas object-based classification methods need to be supervised extensively. Experiments using large images demonstrated that our method can be applied to large urban areas with little human involvement while achieving its high performance. In sum, our method is superior in both the quality of results and automation level. As a consideration of future work, the pheromone information does provide a good clue to extract road areas. However, the threshold-based extraction adopted in this study does not fully exploit its potential. Further improvement can be made by combining pheromone information with other object attributes for enhancing comprehensive classification methods [9]–[11]. Fig. 9. Quality value map of test image #1 with segmentation boundary (extreme high quality is suppressed).

since there is more abundant information on road surface for the algorithm to exploit, and thus outperform traditional methods. Second, as the direct input of the extended ACO algorithm, both segmentation and edge detection results could affect final results. However, as long as the methods for edge and segmentation detection are considered effective in general, the final road extraction results should be consistent. Furthermore, the spatial resolution of segmentation results should not be too coarse (under-segmentation) not only for avoiding the incorrectness of directions but also because segmentation polygons are the basic units to derive final results. For example, there would be inevitable errors if shadows that partially cover some roads but still have a large area outside the roads are segmented as one single object. Finally, since the “quality value,” representing the likelihood for certain areas to be part of roads in our method, plays a key role in the entire extended ACO algorithm, one may argue that the quality value alone may be sufficient to extract roads. However, as shown in Fig. 9, although the quality values serve as a good indicator for road areas, there are still lots of FP and FN responses in the whole area, similar to what objects-based properties reveal. Therefore, it is safe to say that our method intrinsically differs from traditional methods and the effectiveness is attributed to the organic combination of the intelligence of each ant and the holistic global optimization mechanism.

VI. C ONCLUSION The direction-guided ACO method described in this paper considers both edge and object information available in VHR images and exploits the ACO to extract high-quality road information. Experiments showed that our method can effectively resolve the disturbances of shadows, road markings, on-road traffic, and other noise. Due to the built-in geometric constrictions imposed by ants’ movements, road extraction results achieved by our method are more integrated, consecutive, and smoother in comparison with the results achieved by the e-Cognition object-based classification method. In terms

R EFERENCES [1] R. Bajcsy and M. Tavakoli, “Computer recognition of roads from satellite pictures,” IEEE Trans. Syst. Man. Cybern., vol. 6, no. 9, pp. 623–637, Sep. 1976. [2] J. B. Mena, “State of the art on automatic road extraction for GIS update: A novel classification,” Pattern Recognit. Lett., vol. 24, pp. 3037–3058, 2003. [3] C. Poullis, “Tensor-cuts: A simultaneous multi-type feature extractor and classifier and its application to road extraction from satellite images,” ISPRS J. Photogramm. Remote Sens., vol. 95, pp. 93–108, 2014. [4] T. Peng, I. H. Jermyn, V. Prinet, and J. Zerubia, “Incorporating generic and specific prior knowledge in a multiscale phase field model for road extraction from VHR images,” IEEE J. Sel. Topics. Appl. Earth Observ. Remote Sens., vol. 1, no. 2, pp. 139–146, Jun. 2008. [5] T. Peng, I. H. Jermyn, V. Prinet, J. Zerubia, and B. H. B. Hu, “Urban road extraction from VHR images using a multiscale approach and a phase field model of network geometry,” in Proc. Urban Remote Sens. Joint. Event, 2007, pp. 1–5. [6] X. Lin, J. Zhang, Z. Liu, J. Shen, and M. Duan, “Semi-automatic extraction of road networks by least squares interlaced template matching in urban areas,” Int. J. Remote Sens., vol. 32, pp. 4943–4959, 2011. [7] S. Wang and M. P. Armstrong, “A theoretical approach to the use of cyberinfrastructure in geographical analysis,” Int. J. Geogr. Inf. Sci., vol. 23. pp. 169–193, 2009. [8] S. Wang, “Formalizing computational intensity of spatial analysis,” in Proc. 5th Int. Conf. Geogr. Inf. Sci., 2008, pp. 184–187. [9] L. Rokach and O. Maimon, “Decision tree,” in Data Mining and Knowledge Discovery Handbook. Berlin, Germany: Springer, 2005, pp. 165–192. [10] J. Amini, C. Lucas, and M. Saradjian, “Fuzzy logic system for road identification using Ikonos images,” Photogramm. Rec., vol. 17, pp. 493–503, 2002. [11] S. Das, T. T. Mirnalinee, and K. Varghese, “Use of salient features for the design of a multistage framework to extract roads from highresolution multispectral satellite images,” IEEE Trans. Geosci. Remote Sens., vol. 49, no. 10, pp. 3906–3931, Oct. 2011. [12] G. Vosselman and J. de Knecht, “Road tracing by profile matching and Kalman filtering,” Automatic Extraction of Man-Made Objects from Aerial and Space Images. Berlin, Germany: Springer, 1995, pp. 265–275. [13] P. Gamba, F. Dell’Acqua, and G. Lisini, “Improving urban road extraction in high-resolution images exploiting directional filtering, perceptual grouping, and simple topological concepts,” IEEE Geosci. Remote Sens. Lett., vol. 3, no. 3, pp. 387–391, Jul. 2006. [14] A. P. Dal Poz, R. A. B. Gallis, J. F. C. Da Silva, and É. F. O. Martins, “Object-space road extraction in rural areas using stereoscopic aerial images,” IEEE Geosci. Remote Sens. Lett., vol. 9, no. 4, pp. 654–658, Jul. 2012. [15] Y. Hu and K. J. Zu, “Road extraction from remote sensing imagery based on road tracking and ribbon snake,” in Proc. Pac.-Asia Conf. Knowl. Eng. Software Eng. (KESE’09), 2009, pp. 201–204. [16] X. Niu, “A semi-automatic framework for highway extraction and vehicle detection based on a geometric deformable model,” ISPRS J. Photogramm. Remote Sens., vol. 61, pp. 170–186, 2006. [17] T. Blaschke, “Object based image analysis for remote sensing,” ISPRS J. Photogramm. Remote Sens., vol. 65, pp. 2–16, 2010.



[18] A. K. Shackelford and C. H. Davis, “Fully automated road network extraction from high-resolution satellite multispectral imagery,” in Proc. IEEE Int. Geosci. Remote Sens. Symp. (IGARSS’03), 2003, vol. 1, pp. 461–463. [19] D. Comaniciu and P. Meer, “Mean shift: A robust approach toward feature space analysis,” Pattern Anal. Mach. Intell., vol. 24, no. 5, pp. 603–619, 2002. [20] M. Baatz and A. Schape, “Multiresolution segmentation: An optimization approach for high quality multi-scale image segmentation,” J. Photogramm. Remote Sens., vol. 58, no. 3–4, pp. 12–23, 2000. [21] J. Hu, A. Razdan, J. C. Femiani, M. Cui, and P. Wonka, “Road network extraction and intersection detection from aerial images by tracking road footprints,” IEEE Trans. Geosci. Remote Sens., vol. 45, no. 12, pp. 4144– 4157, Dec. 2007. [22] X. Huang and L. Zhang, “Road centreline extraction from high-resolution imagery based on multiscale structural features and support vector machines,” Int. J. Remote Sens., vol. 30, no. 8, pp. 1977–1987, 2009. [23] J. L. Deneubourg, J. M. Pasteels, and J. C. Verhaeghe, “Probabilistic behaviour in ants: A strategy of errors?,” J. Theor. Biol., vol. 105, pp. 259–271, 1983. [24] A. Colorni, M. Dorigo, and V. Maniezzo, “Distributed optimization by ant colonies,” in Proc. Eur. Conf. Artif. Life, 1991, pp. 134–142. [25] E. Bonabeau, M. Dorigo, and G. Theraulaz, “Swarm intelligence: from natural to artificial systems,” J. Artif. Soc. Soc. Simul., vol. 4, p. 320, 1999. [26] M. Dorigo, G. Di Caro, and L. M. Gambardella, “Ant algorithms for discrete optimization,” Artif. Life, vol. 5, pp. 137–172, 1999. [27] T. Stützle and H. H. Hoos, “MAX-MIN ant system,” Future Gener. Comput. Syst., vol. 16, pp. 889–914, 2000. [28] X. M. Hu, J. Zhang, and Y. Li, “Orthogonal methods based ant colony search for solving continuous optimization problems,” J. Comput. Sci. Technol., vol. 23, pp. 2–18, 2008. [29] A. Pentland, “Collective intelligence,” IEEE Comput. Intell. Mag., vol. 1, no. 3, pp. 9–12, Aug. 2006. [30] U. C. Benz, P. Hofmann, G. Willhauck, I. Lingenfelder, and M. Heynen, “Multi-resolution, object-oriented fuzzy analysis of remote sensing data for GIS-ready information,” ISPRS J. Photogramm. Remote Sens., vol. 58, pp. 239–258, 2004. [31] P. Meer and B. Georgescu, “Edge detection with embedded confidence,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 23, no. 12, pp. 1351–1365, Dec. 2001. [32] G. Steger, “An unbiased detector of curvilinear structures,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 20, no. 2, pp. 113–125, Feb. 1998. [33] D. S. Lee, J. Shan, and J. S. Bethel, “Class-guided building extraction from Ikonos imagery,” Photogramm. Eng. Remote Sens., vol. 69, pp. 143– 150, 2003.

Dandong Yin received the Bachelor’s degree in geographic information science from the School of Earth and Space Sciences, Peking University, Beijing, China, in 2014. He is now pursing the Ph.D. degree in geography at the Department of Geography and Geographic Information Science, University of Illinois at Urbana-Champaign, Champaign, IL, USA. His research interests include multiagent system, complex network, and spatial pattern recognition

Shihong Du received the B.S. and M.S. degrees in cartography and geographic information system from the Wuhan University, Wuhan, China, and the Ph.D. degree in cartography and geographic information system from the Institute of Remote Sensing Applications, Chinese Academy of Sciences, Beijing, China, in 1998, 2001, and 2004, respectively. He is currently an Associate Professor with the Peking University, Beijing, China. His research interests include qualitative knowledge representation, reasoning and its applications, and semantic understanding of spatial data.

Shaowen Wang received the B.S. degree in computer engineering from Tianjin University, in 1995, the M.S. degree in geography from Peking University, Beijing, China, in 1998, and the M.S. degree in computer science, and the Ph.D. degree in geography from the University of Iowa, Iowa City, IA, USA, in 2002 and 2004, respectively. He is a Professor of geography and GIScience with the University of Illinois at Urbana-Champaign (UIUC), Champaign, IL, USA, where he is named a Centennial Scholar. He is also an Associate Director of the National Center for Supercomputing Applications for CyberGIS, and Founding Director of UIUC’s CyberGIS Center for Advanced Digital and Spatial Studies. He is the Principal Investigator of several multi-institution NSF projects for advancing cyberGIS and related scientific problem solving. He has authored many peer-reviewed papers including more than 20 journals. Dr. Wang is currently serving as the President-Elect of the University Consortium for Geographic Information Science, and as a member of the U.S. National Research Council’s Board on Earth Sciences and Resources. He was the recipient of National Science Foundation (NSF) CAREER Award.

Zhou Guo received the B.S. degree from the School of Remote Sensing and Information Engineering, Wuhan University, China. He is currently pursuing the Ph.D. degree in geographic information system at Peking University, Beijing, China. He is awarded by the China Scholarship Council in 2015 as a Visiting Scholar and will study in Purdue University from September 2015 to September 2016. His research interests include integration of RS and GIS data.