The novel technologies used in different application domains allow obtaining ... AI approaches (used in the hybrid systems) to solve more complex feature .... different choices performed in these two main issues drive the capability, adaptability, reliability and .... applied, such as: supervised, unsupervised and reinforcement.
Chapter I
Genetic Algorithms and Other Approaches in Image Feature Extraction and Representation Danilo Avola Institute of Research on Population and Social Policies - National Research Council, Italy Fernando Ferri Institute of Research on Population and Social Policies - National Research Council, Italy Patrizia Grifoni Institute of Research on Population and Social Policies - National Research Council, Italy
ABSTRACT The novel technologies used in different application domains allow obtaining digital images with a high complex informative content, which can be exploited to interpret the semantic meaning of the images themselves. Furthermore, it has to be taken into account that the complex informative content extracted from the images, that is, the features, need of flexible, powerful, and suitable ways to be represented and managed. The metadata through which a set of images can be described are directly tied to the quality and quantity of the extracted features; besides the efficient management of the metadata depend on the practical and capable feature representation. The more used approaches to analyze the image content do not seem able to provide an effective support to obtain a whole image understanding and feature extraction process. For this reason, new classes of methodologies that involve computational intelligent approaches have been developed. In particular, genetic algorithms (GAs) and other artificial intelligent(AI) based approaches seem to provide the best suitable solutions. The artificial intelligent technologies allow for the obtaining of a more semantically complex metadata image representation through which to develop advanced systems to retrieval and to handle the digital images. This new method to conceive a metadata description allows the user to make queries in a more natural, detailed, and semantically complete way. As a result it can overcome the always more sophisticated duties caused by the use of wide local and/or distributed databases with heterogeneous complex images.
Copyright © 2009, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
Genetic Algorithms and Other Approaches in Image Feature Extraction and Representation
INTRODUCTION Digital images, that describe objects and/or situations in the real world, are frequently used to support complex human being activities. Furthermore, the pervasive use of image capture devices, and the increasing relevance of the image analysis in specialized fields (such as: medical, military, satellite, engineering, and so on) have brought an exponential growth of the image data sets into the local and/or distributed image databases. This growth makes the database management, and in particular the image retrieval, a complex task. Moreover, the entropy level of the images contained in these databases makes the optimal exploitation of the images and the performance of several operations (such as: intelligent indexing, comparisons, inference, special query, analysis, processing, and so on) not easy. All these factors contribute to the data mining processes inefficiency. This kind of problems can be overcome exploiting the images themselves. In fact, the novel technologies used in the different application domains allow to obtain images with high level details. These images often possess an informative content that goes beyond the simple visual representation. That is, by observing the relationships among pixels (or clusters of pixels) meaningful features of what is represented in the image can be brought out. This information content is connected to features of an image (such as: textures, colours, patterns, local and global characteristics, and so on). For this reason, the feature extraction matter is one of the main issues in image processing, since the understanding of the meaningful image features implies understanding the image (and parts of it) with its own semantic content. These features can be exploited, by the image analysis processes, to evaluate deeply the semantic informative content of every image, that is, to understand the morphological structure of every object contained in the image and the whole image itself. By the semantic content, and the connected features, it is possible to obtain an exhaustive and great metadata image representation through which to accomplish more effectively the mentioned database operations. In particular, store and retrieval operations can be performed in powerful, efficient and effectiveness way. It is important to observe that the complex features extracted from images with high level details need new approaches to be represented. In fact, as well known, high level semantic characteristics (such as: textures, objects, shapes, patterns, local and global characteristics, and so on) need of more complex logical structures than low level semantic characteristics (such as: colours, statistical content information, gradients, and so on) to be represented. Moreover, it has to be taken in account that a single high level semantic characteristic on an image (as the texture) can require several feature images (called feature space) to be expressed. The quality and the accuracy of the feature space allow to classify every image according to the connected semantic characteristic, while the capacity to manage of the feature space (that is, the related feature space representation) drive the effectiveness with which the developed systems can store and retrieve the images into the databases. This chapter introduces and discusses the different Artificial Intelligent (AI) approaches used to extract and to represent the features of any image. In particular, the role of the Genetic Algorithms (GAs) has been highlighted. The chapter starts from a brief discussion about the feature extraction process, then it introduces a general description of some of the most interesting AI approaches and their application in image feature extraction problems. A more complete and exhaustive description of the GAs is given. Finally the possibility of combined AI approaches (used in the hybrid systems) to solve more complex feature extraction problems is faced. Therefore, some of the most recent and powerful applications exploiting the AI image feature extraction are shown.
Genetic Algorithms and Other Approaches in Image Feature Extraction and Representation
Feature Extraction Process The image feature extraction process (Konstantinos et al., 2006; Woods, 2006) is the step through which it is possible to create the feature maps able to explain the semantic meaning of the image itself. These maps are represented by numerical, logical or symbolic elements. Every map identify a specific characteristic of the image (low level feature), several maps are needed to obtain a more complex features (high level feature). Images with high level of colours, objects and details (that is, with several complex features) need of several different set of maps to explain their own semantic content. Obviously, the complexity of the features extracted from the images depends on two main factors. The first one regards the intrinsic complexity of the image. In fact the content of an image can be more or less ambiguous and understandable. This depends on several factors that can influence the image layout, such as: the number and the complexity of the objects represented inside the image, the spatial relationships between these objects, the logical connections between these objects, the textures and the patterns through which these objects are expressed, the complex patterns, structures and shapes represented inside the image, and so on. Moreover, it has to be into account that there are graphical domains (such as: industrial, mechanical, and so on), which are less complex than others (such as: medical, biological, and so on), in fact, as well known (Zhang, 2006; Sarfraz, 2005), the image understanding in artificial domains have a low level of complexity than natural domains. The second factor related to the complexity of the features extracted from the images regards the kind of features that the user wants to obtain from the images. A digital image has a semantic information content distributed on several levels. In fact in an image it is possible to analyze simple features (as the colour scale) or complex features (as the morphological characteristics that make up the object inside the image). Likewise more deep analysis related the context of the image as well as conjectural analysis regarding the semantic meaning of the image can be performed. The quality, quantity and depth of the features extracted from the image depend on how the user intends to use the information. In particular, an image classification system based on simple colour variation (where a simple RGB query is used to identify images that have the specified amount of colours) is much less complex than one based on textural or semantic information (where a descriptive query is used to identify images that have, with a fixed degree of uncertainty, the required semantic features). Moreover, it is important to observer that the metadata description obtained by complex features allow to conceive store and retrieval image systems based on complex query. For example, a system based on texture analysis approach can adopt complex ways to formulate a query, such as: the natural language to describe a scene, the use of pattern template (query by example), the use of a mathematical description of a complex features, and so on. A further aspect that has to be taken into account regards the way through which the features are represented. This aspect is tightly tied to both the semantic meaning of each extracted feature and how this feature will be used by the user. In fact a same feature (for example, the corners measure of the objects in an image) can be represented in different ways (numerical representation, expressed by constraints between the strokes that make up the corner, vectorial representation, and so on). To explain the just mentioned concepts it is necessary to introduce the following example (see Figure 1). In Figure 1 are represented twenty-one segments. A first set of simple extracted features may regard the identification of the segments and related spatial positions inside the graphical layout, moreover the texture of each element may be “easily” detected. Indeed, a more careful analysis can highlight more complex features. For example by a simple approach of pattern recognition two main patterns can be identified, as shown in Table 1.
Genetic Algorithms and Other Approaches in Image Feature Extraction and Representation
Figure 1. An example of image understanding
Table 1. Two different patterns Brief Geometrical Description of the two Patterns
Representations of the two Patterns
First Reference Pattern
Variation of the First Pattern
Second Reference Pattern
Variation of the Second Pattern
Table 1 shows two different patterns, in particular the first pattern is made up by three segments in a triangular geometrical relationship, while the second pattern is made up by a more complex geometrical relationship. In fact, it is composed by four segments where: the two external verticals segments and one horizontal segment are equidistant in considering of a horizontal measure, besides the fourth segment is inner to the two external horizontal segments. Other geometrical consideration could be made for the elements involved in the second pattern. Indeed, the second pattern is an interpretation about some set of segments identified into the graphical layout, it is the more likely but not the only possible. These two patterns can be used to describe the semantic meaning of the image, other features, as shown in Figure 2, could be extracted to support the just extracted features. In Figure 2 are shown two additional features that could be considered in the image understanding process. In particular, two different triangular relationships are identified for the three shapes belonging to the first and second pattern. Moreover, it can be observed that three segments (belonging to the shapes related to the first pattern) have the same texture. Obviously, other features (simple and/or complex) could be extracted. It depends from several factors, such as: application domain, type of image classification method, structure of the queries with which it is possible to interact with the store/retrieval system, and so on.
Genetic Algorithms and Other Approaches in Image Feature Extraction and Representation
Figure 2. An example of further extracted features
. Also the representation of the extracted features covers a main role. In fact, every feature can be represented in several ways depending from how the user wants to apply the information. In this case, the bi-dimensional geometrical domain, the simple textures and the intuitive geometrical relationships between the elements suggest to use a simple numerical or vectorial representation. In Table 2 is shown a simple tree representation, expressed in XML like standard that can be used to describe every kind of 2D graphical shape (Quin, 1999), which resumes the main characteristics of the shape belonging to the first reference pattern. The vectorial description provided in Table 2 highlights some main characteristics of the related shape. In particular, the spatial constraints (SpatialOnPrimitives) fix the spatial relationships between the three elements that make up the shape. The linkage constraints (LinkedTextures) identify specific textures associated (by a hash-table) to the elements. Furthermore, the approximate constraints (ApproximableSpatialOnPrimitives) identify the rotation degree of the elements respect the horizontal axis (x axis). Indeed, the constraints shown in Table 2 represent only a sub-set of the real complex constraints that need to describe the related shape in a vectorial way. Moreover, additional information regarding the relationships about all the shapes in the graphical layout can be needful to describe the whole shapes’ situation. All the described factors (that is, the features extracted from an image and the related representation) make up the theoretical basis through which to analyze and to create the store/retrieval systems. The different choices performed in these two main issues drive the capability, adaptability, reliability and goodness of the related implemented image store/retrieval systems. The purpose of this kind approach is to provide a suitable and adaptable system for the definite application domain and to implement store and retrieve operations in effective and efficient way. All these kinds of reasoning can be applied on every type of image both in natural domain and artificial domain. The main goal of these efforts is to obtain a semantic description that goes beyond the simple explanation of the sum of the parts that make up the graphical layout of the image. The last factor that has to be taken into account regards the “link” between the approach used to extract the image information and the way with which the images are stored and recovered in the system. In fact, these two aspects that compose the image retrieval framework are tightly connected. For example,
Genetic Algorithms and Other Approaches in Image Feature Extraction and Representation
Table 2. XML-like shape representation Shape
Vectorial XML like Representation …………………………………………………………………………………………………………………………………………… …………………………………………………………………………………………………………………………………………… ……………………………………………………………………………………………………………………………………………
a query by example (Belkhatir, 2005; Vu et al., 2003) and a query by lexical description (Belkhatir et al., 2005; Dagli et al., 2004) systems require two completely different logical structures. Also the kind of information extracted from the images and the data structures in which the information are stored have to be conceived differently. In the next two sections both a general description of some of the most interesting AI approaches applied in image feature extraction and a more exhaustive description of the GAs are introduced.
AI Approaches in Image Feature Extraction The current approaches to perform the feature extraction activity (from different images in different domains) can be aided by new technological software approaches (Saad et al., 2007). For this reason, in the early years, there has been much effort to adopt novel computer science approaches on feature extraction matter. The Artificial Intelligence is a branch of Computer Science made up by several different complex areas. These areas have the common characteristic to exploit, from different points of view, the concepts that drive the capacities (and/or mechanisms) of the human intellect to solve different problems in different
Genetic Algorithms and Other Approaches in Image Feature Extraction and Representation
application domains (advanced human problem solving). In this context, the AI allows a careful feature extraction activity from the image to obtain a higher (respect the conventional approaches) information content description. Indeed, the extracted features and their representations are usually distinguished concepts. In fact, the first one (extracted features) are tied up to both the image application domain (that determines complexity and quantity of the features that can be extracted from the given images) and the approach used to perform the feature extraction process. While the second one (related representations) are tied up according to the specific logical algorithms and logical data structure used to reach the prefixed aim. For example, it is possible to represent a particular image feature by several different logical data structure, such as: graphs, trees, set of mathematical constraints, statistical models, and so on. Therefore, the strategies and the algorithms used and/or created to manage the image representation are deeply tied up to the performed choice. This kind of choice is often driven by the way in which both the indexing of the images and the modality of the queries (such as: visual query, textual query, graphical query, and so on). An interesting powerful approach used to extract meaningful features from an image belongs to the Intelligent Agents (IAs) area. Commonly, the IAs based approaches use an entity (the agent) which is aware of the assigned task, and this entity knows how to accomplish it (Jennings et al., 1998). This kind of algorithms is based on single elaboration entities, which work together on the same task and/or separately on different tasks (or parts of the same task). Each entity has to respect several characteristics such as (but not only): reactivity (the ability to interact timely with the environment and the events that occur in it), pro-activity (the ability to work in expected way, but if necessary it has to be also able to take initiatives), autonomy (the ability to work without external presence), social ability (the ability to interact with both other entities and humans). In particular, this last mentioned main property of the entities has a huge relevance in the image feature extraction. In fact, complex image domains require a large amount of feature maps to be described, therefore it is necessary to employ a large amount of agents able to understand the different peculiarity into the images. These agents have to interact each other to accomplish the relative assigned tasks. The complexity of the communication mechanism occurs according to the number and the complexity of the agents involved in the process. For this reason the features of whatever image are usually extracted by Multi-Agent Systems which work separately (but in
Figure 3. Simple image feature extraction conceptual scheme
Genetic Algorithms and Other Approaches in Image Feature Extraction and Representation
constant contact) on different aspects of the image (Liu et al., 2002). In Figure 3 a simple conceptual scheme of a common MAS architecture is given. Figure 3 shows several blocks, each one made up by several IAs. Each block deals with a set of simple and/or complex features (or also a single feature), every agent inside a block can be considered an expert entity regarding a specific feature (such as: colours, corners, textures, and so on) or part of it. Every agent in a block can interact only with the agents of the same block. The coordination and interaction activities between blocks are performed by the mediator agents (MD) and a unique master mediator agent (MMD). The IA systems are particularly used in complex natural domain, such as: medical fields, biological fields, and so on. The reason is that these systems are particularly congenial to the statistical computation, that is, to identify: statistical patterns, stochastic situations, approximate models, and so on. An interesting approach belonging to the IAs area, that follows the mentioned MAS architecture, is performed by Pattern Recognition Agents (PRA) (Konar, 2000; Liu et al., 2000). In this approach each agent is learned to recognize a particular feature (or parts of it) of the image (such as: set of primitive patterns, set of correlated colours, set or sub-set of textures, and so on). The agents can be grouped in teams, where every team deals with a specific feature (or set of features). The ability of interaction among PRAs drives the quality of extracted features. In fact a single pattern can be composed by several different aspects. Every agent belongs to a team has to interact with the other team agents to accomplish accuracy the connected feature extraction process. This reasoning can be used in a higher level to highlight the importance of interaction between teams. The representation of the features extracted by agents depends on the kind of agents themselves. Several agents store the features to provide them to the other agents, in other cases the features have to be provided to the whole image retrieval system. Consequently, the data structures can have a wide range of representations, however the features exchange purely between agents have commonly a double-precision numerical representation. Another considerable (and charming) application area regards the Neural Networks (NNs) also called Artificial Neural Networks (ANNs) (Hopgood, 2005). The applied paradigm, in this area, is connected to the cognitive information processing structure based on models of human brain function. The ANNs have been developed “to mimic” the elaboration way with which the central nervous system performs the information processing activity. ANNs (likewise to the hardware ANNs) can be considered like interconnected entities that work together to produce an output function. An ANN software architecture is composed by a number of interconnected units (called also neurons or nodes). Each unit can receive one (or more) input and can perform one output. Moreover the output performed by a unit can be used either as an input to other units or it can be part of the output of the whole network (I/O characteristics). Each unit implements a local mathematical or computational model for information processing. The output of any unit depends on several factors, such as (but not only): local model, interconnection (weighted) among units, I/O characteristics, external input, score feedback, and so on. The number and the kind of the factors (as well the complexity of the network topology) are related to the kind and the complexity of the specific application. The topology of an ANN image feature extraction depends on both the application domain and the specified filters (that is, the algorithms) used to extract the features. In general, there are several topologies of an ANN, both in learning and analysis phase, such as: State Space Neural Network (SSNN) (Zayan, 2006), Multi-Layer Perceptron Network (MLPN) (Eleuteri et al., 2005), Modular Feed-Forward Network (MFFN) (Boerrs et al., 1992), and so on. These topologies
Genetic Algorithms and Other Approaches in Image Feature Extraction and Representation
Figure 4. Hierarchical ANN topology
. can be applied on one or more conceptual layers through which to implement the whole steps regarding the image retrieval system. Moreover there are several paradigms with which these topologies can be applied, such as: supervised, unsupervised and reinforcement. Indeed, as shown in Figure 4, in image feature extraction the hierarchical topology is widely used. Every image is firstly analyzed by several neurons (nodes belonging to the first level) that catch the lowest possible features, such as: colours, simple primitives, simple patterns, local characteristics, and so on. Afterwards, this set of simple features is analyzed by the neurons of the next level (in the example, second level) to make up a new set of more complex and semantically complete features. This activity is performed until the last ANN level (in the example, third level) to obtain a definitive complex version of the extracted features, such as: textures, complex patterns, objects relationships, and so on. This kind of step by step “refinement” can be performed using several different approaches. For example, a typical simple strategy can be performed catching simple statistical pattern about the image (as colour histograms about prefixed portion of the image) that, level by level, are correlated (by spatial information about cluster of pixels with the same histogram) to make up a more complex statistical scenario describing the object inside the images. The final aim of the whole network is to provide a complete semantic description of the analyzed images. Indeed, the hierarchical network can have several variations to accomplish the requirements of a specific domain or a complex image retrieval system. The functionality of the algorithms inside each node is strongly subdivided according to the related level. In particular, while the algorithms belonging to the “low” levels work deeply on the image features, ones belonging to the “high” levels have to join and to interpret the information of the previous levels to provide, at the following level, high level descriptions. A common algorithm based on ANN hierarchical topology is performed by Complex Adaptive System (CAS) (Arunkumar, 2004) paradigms. These systems can able to change their structure (topology, by adding nodes and functionalities) according to the external or internal information that flow through the network. The employment of the ANNs in image feature extraction occurs usually after a hard phase of training process. In fact, these nets have to learn to recognize every single feature by a set of training
Genetic Algorithms and Other Approaches in Image Feature Extraction and Representation
data set. In particular, the net parameters are trained to recognize a first specific feature, afterwards the parameters are adjusted and managed to recognize another features, and so on. This phase can be very hard because every new feature that can be recognized can “invalidate” the previous choices. During this process both the network topology and the complexity of the model inside each net node can undergo changes. In particular, it can be necessary to introduce new nodes into the net otherwise to “complicate” (or to change) the model that drive the node task. There are several approaches based on ANNs. The reason is that the same goal (for example the research of a particular roughness in a texture model) can be performed by several different nets. This involves possible different patterns of net topologies, different learning approaches (supervised or unsupervised), different computational models inside the nodes, and so on. All of this makes the classification of image feature extraction based on ANNs approaches hard. A common ANN approach is based on Bayesian networks (Dol et al., 2006) (Bayesian Neural Network, BNN). This kind of networks is designed to learn very quickly and incrementally. They are based on probabilistic graphical models. In the graph nodes represent random variables and arcs represent dependencies between random variables with conditional probabilities at nodes. Frequently, BNNs are used to obtain many advantages over traditional methods of determining causal relationships (Limin, 2006). For example, the BNNs parameters of the models are expressed as a probability distribution rather than a single set of values. Other advantages regard the possibility (by the model) to support a powerful supervised ate learning phase and the opportunity to have a well knowledge feedback mechanisms to ensure an acceptable range of errors. In image features extraction these causal relationships represent the desired image features. The BNNs (but also other ANNs) are made up by “reasoning layers” (that is, layers able to understand the effective structure of the image according to the layer knowledge) in which each layer attends to a specific feature level (for example a BNN with three levels can have: first level - global geometrical features, second level - stroke features, third level – shape/object features). Obviously, each layer is connected to each other, but the ability and the “knowledge” of nodes is tied up to the specific layer. Differently from the previously introduced artificial intelligent areas, the input and output of the approaches belonging to the ANNs area are usually mathematical vectors. An interesting AI area very frequently used for feature extraction is Artificial Vision (AV). The approaches belonging to this area aim to identify image features by “mimicking” human visual perception. The AV process tends to identify simple (such as: colours, simple shapes/patterns, and so on) and complex (such as: complex shapes/patterns, objects/models, relationship among objects/models, and so on) features exploiting the “mimicked” human ability to recognize them in the image. Unlike the aforementioned thematic areas, the AV does not have specific common approaches. It includes a set of methods and techniques that are used in a lot of different ways in the image feature extraction matter. A common element of these approaches is the Image Operator (IO) concept (called also: image-filter, elaboration-mask, feature-extractor, etc., or simply feature). The IO is usually the computational unit that deals with a single feature (or part of it in complex features). The joined use of more IOs is performed to extract more complex feature and/or a set of simple/complex features. The Spatial Gray Level Dependence Matrixes (SGLDM) are commonly used to perform IOs able to recognize complex features (Haralick et al., 1973). These matrixes work on the spatial relationships that occur among pixels belonging to predefined portions of the image. There are several ways to effectively perform the matrixes, that is there are several ways to consider the distance between two pixels or cluster of pixels (such as: Euclidean, circular, eight neighbours, four neighbours, and so on). A common
10
Genetic Algorithms and Other Approaches in Image Feature Extraction and Representation
approach is to use the Co-Occurrence Matrixes (Vadivel, 2007). Through an opportune mathematical interpretation of these matrixes (that is, a mathematical norm that gives a single value from an entire matrix) the IOs can recognize complex textural visual features such as: regularity, roughness, coarseness, directionality, and so on. Indeed, the image understanding/learning process by AV is also used to extract these features, which are visible only by an artificial sophisticated visual approach (advanced visual perception that goes beyond the common visual perception). There are others AI areas that can be adopted in several ways to support the image feature extraction, such as: Case-Base Reasoning (CBR), Knowledge-Based Systems (KBSs), Fuzzy Logic, Decision Trees, and so on.
GENETIC ALGORITHMS IN IMAGE FEATURE EXTRACTION One of the most recent interesting AI areas in the image feature extraction regards the Evolutionary Algorithms (EAs). Commonly, these kinds of algorithms use a heuristic approach to find approximate solutions to optimization and search problems. The EAs are inspired by Darwin’s theory about biological evolution. For this reason they use techniques driven by evolutionary biology (such as: reproduction, inheritance, mutation, selection, and crossover -also called recombination- survival of the fittest, and so on) to reach a solution. These algorithms start from a non-optimal solution and reach the optimal one toward their evolution. Indeed, there are several kinds of EAs (Freitas, 2002; Korolev, 2007; Zheng et al., 2007), such as: Genetic Algorithms (GAs), Genetic Programming (GP), Evolutionary Programming (EP), Evolution Strategies (ESs), and others. All these EAs follow the Darwin inspiration but, at the same time, they are distinguished by the different ways with which they have been applied, such as: different interpretation of evolution, different role of the solutions in evolutionary mechanism, different concept of mutation, and so on. The EAs adopt different conceptual models (paradigms) with which the Darwin evolution is performed. Between the aforementioned algorithms the GAs use the paradigm that better than others adapts itself for the image feature extraction problems (Li et al., 2004; Brumby et al., 1999; Brumby et al., 2002) that is the “concept” behind the GA paradigm has a natural empathy with the image feature extraction problems. This paradigm can be expressed as follow: the GA is started with a set of abstract candidate solutions population. These solutions are represented by chromosomes (genotypes or genomes). The Solutions from one population are taken and used to form a new population. This last process is motivated by the hope that the new population will be better than the old one. In each generation, the fitness of each candidate solution is evaluated, multiple candidate solutions are stochastically selected from the current solutions (based on their fitness), and modified (recombined and/or mutated and/or others “genetic” operations) to form a new population of candidate solutions. The new population is then used in the next iteration of the algorithm. The GAs, adopted with this paradigm, are generally used to perform optimization tasks in several different application domains. In image feature extraction this family of algorithms is used to increase both the quality and quantity of the features extracted from the image. They are used to catch particular kinds of features (such as: complex patterns or complex textures) that are hardly achievable with common approaches. All these factors play a relevant role in graphical data mining applications.
11
Genetic Algorithms and Other Approaches in Image Feature Extraction and Representation
The GAs (and the EAs in general) applied to the image feature extraction allow to exploit and improve the just imposed conventional techniques based on statistical/synthetic pattern recognition, edge detection, shapes/objects analysis, model characterization/extraction, and so on. In fact, the GAs allow creating a framework through which the well known potentialities of the conventional techniques. Inside the GA framework, well known feature vectors otherwise specific mathematical image operators or even whole classical applications of image feature extraction are hosted. The framework allows improving the basic techniques through the GAs optimal research process. In Figure 5 a simple graphical explication about the GA paradigm applied to the image feature extraction is given. As it is possible to observe by Figure 5, initial features are used to provide a start up to the retrieval system. These features can be randomly or empirically discovered in previous studies and/or analysis. The “temporary” solution is evaluated usually by a matching function on well-known pattern and/or configuration belonging to the images application domain. The best features are selected, modified, recombined and mutated to provide a new generation of features to restart the whole process. Usually, a rank on the matching function is used to decide the end of the process. The candidate solutions of the GAs, differently from the case of their use for the image features extraction, are usually (but not only) represented by binary strings. The initial population may be created at random or some knowledge about previously known solutions may be used. In image feature extraction the candidate solutions tend to be considerably more complex. They are conveyed by several different heterogeneous ways such as: mathematical functions, formal descriptions, languages (such as the natural language), numerical feature vectors, and so on. For example, several GAs approaches use, as candidate solutions, selected (depending on a specific domain) image operators connected to the desired features. Other approaches to reach the same aim use well known experimental numerical feature vectors. In more complex approaches the candidate solutions can be also conveyed by high level description languages (such as: natural language, visual language, and so on), otherwise by autonomy algorithms (Cheng et al., 2005; Leveson et al., 2004) each one dedicated to catch a particular feature. An autonomy algorithm has the capability to improve the ability in problem solving activity, in particular it is able to face new situations and to integrate the developed behaviour as a new functionality. Figure 5. General scheme of GA paradigm in image feature extraction
12
Genetic Algorithms and Other Approaches in Image Feature Extraction and Representation
Seldom, the initial population in image feature extraction, is created in random way, because this could bring about problems in system’s performance (more than in the common GAs application domains). In fact, a solution (such as a vector) for a single feature (such a kind of texture) can involve several vectors having high dimension. These algorithms start from low level (or not sufficient) feature description and reach (by evolution) high level (or acceptable) feature description. The candidate solution is essentially driven by several factors: complexity of the image domain, methods through which to perform the evolutionary mechanism, quality and quantity of the desired features, and so on. The evolution of solutions (included: the specific used “genetic” operators) and the fitness function can be performed in several different ways. They respectively depend on both: the solution’s description and the way with which to measure the quality of the population (solutions). Some GAs set the evolution process based on objects/shapes identification by edge detection recognition (segmentation technique). The candidate solution, in this case, uses the first and second order derivatives to catch the edges in the images. Other kinds of GAs use a completely heuristic approach to find the best set of features able to describe the image. Usually they start from features expressed by numerical vectors, which should represent the meaningful features of the image application domain. Operators (usually: mutation and crossover) “change” the solutions with the aim to perform a more suitable segmentation on training image data set. Also in this case the fitness function works to guarantee a high level of the visual segmentation of the approach. In general, the GAs will not necessarily achieve the optimal solution to the feature extraction problems, but they permit it by means of a careful manipulation of the solutions and they increase the chances of success choosing a suitable description to the image features. There is an advanced way that tries to improve the GAs characteristics, the Genetic Programming (GP). In this technique not only the solution, but the whole GAs evolves. In this way the algorithm can adapt itself to different kinds of problems and give creative solutions for them. GP is often used to construct image-processing operators for specific tasks. The fitness of the parse tree is usually evaluated by comparison with training examples, where the task to be achieved has been performed manually. There are several ways to perform the output of the AI image feature extraction approaches. Indeed, sometimes may be convenient to have a “standard” output structure contained the set point of interest about the features of an image, such as: graphs, trees, vectors, and so on. In other situations it could be convenient to have more complex “structures” that describe the extracted features, such as: raster images, diagrams, mathematical functions, and so on.
hybrid systems & recent applications The collaborative use of more AI image feature extraction algorithms (hybrid systems) can be performed to extract features from complex domains otherwise to improve the performance of a specific AI algorithm. For example, GAs have been used to support the training process of the ANNs. The GAs can be also used to drive the whole structure of the net. Usually, the ANNs engage collaborative relationships with several kinds of EAs (not only GAs). Several hybrid systems adopt at the same time more than one approach such as" fuzzy logic (a basic AI area) IAs, ANNs, EAs, and so on. A common strategy of hybrid systems is to use the different involved systems by a layer strategy. For example it is possible to use an ANN to recognize the main
13
Genetic Algorithms and Other Approaches in Image Feature Extraction and Representation
features of an image, and after, on the extracted features, it can be convenient to adopt a set of IAs to refine the aforementioned features. Bellow are described some interesting feature extraction applications based on AI methods. For example GENIE (GENetic Imagery Exploration) (Perkins et al., 2000) uses a genetic programming approach to produce automatic feature extraction (AFE) tools for broad-area features in multispectral, hyperspectral, panchromatic, and multi-instrument imagery. Both spectral and spatial signatures of features are discovered and exploited. Another interesting application, mentioned in (Del Buono, 2007) is Face Recognition ActiveX DLL. This application uses neural net back propagation algorithm with more artificial intelligence tools added for imaging optimization. Library works great even for a low resolution Web cam image and requires the user to align to a mirror frame on screen. POOKA application (Porter et al., 2003) combines reconfigurable computing hardware with evolutionary algorithms to allow rapid prototyping of image and signal processing algorithms implemented at the chip level. This enables POOKA to rapidly produce customized automated feature extraction algorithms the run hundreds of times faster than equivalent algorithms implemented in software. The efforts aimed to improve the actual systems in feature extraction field are in increasing growth.
conclusion The novel technologies, tied to the digital image environment, used to support the complex human being activities have brought an exponential growth of the image data sets into the local and/or distributed image databases. The entropy of this kind of information makes the management of the images a hard task, moreover the high detail level of the actual images suggests that it is possible to explain the image content to develop innovative feature extraction mechanisms through which to perform store and retrieval activities in efficient and effective way. In particular, genetic algorithms (GAs) and other artificial intelligent (AIs) based approaches seem to provide the best suitable solutions in these feature extraction issues. In this way advanced systems both to retrieve and to handle the images in local and/or distributed databases can be developed.
Future Research Directions Whole just described image feature extraction approaches have to use a training data set to learn particular patterns and/or specific features that have to be recognized by the related system. In the last years, the emerging trend is to use the training data set by Gaussian Pyramidal Approach (Heeger, 1995; Lowe, 2004). In this approach an image (source image) belonging to the training data set is analyzed to identify the desiderate patterns/features, moreover a new image (sub-sampled image) is generated. The sub-sampled image comes from the source image, the sub-sampling can depend on several factors, such as: lowering of resolution, replacing of pixel sets in the source image with pixel sets less dense (for example, due to an elaboration step), application of mathematical transformation, and so on. This process can be repetitively performed on the obtained image (sub-sampled image) to provide another
14
Genetic Algorithms and Other Approaches in Image Feature Extraction and Representation
sub-sampled image respect to the previous one. In this way from a single image, in training data set, it is possible to obtain several images, where every obtained image represents a level of the Gaussian Pyramid (source image is the pyramid base, and the other sub-sampled images represents respectively the first level, the second level, and so on). All these images are used to understand the variation, quality and quantity of patterns/features into the image application domain. In fact in this way it is possible to obtain a more rich informative content about researched patterns/features. For example, by compared analysis of the different plans (obtained by a simple sub-sampled resolution) it is possible to observe information about the repetitions of specific patterns represented by different scale values. An interesting opportunity that has to be taken into account in image feature extraction of complex and high resolution images is mathematical representation of the main patterns/features. That is, every pattern/feature (independently from the approach used to find it) can be expressed by a mathematical model, which can be exploited to perform provisional studies about the image application domain. For example, in medical image analysis mathematical models of different patterns representing the human tissues are used to estimate provisional models regarding: diagnosis, analysis, zones delineation, abnormal masses identification and irradiation, and so on.
References Arunkumar, S. (2004). Neural network and its application in pattern recognition (Dissemination Report). Department of Computer Science and Engineering, Indian Institute of Technology, Bombay. Belkhatir, M. (2005). A symbolic query-by-example framework for the image retrieval signal/semantic integration. In Proceedings of the 17th IEEE International Conference on Tools with Artificial Intelligence, ICTAI (pp. 348-355). Washington, DC, IEEE Computer Society Press. Belkhatir, M., Chiaramella, Y., & Mulhem, P. (2005). A signal/semantic framework for image retrieval. In Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL ‘05 (pp. 368368). Denver, CO, USA, ACM Press. Boerrs, E. J., & Kuiper, H., (1992). Biological metaphors and the design of modular artificial neural networks. Master’s thesis, Department of Computer Science and Experimental Psychology, Leiden University. Brumby, S. P., Theiler, J., Perkins, S., Harvey, N. R., & Szymanski, J. J. (2002). Genetic programming approach to extracting features from remotely sensed imagery. Paper presented at the internal meeting of Space and Remote Sensing Sciences, Los Alamos National Laboratory, Los Alamos, New Mexico. Brumby, S. P., Theiler, J., Perkins, S. J., Harvey, N., Szymanskia, J. J., Bloch, J. J., & Mitchellb, M. (1999). Investigation of image feature extraction by a genetic algorithm. Paper presented at the internal meeting of the Los Alamos National Laboratory, Space and Remote Sensing Sciences, Santa Fe Institute. Cheng, M. Y. K., Micacchil, C., & Cohen, R. (2005). Adjusting the autonomy of collections of agents in multiagent Ssystems. In Advances in Artificial Intelligence, 3501, 2005 (pp. 33-37). Springer Berlin / Heidelberg Publisher.
15
Genetic Algorithms and Other Approaches in Image Feature Extraction and Representation
Dagli, C., & Huang, T. S. (2004). A framework for grid-based image retrieval. 17th International Conference on Pattern Recognition, 2, ICPR’04 (pp. 1021-1024). IEEE Computer Society Press. Del Buono, A. (2007). An overview about sophisticatedfFace recognition systems. (Tech. Rep No. 1). National Research Centre, Computer Science and Knowledge Laboratory (CSK Lab). Dol, Z., Salam, R. A., & Zainol, Z. (2006). Face feature extraction using Bayesian network. Proceedings of the 4th International Conference on Computer Graphics and Interactive Techniques (pp. 261-264). ACM Press. Eleuteri, A., Tagliaferri, R., & Milano, L. (2005). A novel information geometric approach to variable selection in MLP Networks. International Journal on Neural Network, 18(10), 1309-1318, Elsevier Science Publishing. Freitas, A. A. (2002). A survey of evolutionary algorithms for data mining and knowledge discovery. Advances in evolutionary computation: Theory and application, natural computing series (pp. 819845). Springer Publisher. Haralick, R. M., & Shanmugam, K. (1973). Textural features for image classification. IEEE Transaction on Systems Man Cysbern, 3(6), 610-621. IEEE Press. Heeger, D. J., & Bergen, J. R. (1995). Pyramid-based texture analysis/synthesis. In Proceedings of the 22nd Annual Conference on Computer Graphics and Interactive Techniques (pp. 229-238). ACM Press. Hopgood, A. A. (2005). The state of artificial intelligence. In M. V.Zelkowitz (Ed.), Advances in Computers (pp. 1-75). Elsevier Publishing. Jennings, N. R., Sycara, K., & Wooldridge M. (1998). A roadmap of agent research and development. International Journal on Autonomous Agents and Multi-Agent Systems, 1(1), 7-38. Boston, Kluwer Academic Publishers. Konar, A. (Ed.). (2000). Artificial intelligence and soft computing - Behavioral and cognitive modeling of the human brain. CRC Press. Konstantinos, N. P., & Rastislav, L. (Eds.). (2006). Color image processing: Methods and applications. CRC Press Korolev, L. N. (2007). On evolutionary algorithms, neural-network computations, and genetic programming. Mathematical problems. International Journal on Automation and Remote Control, 68(5), 811-821. Plenum Press. Li, Q., Hu, H., & Shi, Z. (2004). Semantic feature extraction using genetic programming in image retrieval. Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004, 1(23-26), pp. 648-651. IEEE Computer Society Press. Leveson, N. G., & Weiss, K. A. (2004). Making embedded software reuse practical and safe. Proceedings of the Twelfth International Symposium on ACM SIGSOFT (pp. 171-178). ACM Press.
16
Genetic Algorithms and Other Approaches in Image Feature Extraction and Representation
Limin, W. (2006). Learning Bayesian-neural network from mixed-mode data. In Neural information processing (pp. 680-687). Springer Publisher. Liu, J., & Yin, J. (2000). Multi-agent integer programing. Proceedings of the Second International Conference on Intelligent Data Engineering and Automated Learning (pp. 301-307), Springer Press. Liu, J., & Zhao, Y. (2002). On adaptive agentlets for distributed divide-and-conquer: A dynamical systems approach. IEEE Transactions on Systems, Man and Cybernetics, 32(2), 214-227. IEEE Press. Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91-110. Perkins, S., Theiler, J., Brumby, S. P., Harvey, N. R., Porter, R., Szymanski, J. J., & Bloch, J. J. (2000). GENIE - A hybrid genetic algorithm for feature classification in multi-spectral images. Los Alamos National Laboratory Internal Proceeding, SPIE 4120 (pp. 52-62). Porter, R., Eads, D., Hush, D., & Theiler, J. (2003). Weighted order statistic classifiers with large rankorder margin. Proceedings of the Twentieth International Conference on Machine Learning, ICML 20, 600-607. Los Alamos National Laboratory, Washington DC. Quin, L. (Ed.). (1999). XML Specification Guide. John Wiley Press. Saad, A., Avineri, E., Dahal, K., Sarfraz, M., & Roy, R. (Eds.). (2007). Soft computing in industrial applications: Recent and emerging methods and techniques. Springer Verlag Press. Sarfraz, M. (Ed.). (2005). Computer-aided intelligent recognition techniques and applications. John Wiley Press. Vadivel, A., Sural, S., & Majumdar, A. K. (2007). An integrated color and intensity co-occurrence matrix. Pattern Recognition Letter, 28, 974-983. Elsevier Science Inc. Publisher. Vu, K., Hua, K. A., & Jiang, N. (2003). Improving image retrieval effectiveness in query-by example environment. Proceedings of the 2003 ACM Symposium on Applied Computing, (pp. 774-781). Melbourne, Florida, USA, ACM Press. Woods, J. W. (Ed.). (2006). Multidimensional signal, image, and video processing and coding. Elsevier Press. Zayan, M. A. (2006). Satellite orbits guidance using state space neural network. Aerospace Conference, 4, AERO 2006. 16-22. IEEE Press. Zhang, Y. J. (Ed.). (2006). Advances in image and video segmentation. Idea Group Inc (IGI) Press. Zheng, B., & Li, Y. (2007). New model for multi-objective evolutionary algorithms. Computational Science – ICCS 2007, 4490, 2007, 037-1044. Springer Berlin/Heidelberg Publisher.
17
Genetic Algorithms and Other Approaches in Image Feature Extraction and Representation
Additional Reading Bazin, P. L., & Pham, D. L. (2007). Topology Correction of Segmented Medical Images Using a Algorithm. In Computer Methods Prog. Biomed., Vol 88, Issue 2 (pp. 182-190). PubMed Press. Campilho, A., & Kamel, M. (Eds.). (2006). Image Analysis and Recognition. Springer Press. Cyganek, B. (2007). Road Signs Recognition by the Scale-Space Template Matching in the Log-Polar Domain. In Pattern Recognition and Image Analysis, Vol. 4477/2007 (pp. 330-337). Springer Berlin / Heidelberg Publisher. Guyon, I., Gunn, S., Nikravesh, M., & Zadeh, L. A. (Eds.). (2006). Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing). Springer Press. Haikonen, P. O. (Ed.). (2007). Robot Brains: Circuits and Systems for Conscious Machines. Wiley Press. Julesz, B. (Ed.). (1995). Dialogues on Perception. Cambridge: Bradford/MIT Press. Koskela, M., Sjöberg, M., Laaksonen, J., Viitaniemi, V., & Rushes, H. M. (2007). Summarization with Self-Organizing Maps. In Proceedings of the TRECVID Workshop on Video Summarization, TVS’07 (pp. 45-49). Augsburg, Germany. ACM Press. Koskela, M., & Smeaton., A. F. (2007). An Empirical Study of Inter-Concept Similarities in Multimedia Ontologies. In Proceedings of the 6th ACM international conference on image and video retrieval, CIVR 2007 (pp 464-471). Amsterdam, The Netherlands, ACM Press. Koskela, M., Smeaton, A. F, & Laaksonen J. (2007). Measuring Concept Similarities in Multimedia Ontologies: Analysis and Evaluations. In IEEE Transactions on Multimedia, Vol. 9, Issue 5 (pp. 912922). IEEE Computer Society Press. McIntosh, C., & Hamarne, G. (2006). Genetic Algorithm Driver Statistically Deformed Models for Medical Image Segmentation. In ACM Workshop on Medical Applications of Genetic and Evolutionary Computation Workshop (pp. 1-8). ACM Press. Pratt, W. K. (Ed.). (2007). Digital Image Processing. Wiley Press. Roli, F., & Vitulano, S. (Eds.). (2005). Image Analysis and Processing. Springer-Verlag Publisher. Yeung, D. Y., Kwok, J. T., Fred, A., Roli, F., & de Ridder, D. (Eds.). (2006). Structural, Syntactic, and Statistical Pattern Recognition. Springer Press. Yixin, C., Jia, L., & James Z.W. (Eds.). (2004). Machine Learning and Statistical Modeling Approaches to Image Retrieval. Springer/Kluwer Press. Zhirong, Y., & Laaksonen, J. (2006). A Fast Fixed-Point Algorithm for Two-Class Discriminative Feature Extraction. In Proceedings of 16th International Conference on Artificial Neural Networks, ICANN 2006 (pp. 330-339). Athens, Greece, Springer Press. Zhirong, Y., & Laaksonen, J. (2007). Face Recognition Using Parzenfaces. In Proceedings of International Conference on Artificial Neural Networks, Vol. 4669, ICANN’07 (pp. 200-209). Porto, Portugal, Springer Verlag Publisher. 18
Genetic Algorithms and Other Approaches in Image Feature Extraction and Representation
Zhirong, Y., Zhijian, Y., & Laaksonen, J. (2007). Projective Non-Negative Matrix Factorization with Applications to Facial Image Processing. International Journal of Pattern Recognition and Artificial Intelligence, 21(8), 1353-1362. Zwiggelaar, R., Blot, L., Raba, D., & Denton., E. R. E. (2003). Set-Permutation-Occurrence Matrix Based Texture Segmentation. In Pattern Recognition and Image Analysis, IbPRIA, Vol. 2652 (pp. 1099-1107). Springer Berlin / Heidelberg Publisher.
19