New approach to intelligent control systems with ... - Semantic Scholar

1 downloads 0 Views 692KB Size Report
operates through a three-stage self-exploration process to explore new actions, which is .... Sec- tion III introduces the structure of FNN-based controller. The.
56

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 33, NO. 1, FEBRUARY 2003

New Approach to Intelligent Control Systems With Self-Exploring Process Liang-Hsuan Chen and Cheng-Hsiung Chiang, Student Member, IEEE

Abstract—This paper proposes an intelligent control system called self-exploring-based intelligent control system (SEICS). The SEICS is comprised of three basic mechanisms, namely, controller, performance evaluator (PE), and adaptor. The controller is constructed by a fuzzy neural network (FNN) to carry out the control tasks. The PE is used to determine whether or not the controller’s performance is satisfactory. The adaptor, comprised of two elements, action explorer (AE) and rule generator (RG), plays the main role in the system for generating new control behaviors in order to enhance the control performance. AE operates through a three-stage self-exploration process to explore new actions, which is realized by the multiobjective genetic algorithm (GA). The RG transforms control actions to fuzzy rules based on numerical method. The application of the adaptor can make a control system more adaptive in various environments. A simulation of the robotic path-planning is used to demonstrate the proposed model. The results show that the robot reaches the target point from the start point successfully in the lack-of-information and changeable environments. Index Terms—Fuzzy neural network (FNN), fuzzy rules, genetic algorithm (GA), intelligent control systems, path-planning.

I. INTRODUCTION

R

ECENTLY, the concept of human intelligence has been studied in the fields of cognitive science, soft computing, artificial life (A-life), computational intelligence, artificial intelligence, and intelligent control [1]. The three basic functions of intelligent control system are: 1) perception; 2) decisionmaking; and 3) action. Therefore, an intelligent control system requires the ability to sense the change of environment, and to make decisions, and then to generate new control actions. Higher levels of intelligence may include the ability to recognize objects and events, to represent knowledge, and to reason about and plan for the future [2]. The “intelligence” consists of multiple elements, such as perception, reason, emotion, knowing, caring, planning, and acting. Therefore, the intelligent control system is an integrated system, combining more than one artificial intelligence-based technology, such as fuzzy logic, neural networks (NNs), evolutionary algorithms, case-based reasoning, expert systems, affective computing, etc., to realize multielement intelligence. Concerning the research on intelligent control, some studies were aimed at the conceptual architecture [2]–[9], whereas some proposed the computing technology-based Manuscript received July 9, 2001; revised March 25, 2002. This paper was recommended by Associate Editor H. Takagi. The authors are with the Department of Industrial Management Science, National Cheng Kung University, Tainan, Taiwan 702, R.O.C. (e-mail: [email protected]; [email protected]). Digital Object Identifier 10.1109/TSMCB.2003.808192

model [10]–[17]. Most intelligent control systems can represent control actions by knowledge as fuzzy rules [1], [10], [13], [17], fuzzy decision trees [15], symbols [10], etc. Among them, fuzzy rules are easy to implement and often applied to construct fuzzy controller in intelligent control systems. The decision-maker, which is often constructed by genetic algorithms (GAs) or NNs, such as [1], [11]–[15], and [17], is designed for intelligent control systems to modify the controller’s behaviors to adapt various environments. Similar to intelligent control system, a reinforcement learning system is specified by four elements, namely, a policy, a reward function, a value function, and a model of the environment [18]. All reinforcement learning agents having explicit goals, can sense aspects of their environments, and can choose actions to influence their environments. The policy that may be stochastic defines the learning agent’s way of behavior at a given time. A reward indicates the intrinsic desirability of a given state for the agent. A value function specifies the good status of the system in the long run. Given a state and an action, the model might predict the possible producing state and reward. However, in order to select proper actions, a reinforcement learning agent has to proceed to the trial-and-error searches according to the past. This may lead to spending more time on finding out better actions. Albus [2] proposed the conceptual architecture of intelligent systems with four functional elements. Although he offered an exhaustive introduction of intelligent model to imitate human’s intelligence, the way to realize the intelligent system by actual algorithms, such as artificial intelligence technologies, is not introduced. Shibata and Fukuda [10] presented the hierarchical intelligent control, including an NN and knowledge-based approximation, for robotic manipulators. The knowledge is constructed by symbols and attributes. This model is easy to design, but the prior knowledge has to be determined. Furthermore, different symbolic knowledge has to be designed for different applications; however, they did not offer a general manner for this. Fukuda and Kubota [1] developed the architecture of structured intelligence for a robotic path-planning in 1999. Although their model showed better efficiency and generality, it is aimed at the robotic path-planning. For other applications, suitable ways to redesign the algorithms (such as sensory network), should be found. For generating new behaviors, most intelligent control systems are designed based on the pre-designed mathematical formulations, such as reinforcement learning. Such systems may not be applicable in all the environments due to the lack of flexibility. If the environments are unpredictable, the generated behaviors may not be suitable for the control purpose using the cur-

1083-4419/03$17.00 © 2003 IEEE

CHEN AND CHIANG: NEW APPROACH TO INTELLIGENT CONTROL SYSTEMS

rent methods. More searching time is therefore required. In addition, to adapt environments, some studies were aimed at special applications, but not for the generality, such as the sensory network [1]. For dealing with these problems, we investigated the new approach of intelligent control system called self-exploring-based intelligent control system (SEICS). A three-stage adaptive exploration process is presented to generate new behaviors for the control system to adapt various environments. SEICS is comprised of three functions: 1) performance evaluator (PE); 2) controller; 3) adaptor. These functions are actually connected to perception, action, and decision-making, respectively, as described in [1]. The fuzzy neural network (FNN) is applied to realize the controller, and adaptor is proposed to generate new behaviors to acclimatize different environments. Two elements, action explorer (AE) and rule generator (RG), are constructed in the adaptor to explore new actions and then transform these actions to fuzzy rules. An application of the proposed control system to robotic path-planning is used to demonstrate the adaptive ability of SEICS. Simulation results indicated that the robot could reach the target point successfully in various environments. This paper is organized as follows. In Section II, we propose the architecture of SEICS and describe its new concept. Section III introduces the structure of FNN-based controller. The core function of SEICS, the adaptor, is described in Section IV. Furthermore, we also introduce methods to produce new behaviors in the adaptive exploring process and to transform these behaviors to fuzzy rules. Section V develops the GA to realize the adaptive exploring process. Section VI shows the simulation results of robotic path-planning. Finally, we provide conclusions in Section VII. II. THE ARCHITECTURE OF SEICS People usually take actions through decision-making procedures based on sensed information and the current situations. In addition, people can learn by acquiring or perceiving information concerning rewards or penalties for different behaviors. By imitating human’s adaptive behaviors, this paper proposed an intelligent control system (SEICS), including three features: 1) adaptive exploration process; 2) fuzzy rules-based knowledge; 3) FNN-based controller. The objective of adaptive exploration process leads to generate new control actions for the control system. These generated actions are transformed into fuzzy rules and then the associated rules in the rule base are updated. An FNN is presented for the control task. Fig. 1 demonstrates the three functions of SEICS, i.e., PE, adaptor, and controller. PE is applied to perceive the external environment and evaluates the performance of the system. Equation (1) is defined for the PE to represent the performance in the current control situation if controller's performance is unsatisfactory if controller's performance is satisfactory

(1)

57

Fig. 1.

Architecture of SEICS.

where is the reward signal with two values, e.g., 0 and 1, to specify the unsatisfied and satisfied situations, respectively. For the example of robotic path-planning, a robot has to move forward to the target point without collisions. At each time step, the robot moves forward a fixed distance and PE judges whether a ; collision occurs. If the robot runs into obstacles, then . A switch is used in the proposed system to otherwise, change the signal loop. If the performance is satisfactory, the switch turns to control loop (to controller); otherwise, turns to adaptation loop (to adaptor). Explicitly, on the situation of satisfactory performance for controller, the controller keeps on its task. The performance is unsatisfactory, enabling the adaptor to explore new control actions and to update the rule base. The adaptor is comprised of AE and RG. According to the adaptive exploration process, AE is to explore new control actions for controller through the following three-stage procedure. 1) The return stage is to determine the number of actions we should come back. This stage imitates human’s self-examination ability in examining the past actions. 2) The lead stage is to determine the number of actions that should be explored. 3) The exploration stage is to explore new actions (denoted as vector ) to adapt various environments. The RG generates fuzzy rules by a numerical algorithm [19] using numerical data, which are produced from AE and treated as the new control actions. The generated fuzzy rules will update the associated rules in the fuzzy rule base. The RG algorithm is also applied to construct the initial fuzzy rules in FNN. The fuzzy rules represent the control behaviors of the system. Therefore, when the rule base is changed, the behavior of the controller is also changed. In the following, we introduce how to realize the controller in SEICS based on FNN. III. FNN-BASED CONTROLLER More than a decade ago, neuro-fuzzy systems were applied to many fields successfully. Shann and Fu [21] proposed a five-

58

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 33, NO. 1, FEBRUARY 2003

2) Layer 2: The fuzzification layer, which transfers the crisp values to membership degrees through membership functions. The layer consists of the term nodes, such as etc. The activation function in each node serves as membership function. For each node in this layer, the input and output are represented as

(2)

Fig. 2. Network structure of five-layered FNN.

layer FNN with three learning phases. They applied the gradient descent learning algorithm [22] to train membership functions and a heuristic method is presented to train the parameters of AND/OR operators in fuzzy inference system. The fuzzy rules are turned by another heuristic algorithm. Kim and Kasabov [23] proposed another similar structure of FNN: HyFIS. They also applied the gradient descent learning algorithm to learn the membership functions, but the fuzzy rules are generated by the numerical method proposed by Wang and Mendel [19]. However, the applications of the above two methods are subjected to the complexity of method and computational inefficiency. In this study, an FNN with a five-layer structure is also used, which is easy to implement based on two major types of learning: 1) structure learning to generate initial fuzzy rules; 2) parameter learning to fine-tune the membership functions. A. Network Structure The proposed five-layer FNN is shown in Fig. 2, which consists of input, fuzzification, rule, OR operation, and defuzzification layers. It performs the multiple inputs and multiple outputs. Nodes in Layer 1 are input nodes that transmit input signals to the next layer directly. Nodes in Layers 2 and 4 are linguistic term nodes treated as membership functions to express the fuzzy linguistic variables. Each node in Layer 3 is a rule node to represent the fuzzy rule. The nodes in Layer 5 carry out the defuzzification to get crisp values for output variables. All weights, which are assigned to two connecting nodes between two layers in the whole network, are one. The input of Layer 1 to the th node are deand the output of layer and , respectively. The outputs of each layer noted by for the proposed FNN are introduced as follows. For denotaand tions, assume the subscripts in Layers 1–5 are respectively. 1) Layer 1: The input layer, which transforms the input into second layer vector to , directly. The th node in this layer connects the th output of Layer 1.

where and are the parameters of mean and standard deviation of the Gaussian membership function, respectively. 3) Layer 3: The rule layer. Each node in this layer represents a possible IF part of a fuzzy rule. The node in this layer performs fuzzy AND operation. The function of the layer is (3) is the set of subscripts of the nodes in Layer 2 where that are connected to node in Layer 3. Thus, the output , is a product value of all input of node in Layer 3, to this node. 4) Layer 4: The OR operation layer. Each node in this layer represents a possible THEN part of fuzzy rule. Fuzzy OR operation, defined as the maximum operation, is performed. The nodes of Layers 3 and 4 are fully connected. The functions in this layer are (4) where is the set of subscripts of nodes in Layer 3 that are connected to the node in Layer 4. 5) Layer 5: The defuzzification layer, which performs the defuzzification of each node. The node is labeled as in Fig. 2. Applying the correlation-product inference and the centroid defuzzification scheme [24], the output signal can be evaluated as (5) is the set of subscripts of the nodes in Layer 4 where and that have an output link connected to node , and are the parameters of mean and standard deviation of the Gaussian membership function of the output variables represented by node in Layer 4, respectively.

B. Structure Learning We consider a simple and straightforward method proposed by Wang and Mendel [19] for generating initial fuzzy rules using numerical input-output data pairs. It has been used in several studies [23], [25]. We briefly introduce the main steps of the algorithm.

CHEN AND CHIANG: NEW APPROACH TO INTELLIGENT CONTROL SYSTEMS

59

Step 1) Determine the input and output variables: The , and input and output variables , are determined in FNN. Step 2) Determine the membership function for each input/output variable: Suppose that the number of membership functions for each input variable and output variable in FNN are and , respectively. Step 3) Form the numerical data set: Here, the th numerical input-output data set, i.e., the training samples of FNN, can be formed as

bership functions in FNN have to be adjusted. The parameters in (2) and (5) are denoted as the set . The parameters in Layers 2 and 4 are updated to minimize the following error measure using gradient descent learning algorithm:

(6)

(12)

Step 4) Calculate the membership degree: For each input , and the corresponding output value value , their membership degrees are constructed for for the inputs each variable as a row vector for the outputs as and

is the learning rate of FNN and is an element where in parameter set . The adaptation of the FNN is performed by applying the chain rule. The procedure is given in Appendix A.

(7)

In order to realize “intelligence” for control systems successfully, it is necessary to develop a computational approach imitating human’s behavior. Human’s adaptive behaviors usually include three steps: 1) causing adaptive motivation while the judgment of current behavior is not acceptable; 2) making decisions through thinking and reasoning; 3) generating feasible actions to be carried out. Similar to these steps, a three-stage exploration process is presented in the next subsection.

(8) and specify the membership degrees for and , respectively, based on the membership functions. Step 5) Assign importance degree to each data pair: In order to calculate the importance degree, first pick up the and . maximum value of each vector Next, multiply each of them to get the importance degree. The importance degree for each data pair, shown in (6), can be calculated as

(9) Step 6) Construct fuzzy rules for each numerical data: Find the corresponding membership functions of and for each input/output variable. If the maximal membership degree of associates with the second/third term, then for the IF part of th rule. Play the pick up and for the THEN same rule for picking up part. Therefore, the fuzzy rule has the form is

and is

is and

is

(10)

Step 7) Delete the conflict fuzzy rules: As two rules have the same fuzzy sets in the IF part but a different fuzzy set in the THEN part, the rules are called conflict rules. To solve the conflict rules, only the rule with the highest importance degree is selected.

(11) is the desired/actual output in Layer 5 of the th where node to the th epoch. The total number of the training data is . The updating process can be expressed as

IV. ARCHITECTURE OF ADAPTOR

A. Three-Stage Exploration Process for Action Explorer (AE) We proposed an adaptive exploration process illustrated in Fig. 3 for AE to explore new actions. The process involves three critical points, namely, failure, return, and lead points. The failure point occurs when the controller’s performance is . According to adappoor, i.e., the performance index tive behaviors, first we return to some past action (called return point). Fig. 3 indicates that the failure point is action 6 and ). the return point is action 4 (return two actions and Next, the lead point is the newest action we found and here the lead point is action 7’ (seven actions we have to find and ). The seven actions found in the exploration stage , where deare notes the th explored action. The AE can be formulated as a multiobjective optimization problem as follows: Max. Min. Subject to

(13)

C. Parameters Learning After the structure learning, the whole fuzzy rule base in FNN is established. In the control process, the parameters of mem-

In the above formulation, there are objective functions and three decision variables and

. As

60

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 33, NO. 1, FEBRUARY 2003

Fig. 3. Illustration of three-stage adaptive exploration process.

(a) Fig. 4.

(b)

Illustrations of variable-length and discrete-encoded chromosomes. (a) A chromosome with eight genes. (b) A chromosome with ten genes.

an example, the objective functions may be specified to minimize the moving distance and to avoid collisions in the robotic ). and specify the path-planning problem (here number of returned and the number of explored steps, respecare and tively. The lower and upper bounds of which are determined according to the problems in the is a vector of new actions, , environment. are restricted to feasible solution space . The new actions the controller’s outputs for the output variables. Therefore, will be represented as a set, when there is more than one output variable. B. Rule Generator (RG) After exploring new actions, RG is applied to generate fuzzy rules using numerical input-output data pairs containing the acand its corresponding inputs as follows: tions in the set Pair (14) and indicate the th output and th input of where data pair , respectively. For example, suppose we have two input variables and one output variable. The AE finds that and , thus we can generate seven fuzzy rules. If the corresponding inputs are . Then, we can apply (14) to construct the numerical data pairs as follows: Pair Pair .. . Pair

Using the above data pairs, we can get seven new fuzzy rules by applying the algorithm described in Section III-B. V. MULTIOBJECTIVE GENETIC ALGORITHM FOR ACTION EXPLORER In this section, we present a GA to implement AE. The algorithm is developed with three features: 1) multiple objectives; 2) variable-length chromosomes with discrete-encoded genes; 3) multiple operators. Four major properties are introduced as follows. A. Representation Mechanism The solution of a problem can be represented as an artificial chromosome consisting of numerous artificial genes. A gene (or several genes) is applied to express a parameter. The variable-length chromosome is suitable in the uncertain environments, because the number of actions is not known for some situations [28]. Genes are encoded as several ways for specific problems, such as binary-encoded genes [29], discrete-encoded genes [30], real-encoded genes [31], gray-encoded genes [32], symbol-encoded genes [33], hybrid-encoded (combine numbers and symbols) genes [34], etc. In order to reduce the complexity of GA, the discrete-encoded type is suitable for our problem, since the storage space of discrete-encoded type is smaller than that of binary-encoded type, and the required computational time is also less than that of the real-encoded type. We presented the variable-length and discrete-encoded chromosomes, illusindicates that trated in Fig. 4, in this study. In Fig. 4(a), ). Thus, there AE explores six new actions (e.g., are eight genes in Fig. 4(a). As in the other example, Fig. 4(b) expressed shows that eight new actions are . Each discrete number of in the chromosome because

CHEN AND CHIANG: NEW APPROACH TO INTELLIGENT CONTROL SYSTEMS

61

D. Parameters Design and Replacement Strategy

Fig. 5.

Illustrations of shift operator used in the proposed GA.

new actions indicates specific action such as move forward or turn left in the robotic path-planning. B. Fitness Function for Evaluation Mechanism Fitness function is used to evaluate the suitability of chromosomes to the real problem. The fitness function fit is the function of multiple objectives as follows: (15) where denotes the th objective function. One way to implement (15) is the weighted sum of the objectives, such as and

(16)

is decided by prior knowledge or The nonnegative weight according to the requirements of environment. C. Genetic Operation Strategies In general, the conventional GA only performs three basic genetic operators: 1) selection; 2) crossover; and 3) mutation. Nearchou [29] proposed three other operators: 1) swap; 2) insertion; and 3) deletion. In this section, the proposed GA includes a new operator, shift operator, and the above six operators. An illustration of the shift operator is shown in Fig. 5. 1) Selection: The roulette wheel method [27] is used for selection. The higher the value of fitness function of a chromosome implies the higher the chance of a chromosome to be selected. 2) Crossover: According to a probability , a two-point crossover is used. Interchange the genes between the two strings based on the selected pairs of cross points. 3) Mutation: It is to alter the value of each gene of a chro. mosome randomly with a small probability 4) Swap: It exchanges the two-side (left and right) genes of a swap point based on a probability . 5) Insertion: Randomly insert several genes (the number is determined randomly) into the existing sequence of genes based on a small probability . 6) Deletion: It deletes several genes (the number is determined randomly) from the sequence of genes based on a probability . , a sequence of 7) Shift: According to the probability genes is shifted by one or more positions to left or right.

A significant problem in designing a GA is the determination of the proper control parameters of GA. Traditionally; this determination is achieved through exhaustive experimental works. Since the environments of intelligent control systems are dynamic and changeable, the engineering of exact control parameters costs very much. Thus, determining the parameters through experiments is a possible way. An experimental design method, the Taguchi approach [35], can be employed to determine the optimal levels of control parameters to reach higher performance of a system. Also, an elitism strategy [27] is applied to select the individuals for reproduction. One or two of the best individuals are directly copied into the new population. Elitism guarantees that the obtained maximal fitness of the new generation is greater than or equal to the previous one. After carrying out the genetic operations, the new generation is born to replace the old. Repeat such operations until the STOP condition is satisfied. The stop condition can be specified as either a maximum number of generations (e.g., 100 generations), or the emergence of the best solutions of the population (e.g., best vehicle path). VI. APPLICATION: ROBOTIC PATH-PLANNING In this section, we introduce the simulations of a robotic pathplanning to demonstrate our proposed method. The path-planning problem is resolved to obtain the shortest path for the robot from a given start point to a target point without collisions [1], [17], [29], [33]. We aimed at generating approximate shortest path for a robot without collisions to demonstrate the ability of adaptation in the proposed approach. The environments for the planning are changeable and complex, such that the planning problem is resolved appropriately by an intelligent control system to demonstrate its efficiency [1], [33]. A. Problem Definition for Robotic Path-Planning In all experiments, a simulated robot-like vehicle must navigate in the moving space from start point to a target point, as illustrated in Fig. 6 (the obstacles in moving space are generated randomly). The robot is a cylindrical structure, and the target is a circle zone. Given a start point, which specifies the center of the robot, the robot has to move forward to the target point. The area of moving space is 100 100 cm . The radii of the robot and , respecand target are determined as tively. Rectangular obstacles are generated randomly, and the . width and height for each obstacle are in the interval of The robot does not know the positions of these obstacles. We assume that the robot can recognize its position and the target’s position. At each discrete time step, the robot moves a fixed distance 2.25 cm with steering angle and judge whether a collision occurs. The robot has touch sensors and can recognize the collided obstacle. B. Apply SEICS to Robotic Path-Planning As an initial step, the FNN-based controller is trained in the obstacle-free moving space. Thus, the robot will move from the start point to the target along a straight line. While a collision occurs, the adaptor will be enabled. After that, the AE in the

62

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 33, NO. 1, FEBRUARY 2003

(a)

(b)

Fig. 6. Illustration of robotic path-planning problem. (a) Sketches of robot and target zone. (b) Moving space of a robot.

adaptor explores the new feasible path. Then, the RG in the adaptor transforms the new path as fuzzy rules and updates the corresponding rules in the fuzzy rule base. After changing the rule base of the controller, the controller’s behavior is changed to be more suitable for the environment. Two fuzzy input variables and , the ranges of which are in and , respectively, in radians, specify the distance and the included angle between the robot and target points. One fuzzy output variable is representing the steering angle (in in radians) of the robot. The design of two main modules of SEICS, controller and adaptor, are described next. 1) Construct FNN for Controller: For structure learning, we built the initial fuzzy rule base for FNN based on the rela, since the FNN-based controller is trained tionship in the obstacle-free moving space. Here, we set nine linguistic terms for each input and output variables, and 81 fuzzy rules are produced, because each rule has two input variables. The fuzzy set for input variable is GF (great far), VF (very far), MF (medium far), SF (slightly far), FA (fair), SC (slightly close), MC (medium close), VC (very close), and GC (great close). Another group of fuzzy set for input variable is: LB (left big), LM (left medium), LS (left small), LT (left tiny), FR (forward), RT (right tiny), RS (right small), RM (right medium), and RB (right big). The output variable has the same fuzzy set as . The general fuzzy rule can be written as follows: is

and

is

is

(17)

are any linguistic terms of and . where and For parameters learning, we generated the 625 data pairs . After uniformly as training samples according to 60 epochs, the root mean square error (RMSE) is 0.0044, indicating that the training is satisfactory. 2) Design the Adaptor: Three main components of adaptor are specified as follows. denote a collia) The PE: According to (1), let indicates that no collision occurs sion, while at the current time step. b) The AE: In order to simplify the searching space of GA, we conducted preliminary experimentations to

specify the two variables and . Based on the is outcomes, the number of returning actions is encoded as an integer fixed as three , while number between six and ten. The encoding way of for each is defined as follows: indicates indicates indicates indicates indicates

(18)

Based on the characteristics of the system studied in this paper, we modified the fitness function proposed by Nearchou [29] for GA, shown as follows: (19) where (20) In (19), specifies the distance between the target is the point and lead point of the new path and number of collisions. The parameters of GA are determined by experiments as follows: the population size is 200, the number of generation is 30, and the probability for each operator is and . c) The RG: The numerical data pair is formed as follows: Pair

(21)

Then, apply the numerical method introduced in Section III-B to transform the data to the fuzzy rules. C. Simulation Results Fig. 7 illustrates the original behavior of the robot with obstacle-free environment. Given the start point (10, 26) and target point (76, 68), the robot moves to the target along the straight line. Next, we show the adaptive behaviors of the robot in the obstacle-based environments.

CHEN AND CHIANG: NEW APPROACH TO INTELLIGENT CONTROL SYSTEMS

63

(a) Fig. 7.

(b)

Original behavior of the robot. (a) Trajectory of a robot in obstacle-free environment. (b) Control surface.

(a)

(b)

(c) Fig. 8. Case I: Adaptive behavior of the robot in the environment with ten obstacles. (a) Adaptive trajectory of a collision. (b) Evaluation of GA for searching new path. (c) Control surface after changing fuzzy rules.

1) Case I: Initially, generate a simpler environment with ten obstacles, produced randomly, for the robot. The start and target points are (20, 20) and (60, 60). Fig. 8(a) shows that there is a collision and the robot found a new path through the three-stage exploration process. Here, the robot moves and goes forward by seven new back three steps based on the consequences from GA. actions Thus, the path is consisted of seven moving steps. GA carries out the exploration processes and the evolution is shown in Fig. 8(b). Two fuzzy rules are changed by the RG, shown in the following: is is

and is and is

is is

(22) (23)

Fig. 8(c) shows the changed control surface of the controller. 2) Case II: Twenty obstacles are produced in this case. The start and target points are given as (10, 20) and (60, 60). Fig. 9(a) shows the trajectory of the robot after the occurrence of two collisions. The adaptor is enabled twice, and the return and lead steps in these two events are same ( and ). Two fuzzy rules are updated after two exploration processes and the changed control surface is shown in Fig. 9(b). The corresponding updated fuzzy rules are is is

and is and is

is is

(24) (25)

64

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 33, NO. 1, FEBRUARY 2003

(a)

(b)

(c)

(d)

Fig. 9. Case II: Adaptive behavior of the robot in the environment with 20 obstacles. (a) Adaptive trajectory of two collisions. (b) Control surface after changing fuzzy rules. (c) First evaluation of GA. (d) Second evaluation of GA.

For the first collision, the adaptor is enabled and GA is applied to find out the new path based on the information of the obstacle. After finding six new moving steps, the robot keeps moving, but the collision happens again after several steps. Thus, the adaptor is enabled again. The evolutions of GA for the two exploration processes are shown in Fig. 9(c) and (d). 3) Adaptive behaviors of Cases I and II after changing fuzzy rules: We would like to know what are the new behaviors of the robot after changing its rule base in Cases I and II. In Case I, in order to avoid the obstacle shown in Fig. 8(a), the steering angles has to increase and the robot can move over the obstacle without collision. Thus, the control surface [Fig. 8(c)] and new fuzzy rules show us that the new steering angles are larger than the original. Using the new fuzzy rule base and performing Case I again, Fig. 10(a) shows that the robot avoids the obstacle successfully. In Case II, the adaptor is enabled twice and two fuzzy rules are updated. The new steering angles of the robot are smaller than the original to avoid the collisions, as shown in Fig. 9(a). Thus, the new control surface of Case II is hollow on the relative area shown in Fig. 9(b). The new path after carrying out Case II also successfully avoids the obstacles shown in Fig. 10(b). Based on the above two cases, it demonstrates that fuzzy rules can represent the behaviors of the robot successfully. Furthermore, SEICS can control the robot to adapt various

environments with less information through changing the rule base.

VII. CONCLUSION The close linkage of perception, decision-making, and action plays a very important role in achieving high intelligence for intelligent control systems. For applying such concepts into an intelligent control system, this paper has proposed an integrated architecture of intelligent control systems, called SEICS, with three functions: 1) PE; 2) adaptor; and 3) controller. Adaptor is presented to generate new control actions when the performance of the controller is poor. The proposed three-stage adaptive exploration process, which is realized by multiobjective GA, is the core of adaptor to explore new actions. After exploring new control actions, adaptor transforms them to fuzzy rules and updates the fuzzy rule base of controller. The controller is to execute control task based on FNN which involves two learning procedure. The structure learning is to generate proper initial fuzzy rules and the parameter learning is to adjust the membership functions. The main features and advantages of the SEICS developed in this paper are as follows: 1) it is a general framework to realize the “intelligence” for control systems based on some well-known technologies such as fuzzy systems, NNs, and GAs;

CHEN AND CHIANG: NEW APPROACH TO INTELLIGENT CONTROL SYSTEMS

(a) Fig. 10.

65

(b)

Trajectories of a robot after changing fuzzy rules in CasesI and II. (a) Adaptive trajectory of Case I. (b) Adaptive trajectory of Case II.

2) it is an adaptive mechanism, adaptor, to generate new behaviors and transforms them to fuzzy rules to adapt various environments; 3) an adaptive exploration process which is presented to explore new control actions in the lack of information environments; 4) the proposed FNN can be accomplished easily through parameter and structure learning. For confirming the applicability of the proposed approach, SEICS is applied to robotic path-planning problem. Simulation results demonstrated that the mobile robot could reach a target point successfully without collisions. This verifies that SEICS can be applied to the lack-of-information environments by modifying its rule base to adapt various environments in order to avoid the same mistakes.

4) Layer 2: For a fuzzification node , the Delta value is defined as follows:

if otherwise.

The gradient for adjusting the learnable parameters in Layer 4 and Layer 2 are defined as follows. : the gradient of with respect to 1) For each parameter is defined as follows: (A5) : the gradient of 2) For each parameter is defined as follows:

APPENDIX A PARAMETER LEARNING FOR PROPOSED FNN Let the Delta value of a node in the network be defined as the influence of the output of the node with respect to . The Delta values of Layers 2–5 are defined as follows. 1) Layer 5: For a node in Layer 5, the definition of the Delta value is defined as follows:

with respect to

(A6) 3) For the parameter : the gradient of is defined as follows:

(A1) 2) Layer 4: For an OR node , the definition of the Delta value is as follows:

(A4)

with respect to

(A7) 4) For the parameter : the gradient of is defined as follows:

with respect to

(A2)

(A8)

3) Layer 3: For a rule node , the Delta value is defined as follows:

Applying (A1)–(A8), we can compute the updated parameter at time can be as (12). For example, the parameter . calculated as ACKNOWLEDGMENT

if otherwise. (A3)

The authors wish to thank J. Yuan and J.-M. Yeh for offering many useful suggestions concerning the research in this paper.

66

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 33, NO. 1, FEBRUARY 2003

REFERENCES [1] T. Fukuda and N. Kubota, “An intelligent robotic system based on a fuzzy approach,” Proc. IEEE, vol. 87, pp. 1448–1470, Sept. 1999. [2] J. S. Albus, “Outline for a theory of intelligence,” IEEE Trans. Syst., Man, Cybern., vol. 21, pp. 473–509, May–June 1991. [3] T. Fukuda and T. Arakawa, “Intelligent systems: Robotics versus mechatronics,” Annu. Rev. Control, vol. 22, pp. 13–22, 1998. [4] D. A. Linkers and M.-Y. Chen, “Expert control systems—2. Design principles and methods,” Eng. Appl. Artif. Intell., vol. 8, pp. 527–537, 1995. [5] J. S. Albus, “The engineering of mind,” Inf. Sci., vol. 117, pp. 1–18, 1999. [6] B. Hayes-Roth, “Integrating real-time AI techniques in adaptive intelligent agents,” Annu. Rev. Automat. Program., vol. 19, pp. 1–11, 1994. , “An architecture for adaptive intelligent systems,” Artif. Intell., [7] vol. 72, pp. 329–365, 1995. [8] H.-M. Huang, “An architecture and a methodology for intelligent control,” IEEE Expert, vol. 11, pp. 46–55, Apr. 1996. [9] J. Dean, “Animats and what they can tell us,” Trends Cogn. Sci., vol. 2, pp. 60–67, Feb. 1998. [10] T. Shibata and T. Fukuda, “Hierarchical intelligent control for robotic motion,” IEEE Trans. Neural Networks, pp. 823–832, Sept. 1994. [11] D. A. Linkers, M. F. Abbod, A. Browne, and N. Cada, “Intelligent control of a cryogenic cooling plant based on blackboard system architecture,” ISA Trans., vol. 39, pp. 327–343, 2000. [12] T. Morimoto, W. Purwanto, J. Suzuki, and Y. Hashimoto, “Optimization of heat treatment for fruit during storage using neural networks and genetic algorithms,” Comput. Electron. Agr., vol. 19, pp. 87–101, 1997. [13] T. Morimoto and Y. Hashimoto, “An intelligent control for greenhouse automation, oriented by the concepts of SPA and SFA—An application to a post-harvest process,” Comput. Electron. Agr., vol. 29, pp. 3–20, 2000. , “AI approaches to identification and control of total plant produc[14] tion systems,” Control Eng. Practice, vol. 8, pp. 555–567, 2000. [15] T. Shibata, T. Abe, K. Tanie, and M. Nose, “Skill-based motion planning in hierarchical intelligent control of a redundant manipulator,” Robotics Autom. Syst., vol. 18, pp. 65–73, 1996. [16] C. D. Stylios and P. P. Groumpos, “Fuzzy cognitive maps: A model for intelligent supervisory control systems,” Comput. Ind., vol. 39, pp. 229–238, 1999. [17] M.-R. Akbarzadeh-T, K. Kumbla, E. Tunstel, and M. Jamshidi, “Soft computing for autonomous robotic systems,” Comput. Electr. Eng., vol. 26, pp. 5–32, 2000. [18] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press, 2000. [19] L. X. Wang and J. Mendel, “Generating fuzzy rules by learning from examples,” IEEE Trans. Syst., Man, Cybern., vol. 22, pp. 1414–1427, July 1992. [20] J.-S. R. Jang, C.-T. Sun, and E. Mizutani, Neuro-Fuzzy and Soft Computing—A Computational Approach to Learning and Machine Intelligence. Englewood Cliffs, NJ: Prentice-Hall, 1997. [21] J. J. Shann and H. C. Fu, “A fuzzy neural network for rule acquiring on fuzzy control system,” Fuzzy Sets Syst., vol. 71, pp. 345–357, 1995. [22] D. E. Rumlhart, G. E. Hinton, and R. J. Williams, “Learning internal representations by error propagation,” in Parallel Distributed Processing: Explorations in the Microstructure of Cognition, D. E. Rumlhart and J. L. McClelland, Eds. Cambridge, MA: MIT Press, 1986, vol. 1, pp. 318–362. [23] J. Kim and N. Kasabov, “HyFIS: Adaptive neuro-fuzzy inference systems and their application to nonlinear dynamical systems,” Neural Netw., vol. 12, pp. 1301–1319, 1999. [24] B. Kosko, Neural Networks and Fuzzy Systems: A Dynamic Systems Approach to Machine Intelligence. Englewood Cliffs, NJ: Prentice-Hall, 1992. [25] P. J. C. Branco and J. A. Dente, “On using fuzzy logic to integrate learning mechanisms in an electro-hydraulic system—Part I: Actuator’s fuzzy modeling,” IEEE Trans. Syst., Man, Cybern. B, vol. 30, pp. 305–316, Aug. 2000.

[26] J. H. Holland, Adaptation in Natural and Artificial Systems. Ann Arbor, MI: Univ. of Michigan Press, 1975. [27] D. E. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning. Reading, MA: Addison-Wesley, 1989. [28] Y. Davidor, Genetic Algorithms and Robotics: A Heuristic Strategy for Optimization, Singapore: World Scientific, 1991. [29] A. C. Nearchou, “Adaptive navigation of autonomous vehicles using evolutionary algorithms,” Artif. Intell. Eng., vol. 13, pp. 159–173, 1999. [30] M. L. More, J. T. Musacchio, and K. M. Passino, “Genetic adaptive control for an inverted wedge: Experiments and comparative analyses,” Eng. Appl. Artif. Intell., vol. 14, pp. 1–14, 2000. [31] L.-H. Chen, C.-H. Chiang, and J. Yuan, “New approach to adaptive control architecture based on fuzzy neural network and genetic algorithm,” Proc. IEEE Int. Conf. Syst., Man, Cybern., pp. 347–352, 2001. [32] J. Andre, P. Siarry, and T. Dognon, “An improvement of the standard genetic algorithm fighting premature convergence in continuous optimization,” Adv. Eng. Softw., vol. 32, pp. 49–60, 2000. [33] H. Juidette and H. Youlal, “Fuzzy dynamic path-planning using genetic algorithms,” Electron. Lett., vol. 36, pp. 374–376, Feb. 2000. [34] B. Lazzerini and F. Marcelloni, “A genetic algorithm for generating optimal assembly plans,” Artif. Intell. Eng., vol. 14, pp. 319–329, 2000. [35] M. S. Phadke, Quality Engineering Using Robust Design. Englewood Cliffs, NJ: Prentice-Hall, 1989.

Liang-Hsuan Chen received the B.S. and M.S. degrees in industrial management from National Cheng-Kung University, Taiwan, R.O.C., in 1980 and 1982, respectively, and the Ph.D. degree in industrial engineering from the University of Missouri, Columbia, in 1991. Currently, he is a Professor of industrial management science at National Cheng-Kung University. His major research interests include intelligent systems, fuzzy systems, fuzzy set theory and its applications in decision-making problems, decision analysis, and robust system design. Dr. Chen received the Outstanding Research Award in the field of mangagement science from the National Science Council of Taiwan in 2000. He is an Editorial Board Member of Journal of the Chinese Institute of Industrial Engineers. He was the Managing Editor of Asia Pacific Management Review—An International Journal in 2001.

Cheng-Hsiung Chiang (S’00) was born in Hsinchu, Taiwan, R.O.C., in 1972. He received the M.S. degree in industrial engineering from National Tsing Hua University, Hsinchu, in 1997. Since September 1997, he has been pursuing the Ph.D. degree in the Department of Industrial Management Science at National Cheng-Kung University, Tainan, Taiwan. Since February 2000, he has been a Lecturer with the Department of Industrial Management Science, National Cheng-Kung University. His research interests include cognitive science in engineering, intelligent control system, machine intelligence, application of artificial intelligence, and soft computation techniques such as fuzzy theory, neural networks, evolutionary computation, etc. Mr. Chiang received an IEEE Neural Networks Society Travel Grant at FUZZ-IEEE in 2002. He is a Member of the Chinese Institute of Industrial Engineers.