An Adaptive SOM Neural Network Method to Distributed ... - IEEE Xplore

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIE.2018.2807368, IEEE Transactions on Industrial Electronics IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS

An Adaptive SOM Neural Network Method to Distributed Formation Control of a Group of AUVs Xin Li, and Daqi Zhu

Abstract—An adaptive self-organizing map (SOM) neural network method is proposed for distributed formation control of a group of autonomous underwater vehicles (AUVs). This method controls the AUVs holding their positions in the formation when the formation moves as a whole. The group of AUVs can reach the desired locations in an expected formation shape along pre-planned trajectories. The proposed control law is distributed in the sense that the controller of each AUV only uses its own information and limited information of its neighboring AUVs. Formation control strategies based on self-organizing competitive calculations are carried out with workload balance taken into consideration, so that a group of AUVs can reach the desired locations on the premise of workload balance and energy sufficiency. Moreover, the formation can avoid obstacles and change its shape as needed. The formation is in a distributed leader-follower like structure, but there is no need to designate the leader and the followers explicitly. All the AUVs in the formation are treated equal to be the leader and the followers, so that important characteristics such as adaption and fault tolerance are achieved. Comparison results with traditional method and experiments demonstrate the effectiveness of the proposed method. Index Terms—SOM, AUV, formation control, leaderfollower, formation keeping, formation tracking.

I. I NTRODUCTION UV is a highly autonomic device which has no physical connection to the mother ship and can accomplish the given tasks with its own control and guide system in the underwater environment [1]. With the increasing complexity of tasks, it is necessary to carry out underwater missions which are beyond the capacity of a single AUV through cooperation of multiple AUVs. In certain missions, the AUVs should move collectively as a formation. Formation control is a technology to control a group of vehicles, including ground robots, aerial crafts, surface vehicles and AUVs [2], [3], moving along a desired path as the task requires, at the same time maintaining desired formation patterns, and adapting to the environmental

A

Manuscript received July 25, 2016; revised June 24, 2017 and August 29, 2017; accepted February 1, 2018. This work was supported in part by the National Natural Science Foundation of China (U1706224, 51575336, 91748117), the National Key Project of Research and Development Program (2017YFC0306302). (Corresponding author: D. Zhu.) X. Li and D. Zhu are with the Lab of Underwater Vehicles and Intelligent Systems, College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China (e-mail: [email protected]; [email protected]).

constraints, such as obstacles, limited space, ocean current and communication constraints [4], [5]. In order to work in specific environments or meet the demands of some jobs, a variety of geometric formations could be adopted by multi-AUV systems. Typical geometric formations include line, triangle, diamond, wedge, and polygon. As a fundamental problem of multi-AUV systems, formation control has become a research focus in recent years. Various formation control methods reported in former literature could be mainly categorized into four groups: virtual structure schemes; behavior-based methods; artificial potential field techniques and leader-follower schemes. In the virtual structure approach, the entire formation is treated as a single rigid body. The motion of each agent is derived from the trajectory of a corresponding point on the structure [6], [7]. This method realizes the formation control through defining the rigid body’s behavior. When the navigation environments change, the rigid body formation can’t change, so that the application scope is limited. In the behavior-based methods, the control action for each vehicle is derived by a weighted average of each desired behavior, such as formation keeping, goal seeking and obstacle avoidance [8]–[10]. The advantage of the behavior-based approach is its relative simplicity, explicit feedback, and availability in obstacle avoidance; the shortcoming is the lack of a clear definition of group behaviors, which makes it difficult to analyze the local control rules with mathematical methods. The approach based on artificial potential field concept was presented by O. Khatib [11]. The basic idea of this approach is that the vehicles move in a field of forces which is similar to the electric field generated by the positive and negative charges. The position to be reached is an attractive charge and obstacles are repulsive charges. Obstacles produce repulsive forces on the vehicles while the target pulling on them so that the vehicles could move along the gradient direction of the potential field. The advantage of the artificial potential field method is that the definition is explicit, and it is suitable for real-time control, especially for the collision avoidance problem in the environment with obstacles; the deficiency is that the potential field functions are sometimes difficult to design, and the local minimum problem should be taken into consideration [12], [13]. Leader-follower approach is widely adopted in formation control for ground, aerial or underwater robots [4], [14], [15]. This approach divides the entire formation into several small groups and each group has two AUVs at least. In an arbitrary small double

0278-0046 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.


AUV group, one AUV is the leader and the other is the follower [3], [16], [17]. The follower AUV maintains a certain angle and distance with the leader AUV. A follower in one group could be the leader in another group. The referenced trajectory of the follower is generated as the leader moves and a virtual vehicle is designed to track this trajectory. The position tracking control is formulated for the follower to track the virtual vehicle’s trajectory. The leader-follower approach has a clear definition to the whole formation, and it could be analyzed using mathematics, which ensures the stability of the formation [18]–[22]. This approach is widely used to supply a framework for robots’ formation control [15], [23]– [26]; however, the leader-follower scheme encounters a critical issue in actual applications: when a leader AUV breaks down, its follower also fail to continue moving. If the broken AUV happens to be the formation’s leader, the whole team will fail, which may cause disasters. In most cases, the leader’s information, such as heading angle, linear velocity, and linear acceleration, can’t be obtained in real-time. The followers only have access to the local information and the information of its neighbors in an underwater formation [27]–[30]. In this paper, an adaptive SOM method is proposed for the distributed formation control of multi-AUV systems. The contributions of the proposed method can be summarized as follows. 1) The SOM-based approach is improved to solve the distributed formation control problem. Although the proposed approach is based on leader-follower framework, the AUVs in the formation are randomly distributed and each AUV is treated equal to be a leader or a follower. Whether an AUV serves as a leader or a follower in the formation only depends on the adaptive algorithm itself unless it’s designated manually. 2) New learning speed parameters α and αj are introduced in the proposed formation strategy to achieve adaption and fault tolerance. Due to these characteristics, this method deals with the formation transformation in a more efficient way and handles the situation that some AUVs break down. This method overcomes the shortcoming of the traditional leader-follower scheme, i.e., in the leader-follower formation, when a leader AUV breaks down, its followers also fail to continue moving. Mathematical proof of stability and convergence of the proposed algorithm is given in this paper. 3) This strategy emphasizes on the reduction of the total traveling length and the workload balance. New workload balance strategy is introduced in the algorithm. In the process of formation tracking, the total traveling length, i.e., the consumption of energy of the formation is optimized. Compared with the traditional leader-follower approach, the proposed method provides better navigation efficiency. The rest of the paper is organized as follows. Section II introduces the AUV underwater formation control problem and the thinking of applying SOM to formation control. Section III presents the adaptive SOM-based algorithm for distributed formation control and formation tracking. Simulation results with experiments are provided in Section IV. Concluding

remarks of the proposed method are given in Section V. II. P ROBLEM S TATEMENTS

AND

P RELIMINARIES

A main challenge in multi-AUV systems is the underwater formation control which is the foundation of coordination and cooperation of the AUVs in performing a task. Kalantar and Zimmer proposed a method considering the formation control into two decoupled problems: the boundary and the interior [31], but this method is only suitable for large scale swarms. The positions of AUVs in the interior layer are not fixed, either. It is too complicated to be used when the formation includes a small amount of AUVs. Nowadays, in most of underwater tasks, the number of AUVs is less than 10. Thus, we propose a SOM neural network approach to solve the underwater formation problem, which could deals with a random number of AUVs and there is no need to distinguish the boundary and the interior layers. In the distributed formation, all the AUVs are held in position using this method with only local and neighboring information. A. AUV Underwater Formation Control In this paper, firstly, we assume that a group of homogeneous AUVs are working and moving in a general workspace, which could be a two-dimensional (2-D) plane or three-dimensional (3-D) space. For the convenience of visual demonstration, 2-D figures are used in this paper, although the algorithm is explained with the x, y and z axes considered. These AUVs are randomly distributed at the initial state in the bounded area and they are expected to form a corresponding formation style. Each AUV has a position in the formation. The leader and the followers are chosen by the algorithm. After the formation is formed, the positions of AUVs are relatively fixed. The whole formation moves without distortion or deformation as the task needs. As mentioned above, the formation could be any shape as needed. Some typical formation shapes are triangle, diamond, wedge and polygon. In this paper, we assume any formation shape comprise of points which are corresponding to the AUVs. What the SOM approach deals with is to assure that every point has an AUV to reach. What’s more, the AUVs occupy all the positions within a minimal total expense and minimum energy consuming. Here, for each AUV, the expense is evaluated by the traveling distance from its initial position to its final position in the formation. A typical workspace with the AUVs and the desired formation is shown in Fig. 1, where three filled diamonds represent randomly distributed AUVs at the starting location and the unfilled diamonds are the points which form a triangular formation shape. The dashed line represents the desired moving path of the formation to reach the destination. In addition, we assume that the all the AUVs are homogeneous and have basic capabilities for navigation so that the kinematic constraints are free of consideration in this paper. Firstly, the AUVs are expected to reach the positions (where the black edged unfilled diamonds represent) to form a trianglar formation. Secondly, the whole formation moves along the desired path to the destination. The reduction of the total traveling cost



50 Desired formation points

AUVs

45

Desired formation path

Destination

40

Output Layer

R jk

Obstacle K

Input Layer

T1

20

The Proposed SOM [R 21 , R 22 , ×××, R 2 K ]- AUV2's path Algorithm

T2

...

Tl

...

TL

Tl = (xl , yl , zl ) Self-organizing [R j1 , R j 2 , ×××, R jK ]- AUVj's path process obtains:

TL = (xL, yL, zL)

W jk = ( x, y, z )

(a)

Obstacle

Start

T2 = (x2, y2, z2)

R [R11, R12 , ×××, R1K ]- AUV1's path

J

W jk

30 25

Output Layer

T T1 = (x1, y1, z1)

35

y(m)

Input Layer

15

[R J 1 , R J 2 , ×××, R JK ]- AUVJ's path

(b)

Fig. 2. The SOM neural network, (a) the structure (Tl : coordinates of the lth point; Rjk : coordinates of AUV at a certain time instant), (b) mapping relationship of the two layers.

10 5 0 0

10

20

30

40

50

x(m)

Fig. 1. Plan view of a triangular formation example with 3 randomly distributed AUVs.

and the workload balance of every single AUV are taken into consideration. B. Thinking in SOM The SOM neural network method was first introduced by T. Kohonen in the 1980s and extended later [32]–[34]. It is based on the idea that there is a special order of processing units in the mammalian brain. Each unit is dedicated to a specific task and each group of neurons is sensitive to a particular type of input signals. The units are determined by parameters that can be changed in certain processes to produce meaningful organizations. This algorithm soon became a valuable tool and was used to solve many kinds of problems. In recent years, this method has been applied to solve task assignment problems and control of multi-robot systems [35]–[38]. In this paper, we develop a SOM based approach which could be applied to the formation control problems by improving the previous SOM method. The idea comes from the similar characteristics and phenomena between a multi-AUV system and a SOM network. First, a multi-AUV formation system is the same as a self-organizing system which only changes its inner structure. Second, the SOM algorithm has the competitive, cooperative and self-organizing characteristics that are attractive for a multi-AUV system. Thus, the SOM network could be adapted to an unordered team of AUVs to automatically achieve the designated locations with cooperation and competition. The AUV formation could be achieved in this way. By the formation tracking strategy, the movement and tracking of the formation is also realized. In this paper, we propose an improved workload function, which makes sure the traveling length of a single AUV is balanced within a limited range. A SOM based neural network model is shown in Fig. 2a, and the mapping relationship is shown Fig. 2b. The proposed algorithm could be executed within a reasonable computation time, emphasizing on workload balance for each AUV in the team. At the initial state, there are J AUVs randomly distributed in the workspace, where J ∈ N+ . L points are set at the key positions of the expected formation shape.

For instance, there are 3 points to form a triangle as Fig. 1 illustrates, so that we get L = 3. In this way, the SOM model has two layers of neurons. The first layer is the input layer including L neurons, where L ∈ N+ . These neurons represent the Cartesian coordinates of the points in the 3-D workspace. T is a 3 × L matrix which denotes the coordinates of the target points as input neurons. Each input neuron has three parameters (xl , yl , zl ) according to the coordinates of its corresponding point, where l = 1, 2, ..., L. All the coordinates of the points form the input data set. The second layer is the output layer. Neurons in the output layer represent the coordinates of the AUVs and the corresponding path for each AUV. Each neuron of the output layer is fully connected to the neurons of the input layer. There are K neurons to establish an optimal path for each AUV, where K ∈ N+ , so that there are K × J neurons in all. The connection weight of each output neuron, i.e. the weight Wjk = (x, y, z) of that neuron, is given by a 1×3 weight vector, which is initialized as the coordinates of the initial AUV position. The network is initialized with the weight vector Rjk = Wjk = (wjkx , wjky , wjkz ), where j = 1, 2, ..., J and k = 1. Rjk changes with Wjk during every computation iteration. After the iterations, we get a 3 × J × K matrix R which denotes the coordinates of the output neurons, i.e., the trajectories of the AUVs. The input data set is given sequentially to the network in a random order during the iterations, i.e., targets’ coordinates are input to the network one by one until the last target. This input strategy with the random order of the input data results in the robustness of the algorithm and reduces its dependence on initial workspace configuration. During this process of formation, the visiting sequence of the points by AUVs is gradually worked out, and the points would attract output neurons to form a formation for the AUVs. The iterations end until all of the AUVs reach the points. In order to assure that each point in the formation gets an AUV, we set L = J in the algorithm. Note that the relationship between L and J are defined by the algorithm, which means the correspondence between AUVs and the target points are not fixed. III. M AIN A LGORITHM In order to further explain how the SOM method is applied on formation control of multiple AUV system, mathematical models and formulas with some discussions are given in this section.



Notations and definitions: for ∑ a vector w ∈ Rn , the norm n 2 1 | · | is defined as |w| = |w|2 = ( i=1 |wi | ) 2 . A. Winner Selection and Neighborhood Function Design The first step in the SOM approach is to select the winner node. For an input neuron (the target point in formation), the output neurons compete to become the winner node [35]. Let Dkjl be a weight value at a time instance related to Euclid distance between the jth AUV neuron and lth input node neuron in the kth iteration. The winner is chosen by the following general equation: [Nj ] ⇐ min{Dkjl }

(1)

where [Nj ] represents that the jth neuron (AUV) from the kth group is the selected winner to the lth input node, with integers k ∈ [1, K], j ∈ [1, J], and l ∈ [1, L]. The winner is selected through iterations. Let S be the safety length a single AUV can travel without considering running out of energy and Smax be the maximal distance that a single AUV can travel. The weight variable Dkjl is given by  0 ≤ Pj < S  |Tl −Rjk | , |Tl −Rjk | (1 + V ) , S ≤ Pj < Smax (2) Dkjl =  ∞, Pj ≥ Smax where |Tl −Rjk | √ 2 2 2 = (xl − wjkx ) + (yl − wjky ) + (zl − wjkz ) .

(3)

Equation (3) gives the Euclid distance between Tl and Rjk . Here, Tl = (xl , yl , zl ) denotes the location of lth input neural node; Rjk = (wjkx , wjky , wjkz ) is the coordinates of the jth neuron from the kth group of the output neurons, which represents the location of a certain AUV at a certain time instant. The work load balance control variable V is introduced in (2) in order to make sure that each AUV is capable to move in the formation without running out energy. The jth AUV’s traveling length Pj as well as S, Smax are variables related to workload balance, which are further introduced in the next subsection. After the winner neuron is chosen, the next step is to design the neighborhood function to decide the neighbors’ weights. The neighborhood should be a sphere in 3-D space with radius γ. The center of the neighborhood sphere is the winner node. The neighborhood function determines the influence of the input target on the winner neuron and its neighbor nodes. The winner was put on the highest attractive force which is diminishing as the distances of the neighbor neurons increase to the winner. This function should have no effect on the neurons outside the neighborhood. The neighborhood function f (·) is described as { 2 2 e−dm /G (t) , dm < γ f (dm , G) = . (4) 0, others Due to the characteristics of the exponential function, it’s easy to get 0 ≤ f (dm , G) ≤ 1, where dm = |Nm − Nj |. dm is the distance between the mth output neuron Nm and the

winner Nj from the corresponding group of the output neurons , where m = 1, 2, ..., J. γ is a constant indicating the range of winner’s neighborhood. The function G(t) = (1 − µ)t Go is a nonlinear function, where t is the number of iterations; µ and Go are constants that affect the computation time and computational accuracy through adjusting step length. Neighborhood function f (·) decreases as the iteration times t increases. B. Strategy for Workload Balance of the AUVs The benefit of work load balance control is that the energy consumption of each AUV could be equalized, which could makes the maximum use of the AUV team as a whole formation. Workload balance control is mainly determined by the AUVs’ actual moving length and the safety length preset. Under this workload balance strategy, the proposed method achieves a good balance between the total traveling distance and average workload of each AUV under the premise that the energy is sufficient. That is an improvement to conventional SOM neural network method. Let Pj be the actual moving length of the jth AUV in the certain process of formation; S is the safety length a single AUV can travel without considering running out of energy; Smax is the maximal distance that a single AUV can travel. If Pj ≥ Smax , the AUV is in the danger of running out of energy and must stop its task to move back. Conventionally, S is set in the range S ∈ [0.2 · Smax , 0.8 · Smax ]. Then the parameter controlling the workload of a certain AUV could be defined as Pj − v (5) V = S+v where v is the average moving length of the team of AUV in a certain task. Equation (2) describes the calculation process of neural weights distance Dkjl , which is defined in the following three different cases: 1) 0 ≤ Pj < S, i.e., the AUV’s moving length is short when it is forming the formation, as a result it has sufficient energy to continue traveling. Calculating |Tl −Rjk | is enough to find the winner in (2), which guarantees that the total cost of traveling is minimized. 2) S ≤ Pj < Smax , i.e., the AUV’s moving length is longer than the safety length and energy problem should be taken into consideration. The AUV’s workload may be too heavy to have sufficient energy to travel, so that the workload balance algorithm should be executed. From (5): a. Pj > v, meaning that the jth AUV’s workload is above the average level. It generates V > 0 and 1 + V > 1, then Dkjl > |Tl − Rjk | from (2). Dkjl increases reducing the chance of the jth AUV to be selected as the winner. b. Pj < v, meaning the jth AUV’s workload is below the average level. It generates V < 0 and 1 + V < 1, then Dkjl < |Tl − Rjk |. Dkjl decreases resulting in the growing possibility of the jth AUV to be selected as the winner. c. Pj = v: there is no priority change in this case.



3) Pj ≥ Smax , i.e., the AUV’s traveling distance is longer than the maximal distance a single AUV can travel in a mission. This AUV is in the danger of running out of energy and must stop moving, otherwise it may break down or don’t have enough energy to come back. In this case, Dkjl is set to be ∞ and this neural won’t be selected as the winner. Under the premise of fixed Smax , the total path cost tends to be minimal when the value of S increases, but the balance of workloads tend to be worse; otherwise, when S decreases, the total path cost tends to increase, but the workloads are better balanced. All the AUVs’ workloads are balanced in this way.

Algorithm 1 Iterative algorithm based on SOM method for formation control 1: Initialize the formation points matrix T, and AUVs’ random location matrix R; 2: repeat 3: input T to the SOM neural network; 4: calculate the weights matrix W by the competitive algorithm; 5: find out the neural with minimal weight as the winner; 6: 7: 8:

C. Neural Weights Updating Function After the winner neuron and its neighbors are selected, the next step is to move the winner neuron and its neighbors toward the input neuron (the target) whereas the other neurons stay still. The update rule is defined as  Dkjl < Dmin  Tl , Rjk + α · f (dm , G) Rj(k+1) = . (6)  · (Tl − Rjk ) , others The plus symbol in (6) means the addition of two vectors. Dmin is the termination condition of operation, which could reduce computation time. The weight values are decided not only by the position of the winner neuron and the neighboring neurons, but also by the neighborhood function and the network learning speed α. Note that we use αj to substitute for α in (6) in the algorithm. Equation (7) is generated from (6) as:  wj(k+1)x = wjkx + αj · f (dm , G) · (xl − wjkx )  wj(k+1)y = wjky + αj · f (dm , G) · (yl − wjky ) (7)  wj(k+1)z = wjkz + αj · f (dm , G) · (zl − wjkz ) where  0,    α, αj = s · α,    f · α,

the jth target is already achieved ordinary situation . the jth target needs slower achievement the jth target needs faster achievement (8) Equation (7) and (8) give out a convenient and novel method to obtain the weights of the neural network. Note that (xl , yl , zl ) is the Cartesian coordinate of the lth input neural node, and s, f and α are positive variables satisfying s · α < α < f · α. Remark 3.1: Setting L = J in the algorithm, once the lth target is achieved by an AUV, this target should not be visited again by any other AUVs. When a target is already achieved by an AUV, αj is set to zero so that the selected AUV and its neighbors will not move any more. Using this schema, there is no need to delete the already achieved points from the SOM network. The iteration is bounded to end after an acceptable times. In other cases, when some targets should be achieved by a relatively faster or slower speed, using the parameter αj is an adaptive method to implement the solution. Then we present the SOM based method in Algorithm 1.

9:

calculate the neighborhood function; update matrix R of the SOM network by (6) and (7); update matrix R by setting Rjk = ∞ after the jth AUV already reached its position; until each point in the formation has an AUV within an acceptable neighboring range.

D. Formation Tracking Strategy Remark 3.2: After all the AUVs get to their position in the formation, the formation moves along a trajectory and the members of the formation also move to keep their positions. This procedure is named as formation tracking. In order to implement the formation tracking based on the SOM approach, we assume a virtual formation which is totally the same as the actual one. The virtual formation has the same key points as the real formation. When formation tracking is ′ implemented, the virtual formation moves first. Tl denotes the lth AUV’s location in the virtual formation. We have the moving rule as bellow: ′

′

Tl (n) = Tl (n − 1) + ∆T(n), n = 1, 2, ..., Nstep .

(9)

In (9), Nstep is the total moving steps of the formation, ′ ′ ′ ′ and Tl (0) = (xl (0), yl (0), zl (0)) is the initial location of the lth AUV before the virtual formation moving. ∆T(n) is an arbitrary 3-D vector satisfying 0 < |∆T(n)| ≤ λ. The angle of ∆T(n) changes with n and the magnitude of ∆T(n) equals to |∆T|. We set λ a reasonable constant as the maximal single step length of the virtual formation. Based on (9), the formation can track an arbitrary path with the virtual formation’s single step length |∆T| no more than λ. After the virtual formation moves, the virtual points in the formation is set as the input vector of the SOM network by ′

Tl = Tl (n), n = 1, 2, ..., Nstep .

(10)

After the value of Tl is updated, the SOM algorithm is executed again. The AUVs move to the new positions and the actual formation also shifts as the virtual formation do. The moving formation is kept in this way until the virtual formation reaches the destination. Remark 3.3: In the formation tracking procedure, transformation may occur to avoid obstacles or to change the formation shape. The value of Tl is also updated after the formation transforms. Algorithm 1 is executed again to form the new formation shape according to the new value of Tl . The new formation can be kept and move as a whole again. In this sense, transformation is a part of the formation tracking rule and can be realized in this way.



F. Stability and Convergence Analysis Although in most cases the SOM based neural network is convergent, it does not guarantee convergence. To demonstrate

2: 3: 4: 5: 6: 7:

8: 9: 10:

repeat ′ the virtual formation moves one step and update Tl (n) using (9); update matrix Tl as the virtual targets move using (10); calculate the APF repulsive force to avoid obstacles and adjust the desired path [41]; update matrix Tl if transformation occurs from Remark 3.3; if at initial state or transformation occurs, set APF repulsive force and set a range to avoid collision of AUVs inside formation [40]; use Algorithm 1 to recover the whole formation; relieve APF force between AUVs in the formation; until the formation reaches the final position and stops moving.

25

20

25 Desired Formation Points AUVs

Neighborhood of the winner AUV1 (inside the circle)

Virtual formation points Formation shape 20 Virtual formation points

15

Output neurals (Coordinates of AUVs)

15

AUV1 AUV2

10

Actual formation points at initial status

y(m)

The algorithm is a loop procedure which repeats until all of the points in the formation are achieved by an AUV. All the AUVs move to its corresponding position in the formation spontaneously. The procedure is done through iterations to calculate all the paths of the AUVs to form the formation. In order to keep the formation tracking the path, we present Algorithm 2 for formation tracking which is based on Algorithm 1. Although the proposed method is more efficient for large AUV team, detailed working process is explained with 3 AUVs for easier understanding. As shown in Fig. 3, the red diamonds represent the three AUVs and the black edged diamonds represent the point positions as input neurons. AUV1 is the winner as it is the nearest one to the first input neuron, which is the top point P1 of the triangular formation; according to the rule in (4), the neighbor is selected, where AUV2 is the only neighbor of the winner, as shown in Fig. 3a. Remark 3.4: Both the winner (AUV1) and its selected neighbor (AUV2) moves toward the winner’s nearest point (P1) by changing the weights vectors, whereas the other neurons do not move. It is easy to find that the winner moves a longer distance than its neighbor. The closer the point is to the winner, the longer distance the winner moves, on the premise of maximal single step length, which is exactly the rule described in (4) and (6). The loop repeats until all of the points are achieved by an AUV. All the AUVs move to its corresponding position in the formation spontaneously. After the formation is completed, the whole formation moves to a specified target or tracks a specified trajectory. As shown in Fig. 3b, the virtual formation moves n steps and the maximum step length is set to λ = 3m. Note that λ is proportional with that in the real-world experiments as well as the single step length of AUVs and the workspace. The formation could track an arbitrary path with the single step length no more than λ. This parameter should not be set too large to make sure the formation moves as a whole. The procedure is done again through iterations introduced above to calculate all the paths for the AUVs to reach the new formation positions. Note that collision free is important to large scale multiagent systems and there have been some research results about it [39]. In this paper, during the formation control process, once the formation is already formed and moves in shape without transformation, the locations of AUVs are restricted by the algorithm and the AUVs in team will not collide with each other. Collision free strategy is only needed at the initial state and the state of transformation. We employ the simple but efficient Artificial Potential Field (APF) method which are similar to the obstacle avoidance strategy [40], [41]. The AUVs are treated as the same electric charges with mutual repulsion, so that the AUVs in the formation will not collide. Please refer to Algorithm 2.

Algorithm 2 Formation tracking in 3-D workspace 1: Initialize the formation shape at the start location, and generate a preliminary path using path planning algorithm;

y(m)

E. The Whole Procedure of the Proposed Algorithm

Tl (n)

10

AUV3

… 5

5

Tl (1)

Input neural (points in formation)

Tl (0) 0 0

5

10

15

20

25

0 0

5

10

15

x(m)

x(m)

(a)

(b)

20

25

Fig. 3. Working process of the proposed approach in plan view, (a) the SOM computational process with 3 AUVs in a triangular formation, (b) virtual formation moves n steps along an arbitrary trajectory.

the convergence behaviour of the proposed method, (1) the stability of the weights update algorithm with maximal learning rate is investigated, and (2) local convergence is proved for the improved SOM network. Note that the convergence of the network is local but not global, which will be explained later in this section. Remark 3.5: As the weights matrix Rjk is updated based on the Euclidean distance of input neurons and output neurons, the weights update rule (6) performs gradient descent on an energy function E (or termed error, cost, or distortion objective function): Rjk ∝ −

∂E . ∂Wjk

(11)

In (6), assuming Dkjl ≥ Dmin , let j be an positive integer no more than J, and replace variable k with x ∈ N+ . Rjk is then written as a function R(x) of argument x. R(x) is a 3 × L matrix at any value of x. Theorem 3.1: If a SOM neural network uses a gradient descent learning strategy as proposed in this paper, the condi1 , where ξmax tion of stability is 0 < α · f (dm , G) < ξmax



is the maximum eigenvalue of matrix [∆(x)∆T (x)], with ∆(x) = [Tl − R(x)]. Proof: To evaluate the the update rule in (6), the error function of the AUVs’ relative position is defined as: T

e(x) = Tl T Tl − [Tl − R(x)] R(x).

(12)

From Remark 3.5, the weights update rule is gradient descending. According to [42], the gradient descent weights’ error evaluation function based on (6) is defined as R(x + 1) = R(x) + 2α · f (dm , G)∆(x)e(x).

(13)

Take the expectation of both side and substitute (12) into (13): E[R(x + 1)] = [I − 2α · f (dm , G)∆(x)∆T (x)]E[R(x)] + 2α · f (dm , G)E[Tl T Tl ∆(x)]. (14) This is a linear dynamic system, which will be stable if the eigenvalues of the matrix [I − 2α · f (dm , G)∆(x)∆T (x)] are less than 1 in magnitude, falling inside the unit circle [43]. The condition for stability of the proposed method is then |1 − 2α · f (dm , G)ξi | < 1,

(15) T

where ξi are the eigenvalues of matrix [∆(x)∆ (x)]. Equation 1 (15) reduces to 0 < α · f (dm , G) < ξmax . Although f (dm , G) is related to the Euclidean distance of the input and output neurons from (4), an appropriate value of α can be found to satisfy Theorem 3.1. The proof is complete. Lemma 3.1: A function f (x) is said to be Lipschitz near x ∈ Rn if there exist ℓ, ϵ > 0, such that we have |f (x1 ) − f (x2 )| ≤ ℓ |x1 − x2 |, for all x1 , x2 ∈ Rn satisfying |x1 − x| < ϵ and |x2 − x| < ϵ. If f (x) is Lipschitz near any point x ∈ Rn , then it is said to be locally Lipschitz in Rn . Lemma 3.2: According to former literature [44] on neural networks’ convergence, if the weights update function is locally Lipschitz, the neural network is locally convergent. As long as the value of α·f (dm , G) is finite and positive, the update rule function in (6) is locally Lipschitz at any winner j with k ∈ N+ , so that convergence to a local minimum in E in (11) can be easily shown. For the formation control problem in this paper, global convergence is not needed, because local convergence is enough to guarantee the end of the whole procedure, i.e., to make sure each target point gets an AUV. Each target point in formation is a local minimum. However, there still are two other cases needed to be discussed separately: 1) There are more than one targets competing for one AUV, and this AUV has the same weights to these targets. All the other AUVs are not in this AUV’s neighborhood. In this case, we have to enlarge this AUV’s neighborhood γ to make sure that at least one other AUV is in the range of γ. When the AUV team is sparse, γ is set to a larger value. 2) There are more than one AUVs move to one target, but other targets are far away from any AUV. In this case,

AUVs moves fast to this target. According to the rules in (7) and (8), αj should be adjusted. αj is set to zero as far as one AUV reaches the target, that means this target’s corresponding neuron’s input and output are set to zero. After that, the rest AUVs moves to other targets normally until the iterations end, since L = J, which makes sure each target has one AUV. The proposed SOM network is stable and convergent with restricted conditions mentioned above. In defense of the lack of an energy function, Kohonen emphasizes that there is no theoretical reason why the SOM algorithm should ensue from such a function [32]. In conclusion, based on the analysis in this section, convergence of this neural network is not a problem in practice. IV. S IMULATION R ESULTS

AND

D ISCUSSIONS

This section is to demonstrate effectiveness of the algorithm described above. Simulations are carried out with several AUVs randomly distributed at the initial state. Based on the proposed method, programs in MATLAB language are implemented. Parameters are modified during our simulation to get the best effects and the results from the figures and tables are explained. Note that as the 3-D figures with many lines and curves can not be shown clearly, we use plan view of the figures in this paper. To verify that the proposed method can be applied to the real system, the hardware based experiment is also conducted in this section. The map of the workspace is generated by sonar or laser based simultaneous localization and mapping (SLAM). The moving path of the whole formation is generated by the APF method. A. Formation Keeping and Formation Tracking of 6 AUV Team Firstly, the proposed method is applied to a case where the workspace has 6 AUVs to form a triangular shape and move to the destination. The proposed method is used to control the formation move along the trajectory to the target. The whole time-dependent process based on adaptive self-organizing behavior of the multi-AUV system is illustrated in Fig. 4. The workspace is a bounded area where (x, y, z) ∈ [0, 50m]. In real world practice, the tracking path is generated by the SLAM module and path planning algorithms. As there are obstacles in the environment, the AUV team transforms to avoid collision. At the start place, a team of 6 AUVs are randomly distributed as the filled yellow diamonds represent. Based on the SOM method, the triangular formation is formed. Then the formation moves as a whole along the path. After the team gets through the narrow place, the formation recovers and moves to the destination as illustrated in Fig. 4. At the narrow position with obstacles, a virtual formation shape is set to make sure the formation could get through and the SOM based method direct the AUVs to form the virtual formation shape. After that the triangular formation shape can be recovered by the same strategy. Unlike traditional leader-follower based algorithms, any AUV could be the leader here, because the SOM method doesn’t depend on the fixed leader-follower relationship. Each



50

TABLE I S YSTEM PARAMETERS

Destination

Formation transforms and restores 45 40

Symbol 30 Obstacle 25

Typical value

S

safety traveling length of an AUV

60m

60m

Smax

maximal traveling length of an AUV

120m

120m

γ 20 15

Start

10

Obstacle

5 0 0

5

10

15

20

25 x(m)

30

Dmin

Path of AUV1 Path of AUV2 Path of AUV3 Path of AUV4 Path of AUV5 Path of AUV6 35

40

45

λ

50

Fig. 4. Formation control of 6 AUV team with obstacles: tracking trajectories and transformation to avoid obstacles.

Optimized value for 6 AUV formation

Description

35

radius of the neighborhood acceptable distance between AUV and virtual point maximal step length of the virtual formation

3m

5m

0.1m

0.1m

3m

3m

Go

neighborhood function coefficient

0.5

0.2

µ

neighborhood function coefficient

5

5

|∆T|

virtual formation’s step length

2.5m

1.0m

α

normal velocity of AUV

0.5m/s

0.5m/s

s

acceleration factor of AUV

(0.0 1.0)

0.7

f

acceleration factor of AUV

(1.0 2.0)

1.2

tracking error(m)

15 AUV1

AUV2

AUV3

AUV4

10

AUV5

AUV6

B. Large Swarm of AUVs in Formation and Fault Tolerance

|∆T| = 2.5m

5 0 0

25

50

75

100

125

time(s)

tracking error(m)

10 AUV1

AUV2

AUV3

AUV4

AUV5

AUV6

|∆T| = 1.0m 5

0 0

25

50

75

100

125

time(s)

Fig. 5. Formation tracking errors with different step length |∆T|.

AUV in the formation is treated equal. The virtual points move first to generate the desired trajectory. Virtual formation’s instantaneous step length is set as |∆T| from (9). ∆T can be an arbitrary vector, which means the formation moves freely if there isn’t an preset path. As mentioned in the preceding section, |∆T| is bounded to achieve good performance. All the parameters of the adaptive controller are given in Table I. Radius of the neighborhood γ in (4) is set greater than the typical value because the 6 AUV team is sparse, as discussed in section III. F. In next subsection, with a large swarm of AUVs, γ is set as typical value. |∆T| and the acceleration factors are also adjusted. Remark 4.1: Formation tracking errors are measured by the distances between the AUVs and the desired virtual formation ′ points, i.e., |Rjk − Tl (n)|. The tracking errors increase when the virtual formation moves one step. If all the AUVs reach the virtual points as expected, the tracking errors decrease to zero. Formation tracking errors are shown in Fig. 5. As |∆T| changes from 2.5m to 1.0m, tracking errors decrease and converge to zero faster, which means that smaller |∆T| generates smoother formation path.

Nowadays, in real practice, a formation doesn’t have too many AUVs, often less than 6, due to the high expense of the underwater vehicles. In the long run, as the cost of the vehicle reduces, it is meaningful to investigate large AUV team. Remark 4.2: If the number of AUVs in a formation is greater than 10, i.e., J > 10 in the algorithm, this formation is named as a large swarm. Otherwise the formation is a small swarm. In a large swarm of AUVs, the advantage of proposed method is more significant. Table II is the comparison results on traveling lengths. Average traveling length of the formation is calculated to show the energy efficiency and the standard deviation is used to evaluate the workload balance of the AUVs. It is easy to find from the data table that the proposed method results in shorter traveling length as well as better workload balance. Note that the proposed method is more effective with larger swarm of AUVs. As for large swarms, the decrement of standard deviation of the traveling length by the proposed method is much smaller than that of small swarms, which means workload balance is much more efficient. It is important in real world practice. Due to page limitations, the traditional leader-follower approach isn’t presented here. Readers can refer to literature [17] for more details. Another important characteristic of the SOM-based method is fault tolerance. Remark 4.3: Fault tolerance is important for multi-AUV formations. In traditional leader-follower formation, the team stops working if the leader breaks down, because the leaderfollower chain is cut down. The proposed method doesn’t need to assign leader and followers, so that it solves the brokenleader problem. In a large team where the AUVs are redundant, when an AUV breaks down, just removing one point in the formation, the rest AUVs continue working well under the SOM strategy waiting a substitution for the broken one. In the 6-AUV team’s triangular formation tracking, we assume that one AUV breaks down halfway, as depicted in



C OMPARISON

OF THE

TABLE II P ROPOSED AND T RADITIONAL M ETHODS

Communication

Experiment

Method

Average traveling length of an AUV

Standard deviation of the traveling length

Small swarm of 6 AUVs

Proposed Traditional

63.87m 64.32m

3.7127 4.1949

77.02m 77.24m

Proposed Traditional

Large swarm of 15 AUVs

150Mbps Wifi Module

115200Baud UART

Arduino ATMega 2560

200Kbps I2C

50 45

Leader of the whole formation

20

y(m)

y(m)

30

10

Front and Rear Servos f60 degree)

120 degree Wide Angle Camera

(b)

25 20 This AUV will break down at next step (after 10 tracking steps)

15 5

5

10

10

15

20

25

0 0

x(m)

(a)

AUV2

AUV3

AUV4

AUV5

AUV6

4 10

20

30

40

50

x(m)

3.5 tracking error(m)

5

AUV1 4.5

5

Followers of the broken AUV 0 0

Brushless DC Motor with 40cm Propeller

480Mbps USB2.0

Laser SLAM Module

Fig. 7. Hardware test platform, (a) AUVs for experiment, (b) hardware structure of the AUV.

35

The broken AUV

400Kbps SPI

IMU Module

(a)

AUVs Formation shape Formation trajectory

40

15

200Kbps I2C

Depth Module

Surface Neighbor Vessles AUVs

Formation shape AUVs in formation

Node.js Server on BeagleBone Black

Controller Board

8.8700 13.2503

25

User Interface

Underwater Electrical Field (Power Line Ethernet)

(b)

Fig. 6. Fault tolerance based on the adaptive SOM method: (a) formation shape, (b) an AUV breaks down after 10 tracking steps and the formation keeps moving.

3 2.5 2 1.5 1 0.5 0 0

Fig. 6. The broken AUV is the follower of the formation leader, and at the same time it is the leader of another two AUVs. With the traditional leader-follower method, the broken AUV stops and its followers stop moving, too. If this AUV is the whole team’s leader, all the AUVs in formation will break down. Fortunately, based on the improved SOM method, the formation could continue work despite the failure of a certain AUV. Note that the broken AUV could be randomly chosen. In this case, after 10 steps in the formation tracking, an AUV breaks down, i.e., this AUV stops at n = 10, where n denotes the moving steps in (9). Simulation results in Fig. 6 show that the formation keeps moving with the formation shape unchanged. All the other AUVs in the team keep moving in formation along the expected trajectory, waiting for a spare AUV to join the formation. C. Test Platform and Experiment The proposed method is applied on the AUV model Neptune-1, shown in Fig. 7a, which is modified based on the R remote control submarine produced by the ThunderTiger⃝ Corporation. We rebuild the remote control submarine by adding some modules so that it can act as an AUV. Hardware structure is shown in Fig. 7b. With the thruster installed, the speed of the AUV is about 2km/h, or 1.08kn. User interface of the AUV is easily accessed by web browser. WiFi floater provides communication channel between AUVs and surface vessels. As the underwater electrical field communication technology is not mature yet, a two wire tether with 9V power is used instead for inter-AUV communication and the connection of the WiFi floater to the AUV. In the distributed formation control process, each AUV only needs to know limited information such as its own pose and status, and the relative positions of its neighbor AUVs. As a result,

25

50

75

100

125

time(s)

(a)

(b)

Fig. 8. Real time in-system simulation result: (a) formation control and trajectories, (b) tracking errors of each AUV.

there is no need to transfer the status information or other information between the AUVs. Note that because the AUVs are nonholonomic, only limited information is assumed to be acquired. As the image sonar for SLAM is too heavy to be installed on our vehicle, the laser based SLAM module is used for discovering the neighborhood and the AUVs move on the water surface as z = 0. In real world practice, ocean current influence is inevitable. We sample and simplify the surface current model data provided by [13], [45]. These ocean current data are used in the real time in-system simulations. Remark 4.4: As for the ocean current influence, we adopt velocity synthesis method proposed in [37], [46]. Therefore the AUVs conquer the water flow disturbance and move to the desired direction. The anti-disturbance job is handed to the dynamic control unit of the AUV, which is not the focus of this paper. Due to the limitation of the experimental environment, these AUVs are used for the real time in-system simulations. The constraints of linear velocity and angular velocity are simulated according to the hardware, which are nearly the same as that in the real-world experiments. Fig. 8 shows the experiment result when the proposed controller is applied. The AUVs converge to the desired formation and move to the destination as in Fig. 8a. The tracking errors shown in Fig. 8b converge in almost the same way as in Fig. 5. Therefore, it can be concluded that the proposed method is demonstrated by performing simulation and experiment.



V. C ONCLUSION In this paper, a SOM-based neural network approach is proposed for formation control of multi-AUV systems in 3-D working environment. The calculation process includes special definition of the initial neural weights of the network, the rule to select the winner, the workload balance law, the method to update the weights, and the formation tracking strategy. Based on the SOM neural network method, formation keeping as well as formation tracking could be implemented. This adaptive approach guides a group of nonholonomic AUVs to move to the expected destination in formation along desired paths. In addition, it can deal with complicated cases such as that the formation changes its geometrical shape to avoid obstacles. The total traveling length and balance of workloads are rationally adjusted on the premise of sufficient energy. Moreover, this approach has the fault-tolerant characteristic. Simulation results validate the proposed method, and comparisons with the traditional leader-follower approach show the benefit of it, and it was shown that the control law can be applied to real systems. In real world, there are many more uncertainties such as moving obstacles. The algorithm should be capable of performing dynamic collision free as well as handling severe ocean current influence, which will be investigated in our future study. R EFERENCES [1] X. S. Jiang, X. S. Feng, and D. T. Wang, Unmanned underwater vehicles, 1st ed. Shenyang, China: Liaoning Science and Technology Publishing House, 2000. [2] F. Fahimi, “Sliding-mode formation control for underactuated surface vessels,” IEEE Trans. Robot., vol. 23, no. 3, pp. 617–622, Jun. 2007. [3] X. P. Wu, Z. P. Feng, and J. M. Zhu, “A novel method for formation control of multiple autonomous underwater vehicles,” Ship Sci. Technol., vol. 30, no. 2, pp. 128–134, Apr. 2008. [4] R. Cui, S. S. Ge, B. V. E. How, and Y. S. Choo, “Leader–follower formation control of underactuated autonomous underwater vehicles,” Ocean Eng., vol. 37, no. 17, pp. 1491–1502, Dec. 2010. [5] N. E. Leonard and E. Fiorelli, “Virtual leaders, artificial potentials and coordinated control of groups,” in Proc. 40th IEEE Conf. Decision and Control, vol. 3, pp. 2968–2973, Orlando, FL, Dec. 2001. [6] M. A. Lewis and K.-H. Tan, “High precision formation control of mobile robots using virtual structures,” Autonomous Robots, vol. 4, no. 4, pp. 387–403, Oct. 1997. [7] W. Ren and R. Beard, “Decentralized scheme for spacecraft formation flying via the virtual structure approach,” Journal of Guidance, Control, and Dynamics, vol. 27, no. 1, pp. 73–82, Feb. 2004. [8] T. Balch and R. C. Arkin, “Behavior-based formation control for multirobot teams,” IEEE Trans. Robot. Autom., vol. 14, no. 6, pp. 926– 939, Dec. 1998. [9] B. Liu, R. Zhang, and C. Shi, “Formation control of multiple behaviorbased robots,” in Int. Conf. Computational Intelligence and Security, vol. 1, pp. 544–547, Guangzhou, China, Nov. 2006. [10] R. M. Kuppan Chetty, M. Singaperumal, and T. Nagarajan, “Behavior based multi robot formations with active obstacle avoidance based on switching control strategy,” Advanced Materials Research, vol. 433, pp. 6630–6635, Jan. 2012. [11] O. Khatib, “Real-time obstacle avoidance for manipulators and mobile robots,” The international journal of robotics research, vol. 5, no. 1, pp. 90–98, Mar. 1986. [12] M. Zhang, Y. Shen, Q. Wang, and Y. Wang, “Dynamic artificial potential field based multi-robot formation control,” in Conf. Instrumentation and Measurement Technology (I2MTC), pp. 1530–1534, Austin, TX, May. 2010. [13] E. Fiorelli, N. E. Leonard, P. Bhatta, D. A. Paley, R. Bachmayer, and D. M. Fratantoni, “Multi-auv control and adaptive sampling in monterey bay,” IEEE J. Ocean. Eng., vol. 31, no. 4, pp. 935–948, Oct. 2006.

[14] X. Chen, A. Serrani, and H. Ozbay, “Control of leader-follower formations of terrestrial uavs,” in Proc. 42nd IEEE Conf. Decision and Control, vol. 1, pp. 498–503, Maui, HI, 2003. [15] J. Chen, D. Sun, J. Yang, and H. Chen, “A leader-follower formation control of multiple nonholonomic mobile robots incorporating recedinghorizon scheme,” The International Journal of Robotics Research, vol. 29, no. 6, pp. 727–747, May. 2009. [16] D. B. Edwards, T. A. Bean, D. L. Odell, and M. J. Anderson, “A leaderfollower algorithm for multiple auv formations,” in Proc. IEEE/OES Autonomous Underwater Vehicles, pp. 40–46, Sebasco, ME, Jun. 2004. [17] G. Ding, D. Zhu, and B. Sun, “Formation control and obstacle avoidance of multi-auv for 3-d underwater environment,” in Proc. IEEE 33rd Chinese Control Conference (CCC), pp. 8347–8352, Nanjing, China, 2014. [18] J. Shao, G. Xie, and L. Wang, “Leader-following formation control of multiple mobile vehicles,” IET Control Theory & Applications, vol. 1, no. 2, pp. 545–552, Mar. 2007. [19] L. Consolini, F. Morbidi, D. Prattichizzo, and M. Tosques, “Leaderfollower formation control of nonholonomic mobile robots with input constraints,” Automatica, vol. 44, no. 5, pp. 1343–1349, May. 2008. [20] Y. Dai and S. G. Lee, “The leader-follower formation control of nonholonomic mobile robots,” International Journal of Control, Automation and Systems, vol. 10, no. 2, pp. 350–361, Apr. 2012. [21] H. Xiao, Z. Li, and C. Chen, “Formation control of leader-follower mobile robots systems using model predictive control based on neurodynamics optimization,” IEEE Trans. Ind. Electron., vol. 1, no. 99, pp. 1–11, Mar. 2016. [22] K. Yoshida, H. Fukushima, K. Kon, and F. Matsuno, “Control of a group of mobile robots based on formation abstraction and decentralized locational optimization,” IEEE Trans. Robot., vol. 30, no. 3, pp. 550– 565, Jun. 2014. [23] S. M. Kang and H. S. Ahn, “Design and realization of distributed adaptive formation control law for multi-agent systems with moving leader,” IEEE Trans. Ind. Electron., vol. 63, no. 2, pp. 1268–1279, Feb. 2016. [24] X. Chen and Y. Jia, “Adaptive leader-follower formation control of nonholonomic mobile robots using active vision,” IET Control Theory & Applications, vol. 9, no. 8, pp. 1302–1311, May. 2015. [25] S. C. Liu, D. L. Tan, and G. J. Liu, “Robust leader-follower formation control of mobile robots based on a second order kinematics model,” Acta Automatica Sinica, vol. 33, no. 9, pp. 947–955, Sep. 2007. [26] H. Yang, C. Wang, and F. Zhang, “Brief paper: a decoupled controller design approach for formation control of autonomous underwater vehicles with time delays,” IET Control Theory & Applications, vol. 7, no. 15, pp. 1950–1958, Oct. 2013. [27] Y. N. Zolotukhin, K. Y. Kotov, A. Mal’tsev, A. Nesterov, M. Sobolev, M. Filippov, and A. Yan, “Robust leader-follower formation control of mobile robots by the structural synthesis method,” Optoelectronics, Instrumentation, and Data Processing, vol. 51, no. 5, pp. 496–504, Dec. 2015. [28] J. Kim, H. Joe, S. C. Yu, J. S. Lee, and M. Kim, “Time-delay controller design for position control of autonomous underwater vehicle under disturbances,” IEEE Trans. Ind. Electron., vol. 63, no. 2, pp. 1052–1061, Feb. 2016. [29] X. Yu and L. Liu, “Distributed formation control of nonholonomic vehicles subject to velocity constraints,” IEEE Trans. Ind. Electron., vol. 63, no. 2, pp. 1289–1298, Feb. 2016. [30] G. Antonelli, Underwater Robots, 3rd ed. Berlin, Germany: SpringerVerlag, 2014. [31] S. Kalantar and U. R. Zimmer, “Distributed shape control of homogeneous swarms of autonomous underwater vehicles,” Autonomous Robots, vol. 22, no. 1, pp. 37–53, Jan. 2007. [32] T. Kohonen, Self-organizing maps, 2nd ed. Heidelberg, Germany: Springer, 1995. [33] T. Kohonen, “The self-organizing map,” Neurocomputing, vol. 21, no. 3, pp. 1–6, 1998. [34] A. Zhu and S. X. Yang, “An improved self-organizing map approach to traveling salesman problem,” in Proc. IEEE Int. Conf. Robot., Intell. Syst., and Signal Processing, vol. 1, pp. 674–679, 2003. [35] A. Zhu and S. X. Yang, “A neural network approach to dynamic task assignment of multirobots,” IEEE Trans. Neural Netw., vol. 17, no. 5, pp. 1278–1287, Sep. 2006. [36] D. Zhu, X. Li, and M. Yan, “Task assignment algorithm of multi-auv based on self-organizing map,” Control and Decision, vol. 27, no. 8, pp. 1201–1205+1210, Aug. 2012.



[37] D. Zhu, H. Huang, and S. X. Yang, “Dynamic task assignment and path planning of multi-auv system based on an improved self-organizing map and velocity synthesis method in three-dimensional underwater workspace,” IEEE Trans. Cybern., vol. 43, no. 2, pp. 504–514, Apr. 2013. [38] H. Huang, D. Zhu, and F. Ding, “Dynamic task assignment and path planning for multi-auv system in variable ocean current environment,” Journal of Intelligent & Robotic Systems, vol. 74, no. 3–4, pp. 999–1012, Jun. 2014. [39] P. Long, W. Liu, and J. Pan, “Deep-learned collision avoidance policy for distributed multiagent navigation,” IEEE Robotics and Automation Letters, vol. 2, no. 2, pp. 656–663, Apr. 2017. [40] S. W. Ekanayake and A. P. N. Pathirana, “Formations of robotic swarm: An artificial force based approach,” International Journal of Advanced Robotic Systems, vol. 7, no. 3, pp. 173–190, 2010. [41] S. S. Ge and Y. J. Cui, “Dynamic motion planning for mobile robots using potential field method,” Autonomous Robots, vol. 13, no. 3, pp. 207–222, 2002. [42] M. T. Hagan, H. B. Demuth, and M. Beale, Neural network design. Beijing, China: China Machine Press, 2002. [43] W. L. Brogan, Modern control theory, 3rd ed. Eaglewood Cliffs, NJ: Prentice-Hall, 1990. [44] X. B. Liang and J. Wang, “Absolute exponential stability of neural networks with a general class of activation functions,” IEEE Trans. Circuits Syst. I, vol. 47, no. 8, pp. 1258–1263, Aug. 2000. [45] E. P. Chassignet, H. E. Hurlburt et al., “Global ocean prediction with the hybrid coordinate ocean model (HYCOM),” Oceanography, vol. 22, no. 2, pp. 64–75, Jun. 2009. [46] X. Cao, D. Zhu, and S. X. Yang, “Multi-auv target searching under ocean current based on ppso and velocity synthesis algorithm,” Underwater Technology, vol. 33, no. 1, pp. 31–39, Jul. 2015.

Xin Li was born in Anhui, China. He received his B.S. degree from Shanghai Jiaotong University, Shanghai, China, in 2007, and the M.S. degree from Shanghai Maritime University, Shanghai, China, in 2010. He is now pursuing the doctor’s degree in power electronics and power transmission in the Lab of Underwater Vehicles and Intelligent Systems, Shanghai Maritime University, under the supervision of Prof. D. Zhu. His current research interests include neural networks, robotics, and control of autonomous underwater vehicles. Daqi Zhu was born in Anhui, China. He received the B.Sc. degree in physics from Huazhong University of Science and Technology, Wuhan, China, in 1992, and the Ph.D. degree in electrical engineering from Nanjing University of Aeronautics and Astronautics, Nanjing, China, in 2002. He is currently a Professor with the Information Engineering College and the Head of the Laboratory of Underwater Vehicles and Intelligent Systems, Shanghai Maritime University, Shanghai, China. His current research interests include neural networks, fault diagnosis, and control of AUVs.