Low-power mapping algorithm for three-dimensional ... - SAGE Journals

2 downloads 0 Views 721KB Size Report
Cui Huang, Dakun Zhang and Guozhi Song. Abstract ..... Design and realization of mapping algorithm of 3D NoC based ..... In: Technical papers of 2014 international symposium on VLSI design, automation and test, 2014. 7. Riget J and ...
Research Article

Low-power mapping algorithm for three-dimensional network-on-chip based on diversity-controlled quantum-behaved particle swarm optimization

Journal of Algorithms & Computational Technology 2016, Vol. 10(3) 176–186 ! The Author(s) 2016 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav DOI: 10.1177/1748301816649070 act.sagepub.com

Cui Huang, Dakun Zhang and Guozhi Song

Abstract Three-dimensional network on chip (3D NoC) is developed based on three-dimensional integrated circuit, system on chip and two-dimensional network on chip. The 3D NoC is mainly used to solve the problems such as communication bottleneck of highly integrated chips. Mapping of 3D NoC is a key problem in the research area of 3D NoC. The authors proposed a low-power mapping algorithm based on quantum-behaved particle swarm optimization. Simulation results show that the proposed algorithm can significantly reduce power consumption, but with large-scale application characteristic graph, the efficiency of power optimization cannot be improved very much. To solve this problem, a low-power mapping algorithm for 3D NoC based on diversity-controlled quantum-behaved particle swarm optimization is proposed in this paper. Simulation results show that for large-scale application characteristic graph, this algorithm is able to maintain a stable power optimization efficiency (4.08–8.04%) and converges much faster.

Keywords Three-dimensional network on chip, mapping algorithm, quantum-behaved particle swarm optimization, diversitycontrolled, low-power Date received: 31 July 2015; accepted: 5 December 2015

Introduction Three-dimensional network-on-chip (3D NoC) is an effective solution to solve the problem of complex interconnection for large-scale system on chip (SoC). Since the adoption of integrated circuit stacking technology, 3D NoC breaks the limitations on performance and size of two-dimensional NoC (2D NoC). Mapping of 3D NoC is a key issue in 3D NoC research field. The results of mapping will directly affect the power consumption of communication in system, delay and other properties.1 In terms of the optimal solution search method, 3D NoC mapping algorithm can be divided into three categories, i.e. search algorithms, construction algorithms, and enumeration algorithm. Search algorithms are the most commonly used and can be divided into two types of heuristic search and systematic search.2 Heuristic search algorithm based on particle swarm optimization (PSO) is a kind of very typical algorithm to solve 3D NoC mapping problem. Zhang Zhen in

Guangdong University of Technology improved the PSO to solve the problem of multi-objective and discrete space in Chip Multiprocessors (CMP) on-chip network mapping algorithm.3,4 Pradip Kumar Sahu et al. in Indian Institute of Technology Kharagpur presented a discrete particle swarm optimization (DPSO)based strategy to map applications on both 2D and 3D mesh-connected NoC.5 Bhardwaj and Mane in BITS Pilani-Goa Campus of India proposed centralized 3D mapping (C3Map) using a new octahedral traversal technology and realized the H-C3Map6 which is formed from C3Map and attractive and repulsive School of Computer Science and Software Engineering, Tianjin Polytechnic University, Tianjin, China Corresponding author: Dakun Zhang, School of Computer Science and Software Engineering, Tianjin Polytechnic University, Tianjin 300387, China. Email: [email protected]

Creative Commons CC-BY-NC: This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 3.0 License (http://www. creativecommons.org/licenses/by-nc/3.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage).

Huang et al.

177

PSO (ARPSO).7 We have applied quantum-behaved particle swarm optimization (QPSO) to 3D NoC mapping problem for the first time, which reduced power consumption of mapping. But good power-optimized efficiency can only be maintained when the scale of the application characteristic graph (APCG) is within a certain range. To solve this problem, in this paper, a low-power mapping algorithm for 3D NoC based on diversity-controlled quantum-behaved particle swarm optimization (DCQPSO) is proposed. Simulation results show that for large-scale APCG, this algorithm is able to maintain a stable power optimization efficiency (4.08–8.04%), and converges faster.

Basic questions of 3D NoC mapping Basic concept of 3D NoC mapping Mapping of 3D NoC is used to decide which IP in a given APCG should be represented by which node of topology architecture graph of 3D NoC to achieve the goal of minimizing power consumption or packet latency described as objective functions.8,9 Mapping problem belongs to quadratic assignment problem and is a NP-hard problem.10 Figure 1 shows a mapping instance of NoC.11

Power consumption model Mapping of 3D NoC is to find the optimal mapping relationship between IP cores and resource nodes of 3D NoC through an algorithm, so that one or more performance metrics such as lowest power consumption, shortest execution time of application, the most balanced network load and so on can be optimized. A major concern with 3D NoC mapping is power consumption among all these metrics.11 Using the power consumption model given in Yang et al.,12 power consumption of the process of node ti sending 1 flit to node tj can be formulated as: t ,t

i j ¼ ERbit þ H ELHbit þ V ELVbit Ebit

ð1Þ

In equation (1),  represents the number of routers between ti to tj . H is the number of edges between ti to tj in horizontal direction and V is the number of edges between ti to tj in vertical direction. ERbit represents the power consumption of a flit through a router. ELHbit represents the power consumption of a flit through circuit in horizontal direction and ELVbit represents the power consumption of a flit through circuit in vertical direction. Mapping of 3D NoC is to find a mapping function under the conditions of a given APCG and TAG so that the following conditions in equations (2) and (3) are met and the total power consumption is lowest, i.e. the objective function in equation (4) is minimum (wi,j represents the communication volume between vertex vi and vertex vj ): 8vi 2 V, mapðvi Þ 2 R

ð2Þ

8vi ¼ 6 vj , mapðvi Þ 6¼ mapðvj Þ 0 1  X  mapðv Þ,mapðvj Þ A min@ wi,j  Ebit i

ð3Þ ð4Þ

vi ,vj 2V

For a determined topology, ERbit , ELHbit , and ELVbit are determined. According to Matsutani et al.,13 the power consumption on the circuit is proportional to the length of the circuit. So the length of the circuit between two communicating nodes is a key influence affecting the power consumption of 3D NoC. Power consumption of 3D NoC in vertical direction is much less than that in horizontal direction when the same amount of data is transferred with Through Silicon Vias (TSV) technology adopted.

Review of mapping algorithm of 3D NoC based on QPSO Description of the problem QPSO algorithm is proposed by Sun Jun from Jiangnan university of China.14 QPSO is a global convergence algorithm and has many advantages such as global convergence, less control parameters, faster convergence, and stronger ability of optimization. The authors had applied QPSO algorithm to the mapping problem of 3D NoC for the first time to optimize the mapping position of IP cores of APCG on the resource nodes of 3D NoC.

Realization of mapping algorithm of 3D NoC based on QPSO

Figure 1. A mapping instance of NoC.

In this part, we show the realization of mapping algorithm based on QPSO. The steps are shown in Algorithm 1.

178 Algorithm 1. Mapping algorithm of 3D NoC based on QPSO Input: Particle swarm randomly generated. Output: Optimum position best of particle swarm. Step 1: Parameters initialization. Define the number of particles as MAX NP and the maximum number of iterations as MAX GENERS. Randomly generate MAX NP initial particles. Step 2: Calculate the average optimal position of the particle swarm according to equation (10). Step 3: Calculate the fitness value of each particle according to equation (5). If the particle adapts better than indibest, then the value is assigned to indibest and update the best position of the individual particle. Step 4: If the fitness of indibest is better than best, then update the value of best. Step 5: For every dimension of the particle, calculate a random position according to equation (8). Step 6: Update the position of particles according to equation (7). Step 7: Correct the obtained position and velocity information of particles according to the determined range of value of position and velocity so that they would not exceed the available space. Step 8: Decide whether the current iteration number reaches the maximum number MAX GENERS. If it is reached, then output best, else jump to step 2 and iteration number adds 1. Step 9: Output best and end the algorithm.

Study of mapping algorithm of 3D NoC based on DCQPSO DCQPSO algorithm The diversity of QPSO algorithm is relatively high at the beginning of searching since the initialization of particle swarm. In the subsequence searching process, the diversity of the population continues to decline, resulting in stronger local search capability and much weakened global search capability. In the early and middle stages of the search, reducing the diversity of particle swarm is necessary for the search efficiency. However, the diversity of particle swarm is very low at the late stage of the search because the particles are gathered in a relatively small area. In this time, the capability of global searching gets very weak and the possibility of large-scale searching is very small. In this case, if the global best position is the locally optimal solution or suboptimal solutions, the algorithm prematurity will occur. In order to effectively avoid prematurity and improve the performance of QPSO algorithm, Sun proposed diversity-controlled QPSO (DCQPSO).14 In DCQPSO

Journal of Algorithms & Computational Technology 10(3) algorithm, the search in the algorithm is guided by a diversity measurement diversityðSx Þ. Particle swarm gets into convergence state after initialization. In this process, the contraction–expansion coefficients  decreases linearly from 1.0 to 0.5 and this parameter control method is the same as QPSO algorithm. In the convergence process, once the diversity diversityðSx Þ is below a predetermined limit dlow , adjust the contraction– expansion coefficients  as a preset 0 (0 > 1.782, the particle swarm enters a divergent state), so that the particle swarm moves into a divergent state, then diversity increases temporarily, until more than dlow .

Proposal and realization of mapping algorithm of 3D NoC based on DCQPSO Theory of mapping algorithm of 3D NoC based on DCQPSO. The fitness function of mapping algorithm based on DCQPSO used is shown in equation (5), Cost ¼ MAXFit  ðdx  wx þ dy  wy þ dz Þ

ð5Þ

In equation (5), MAXFit is biggest fitness value set for in advance. dx , dy , and dx , respectively, represent the distance between communication nodes in the x-axis, y-axis, and z-axis. wx and wy , respectively, represent traffic between nodes in the x-axis and y-axis. Power consumption of 3D NoC in vertical direction is much less than that in horizontal direction when the same amount of data is transferred due to TSV technology involved. So in the z-axis direction it is no longer multiplied by the weight. From equation (5), we can know that the fitness value is inverse proportional to the traffic of 3D NoC. The greater the traffic, the lower fitness value is. We can know from the power consumption model above that the greater the traffic, the higher the power consumption is. So the power consumption brought by the algorithm can be judged by the size of the fitness value, i.e. the greater the fitness value, the lower power consumption is. Diversity metric formula of the algorithm is shown as equation (6): vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi M u IM X uDX 1 t diversityðSx Þ ¼  ðxi,j ðtÞ  xj ðtÞÞ2 , xj ðtÞ M  jAj i¼1 j¼1 ¼

M 1 X xi,j ðtÞ M i¼1

ð6Þ

In equation (6), Sx ¼ ðx1 , x2 , . . . , xM Þ is the population composed of the current position of the particle. M is the size of particle swarm. jAj is the length of the longest diagonal of the search space. DIM is the number of dimensions of the optimization problem and also the number of IP cores of APCG. xi,j ðtÞ is the current

Huang et al.

179

position of the ith particle at the jth dimension. xj ðtÞ is the average position of all particles at the jth dimension. The evolutionary formula of position of the algorithm is shown in equation (7). xi,j ðt þ 1Þ ¼ pi,j ðtÞ    jcj ðtÞ  xi,j ðtÞj  lnð1=Þ

ð7Þ

In equation (7), pi,j ðtÞ represents the position of a random node and the calculated formula is shown as equation (8). ’ is a number that randomly generated between 0 and 1. The parameter indibesti,j ðtÞ represents the best position of the particle itself and besti,j ðtÞ represents the optimum position of particle swarm.  is an expansion shrinkage factor and the calculated formula is defined as equation (9). The parameter count represents the current number of iterations and N represents the maximum number of iterations. cj ðtÞ is the average optimal position of the particle swarm and the calculated formula is shown as equation (10).  is a number that randomly generated between 0 and 1. pi,j ðtÞ ¼ ’  indibesti,j ðtÞ þ ð1  ’Þ  bestj ðtÞ  ¼ 1:0 

ð1:0  0:5Þ  count N

ð8Þ ð9Þ

assigned to indibest and update the best position of the individual particle. Step 4: If the fitness of indibest is better than best, then update the value of best. Step 5: Calculate the diversity diversity ðSx Þ of the particle swarm according to equation (6). Step 6: Calculate the contraction–expansion coefficients a according to equation (9). Step 7: If the diversity diversity ðSx Þ is below a predetermined limit dlow , adjust the contraction– expansion coefficients  as a preset 0. Step 8: For every dimension of the particle, calculate a random positon according to equation (8). Step 9: Update the position of particles according to equation (7). Step 10: Decide whether the current iteration number reaches the maximum number MAX_ GENERS. If it is reached, then output best, else jump to step 2 and iteration number adds 1. Step 11: Output best and end the algorithm.

Experiments and performance analysis Simulation platform and parameter design 1. Simulation platform

M   1 X cðtÞ ¼ c1 ðtÞ, . . . , cj ðtÞ, . . . , cDIM ðtÞ ¼ indibesti ðtÞ M i¼1 M M 1 X 1 X indibesti, 1 ðtÞ, . . . , indibesti, j ðtÞ, M i¼1 M i¼1 ! M 1 X ..., indibesti, DIM ðtÞ ð10Þ M i¼1

¼

Design and realization of mapping algorithm of 3D NoC based on DCQPSO. In this part, we show the realization of mapping algorithm based on DCQPSO which is proposed in this paper. The steps are shown in Algorithm 2. Algorithm 2. Mapping algorithm of 3D NoC based on DCQPSO Input: Particle swarm randomly generated. Output: Optimum position best of particle swarm. Step 1: Parameters initialization. Define the number of particles as MAX NP and the maximum number of iterations as MAX GENERS. Randomly generate MAX NP initial particles. Step 2: Calculate the average optimal position of the particle swarm according to equation (10). Step 3: Calculate the fitness value of each particle according to equation (5). If the particle adapts better than indibest, then the value is

In this paper, Access Noxim 0.2 developed by KaiYuan Jheng in National Taiwan University is adopted as our simulation tool.15 The Access Noxim is a co-simulation platform for 3D NoC system that couples the network model, power model, and thermal model. They integrate Noxim (a cycle-accurate SystemC NoC simulator) and HotSpot (HotSpot provides the architecture-level thermal model), and adopt the power model of Intel’s 80-core processor. 2. Choice of topology and routing algorithm The 3D mesh architecture is formed by extending 2D mesh directly to 3D space16 and has a regular structure. The position of network nodes of 3D mesh is homogeneous and the wiring is simple. TSV technology is used in the vertical direction which can reduce the overall wiring length. In our experiment, 3D mesh is used in our simulations. As for routing algorithm, we adopt XYZ routing algorithm. In XYZ algorithm, data packets are passed first in the x-axis, and then in the y-axis, finally in the zaxis. The algorithm is easy to understand and be realized and it is the most commonly used routing algorithm. 3. Parameter initialization We initialized the parameters of the algorithms and the simulation tool as follows. Population size is set

180 as 200 and the number of iterations is set 100. Data packets are injected in the way of memory-less Poisson distribution. Packet injection rate is set as 0.02. The total power consumption of 5000 cycles is summarized. The size of data packet is between 2 flits to 10 flits. The size of cache of each channel of the router is 8 flits.

Experiment results and performance analysis In order to compare our proposed mapping algorithm of 3D NoC based on DCQPSO with the original mapping algorithm of 3D NoC based on QPSO, we conduct the experiments of these two algorithms on the same simulation platform. To reflect the universality of the algorithms, we use both classic APCG and random APCG in our experiments. Classic APCG includes PIP (8-core), MWD (12-core), VOPD (16-core), and DVOPD (32-core). Random APCG is generated by using task builder TGFF17 and different APCGs have different numbers of IP cores. For each APCG, simulation experiments were conducted separately using mapping algorithms based on DCQPSO and QPSO. Then the communication mode files with the best solutions were generated. Finally, the communication mode files are fed into Access Noxim to start the simulation.

Journal of Algorithms & Computational Technology 10(3) In this paper, both convergence speed and power consumption of the two algorithms are compared. Comparison on convergence speed of mapping algorithms . based on DCQPSO and QPSO 1. Comparison on convergence speed of the two algorithms for classic APCG For classic APCG, comparison on convergence speed in 100 iterations of mapping algorithms based on DCQPSO and QPSO is shown in Figure 2, where the abscissa represents the number of iterations, the ordinate represents fitness value and the solid circle represents the convergence point of the algorithm. Take Figure 2(b) as an example, for MWD, mapping algorithm based on DCQPSO converges to the optimal fitness value 2220 at the 35th iteration while mapping algorithm based on QPSO converges to the optimal fitness value 2189 at the 42th iteration. From Figure 2, we can observe that for any classic APCG, mapping algorithm based on DCQPSO can get optimal solutions within a few iterations and an obvious improvement in convergence speed can be observed compared with the mapping algorithm based on QPSO (about 16.67–66.67%). Also, except for the

Figure 2. Comparison on convergence speed for different classic APCG. (a) Comparison on conversation speed for PIP, (b) MWD, (c) VOPD, and (d) DVOPD.

Huang et al.

181

Figure 3. Comparison on conversation speed for APCGs with different numbers of IP cores. (a) Comparison on conversation speed for 26-core APCG, (b) 51-core APCG, (c) 75-core APCG, (d) 101-core APCG, (e) 127-core APCG, and (f) 151-core APCG.

PIP, optimal solution of mapping algorithm based on DCQPSO fits better than the optimal solution of mapping algorithm based on QPSO. Because the number of IP cores of PIP is less, QPSO algorithm is not easy to fall into prematurity and we can get the same optimal solution with DCQPSO algorithm. It also can be seen that for APCG with large number of IP cores, the advantages of mapping algorithm based on DCQPSO are more obvious, i.e. it can get better solutions within a short period of time. 2. Comparison on convergence speed of the two algorithms for random APCG

In this paper, random APCGs with 26-core, 51core, 75-core, 101-core, 127-core, and 151-core are generated using TGFF. For different random APCGs, comparison on convergence speed in 100 iterations of mapping algorithms based on DCQPSO and QPSO is shown in Figure 3. Take Figure 3(e) as an example, for random APCG with 127 cores, mapping algorithm based on DCQPSO converges to the optimal fitness value 164,330 at the 49th iteration while mapping algorithm based on QPSO converges to the optimal fitness value 154,395 at the 77th iteration. From Figure 3, we can observe that for any random APCG, convergence speed of mapping algorithm

182

Journal of Algorithms & Computational Technology 10(3)

Figure 4. Comparison on the average power consumption for classic APCG.

Figure 5. Comparison on the maximum power consumption for classic APCG.

based on DCQPSO is improved obviously related to the mapping algorithm based on QPSO (about 12.05– 50.52%). The final solution of mapping algorithm based on DCQPSO is also better than that of mapping algorithm based on QPSO. Comparison on power consumption of mapping algorithms based on DCQPSO and QPSO. In our simulation experiments, for each APCG, the two algorithms are separately repeated for 10 times and the best individual is used to generate communication mode files. 1. Comparison on power consumption of the two algorithms for classic APCG

For different classic APCGs, mapping algorithm based on DCQPSO and QPSO are repeated for 10 times, respectively, and get the value of power consumption from the simulation platform. The experiment results are analyzed and the average power consumption, maximum power consumption, and minimum power consumption of the two algorithms for the same APCG in 10 simulation trials are compared separately. The comparison on the average power consumption of mapping algorithms based on DCQPSO and QPSO is shown in Figure 4. From Figure 4 we can observe for all classic APCGs, the power consumption of mapping algorithm based on DCQPSO is lower than that of

Huang et al.

183

Figure 6. Comparison on the minimum power consumption for classic APCG.

Figure 7. The reduction ratio of power consumption of mapping algorithm based on DCQPSO for classic APCG.

mapping algorithm based on QPSO with maximum case as 5.29% (for DVOPD) and the minimum case as 0.36% (for PIP). The comparison on the maximum power consumption of mapping algorithms based on DCQPSO and QPSO is shown in Figure 5. From Figure 5 we can see that for all APCGs, the power consumption of mapping algorithm based on DCQPSO is lower than that of mapping algorithm based on QPSO with the maximum case as 6.85% (for DVOPD) and the minimum case as 0.25% (for PIP). The comparison on the minimum power consumption of mapping algorithms based on DCQPSO and QPSO is shown in Figure 6. From Figure 6 we can see that for all

APCGs, the power consumption of mapping algorithm based on DCQPSO is lower than that of mapping algorithm based on QPSO with the maximum case as 3.89% (for VOPD) and the minimum case as 0.70% (for PIP). In summary, it can be observed that power consumption is reduced in varying degrees with mapping algorithm based on DCQPSO compared to the original mapping algorithm based on QPSO. Related to mapping algorithm based on QPSO, the reduction ratio of power consumption of mapping algorithm based on DCQPSO is shown in Figure 7, where the abscissa represents different classic APCGs, the ordinate represents the ratio of reduction of power consumption of mapping algorithm based on DCQPSO compared to

184

Journal of Algorithms & Computational Technology 10(3)

Figure 8. Comparison on the average power consumption for random APCG.

Figure 9. Comparison on the maximum power consumption for random APCG.

mapping algorithm based on QPSO. When the number of IP cores of APCG is small, QPSO algorithm rarely falls in prematurity and the advantages of DCQPSO algorithm is not very obvious. When the number of IP cores of APCG is large, QPSO algorithm tends to fall in prematurity and the ability of optimization of DCQPSO algorithm is stronger. 2. Comparison on power consumption of the two algorithms for random APCG In this paper, random APCGs with 26-core, 51-core, 75-core, 101-core, 127-core, and 151-core are generated using TGFF. For different random APCGs, mapping algorithm based on DCQPSO and QPSO are repeated

for 10 times, respectively, and get the value of power consumption from the simulation platform. The experiment results are analyzed and the average power consumption, maximum power consumption, and minimum power consumption of the two algorithms for the same APCG in 10 simulation trials are compared separately. The comparison on the average power consumption of mapping algorithms based on DCQPSO and QPSO is shown in Figure 8. From Figure 8 we can see that for all random APCGs the power consumption of mapping algorithm based on DCQPSO is lower than that of mapping algorithm based on QPSO with the maximal decrease could reach 5.51% (for 127-core) and the minimum case as 4.26% (for 101-core).

Huang et al.

185

Figure 10. Comparison on the minimum power consumption for random APCG.

Figure 11. The reduction ratio of power consumption of mapping algorithm based on DCQPSO for random APCG.

The comparison on the maximum power consumption of mapping algorithms based on DCQPSO and QPSO is shown in Figure 9. From Figure 9 we can see that for all random APCGs, the power consumption of mapping algorithm based on DCQPSO is lower than that of mapping algorithm based on QPSO with the maximum case as 5.57% (for 75-core) and the minimum case as 3.56% (for 26-core). The comparison on the minimum power consumption of mapping algorithms based on DCQPSO and QPSO is shown in Figure 10. From Figure 10 we can see that for all random APCGs the power consumption of mapping algorithm based on DCQPSO is lower than that of mapping algorithm based on QPSO with the maximum case as 8.04% (for 75-core) and the minimum case as 4.08% (for 151-core).

In summary, it can be observed that for random APCG, power consumption is reduced in varying degrees and the performance is more stable with mapping algorithm based on DCQPSO compared to the original mapping algorithm based on QPSO. This is because once the mapping algorithm based on QPSO falls into prematurity it will lead to poor results. In the comparison of mapping algorithms based on QPSO and PSO, mapping algorithm based on QPSO is most suitable for APCG with about 60 IP cores. But for APCG with large-scale, the efficiency of power-optimization cannot be improved very much. For APCG with large number of IP cores (100-core to 150-core), optimal results of the proposed mapping algorithm based on DCQPSO are still relatively stable and power consumption can decrease by 4.08–8.04%. Related to

186 mapping algorithm based on QPSO, the reduction ratio of power consumption of mapping algorithm based on DCQPSO is shown in Figure 11, where the abscissa represents different random APCGs, the ordinate represents the ratio of reduction of power consumption of mapping algorithm based on DCQPSO related to mapping algorithm based on QPSO.

Conclusions In this paper, both mapping algorithms based on DCQPSO and QPSO are used to solve the problem of 3D NoC mapping. Simulation results show that mapping algorithm based on DCQPSO can maintain diversity from start to finish and is less prone to fall into prematurity. Convergence speed is significantly improved (12.05–66.67%) and power consumption of mapping is effectively reduced (4.08–8.04%). For APCG with different number of IP cores, the optimal efficiency is very stable. In future studies, we will continue to optimize the objective function and improve the efficiency of 3D NoC mapping in different ways. In addition, we will focus on the study of 3D NoC mapping algorithm for different topologies and routing, such as heterogeneous topologies and personalization routing, so that the proposed low-power mapping algorithm based DCQPSO can be more generally applied. Declaration of conflicting interests The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding The author(s) received no financial support for the research, authorship, and/or publication of this article.

References 1. Liu Y. Research of NoC mapping problem based on particle swarm optimization algorithm. Dalian University of Technology, 2010. 2. Wang T. Low energy NoC mapping algorithm based on modified electromagnetism-like mechanism. Xidian University, 2014.

Journal of Algorithms & Computational Technology 10(3) 3. Zhang Z. Improved particle swarm optimization algorithm based mapping algorithm for 3D-mesh CMP. Guangdong University of Technology, 2012. 4. Yang W, Zhang Z and Liu YJ. Improved particle swarm optimization algorithm based mapping algorithm for 3Dmesh CMP. Application Research of Computers 2013; 30: 1345–1348. 5. Sahu PK, Shah T, Manna K, et al. Application mapping onto mesh-based network-on-chip using discrete particle swarm optimization. In: IEEE transactions on very large scale integration (VLSI) systems, vol. 22, No. 2, 2014, pp.300–312. 6. Bhardwaj K and Mane PS. C3Map and ARPSO based mapping algorithms for energy-efficient regular 3-D NoC architectures. In: Technical papers of 2014 international symposium on VLSI design, automation and test, 2014. 7. Riget J and Vesterstrom J. A diversity-guided particle swarm optimizer the ARPSO. Dept. Comput. Sci., Univ. of Aarhus, Aarhus, Denmark, Tech. Rep 2002. 8. Jiawen W, Li L, Wei Y, et al. A dynamic ant colony optimization algorithm for 3D NoC mapping. J Comp Aid Des Comp Graphics 2011; 23: 1614–1620. 9. Wang J, Li L, Pan H, et al. Latency-aware mapping for 3D NoC using rank-based multi-objective genetic algorithm. In: 2011 IEEE 9th international conference on ASIC, 25–28 October 2011, Xiamen, China, pp.413–416. 10. Sahni S and Gonzales T. P-complete approximation problems. JACM 1976; 23: 555–565. 11. Xingsheng LV. Study on mapping algorithm of 3D NoC. Qufu Normal University, 2014. 12. Yang W, Zhang Z and Liu Y. Improved particle swarm optimization algorithm based mapping algorithm for 3Dmesh CMP. Appl Res Comp 2013; 5: 1345–1348. 13. Matsutani H, Koibuchi M and Amano H. Tightlycoupled multi-layer topologies for 3-D NoCs. In: ICPP, 10–14 September 2007, XiAn, China, p.75. 14. Sun J. Study on quantum-behaved particle swarm optimization. Jiangnan University, 2009. 15. Jheng K-Y, Chao C-H, Wang H-Y, et al. Traffic-thermal mutual-coupling co-simulation platform for threedimensional network-on-chip. VLSI Design Autom Test (VLSI-DAT) 2010; 54: 135–138. 16. Chen Y, Hu J and Ling X. Study on 3D NoC topology. Telecommun Sci 2009; 25: 39–44. 17. Dick RP, Rhodes DL and Wolf W. TGFF: task graphs for free. In: Proceedings of the 6th international workshop on hardware/software codesign, IEEE Computer Society, 15–18 March 1998, Seattle, WA, USA, pp.97–101.

Suggest Documents