Bi-Directional Synthesis of 4-Bit Reversible Circuits - CiteSeerX

The Computer Journal Advance Access published July 12, 2007 # The Author 2007. Published by Oxford University Press on behalf of The British Computer Society. All rights reserved.

For Permissions, please email: [email protected] doi:10.1093/comjnl/bxm042

Bi-Directional Synthesis of 4-Bit Reversible Circuits GUOWU Y ANG 1, XIAOYU SONG 2, W ILLIAM N. N. HUNG 2,3,*

AND

M AREK A. PERKOWSKI 2

1

School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan 610054, China 2 Department of Electrical and Computer Engineering, Portland State University, Portland, OR 97207, USA 3 Synplicity Inc., Sunnyvale, CA 94086, USA *Corresponding author: [email protected] Reversible circuits play an important role in quantum computing, which is one of the most promising emerging technologies. In this paper, we investigate the problem of optimally synthesizing 4-bit reversible circuits. We present an enhanced bi-directional synthesis approach. Owing to the exponential nature of the memory and run-time complexity, all existing methods can only perform four steps for the Controlled-Not gate NOT gate, and Peres gate library. Our novel method can achieve 12 steps. As a result, we augment the number of circuits that can optimally be synthesized by over 5 3 106 times. We synthesized 1000 random 4-bit reversible circuits. The statistical analysis result supports our estimation. The quantum cost of our result is also better than the quantum cost of other approaches. The promising experimental results demonstrate the effectiveness of our approach. Keywords: reversible logic; quantum circuits; minimization; algorithm Received 12 March 2006; revised 29 March 2007

1.

INTRODUCTION

Reversible circuits play an important role in quantum computing, which is one of the most promising emerging technologies. There has recently been some research effort on the synthesis of reversible circuits [1 – 9]. Exact minimal results were given only for 3-bit circuits. There are no exact results on 4-bit circuits and all published results for more than 3-bit circuits are only heuristics with no evaluation of distance from the minimal solution. Moreover, even these heuristic results are usually slow since they depend on optimization algorithms such as evolutionary programming or simulated annealing. It has been known that any 3-bit reversible gate can be synthesized using the Controlled-Not, Not and Toffoli gates (CNT) gate library [1, 2]. In [8, 9], an optimal approach was proposed for synthesizing 3-bit reversible gates with an average of 5.87 gates with CNT. Miller et al. [6] give the average of 5.63 gates for optimally synthesizing three-line reversible circuits based on a bigger library Not, CNot, Toffoli and Swap gates. Group theory has been demonstrated as a powerful tool for analysis in many applications. GAP [10] is a mathematical analysis package for group theory applications. It is composed

of a set of efficient and fast algorithms for manipulating set and group operations. It was used to prove the universality of some given sets of reversible gates [7, 11]. Recently, more and more works on using group theory for reversible logic synthesis are being proposed [2, 3, 7 – 9, 11]. Miller et al. [6] presented a transformation-based algorithm for synthesizing reversible circuits. They demonstrated a basic greedy algorithm, and then used a bi-direction [12] algorithm to improve the greedy result. Although the method tried to optimize the circuit in two directions, their result is not optimal. It should be pointed out that bi-directional search originated from the artificial intelligence (AI) community, and many other search algorithms, such as iterative deepening and IDA* [12], have also been studied in AI and operations research. In this paper, we investigate the problem of optimally synthesizing 4-bit reversible circuits. We apply the bi-direction idea into search-based synthesis, and present an enhanced bi-directional synthesis approach. Owing to the super-exponential increase on the memory requirement, all the existing methods can only perform four steps for the Controlled-Not, Not and Peres gates (CNP) library [9]. Our

THE COMPUTER JOURNAL, 2007

Page 2 of 9

G. YANG

novel method outperforms them by eight steps, i.e. we are able to execute 12 steps. It is also the orders of magnitude faster than the existing approaches. The number of circuits that can be optimally synthesized has increased over 5 106 times. The promising experimental results demonstrate the effectiveness of our approach.

2. BACKGROUND In this section, we introduce some basic concepts and results on permutation group theory from [13] and binary reversible logic from [3, 6, 11]. DEFINITION 2.1 (Binary reversible gate) Let B ¼ f0, 1g. A binary logic circuit f with w inputs and outputs is denoted by a binary multiple-output function f : B w ! B w. Let kB1, . . . , Bwl [ B w and kP1, . . . , Pwl [ B w be the input and output vectors, where B1, . . . , Bw are input variables and P1, . . . , Pw are output variables. There are 2w different assignments for the input vectors. A binary logic circuit f is reversible if it is a one-to-one and onto function (bijection). A binary reversible logic circuit with w inputs and w outputs is also called a w-bit binary reversible gate especially when they are used to synthesize reversible circuits (for quantum computing). There are a total of (2w)! different w-bit binary reversible circuits. Now we introduce permutation group and its relationship with reversible functions. DEFINITION 2.2 (Permutation). Let M ¼ fd1, d2, . . . , dng. A bijection (one-to-one, and onto mapping) of M onto itself is called a permutation on M. The set of all permutations on M forms a group under composition of mappings, called a symmetric group on M. It is denoted by Sn [13]. A permutation group is simply a subgroup [13] of a symmetric group. A mapping a: M ! M can be written as a product of disjoint cycles as an alternative notation for a mapping [13]. For example,

d1 ; d2 ; d3 ; d4 ; d5 ; d6 ; d7 ; d8 ; d9 d1 ; d4 ; d7 ; d2 ; d5 ; d8 ; d3 ; d6 ; d9

ð1Þ

can be written as (d2, d4)(d3, d7)(d6, d8). Denote ‘( )’ as the identity mapping (i.e. direct wiring) and call this the unity element in a permutation group. The inverse mapping of mapping f is denoted as f 21. Per convention, a product a b of two permutations applies mapping a before b. To establish a one-to-one correspondence between a reversible function and a permutation, we encode a w-bit binary input (output) vector kBw, Bw21, . . . , B1l as a unique integer value index (kBw, Bw21, . . . , B1l) ¼ B1 þ B2 . 21 þ B3 . . . 22 þ . . . þ Bw . 2w21 þ 1. We add one due to the following two reasons. First, in most of the permutation group books,

ET AL.

M begins from one, instead of zero. Second, in GAP, M also begins from one, instead of zero. Therefore, we have the following relation: kBw, Bw21, . . . , B1l2 ¼ index(kBw, Bw21, . . . , B1l) 2 1. Using the integer coding, we consider a permutation as a bijection function f: f1, 2, . . . , 2wg ! f1, 2, . . . , 2wg. Cascading two gates is equivalent to multiplying two permutations. In what follows, we will not distinguish a reversible gate from a permutation in S2w. If A and B are subsets of a symmetric group, then A B is defined as a bja [ A^b [ B. We use jSj to denote the size of S. DEFINITION 2.3 (Library). w-library is the set of w w reversible gates, which are used to synthesize w w reversible gates, denoted as Lw, or simply as L. A reversible circuit g can be synthesized by L means that there are some gates in L such that g is the product of these gates. We use T(L) to denote a set of all reversible circuits that can be synthesized using gates from library L. DEFINITION 2.4 (Minimum length). A minimum length l(a) of any element a in T(L) means that there exist l(a) gates in L (the gates can be of the same type) such that a is a cascade of these l(a) gates, and there does not exist k gates in L such that k , l(a) and a is a cascading of these k gates.

3.

BI-DIRECTION SEARCH ALGORITHM

In this section, we will answer these questions: given a library L and an arbitrary reversible function g, can g be synthesized using gates from L? If yes, how to synthesize g with minimal length in limited memory space? When we use breadth-first search algorithms [9] to deal with 4-bit reversible problems, the memory becomes quickly exhausted. In our computer, the breadth-first search algorithm can only go four steps (the depth of search which corresponds to the length of the reversible cascade) using the CNP gate library (there are 40 permutations in it, see Section 5), and only about 1 million (106) 4-bit reversible functions can be realized (see Table 1). The bread th-first search method is a forward search (FS) from identity until the permutation g that represents the specification of the given reversible circuit is found. Now we propose a bi-directional search method: an FS from identity and a backward search from the specification g. The idea is to compute the inverse of the library L, L 21 ¼ faja 21 [ Lg. Set s is the maximum number of steps which a computer can go when using breadth-first TABLE 1. Number of minimum synthesized circuits.

Lib CNP

Forward one diretion

Enhanced bi-direction s ¼ 4, t ¼ 2

Length 4

Length 10


# circuits 1.0106

# circuits .51012

BI-DIRECTIONAL SYNTHESIS search method in the computer. The forward set Af(k) is the set of reversible circuits which can be realized by no more than k gates in the library L. The backward set Ab(k) is the set of reversible circuits, which can be constructed by no more than k inverse of the gates in the library L from the given circuit. Compute the forward sets Af ð1Þ ¼ L; Af ð2Þ ¼ Af ð1Þ L; .. .

Page 3 of 9

(10) Else i ¼ i þ 1; (11) End while. THEOREM 3.1. In the algorithm BDS, the parameter leng is the minimum length l(g), and g ¼ h1 * h2 * . . . * hleng is the minimum length design of circuit g, where hj [ L for j ¼ 1, 2, . . . , leng. Proof. According to the algorithm BDS, g ¼ h1 * h2 * . . . * hleng, where hj [ L for j ¼ 1, 2, . . . , leng. Thus l(g) leng. In the following, we will prove l(g) ¼ leng. Then the proof of the theorem finishes.

Af ðsÞ ¼ Af ðs 1Þ L and the backward sets Abð1Þ ¼ L1 g; Abð2Þ ¼ L1 Abð1Þ; .. . AbðsÞ ¼ L1 Abðs 1Þ: If Af(j) > Ab(j 2 1) ¼ ;, but Af(j) > Ab(j) = ;, then the minimum number of steps of g is 2j. If Af(j) > Ab(j) ¼ ;, but Af(j þ 1) > Ab(j) = ;, then the minimum number of steps of g is 2j þ 1. In the bi-directional search approach, the needed memory is only doubled, but the steps are doubled as well, up to 2s. The number of the reversible circuits, which can be constructed by library L, is dramatically increased with the number of steps. The bi-directional algorithm is also significantly faster, because the number of the searched reversible circuits is much less than the FS only (see Table 1). For instance, when we synthesize a circuit with minimum length four, we need to remember jAf(4)j ¼ 1115774 permutations if we use FS. But if we use bi-direction search, we only need to remember jAf(2)j þ jAb(2)j ¼ 2734 permutations. ALGORITHM 1: Bi-direction Search (BDS) Inputs: Library L, Reversible Function g Outputs: leng, h1, h2, . . . , hleng (1) Af(0) ¼ f( )g; Ab(0) ¼ fgg; Flag ¼ 0; i ¼ 1; (2) While Flag ¼ 0 do (3) Af(i) ¼ Af(i 2 1) L; Ab(i) ¼ L 21 Ab(i 2 1); (4) If Af(i)> Ab(i 2 1) = ; then (5) Flag ¼ 1; leng ¼ 2i 2 1; (6) Select a in Af(i) > Ab(i 2 1), find h1, h2, . . . , hi, hiþ1, . . . , h2i21 in L such that a ¼ h1 h2 . . . hi ¼ g * (h2i21)21 * . . . * (hiþ1)21; (7) Elseif Af(i) > Ab(i) = ; then (8) Flag ¼ 1; leng ¼ 2i; (9) Select a in Af(i) > Ab(i), find h1, h2, . . . , hi, hiþ1, . . . , h2i in L such that a ¼ h1 * h2 * . . . . . . * hi ¼ g * (h2i)21 * . . . * (hiþ1)21;

(1) Suppose l(g) ¼ 2k 2 1, an odd number. Then there exist b1, b2, . . . , b2k21 [ L such that g ¼ b1 * b2 * . . . * bk * bkþ1 * . . . * b2k21. So g * (b2k21)21 * . . . * (bkþ1)21 ¼ b1 * b2 * . . . * bk. According to the computation of Af(k) and Ab(k), b1 * b2 * . . . * bk [ Af(k), g * (b2k21)21 * . . . * (bkþ1)21 [ Ab(k 2 1). Thus leng 2k 2 1 ¼ l(g). Combining with l(g) leng, thus leng ¼ 2k 2 1 ¼ l(g). (2) l(g) ¼ 2k, an even number. Similar to case 1, leng ¼ l(g). A We use a simple strategy to enhance the bi-directional method such that the length or cost grows up to 2s þ t, where t s. We do FS from identity and backward search from the targeted circuit. In our enhanced bi-directional search (EBDS) method, we FS from the set Bf(s), where Bf(k) ¼ Af(k) 2 Af(k 2 1), k ¼ 1, . . . , s, is the set of circuits that need k and at least k gates in the library L to be realized. The idea is: if Bf(s)[i] * Bf(k)[j], k ¼ 1, 2, . . . , s, is in the backward set Ab(s), then stop. The mini-length is equal to 2s þ k, where k ¼ 1, 2, . . . , s. We do not compute new sets, so the memory is not increased. That is to say, the memory needed in the EBDS is the same as the memory needed in the bi-directional search. Let s be the length bound in bi-direction search. ALGORITHM 2: Enhanced bi-direction search (1) k ¼ 1; (2) While k s do (3) if (Bf(s) * Bf(k)) > Ab(s) ¼ ; then (4) k ¼ k þ 1; (5) else leng ¼ 2s þ k; (6) Back track from Bf(s) to Bf(1), from Bf(k) to Bf(1), . . . and from Ab(s) to Ab(1) respectively to get h1, . . . , hleng . . . such that g ¼ h1 * . . . * hleng; (7) end if; (8) end while.


Page 4 of 9

G. YANG

(4) Else if BCf(bound) * ACb(k) is not empty then cost ¼ bound þ k; (5) Back track from BCf(bound) to BCf(1) and from ACb(k) to ACb(1), respectively, to get h1, . . . , hm in L such that g ¼ h1 * h2 * . . . * hm; (6) Else if (BCf(bound) * BCf(k)) > ACb(bound) is not empty then cost ¼ 2 * bound þ k; (7) Back track from BCf(bound) to BCf(1), from BCf(k) to BCf(1) and from ACb(bound) to ACb(1) respectively, to get h1, . . . , hm in L such that g ¼ h1 * h2 * . . . * hm; (8) End if.

4. REDUCING QUANTUM COST WITH CNP The algorithms presented in Section 3 were designed to minimize the number of binary reversible gates. We can modify the algorithm to reduce quantum cost [14] instead. Calculate the sets ACf(k), which is the set of the circuits with cost no more than k, where k bound. Then BCf(k) ¼ ACf(k) 2 ACf(k 2 1) is the set of the circuits with cost k, where k bound. If input circuit is in BCf(k), then cost ¼ k. Otherwise, calculate the sets ACb(kb), which is the set of the circuits back from input circuit g with cost no more than kb, where kb bound. If BCf(bound) > ACb(kb) = ;, then cost ¼ bound þ kb. Else if (BCf(bound) * BCf(k)) > ACb(bound) is not empty then cost ¼ 2 * bound þ k. When we use library CNP to synthesize 4-bit reversible circuits with minimum quantum cost, there are four connections for NOT gate whose cost is 1, 12 connections for CNot gate whose cost is 1 and 24 connections for Peres gate whose cost is 4. Therefore, there are 16 permutations in the set L1 composed of gates with cost 1, and there are 24 permutations in the set L4 composed of gates with cost 4. The computation of ACf(k) and ACb(k) is ACf ACf ACf ACf

ð0Þ ¼ fðÞg; ð1Þ ¼ L1; ð2Þ ¼ ACf ð1Þ < ACf ð1Þ L1; ð3Þ ¼ ACf ð2Þ < ACf ð2Þ L1:

For k 4, ACf ðkÞ ¼ ACf ðk 1Þ < ACf ðk 1Þ L1 < ACf ðk 4Þ L4; ACbð0Þ ¼ fðÞg; ACbð1Þ ¼ input ðL1Þ1 ; ACbð2Þ ¼ ACbð1Þ < ACbð1Þ ðL1Þ1 ; ACbð3Þ ¼ ACbð2Þ < ACbð2Þ ðL1Þ1 : For k 4, ACbðkÞ ¼ ACbðk 1Þ < ACbðk 1Þ ðL1Þ1 < ACbðk 4Þ ðL4Þ1 : ALGORITHM 3: Enhanced bi-direction search for quantum cost Inputs: Library L, Reversible Function g Outputs: cost, h1, h2, . . . , hm

ET AL.

The outputs cost, h1, h2, . . . , hm satisfy: g ¼ h1 * h2 * . . . * hm and, sum of the costs of h1, h2, . . . , hm is cost.

5.

EXPERIMENTS

In this section, we present some experiments on 4-bit synthesis using CNP. All experiments are running on a 1.6 GHz Pentium(R) Micro processor with 512 M Memory. We first introduce how many permutations will be in the library CNP when we use GAP. Not gates Ni: Pi ¼ Bi0 ; Pm ¼ Bm, if m = i. (The subscript number i means the 1-bit NOT gate is connected to wire i, see Fig. 1.) There are four connections, so there are four permutations of NOT gate in our permutation library. For instance, N1 ¼ (1, 2) (3, 4) (5, 6) (7, 8) (9, 10) (11, 12) (13, 14) (15, 16). Control-Not gate Ci,j: Pi ¼ Bi Bj; Pm ¼ Bm, if m = i. (The subscript numbers i and j mean that this 2-bit ControlNot gate is connected to the wire i and j, and the XOR gate is connected to the wire i, the first number of these subscript numbers, see Fig. 2.) There are 12 connections, so there are 12 permutations of Control-Not gate in our permutation library. For instance, C2,3 ¼ (5, 7) (6, 8) (13, 15) (14, 16). Peres gate Pi,j,k: Pi ¼ Bi BjBk, Pj ¼ Bj Bk, Pm ¼ Bm, if m = i, j. (see Fig. 3.) There are 24 connections, thus there are 24 permutations of Peres gate in our permutation library. For instance, P2,3,4 ¼ (9, 13, 11, 15) (10, 14, 12, 16). Only half of all 4-bit reversible circuits (even reversible circuits) can be realized by CNP (which is the alternating group A16). The number jA16j is about 1 trillion (1013). In our computer, the traditional algorithm with FS can only go four steps, and only about 1 million (106) 4-bit reversible circuits can be realized (see Table 1). When using enhanced bi-directional search, we set t ¼ 2 for time balance. The algorithm can use up to eight steps. The number of reversible circuits with the minimum length becomes much larger. Denote Bf(k) ¼ Af(k) 2 Af(k 2 1), the set of circuits with minimum length k. From the second step, the ratios

Compute BCf(1), . . . , BCf(bound), and ACb(1), . . . , ACb(bound); (2) If g in BCf(k) then cost ¼ k; (3) Back track from BCf(k) to BCf(1) to get h1, . . . , hm in L such that g ¼ h1 * h2 * . . . * hm;

(1)


FIGURE 1. Not gate Ni.

BI-DIRECTIONAL SYNTHESIS

Page 5 of 9

For time reason, we use bi-direction search. There are 15 circuits that can be constructed within eight steps of Not, Controlled-Not and Peres gates, namely 1.5% circuits can be constructed within eight steps of Not, Controlled-Not and Peres gates. Our estimated probability p ¼ 1.4 1011/ (1.04 1013) ¼ 1.3%, which is our hypothesis H0: p ¼ 1.3%. The statistical hypothesis testing shows that this hypothesis H0 is accepted with significance level a ¼ 0.05 [15, 16]. That is say that our estimation is over 95% reliable. Therefore, our estimation is reasonable. Using the enhanced bi-directional search, we present two 4-bit reversible circuits with minimum lengths 8 and 9, respectively.

FIGURE 2. Controlled-Not gate Ci,j.

EXAMPLE 1: Given f1 as P ¼ AB þ AC 0 þ A0 CD0 ;

FIGURE 3. Peres gate Pi,j,k.

Q ¼ A0 C þ B0 CD0 þ BCD þ BC0 D0 þ BCD; R ¼ BD þ A0 B0 C þ ABC þ AC 0 D;

of jBf(j þ 1)j/jBf(j)j, j ¼ 1, 2, 3, are 33.2, 29.9, 27.1 (see Table 2). Therefore, before the number of the represented circuits reaches half of the all even circuit number jA16j, it is reasonable to assume that the ratio jBf(j þ 1)j/jBf(j)j is about 10% reduced forward. On the basis of this assumption, the estimation of the circuits with the minimum length represented by bi-direction method is 24 21 18 16 106 ¼ 1.4 1011, about 1.4 105 times more than using only the FS. Using the enhanced bi-direction search, we calculate two more steps. 1.4 1011 14 12 ¼ 2.4 1013 bigger than half of jA16j, which means over 50% of even 4-bit reversible circuits can optimally be realized within the length of 10 (Table 1); after step 10, the size of Bf(j) will decrease; and the maximum minimum length is less than 20. To demonstrate our idea, we synthesized 1000 random 4-bit reversible circuits. According to Section 2, the functional (input – output) behavior of each 4-bit reversible circuit can canonically be represented by a permutation of integers 1, 2, . . . , 16. To produce such random 4-bit reversible circuit, we use a random number generator to assign a random weight for every integer 1, 2, . . . , 16. We then sort these integers according to their randomly generated weight. The sorted result is the permutation of the corresponding reversible circuit.

S ¼ CD0 þ B0 D:

The permutation of f1 is f1 ¼ ð5; 16; 8; 14; 9; 10; 13; 15; 7; 12Þ ð6; 11Þ: We get the minimum length of f1 is 8 and the synthesis f1 ¼ CB;C PC;B;D PB;D;C PC;B;D CA;D PC;A;D PA;B;C PB;A;D : The result is shown in Fig. 4. Notice that we use boxes (with one CNOT and one Toffoli inside each box) to represent functional equivalent Perez gates. EXAMPLE 2: Given f2 as P ¼ AC 0 þ AB0 D þ A0 BC þ A0 CD0 ; Q ¼ BD0 þ CD0 þ A0 B0 C þ AB0 C 0 D; R ¼ BD þ AC 0 D þ ABC þ A0 B0 C; S ¼ AC þ CD0 þ ABD þ A0 B0 C 0 D:

The permutation of f2 is f2 ¼ ð5; 16; 13; 7; 12; 14; 10; 8; 15; 6; 11Þ:

TABLE 2. Number of minimum synthesized circuits. k

1

2

3

4

jAf(k)j jBf(k)j jBf(k)j/jBf(k 2 1)j

41 40

1367 1326 33.2

40967 39600 29.9

1115774 1074807 27.1

The minimum length of f2 is 9 and we have the synthesis (see Fig. 5) f2 ¼ PB;A;D PA;B;D PB;A;C PC;A;D CD;B PB;C;D PD;C;B PA;D;C CB;D :


Page 6 of 9

G. YANG

ET AL.

FIGURE 4. Optimal realization of f1.

Our EBDS is more efficient and powerful than a single FS. Not only the number of minimum synthesized circuits that can be handled by EBDS is 5 106 times bigger than that by FS, but also the speed of EBDS is much faster than that of FS (see Table 3). Time is measured in seconds. The follows are the corresponding permutations of the input circuits gi, i ¼ 1, 2, . . . , 11 and the synthesis by CNP library, respectively.

g5 ¼ ð5; 13; 12; 9; 7; 15; 16; 14; 10Þ; ¼ C2;3 P2;1;4 P1;2;4 P2;3;1 P1;4;3 P4;3;1 C2;4 :

g6 ¼ ð5; 14; 9; 7; 16; 13; 11; 12; 10Þ; ¼ P1;3;4 P2;1;4 P1;2;4 P3;1;4 P2;3;4 P4;1;3 P1;4;3 :

g1 ¼ ð11; 14; 16Þ ð12; 13; 15Þ; ¼ P3;1;4 P1;2;4 P3;1;4 P2;3;4 : g7 ¼ ð6; 14; 10; 13; 12; 9; 15; 8; 16Þ; ¼ C1;2 P3;2;4 P2;1;4 C1;2 P1;3;4 P4;1;3 C1;3 :

g2 ¼ ð10; 11; 13; 16Þ ð12; 15Þ; ¼ P3;2;4 P2;3;4 C3;4 P1;2;4 :

g3 ¼ ð9; 12; 14Þ ð11; 16; 13Þ; ¼ P3;2;4 P2;1;4 P1;2;4 P1;3;4 P2;3;4 :

g8 ¼ ð5; 15; 13; 9; 7; 14; 12; 11; 10Þ; ¼ C1;2 P3;2;4 P2;1;4 C1;2 P1;3;4 C1;3 P4;1;3 P1;2;4 ; ¼ P2;1;4 C2;3 P1;2;4 P2;3;1 P1;4;3

g4 ¼ ð5; 14; 13; 15; 11; 12; 10Þ ð7; 16; 9Þ; ¼ P2;1;4 P1;2;4 P4;1;3 P3;1;4 C2;4 P2;4;3 :

FIGURE 5. Optimal realization of f2.


P4;3;1 C2;4 P1;2;4 :

BI-DIRECTIONAL SYNTHESIS TABLE 3. Time of minimum length synthesis.

Page 7 of 9

g2 ¼ ð10; 11; 13; 16Þ ð12; 15Þ;

Forward search



Input

Length

Time

Length

Time

Length

Time

g1 g2 g3 g4 g5 g6 g7 g8 g9 g10 g11 g12 f1 f2

4 Exploded Exploded Exploded Exploded Exploded Exploded Exploded Exploded Exploded Exploded Exploded Exploded Exploded

90

4 5 6 6 7 7 7 8 8 8 9 10 8 9

,1 3 4 4 165 166 165 166 169 167 180 289 165 194

4 5 6 6 7 7 7 8 8 8 9

,1 3 4 4 5 5 5 8 8 8 118

8 9

8 168

¼ C1;3 P3;2;4 C1;2 P2;1;4 C1;2 C1;3 : g3 ¼ ð9; 12; 14Þ ð11; 16; 13Þ; ¼ C2;1 P1;2;4 C2;1 C1;3 P3;1;4 C1;3 : g4 ¼ ð5; 14; 13; 15; 11; 12; 10Þ ð7; 16; 9Þ; ¼ C1;4 C1;2 P2;1;4 C1;3 C1;2 C1;3 P4;1;3 P3;1;4 C4;3 : g5 ¼ ð5; 13; 12; 9; 7; 15; 16; 14; 10Þ; ¼ C1;2 P3;2;4 P2;1;4 C1;2 P1;3;4 C1;3 P4;1;3 : g6 ¼ ð5; 14; 9; 7; 16; 13; 11; 12; 10Þ; ¼ C1;2 P3;2;4 P2;1;4 C1;2 C3;4 P4;1;3 P1;4;3 :

g9 ¼ ð1; 5; 9; 3; 7; 11; 15; 12; 13; 16; 10Þ ð2; 6Þ ð4; 8Þ;

g11 ¼ C2;3 P3;2;4 P2;4;3 P3;2;4 C1;4 P3;1;4 P1;2;3 P2;1;4 P1;2;4 :

¼ C1;2 P3;2;4 P2;1;4 C1;2 P1;3;4 C1;3 P4;1;3 N3 ; ¼ P2;1;4 P1;2;4 P4;1;3 P3;1;4 C4;3

g12 ¼ C3;2 P1;3;4 P3;1;4 P4;2;3 P1;3;4 P2;3;4 P3;2;1 P2;3;4 P4;2;1 C4;1 :

N3 P2;3;4 P3;1;4 :

g10 ¼ ð3; 11; 4; 12; 9; 15; 7; 8; 16; 14; 10; 5; 13Þ; ¼ C1;2 P3;2;4 P2;1;4 C1;2 P1;3;4 C1;3 P4;1;3 P3;4;2 :

g11 ¼ ð5; 13; 14; 11; 6; 10; 15; 7; 9; 12Þ ð8; 16Þ; ¼ P4;2;3 C3;4 P4;2;3 P2;1;3 P1;2;4 P2;1;3 P1;2;3 P3;1;4 P2;1;4 :

g12 ¼ ð3; 4; 11; 5; 7Þ ð6; 12; 14; 10; 15Þ ð8; 16; 13Þ; ¼ P1;4;2 P3;4;1 P1;4;3 P3;1;2 P2;1;3 C4;1 P3;2;4 P2;1;3 P1;3;4 P3;2;4 : We also synthesized the circuits g1, g2, . . . , g6, g11, g12 and f1 using the quantum cost minimization technique described in Section 4. The results are as follows. g1 ¼ ð11; 14; 16Þ ð12; 13; 15Þ; ¼ C1;3 P3;2;4 C1;3 C3;4 P2;3;4 :

f1 ¼ C2;3 P3;2;4 P2;4;3 C2;1 P3;2;4 C2;1 C3;4 P1;2;3 P2;1;4 : To further showcase the necessity for optimality, we compare the quantum cost of the synthesized circuits between CNP library using our algorithm and CNT library using Maslov’s template-based algorithm [17, 18]. The results are listed in Table 4. Notice that the quantum cost using CNP is less than or equal to the quantum cost using CNT if we compare the sum of gate costs. Although the two methods are using different binary reversible gates (CNP vs. CNT), there are prior published method [14, 19] to measure the quantum cost of individual binary reversible gates. Hence, we use the quantum cost as the comparison basis for the two methods. The quantum cost of our CNP implementation is computed by adding the quantum cost of each binary reversible gate. The quantum cost of the CNT implementation is also computed by the Maslov’s software, which is related to the quantum cost of each binary reversible gate (with optimization). Notice that it is possible that certain gate patterns for the CNP and CNT implementations may have quantum cost better than the sum of these gate costs. For example, a few gates arranged together may


Page 8 of 9

G. YANG TABLE 4. Quantum cost comparison.

Input

CNP using EBDS

CNT [17, 18]

g1 g2 g3 g4 g5 g6 g7 g8 g9 g10 g11 g12 f1 f2

11 12 12 18 19 19 20 23 20 23 30 34 24 30

12 17 15 22 25 25 24 27 26 29 49 55 30 49

be implemented using a special quantum circuit [14] that has lower quantum cost than the sum of each gate’s quantum cost. Such possibility exists for both gate libraries. To simplify the comparison, we use the sum of gate costs in this table. The enhanced bi-direction approach cannot be used directly in the synthesis of reversible functions with five or more bits, but we can first use the approach of Maslov et al. [17, 18] through CNT library, and then divide the synthesis result into some blocks, such that there are at most 4 bits related to the changed output bits in each block. For each block, we use our enhanced bi-direction approach and CNP library to optimize the cost. The following example shows the benefit. The function 2 or 5 can be realized using 7 bits a, b, c, d, e, f, g through CNT library as: Tof,a,b * Cb,a * Tog,c,f * Tof,b,c * Cc,b * Tog,d,f * Tof,c,d * Cd,c * Tog,e,f * Tof,d,e * Ce,d * Cf,g (The meaning of Toffli gate Toa,b,c is the first bit a is controlled by the following two bits b and c.). There are seven Toffoli gates and five CNOT gates. The total cost is 7 5 þ 5 1 ¼ 40. Using CNP library and EBDS method, we can optimize the result as: Pef,a,b * Ca,b * Cb,a * Peg,c,f * Cc,f * Pef,c,b * Peg,d,f * Cd,f * Pef,d,c * Pef,g,e * Cf,g * Cf,e * Pef,e,d. There are seven Peres gates and six CNOT gates. The cost is 7 4 þ 6 1 ¼ 34.

ET AL.

Proof. We know that all even n-bit reversible circuits can be realized by 1-bit NOT gates, 2-bit CNOT gates and 3-bit Toffoli gates [8]. Toffoli gate can be realized by a Peres gate cascaded with a CNOT gate. Therefore, all even n-bit reversible circuits can be realized by CNP. A Now we deal with any logic circuit f, a non-reversible logic circuit or odd reversible circuit. THEOREM 6.1. Suppose that f is a non-reversible logic circuit. There are t inputs B1, . . . , Bt, s outputs P1, . . . , Ps (s t). The truth table of f is M. In M, there are at most r rows that have the same outputs. Then by adding dlog 2re 2 (t 2 s) (if dlog 2re . (t 2 s), else 0) inputs with constant zero, f can be realized by CNP with dlog 2re garbage outputs. Proof. Consider the case: t ¼ s ¼ n, r ¼ 2. We add 1 bit for input and output, and consider a new (n þ 1)-bit circuit with truth table (Table 5). h i The column vector CC12 is a binary vector, and the number of zeros and the number of ones are the same. We set N1h¼ M,i

and construct C1, C2 and N2 such that the rows of matrix

is an even permutation of the rows of the input matrix. Then the (n þ 1)-bit circuit is an even reversible circuit. According to Lemma 6.1, this circuit can be realized by CNP. Therefore, by adding a zero constant bit, the circuit f can be realized by CNP. Other cases can be dealt with in similar fashion. A Note, the transformation in Theorem 6.1 is not unique and that the choice made will have significant effect on the cost of the resulting circuit since one is embedding a non-reversible problem into one of many choices of a reversible one. Even fully optimal synthesis of the reversible function will not guarantee an optimal implementation of the desired non-reversible function. TABLE 5. Truth table of added input/output of circuit f. Input Bnþ1

6. SYNTHESIS OF NON-REVERSIBLE CIRCUITS OR EVEN REVERSIBLE CIRCUITS This section shows how to realize any n-bit non-reversible logic circuit or even reversible circuit (the corresponding permutation is even) using a library of 1-bit NOT gates, 2-bit CNOT gates and 3-bit Peres gates. LEMMA 6.1. All even n-bit reversible circuits can be realized by CNP.

C1 N1 C2 N2

0 .. . 0

1 .. . 1


Output Bn

...

0 .. .. . . 1

0 .. .. . . 1

B1

Pnþ1

0 .. .

C1

N1

C1

N1

Pn

...

1

0 .. . 1

P1

BI-DIRECTIONAL SYNTHESIS 7.

CONCLUSION

We presented an efficient bi-direction synthesis approach to the synthesis of 4-bit even binary reversible circuits using CNP library. Our proposed method outperforms all existing search-based methods in both speed and scalability. For the first time, we are able to find exact minimal solutions to some subset of four-variable functions. The number of circuits that can be optimally synthesized is increased over 5 106 times by our method comparing to standard breadth-first search algorithms. We synthesized 1000 random 4-bit reversible circuits. For five or more bit reversible circuits, we can first use the approach of Maslov et al. through CNT library, and then divide the synthesis result into some blocks. In each block, there are at most 4 bits related to the changed output bits. For each block, we use the enhanced bi-direction approach and CNP library to optimize the cost. The statistics analysis supports our estimation. The further experimental results demonstrated the effectiveness of our approach. The method can also be used to invent new reversible 4-bit gates to be next realized in quantum [7] and used in hierarchical heuristic-driven synthesis [4].

ACKNOWLEDGEMENTS The authors would like to thank Dmitri Maslov for providing his synthesis software for comparison purposes.

REFERENCES [1] Toffoli, T. (1980) Reversible computing. Technical Report MIT/LCS/TM-151. MIT Lab for Computer Science. [2] Song, X., Yang, G., Perkowski, M. and Wang, Y. (2006) Algebraic characterization of reversible logic gates. Theory Comput. Syst., 39, 311 –319. [3] De Vos, A., Raa, B. and Storme, L. (2002) Generating the group of reversible logic gates. J. Phy. A: Math. Gen., 35, 7063–7078. [4] Perkowski, M., Lukac, M., Pivtoraiko, M., Kerntopf, P. and Folgheraiter, M. (2003) A hierarchical approach to computer aided design of quantum circuits. Proc. sixth Int. Symp. Representations and Methodology of Future Computing Technology (RM2003), Trier, Germany. [5] Bullock, S.S. and Markov, I.L. (2003) An arbitrary two-qubit computation in 23 elementary gates. Proc. Design Automation Conf., Anaheim, CA, pp. 324 –329. ACM/IEEE.

Page 9 of 9

[6] Miller, D.M., Maslov, D. and Dueck, G.W. (2003) A transformation based algorithm for reversible logic synthesis. Proc. Design Automation Conf., Anaheim, CA, pp. 318–323. ACM/IEEE. [7] Yang, G., Hung, W.N.N., Song, X. and Perkowski, M. (2005) Majority-based reversible logic gates. Theor. Comput. Sci., 334, 295–274. [8] Shende, V.V., Prasad, A.K., Markov, I.L. and Hayes, J.P. (2003) Synthesis of reversible logic gates. IEEE Trans. CAD, 22, 710– 722. [9] Yang, G., Song, X., Hung, W.N.N. and Perkowski, M. (2005) Fast synthesis of exact minimal reversible circuits using group theory. Asia and South Pacific Design Automation Conf. (ASP-DAC), Shanghai, China, pp. 1002– 1005. ACM/IEEE. [10] Scho¨nert, M. et al. (1995) GAP—Groups, Algorithms, and Programming (5th edn). Lehrstuhl, D and fu¨r, Mathematik Rheinisch Westfa¨lische Technische Hoch-schule. Aachen, Germany. [11] Storme, L., De Vos, A. and Jacobs, G. (1999) Group theoretical aspects of reversible logic gates. J. Univers. Comput. Sci., 5, 307–321. [12] Korf, R. (1999) Artificial intelligence search algorithms. Algorithms and Theory of Computation Handbook. CRC Press, Boca Raton, FL. [13] Dixon, J.D. and Mortimer, B. (1996) Permutation Groups. Springer, New York. [14] Hung, W.N.N., Song, X., Yang, G., Yang, J. and Perkowski, M. (2006) Optimal synthesis of multiple output boolean functions using a set of quantum gates by symbolic reachability analysis. IEEE Trans. Comput. Aided Des., 25, 1652–1663. [15] Alder, H.L. and Roessler, E.B. (1972) Introduction to Probability and Statistics. W.H. Freeman and Company, San Francisco. [16] Bailey, D.E. (1971) Probability and Statistics. John Weily and Sons Inc., New York, London. [17] Maslov, D., Young, C., Miller, D.M. and Dueck, G.W. (2005) Quantum circuit simplification using templates. Design Automation and Test in Europe (DATE), Munich, Germany, pp. 1208–1213. IEEE. [18] Maslov, D., Miller, D.M. and Dueck, G.W. (2006). Techniques for the synthesis of reversible toffoli networks. eprint arXiv:quant-ph/0607166. [19] Hung, W.N.N., Song, X., Yang, G., Yang, J. and Perkowski, M. (2004) Quantum logic synthesis by symbolic reachability analysis. Proc. Design Automation Conf., San Diego, CA, pp. 838–841. ACM/IEEE.