Hierarchical Watermarking in IC Design - Semantic Scholar

Hierarchical Watermarking in IC Design Edoardo Charbon Cadence Design Systems Inc., San Jose, California, 95134

Abstract A formalization of the watermarking problem is presented and basic algorithms for its generation and detection at several abstraction levels are proposed. The concepts of robustness against forgery and theft tracing are analyzed in light of the proposed algorithms.

1 Introduction Watermarking is a well-known technique traditionally used in banknotes and other documents to discourage counterfeiting. It usually consists of semi-transparent symbols embedded on paper. Recently, similar concepts have been applied to digital audio-visual Intellectual Properties (IPs) [1, 2, 3]. With the explosion of electronic IP commerce, the EDA industry will soon find itself in a similar situation as the record industry. Unlike the record industry however, electronic IPs will be simultaneously developed in several levels of abstraction. It is therefore essential that every phase of conventional IC design flows be considered. In this paper a method, called hierarchical watermarking, is proposed to independently process multiple abstraction levels present in a design flow. This approach is robust, since the deletion of a watermark at a certain abstraction level, leaves watermarks at most other abstraction levels intact. Second, forgery can be traced to the source, since watermarks at the lowest abstraction levels are associated with the last “legal” IP buyers who ultimately caused the breach. A complete IP protection scheme based on watermarking consists of two phases: watermark synthesis and watermark detection. The synthesis phase is fully characterized by (a) a set of algorithms translating design features onto a unique watermark, (b) , the worstcase time required to forge and/or delete the watermark, and (c) , the odds that a design carries an unintended watermark in part or in its totality. The detection phase is fully characterized by (d) , the probability of a miss and (e) , the probability of a false alarm. Typical specifications of a complete IP protection scheme could be: 2 10 30 10 6 In general, watermarks can be designed to be inherently redundant, in order to combat partial forging and/or deletion. Redundancy is designed to boost the confidence in positive watermark identification, by reducing and . The paper is organized as follows. The problem is formulated in Section 2. Sections 3 and 4 present techniques for watermark synthesis and detection at various abstraction levels. Examples are shown in Section 5.

2 Problem Formulation Let us fix a finite alphabet , e.g. 0 1 , and let be the set of all strings in the alphabet. Define watermark set as a collection of strings over that identify the design. Let us now select string , the key of the identification process, then algorithm

is defined as :

(1)

where is a compact representation or signature of the design and its watermark. Consider a certain abstraction level and define as the set of all design features for that abstraction and as one of all design implementations over . In , every object is characterized by a set of parameters , defining it location , orientation , geometric and technological parameters, etc. Assume there exists a compact representation for the design. Let be the mapping of the design features onto such representation: :

(2)

Note that is an abstraction-dependent, possibly lossy mapping, while , and consequently , are general in nature. Let us define as a mapping which transforms implementation onto a new implementation as: :

(3)

If , then is said to be signature-invariant. If is signature-invariant over a subset of , it is partially signatureinvariant.

3 Watermarking Abstractions 3.1 Topology Most design implementations are associated with a topology. A topology describes the relative position and orientation of any object pairs 2 . A mapping which does not modify the topology 1 2 of a design is said to be topology-invariant. A design topological information can be uniquely mapped onto a sequence of symbols as suggested in [4]. Call such sequence topological signature. Consider the terminals of the design in Figure 1a. Figure 1b shows the boundaries discriminating the spatial relations around a single terminal, i.e. left, right, top, bottom, in terms of two 45 cones. The final sequence is computed by enumerating the literals associated with the cones of all the terminals as they result from superimposition in Figure 1c. For clarity, the enumerated left handside (lhs) and right handside (rhs) literals are separated by a “;”. In this case, the topological signature is ; Notice that the above code enables the reconstruction of the original topology, however it does not hold any information about the exact location of the objects, thus it is independent from topology-invariant mappings. Moreover, in this case a topology-invariant mapping is also signature-invariant. In the case of repetitive structures, topology can be usually represented more compactly. Consider for instance the row-based layout in Figure 2. The topology can be represented here as a single sequence of symbols, one for each standard cell type. Consequently, in this case, 1 2 3 1 1 3 3 2 1 1 3 1 2 2 2 3 3 3 3 2 1 1 3 3 1 3 2 1 2 1 . Cell connectivity can be coded in a similar fashion.

a

A

c

b

B

a

D

C

Left

a

Right

d

c

E

e

d

c

Top

d

b

A L A T C H

(a)

e

e

(b)

c

a

d

(c)

Figure 2: Watermark in instantiation schemes

3.2 Networks and Unstructured Connectivity Consider the network of Figure 3a. If non-reconvergent, such net can be mapped onto a tree (see Figure 3b), whose nodes are terminals, bends ( 1) and joints ( 1 2) and whose arcs are the Manhattan segments connecting them. The tree can then be mapped onto a ad hoc topological signature. Alternatively, the network can be partitioned in vertical and horizontal segments. The center of each segment, marked with an “X” in Figure 3c, is labeled in increasing order based on its relative position (from top to bottom, from left to right). The centers are then mapped onto a topological signature. The signature of the example in Figure 3c is 1 3 5 4 6 2 7 8 9; 9 7 6 1 5 3 8 4 2

Designs at RTL and gate level can be represented in terms of a graph , whose nodes represent general blocks, or single gates, as well as nets, and whose (directed) edges define connectivity. Let be the set of all blocks or gates in and the set of all nets, with . For each net there exists a set of edges connected to an output , one of edges leading to an input , and one of edges connected to a high-impedance or pass transistor gate . Note that in our model, exactly one edge can be connected to an output, i.e. 1, this condition is however not necessary. The total pin number and the type and port of the gates connected by are necessary but not sufficient properties to uniquely identify the net. However, one can impose a set of constraints on sets , and for each net, so as to make these properties define the net uniquely to all practical purposes. As an illustration, consider the gate level circuit in Figure 4a. The corresponding connectivity graph is shown in Figure 4b. Define subset : is prime . For each net set the following constraints: gates of type gates of type gates of type

; ;

;

(4)

;

where and are net size-dependent parameters, generated using, for example, a parametrized pseudorana c

b

1 A

J1

J1 C

B1

d

B1

B

J2

J2

e

(a)

B

C

E

D

F

E

a

b

c

d

effect on lhs and rhs same literal order reversed literal order invariant literal order -

Table 1: Topological constraints vs. topological signatures

2 3

s

d

F

topological constraint horizontal alignment/symmetry vertical alignment/symmetry firm group device parameters

Figure 1: (a) Original design topology; (b) Topology cones; (c) Construction of topological signature 1

c

A

L A T C H

(a) (b) Figure 4: (a) Gate level circuit; (b) Connectivity graph

Bottom b

B C

D b

e

G

G

b a

E

D

(b)

3

4

5

2

dom sequence determined by key . The signature is the set of Equations (4) compactly written for each net. The resulting signature for net in Figure 4 is: ; 3; 0 .

3.3 Use of Topological Constraints Topological constraints can be viewed as restrictions to the degrees of freedom of a circuit topology, thus reducing the implementation space from to . A mapping which preserves all the constraints in a design is said to be constraint-invariant. Consider the following topological constraints. Alignment: Let be the set of all objects constrained to share the same x- or y- coordinate of the center/edge reference and the same orientation. 2 Symmetry: Let be the set of all object pairs 1 2 symmetric to a given axis , i.e. aligned perpendicularly to the axis and with mirrored orientations. Firm Groups: Let be the set of all objects with a fixed relative position and orientation with respect to each other. Every constraint is associated with a certain topology which will be necessarily present in the final topology of the entire circuit. The effects of the above constraints on the signature’s lhs and rhs are summarized in Table 1. Due to the loss of orientation information symmetry and alignment are essentially identical, except for the necessary presence of the axis. Firm groups do not restrict lhd and rhs but the respective literal order in invariant. Suppose now that the configuration of Figure 1a is constrained as follows: 1: 2: 3:

horizontal align top_edge( ) top_edge( ); vertical align left_edge( ) left_edge( ); horizontal symmetry wrt. center( ) center( ); = left_edge( ).

Then, the following partial topological signatures are necessarily true: ( [1.1] (a,b; a,b) [1.2] (b,a; b,a) ) ( [2.1] (a,c; c,a) [2.2] (c,a; a,c) ) ( [3.1] (c, ,d; c, ,d) [3.2] (d, ,c; d, ,c) )

left edge , center center

, .

One can recognize that partial signatures [1.1], [2.1], [3.1] are included in the original signature: a c b e d; e c a d b ; a c b e d; e c a d b ; a c b e d; e c a d b

Furthermore, partial signatures [1.2], [2.2] and [3.2], or any combination of them, also satisfies the original constraints, hence they can be used as alternative signature. This is useful to create different watermarks for different customers using the same design.

6

9

3.4 Redundant and Dead DNA Sequences

7

8

Let be the set of all possible blocks in a given library. Every block has a finite set of implementations . Applying a given constraint set to each block , is reduced to . Assume that for each block , there exists a mapping defined as in Equation (2). Then, for each block one can construct a set

(c)

Figure 3: (a) Network; (b) Tree representation; (c) Partition in Manhattan segments

of signatures by applying to all the implementations in . The multiplicity of signatures for each block can be exploited when complex circuits are designed using library . One or more phases of physical design can be modified so as to use a block instantiation sequence corresponding to a secret key. As an illustration, consider the circuit in Figure 2. Suppose that 1 2 3 and the circuit is implemented using 1 11 instances of type 1, 2 8 of type 2 and 3 11 of type 3. Moreover assume that 1 1 1 1 , 2 2 2 and 3 3 3 3 , then 2 1 2 3 1 1 3 3 2 1 1 3 1 2 2 2 3 3 3 3 2 1 1 3 3 1 3 2 1 2

1

(5)

There exist 1 1 2 2 3 3 31138 311 330 possible signatures for the topology of Figure 2. Coding the entire layout topology into a single signature is obviously not a safe practice, since the removal or addition of even a single component would result in the destruction of the watermark. Alternatively, one can generate a watermark from the combination of large numbers of partial signatures, or genes, so as to make the swapping of unconstraint components not affect the detection. One can think of the unused partial signatures in terms of dead DNA. A similar technique can be used in the circuit of Figure 2. Here, one can pick partitions of the signature to be included in the watermark. A possible scheme consists of choosing partitions starting and ending in locations associated with prime numbers. Formally, algorithm is defined here as a 1x matrix 1 if 2 1 2 th prime 1 0 if 2 2 1 th prime 1 2; 3 1 1; 3 2 1 1 3; 2 2 2 3 3; 3 2 1 1 3; 2 1 with

(6)

Prime numbers can be easily substituted with a -dependent pseudorandom sequence.

4 Watermark Detection A watermark should be detected in a design which has been deprived of any information other than geometries and connectivity. Hence, detection must be performed in two steps: topology and connectivity extraction and signature identification. The first step is abstractiondependent, the second is not, hence it is general in nature.

4.1 Topology and Connectivity Extraction The layout is sliced in horizontal areas encompassing exactly one feature1 and every object is labeled and catalogued in order from left to right and from top to bottom. Call the set of all scanned objects. Subsequently, a mapping is performed between every object and the original ones in based on the set of all relevant properties for , also part of the watermark. The mapping is defined as: :

(7)

Finally, the topological signature is extracted using the techniques of Section 3.1. Given a net-list or a schematic circuit, every net in is identified and collected into bins by size. No bin is ignored since some nets may have been tempered with, thus resulting in the augmentation of the edge count per net by one or more. Each net is associated with its signature, using the technique outlined in Section 3.2. 1 Features

not used in the watermarking process are ignored.

4.2 Signature Identification If no manipulation has occurred, using the original algorithm and key on signature , one can obtain the extracted watermark which will be identical to the original watermark . Let us now assume that the design has been tempered with. Since was originally synthesized using algorithm , only some sections of were used to generate , however these sections may be shifted and/or partially scrambled. To cope with this problem a technique known as genome search is used for identifying and counting the number of fragments in which also appear in . Let be the set of all the signature fragments present in the original watermark . genome search( ) foreach ( ) = best match( , , length( )) overlap += overlap( ) / length( )

Function best_match( ) selects subsequence of of length which best matches . The matching criterium usually employed is the number of identical symbols. length( ) returns the length of , and overlap( ) computes the number of identical symbols in and . This algorithm returns an estimate of the probability that the design contains in fact watermark .

4.3 Confidence Analysis Topology: Let be the number of objects of in the design. The total number of possible topologies !2 1 . Since neither , nor are known, time must necessarily be proportional to . For instance, suppose that exploring an implementation requires 1 second and that 10, then 1013 seconds. Concurrent topological constraints reduce by introducing restrictions to the lhs and rhs of the signature. For a given number of constrained objects, is bounded from both sides as: !2 ! !. The lower-bound represents the case when the topology is completely fixed by a single firm group on objects or by the combination of multiple constraints. The upper-bound corresponds to the case when a restriction exists between lhs and rhs. Probability 0, if the IP was not tempered with or a topologyinvariant mapping was applied to it, otherwise three types of modifications can occur to a signature: (1) literal deletions, (2) literal additions and (3) literal swaps. Let us assume that layout objects are included in the watermark with a fixed probability independently of their spatial location. Assume that an object must be translated by an amount to cause a topological change, i.e. one or more literal swaps in the signature. Moreover, assume that in a watermarking deletion attempt, object is translated by a random amount . Let us model as an independent random variable with common statistical properties for each object. Define the probability of a literal swap in the signature due to . can be derived from geometries surrounding . Assuming and , , the probability that a section of length in watermark mutates is P

Pr Pi

j

1

Pr 1

Pi

N j

(8)

j 1

Suppose that in order to identify a watermark at most of all symbols in can be corrupted. Then, the counterfeiting is successful,

type alignment symmetry firm group device parameters

# constraints 9 4 4 21

# objects (avg.) 2.8 2.0 4.0 1.0

Table 2: Constraints in “latch” i.e. identification is missed, with probability . For instance, for N 10 Pr 0 5 Pi 0 01 and m 1 Pm 1 7 10 5 . In the case of repetitive structures, 1 ! , since only one dimension is used to produce the signature. cannot be reduced applying the constraints described in Section 3.3. Moving always implies a change in the signature. Given the same assumptions of above, probability is computed as in Equation (8). Connectivity: Assuming that every has exactly one arriving and leaving edges, the number of possible graphs 1 is Ng

j

2

1

1

(9)

j 2

For instance, suppose 100, 40, then 4 54 10 16. Connectivity may be altered in the following ways: (1) edge augmentation, (2) net partitioning and (3) net consolidation. Manipulation (1) adds connectivity to dead or unused nodes, or even extra circles in graph . Assuming that dead and unused nodes can be detected as extraneous components to the original set and that circles can be found in at relative low cost, this type of counterfeiting can be eliminated as a pre-processing step. Manipulations (2) and (3) may be difficult to perform at no cost of performance, since they imply adding/removing extra components (a buffer, for example) in possibly critical nets. The effect is the same in both manipulations since new nets are formed from the original ones. For instance, net in Figure 4b can be partitioned in: , connecting with and , connecting with through a buffer. In general, a net connects one output with 1 inputs or highimpedance gates. Assuming that a -edge net is part of the watermark with probability , then the probability that a partition in an edge net is responsible for a mutation in the watermark is L

1

P

Pi j 1 Pi j 1

Pi

j Pi

j

4 (10)

j 1

where 1 2 if is odd, 2 otherwise. is the probability that a section of length cut from a segment of length are identical. Suppose that a subset of nets in of a given size is partitioned, then probability . Suppose 4 Pi j z Pi j Pi 10 3 j y z, then Pm P4 6 10 6 .

5 Examples The proposed watermarking techniques have been applied to a number of examples at three abstraction levels. Consider library cell “latch”. It consists of 80 objects, 50 of which instantiated, 12 nets, 4 I/O pins and 4 P/G pins. Table 2 lists the constraints imposed on the design. The layout was performed using the DEVICE-LEVEL EDITOR in the VIRTUOSO environment (see Figure 5a). Note that the odds of creating an identical watermark by chance or the probability of a false alarm are at most 9 10 81. Subsequently, the objects in the design were translated at random with probability by a random amount which caused a literal change with probability . A sample of these transformation is shown in Figure 5b. In this example, was chosen so as to result in 0 5 and the watermark included 1% of the literals present in

(a) (b) Figure 5: (a) Layout carrying watermark; (b) Tempered layout the signature, i.e. 10 2 . The extraction phase was performed as described in Section 4. The rate of successful forgery is listed as a function of the size of the genomes considered during extraction. 1 2 99

10

25

2 4 19

10

25

3 4 49

10

25

4 4 55

10

25

Consider now design “c3540” from the ISCAS 85 benchmark list, consisting of 1304 instances and 72 I/O pins. The connectivity of the design was first treated so that the following constraints be satisfied NOT

if 1 : NOT with 1 if 3 : NOT 2 NAND2 with 3 if 5 : 2 NAND2 3 NOR3 with 5 don’t care, otherwise

where as defined in Section 3.2, NAND2 represents a two port NAND and NOR3 a three port NOR. The watermark (A) was then constructed using the techniques outlined in Section 3.2. The circuit was mapped to a SCMOS technology and laid out using TIMBERWOLFSC-4.1. Note that watermarked latches (B) had been added at all I/O ports and a topological watermark (C) was added using a scheme identical to (6). As expected, the circuit and the corresponding three watermarks were successfully extracted. The circuit was then re-laid out, by re-running the non-deterministic place&route tool, thus simulating, engineering changes or technology migration. Watermarks (A) and (B), were still intact, while (C) had been destroyed. Had one tried to temper with the connectivity watermark (1), by partitioning one of the 5-edge into two 3-edge nets, then the odds of a mutation in would have been Pm 1 n P5 12 10 8 3 5 for n 10 Pi 3 10 Pi 2 Pi 4 0.

6 Conclusions The concept of hierarchical watermarking was presented for IC design. Techniques have been proposed for its synthesize and detection, and to quantify its resilience to engineering modifications.

References [1] M. D. Swanson, B. Zhu and A. H. Tewfik, “Transparent Robust Image Watermarking”, in Proc. IEEE International Conference on Image Processing, volume 3, pp. 211–214, September 1996. [2] L. Boney, A. H. Tewfik and K. N. Hamdy, “Digital Watermarks for Audio Signals”, in Proc. IEEE International Conference on Multimedia Computing and Systems, pp. 473–480, June 1996. [3] M. Eisenhart, “Digital Watermarks”, MicroTimes, , n. 171, pp. 98–114, November 1997. [4] H. Murata, K. Fujiyoshi, S. Nakatake and Y. Kajitani, “RectanglePacking-Based Module Placement”, IEEE Trans. on Computer Aided Design, vol. CAD-15, n. 12, pp. 1518–1524, December 1997.