New upper bounds for parent-identifying codes and ... - Springer Link

0 downloads 0 Views 511KB Size Report
Sep 14, 2017 - Keywords Traitor tracing schemes · Parent-identifying codes ...... Barg A., Kabatiansky G.: A class of I.P.P. codes with efficient identification.
Des. Codes Cryptogr. DOI 10.1007/s10623-017-0420-y

New upper bounds for parent-identifying codes and traceability codes Chong Shangguan1,2 · Jingxue Ma2 · Gennian Ge1

Received: 5 June 2017 / Revised: 14 September 2017 / Accepted: 22 September 2017 © Springer Science+Business Media, LLC 2017

Abstract In the last two decades, parent-identifying codes and traceability codes are introduced to prevent copyrighted digital data from unauthorized use. They have important applications in the scenarios like digital fingerprinting and broadcast encryption schemes. A major open problem in this research area is to determine the upper bounds for the cardinalities of these codes. In this paper we will focus on this theme. Consider a code of length N which is defined over an alphabet of size q. Let M I P PC (N , q, t) and MT A (N , q, t) denote the maximal cardinalities of t-parent-identifying codes and t-traceability codes, respectively, where t is known as the strength of the codes. We show M I P PC (N , q, t) ≤ rq N /(v−1) + (v − 1 − r )q N /(v−1) , where v = (t/2 + 1)2 , 0 ≤ r ≤ v − 2 and N ≡ r mod (v − 1). This new bound improves two previously known bounds of Blackburn, and Alon and Stav. On the other hand, MT A (N , q, t) is still not known for almost all t. In 2010, 2 Blackburn, Etzion and Ng asked whether MT A (N , q, t) ≤ cq N /t  or not, where c is a constant depending only on N , and they have shown the only known validity of this bound for t = 2. By using some complicated combinatorial counting arguments, we prove this bound for t = 3. This is the first non-trivial upper bound in the literature for traceability codes with strength three. Keywords Traitor tracing schemes · Parent-identifying codes · Traceability codes

Communicated by C. J. Colbourn.

B

Gennian Ge [email protected] Chong Shangguan [email protected] Jingxue Ma [email protected]

1

School of Mathematical Sciences, Capital Normal University, Beijing 100048, China

2

School of Mathematical Sciences, Zhejiang University, Hangzhou 310027, Zhejiang, China

123

C. Shangguan et al.

Mathematics Subject Classification 68R05 · 97K20 · 94B25

1 Introduction The concept of traitor tracing scheme was introduced in 1994 by Chor et al. [9] as a method to discourage piracy. Traitor tracing schemes are useful in scenarios like digital fingerprinting and broadcast encryption schemes, where the distributed content may only be accessible to authorized users. In [16], Stinson et al. discussed in detail four types of traitor tracing schemes, namely, frameproof codes, secure frameproof codes, parent-identifying codes and traceability codes. These codes have different traceability and are used for different purposes. For example, t-frameproof codes can be used to prevent a coalition of at most t traitors from framing a legitimate user not in this coalition. However, they are widely considered having no traceability for generic digital fingerprinting (it is worth mentioning that Cheng and Miao [8] showed that frameproof codes have very good traceability for multimedia fingerprinting). Therefore, in order to trace the origin of the pirate digital content, parent-identifying codes and traceability codes are introduced, with different tracing algorithms. In recent years, properties and applications of these codes have been studied extensively, see for instance [1,2,4–10,14,16]. There are two major problems in this research area. The first one is to design good detecting algorithms which can be used to identify the traitors as quickly as possible. For example, Silverberg, Staddon and Walker [15] applied the list-decoding algorithm to traceability codes and Barg and Kabatiansky [3] introduced a similar strategy for parent-identifying codes. The second major problem is to determine the bounds for the cardinalities of these codes. A lot of papers have been written in this aspect, see for example [2,4–6,14,16,17]. In this paper we will also focus on this theme. Consider a code C ⊆ F N , where F denotes an alphabet of size q. Without loss of generality, we can set F = {0, 1, . . . , q −1}. We call the code C an (N , n, q) code if |C | = n. For the sake of saving space, the formal definitions of t-frameproof codes (or t-FP codes for simplicity), t-parent-identifying codes (or t-IPP codes for simplicity, where the capitals IPP are the abbreviation for “identifiable parent property” [11]) and t-traceability codes (or t-TA codes for simplicity) are postponed to Sect. 2. Let the code length N , the alphabet size q and the strength t be fixed, we use M F PC (N , q, t), M I P PC (N , q, t), MT A (N , q, t) to denote the corresponding maximal cardinalities of FP codes, IPP codes and TA codes. In what follows, we will collect the previously known upper bounds for these codes. The best current bound for FP codes is due to Blackburn [4], who proved the following theorem: Theorem 1.1 [4] Let r ∈ {0, . . . , t − 1} be an integer such that r ≡ N (mod t). Then it holds that M F PC (N , q, t) ≤ max{q N /t , r (q N /t − 1) + (t − r )(q N /t − 1)}. Note that the constants r and t −r can be reduced in many cases. For example, Corollary 9 of [4] gives a slightly better bound with an improved constant in front of q N /t , and relates this constant to a question in the theory of set systems. When r = 1, [17] gives a much cleaner bound which can be described as M F PC (N , q, t) ≤ q N /t . For IPP codes, Alon and Stav proved the following bound: Theorem 1.2 [2] Denote v = (t/2 + 1)2 . Then it holds that M I P PC (N , q, t) ≤ (v − 1)q N /(v−1) . Almost at the same time, a slightly worse bound was proved in [5] with M I P PC (N , q, t) ≤ v(v−1) N /(v−1) . 2 q

123

Bounds for IPP codes and TA codes

Compared with FP codes and IPP codes, the upper bound of TA codes is much harder to determine. Despite the trivial bounds deduced from FP codes and IPP codes, the only known upper bound for TA codes is the one given by Blackburn et al. in [6]: Theorem 1.3 [6] MT A (N , q, 2) ≤ cq N /4 , where c is a constant depending only on N . Unfortunately, this upper bound is also not as good as we think, since the constant c is too large   N  larger than N N /4 , one may compare it with the constants appearing in Theorems 1.1 and 1.2. A cleaner bound MT A (4, q, 2) ≤ 4q was later obtained in [12], only for 2-TA codes with length 4. The authors of [6] also posed the following question: Question 1.4 [6] Does there exist a constant c only related to the code length N such that 2 MT A (N , q, t) ≤ cq N /t  ? Note that for sufficiently large q, the bounds stated in Theorems 1.1 and 1.2 and conjectured in Question 1.4 are quite strong. For example, for sufficiently large q there are constructions (see, for example [16]) from Reed–Solomon codes (which exist when q is a prime power no 2 less than N ) showing that M F PC (N , q, t) ≥ q N /t and MT A (N , q, t) ≥ q N /t  . Combining these constructions with the bounds in Theorem 1.1 and Question 1.4, one can conclude that limq→∞ logq M F PC (N , q, t) = N /t and limq→∞ logq MT A (N , q, t) = N /t 2  (if there is a positive answer to Question 1.4). Moreover, in [13] the first and the third authors of this paper showed that limq→∞ M I P PC (v − 1, q, t)/q = v − 1, which implies that Theorem 1.2 is asymptotically optimal for N = v − 1. The motivation of this paper is to present improved upper bounds for IPP codes and TA codes. For IPP codes, Theorem 1.7 of [13] implies that the constant v − 1 in Theorem 1.2 is tight for N = v − 1. However, the next theorem shows that in most cases it is not the case. Theorem 1.5 Denote v = (t/2 + 1)2  and let 0 ≤ r ≤ v − 2 be a positive integer such that N ≡ r mod (v − 1). Then it holds that M I P PC (N , q, t) ≤ rq N /(v−1) + (v − 1 − r )q N /(v−1) . Our theorem is obviously an improvement of Theorem 1.2 when v − 1  N , since the coefficient of the leading term q N /(v−1) is replaced by some constant r < v − 1. For TA codes, by using some complicated combinatorial counting arguments, we answer Question 1.4 positively for t = 3. In the literature, our result is the first non-trivial upper bound for TA codes with strength three and it can be stated as the following theorem: Theorem 1.6 Let N be a positive integer. Then it holds that MT A (N , q, 3) ≤ cq N /9 , where c is a constant depending only on N . The rest of this paper is organized as follows. In Sect. 2 we will present the necessary definitions and notations. In Sect. 3 we will prove Theorem 1.5. In Sect. 4 we will prove Theorem 1.6. We will conclude this paper in Sect. 5.

2 Preliminaries Let F = {0, 1, . . . , q − 1} be an alphabet of size q and let C ⊆ F N be an (N , n, q) code. Each codeword c ∈ C can be represented as a column vector c = (c1 , . . . , c N )T , where 0 ≤ ci ≤ q − 1 for all 1 ≤ i ≤ N . Note that if a vector is denoted as c1 , then we use c1,i to denote its i-th coordinate. Sometimes it will be more convenient if we use a matrix to describe a code. We can depict an (N , n, q) code as an N × n matrix over q symbols, where

123

C. Shangguan et al.

each column of the matrix corresponds to one of the codewords. This matrix is called the representation matrix of the code. Representation matrices of codes will be used frequently in this paper. For any subset of codewords D ⊆ C and every 1 ≤ i ≤ N , we denote D(i) = {ci : c ∈ D}. The set of descendants of D is defined as desc(D) = {x ∈ F N : xi ∈ D(i), 1 ≤ i ≤ N }. One can also view desc(D) as the direct product desc(D) = D(1) × D(2) × · · · × D(N ). The set D ⊆ C is said to be a parent set of a vector x ∈ F N if x ∈ desc(D). We use Pt (x) to denote the collection of parent sets of x such that |D| ≤ t and D ⊆ C . For arbitrary two vectors x, y ∈ F N , the Hamming distance d(x, y), is defined to be the number of distinct coordinates between them: d(x, y) = |{1 ≤ i ≤ N : xi = yi }|. Sometimes it will be more convenient to use I (x, y) = N − d(x, y), which denotes the number of identical coordinates between x and y. The minimum distance of a code C ⊆ F N is defined to be d(C ) = min{d(x, y) : x, y ∈ C , x = y}. For a vector x ∈ F N and a subset D ⊆ C , the group distance d(x, D) is defined to be / D(i)}|. d(x, D) = |{i : 1 ≤ i ≤ N , xi ∈ Similarly, we use I (x, D) = N − d(x, D) to denote the number of coordinates i such that xi ∈ D(i), 1 ≤ i ≤ N . Now we are ready to present the definitions of the codes discussed in this paper. Definition 2.1 Suppose C is an (N , n, q) code and t ≥ 2 is an integer. Let D ⊆ C with |D| ≤ t. (1) We call C a t-frameproof code if it holds that desc(D) ∩ C = D. C will be also denoted as a t-F PC(N , n, q). (2) We call C a t-parent-identifying code if for all x ∈ F N , it holds that either Pt (x) = ∅ or

∩ D∈Pt (x) D = ∅. C will be also denoted as a t-I P PC(N , n, q). (3) We call C a t-traceability code if for arbitrary D ⊆ C with |D| ≤ t and arbitrary x ∈ desc(D), it holds that

min d(x, c) < min d(x, y). c∈D

y∈C \D

C will be also denoted as a t-T A(N , n, q).

It is well-known that the t-traceability property implies the t-identifiable parent property and the t-identifiable parent property implies the t-frameproof property. See [16] for a more detailed description of the relations among these three codes. We have mentioned before that both t-IPP codes and t-TA codes can trace at least one traitor if the number of all traitors is at most t. Generally speaking, assume that we are given a secure code C and a coalition

123

Bounds for IPP codes and TA codes

D ⊆ C of at most t traitors. If x ∈ desc(D) is the pirate data, then our goal is to find some traitor c ∈ D. If C is a t-IPP code, we can determine Pt (x) by simply examining all small subsets (with size at most t) of C , then the non-empty set ∩ E∈Pt (x) E must belong to D. If C is a t-TA code, we can find some traitor c ∈ D by computing all distances {d(x, c) : c ∈ C }, then the codewords with the smallest distance must belong to D. To sum up, for fixed N and t, if we are given n codewords with length N , and a coalition   of at most t traitors, then IPP codes and TA codes can trace at least one traitor in time O( nt ) and O(n), respectively. Note that the tracing time can be further reduced to O(logk n) for some positive constant k if a list-decoding algorithm is employed [3,15].

3 Proof of Theorem 1.5 Some preparations are needed before presenting the proof. For a vector x ∈ F N and a set V ⊆ [N ], where [N ] = {1, . . . , N }, a |V |-pattern of x with restriction to V is defined to be the ordered |V |-tuple written as x|V = (xi1 , . . . , xi|V | ), where i j ∈ V for 1 ≤ j ≤ |V | and 1 ≤ i 1 < · · · < i |V | ≤ N . Let C be an (N , n, q) code and c be a codeword of C . c|V is said to be a private pattern of c if no other member of C coincides with c simultaneously in all coordinates of V . In other words, c|V is private, if there does not exist x ∈ C \{c} such that c|V = x|V . Now we can prove Theorem 1.5 as follows. Proof of Theorem 1.5 Let C be an arbitrary t-I P PC(N , n, q) and denote by M the representation matrix of C . Then M is an N × n q-ary matrix. Let us partition the rows of M into v − 1 disjoint parts denoted by V1 , . . . , Vv−1 , with the property that |V1 | = · · · = |Vr | = N /(v − 1) and |Vr +1 | = · · · = |Vv−1 | = N /(v − 1). One can check that r (N /(v − 1)) + (v − 1 − r )(N /(v − 1)) = N and hence {Vi : 1 ≤ i ≤ v − 1} is indeed a partition of the rows of M. We say a codeword x ∈ C is special (with respect to C ) if it contains some private pattern with support set Vi for some 1 ≤ i ≤ v − 1. Suppose that |C | ≥ rq N /(v−1) + (v − 1 − r )q N /(v−1) + 1, our goal is to find a subset of C which violates the t-identifiable parent property. We claim that there must exist a nonempty set Cˆ ⊆ C that contains no special (with respect to Cˆ) codewords. Let us delete the special codewords in C and denote the collection of the remaining codewords by C (1) . Furthermore, we delete the special codewords corresponding to C (1) and denote the collection of the remaining codewords by C (2) . Each time, whenever there is a special codeword (special among the codewords that have not been deleted yet), we delete it. We continue this procedure until we get a code Cˆ with no special codewords in it. We claim that Cˆ is not empty. On one hand, any pattern (particularly, with support set Vi ) can be deleted as a private pattern of some codeword for at most one time. On the other hand, any deleted codeword (which is special in C (i) for some i ≥ 1) contains at least one private pattern (corresponding to C (i) ) with support set Vi . Consequently, at most rq N /(v−1) + (v − 1 − r )q N /(v−1) special codewords can be deleted since each Vi is responsible for at most q |Vi | distinct patterns. Taking the assumption |C | ≥ rq N /(v−1) + (v − 1 − r )q N /(v−1) + 1 into account, our deletion can not delete all codewords of C . Therefore, after the deletion, we are left with a nonempty set such that no codewords in it contain a private pattern (with respect to the remaining codewords) with support set Vi for any 1 ≤ i ≤ v − 1. Then this set satisfies the desired property mentioned in the claim. Let us take this set to be Cˆ.

123

C. Shangguan et al.

Suppose first that t is even, then v − 1 = (t/2 + 1)2 − 1 = t 2 /4 + t. Now our goal is to pick a specified subset of Cˆ in order to deduce the desired contradiction. We start by picking some codeword, x1 ∈ Cˆ. Next, we pick a codeword x2 ∈ Cˆ such that x1 |Vt/2+1 = x2 |Vt/2+1 . Note that the property of Cˆ guarantees the existence of such x2 . Set m 1 = (t/2 + 1). To choose x3 , we consider the pattern of x2 with support set V2(t/2+1) = Vt+2 . We check whether x1 |Vt+2 = x2 |Vt+2 . If so, we move to the pattern with support set Vt+3 and check it. We do so until we find the first Vi such that i ≥ t + 2 and x1 |Vi = x2 |Vi . We choose a codeword x3 coinciding x2 in Vi and set m 2 = i. We continue this procedure. The (k + 1)-th codeword xk+1 is chosen as follows. Let m k be the first integer such that m k ≥ m k−1 + (t/2 + 1) and xk |Vm k = xi |Vm k for all 1 ≤ i ≤ k − 1. Then we choose xk+1 as the codeword coinciding xk in Vm k . If no such m k exists, we say that m k is undefined. We stop when m k is undefined. Note that at most t/2 + 1 codewords can be chosen in this way since each time we skip at least t/2 + 1 patterns and there are at most v − 1 = t 2 /4 + t = (t/2 + 1)t/2 + t/2 < (t/2 + 1)2 patterns, thus we can never pick a (t/2 + 2)-th codeword. Finally, we have picked a set X ⊆ Cˆ satisfying the following properties: |X | ≤ t/2 + 1, xi |Vm i = xi+1 |Vm i for all 1 ≤ i ≤ |X | − 1. The descendant s ∈ desc(X ) is chosen as follows. The first m 1 = t/2 + 1 patterns of s (i.e. the coordinates in V1 ∪ · · · ∪ Vm 1 ) are chosen from x1 , the following patterns until Vm 2 (i.e. the coordinates in Vm 1 +1 ∪ · · · ∪ Vm 2 ) are chosen from x2 , and so on. The last member of X contributes at most t/2 patterns that do not belong to the other members of X . The following observation is the core of this proof. Any xi ∈ X contributes at most t/2 patterns which do not belong to the other members of X . For example, fix an arbitrary xi ∈ X . The patterns of s ∈ desc(X ) taken from xi are Vm i−1 +1 , . . . , Vm i−1 +t/2 , . . . , Vm i (for i = 1, let m 0 = 0). By our definition of m i and xi , only the first t/2 of the Vi ’s, namely, Vm i−1 +1 , . . . , Vm i−1 +t/2 , could be the possible “private” patterns of xi in X . Therefore, since xi ∈ Cˆ, and by definition any codeword in Cˆ contains no private pattern with support set Vi for any 1 ≤ i ≤ v − 1, there exists a set Yi = {y1 , . . . , yt/2 } ⊆ Cˆ with at most t/2 codewords such that y j |Vm i−1 + j = xi |Vm i−1 + j for all 1 ≤ j ≤ t/2. So the new set X i , formed by X i = (X \{xi }) ∪ Yi , can also produce the same descendant s, implying s ∈ desc(X i ). Note that |X i | = |X | − 1 + |Yi | ≤ t/2 + 1 − 1 + t/2 = t, then it holds that X i ∈ Pt (s). We can do the replacement similarly for all xk ∈ X , leading to the corresponding Yk ’s and the newly defined X k ’s. Set X 0 = X , then according to the discussion above one can see that s ∈ desc(X k ) for all 0 ≤ k ≤ |X |. Therefore, our desired contradiction follows from the simple fact that ∩0≤k≤|X | X k = ∅ and |X k | ≤ t, which violates the t-identifiable parent property. For odd t, we do exactly the same thing, only taking m k+1 ≥ m k + (t + 1)/2, which gives |X | ≤ (t +1)/2+1, and hence the size of the parent sets is also at most |X |−1+(t +1)/2−1 ≤ t.  

4 Proof of Theorem 1.6 For a code C of length N , a codeword x ∈ C and a subset V ⊆ [N ] of positions denoting the indices of the codeword, define FC (x, V ) = |{y ∈ C : y|V = x|V }|. Observe that FC (x, V ) = 1 for some V ⊆ [N ] is equivalent to saying that x|V is a private |V |-pattern of x.

123

Bounds for IPP codes and TA codes

Let us begin with a lemma. Lemma 4.1 Let l be a fixed positive integer, and let N = 9l. Suppose that C is a q-ary 3-traceability code of length N containing three or more codewords. Then there is a set X of at most c q l codewords such that after the deletion of X the subcode C  = C \X of C satisfies d(C  ) ≥ d(C ) + 1, where c is a constant depending only on N . Proof The proof is divided into several cases according to the minimum distance d(C ). Note that we simply define d(∅) = ∞. Case 1 d(C ) > N − l = 8l. Notice that the Singleton bound implies |C | ≤ q l . Thus we can simply take X = C and hence C  = ∅. It is easy to verify that the lemma follows trivially under our choice of X . Case 2 d(C ) ≤ 2l. Define a subcode C  of C by removing all codewords in C that contain a private l-pattern. In other words, X = {x ∈ C : FC (x, V ) = 1 for some l-subset V ⊆ [N ]}, and C  = {x ∈ C : FC (x, V ) ≥ 2 for all l-subsets V ⊆ [N ]}.

    It is easy to compute that |X | = |C \C  | ≤ Nl q l , since there are at most Nl q l different l-patterns in a q-ary code of length N , and every codeword x ∈ X contains at least one private l-pattern with positions V such that FC (x, V ) = 1, and such l-pattern belongs to exactly one x ∈ X ⊆ C . We only need to show that there are no distinct codewords x, y ∈ C  with d(x, y) = d(C ). Assume, to the contrary, that there exist x = y ∈ C  , such that d(x, y) = d(C ) ≤ 2l. Let V ∗ be a subset of [N ] with size at most 2l that contains all positions where x and y disagree. Then we can choose V1 and V2 such that V ∗ ⊆ V1 ∪ V2 and |V1 | = |V2 | = l. By the definition of C  , it holds that FC (x, Vi ) ≥ 2 for each i ∈ {1, 2}, then we can also choose y1 , y2 ∈ C \{x} such that x|Vi = yi |Vi for i = 1, 2. But then it is easy to see that x ∈ desc({y, y1 , y2 }), which contradicts the fact that C is a 3-traceability code. Thus we can conclude that d(C  ) > d(C ), and the lemma follows in this case. Now it remains to consider the case 2l < d(C ) ≤ N − l = 8l. We will need some preparations before continuing our proof. Write d(C ) = N − (l + δ) with 0 ≤ δ < 6l. Define    N −l for some l-subset V ⊆ [N ] , X = x ∈ C : FC (x, V ) ≤ 2δ+1 δ+1 and

 N −l for all l-subsets V ⊆ [N ] . δ+1 N  l    −l N l 3N l Note that |X | = |C \C  | ≤ 2δ+1 Nδ+1 l q < 2 q , since there are at most l q different l-patterns in a q-ary code of length N , and every codeword  −l  x ∈ X contains at least one l-pattern with positions V such that FC (x, V ) ≤ 2δ+1 Nδ+1 , and such l-pattern belongs to at  N −l  δ+1  most 2 δ+1 codewords x ∈ X ⊆ C . To prove d(C ) ≥ d(C )+1, it is sufficient to show that there are no distinct codewords x, y ∈ C  with d(x, y) = d(C ). Assume, to the contrary, that there exist y0 = y1 ∈ C  , such that I (y0 , y1 ) = l + δ. Define V1 = {i ∈ [N ] : y0,i = y1,i }, in other words we have y0 |V1 = y1 |V1 . Take V2 such that V1 ∩ V2 = ∅ and |V2 | = l. We claim that there exists y2 ∈ C such that y0 |V2 = y2 |V2 and I (y1 , y2 ) ≤ δ. In fact, the minimum distance of C implies that any 

C  = x ∈ C : FC (x, V ) > 2δ+1



123

C. Shangguan et al.

codeword is uniquely determined by l + δ + 1 of its coordinates. Once V2 is fixed, it holds that   N −l < FC (y0 , V2 ). |{y ∈ C : I (y1 , y) ≥ δ + 1, y0 |V2 = y|V2 }| ≤ δ+1  −l  means that we can choose δ + 1 coordinates from [N ]\V2 such that y1 and The value Nδ+1 y are equal, then these coordinates together with V2 uniquely determine y2 . So, there is at least one choice for y2 ∈ C satisfying the conditions mentioned in the claim above. Now, we redefine V2 = {i ∈ [N ]\V1 : y0,i = y2,i }, and write |V2 | = l + δ2 with 0 ≤ δ2 ≤ δ. Note that y1 and y2 have no identical coordinates in V2 , since otherwise y0 , y1 , y2 must be identical on these coordinates and they can be added to V1 . Let D := [N ]\(V1 ∪ V2 ). If N − |V1 | − |V2 | ≤ l, by the definition of C  , one can choose y3 ∈ C \{y0 } such that y3 | D = y0 | D . Thus it holds that y0 ∈ desc({y1 , y2 , y3 }), which contradicts the fact that C is a 3-traceability code, and so we can always assume that |D| > l. Set J = {i ∈ [N ]\V1 : y1,i = y2,i }. We have I (y0 , {y1 , y2 }) = |V1 | + |V2 | = 2l + δ + δ2 and I (y1 , {y0 , y2 }) = |V1 | + |J | = l + δ + |J |. We may assume |J | ≤ l + δ2 since otherwise we can exchange the roles of y0 and y1 . Take an l-subset V3 ⊆ [N ] such that V3 ∩ (V1 ∪ V2 ) = ∅, and make it cover as many elements of J as possible. We claim that there exists some y3 ∈ C such that y0 |V3 = y3 |V3 and I (y3 , {y1 , y2 }) ≤ δ. As mentioned before, any codeword of C is uniquely determined by l + δ + 1 of its coordinates. Once V3 is fixed, it holds that   N −l |{y ∈ C : I (y, {y1 , y2 }) ≥ δ + 1, y0 |V3 = y|V3 }| ≤ 2δ+1 < FC (y0 , V3 ), δ+1 where the multiplier 2δ+1 means that there are at most two choices for the chosen coordinates i ∈ [N ]\V3 , either yi = y1,i or yi = y2,i . So, there is at least one choice for y3 ∈ C satisfying the conditions mentioned in the claim above. Now, we redefine V3 = {i ∈ [N ]\(V1 ∪ V2 ) : y0,i = y3,i }, and write |V3 | = l + δ3 with 0 ≤ δ3 ≤ δ. It is not hard to show that y1 , y3 and y2 , y3 both have no identical coordinates on V3 . For {i, j, k} = {1, 2, 3}, we denote Vi, j,k := {v ∈ Vi : y j,v = yk,v }, then immediately one can deduce that |V1,0,2 | ≤ I (y1 , y2 ) ≤ δ, and |V1,0,3 | + |V2,0,3 | ≤ I (y3 , {y1 , y2 }) ≤ δ. Write E := [N ]\(V1 ∪ V2 ∪ V3 ). It is easy to see |E| = 6l − δ − δ2 − δ3 and |E| > 0, since otherwise y0 ∈ desc({y1 , y2 , y3 }), which contradicts the definition of 3-traceability code. The remaining part of the proof will be divided into two cases where 2l < d(C ) ≤ 4l (i.e., 4l ≤ δ < 6l) and 4l < d(C ) ≤ 8l (i.e., 0 ≤ δ < 4l). Case 3 2l < d(C ) ≤ 4l (4l ≤ δ < 6l). We take a vector w ∈ desc({y1 , y2 , y3 }) with w| E = y1 | E and w|Vi = yi |Vi , where i = 1, 2, 3. Such a choice for w is well-defined since 3 V ) = [N ] and they are all pairwise disjoint. See Fig. 1 for an illustration of our E ∪ (∪i=1 i notation. Note that for the sake of convenience, the codewords are depicted as the row vectors.

yT0 ∈ C yT1 ∈ C yT2 ∈ C yT3 ∈ C wT ∈ desc({yT1 , yT2 , yT3 })

V1

V2

V3

E

0000 · · · 00 0000 · · · 00 ∗∗∗∗ · · · ∗∗ ∗∗∗∗ · · · ∗∗ 0000 · · · 00

0000 · · · 00 1111 · · · 11 0000 · · · 00 ∗∗∗∗ · · · ∗∗ 0000 · · · 00

0000 · · · 00 1111 · · · 11 1123 · · · 15 0000 · · · 00 0000 · · · 00

0000 · · · 00 1111 · · · 11 ∗∗∗∗ · · · ∗∗ ∗∗∗∗ · · · ∗∗ 1111 · · · 11

|V1 |=l+δ

|V2 |=l+δ2

|V3 |=l+δ3

|E|=6l−δ−δ2 −δ3

Fig. 1 When 2l < d(C) ≤ 4l (4l ≤ δ < 6l)

123

Bounds for IPP codes and TA codes

yT0 ∈ C yT1 ∈ C yT2 ∈ C yT3 ∈ C wT ∈ desc({yT1 , yT2 , yT3 })

J1,2

J1,3 \J1,2,3

J2,3 \J1,2,3

H

0000 · · · 00 1111 · · · 11 1111 · · · 11 1231 · · · 23 1111 · · · 11

0000 · · · 00 1111 · · · 11 2222 · · · 22 1111 · · · 11 2222 · · · 22

0000 · · · 00 1111 · · · 11 2222 · · · 22 2222 · · · 22 1111 · · · 11

0000 · · · 00 1111 · · · 11 2222 · · · 22 3333 · · · 33 ∗∗∗∗ · · · ∗∗

|J1,2 |≤δ2

|J1,3 \J1,2,3 |≤2l+δ3

|J2,3 \J1,2,3 |≤2l+δ3

|H|=|E|−|J1,2

J1,3

J2,3 |

Fig. 2 When 0 ≤ δ < 4l, |J1,3 \J1,2,3 | ≤ 2l + δ3 and |J2,3 \J1,2,3 | ≤ 2l + δ3

It is easy to compute the following inequalities: ⎧ I (y0 , w) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ I (y1 , w) ⎪ ⎪ ⎨ I (y2 , w) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ I (y3 , w) ⎩

= = ≤ ≤ ≤ ≤ ≤

|V1 | + |V2 | + |V3 | = 3l + δ + δ2 + δ3 , |V1 | + |E| = (l + δ) + (6l − δ − δ2 − δ3 ) = 7l − δ2 − δ3 7l ≤ 3l + δ ≤ I (y0 , w), |V1,0,2 | + |V2 | + |E| ≤ δ + (l + δ2 ) + (6l − δ − δ2 − δ3 ) = 7l − δ3 7l ≤ 3l + δ ≤ I (y0 , w), |V1,0,3 | + |V2,0,3 | + |V3 | + |E| ≤ δ + (l + δ3 ) + (6l − δ − δ2 − δ3 ) = 7l − δ2 7l ≤ 3l + δ ≤ I (y0 , w).

Since y0 ∈ / {y1 , y2 , y3 }, this contradicts the 3-traceability property of C , as required. Case 4 4l < d(C ) ≤ 8l (0 ≤ δ < 4l). As the above, we take a vector w ∈ desc({y1 , y2 , y3 }) with w|Vi = yi | Ii , where i = 1, 2, 3. However, we have to be more careful about the choice of w| E .   For 1 ≤ i < j ≤ 3, define Ji, j := {v ∈ E : yi,v = y j,v } and J1,2,3 := J1,2 J1,3 J2,3 . We have |J1,2,3 | ≤ |J1,2 | ≤ max{|J | − l, 0} ≤ δ2 , since J1,2,3 ⊆ J1,2 ⊆ J \V3 and we have chosen y3 to cover as many elements of J as possible. Taking into account the fact that |J1,3 \J1,2,3 | + |J2,3 \J1,2,3 | ≤ I (y3 , {y1 , y2 }) ≤ δ < 4l, we consider the following two subcases separately. Subcase 4.1 |J1,3 \J1,2,3 | ≤ 2l +δ3 and |J2,3 \J1,2,3 | ≤ 2l +δ3 . When i ∈ E, wi is defined as the following steps. See Fig. 2 for an illustration of our notation. Note that the codewords are depicted as the row vectors.  1. Take wi = y1,i , when i ∈ J1,2 J2,3 , 2. Take wi = y2,i , when i ∈ J1,3 \J1,2,3 ,   3. For the remaining coordinates, i.e., these coordinates in H := E\(J1,2 J1,3 J2,3 ). Note that each pair of y1 | H , y2 | H and y3 | H has no identical coordinates, we may partition H into three disjoint parts, H1 , H2 , H3 , satisfying the property that |H1 | ≤ 2l + δ3 − |J2,3 \J1,2,3 |, |H2 | ≤ 2l + δ3 − |J1,3 \J1,2,3 |, and |H3 | = |H | − |H1 | − |H2 | ≤ 2l. To see that such partition does exist, note that the first two inequalities are valid by our assumption on the sizes of J1,3 \J1,2,3 and J2,3 \J1,2,3 , and the third inequality holds since |H3 | ≤ |E| − (|H1 | + |J2,3 \J1,2,3 |) − (|H2 | + |J1,3 \J1,2,3 |), |E| = 6l − δ − δ2 − δ3 , and one can choose |H1 | + |J2,3 \J1,2,3 | and |H2 | + |J1,3 \J1,2,3 | as large as 2l + δ3 . For the undefined coordinates of w, we take w| Hi = yi | Hi , where i = 1, 2, 3. One can verify that such a choice for w is well-defined. Now, we compute the values I (w, yi ) for i ∈ {0, 1, 2, 3}. Recall that |J1,2 | ≤ δ2 , |V1,0,2 |+ |J1,2 | ≤ I (y1 , y2 ) ≤ δ and |V1,0,3 | + |V2,0,3 | + |J1,2,3 | ≤ I (y3 , {y1 , y2 }) ≤ δ, then we have

123

C. Shangguan et al. Fig. 3 When 0  δ < 4l and |J2,3 \J1,2,3 | > 2l + δ3

J2,3

yT0 yT1 yT2 yT3

∈C ∈C ∈C ∈C wT ∈ desc({yT1 , yT2 , yT3 })

0000 · · · 00 0000 · · · 00 1111 · · · 11 ∗∗∗∗ · · · ∗∗ 2222 · · · 22 2222 · · · 22 2222 · · · 22 ∗∗∗∗ · · · ∗∗ 1111 · · · 11 2222 · · · 22 |J2,3 |=2l+δ3

⎧ I (y0 , w) ⎪ ⎪ ⎪ ⎪ I (y1 , w) ⎪ ⎪ ⎪ ⎪ ⎨ I (y2 , w) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ I (y3 , w) ⎪ ⎪ ⎩

= = = = = = =

E\J2,3

|E\J2,3 | 2l + δ3 or |J2,3 \J1,2,3 | > 2l + δ3 . Without loss of generality,  such that J  ⊆ J \J we can assume |J2,3 \J1,2,3 | > 2l + δ3 . Then we define J2,3 2,3 1,2,3 and 2,3  |J2,3 | = 2l + δ3 . When i ∈ E, wi is defined as the following steps. See Fig. 3 for an illustration of our notation. Note that the codewords are depicted as the row vectors.  , 1. Take wi = y1,i , when i ∈ J2,3  . 2. Take wi = y2,i , when i ∈ E\J2,3

Now, we compute the values I (w, yi ) for i ∈ {0, 1, 2, 3}. Since 2l + δ3 < |J2,3 \J1,2,3 | ≤  | = max{4l − δ − δ − 2δ , 0} < 2l, we have I (y3 , {y1 , y2 }) ≤ δ < 4l and |E\J2,3 2 3 ⎧ I (y0 , w) = |V1 | + |V2 | + |V3 | = 3l + δ + δ2 + δ3 , ⎪ ⎪ ⎪  | ≤ (l + δ) + δ + (2l + δ ) = 3l + δ + δ + δ ⎪ I (y1 , w) ≤ |V1 | + |J1,2 | + |J2,3 2 3 2 3 ⎪ ⎪ ⎪ ⎪ = I (y0 , w), ⎨  | < δ + (l + δ ) + 2l = 3l + δ + δ I (y2 , w) = |V1,0,2 | + |V2 | + |E\J2,3 2 2 ⎪ ⎪ ≤ I (y , w), ⎪ 0 ⎪ ⎪  | < δ + (l + δ ) + 2l = 3l + δ + δ ⎪ ⎪ I (y3 , w) ≤ |V1,0,3 | + |V2,0,3 | + |V3 | + |E\J2,3 3 3 ⎪ ⎩ ≤ I (y0 , w). Since y0 ∈ / {y1 , y2 , y3 }, this contradicts the 3-traceability property of C .

 

Proof of Theorem 1.6 Write N = 9l − r, where l ∈ Z and 0 ≤ r ≤ 8. By concatenating all codewords with the vector 0r , we may regard C as a traceability code of length 9l. So we may assume that N is divisible by 9. Let d = d(C ). By applying Lemma 4.1 at most N − d times, we obtain a code C  with minimal distance N , which has at most q codewords. We have removed at most (N − d)c q l codewords to obtain C  , and so |C| ≤ (N − d)c q l + q ≤ cq l where we define c = N c . So the theorem follows.  

5 Concluding remarks In this paper, we present improved upper bounds for IPP codes and TA codes. It is worth mentioning that we obtain the first non-trivial upper bound for TA codes with strength three. A

123

Bounds for IPP codes and TA codes

remaining interesting open problem is to answer Question 1.4. An important property that both FP codes and IPP codes satisfy is the composition law, which states that M F P (a N , q, t) ≤ M F P (N , q a , t) and M I P P (a N , q, t) ≤ M I P P (N , q a , t) hold for every positive integer a. This property says that a t-F PC(N , n, q a ) (resp. a t-I P PC(N , n, q a )) exists if only a tF PC(a N , n, q) (resp. a t-I P PC(a N , n, q)) exists. This composition law can be proved directly by splitting a codeword of length a N into N blocks of a coordinates each and then viewing this codeword as a vector of length N over an alphabet of size q a . Unfortunately, because of the minimum distance condition required in its definition, TA codes do not seem to satisfy such a law. This may be one reason why the upper bound of TA codes is so hard to estimate. It seems that our method of proving Theorem 1.6 can be further generalized, with a more complicated discussion. Acknowledgements Research supported by the National Natural Science Foundation of China under Grant Nos. 11431003 and 61571310, Beijing Scholars Program, Beijing Hundreds of Leading Talents Training Project of Science and Technology, and Beijing Municipal Natural Science Foundation.

References 1. Alon N., Fischer E., Szegedy M.: Parent-identifying codes. J. Combin. Theory A 95(2), 349–359 (2001). 2. Alon N., Stav U.: New bounds on parent-identifying codes: the case of multiple parents. Comb. Probab. Comput. 13(6), 795–807 (2004). 3. Barg A., Kabatiansky G.: A class of I.P.P. codes with efficient identification. J. Complex. 20(2–3), 137–147 (2004). 4. Blackburn S.R.: Frameproof codes. SIAM J. Discret. Math. 16(3), 499–510 (2003). (Electronic). 5. Blackburn S.R.: An upper bound on the size of a code with the k-identifiable parent property. J. Comb. Theory A 102(1), 179–185 (2003). 6. Blackburn S.R., Etzion T., Ng S.: Traceability codes. J. Comb. Theory A 117(8), 1049–1057 (2010). 7. Boneh D., Shaw J.: Collusion-secure fingerprinting for digital data. IEEE Trans. Inf. Theory 44(5), 1897– 1905 (1998). 8. Cheng M., Miao Y.: On anti-collusion codes and detection algorithms for multimedia fingerprinting. IEEE Trans. Inf. Theory 57(7), 4843–4851 (2011). 9. Chor, B., Fiat, A., Naor, M.: Tracing traitors. In: Advances in Cryptology—CRYPTO94, pp. 257–270. Springer, Berlin (1994) 10. Chor B., Fiat A., Naor M., Pinkas B.: Tracing traitors. IEEE Trans. Inf. Theory 46(3), 893–910 (2000). 11. Hollmann H.D.L., van Lint J.H., Linnartz J.P., Tolhuizen L.M.G.M.: On codes with the identifiable parent property. J. Comb. Theory A 82(2), 121–133 (1998). 12. Owen S., Ng S.-L.: A note on an upper bound of traceability codes. Australas. J. Comb. 62, 140–146 (2015). 13. Shangguan C., Ge G.: Separating hash hamilies: a Johnson-type bound and new constructions. SIAM J. Discret. Math. 30(4), 2243–2264 (2016). 14. Shangguan, C., Wang, X., Ge, G., Miao, Y.: New bounds for frameproof codes. IEEE Trans. Inf. Theory. doi:10.1109/TIT.2017.2745619 15. Silverberg A., Staddon J., Walker J.L.: Applications of list decoding to tracing traitors. IEEE Trans. Inf. Theory 49(5), 1312–1318 (2003). 16. Staddon J.N., Stinson D.R., Wei R.: Combinatorial properties of frameproof and traceability codes. IEEE Trans. Inf. Theory 47(3), 1042–1049 (2001). 17. van Trung T.: A tight bound for frameproof codes viewed in terms of separating hash families. Des. Codes Cryptogr. 72(3), 713–718 (2014).

123