Algorithms for orbit closure separation for invariants and semi

0 downloads 0 Views 252KB Size Report
Jan 6, 2018 - arXiv:1801.02043v1 [math.RA] 6 Jan ... In this paper, K will denote an algebraically closed field. ... an invariant polynomial f ∈ K[V ]G such that f(v) = f(w)? The answer to this question is ..... If char(K)=0, then β(R(n, m)) ≤ n6.
ALGORITHMS FOR ORBIT CLOSURE SEPARATION FOR INVARIANTS AND SEMI-INVARIANTS OF MATRICES

arXiv:1801.02043v1 [math.RA] 6 Jan 2018

HARM DERKSEN AND VISU MAKAM

Abstract. We consider two group actions on m-tuples of n × n matrices. The first is simultaneous conjugation by GLn and the second is the left-right action of SLn × SLn . We give efficient algorithms to decide if the orbit closures of two points intersect. We also improve the known bounds for the degree of separating invariants in these cases.

1. Introduction In this paper, K will denote an algebraically closed field. For a vector space V over the field K, let K[V ] denote the ring of polynomial functions on V . Suppose that a group G acts on V by linear transformations. A polynomial f ∈ K[V ] is called an invariant polynomial if it is constant along orbits, i.e., f (g · v) = f (v) for and v ∈ V . The invariant L∞all g ∈ G G G G polynomials form a graded subalgebra K[V ] = d=0 K[V ]d , where K[V ]d denotes the G degree d homogeneous invariants. We will call K[V ] the invariant ring or the ring of invariants. For a point v ∈ V , its orbit G·v = {g·v | g ∈ G} is not necessarily closed with respect to the Zariski topology. We say that an invariant f separates two points v, w ∈ V if f (v) 6= f (w). It follows from continuity that any invariant polynomial must take the same value on all points of the closure of an orbit. Hence invariant polynomials cannot separate two points whose orbit closures intersect. We can ask the converse question: if v, w ∈ V such that G · v ∩ G · w = ∅, then is there an invariant polynomial f ∈ K[V ]G such that f (v) 6= f (w)? The answer to this question is in general negative (see [3, Example 2.2.8]). However, if we enforce additional hypothesis, we get a positive answer as the theorem below shows (see [25]). Theorem 1.1. Let V be a rational representation of a reductive group G. Then for v, w ∈ V , there exists f ∈ K[V ]G such that f (v) 6= f (w) if and only if G · v ∩ G · w = ∅. Henceforth, we shall assume that V is a rational representation of a reductive group G.

Problem 1.2 (orbit closure problem). Decide whether the orbit closures of two given points v, w ∈ V intersect.

Definition 1.3. Two points v, w ∈ V are said to be closure equivalent if G · v ∩ G · w 6= ∅. We write v ∼ w if v and w are closure equivalent, and we write v 6∼ w if they are not closure equivalent. By Theorem 1.1, we have v ∼ w if and only if f (v) = f (w) for all f ∈ K[V ]G . So ∼ is clearly an equivalence relation. Since closure equivalence can be detected by invariant polynomials, the existence of a small generating set of invariants, each of which can be The authors were supported by NSF grant DMS-1601229. 1

computed efficiently would give an algorithm for the orbit closure problem. Fortunately, the invariant ring K[V ]G is finitely generated (see [16, 17, 18, 26]). Definition 1.4. We define β(K[V ]G ) to be the smallest integer D such that invariants of degree ≤ D generate K[V ]G , i.e., S G G β(K[V ]G ) = min{D ∈ N | D d=1 K[V ]d generates K[V ] }, where N = {1, 2, . . . }.

We are not just interested in deciding whether orbit closures intersect – when they do not, we want to provide an explicit invariant that separates them. To be able to this efficiently, there must exist an invariant of small enough degree that separates the two given points. A strong upper bound on β(K[V ]G ) would provide evidence that such invariants exist. Such a bound can be obtained for any rational representation V of a linearly reductive group G (see [2]), but this is often too large. For the cases of interest to us, stronger bounds exist, and we recall them in Section 2. Despite having strong degree bounds, it is a difficult problem to extract a small set of generators. On the other hand, we may only need a subset of the invariants to detect closure equivalence, prompting the definition of a separating set of invariants. Definition 1.5. A subset of invariants S ⊂ K[V ]G is called a separating set of invariants if for every pair v, w ∈ V such that v 6∼ w, there exists f ∈ S such that f (v) 6= f (w). We make another definition.

Definition 1.6. We define βsep (K[V ]G ) to be the smallest integer D such that the invariants of degree ≤ D form a separating set of invariants, i.e., S G βsep (K[V ]G ) = min{D ∈ N | D d=1 K[V ]d is a separating set of invariants}.

We now turn to a closely related problem, and to describe this we need to recall the null cone. Definition 1.7. The null cone N (G, V ) = {v ∈ V | 0 ∈ G · v}.

For a set of polynomials I ⊂ K[V ] we define its vanishing set

V(I) = {v ∈ V | f (v) = 0 for all f ∈ I}.

G The null cone can also be defined by N (G, V ) = V(K[V ]G + ), where K[V ]+ = (see [3, Definition 2.4.1, Lemma 2.4.2]).

L∞

d=1

K[V ]G d

Problem 1.8 (null cone membership problem). Decide whether a given point v ∈ V lies in the null cone N (G, V ). Since 0 is a closed orbit, a point v ∈ V is in the null cone if and only if 0 ∼ v, and hence the null cone membership problem can be seen as a subproblem of the orbit closure problem. So, the null cone membership problem could potentially be easier than the orbit closure problem. On the other hand, an algorithm for the null cone membership problem may provide a stepping stone for the orbit closure problem. In this paper, we are interested in giving efficient algorithms for the orbit closure problem in two specific cases – matrix invariants and matrix semi-invariants. These two cases have generated considerable interest over the past few years due to their connections to computational complexity, see [24, 11, 5, 21, 22, 15, 19]. 2

Remark 1.9. For analyzing the run time of our algorithms, we will use the unit cost arithmetic model. This is also often referred to as algebraic complexity. 1.1. Matrix invariants. Let Matp,q be the set of p × q matrices. The group GLn acts by simultaneous conjugation on the space V = Matm n,n of m-tuples of n × n matrices. This action is given by g · (X1 , X2 , . . . , Xm ) = (gX1g −1 , gX2g −1 , . . . , gXmg −1 ).

We set S(n, m) = K[V ]G . The ring S(n, m) is often referred to as the ring of matrix invariants. We will write ∼C for the orbit closure equivalence relation ∼ with respect to this simultaneous conjugation action. Given any separating set S, an obvious algorithm for the orbit closure problem would be to evaluate the two given points at every invariant function in the set S. In characteristic 0, Forbes and Shpilka construct a quasi-polynomial sized set of explicit separating invariants in this case (see [11]), but this is not sufficient to get a polynomial time algorithm. Nevertheless, Forbes and Shpilka give a deterministic parallel polynomial time algorithm for the orbit closure problem. Given an input X ∈ Matm n,n , they construct in polynomial time a noncommutative polynomial PX with the feature that the coefficients of the monomials in PX are the evaluations of a generating set of invariants on X. Hence, to check if the orbit closures of two points X, Y ∈ Matm n,n intersect, one needs to determine whether the noncommutative polynomial PX − PY is the zero polynomial. There is an efficient algorithm to test whether PX − PY is the zero polynomial (see [28]). Forbes and Shpilka’s algorithm has two drawbacks. First, it is not easily adapted to positive characteristic. Second, when the orbit closures of X and Y do not intersect, this algorithm does not provide a specific invariant f ∈ S(n, m) such that f (X) 6= f (Y ). In this paper, we provide an algorithm without these drawbacks. Theorem 1.10. The orbit closure problem for the simultaneous conjugation action of GLn m on Matm n,n can be decided in polynomial time. Further, if A, B ∈ Matn,n and A 6∼C B, then we can provide an invariant f ∈ S(n, m) that separates A and B. Our algorithm does not help to provide a small set of separating invariants either. Our approach also gives better bounds for the degree of separating invariants than the existing ones in literature, see [24]. √ 2 Theorem 1.11. We √ have βsep (S(n, m)) ≤ 2n n. If we assume char(K) = 0, then we have βsep (S(n, m)) ≤ 2n n. The bound in characteristic 0 is especially interesting because there are quadratic lower bounds for the degree of generating invariants in this case, see [12]. This also improves the bound in [6] for the degree of invariants defining the null cone. 1.2. Matrix semi-invariants. We consider the left-right action of G = SLn × SLn on the space V = Matm n,n of m-tuples of n × n matrices. This action is given by (P, Q) · (X1 , X2 , . . . , Xm ) = (P X1 Q−1 , P X2Q−1 , . . . , P Xm Q−1 ).

We set R(n, m) = K[V ]G . The ring R(n, m) is often referred to as the ring of matrix semiinvariants. We will write ∼LR for the equivalence relation ∼ with respect to this left-right action. 3

The null cone membership problem for matrix semi-invariants has attracted a lot of attention due to its connections to non-commutative circuits and identity testing, see [5, 15, 19, 21, 22]. In characteristic 0, Gurvits’ algorithm gives a deterministic polynomial time algorithm, see [5, 15]. There is a different algorithm which works for any sufficiently large field in [22]. Theorem 1.12 ([5, 15, 22]). The null cone membership problem for the left-right action of SLn × SLn on Matm n,n can be decided in polynomial time. Remark 1.13. We will say the null cone membership problem and orbit closure problem for matrix invariants (resp. matrix semi-invariants) to refer to the corresponding problem for the simultaneous conjugation action of GLn (resp. left-right action of SLn × SLn ) on Matm n,n . Using the above theorem, we are able to show a polynomial time reduction from the orbit closure problem for matrix semi-invariants to the orbit closure problem for matrix invariants. In fact, the converse also holds, i.e., there is a polynomial time reduction from the orbit closure problem for matrix invariants to the orbit closure problem for matrix semiinvariants. As a consequence, we have a polynomial time algorithm for the orbit closure problem for matrix semi-invariants as well. Moreover, due to the nature of the reduction, we will be able to find a separating invariant when the orbit closures of two points do not intersect. Theorem 1.14. The orbit closure problem for the left-right action of SLn × SLn on Matm n,n can be decided in polynomial time. Further for A, B ∈ Matm , if A ∼ 6 B, we can provide LR n,n an invariant f ∈ R(n, m) that separates A and B. In characteristic 0, an analytic algorithm for the orbit closure problem for matrix semiinvariants has also been obtained by Allen-Zhu, Garg, Li, Oliveira and Wigderson in [1]. Our algorithm is algebraic, independent of characteristic, and provides a separating invariant when the orbit closures do not intersect. In [6], bounds on βsep (R(n, m)) were given. In this paper, we give better bounds using a reduction to matrix invariants. Theorem 1.15. We have βsep (R(n, m)) ≤ n2 βsep (S(n, mn2 )). Using the bounds on matrix invariants in Theorem 1.11, we get bounds for matrix semiinvariants. √ Corollary 1.16. We have βsep (R(n, m)) ≤ 2n4 n. If we assume char(K) = 0, then we √ have βsep (R(n, m)) ≤ 2n3 n. 1.3. Organization. In Section 2, we collect a number of preliminary results on matrix invariants and matrix semi-invariants. In Section 3, we show polynomial time reductions in both directions between the orbit closure problems for matrix invariants and matrix semiinvariants. We give a polynomial time algorithm for finding a basis of a subalgebra of matrices in Section 4. We collect some results on norms that we need in Section 5. In Section 6, we give the algorithm for the orbit closure problem for matrix invariants, and prove bounds on separating invariants. Finally in Section 7, we prove Theorem 1.15. 4

2. Preliminaries on matrix invariants and matrix semi-invariants 2.1. Matrix invariants. Let us recall that the ring of matrix invariants S(n, m) is the invariant ring for the simultaneous conjugation action of GLn on Matm n,n , the space of mtuples of n×n matrices. Procesi showed that in characteristic 0, the ring S(n, m) is generated by traces of words in the matrices. A word in an alphabet set Σ is an expression of the form i1 i2 . . . ik with ij ∈ Σ. We denote the set of all words in an alphabet Σ by Σ⋆ (the Kleene closure of Σ). The set Σ⋆ includes the empty word ǫ. For a word w = i1 i2 . . . ik , we define its length l(w) = k. For a positive integer m, we write [m] := {1, 2, . . . , m}, the set of all positive integers less equal m. For a word w = i1 i2 . . . ik ∈ [m]⋆ , and for X = (X1 , . . . , Xm ) ∈ Matm n,n , we define Xw = Xi1 Xi2 . . . Xik . m The function Tw : Matn,n → K given by Tw (X) := Tr(Xw ) is an invariant polynomial. Formally stated, Procesi’s result is the following. Theorem 2.1 ([27]). Assume char(K) = 0. The invariant functions of the form Tw , w ∈ [m]⋆ generate S(n, m). Razmyslov studied trace identities, and as a consequence of his work, we have: Theorem 2.2 ([29]). Assume char(K) = 0. Then β(S(n, m)) ≤ n2 . In positive characteristic, generators of the invariant ring were given by Donkin in [13, 14]. In simple terms, we have to replace traces with coefficients of characteristic polynomial. P For an n × n matrix X, let c(X) = det(I + tX) = ni=0 σj (X)tj denote its characteristic polynomial. The function X 7→ σj (X) is a polynomial in the entries of X, and is called the j th characteristic coefficient of X. Note that σ0 = 1, σ1 (X) = Tr(X) and σn (X) = det(X). For any word w, we define the invariant polynomial σj,w ∈ S(n, m) by σj,w (X) := σj (Xw ) for X = (X1 , X2 , . . . , Xm ) ∈ Matm n,n . Theorem 2.3 ([13, 14]). The set of invariant functions {σj,w | w ∈ [m]⋆ , 1 ≤ j ≤ n} is a generating set for the invariant ring S(n, m). In a radically different approach from the case of characteristic 0, we recently proved a polynomial bound on the degree of generators. Theorem 2.4 ([6]). We have β(S(n, m)) ≤ (m + 1)n4 . 2.2. Matrix semi-invariants. The ring of matrix semi-invariants R(n, m) is the ring of invariants for the left-right action of SLn × SLn on Matm n,n . There is a determinantal description for semi-invariants of quivers, see [7, 10, 30]. Matrix semi-invariants is a special case – it is the ring of semi-invariants for the generalized Kronecker quiver, for a particular choice of a dimension vector, see for example [5]. Given two matrices A = (aij ) of size p × q, and B = (bij ) of size r × s, we define their tensor (or Kronecker) product to be   a11 B a12 B · · · a1n B ..  ..  . .   a21 B A⊗B = .  ∈ Matpr,qs . . . .. ..   .. am1 B · · · · · · amn B 5

Associated to each T = (T1 , T2 , . . . , Tm ) ∈ Matm d,d , we define a homogeneous invariant fT ∈ R(n, m) of degree dn by fT (X1 , X2 , . . . , Xm ) = det(T1 ⊗ X1 + T2 ⊗ X2 + · · · + Tm ⊗ Xm ).

Theorem 2.5 ([7, 10, 30]). The invariant ring R(n, m) is spanned by all fT with T ∈ Matm d,d and d ≥ 1. In particular, notice that if d is notL a multiple of n, then there are no degree d invariants. ∞ In other words, we have R(n, m) = d=0 R(n, m)dn . A polynomial bound on the degree of generators in characteristic 0 was shown in [5], and the restriction on characteristic was removed in [6]. Theorem 2.6 ([5, 6]). We have β(R(n, m)) ≤ mn4 . If char(K) = 0, then β(R(n, m)) ≤ n6 .

Let N (n, m) denote the null cone for the left-right action of SLn × SLn on Matm n,n . The following is proved in [5]. Theorem 2.7 ([5]). For X ∈ Matm n,n , the following are equivalent: (1) X ∈ / N (n, m); (2) For some d ∈ N, there exists T ∈ Matm d,d such that fT (X) 6= 0; (3) For any d ≥ n − 1, there exists T ∈ Matm d,d such that fT (X) 6= 0.

The above theorem relies crucially on the regularity lemma proved in [21]. A more conceptual proof of the regularity lemma is given in [4] using universal division algebras, although it lacks the constructiveness of the original proof. An algorithmic version of the above theorem appears in [22]. Theorem 2.8 ([22]). For X ∈ Matm n,n , there is a deterministic polynomial time (in n and m) algorithm which determines if X ∈ / N (n, m). Further, for X ∈ / N (n, m) and any n − 1 ≤ d ≤ poly(n), the algorithm provides in polynomial time, an explicit T ∈ Matm d,d such that fT (X) 6= 0. Remark 2.9. We will henceforth refer to the algorithm in Theorem 2.8 above as the IQS algorithm. For 1 ≤ j, k ≤ d, we define Ej,k ∈ Matd,d to be the d × d matrix which has a 1 in the (j, k)th entry, and 0 everywhere else. 2

[d] Definition 2.10. If X = (X1 , . . . , Xm ) ∈ Matm = (Xi ⊗Ej,k )i,j,k ∈ Matmd n,n , we define X nd,nd , where the tuples (i, j, k) ∈ [m] × [d] × [d] are ordered lexicographically.

Proposition 2.11. The following are equivalent (1) There exists f ∈ R(n, m) such that f (A) 6= f (B); (2) There exists g ∈ R(nd, md2 ) such that g(A[d] ) 6= g(B [d] ) for either d = n − 1 or d = n.

Proof. We first show (1) =⇒ (2). We can assume f = fT for some T ∈ Matm e,e for some e ≥ 1. Without loss of generality, assume f (A) 6= 0. Then we have µ = f (B)/f (A) 6= 1. For any µ 6= 1, both µn−1 and µn cannot be 1. Hence for at least one of d ∈ {n − 1, n}, we have µd = f (B)d /f (A)d 6= 1, and hence f (A)d 6= f (B)d . Now, it suffices to show the existence of g ∈ R(nd, md2 ) such that g(A[d] ) = f (A)d for all A ∈ Matm n,n . 6

But now, consider fT (A)d = det = det

Ek,k ⊗ Ai )  P = det i,k Ti ⊗ (Ai ⊗ Ek,k ) .

= det 2

d i=1 Ti ⊗ Ai  Pm ⊕d ⊗ Ai i=1 Ti Pm Pd i=1 ( k=1 Ti ⊗ Pm



Let S ∈ Matmd e,e given by Si,j,k = δj,k Ti . We can take g = fS . 2 We now show (2) =⇒ (1). Indeed, we can choose g = fS for some S ∈ Matmd e,e , e ≥ 1. We have  P [d] fS (A[d] ) = det i,j,k Si,j,k ⊗ (A )i,j,k  P = det i,j,k Si,j,k ⊗ Ai ⊗ Ej,k   P P = det S ⊗ E ⊗ A i,j,k j,k i i j,k  P ei ⊗ Ai , = det S i P where Sei = j,k Si,j,k ⊗ Ej,k . Let Se = (Se1 , . . . , Sem ) ∈ Matm de,de . Then the above calculation [d] [d] tells us that fSe(A) = fS (A ) = g(A ). Hence we have We can take f = fSe.

fSe(A) = g(A[d] ) 6= g(B [d] ) = fSe(B).

 Corollary 2.12. The orbit closures of A and B do not intersect if and only if the orbit closures of A[d] and B [d] do not intersect for at least one choice of d ∈ {n − 1, n}. 2.3. Commuting action of another group. Let G be a group acting on V . Suppose we have another group H acting on V , and the actions of G and H commute. To distinguish the actions, we will denote the action of H by ⋆. The orbit closure problem for the action of G on V also commutes with the action of H. More precisely, we have the following: Lemma 2.13. Let v, w ∈ V and h ∈ H. Then v ∼ w if and only if h ⋆ v ∼ h ⋆ w.

m We have a natural identification of V = Matm n,n with Matn,n ⊗K . The latter viewpoint illuminates an action of GLm on V that commutes with the left-right action of SLn × SLn , as well as the simultaneous conjugation action of GLn . In explicit terms, for P = (pi,j ) ∈ GLm and X = (X1 , . . . , Xm ) ∈ Matm n,n , we have  P P P P ⋆ (X1 , . . . , Xm ) = j p1,j Xj , j p2,j Xj , . . . , j pm,j Xj .

Corollary 2.14. The orbit closure problem for both the left-right action of SLn × SLn and the simultaneous conjugation action of GLn on Matm n,n commutes with the action of GLm . 2.4. A useful surjection. We consider the map m+1 φ : Matm n,n −→ Matn,n

(X1 , . . . , Xm ) 7−→ (I, X1 , . . . , Xm ) 7

m This gives a surjection on the coordinate rings φ∗ : K[Matm+1 n,n ] → K[Matn,n ], which descends to a surjective map on invariant rings as below (see [9, 6]).

Proposition 2.15. The map φ∗ : R(n, m + 1) ։ S(n, m) is surjective. Proof. We want to first show that the image φ∗ (R(n, m + 1)) ⊆ S(n, m). Since {fT′ | T ∈ ∗ Matm+1 d,d } is a spanning set for the ring R(n, m + 1), it suffices to show that φ (fT ) ∈ S(n, m) ∗ for all T ∈ Matm+1 d,d . By definition, we have φ (fT )(X1 , X2 , . . . , Xm ) = fT (I, X1 , X2 , . . . , Xm ). For g ∈ GLn , we have φ∗ (fT )(g · (X1 , X2 , . . . , Xm )) = φ∗ (fT )(gX1g −1 , gX2g −1 , . . . , gXmg −1 ) = fT (I, gX1 g −1, . . . , gXm g −1)

= det(T1 ⊗ I + T2 ⊗ gX1g −1 + . . . Tm+1 ⊗ gXm g −1)

= det((I ⊗ g)(T1 ⊗ I + T2 ⊗ X1 + · · · + Tm+1 Xm )(I ⊗ g)−1 ) = det(T1 ⊗ I + T2 ⊗ X1 + · · · + Tm+1 Xm ) = fT (I, X1 , . . . , Xm )

= φ∗ (fT )(X1 , . . . , Xm ). Now, we show that the image of φ∗ surjects onto S(n, m). For each f ∈ S(n, m), we need to find an fe such that φ∗ (fe) = f . Define fe by fe(X1 , . . . , Xm+1 ) = f (Adj(X1 )X2 , Adj(X1 )X3 , . . . , Adj(X1 )Xm+1 ),

where Adj(X) denotes the adjoint of a matrix. It is easy to see that fe is invariant for the action of SLn × SLn . Further, we have (φ∗ (fe))(X1 , . . . , Xm ) = fe(I, X1 , . . . , Xm ) = f (Adj(I)X1 , . . . , Adj(I)Xm )

= f (X1 , . . . , Xm )

 In fact, from the above proof, we can see that for f ∈ S(n, m), we can construct a preimage easily. We record this as a corollary. Corollary 2.16. For f ∈ S(n, m), the invariant polynomial fe ∈ R(n, m + 1) defined by fe(X1 , . . . , Xm+1 ) = f (Adj(X1 )X2 , Adj(X1 )X3 , . . . , Adj(X1 )Xm+1 )

is a pre-image of f under φ∗ , i.e., φ∗ (fe) = f .

3. Time complexity equivalence of orbit closure problems

In this section, we will show polynomial reductions between the orbit closure problem for matrix invariants and the orbit closure problem for matrix semi-invariants. We will in fact show a more robust reduction. Let G be a group acting on V . Definition 3.1. An algorithm for the orbit closure problem with witness is an algorithm that decides if v ∼ w for any two points v, w ∈ V , and if v 6∼ w, provides a witness f ∈ K[V ]G such that f (v) 6= f (w). 8

3.1. Reduction from matrix invariants to matrix semi-invariants. Let A, B ∈ Matm n,n . m+1 m m+1 We can consider φ(A), φ(B) ∈ Matn,n , where φ : Matn,n → Matn,n is the map described in Section 2.2. Proposition 3.2. The following are equivalent: (1) There exists f ∈ S(n, m) such that f (A) 6= f (B) (2) There exists g ∈ R(n, m + 1) such that g(φ(A)) 6= g(φ(B)). Proof. Let’s first prove (1) =⇒ (2). Given f ∈ S(n, m) such that f (A) 6= f (B), take g to be a preimage of f , i.e., φ∗ (g) = f . Now, g(φ(A)) = φ∗ (g)(A) = f (A) 6= f (B) = φ∗ (g)(B) = g(φ(B)).

To prove (2) =⇒ (1), simply take f = φ∗ (g).



Corollary 3.3. Let A, B ∈ Matm n,n . Then we have

A ∼C B if and only if φ(A) ∼LR φ(B).

Corollary 3.4. There is a polynomial reduction that reduces the orbit closure problem with witness for matrix invariants to the orbit closure problem with witness for matrix semiinvariants Proof. Given A, B ∈ Matm n,n , we construct φ(A) and φ(B). Appeal to the orbit closure problem with witness for matrix semi-invariants with input φ(A) and φ(B). There are two possible outcomes. If φ(A) ∼LR φ(B), then we conclude that A ∼C B. If φ(A) 6∼LR φ(B) and f ∈ R(n, m + 1) separates φ(A) and φ(B), then φ∗ (f ) is an invariant that separates A and B. The reduction is clearly polynomial time.  3.2. Reduction from matrix semi-invariants to matrix invariants. We will show that the orbit closure problem for matrix semi-invariants can be reduced to the orbit closure problem for matrix invariants. We will need some preparatory lemmas before we give the algorithm. e = Lemma 3.5. Assume A = (I, A2 , . . . , Am ) and B = (I, B2 , . . . , Bm ). If we denote A e = (B2 , . . . , Bm ). Then we have (A2 , . . . , Am ) and B e ∼C B. e A ∼LR B ⇐⇒ A

e and B. e Proof. This follows from Corollary 3.3 applied to A



The above lemma paves the way for a slightly more general result.

Proposition 3.6. Assume A, B ∈ Matm n,n such that det(A1 ) = det(B1 ) 6= 0. If we denote −1 −1 −1 e = (B1 B2 , . . . , B1−1 Bm ), then we have e = (A1 A2 , . . . , A1 Am ) and B A e ∼C B. e A ∼LR B ⇐⇒ A

Proof. Let us first suppose that det(A1 ) = det(B1 ) = 1. Then for g = (A−1 1 , id) ∈ SLn × SLn , −1 −1 −1 e Similarly for h = (B1 , id) ∈ SLn × SLn , we we have g · A = (I, A1 A2 , . . . , A1 Am ) = φ(A). e have h · B = φ(B). Now, we have e ∼LR φ(B) e ⇐⇒ A e ∼C B. e A ∼LR B ⇐⇒ φ(A) 9

The general case for det(A1 ) 6= 0 follows because the orbit closures of A and B intersect if and only if the orbit closures of λ · A = (λA1 , . . . , λAm ) and λ · B = (λB1 , . . . , λBm ) intersect for any λ ∈ K ∗ .  Lemma 3.7. For any non-zero row vector v = (v1 , . . . , vm ), we can construct efficiently a matrix P ∈ GLm such that the top row of the matrix P is v. Proof. This is straightforward and left to the reader.



Algorithm 3.8. Now we give an algorithm to reduce the orbit closure problem with witness for matrix semi-invariants to the orbit closure problem with witness for matrix invariants. Input: A, B ∈ Matm n,n Step 1: Check if A or B are in the null cone by the IQS algorithm. If both of them are in the null cone, then A ∼LR B. If precisely one of them is in the null cone, then A 6∼LR B and the IQS algorithm gives an invariant that separates A and B. If neither are in the null cone, then we proceed to Step 2. Step 2: Neither A nor B in the null cone. Now, for d ∈ {n − 1, n}, the IQS algorithm constructs T (d) ∈ Matm d,d such that fT (d) (A) 6= 0 in polynomial time. We denote fd := fT (d) . If fd (A) 6= fd (B), then A 6∼C B and fd is the separating invariant. Else fd (A) = fd (B) for both choices of d ∈ {n − 1, n}, and we proceed to Step 3. P Step 3: For d ∈ {n − 1, n}, we have fd (A) = det( i,j,k (T (d)i )j,k (Ai ⊗ Ej,k )). We can construct efficiently a matrix P ∈ Matmd2 ,md2 such that the first row is (T (d)i )j,k )i,j,k 2 by Lemma 3.7. Consider U = P ⋆ A[d] , V = P ⋆ B [d] ∈ Matmd nd,nd . This has the property that det(U1 ) = fd (A) 6= 0. Since we did not terminate in Step 2, we know that det(U1 ) = det(V1 ). Let us recall that by Corollary 2.12, A ∼LR B if and only A[d] ∼LR B [d] for both d = n − 1 and d = n. By Lemma 2.13, A[d] ∼LR B [d] if and only if U ∼LR V . e = (U −1 U2 , . . . , U −1 Umd2 ) To decide whether U ∼LR V , we do the following. Let U 1 1 and Ve = (V1−1 V2 , . . . , V1−1 Vmd2 ). By Proposition 3.6, we have U ∼LR V if and only e ∼C Ve . But this can be seen as an instance of an orbit closure problem with if U e witness for matrix invariants. Also note the fact if we get an invariant separating U and Ve , the steps can be traced back to get an invariant separating A and B.

Corollary 3.9. There is a polynomial time reduction from the orbit closure problem with witness for matrix semi-invariants to the orbit closure problem with witness for matrix invariants. 4. A polynomial time algorithm for finding a subalgebra basis Let {C1 , . . . , Cm } ⊆ Matn,n be a finite subset of Matn,n . Consider the (unital) subalgebra C ⊆ Matn,n generated by C1 , . . . , Cm . In other words, C is the smallest subspace of Matn,n containing the identity matrix I and the matrices C1 , . . . , Cm that is closed under multiplication. For a word i1 i2 . . . ib we define Cw = Ci1 Ci2 · Cib . We also define Cǫ = I for the empty word ǫ. We will describe a polynomial time algorithm for finding a basis for C. First observe that C is spanned by {Cw | w ∈ [m]⋆ }. While this is an infinite spanning set, we will extract a basis from this, in polynomial time. We define a total order on [m]⋆ . 10

Definition 4.1. For words w1 = i1 i2 . . . ib and w2 = j1 j2 . . . jc , we write w1 ≺ w2 if either (1) l(w1 ) < l(w2 ) or (2) l(w1 ) = l(w2 ) and for the smallest integer m for which im 6= jm , we have im < jm .

Remark 4.2. If w ≺ w ′, we will say w is smaller than w ′ .

We call a word w a pivot if Cw does not lie in the span of all Cu , u ≺ w. Otherwise, we call w a non-pivot. Lemma 4.3. Let P = {w | w is pivot}. Then {Cw | w ∈ P } is a basis for C. We will call this the pivot basis. Definition 4.4. For words w = i1 i2 . . . ib and w ′ = j1 j2 . . . jc , we define the concatenation ww ′ = i1 i2 . . . ib j1 j2 . . . jc . Lemma 4.5. If w is a non-pivot, then xwy is a non-pivot for all words x, y ∈ [m]⋆ . P Proof. IfPw is non-pivot, then Cw = k ak Cwk for wk ≺ w and ak ∈ K. Then we have Cxwy = k ak Cxwk y . Hence, xwy is non-pivot as well. 

Corollary 4.6. Every subword of a pivot word is a pivot.

Lemma 4.7. For any word u ∈ [m]⋆ \ {ǫ} , un is not a pivot.

Proof. By the Cayley-Hamilton theorem, Cun = Cun lies in the span of all Cuj = Cuj with 0 ≤ j < n. Since u is not the empty word, uj ≺ un for j < n. Therefore, un is not a pivot.  Corollary 4.8. Any word w containing a nonempty subword of the form un is not a pivot. Proposition 4.9. Suppose that w = w1 w2 · · · wd ∈ [m]⋆ is a word of length d, and w has no subword of the form un (i.e., a subword that is repeated ≥ n consecutive times). Then the total number of subwords of w is at least d+1 /n. 2

Proof. Suppose that wj+1 wj+2 · · · wl = wi+1 wi+2 · · · wk with j < i. Then we have wj+1wj+2 · · · wi = wi+1 wi+2 · · · w2i−j (if 2i − j ≤ k), wi+1 wi+2 · · · w2i−j = w2i−j+1 w2i−j+2 · · · w3i−2j (if 3i − 2j ≤ k) etc. The subword wj+1wj+2 · · · wi appears at least ⌊(k − j)/(i − j)⌋ times inside wj+1wj+1 · · · wk . If i ≤ k/n, then a subword appears at least       k−j ni − j ni = ≥ =n i−j i−j i consecutive times. This contradicts the assumption. Let S be the set of all subwords of the form wi+1 wi+2 · · · wk with k ≤ d and i ≤ k/n. These words are all distinct. For every k with 0 ≤ k ≤ d there are ⌊k/n⌋ + 1 choices for i. So the total number of subwords is:  d j k d d+1  X X k k 2 +1 > = . n n n k=0 k=0 

Corollary 4.10. If the length of the longest pivot word is d, then we must have  d+1 2

n

≤ dim(C). 11

Proof. This follows from the above proposition because the number of pivots is at most dim(C).  √ Corollary 4.11. The length of the longest pivot is less than 2n3 . Proof. We have d2 < 2n This gives d2 < 2n3 .

d+1 2

n



≤ dim(C) ≤ n2 . 

Now, we describe an efficient algorithm to construct the set of pivots. Algorithm 4.12 (Finding a basis for a subalgebra of Matn,n ). Input: n × n matrices C1 , C2 , . . . , Cm Step 1: Set t = 1 and P = P0 = [(ǫ, I)]. Step 2: If Pt−1 = [w1 , w2 , . . . , ws ], define Pt = [w1 1, . . . , w1 m, w2 1, . . . , w2 m, . . . , ws 1, . . . , ws m] Step 3: Proceeding through the list Pt , check if an entry (w, Cw ) is a pivot. This can be done in polynomial time, as we have to simply check if Cw is a linear combination of smaller pivots. If it is a pivot, add it to P . If it is not a pivot, then remove it from Pt . Upon completing this step, the list Pt contains all pivots of length t. Step 4: t = t + 1. If Pt−1 6= [] go back to Step 2. Else, return P and terminate. Corollary 4.13. There is a polynomial time algorithm to construct the set of pivots. Further, this algorithm also records the word associated to each pivot. Proof. To show that the above algorithm runs in polynomial time, it suffices to show that the number of words we consider is at most polynomial. Indeed, if there are k pivots of length d, then we only consider km words of length d + 1. Since k ≤ n2 , the number of √ words we consider in each degree is at most n2 m. We only consider words of length up to 2n3 . Hence, the number of words considered is polynomial (in n and m).  5. Norms In order to justify our algorithm for the orbit closure problem for matrix invariants in positive characteristic, we need some results on general norms. If the reader is only interested in characteristic 0 or is willing to accept Theorem 6.2, then this section can be skipped. Suppose that K is an algebraically closed field, R is a finite dimensional associative Kalgebra and N : R → K is a norm, meaning that it has the following properties: (1) N is a polynomial; (2) N(1) = 1; (3) N(ab) = N(a)N(b) for all a, b ∈ R. We define a trace function by Tr(a) = dtd N(1 + at) |t=0 . The map Tr is K-linear. Lemma 5.1. We have is invertible.

d N(a + bt) dt

= N(a + bt) Tr((a + bt)−1 b) for all t ∈ K for which a + bt 12

Proof. d N(a dt

+ bt) =

d N(a ds

+ bt + bs) |s=0 =

d = N(a + bt) ds N(1 + (a + bt)−1 bs) |s=0= N(a + bt) Tr((a + bt)−1 b).



Lemma 5.2. Suppose that N = N1k1 N2k2 · · · Nrkr where N1 , N2 , . . . , Nr are distinct irreducible polynomials, k1 , k2 , . . . , kr are positive and N1 (1) = N2 (1) = · · · = Nr (1) = 1. Then N1 , N2 , . . . , Nr are norms as well. Proof. We have N1k1 (ab)N2k2 (ab) · · · Nrkr (ab) = N(ab)

= N(a)N(b) = N1k1 (a)N2k2 (a) · · · Nrkr (a)N1k1 (b)N2k2 (b) · · · Nrkr (b)

If the irreducible polynomial Ni (a) divides Nj (ab), then it also divides Nj (a) by setting b = 1. It follows that Ni (a) and Nj (a) have to be the same up to a scalar. Since Ni (1) = Nj (1) we have Ni = Nj and i = j. So Ni (a) and Ni (b) divide Ni (ab) and Ni (ab) = λNi (a)Ni (b) with λ ∈ K. For a = b = 1 we get λ = 1. 

Theorem 5.3. Suppose that N1 , N2 : R → K are two norms, a1 , a2 , . . . , ak ∈ R span R as a K-vector space and N1 (1 + tai ) = N2 (1 + tai ) for all i and all t. Then we have N1 = N2 .

Proof. Without loss of generality we may assume that N1 (a) and N2 (a) do not have a common factor as a polynomial. If N1 and N2 are both constant, then N1 (a) = N1 (1) = 1 = N2 (1) = N2 (a) and we are done. So let us assume that N1 (a) and N2 (a) are not both constant. If the characteristic of F is p > 0 then we can also assume that N1 and N2 are not p-th powers (because otherwise we replace N1 and N2 by their (unique) p-th roots). For a, b ∈ R and t ∈ F with a + tb ∈ R× we have  d N1 (a + tb) N1 (a + tb)  Tr1 ((a + tb)−1 b) − Tr2 ((a + tb)−1 b) = dt N2 (a + tb) N2 (a + tb) For a = 1, b = ai and t = 0 we get Tr1 (ai ) = Tr2 (ai ) for all i. Because Tr1 and Tr2 are linear and a1 , . . . , ak spans R, we get Tr1 = Tr2 . It follows that dtd N1 (a + tb)/N2 (a + tb) = 0. This implies that N1 (a)/N2 (a) is a p-th power of a rational function, where p is the characteristic. It follows that both N1 (a) and N2 (a) are p-th powers because they do not have a common factor. This contradicts our assumptions.  6. Orbit closure problem for matrix invariants Let A, B ∈ Matm n,n with A = (A1 , . . . , Am ) and B = (B1 , . . . , Bm ). Define   Ai 0 Ci = 0 Bi

for all i. Let C be the algebra generated by C1 , C2 , . . . , Cm . Let Z1 , Z2 , . . . , Zs be the pivot basis of C and write   Xj 0 Zj = 0 Yj 13

for all j. Proposition 6.1. Suppose char(K) = 0. Then we have A ∼C B if and only if Tr(Xj ) = Tr(Yj ) for all j. Proof. Two orbit closures do not intersect if and only if there is an invariant that separates them. By Theorem 2.1, the invariant ring is generated by invariants of the form X 7→ Tr Xw for some word w in the alphabet {1, 2, . . . , m}. Note that C is the span of all   Aw 0 Cw = , 0 Bw where w is a word. Now the proposition follows by linearity of trace.



In order to get a version of the above proposition in arbitrary characteristic, we use the results from Section 5. Theorem 6.2. We have A ∼C B if and only if det(I + tXj ) = det(I + tYj ) as a polynomial in t for all j. Proof. We define two norms on C, by   A 0 N1 = det(A), 0 B

  A 0 N2 = det(B). 0 B

Suppose for every j, we have

N1 (I + tZj ) = det(I + tXj ) = det(I + tYj ) = N2 (I + tZj ). Since Z1 , . . . , Zs spans C, we have N1 = N2 . This means that det(I + tX) = N1 (I + tZ) = N2 (1 + tZ) = det(I + tY ) for every   X 0 Z= ∈ C. 0 Y In particular, if w is a word, we have

Cw = so



Aw 0 0 Bw



∈ C,

det(I + tAw ) = det(1 + tBw ) for all words w ∈ [m] . Hence σj,w (A) = σj,w (B) for all 1 ≤ j ≤ n and w ∈ [m]⋆ . By Theorem 2.3, these are a set of generating invariants, and hence A ∼C B. For the other direction, suppose det(I + tXj ) 6= det(I + tYj ) for some j. Since Z1 , . . . , Zs is a pivot basis, Zj = Cw for some w ∈ [m]⋆ . So, we have Xj = Aw and Yj = Bw , and det(I + tAw ) 6= det(I + tBw ). In particular, σj,w (A) 6= σj,w (B) for some j. For this j, σj,w is an invariant that separates A and B, and hence A 6∼C B.    Ai 0 Proof of Theorem 1.10. Given A, B ∈ Matm . Let C be the subalgebra n,n , let Ci = 0 Bi generated   by C1 , . . . , Cm . Construct the pivot basis Z1 , . . . , Zs of C. For all j, let Zj = Xj 0 . Further for each j, we have Zj = Cwj for some word wj ∈ [m]⋆ , and consequently 0 Yj Xj = Awj and Yj = Bwj . ⋆

14

If char(K) = 0, we only need to check if Tr(Xj ) = Tr(Yj ). If they are equal for all j, then we have A ∼C B. Else, we have Tr(Xj ) 6= Tr(Yj ) for some j, i.e., Twj (A) 6= Twj (B) and A 6∼C B. For arbitrary characteristic, we need to check instead if det(I + tXj ) = det(I + tYj ) as a polynomial in t for each j. But this can be done efficiently. When A 6∼C B, the algorithm finds j with 1 ≤ j ≤ n and w ∈ [m]⋆ such that σj,w (A) 6= σj,w (B). This means that σj,w ∈ S(n, m) is an invariant that separates A and B.  We will now prove the bounds for separating invariants. Let A, B ∈ Matm n,n with A 6∼C B   Ai 0 and write Ci = . Let C ⊆ Mat2n,2n be the subalgebra generated by C1 , . . . , Cm . 0 Bi Since the upper right and lower left quadrants are always zero for every matrix in C, the algebra C which is at most 2n2 -dimensional. Lemma 6.3. Suppose the length of any pivot w is d, then we must have √ d ≤ 2n n.

Proof. Since dim(C) < 2n2 , by Corollary 4.10, we have  d+1 d2 2 ≤ ≤ dim(C) ≤ 2n2 . 2n n √ 2 3 This gives d ≤ 4n , and hence d ≤ 2n n.



Proof of Theorem 1.11. Given A, B ∈ Matn,n with A 6∼C B, we √ construct the set of pivots of C. By the above lemma, the length of any pivot is at most 2n n. If char(K) = 0, then an invariant Tw separates A√ and B for some pivot w. This means there is an invariant of degree deg(Tw ) = l(w) ≤ 2n n that separates them. If char(K) > 0, we must have det(I + tAw ) 6= det(I + tBw ) for some pivot w.√ Hence for some 1 ≤ j ≤ n, σj,w (A) 6= σj,w (B). This gives an invariant of degree ≤ 2n2 n that separates them.  Remark 6.4. Suppose L is a subfield of K, and suppose A, B ∈ Matm n,n (L). Then, we observe that the entire algorithm works over L. However, we should point out that the algorithm does not check whether the orbit closures of A and B for the action of GLn (L) intersect, instead it checks whether the orbit closures of A and B for the action of GLn (K) intersect. The reduction from the orbit closure problem for matrix semi-invariants to the orbit closure problem for matrix invariants will continue to use scalars only from L if we assume L is infinite (or at least L is large enough). The reason for this is that the IQS algorithm requires this. Finally, if we take L = Q, it is easy to see that the run times of our algorithms for matrix invariants as well as matrix semi-invariants will be polynomial in the bit length of the inputs. Remark 6.5. The null cone for the simultaneous conjugation action of GLn on Matm n,n is √ in fact defined by invariants of degree ≤ n n in characteristic 0. To see this, we will use the same argument as in the proof of Theorem 1.11 above, where B = 0. In√particular, this means that dim C ≤ n2 instead of 2n2 , giving us the required bound of n n instead 15

√ √ of 2n n. In positive characteristic, we can get a bound of n2 n, but better bounds are already known, see [6]. 7. Bounds for separating matrix semi-invariants The reduction given in Section 3.2 is good enough for showing that the orbit closure problems for matrix invariants and matrix semi-invariants are in the same complexity class. In this section we give a stronger reduction with the aim of finding better bounds for the degree of separating invariants for matrix semi-invariants. This reduction can also be made algorithmic, and can replace the reduction in Section 3.2. However, we will only focus on obtaining bounds for separating invariants. m Let T ∈ Matm d,d such that fT 6= 0. For X ∈ Matn,n , consider   L1,1 (X) . . . L1,d (X) m X .. .. , Tk ⊗ Xk =  ... LT (X) = . . k=1 Ld,1 (X) . . . Ld,d (X) where Li,j (X) represents an n × n block. Pm From the matrices, one can check that Li,j (X) = k=1 (Tk )i,j Xk , By definition fT (X) = det(LT (X)). Let  M1,1 (X) .. MT (X) = Adj(LT (X)) =  . Md,1 (X)

definition of Kronecker product of i.e., a linear combination of the Xi .  . . . M1,d (X) .. .. , . . . . . Md,d (X)

where Mi,j (X) represents an n × n block. The entries of MT (X) are not linear in the entries of the matrices Xk . Instead the entries are polynomials of degree dn−1 in the entries (Xk )i,j . For X ∈ Matm n,n , let us define Xi,j,k = Xk Mi,j (X), for 1 ≤ k ≤ m and 1 ≤ i, j ≤ d. md2 Consider the map ζ : Matm n,n → Matn,n given by X 7→ (Xi,j,k )i,j,k . This gives a map on 2 m the coordinate rings ζ ∗ : K[Matmd n,n ] → K[Matn,n ]. Lemma 7.1. The map ζ ∗ descends to a map on invariant rings ζ ∗ : S(n, md2 ) → R(n, m). Proof. Observe that Li,j (X) is a linear expression in the matrices Xi . Hence for σ = (P, Q−1 ) ∈ SLn × SLn , we have Li,j (σ · X) = Li,j (P X1 Q, P X2Q, . . . , P XmQ) = P Li,j (X)Q. In particular, we see that LT (σ · X) = (P ⊗ Id)LT (X)(Q ⊗ Id). For any two square matrices A, B of the same size, we have Adj(AB) = Adj(B) Adj(A). Hence, we have MT (σ · X) = Adj(LT (σ · X))

= Adj((P ⊗ Id)LT (X)(Q ⊗ Id))

= Adj(Q ⊗ id)MT (X) Adj(P ⊗ Id)

= (Q−1 ⊗ id)MT (X)(P −1 ⊗ Id)

The last equality follows from the fact that for a matrix whose determinant is equal to 1, the inverse and adjoint are the same. We deduce that Mi.j (σ · X) = Q−1 Mi,j (X)P −1 . 16

Hence, we have (σ · X)i,j,k = (P Xk Q)(Q−1 Mi,j (X)P −1) = P Xi,j,k P −1, and so for g ∈ S(n, md2 ), we have g(ζ(σ · X)) = g(ζ(X)). Now observe that ζ ∗ (g) ∈ R(n, m) since ζ ∗ (g)(σ · X) = g(ζ(σ · X)) = g(ζ(X)) = ζ ∗ (g)(X). 

Corollary 7.2. Suppose we have g ∈ S(n, md2 ) such that g(ζ(A)) 6= g(ζ(B)), then A 6∼LR B. Assume A, B ∈ Matn,n and assume A 6∼LR B. We want to show the existence of an invariant f ∈ R(n, m) such that f (A) 6= f (B) such that deg(f ) is as small as possible. Indeed, since A 6∼LR B, there is a choice of S ∈ Matm k,k , for some k ≥ 1, such that fS (A) 6= fS (B). Without loss of generality, assume fS (B) 6= 0. Hence fS (A)/fS (B) 6= 1. Once again we must have fS (A)d /fS (B)d 6= 1 for at least one choice of d ∈ {n − 1, n}. In particular, for such a d, (fS )d is an invariant of degree dkn that separates A and B.

m Lemma 7.3. Let A, B ∈ Matm n,n and let T ∈ Matd,d such that fT (A) = fT (B) 6= 0. Then 2 ∗ ∗ there exists g ∈ S(n, md ) such that ζ (g)(A) 6= ζ (g)(B).

Proof. We have a degree dkn invariant f such that f (A) 6= f (B) by the above discussion. There exists U ∈ Matm dk,dk such that fU (A) 6= fU (B) since such invariants are a spanning set for invariants of degree dkn. Now for X ∈ Matm n,n , consider ! m X N(X) = Uk ⊗ Xk (MT (X) ⊗ Idk ). k=1

We observe that N(X) is a dk × dk block matrix, where each block has size n × n and is a linear combination of Xi,j,k . In other words, the (p, q)th block N(X)p,q is a linear combination P i,j,k i,j,k 2 i,j,k λp,q Xi,j,k for some λp,q ∈ K. Now we can define an invariant g ∈ S(n, md ). For 2 Z = (Zi,j,k )i,j,k ∈ Matmd , we define NZ to be the dk × dk block matrix, where the (p, q)th P n,ni,j,k block is given by i,j,k λp,q Zi,j,k . Let g(Z) = det(NZ ). It is easy to check that g ∈ S(n, md2 ). Moreover, we have ζ ∗ (g)(X) = g(ζ(X)) = det(Nζ(X) ) = det(N(X)) = fU (X) det(MT (X))k .

Since fT (A) = fT (B) 6= 0, we have that det(MT (A)) = det(MT (B)) 6= 0. In particular, since fU (A) 6= fU (B), we have ζ ∗ (g)(A) 6= ζ ∗ (g)(B) as required.  Now, we can finally prove Theorem 1.15 Proof of Theorem 1.15. Suppose A, B ∈ Matm n,n with A 6∼LR B. Both A and B cannot be in the null cone. If A is in the null cone, then we have an invariant f with deg(f ) = n(n − 1) such that f (B) 6= 0 = f (A). Similarly if B is in the null cone. Now, let us assume both A and B are not in the null cone. By the above discussion, for at least one choice of d ∈ {n − 1, n}, we either have T ∈ Matm d,d such that fT (A) 6= fT (B) ∗ or we have an invariant of the form f = ζ (g) such that f (A) 6= f (B). In the former case, we have an invariant of degree nd ≤ n2 that separates A and B. The latter implies that ζ(A) 6∼C ζ(B). Hence, we have an invariant g ∈ S(n, md2 ) of degree ≤ βsep (S(n, md2 )) such that g(ζ(A)) 6= g(ζ(B)). Now, since ζ is a map of degree dn, we have ζ ∗ (g) ∈ R(n, m) is a polynomial of degree = deg(g)dn ≤ n2 βsep (S(n, md2 )) ≤ n2 βsep (S(n, mn2 )) that separates A and B.  Acknowledgements. We thank the authors of [1] for sending early versions of their papers. 17

References [1] Z. Allen-Zhu, A. Garg, Y. Li, R. Oliveira and A. Wigderson, Operator Scaling via Geodesically Convex Optimization, Invariant Theory and Polynomial Identity Testing, preprint. [2] H. Derksen, Polynomial bounds for rings of invariants, Proc. Amer. Math. Soc. 129 (2001), no. 4, 955–963. [3] H. Derksen and G. Kemper, Computational Invariant Theory. Invariant Theory and Algebraic Transformation Groups. I. Encyclopaedia of Mathematical Sciences 130, Springer-Verlag, 2002. [4] H. Derksen and V. Makam, On noncommutative rank and tensor rank, Linear and Multilinear Algebra, published online (2017). [5] H. Derksen and V. Makam, Polynomial degree bounds for matrix semi-invariants, Adv. Math. 310 (2017), 44–63. [6] H. Derksen and V. Makam, Generating invariant rings of quivers in arbitrary characteristic, J. Algebra 489 (2017), 435–445. [7] H. Derksen and J. Weyman, Semi-invariants of quivers and saturation of Littlewood-Richardson coefficients, Journal of the American Math. Soc. 13 (2000), 467–479. [8] H. Derksen and J. Weyman, On Littlewood-Richardson polynomials, Journal of Algebra 255 (2002), 247–257. [9] M. Domokos, Relative invariants of 3 × 3 matrix triples, Linear and Multilinear Algebra 47 (2000), 175-190. [10] M. Domokos and A. N. Zubkov, Semi-invariants of quivers as determinants, Transformation groups 6 (2001), 9–24. [11] M. A. Forbes and A. Shpilka, Explicit Noether normalization for simultaneous conjugation, Approximation, randomization, and combinatorial optimization, Lecture Notes in Comput. Sci., vol 8096, 527–542. via polynomial identity testing [12] E. Formanek, Generating the ring of matrix invariants, in: F. M. J. van Oystaeyen, editor, Ring Theory, Lecture Notes in mathematics 1197, Springer Berlin Heidelberg, 1986, 73–82. [13] S. Donkin, Invariants of several matrices, Invent. Math. 110 (1992), 389–401. [14] S. Donkin, Invariant functions on matrices, Math. Proc. of the Cambridge Math. Soc. 113 (1993), 23–43. [15] A. Garg, L. Gurvits, R. Oliveira and A. Widgerson, A deterministic polynomial time algorithm for non-commutative rational identity testing, arXiv:1511.03730, 2015. [16] W. Haboush, Reductive groups are geometrically reductive, Ann. of Math. 102 (1975), 67–85. ¨ [17] D. Hilbert, Uber die Theorie deralgebraischen Formen, Math. Ann. 36 (1890), 473–534. ¨ [18] D. Hilbert, Uber die villen Invariantensysteme, Math. Ann. 42 (1893), 313–370. [19] P. Hrubeˇs and A. Wigderson, Non-commutative arithmetic circuits with division, ITCS’14, Princeton, NJ, USA, 2014. [20] G. Ivanyos, M. Karpinski, Y. Qiao and M. Santha, Generalized Wong sequences and their applications to Edmonds’ problems, J. Comput. System Sci. 81 (2015), 1373–1386. [21] G. Ivanyos, Y. Qiao and K. V. Subrahmanyam, Non-commutative Edmonds’ problem and matrix semiinvariants arXiv:1508.00690 [cs.DS], 2015. [22] G. Ivanyos, Y. Qiao and K. V. Subrahmanyam, Constructive noncommutative rank computation in deterministic polynomial time over fields of arbitrary characteristics, arXiv:1512.03531 [cs.CC], 2016. [23] A. D. King, Moduli of representations of finite-dimensional algebras, Quart. J. Math. Oxford Ser. 45 (1994), no. 180, 515–530. [24] K. Mulmuley, Geometric Complexity Theory V: Equivalence between blackbox derandomization of polynomial identity testing and derandomization of Noether’s normalization lemma, arXiv:1209.5993. [25] D. Mumford, J. Fogarty and F. Kirwan, Geometric invariant theory, third edition, Springer–Verlag, Berlin, 1994. [26] M. Nagata, Invariants of a group in an affine ring, J. Math. Kyoto Univ. 3 (1963/1964), 369–377. [27] C. Procesi, The invariant theory of n × n matrices, Adv. in Math. 19 (1976), 306–381. [28] R. Raz and A. Shpilka, Deterministic polynomial identity testing in non-commutative models, Comput. Complexity 14 (2005), 1–19. 18

[29] Y. Razmyslov, Trace identities of full matrix algebras over a field of characteristic zero, Comm. in Alg. 8 (1980), Math. USSR Izv. 8 (1974), 727–760. [30] A. Schofield and M. van der Bergh, Semi-invariants of quivers for arbitrary dimension vectors, Indag. Mathem., N.S 12 (2001), 125–138.

19

Suggest Documents