and, compared to the exact roots, they are correct to at least 7 decimal digits. .... 0.08. 1.35. 2:15 10?7. 0.30. 1.67. 1:14 10?4. 0.28. 1.17. 4:17 10?3. 15. 1.38. 1.48 ... 3:1 10?11. 0.35. 0.68. 1.87. 3:0 10?12. 0.20. 25.38. 1.02. 1:48 10?4. 20. 0.20.
Computer algebra methods for studying and computing molecular conformations Ioannis Z. Emiris and Bernard Mourrain Institut National de Recherche en Informatique et Automatique (INRIA) 2004, route des Lucioles, B.P. 93, 06902 Sophia-Antipolis, France. {emiris,mourrain}@sophia.inria.fr
June 15, 1997
Abstract
A relatively new branch of computational biology has been emerging as an eort to supplement traditional techniques of large scale search in drug design by structure-based methods, in order to improve eciency and guarantee completeness. This paper studies the geometric structure of cyclic molecules, in particular the enumeration of all possible conformations, which is crucial in nding the energetically favorable geometries, and the identication of all degenerate conformations. Recent advances in computational algebra are exploited, including distance geometry, sparse polynomial theory, and matrix methods for numerically solving nonlinear multivariate polynomial systems. Moreover, we propose a complete array of computer algebra and symbolic computational geometry methods for modeling the rigidity constraints, formulating the problems in algebraic terms and, lastly, visualizing the computed conformations. The use of computer algebra systems and of public domain software is illustrated, in addition to more specialized programs developed by the authors, which are also freely available. Throughout our discussion, we show the relevance of successful paradigms and algorithms from geometry and robot kinematics to computational biology. Keywords: Molecular conformations, structure-based design, geometric and kinematic constraints, computer algebra, equation solving.
1 Introduction Identifying molecular structure is a critical question in pharmaceutical drug design and discovery for human medicine as well as veterinary products, insecticides and herbicides [BMB94]. More specically, geometric structure is essential in function identication concerning intramolecular and intermolecular properties [HK88], docking of relatively small exible ligands to macromolecules, especially when the structure and the chemically predominant features at the receptor site's are unknown [LK92] and, lastly, pharmacophoric pattern matching [CJW+ 94]. A contemporary eort is to supplement traditional techniques of large scale search by structure-based design methods in order to accelerate the screening process or guarantee its completeness. To realize the eort and cost involved in these domains we note that human drugs take an average of 8 to 12 years of research until a new product arrives in the market [BMB94]. This paper proposes computer algebra and symbolic computational geometry methods for modeling, computing and visualizing all spatial congurations, or conformations, of molecules. These methods are applied to small or medium-sized molecules, a domain that has been very active lately [Lea91, BMB94, FHK+ 96]. In particular, we concentrate on cyclic, or ring, molecules; more generally, we consider constraints on cyclic portions of molecules. Enumeration of all possible conformations provides a complete and rigorous way to nd an optimal or a few acceptable congurations with respect to energy minimization. The basic hypothesis is that, at a rst approximation, energy depends only upon the torsion dihedral angles about certain bonds while bond angles and bond lengths may
1
be considered as rigid [GS70, CJW+ 94]. Analogous methods are used to examine the intimately related problem of identifying singular or unstable conformations. An immediate generalization of these problems is to computing local deformations, where conditions are sought on the rotatable bonds connecting two rigid bodies so that they are in a given relative position [SNW94], and to embedding molecules in 3-dimensional Euclidean space. We propose general methods for modeling each problem in algebraic terms so that the resulting polynomial system has small dimension. We provide a brief introduction to relatively new and very powerful algorithms for studying basic properties of systems of simultaneous polynomial equations, in particular for counting and computing their common roots and identifying their singularities. We illustrate the use of computer algebra packages, such as Maple, which are straightforward to use, as well as software libraries and specialized implementations, which include algebraic and numeric algorithms of higher performance. Most of the implementations referred to in this paper are already publicly available, otherwise they can be obtained by the authors. Lastly, we demonstrate the use of two freely available visualization tools. The principal merits of the proposed methods are their completeness, their mathematical rigor and the limited need for user intervention. Furthermore, our methods are able to deal with approximate inputs or inputs of limited accuracy and produce the best possible output under some measure. Lastly, although not required, these methods can incorporate probabilistic techniques such as those in [FHK+ 96] for dealing with large molecules. An overview of the paper follows. The next section points to related work. Section 3 examines algorithms to identify all possible conformations of a molecule, given some rigidity constraints. The problem is translated into algebraic terms in sections 3.1.1 and 3.1.2, rst by a ap angle formulation and then by distance geometry, both approaches yielding a reduction in the apparent problem dimension. Sections 3.2 and 3.3 apply recent advances in computational algebra that exploit the sparseness of the equations in order to estimate the number of solutions, then compute them numerically. We compare dierent alternatives with respect to their speed and accuracy. Section 3.4 extends our methods from 6-atom molecules to 5- and 7-atom molecules, which is interesting algebraically, since the respective polynomial systems are over- and under-constrained. Section 4 proposes constructive methods for identifying the innitesimally unstable conformations. Section 5 discusses the software used, especially an object-oriented Maple modeler of geometric constraints and the visualization tools employed. A summary of our contributions concludes the paper in section 6.
2 Related work Several algorithmic approaches have been proposed for modeling, then solving problems of geometric nature concerning molecules. For conformational search (nding all conformations minimizing a given energy function), dierent search approaches have mostly been used, besides molecular dynamics, randomized and optimization techniques [HK88, Lea91, CJW+ 94]. Such methods suer from lack of completeness and are not always suciently fast. The problem addressed here is a critical subproblem of conformational search. We focus on constraints entailed by rings, which has been extensively studied by G o and Scheraga in the 1970's [GS70, GS73]. They established the viewpoint adopted in this paper, namely of considering the dihedral torsion angles as the only non-rigid parameters. At a rst level of approximation, the search in the space of dihedral angles yields all minimum-energy conformations. Formalizing the rigidity constraints into algebraic terms may be harder than one imagines, especially since we try to minimize the dimension of the resulting polynomial system. Our rst approach discussed in section 3.1.1 uses ap angles and has been proposed by Parsons [PC94] for the cyclohexane, but had never been generalized. Distance geometry, used in section 3.1.2, provides a quite general approach and has been also used in [Hav91, HN95, HN97]. Although we do not develop these applications in detail, our methods can be used to solve kinematic and geometric constraints appearing in ligand docking as well as in searching for pharmacophores (3-dimensional structures exhibiting certain chemical properties); see [LK92, CJW+ 94, PC94, FHK+ 96]. The underlying premise is that such constraints are expressed by polynomial equations whose solution indicates either the position or the space transformation for the molecule in question. In solving the algebraic system, several ad hoc methods have been proposed, including [HN95]; see also [HN97]. The main contribution of [MZW95] is to use matrix methods for eciently solving geometric problems regarding cyclic molecules with 6 to 8 rotational degrees of freedom. In particular, extending to molecules with more than six varying dihedrals has been approached by a grid search of the space of the two last dihedrals, each grid point giving
2
rise to a six-dimensional subproblem [GS70, MZW95]. For molecules with six free dihedrals and an arbitrary number of atoms, the problem is identical to the inverse kinematics of a general serial manipulator with six rotational joints, which is a settled question for roboticists [MC92]. We suggest new methods that are general and guarantee the computation of all solutions, while exploiting the sparseness of the polynomial equations.
Figure 1: One conformation of a cyclic 6-atom molecule. We point out the relevance of our work with respect to a relatively recent eort for applying successful paradigms from combinatorial geometry, robot kinematics, mechanism theory as well as vision to predicting the structure of molecules, embedding them in Euclidean space and nding the energetically favorable conformations [PC94, FHK+ 96]. The main premise for this interaction is the observation that various structural requirements on molecules can be modeled as macroscopic geometric or kinematic constraints. This work focuses on algebraic algorithms that have been very useful in the areas mentioned above but are not well-known among structural biologists and computational chemists. Incorporating energy minimization into the conformational analysis is the nal objective and is most often performed independently, after the conformational analysis [HK88, FHK+ 96]. However, this is a hard enough problem to warrant separate consideration.
3 Molecular conformations This section examines the problem of computing all conformations of cyclic molecules of 6 atoms, then section 3.4 extends to rings of 5 and 7 atoms. Conformations specify the 3-dimensional structure of the molecule under the ring closure requirement. Energy minima are typically found among the conformations obtained by allowing only the torsion dihedral angles to vary about single covalent bonds, while keeping bond lengths and bond angles xed. A torsion dihedral angle is the solid angle between two consecutive planes, each dened by an atom and its two bonds. The medium of the molecule is ignored. Figure 1 shows a molecule drawn by izic; see section 5. The relationship to geometry and robotics is obvious once bonds are thought of as rigid joints and atoms as the mechanism's links or articulations. Figure 2 by D. Parsons [Par94] illustrates the kinematic equivalence between a cyclic 6-atom molecule and a robot with six rotational degrees of freedom (6R). The only movement allowed is rotation around the bonds' axes, so the question of identifying a conformation reduces to nding the respective pose. In kinematic terms, the molecule is equivalent to a serial mechanism in which each pair of consecutive axes intersects at a link. This implies that the link osets are zero for all six links, which will allow us to reduce the
3
Figure 2: A cyclic 6-atom molecule and the equivalent 6R robot. 6-dimensional problem to a system of 3 polynomials in 3 unknowns. The product of all link transformation matrices is the identity matrix, since the end-eector is at the same position and orientation as the base link.
3.1 Algebraic formulation
The molecule has a cyclic backbone of 6 atoms, typically of carbon. They determine primary structure, the object of our study. Carbon-hydrogen or other bonds outside the backbone are ignored. The bond lengths and angles provide the constraints while the six dihedral angles are allowed to vary.
3.1.1 A formulation using angles
We adopt an approach proposed by Parsons [PC94, Par94]. His approach has been generalized and used on other molecules, including section 3.4, since it possesses the merit of capturing the eective dimension of the underlying system and of facilitating the visualization of the molecule once all parameters have been computed. Notation is dened in gure 3. Backbone atoms are regarded as points p1 ; : : : ; p6 2 R3 ; the unknown dihedrals are the angles !1 ; : : : ; !6 about axes (p6 ; p1 ) and (pi?1 ; pi ) for i = 2; : : : ; 6. Each of triangles T1 = 4(p1; p2 ; p6 ), T2 = 4(p2 ; p3; p4 ) and T3 = 4(p4; p5 ; p6) is xed for constant bond lengths L1; : : : ; L6 and bond angles 1 ; 3 ; 5 . Then the lengths of (p2 ; p6 ), (p2 ; p4 ) and (p4 ; p6 ) are constant, hence base triangle 4(p2; p4 ; p6 ) is xed in space, dening the xy-plane of a coordinate frame. Let 1 be the (dihedral) angle between the plane of T1 and the xy-plane. Clearly, for any conformation, 1 is well-dened. Similarly we dene angles 2 and 3 . We call them ap (dihedral) angles to distinguish them from the bond dihedrals. Figure 3 attempts to capture the 3-dimensional nature of the molecule by showing the normal vectors to the triangles Ti , denoted by N (Ti) in the gure, i = 1; 2; 3. Conversely, given lengths Li , angles i for i = 1; : : : ; 6 and ap angles i for i = 1; : : : ; 3 the coordinates of all points pi are uniquely determined and hence the bond dihedral angles and the associated conformation are all well-dened. We have therefore reduced the problem to computing the three ap angles i which satisfy the constraints on bond angles 2 ; 4 ; 6 . Hence we obtain polynomial system 11 + 12 cos 2 + 13 cos 3 + 14 cos 2 cos 3 + 15 sin 2 sin 3 = 0; 21 + 22 cos 3 + 23 cos 1 + 24 cos 3 cos 1 + 25 sin 3 sin 1 = 0; 31 + 32 cos 1 + 33 cos 2 + 34 cos 1 cos 2 + 35 sin 1 sin 2 = 0; (1)
4
b2
ω2
L2
p1
11111 00000 00000 11111 00000 11111 00000 11111 00000 11111 00000 11111 θ1 00000 11111
φ1
N(T2)
r
ω1
N(T1)
L1
a3
11 00 00 11 θ 311 00 00 11 00 11 00 11 00 11
a2
p6
N(T3)
φ6 a1
φ5
L6
ω6
111111111111 000000000000 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 00000 11111 000000000000 111111111111 00000 11111 000000000000 111111111111 00000 11111 000000000000 111111111111 00000 11111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111
p5
b1
Figure 3: The parameters of the cyclohexane. cos2 1 + sin2 1 ? 1 = 0; cos2 2 + sin2 2 ? 1 = 0; cos2 3 + sin2 3 ? 1 = 0; where the ij are input coecients. For our resultant solvers we prefer an equivalent formulation with a smaller number of polynomials, obtained by applying the standard transformation to half-angles that gives rational equations in the new unknowns ti : ? t2i ; sin = 2ti ; i = 1; 2; 3: ti = tan 2i : cos i = 11 + i 1 + t2 t2i i This transformation captures automatically the last three equations in (1). By multiplying both sides of the i-th equation by (1 + t2j )(1 + t2k ), where (i; j; k) is a permutation of f1; 2; 3g, the polynomial system becomes
f1 = 11 + 12 t22 + 13 t23 + 14 t2 t3 + 15 t22 t23 = 0 f2 = 21 + 22 t23 + 23 t21 + 24 t3 t1 + 25 t23 t21 = 0 f3 = 31 + 32 t21 + 33 t22 + 34 t1 t2 + 35 t21 t22 = 0
(2)
where ij are input coecients.
3.1.2 A formulation using distances
Another formulation of the problem relying on the geometry of distances is presented here. It aims to show how Computational Symbolic Geometry can help in easily deriving constraints for such problems. Another application of this approach can be found in [Hav91, HN95]. Let us rst have a glimpse on the space of spheres. A sphere has a unique equation (up to a nonzero scalar) of the form u0(x2 + y2 + z 2 ) ? 2 u1 x ? 2 u2 y ? 2 u3 z + u4 = 0; (3) 4 therefore we can identify the space of spheres with a subset of the projective space P of classes (u0 : u1 : : : : : u4 ). We also denote this projective space by S .
5
Conversely, given a point (u0 : u1 : : : : : u4 ) 2 S = P4, if u0 6= 0, the quadric (3) denes a sphere of 4) where Q(u0 ; : : : ; u4 ) = center (1; uu10 ; uu02 ; uu30 ) (or (u0 : : u3) in P3) and radius R such that R2 = Q(u0u;:::;u 2 0 2 2 2 u1 + u2 + u 3 ? u 0 u4 : If u0 = 0, the quadric obtained by homogenization of (3) is the union of an ane plane and the plane at innity of P3 (dened by t = 0, where t is the homogenization variable in P3). It corresponds to a point at innity in P4. Moreover, if u0 = u1 = : : : = u3 = 0, we obtain the special sphere ! = (0 : : : : : 0 : 1), whose center is nowhere. The special spheres of radius 0, will be called point-spheres. To any ane point A = (1 : a1 : a2 : a3 ) 2 A 3 , we associate the point-sphere A_ = (1 : a1 : a2 : a3 : a21 + a22 + a23 ). This gives an embedding of A 3 in S . The space of spheres S has a natural inner-product given by the following formula: for any spheres S = (u0 : : u4); S 0 = (u00 : : u04) in S , (S jS 0 ) = u1 u01 + u2 u02 + u3 u03 ? 12 (u0 u04 + u4u00 ): It is not hard to check that if u0 = u00 = 1, then (S jS 0 ) = 12 (R2 + R02 ? d2 (O; O0 )) where R; R0 are the radii of the spheres, O; O0 their centers and d2 (O; O0 ) the square of the distance between the two centers. Therefore, for two points A; B 2 A 3 , the inner-product is (4) (A_ jB_ ) = ? 21 d2 (A; B ) and we also have (A_ jA_ ) = 0. We also check that for any point A 2 A 3 , (5) (A_ j!) = ? 21 : See, for instance, [MS94, DH91] for more information. Let us come back now to the congurations of p1 ; : : : ; p6 and let us denote by L the list of point-spheres (!; p_1 ; : : : ; p_6 ). Consider the 7 7 symmetric matrix whose coecient (i; j ) is the inner-product of the i-th element of L and its j -th element. It is the Gramm-Schmidt matrix, constructed with the list L and the inner-product ( j ). It is also called the Cayley-Menger matrix of the points p1 ; : : : ; p6; see, for instance, [Ber77]. The spheres, being represented by a vector with 5 coordinates, it is not hard to check that this matrix is at most of rank 5 (spheres are represented by vectors with 5 coordinates). Using the equations (4), (5) and dividing by ? 21 , we obtain a matrix of the form
0 0! p11 p12 p13 p14 p15 p61 1 ! p1 B B 11 c 0 c10;2 cc1;3 cu1 cu1;5 cc1;6 CCC p2 B B 1 c1;2 c 20;3 c2;4 c 2 u2;6 CC : p3 B B 1 u1;3 c2;3 c 30;4 c3;5 c 3 CC p4 B 1 2;4 3;4 4;5 4;6 C B p5 @ 1 c1;5 u2 c3;5 c4;5 0 c5;6 A p6 1 c1;6 c2;6 u3 c4;6 c5;6 0 In this matrix, all the coecients ci;j = d(pi ; pj )2 are known from the input parameters Li , i and there are only 3 unknowns u1 ; u2 ; u2 (just as in the previous section). Once these unknowns are known, we can recover the geometry
of the molecule (up to global translations and rotations). As any 6 6 minor of this matrix vanishes, we easily derive new constraints on the parameters ui . Taking for instance the diagonal minor of column (resp. row) indices (1; 2; 3; 4; 5; 6) of this matrix, we obtain an equation P1 (u1 ; u2 ) = 0 in the variables u1 ; u2 , which is of degree 2 in u1 and u2 . In a similar way, we can derive a constraint P2 (u1 ; u3 ) = 0 (resp. P3 (u2 ; u3 ) = 0) of the same type by considering the diagonal minor of indices (1; 2; 3; 4; 5; 7) (resp. (1; 2; 3; 5; 6; 7)). This yields a system of 3 equations in 3 unknowns P1 (u1 ; u2 ) = 0; P2 (u1 ; u3 ) = 0; P3 (u2 ; u3 ) = 0 which has exactly the same set of monomials as system (2).
6
3.2 Counting all conformations
A rst step in the analysis of a polynomial system consists in studying the geometry of common roots and in nding an accurate bound for the number of common solutions. We refer here to some recent advances in this area which exploit a priori knowledge of which monomials appear in the equations. We consider the algebraic closure of the given coecient eld, which is typically the eld of complex numbers. The classical theory provides the bound by Bézout's theorem on the number of roots in the projective complex space, which is simply the product of the polynomials' total degrees [CLO92, CLO97]. The bound given by Bézout's theorem here is 4 4 4 = 64. This bound is too large, for the system has a very special shape. Not all monomials of degree 4 in the variables t1 ; t2 ; t3 are present in the equations and the degree of each equation in each variable is bounded by 2. In other words, regarding only total degree overestimates the total number of solutions. Sparse elimination theory, initiated with Bernstein's theorem, provides a context for exploiting the monomial structure of the system, when roots at innity are of no interest. To model sparseness, we regard each 3-dimensional exponent vector corresponding to a non-vanishing monomial in t1 ; t2 ; t3 as a point in Z3. The convex hull of all such points denes the Newton polytope of the polynomial. The mixed volume of 3 polytopes C1 ; C2 ; C3 2 R3 is a real-valued function that generalizes Euclidean volume, formally dened as the coecient of 1 2 3 in the polynomial expressing the volume of 1 C1 + 2 C2 + 3 C3 . More generally, letting V () express the Euclidean volume function, we have the following denition: For 1 ; : : : ; n 2 R0 and convex polytopes A1 ; : : : ; An Rn , their mixed volume is precisely the coecient of 1 2 n in V (1 A1 + + n An ) expanded as a polynomial in 1 ; : : : ; n . A complete account of sparse elimination, including equivalent denitions for mixed volume in general dimension an an ecient algorithm for its computation, can be found in [PS93, EC95, Emi96] and their references. See section 5 for the location at which the code is publicly available. In our case the polytopes Ci are squares of size 2. Bernstein's theorem bounds the number of common roots with no zero coordinates by the mixed volume of the Newton polytopes. The mixed volume of the present system is 16, so Bernstein's theorem says that the number of isolated roots of system (2), counting multiplicities, with t1 6= 0; t2 6= 0; t3 6= 0 is at most 16. In fact, this bound is optimal, as shown in the next section by exhibiting an example with 16 solutions, which are moreover all real. This method is, in our case, equivalent to a variant of Bézout's theorem for multi-homogeneous polynomials; see [Sha77, CLO97]. If we homogenize each equation with respect to each variable t1 ; t2 ; t3 , we obtain a system of 3 multi-homogeneous equations in P1 P1 P1 (where P1 is the projective space of dimension 1). The multi-degrees of these equations are (0; 2; 2); (2; 0; 2); (2; 2; 0): Using Bézout's theorem in P1 P1 P1, we can bound the number of isolated roots (not necessarily with no zero coordinates) of this system by the coecient of h1 h2 h3 in the polynomial (2h1 + 2h2 ) (2h1 + 2h3) (2h2 + 2h3) which also yields 16.
3.3 Computing all conformations
This section computes all common solutions for the algebraic system expressing the molecule, thus yielding all possible conformations. Our preferred method uses resultants in order to reduce the nonlinear problem derived in the previous section to a question in linear algebra. Matrix computations are then sucient, for which we can rely on ecient and stable public implementations. An introduction can be found in [KL92, CLO92, CLO97]. Let us concentrate on zero-dimensional varieties dened by a polynomial system of n equations in n unknowns. The crucial step is to dene a matrix expressing the resultant (or the Bezoutian, or the Chow form). From this, we obtain a square matrix whose eigenvalues correspond to a coordinate in every solution vector. The rest of the coordinates are obtained by the eigenvectors of the multiplication matrix. Root nding is therefore reduced to an eigenproblem and then existing algorithms are employed from numerical linear algebra; sometimes this problem is regarded as the generalized eigenproblem or the problem of computing the matrix kernel vectors. It is not a new result in computational algebra, but it has not widely been used for practical purposes; the interested reader may consult any of [KL92, EC95, NR95, Emi96, CM96, CLO97]. An important feature of our method is precisely that it
7
reduces to matrix operations for which powerful and accurate implementations already exist in the public domain, such as [ABB+ 95] which is used in the sparse resultant solver implemented in C [Emi97]; see section 5 for getting this software. There are several algorithms to construct the matrices whose eigenvalues will give us the roots. A rst class of methods is based on Macaulay-type matrices which rely on the idea that the system is generic in some sense [KL92], and whose entries are linear in the polynomial coecients. In the classical context, the resultant's degree is a function of the classical Bézout bound, i.e., the product of total degrees. The Macaulay matrix is dened for homogenized polynomials (obtained by introducing a homogenization variable), and its size depends again on Bézout's bound. In sparse elimination, the sparse resultant has degree dependent on the mixed volume of the Newton polytopes of the equations [PS93]. The sparse resultant approach generalizes the well-known Sylvester resultant for two univariate polynomials into several multivariate polynomials as well as Macaulay's algorithm for dense polynomials; see [KL92] for a survey. Several algorithms have been proposed in the last few years for constructing sparse resultant matrices (also called Newton matrices), whose determinant is a nontrivial multiple of the sparse resultant and which express multiplication in the quotient ring. The size of these matrices scales with mixed volume. More recently, linear algebra methods for reducing the matrix size and removing some genericity requirements have been proposed. This area is covered in [PS93, CE93, EC95, NR95, Emi96, CM96] and the references thereof. The second class of methods relies on the theory of residues, and does not assume that the system is generic,but only that it is a complete intersection (the codimension of the set of solutions is the number of equations). The constructed matrices generalize the constructions proposed by Bézout for the resultant of two polynomials in one variable in as early as 1779 (see [KL92, CM96, CLO97] and the references thereof) and are typically smaller than sparse resultant matrices but the coecients are no longer linear in the coecients of the input polynomials. This technique can be used for any polynomials, either homogeneous or not, including sparse polynomials. It relies on compression algorithms, reducing the size of the matrix to the optimal size (i.e. the number of roots, counted with multiplicity). The connections between algebraic residue theory and matrix methods for polynomial system solving are another interesting feature of this approach; see, for instance, [CM96, EM96]. Thirdly, we simply apply Sylvester's resultant successively until all but one variables are eliminated. All three algebraic methods are illustrated in the next sections. The overall algorithm we propose can be outlined as follows: Input: The physical description of a molecular ring in terms of point coordinates, distances and/or angles. Output: A complete description of every possible conformation, in particular the values of the torsion dihedral angles. Alternatively, we may simply output a bound on the number of possible conformations, as discussed in section 3.2.
Major steps:
1. Formulate the problem in terms of polynomial equations. 2. Optionally compute bounds on the number of possible common solutions. 3. Use one of the 3 matrix methods sketched above and illustrated below in order to reduce the nonlinear polynomial system to a matrix problem and the solution of a single univariate polynomial. 4. Solve numerically this univariate polynomial (or compute the eigenvalues of the corresponding square matrix). This supplies one coordinate in every solution vector. 5. Use this information to compute all coordinates of the solution vectors. We treat three systems of type (2), described immediately below. Then, in the following sections, we apply the three matrix methods for system solving. The input data are the ij , dened at the end of section 3.1.1. 1. For the rst one, we consider the cyclohexane molecule which has 6 carbon atoms at equal distances and equal bond angles. Usually noise enters in the process that produces the coecients. To illustrate this phenomenon, starting with the pure cyclohexane, we randomly perturb the data by about 10% to obtain ij as the entries of matrix 2 ?310 959 774 1389 1313 3 4 ?365 755 917 1451 1269 5 : ?413 837 838 1655 1352
8
In this case, we nd the following 4 solutions, visualized by izic; see section 5 for details on izic. A chemist will immediately recognize two chair and two twisted boat, or crown, conformations
corresponding to the following roots.
t1
t2
t3
0.3684363946 0.3197251270 0.2969559367 - 0.3684363946 - 0.3197251270 - 0.2969559367 0.7126464332 - 0.01038413185 - 0.6234532743 - 0.7126464332 0.01038413185 0.6234532743 2. The second instance has the maximum number of roots, namely 16, and furthermore, they are all real. The ij coecients are given by matrix
2 ?13 ?1 ?1 24 ?1 3 4 ?13 ?1 ?1 24 ?1 5 : ?13 ?1 ?1 24 ?1
Among the 16 real roots, four correspond to eigenvalues of unit (geometric) multiplicity, while the rest form four groups, each corresponding to a triple eigenvalue. For the latter the eigenvectors give us no valid information, so we recover the values of t1 ; t2 by substituting t3 in the original system and by solving the resulting univariate equations.
9
t1
t2
10.85770360 - 10.85770360 0.3320730984 - 0.3320730984 0.7795480449 0.7795480449 0.7795480440 4.625181601 4.625181601 4.625181600 - 0.7795480440 - 0.7795480449 - 0.7795480449 - 4.625181600 - 4.625181601 - 4.625181601
t3
0.7795480449 - 0.7795480449 4.625181601 - 4.625181601 0.7795480449 10.85770360 0.7795480457 4.625181601 0.3320730984 4.625181613 - 0.7795480457 - 0.7795480449 - 10.85770360 - 4.625181613 - 4.625181601 - 0.3320730984
3. The third system is dened by the following matrix of ij .
2 p 64 ?? p p
0.7795480451 - 0.7795480451 4.625181601 - 4.625181601 0.7795480451 0.7795480451 10.85770360 4.625181601 4.625181601 0.3320730984 - 10.85770360 - 0.7795480451 - 0.7795480451 - 0.3320730984 - 4.625181601 - 4.625181601
3
p
2 p23 2 p23 75 : ? 2 23 It is not a zero-dimensional system since its solution set is the union of a curve of degree 4 and of 16 points. Among these 16 points, 4 points lie on on the line t1 = t2 = t3 and are isolated (2 of them are real), while and the other 12 are embedded in the curve. By observing that equation fi in system (2) is of degree 4 in tj ; tk only, for distinct i; j; k 2 f1; 2; 3g, we deduce that the curve's projection on the coordinate plane dened by ti = 0 is the curve dened by the equation fi = 0, for all i 2 f1; 2; 3g. Here is a picture of the real part of the curve's projection on any one of the coordinate planes: 3 2 3 2 3 2
1 2 1 2 1 2
1 2 1 2 1 2
2
1
-2
-1
00
1 x
2
-1
-2
The system that we consider and actually solve is obtained by taking the oating-point approximations of the coecients ij with 5 digits. There are then 16 complex roots, of which the real ones are the following:
10
t1
t2
t3
0.5176444559 0.5176444559 0.5176444563 -0.5176444567 -0.5176444567 -0.5176444555 0.5176444559 -1.931851652 0.5176444563 -0.5176444567 1.931851652 -0.5176444555 1.931851652 -0.5176444567 -0.5176444555 -1.931851652 0.5176444559 0.5176444563 0.5176444561 0.5176444561 -1.931851652 -0.5176444561 - 0.5176444561 1.931851652 The rst two correspond to isolated points, whereas the last 6 points are perturbations of points that lie in the curve, when taking rational inputs. We describe now the methods that we have used.
3.3.1 Sparse resultant matrices
Sparse multivariate resultants eliminate one fewer variables than the number of polynomials, all at once, and exploit any a priori knowledge on certain coecients being zero. Since not all the monomials of degree 2 are present in the given equations, the sparse resultant has lower degree, and hence reduced complexity compared to the classical multivariate resultant; see [VdW48, KL92, CLO97]. We apply this approach to all three systems and report on experimental results in section 3.3.4. The rst approach in applying resultants for solving well-constrained systems hides one variable in the coecient eld. If we consider variable t3 as part of the coecient eld, the 3 equations can be viewed as bivariate:
f1 = c11 + c12 t2 + c13 t22 = 0; f2 = c21 + c22 t1 + c23 t21 = 0; (6) 2 2 2 2 f3 = c31 + c32 t2 + c33 t1 t2 + c34 t1 + c35 t1 t2 = 0: The resultant matrix of this system has dimension 16 and its entries cij are naturally functions of t3 ; in fact, they are quadratic polynomials in t3 . The sparse resultant has total degree in the coecients equal to the sum of the mixed volumes of the bivariate well-constrained subsystems. Each of the 3 mixed volumes (in t1 ; t2 ) is equal to 4, therefore the sparse resultant degree is 12 in the variables ci;j and the size of the constructed matrix will be at least 12. The shown 16 16 matrix Ms is the resultant matrix constructed by both greedy and incremental algorithms, described in [EC95] and its references. Matrix Ms expresses a quadratic matrix polynomial in t3 , therefore its determinant is quadratic in t3 and a multiple of the sparse resultant. Solving this determinant for t3 gives us the values of t3 at the roots, while the kernel vectors correspond to the values of t1 ; t2 . More specically, each kernel vector expresses the evaluation of the following vector of monomials at the roots of t1 ; t2 . The monomial vector is precisely the sequence of monomials indexing the columns of Ms :
1; t ; t ; t ; t ; t t ; t t ; t t ; t ; t t ; t t ; t t ; t ; t t ; t t ; t t : 2
2 2
3 2
1
1 2
2 1 2
3 1 2
2 1
2 1 2
2 2 1 2
2 3 1 2
3 1
3 1 2
3 2 1 2
3 3 1 2
To recover the value of t1 ; t2 at the roots, it suces to divide certain vector entries. To nd, for instance, the value of t2 , we divide the second entry by the rst, i.e., t2 =1. However innocuous this operation may seem, the choice of which entries to use is an important numerical issue that has to be studied further. Results vary signicantly according to this choice, and certain simple strategies examined dopnot work. We have, namely, tried averaging over several ratios, e.g. (t2 =1 + t1 t2 =t1 )=2; or taking roots, e.g. t22 =1. For the second system the specialized determinant is of degree 24: ?t 4 ? 22 t 2 + 133 ?t 2 + 14 ?t 4 ? 118 t 2 + 13 : ? 186624 3 3 3 3 169 3
11
2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 Ms = 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4
c11
c12
c13
0
0
c11
c12
c13
0
0
0
0
0
c11
0
0
0
0
0
c11
c31
0
c32
0
0
0
c31
0
c32
0
0
0
0
0
0
0
0
c21
0
0
c21
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
c12
c13
0
0
0
0
0
0
0
0
0
c12
c13
0
0
0
0
0
0
0
0
c33
0
0
c34
0
c35
0
0
0
0
0
0
c33
0
0
c34
0
c35
0
0
0
0
c31
0
c32
0
0
c33
0
0
c34
0
c35
0
0
0
c31
0
c32
0
0
c33
0
0
c34
0
c35
0
0
c22
0
0
0
c23
0
0
0
0
0
0
0
0
0
0
c22
0
0
0
c23
0
0
0
0
0
0
0
c21
0
0
0
c22
0
0
0
c23
0
0
0
0
0
0
0
0
c21
0
0
0
c22
0
0
0
c23
0
0
0
0
0
0
0
0
c21
0
0
0
c22
0
0
0
c23
0
0
0
0
0
0
0
0
c21
0
0
0
c22
0
0
0
c23
0
0
0
0
0
0
0
0
c21
0
0
0
c22
0
0
0
c23
0
0
0
0
0
0
0
0
c21
0
0
0
c22
0
0
0
c23
3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5
The second way to dene an overconstrained system from the given one is to add to the given system any
polynomial of our choice. This gives the u-resultant approach if the extra polynomial is linear in the variables with indeterminate constant term u. This method distinguishes among all simple roots at the expense of increasing the problem's dimension by 1. If we choose polynomial a + b t1 , with a; b indeterminate, the mixed volumes of the 3-polynomial systems are 4; 4; 4; 16, hence the sparse resultant degree is 4 in the variables i;1 ; : : : ; i;5 , 16 in the variables a; b and its total degree is 28 The incremental algorithm was applied and the smallest matrix found is of size 48. Its determinant, which is a multiple of the resultant, when specialized to the coecients of the second system, yields the following polynomial of degree 22:
?
?
?
3869835264 b2 a4 ? 118 b2a2 + 13 b4 4 a2 + b2 2 a4 ? 22 b2a2 + 13 b4 3 : The mixed volume information indicates that the degree of the specialized sparse resultant in each of a; b is 16. In fact, the specialized sparse resultant is
?a ? 118 b a + 13 b ?a ? 22 b a + 13 b ; 4
2 2
4
4
4 3
2 2
which shows that 12 roots will come in triplets with t1 constant. In the rest of the section we report on the application of the public-domain solver in
;
http://www.inria.fr/safir/emiris
where we have chosen to hide t3 in the coecient eld. This solver is in C therefore it runs faster than the Maple solver examined in detail below. It constructs the matrix by the incremental algorithm with generic coecients, therefore independent of the instance and the precision. The time for constructing the resultant matrix is less than half a second on the DEC AlphaStation of section 3.3.4. Then, the program constructs a 32 32 companion matrix whose eigenvalues and eigenvectors are the t3 values and the kernel vectors of Ms . Since there are 32 eigenvalue and eigenvector pairs, exactly half of them will not correspond to solutions. The program applies the standard eigendecomposition routines of [ABB+ 95] over double-precision oating-point numbers, which have 53 binary digits reserved for the mantissa. It yields the following results on the rst two instances; more information is included in [Emi97]. 1. For the rst system, after rejecting false candidates, the recovered roots cause the maximum absolute value of the input polynomials to be 10?5 . We check the computed solutions against those obtained by an exact
12
Mb =
2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4
?14 t1 2 ? 13 ? t1 4 ?7488 t1 ? 864 t1 3
?2 t1 2 ? 1 ? t1 4
24 t1 + 24 t1 3
0
0
312 + 24 t1 4 + 336 t1 2
7488 t1 2
24 t1 + 24 t1 3
?14 t1 2 ? 13 ? t1 4
0
?240 t1 2 + 24 t1 4 + 312 0 2 4 2 13822 t1 + 11 t1 ? 13 ?14 t1 ? 13 ? t1 4 3 ?552 t1 ? 264 t1 24 t1 + 24 t1 3 24 t1 + 24 t1 3
?
?
? 264 t1 ? 528 t1 2 ?2 t1 2 ? 1 ? t1 4
0
0
0
? 13 ? ?24 ? 24 t1 4 ? 48 t1 2
t1 4
?14 t1 2 ? 13 ? t1 4 24 t1 3 + 312 t1 0
?
2 t1 2
?
288 t1 3
14 t1 2
24 t1 + 24 t1 3
11 t1 4
24 t1 + 24 t1 3
?14 t1 2 ? 13 ? t1 4
? 13 ?
t1 4
0
? 13 ? ?240 t1 2 + 24 t1 4 + 312
24 t1 + 24 t1 3
24 t1 + 24 t1 3
312 + 24 t1 4 + 336 t1 2
0
11 t1 4
24 t1 + 24 t1 3
? 13 ? ?576 t1 2
?
24 + 24 t1 4
24 t1 + 24 t1 3
24 t1 3 + 312 t1
14 t1 2
0 552 t1 3
0
576 t1 2
? 13 ? 2 t1 2 ?288 t1 3 ?7488 t1 ? 864 t1 3
11 t1 4
2 t1 2
0 24 t1 + 24 t1 3
0
?2 t1 2 ? 1 ? t1 4
0
0
0
0
?2 t1 2 ? 1 ? t1 4
24 t1 + 24 t1 3
?14 t1 2 ? 13 ? t1 4
0
3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5
Gröbner bases computation over the integers and observe that each contains at least 8 correct digits. The total CPU time on a sparc is 0:2 seconds on average for transforming the matrix and computing its eigenvalueeigenvector pairs. 2. For the second system, the computed roots evaluate the input polynomials to values no larger than 10?6 and, compared to the exact roots, they are correct to at least 7 decimal digits. The average CPU time for computing the roots from the resultant matrix is 0:2 seconds on a sparc.
3.3.2 Bézout resultant matrices
We consider here two methods, analogous to those of the previous section. First, we consider t3 as a parameter and compute the Bezoutian of the 3 polynomial in the variables t1 ; t2 . It yields the shown 8 8 matrix, whose entries are polynomials of degree 4 in t3 . For the second system, we obtain the matrix Mb given below: The determinant of matrix Mb is of degree 32:
?
?
?
186624 t3 4 ? 118 t32 + 13 t3 4 ? 22 t32 + 13 3 t3 2 + 1 8 The correct candidates for the solutions of the system are given by the roots of the rst two factors. Four of them will be of multiplicity 3. This is the formulation used in the experiments reported in section 3.3.4. Secondly, we consider the variables t1 ; t2 ; t3 , but we add a generic linear form a + b t1 and we compute the Bezoutian of these 4 polynomials. We obtain a 32 36 matrix, whose rank is 20. Using Bareiss's method, we obtain on the last row, the polynomial of degree 20 :
?
?
?2579890176 b 13 b ? 118 b a + a 13 b ? 22 b a + a : 4
4
2 2
4
4
2 2
4 3
This is a multiple of the resultant, which has degree 16. The sparse resultant method, described above, had given another multiple of degree 22. The time for computing the roots from the Bezoutian (by compression and eigenvector computations) is 0:1 seconds on a sparc. We have used the library ALP, still under development, at
:
http://www.inria.fr/safir/SAM/ALP/
13
3.3.3 Sylvester resultants
A simpler approach applies Sylvester's resultant in cascade. This is the resultant of two univariate polynomials and is implemented in most modern computer algebra systems, including Axiom, Maple and Mathematica. The cascade method applies Sylvester's resultant several times, each time eliminating one variable. In general it produces matrices larger than with the multivariate resultant, which eliminates all the variables in one step. The example at hand, though, is favorable for this algorithm since each polynomial is independent of one variable. We can, thus, dene S23 as the Sylvester matrix of f2 and f3 , regarded as univariate polynomials in t1 in the second instance of system (2), 3 2 ?1 ? t 2 24 t ?13 ? t 2 0 3 3 3 66 0 ?1 ? t 2 24 t ?13 ? t 2 77 3 3 3 77 S23 = 666 2 2 75 0 24 t ? 13 ? t ? 1 ? t 2 2 2 4 0 ?1 ? t2 2 24 t2 ?13 ? t2 2 so the determinant R23 (t2 ; t3 ) = det S23 is a polynomial in t2 ; t3 :
576 t24 t3 2 ? 1152 t23 t3 3 + 576 t22 t3 4 + 144 t24 ?8064 t23 t3 + 15840 t22 t3 2 ? 8064 t2t3 3 + 144 t34 + 7488 t22 ? 14976 t2t3 + 7488 t32 : Then, the Sylvester resultant of R23 and f1 , regarded as polynomials in t2 , is a polynomial in t3 and a multiple of the resultant of f1 ; f2 ; f3 . Letting S be the last Sylvester matrix and R its determinant,
?
?
R = 186624 t3 4 ? 118 t32 + 13 t3 4 ? 22 t32 + 13 3 ; we can solve numerically for t3 , thus obtaining a superset of the values of t3 at the common roots of the original
system. Thanks to the special shape of this system, in this special case we get exactly the minimal condition in t3 , i.e., R above is the sparse resultant within a scalar factor. However, this is not usually the case. Recall that the sparse resultant has degree 16; note that the Newton and Bézout matrices have, respectively, led to nontrivial multiples of degrees 24 and 32. For each root of R, we compute the kernel of S , yielding the corresponding values of t2 at the roots. Knowing t2 and t3 , we recover t1 by computing the kernel of S23 . This is the approach on which we present experimental results in section 3.3.4. If we wish to avoid solving univariate polynomial R(t3 ) (which is a rather unstable operation), we may build the companion matrix of S and compute its eigenvalue-eigenvector pairs. In practice, the kernel can be computed by a Singular Value Decomposition (SVD) of a singular matrix, namely S with t3 specialized to the roots of R. These are essentially the pairs of eigenvalues and kernel vectors of S , and we obtain again the values of t1 ; t2 . The same limitation as with resultant matrices in the previous subsection is also present here concerning eigenspaces of high dimension.
3.3.4 Numeric results
We report on experimental results obtained on Maple V, on a DEC AlphaStation 200 4/233 with Spec ratings 157.7 92Int, 183.9 92FP and 96 MBytes of memory. All relevant Maple code is available by the authors or publicly available as indicated in section 5. Several C programs used in this work can also be found in these locations. D is the number of decimal digits used in the computation. (precision), T1 is the time for constructing the matrix(ces) in seconds, including simplifying these matrices through either Gaussian or Bareiss's fraction-free elimination and, nally, computing the determinant as a univariate polynomial. For the sparse resultant method this includes 6:12 seconds for building the generic-coecient matrix, once for all systems. T2 is the time for solving the univariate resultant polynomial in the hidden variable t3 (in this case with fsolve) in seconds.
14
T1
D
7.97
T2
S1 T3
T1
7.44
?7 ?11 ?16 ?25 ?45
T2
S2 T3
5
0.13
1.33
0.31
0.18
1.70
10
0.08
1.35
0.30
1.67
15
1.38
1.48
20
1.67
18.92
30
1.27
21.68
50
0.20
26.17
2:15 10 1:49 10 5:58 10 1:87 10 1:93 10
0.22
1.68
0.25
17.07
0.25
19.37
0.30
23.87
T1
7.74
?4 ?9 ?15 ?24 ?44
12.61
1:14 10 1:55 10 7:51 10 2:73 10 3:51 10
T2
S3 T3
0.35
0.85
0.28
1.17
0.38
1.13
1.23
10.88
0.80
11.75
0.55
15.28
4:11 10 4:17 10 4:17 10 4:17 10 4:17 10 4:15 10
?3 ?3 ?3 ?3 ?3 ?3
Table 1: Time and accuracy with sparse resultant matrices (method 3.3.1) D
T1
T2
S1 T3
5
0.19
2.0
0.35
10
0.18
2.05
0.50
15
0.18
2.02
0.50
20
0.20
2.02
4.27
30
0.37
2.04
5.02
50
0.37
2.12
6.45
10?6 3:1 10?11 4:5 10?16 3:9 10?26 2:1 10?46 0.18
S2 T3
T1
T2
0.21
0.72
2.05
0.20
0.88
1.92
0.35
0.68
1.87
0.22
0.88
13.05
0.35
0.70
14.32
0.20
0.88
16.63
9 10?7 3:0 10?12 6:5 10?16 9:0 10?26 4:4 10?46 0.32
S3 T3
T1
T2
0.30
2.82
0.43
0.25
3.15
1.0
0.20
25.38
1.02
0.20
8.00
7.83
0.20
3.10
4.30
0.22
2.90
5.57
? ? ? ? ? ?
2 10 4 1:48 10 4 1:48 10 4 1:48 10 4 1:48 10 4 1:48 10 4
Table 2: Time and accuracy with Bézout's matrices (method 3.3.2)
T is the time for computing the matrix kernel and then the other coordinates of the roots, or directly the 3
latter, in seconds. is the maximal absolute value of the input polynomials at the computed roots. For the method of sparse resultants, the matrix is computed only once by the greedy algorithm, for generic coecients, in 6:12 seconds; this, obviously, does not depend on the chosen precision. For every system the coecients are specialized, respectively, to integer polynomials in the hidden variable t3 . The time T1 in the rst line of table 1 includes the 6:12 seconds to construct the generic matrix, and the time of Gaussian elimination on the specialized matrix, which yields the determinant. The incremental algorithm for sparse resultant matrices is expected to be considerably faster in constructing the generic matrices, i.e. with respect to the timings in column T1, which correspond to the original subdivision-based algorithm. Our preliminary Maple implementation suggests that this time would be less than half of the reported running times. This is intuitively obvious because the main step in the incremental method is a rank test on the resultant matrix, which takes about 2 seconds. Gaussian pivoting is slightly slower than fraction-free elimination, but does not increase the size of the entries, so that the triangular matrix may be used in computing the kernel. To increase accuracy, the kernel is computed for the original matrix, which leads to an increase in running time. In all three methods, we solve for the real roots of the univariate polynomial expressing a multiple of the resultant. We use the maple command fsolve, whose eciency can probably be ameliorated. It takes about 2 seconds to solve a polynomial of degree 16 and for the last system, with 15 digits, more than 25 seconds. This strange variation is probably due to testing out several methods (by try and error) in the function fsolve. For each root, we substitute in the matrix and compute its kernel vector by SVD. When the root projections on the t3 -axis are not simple, the vectors do not necessarily correspond to the values of t1 ; t2 . In this case, we substitute t3 in the original system and solve the equations in t1 and t2 only. Some experience is required in choosing the S1 T3
D
T1
T2
5
0.45
0.08
0.48
10
0.40
0.083
0.32
15
0.60
0.10
0.30
20
0.43
0.083
2.4
30
0.42
0.10
2.3
50
0.62
0.12
2.7
4:9 10?5 4:8 10?5 1:8 10?8 8:6 10?19 2:6 10?38 0.025
S2 T3
T1
T2
0.20
0.10
1.4
0.18
0.12
1.5
0.18
0.13
1.4
0.37
0.15
6.9
0.38
0.15
8.1
0.20
0.17
9.5
8 10?7 2:2 10?9 5:2 10?13 5:3 10?24 7:6 10?44 0.23
S3 T3
T1
T2
0.55
0.15
0.42
0.55
0.17
0.43
0.58
0.82
0.28
0.38
0.33
1.8
0.38
0.15
2.3
0.40
4.6
2.9
1:3 10 1:5 10 1:5 10 1:5 10 8:4 10 2:2 10
Table 3: Time and accuracy with Sylvester resultants (method 3.3.3) 15
?3 ?4 ?4 ?4 ?3 ?9
ω2
p1
L2
φ1
ω1
p2 φ2
θ1
L1
L3
p3 p5
θ2
φ5
φ4
L5
ω5
φ3
L4
ω4
p4
Figure 4: The parameters of the cyclopentane. threshold value that decides whether the vectors are valid. If not, we substitute directly in the original system and solve the resulting equations. Then a second threshold is used for deciding which candidate solutions to keep. In conclusion, we observe that the Sylvester cascade method is fastest for the particular problem. This is explained by the equations' structure. However, it is not expected to happen in larger systems, unless they possess some similar strong structure. Nonetheless, the Sylvester method oers no advantage in terms of accuracy. The Bézout's matrices method yields smaller matrices, which are easier and faster to manipulate than the sparse resultant matrices. The accuracy of Bezout's method is slightly better than the sparse resultant method on the example. On the other hand sparse resultant matrices are easier to instantiate than the Bézout matrices. The higher running times of the sparse resultant method are due to the large matrix size.
3.4 Extension to other cyclic molecules
This section briey illustrates the application of the above matrix methods to computing all conformations of cyclic molecules with less and more than 6 atoms. This is interesting because it leads to algebraic systems with more or less equations than unknowns, in other words, overconstrained or underconstrained systems, respectively.
3.4.1 Cyclopentane
We show that the above matrix methods generalize directly to the cyclopentane and nd all conformations, even in the presence of data given with limited accuracy or inputs that do not admit an exact solution. We consider the cyclopentane molecule with 5 carbon atoms and 5 single covalent bonds between them. All 5 torsion dihedrals may vary, whereas bond lengths and angles stay xed. Following the methods of ap angles developed in section 3.1.1, we can model every conformation by only two parameters. Referring to gure 4, a conformation is fully described by the dihedral angles 1 ; 2 whereas the torsion angles are denoted !i , i = 1; : : : ; 5. We obtain, from experimental evidence, the common values of bond angles and bong lengths [HC87], so Li = 1, i = 105, for i = 1; : : : ; 5: The derivation of the algebraic equations follows the same lines as in section 3.1.1. Letting p5 be the origin and using 1 = 4 = 105, we can compute the positions of p2 ; p3 , which form the base triangle, assumed to lie on xy-plane with p3 on the x-axis. Then the positions of p1 ; p4 are functions of 1 ; 2 and yield certain constraints on 2 ; 3 and 5 . It is easy to realize that the procedure applied here is the same as in the case of the cyclohexane and can also serve in general. These three constraints dene the following polynomial system of 3 equations in the two unknowns ti = tan(i =2), i = 1; 2.
f1 = 11 + 12 t21 + 13 t22 + 14 t1 t2 + 15 t21 t22 f2 = 21 + 22 t21
16
(7)
t1
t2
0.1653780264 0.1653780264 -0.1653780264 -0.1653780264
0.6273518034 -0.6273518034 0.6273518034 -0.6273518034
Table 4: The rigid parameters of the cyclopentane. f3 = 31 + 32 t22 System (7) does not have an exact solution neither for the nominal values above nor for the perturbed ones in table 4. However, we can use a matrix similar to the sparse resultant matrix of section 3.3.1 and to the Sylvester matrix of section 3.3.3 to nd the conformations that are closest to solving all three constraints. An analogous approach is also possible using a Bézout matrix as in section 3.3.2. The main idea is to construct a matrix that eliminates all variables, unlike section 3.3.1 where one variable is present in the matrix entries. The sparse resultant matrix eliminating t1 ; t2 from (7) is
2 3 66 0 0 0 77 0 7 M = 66 0 0 4 0 0 0 75 11
12
21
22
13
31
0 31
14
15
21
22
32
0 32
0
with column monomials [1; t21; t22 ; t1 t2 ; t21 t22 ]. Given this matrix, we use SVD to compute a vector in its kernel. The lack of exact solution is illustrated by the fact that the kernel vector entries do not correspond exactly to an evaluation of the column monomials at a value. Nonetheless, if we treat these entries as such and compute values for t1 ; t2 , we do nd the following approximate solutions. Here we have randomly perturbed the nominal length and angle parameters given above in order to have a more natural context, so i 1 2 3 4 5 Li 1.2 0.9 1 0.8 1.1 i [ ] 105.1 105.2 105.1 105.3 105 The maximum absolute value upon evaluation of polynomials (7) at these values is 0:37373, which is comparable to 0:16842, the value by which the kernel vector fails to express an evaluation of the column monomials. The rst solution is visualized below by Povray; see section 5 for more information on this ray-tracer.
3.4.2 Cycloheptane
We consider now the case of 7 atoms, connected by links of equal length that we normalize to 1. We also assume here that the angles at each atom between the links are equal. We use again the formulation of section 3.1.2, on squares of distances between points. The square of the distance between two consecutive points is 1 and the distance between two points Pi ; Pi+2 is 2 (1 ? cos()). The distance
17
between two other points is unknown. Thus the Cayley-Menger matrix associated to these seven points is of the form
! p1 p2 p3 p4 p5 p6 p7
2 !0 66 1 66 66 1 66 1 66 66 1 66 1 66 64 1
p1 1 0 1 c x1 y1 1 1 c
p2 1 1 0 1 c x2 y2 1
p3 1 c 1 0 1 c x3 y3
p4 1 x1 c 1 0 1 c x4
p5 1 y1 x2 c 1 0 1 c
p6 1 1 y2 x3 c 1 0 1
p7 1 c 1 y3 x4 c 1 0
3 77 77 77 77 77 77 77 77 75
where c = 2 (1 ? cos()). For the computation shown in this section, we have chosen 107:46 so that c = 135 . We also assume that two distinct points Pi , Pj (with i 6= j ) cannot be in the same position. The unknown entries x1 ; x2 ; x3 ; x4 ; y1 ; y2 ; y3 of this matrix correspond to the unknown square of distances between the points. They cannot vanish. Once these quantities are known, we can recover the geometry of the molecule, up to a simultaneous displacement of all the points. p1 ; : : : ; p7 . Let us denote by C the set of vectors [x1 ; x2 ; x3 ; x4 ; y1; y2 ; y3 ], for which the molecule satises the geometric constraints on the length of the links and on the angles. Counting the degrees of freedom, we verify that C is a curve. This matrix M must be of rank 5, because it corresponds to a Gramm-Schmidt matrix of inner-products in the space of spheres. Therefore any 6 6 minor of M must be zero and yields (if not zero) a polynomial constraint on our indeterminates. Examining carefully the matrix M , we remark that it can be decomposed in 4 4 blocks as follows: A B M= C D
where the left-upper 44 block A is is independent the variables x1 ; x2 ; x3 ; x4 ; y1 ; y2 ; y3 and invertible. Consequently, we have I 0 A B A B ?C A?1 I C D = 0 D ? C A?1 B
and M is of rank 5 if and only if S = D ? C A?1 B is of rank 1. Any 2 2 minor of the matrix S is a polynomial relation on the variables x1 ; x2 ; x3 ; x4 ; y1 ; y2 ; y3 . Instead of working with the 6 6 minors of M , we will deal equivalently with the 2 2 minors of S . Given 4 of these 2 2 minors, we deduce the values of x3 ; x4 ; y2 ; y3 as rational functions of the variables x1 ; x2 ; y1 as follows: s :={det(submatrix(S,[1,2],[1,3])), det(submatrix(S,[1,2],[1,4])), det(submatrix(S,[1,3],[1,4])), det(submatrix(S,[1,4],[1,4]))}; rg := solve(s,{y[2],y[3],x[3],x[4]});
We obtain two branches, but one of these branches implies that two points are in a same place. Removing this degenerate case, we can express the variables x3 ; x4 ; y2 ; y3 as rational fraction of x1 ; x2 ; y1 :
2411500 x12 x2 3 + 4114824 x2x1 y1 + x3 = ??2411500 x12 x2 3 ? 28129080 x2x1 y1 + y1x1 3 x2 + 2250 x13 y1 + x4 = ?1400?x1250 2 2 2 1 ? 1250 x2 x1 y1 ? 1250 x1 y1 +
18
(8)
?7665000 x12 x2 3 + 18545184 x2x1 y1 + y2 = ? 2411500 x12 x2 3 ? 28129080 x2x1 y1 + x12 + +2500 y12 x1 2 + y3 = ?196 1400 x12 ? 1250 x2x1 2 y1 + Substituting these values in the matrix S yields a matrix S 0 , whose entries depend only on x1 ; x2 ; y1 . We check that the 2 2 diagonal minors t1 := [1; 2j1; 2]; t2 := [1; 3j; 1; 3] 0 of S are not zero, so that the yield 2 non-zero constraints of the 3 variables x1 ; x2 ; y1 . Eliminating y1 between these two equations yields the following equation in x1 ; x2 : R(x1 ; x2 ) = x1 12 x2 10
12 x 9 2
x1 ? 13158335
?
11 x 10 2
x1 ? 1162082345
14622443292074164099219456 x1 357818603515625
?
22815512 x1 11 x2 9 518076 x1 10 x2 10 + + 1675 11725 469 12295458288934920183087104 x2 1970268575670553886588928 + : 357818603515625 357818603515625 +
1060499 x1 12 x2 8
It is of degree 12 in x1 , of degree 10 in x2 and of total degree 22. This equation denes the projection of the curve C on the plane (x1 ; x2 ). In order to compute a point on the curve C , we compute the a point (x1 ; x2 ) satisfying the equations R(x1 ; x2 ) = 0. We substitute these values in the Sylvester matrix associated to t1 (x1 ; x2 ; y1 ) and t2 (x1 ; x2 ; y1 ) and compute its kernel. From this kernel, we deduce the value of y1 (as in section 3.3) and we compute the values of x3 ; x4 ; y2; y3 from the values of x1 ; x2 and y1 using the rational fractions (8).
4 Critical conformations This section examines the geometrically unstable conformations of the 6-atom cyclic molecule, in other words, the locus of the critical points in the space of all conformations. Obviously, there exists an analogy between molecular conformations and robot congurations or mechanical arrangements. Various structural requirements on molecules can be modeled as geometric or kinematic constraints, as in robotics. This analogy consists in associating a serial robot to a molecule, by associating the rotation !i around axis Li to a rotating link between two solids Si?1 (centered at pi?1 ) and Si (centered at pi ). This is illustrated in the following gure; for notation refer to gure 3.
The circles indicate the articulations of rotation in this robot. It has the particularity that two consecutive axes of articulation are intersecting at a point. The cyclic molecule corresponds to a situation where the last solid is in a xed conformation with respect to the rst one.
19
This analogy can be carried on further, in order to analyze the conformations that are innitesimally unstable. For any solid S moved by a displacement D(t) (where t is, for instance, time), the velocity of a point M (t) of this solid is given by dt (M (t)) = V0 + OM (t)
where O is the origin, V0 the velocity of the origin, considered as a point of the solid, is the vector product of the physicists and the angular velocity. In other words, (V0 ; ) represents the cinematic torque. We will identify it with the element I of ^2 E , where E is the 4-dimensional vector space underlying the ane space A 3 and ^E the exterior algebra on E , see [Lan80, Bou70]:
I =V e ^e ?V e ^e +V e ^e + e ^e + e ^e + e ^e ; 1
2
3
2
1
3
3
1
2
1
1
4
2
2
4
3
3
4
where the ei = (0; : : : ; 0; 1; 0; : : :; 0) are the vectors in the canonical basis. It represents the innitesimal movement of the solid. For instance, for a rotation along an axis L containing A = (1 : a1 : a2 : a3 ); B = (1 : b1 : b2 : b3 ), we ?! have = AB and VA = 0. Thus the innitesimal movement for this rotation is A ^ B 2 ^2 E . If M = (1 : m1 : m2 : m3 ), we check that the rst 3 coordinates of I ^ M (i.e. the coecients of e0 ^ e2 ^ e3 , e0 ^ e1 ^ e3 , e0 ^ e1 ^ e2 ) are precisely the coordinates of dt (M (t)). As we have I ^ M ^ M = 0, the last coordinate (the coecient of e1 ^ e2 ^ e3 ) of I ^ M is determined when we know its rst 3 coordinates and M . Thus, the velocity dt (M (t)) can be identied with the vector I ^ M 2 ^3 E . Consider now several solids connected by articulations of rotation along axes L1 ; : : : ; Ln. The velocity of any point M of the last solid is the sum of the velocities of M attached to each solid. It is given by the formula
dt (M (t)) =
N X i=1
i Li ^ M:
Coming back to our problem of molecule (or robot), the serial robot will be unstable (see [MS94, Mer90]), if the velocity of any point M in the last solid is 0, but the innitesimal movement of the 6 solids is not zero. This means that we can x the rst and last solids and still nd an innitesimal motion of the chain of solids. P In other words, the chain will be innitesimally unstable if there exists (1 ; : : : ; 6 ) 6= (0; : : : ; 0) such that 8M; Ni=1 i Li ^ M or equivalently 1 L1 + + 6 L6 = 0: 2 where the Li = pi ^ pi+1 2 ^ E represent the axes of rotation of the robot. Therefore, the cyclic molecule is innitesimally unstable if these axes of rotation are linearly dependent in ^2 E . We are going now to characterize geometrically, the conformations of the molecule which are innitesimally unstable. p4
P
p6
p2 p5
p1
p3
20
Let P be the plane (p2 ; p4 ; p6 ). The linear system generated by p1 ^ p2 , p1 ^ p3 is the same as the one generated by p1 ^ p2 , p1 ^ p03 where p03 is any point of the plane p1 ; p2 ; p3 outside the line (p1 ; p2 ). In particular, we can take p03 on the plan P . Let us call this line L01 = p2 ^ p03 . By the same operations at the points p4 ; p6 , we can replace the system (pi ^ pi+1 )i=1:::6 by an equivalent system
p1 ^ p2 ; p3 ^ p4 ; p5 ^ p6 ; L01 ; L02; L03 where L01 (resp. L02, L03 ) is the intersection of (p1 ; p2 ; p3) (resp. (p3 ; p2 ; p4 ), (p5 ; p4 ; p6 )) with P . Note that if these 3 lines are concurrent, then they are linearly dependent in ^2 E , for they are in a same plane. In this case, the 6 elements p1 ^ p2 ; p3 ^ p4 ; p5 ^ p6 ; L01 ; L02; L3 . are also linearly dependent. By construction, if L01 ; L02 ; L03 meet in a common point, so do the planes (p2 ; p4 ; p6 ); (p1 ; p2 ; p3 ); (p3 ; p4 ; p5 ); (p5 ; p6 ; p1 ):
We can translate this condition, using the Cayley operators (see [BBR85]), in a polynomial in the determinants of the points pi as follows: (p2 ; p4 ; p6) \ (p1 ; p2 ; p3 ) = p2 ^ p4 [p6 ; p1 ; p2 ; p3 ] ? p2 ^ p6 [p4 ; p1 ; p2 ; p3 ] + p4 ^ p6 [p2 ; p1; p2 ; p3 ]
(p2 ; p4 ; p6 ) \ (p1 ; p2 ; p3 ) \ (p3 ; p4 ; p5 ) = ?p4 [p6 ; p1 ; p2 ; p3 ][p2 ; p3 ; p4 ; p5 ] ? p2 [p4 ; p1 ; p2 ; p3 ][p6 ; p3 ; p4 ; p5 ] +p6 [p4 ; p1 ; p2 ; p3 ][p2 ; p3 ; p4 ; p5 ] + p4 [p2 ; p1 ; p2 ; p3 ][p6 ; p3 ; p4 ; p5 ]
which yields [p4 ; p1 ; p5 ; p6 ][p6 ; p1 ; p2 ; p3 ][p2 ; p3 ; p4 ; p5 ] + [p2 ; p1 ; p5 ; p6 ][p4 ; p1 ; p2 ; p3 ][p6 ; p3 ; p4 ; p5 ] (9) ? [p6 ; p1 ; p5 ; p6][p4 ; p1 ; p2; p3 ][p2 ; p3 ; p4 ; p5 ] ? [p4 ; p1 ; p5 ; p6 ][p2 ; p1 ; p2 ; p3][p6 ; p3 ; p4; p5 ] = 0 where [A; B; C; D] is the determinant (or oriented volume) of the 4 ane points A; B; C; D. The determinant of the 6 axis pi ^ pi+1 and this polynomial (9) are of the same degree (i.e. 2) in the coordinates each point pi , and vanish on the same open subset of the unstable conformations (p1 ; : : : ; p6). Therefore, they are equal within multiplication by a nonzero scalar. Therefore, the necessary and sucient conditions for which the molecule is unstable is given by the equation (9).
5 Details on the implementation One goal of this work is to illustrate the simplicity and eectiveness of standard computer algebra tools for identifying and manipulating molecules. This section describes the dierent systems and implementations used in some detail, in particular those related to translating the biology problem into algebraic terms and those applied for visualizing the results. All experiments presented in this paper were performed on Maple, a general computer algebra system [CGG+ 92]. We used this system in three dierent ways: 1. First, in modeling the molecular problem, the step where we derive the polynomial equations from the geometric constraints. 2. The step of simultaneously solving these equations numerically. 3. Lastly, the step of visualizing the results. For the second step, we implemented the procedures for computing the sparse resultant and the Bezoutian directly in Maple. We also used existing routines of Maple, including those for computing the Sylvester matrix, the eigenvectors of a matrix and its Singular Value Decomposition (SVD). These routines, in addition to the C implementations of sections 3.2 and 3.3.1 can all be found at one of the following locations:
;
http://www.inria.fr/safir/emiris
:
http://www.inria.fr/safir/whoswho/Bernard.Mourrain/
21
5.1 Modeling
We have already discussed the algorithmic aspects of the modeling process in sections 3.1.1 and 3.1.2 and have mentioned the innovative aspects of both procedures. Here we discuss the implementation on Maple, which adopts an original computer algebra approach which should be of independent interest to researchers in the latter eld and those working in mathematical education. Our approach is based on an object-oriented methodology which allows, for instance, to hide the internal representation of the objects, so that we can apply a function with the same name on dierent objects. This is closely related to object-oriented programming languages, such as C++, in which methods provide the analogue to these functions. The behavior of the computations will be deduced from the type of the object on which this function is applied. Thus, we can change, if necessary, the internal representation of the objects, without changing the main part of the algorithm. Moreover, pieces of code can be easily reused through the mechanism of inheritance. This mechanism does not exist by default in Maple, and we describe below its implementation; we use Maple syntax. It is straightforward to export this implementation to any general computer algebra system which supports symbolic computation. The objects that we manipulate are inert functions, in other words they require no denition. For instance points (which shall represent atoms) are expressed as follows: P:= Point(coord = [x,y,x]);
where the name of the (inert) function (here Point) denes the type of the object P and the arguments of this function are values associated to this object. Here, the value of the eld coord is the list [x,y,z]. Another type commonly used in molecular modeling is the type Angle. To give a concrete example of a function operating on two objects of type Angle and returning such an object, consider the application of the rule of cosines, when we have a triangle where we know two angles and two edge lengths, including that of the edge between the known angles. Then, the following function returns an angle (in radians) expressing the third angle of the triangle. Angle(MTH)(cosrule) := proc(a:Angle, b:Angle, inbetween, opposite) local xx, l; xx := evalf(inbetween^2+opposite^2-2*inbetween*opposite*cos(b&.radvalue())); l := {evalf(2*inbetween*sqrt(xx)*cos(a&.radvalue())-inbetween^2-xx+opposite^2)}; if nops(a) = 1 then Angle(op(1,a),l); else Angle(op(1,a),{op(op(2,a)),op(l)}); fi; end:
We omit the full description of the conventions governing the representation of angles in terms of constraints, such as those obtained by the application of the cosine rule above. We also omit the precise denitions of all subroutines used here for the sake of simplicity; the full package can be obtained by contacting the authors. Having dened objects such as points, angles and lines, we wish to apply functions whose behavior depends on the type of the object. For this we implemented a special operator &., which gets the type of the object and applies the corresponding function, according to this type. For instance, if we want to apply a translation to the point P, we will use Pt :=
P &. translate([1,2,3]);
where Pt is the translated point obtained by application of function translate. If we have dened a line L Line(points=[P1, P2]);, we may also use Lt :=
:=
L &. translate([1,2,3]);
to get translated line Lt. In both cases, the operator &. gets the type of the object (Point for P and Line for L) and applies either the procedure Point(MTH)(translate), with the arguments (P,[1,2,3]), or Line(MTH)(translate), with the arguments (L,[1,2,3]). Therefore, we need to dene the procedures Point(MTH)(translate) := proc(p,v) ... end Line(MTH)(translate) := proc(p,v) ... end
22
in order to use this mechanism. Symbolic object-oriented computing on computer algebra packages is still under development. We aim at extending the classes of geometrical objects that can be used for computation in other contexts as well, including robotics and computer vision.
5.2 Visualization
Visualization transforms the symbolic and numeric results to visual geometric information, hence its importance in tactile and visual sciences such as computational biology. Here, we illustrate the use of two specialized tools with dierent merits and limitations. The rst tool is izic [FKM94], a stand-alone high-quality 3D graphics tool driven by a command language, which allows ecient curve and surface manipulations using a virtual graphics device. They include management of pseudo-colors and true colors, manipulation of the illumination model, shading and transparency. As an interactive tool, izic is run as a Unix server which can be driven from one or more computer algebra systems, including Maple, Mathematica, and Reduce. This is an important advantage, which is not shared by most stand-alone graphics packages. Connecting izic with a dierent system is a very simple task which can be achieved at runtime and requires no compilation. Another important feature is the possibility to drive izic both through its freely-recongurable menu-button user interface, and through its command language. This brings a high level of programmability and interoperability, allowing for instance the remote animation of surfaces from the symbolic computation engine. Furthermore, there are library routines for drawing some basic objects, including spheres and cylinders, from within Maple. These capabilities were used in rendering the various conformations of the cyclohexane in section 3. A piece of code would include: Sph[i] := izsphere(A[i],r1,color = c0); Len[i] := izcylinder(A[i],A[i+1],r2,color = c2);
The second tool, called Povray, is a ray-tracer, which yields excellent quality color images with the notion of depth in three-dimensional space, but with no possibility to operate interactively on the scene [POV]. It oers a wide range of pre-dened objects which are very easy to use and can be customized by a small number of parameters. For instance, the cyclopentane of section 3.4 is dened by a text le including commands of the following type. sphere { , .3 texture { pigment { color Green } } } cylinder { , , .08 }
Besides, there is an array of colors, background scenes and textures.
6 Conclusions We have examined molecular closure, which is encountered in a variety of problems in computational biology, including docking of exible ligands to large molecules and the search for energetically favorable conformations, both crucial questions in structure-based drug design. The geometric problem we have studied lies in the bottleneck of various algorithms, hence eciency and accuracy are critical. Certain geometric and algebraic methods are compared in order to address these concerns. Furthermore, they illustrate a technology transfer from computer algebra and symbolic computational geometry to formalizing the problem in algebraic terms, counting and then solving for all possible molecular conformations. Lastly, we have demonstrated the use of various public domain software tools, such as computer algebra packages and their use in the modeling and visualization process, as well as the performance of more specialized programs, which are all freely available. By illustrating the simplicity and eciency of our algorithms, we wish to expose a modern application area to researchers in computer algebra and, at the same time, motivate biologists to use advanced algebraic and geometric methods.
23
7 Acknowledgment This work is partially supported by the European project FRISCO (LTR 21.024).
References [ABB+ 95] E. Anderson, Z. Bai, C. Bischof, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, S. Ostrouchov, and D. Sorensen. LAPACK Users' Guide. SIAM, Philadelphia, 2nd edition, 1995. [BBR85] M. Barnabei, A. Brini, and G.C. Rota. On the exterior calculus of invariant theory. J. of Algebra, 96:120160, 1985. [Ber77] M. Berger. Géométrie, volume 5, L'espace des sphères. Cedic/Fernand Nathan, 1977. [BMB94] L.M. Balbes, S.W. Mascarella, and D.B. Boyd. A perspective of modern methods in computer-aided drug design. In K.B. Lipkowitz and D.B. Boyd, editors, Reviews in Computational Chemistry, volume 5, pages 337379. VCH Publishers, New York, 1994. [Bou70] N. Bourbaki. Eléments de Mathématiques, Algèbre, Ch. 3. Hermann, Paris, 1970. [CE93] J. Canny and I. Emiris. An ecient algorithm for the sparse mixed resultant. In G. Cohen, T. Mora, and O. Moreno, editors, Proc. Intern. Symp. on Applied Algebra, Algebraic Algor. and Error-Corr. Codes, Lect. Notes in Comp. Science 263, pages 89104, Puerto Rico, 1993. Springer. + [CGG 92] B.W. Char, K.O. Geddes, G.H. Gonnet, B.L. Leong, M.B. Monagan, and S.M. Watt. First Leaves: A Tutorial Introduction to Maple V. Springer, 1992. See also http://www.maplesoft.com. + [CJW 94] D.E. Clark, G. Jones, P. Willett, P.W. Kenny, and R.C. Glen. Pharmacophoric pattern matching in les of three-dimensional chemical structures: Comparison of conformational searching algorithms for exible searching. J. Chem. Inf. Comput. Sci., 34:197206, 1994. [CLO92] D. Cox, J. Little, and D. O'Shea. Ideals, Varieties, and Algorithms. Undergraduate Texts in Mathematics. Springer, New York, 1992. [CLO97] D. Cox, J. Little, and D. O'Shea. Using Algebraic Geometry. Springer, New York, 1997. [CM96] J.-P. Cardinal and B. Mourrain. Algebraic approach of residues and applications. In J. Renegar, M. Shub, and S. Smale, editors, The Mathematics of Numerical Analysis, volume 32 of Lectures in Applied Math., pages 189210. AMS, 1996. [DH91] A.W.M. Dress and T.F. Havel. Distance geometry and geometric algebra. Foundations of Physics, 23(10):13571374, 1991. [EC95] I.Z. Emiris and J.F. Canny. Ecient incremental algorithms for the sparse resultant and the mixed volume. J. Symbolic Computation, 20(2):117149, August 1995. [EM96] M. Elkadi and B. Mourrain. Approche eective des résidus algébriques. Technical Report 2884, INRIA, Sophia-Antipolis, France, 1996. [Emi96] I.Z. Emiris. On the complexity of sparse elimination. J. Complexity, 12:134166, 1996. [Emi97] I.Z. Emiris. A general solver based on sparse resultants: Numerical issues and kinematic applications. Technical Report 3110, INRIA Sophia-Antipolis, France, 1997. + [FHK 96] P.W. Finn, D. Halperin, L.E. Kavraki, J.-C. Latombe, R. Motwani, C. Shelton, and S. Venkatasubramanian. Geometric manipulation of exible ligands. In Applied Computational Geometry, volume 1148 of LNCS, pages 6778. Springer, 1996. [FKM94] R. Fournier, N. Kajler, and B. Mourrain. Visualization of mathematical surfaces: the IZIC server approach. J. Symbolic Computation, 19:159173, 1994. [GS70] N. G o and H.A. Scheraga. Ring closure and local conformational deformations of chain molecules. Macromolecules, 3(2):178187, 1970.
24
N. G o and H.A. Scheraga. Ring closure in chain molecules with cn or s2n symmetry. Macromolecules, 6(2):273281, 1973. [Hav91] T.F. Havel. An evaluation of computational strategies for use in the determination of protein structure from distance constraints obtained by nuclear magnetic resonance. Pog. Biophys. Molec. Biol., 56:4378, 1991. [HC87] A. Hart and J.-M. Conia. Introduction en Chimie Organique. Inter-Editions, 1987. [HK88] A.E. Howard and P.A. Kollman. An analysis of current methodologies for conformational searching of complex molecules. J. Medic. Chem., 31(9):16691675, 1988. [HN95] T.F. Havel and I. Najfeld. A new system of equations, based on geometric algebra, for ring closure in cyclic molecules. In J. Fleischer, J. Grabmeier, and W. Hehl, editors, Computer Algebra in Science and Engineering. World Scientic Publishing, 1995. To appear. [HN97] T.F. Havel and I. Najfeld. Embedding with a rigid substructure. Technical Report 04-97, Center for Research in Computing Technology, Harvard University, 1997. [KL92] D. Kapur and Y.N. Lakshman. Elimination methods: An introduction. In B. Donald, D. Kapur, and J. Mundy, editors, Symbolic and Numerical Computation for Articial Intelligence, pages 4588. Academic Press, 1992. [Lan80] S. Lang. Algebra. Addison-Wesley, 1980. [Lea91] A.R. Leach. A survey of methods for searching the conformational space of small and medium size molecules. In Reviews in Computational Chemistry, volume 2, pages 155. VCH Publishers, 1991. [LK92] A.R. Leach and I.D. Kuntz. Conformational analysis of exible ligands in macromolecular receptor sites. J. Computational Chem., 13(6):730748, 1992. [MC92] D. Manocha and J. Canny. Real time inverse kinematics of general 6R manipulators. In Proc. IEEE Conf. Robotics and Automation, pages 383389, Nice, May 1992. [Mer90] J.P. Merlet. Les robots parallèles. Traités de nouvelles technologiques. Hermes, 1990. [MS94] B. Mourrain and N. Stol. Invariants methods in Discrete and Computational Geometry, chapter Computational Symbolic Geometry, pages 107139. Kluwer Acad. Pub., 1994. [MZW95] D. Manocha, Y. Zhu, and W. Wright. Conformational analysis of molecular chains using nanokinematics. Computer Applications of Biological Sciences, 11(1):7186, 1995. [NR95] J. Nielsen and B. Roth. Elimination methods for spatial synthesis. In J.-P. Merlet and B. Ravani, editors, Computational Kinematics '95, pages 5162. Kluwer, 1995. [Par94] D. Parsons, 1994. Personal Communication. [PC94] D. Parsons and J. Canny. Geometric problems in molecular biology and robotics. In Proc. 2nd Intern. Conf. on Intelligent Systems for Molecular Biology, pages 322330, Palo Alto, CA, August 1994. [POV] POV-Ray Team. http://www.povray.org/ or ftp.povray.org. See also http://stef.u-picardie.fr/povray/. [PS93] P. Pedersen and B. Sturmfels. Product formulas for resultants and Chow forms. Math. Zeitschrift, 214:377396, 1993. [Sha77] I.R. Shafarevich. Basic Algebraic Geometry. Springer, Berlin, 1977. [SNW94] B. Sandak, R. Nussinov, and H.J. Wolfson. 3-D exible docking of molecules. In Proc. 1st IEEE Workshop on Shape and Pattern Recognition in Computational Biology, Seattle, 1994. [VdW48] B.L. Van der Waerden. Modern algebra Vol II. Frederick Ungar Publishing Co, 1948.
[GS73]
25