Computational Methods for Dynamics and Mobility ...

3 downloads 0 Views 6MB Size Report
Mechanisms Program) [3], MSC ADAMS (Automated Dynamics of ... For example, although the MSC ADAMS software ...... TR-2001-2, Hamilton College, Clinton,.
Computational Methods for Dynamics and Mobility Analysis of Multiloop Mechanisms and Robotic Systems By KRISTOPHER WEHAGE B.S. Mechanical Engineering, University of Illinois, Urbana-Champaign 2002 M.S. Materials Science, University of California, Davis 2014 THESIS Submitted in partial satisfaction of the requirements for the degree of DOCTOR OF PHILOSOPHY in MECHANICAL ENGINEERING in the OFFICE OF GRADUATE STUDIES of the UNIVERSITY OF CALIFORNIA, DAVIS

Approved:

Prof. Bahram Ravani, Chair

Prof. Harry Cheng

Prof. Steven Velinsky Committee in Charge 2016 i

Contents 1 Introduction

1

2 Notation

9

2.1

Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9

2.2

Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

2.3

Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13

3 Preliminaries 3.1

3.2

3.3

15

Numerical Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

15

3.1.1

Pipelining / Vectorization . . . . . . . . . . . . . . . . . . . . . . . .

15

3.1.2

Memory Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

17

3.1.3

Vector Stride . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

18

Geometric Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

18

3.2.1

Transforming Screws . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

3.2.2

Coscrews . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

25

3.2.3

Influence Coefficient Matrices . . . . . . . . . . . . . . . . . . . . . .

28

3.2.4

Constraint Forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

30

3.2.5

Transforming the Inertia Matrix . . . . . . . . . . . . . . . . . . . . .

31

3.2.6

More derivative operator matrices . . . . . . . . . . . . . . . . . . . .

34

Dynamic Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

37

3.3.1

Lagrange’s Equations of the First Kind . . . . . . . . . . . . . . . . .

38

3.3.2

Euler-Lagrange Equations . . . . . . . . . . . . . . . . . . . . . . . .

38

3.3.3

Index-1 Euler-Lagrange Equations . . . . . . . . . . . . . . . . . . . .

39

3.3.4

Null-Space Methods . . . . . . . . . . . . . . . . . . . . . . . . . . .

39

3.3.5

Pseudo–Inverse Methods . . . . . . . . . . . . . . . . . . . . . . . . .

40

3.3.6

Kane’s Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

40

3.3.7

Maggi’s Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . .

41

ii

3.4

Generalized Coordinate Partitioning . . . . . . . . . . . . . . . . . . . . . . .

42

3.5

Graph theory for multibody systems . . . . . . . . . . . . . . . . . . . . . .

44

3.5.1

Spatial Equations of Motion . . . . . . . . . . . . . . . . . . . . . . .

45

3.5.2

Comparison to Newton-Euler Equations . . . . . . . . . . . . . . . .

46

3.5.3

Properties of Spatial Accelerations . . . . . . . . . . . . . . . . . . .

47

3.5.4

Lagrange’s Spatial Equations of Motion . . . . . . . . . . . . . . . . .

48

4 Preconditioned Spatial Equations of Motion

54

4.1

Preconditioning method . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

54

4.2

Equivalence to Maggi’s Equations . . . . . . . . . . . . . . . . . . . . . . . .

55

4.3

Advantages to the Preconditioned Spatial Equations of Motion . . . . . . . .

57

5 Kinematic Substructuring

58

5.1

Motivation for substructuring . . . . . . . . . . . . . . . . . . . . . . . . . .

58

5.2

Representing Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

59

5.3

Kinematic Substructuring Algorithm . . . . . . . . . . . . . . . . . . . . . .

66

5.3.1

Identify the level of each node or body in the spanning tree . . . . . .

69

5.3.2

Identify loop members . . . . . . . . . . . . . . . . . . . . . . . . . .

71

5.3.3

Order loops by level . . . . . . . . . . . . . . . . . . . . . . . . . . .

73

5.3.4

Merge loops into kinematic substructures . . . . . . . . . . . . . . . .

74

5.3.5

Reorder bodies and joints . . . . . . . . . . . . . . . . . . . . . . . .

76

5.4

Modified algorithm for level–order sort and identification of loops and loop members . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

79

5.5

Numerical Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

83

5.6

Dynamics Equations with Substructures . . . . . . . . . . . . . . . . . . . .

88

6 Numerical Solution of Dynamics Equations with Substructures

96

6.1

The Schur Complement Method . . . . . . . . . . . . . . . . . . . . . . . . .

96

6.2

Recursive Schur Complement with Multiple Substructures . . . . . . . . . .

98

iii

6.3

Sparse Matrix Methods and Matrix Fills . . . . . . . . . . . . . . . . . . . . 102

6.4

Permutation Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

6.5

6.6

6.7

6.4.1

Permutation Order 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

6.4.2

Permutation Order 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

6.4.3

Permutation Order 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

Representing Permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 6.5.1

Inverse Permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

6.5.2

Evaluating Permutations . . . . . . . . . . . . . . . . . . . . . . . . . 108

Evaluating the Generalized Equations of Motion . . . . . . . . . . . . . . . . 111 6.6.1

Matrix Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

6.6.2

Evaluating the Generalized Mass Matrix . . . . . . . . . . . . . . . . 111

6.6.3

Evaluating Right Hand Side . . . . . . . . . . . . . . . . . . . . . . . 119

6.6.4

Evaluating Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . 124

6.6.5

Setting up Sparse Factor and Solve . . . . . . . . . . . . . . . . . . . 129

Numerical Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 6.7.1

Overview of numerical solution process . . . . . . . . . . . . . . . . . 143

7 Numerical Examples

146

7.1

Slider Crank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

7.2

Bricard’s Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

7.3

Excavator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

7.4

Vehicle Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 7.4.1

7.5

Overview of model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

8 Overview of Transmission Performance and Mobility 8.1

171

Transmission metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 8.1.1

Transmission Angle and Pressure Angle . . . . . . . . . . . . . . . . . 171

8.1.2

Screw-Based Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 172

iv

8.1.3

Transmission Quality from the Jacobian Matrix . . . . . . . . . . . . 174

8.2

Computing the TWS from the Jacobian Matrix . . . . . . . . . . . . . . . . 176

8.3

Mobility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

8.4

8.3.1

Chebyshev-Gr¨ ubler-Kutzbach criteria . . . . . . . . . . . . . . . . . . 179

8.3.2

Mobility based on the rank of the Jacobian . . . . . . . . . . . . . . . 180

8.3.3

Numerical computation of mobility . . . . . . . . . . . . . . . . . . . 180

Transmission Performance, Manipulability and Mobility . . . . . . . . . . . . 182

9 Mobility Numbers

184

9.1

Computing Mobility Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . 184

9.2

Connection to Singular Value Decomposition . . . . . . . . . . . . . . . . . . 185

10 Mobility Numbers – Examples

187

10.1 Single Loop Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 10.1.1 Slider–crank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 10.1.2 Bricard’s mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 10.2 Multiloop Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 10.2.1 Sarrus linkage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 10.2.2 Klann’s linkage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 10.3 Mechanisms with Substructures . . . . . . . . . . . . . . . . . . . . . . . . . 198 10.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 11 Mobility Numbers – Applications

203

11.1 Mechanical advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 11.2 Computing Static Force Balance . . . . . . . . . . . . . . . . . . . . . . . . . 209 11.3 Mobility Numbers in a CAD Environment . . . . . . . . . . . . . . . . . . . 210 11.4 Application of Mobility Numbers for Redundancy Resolution . . . . . . . . . 214 11.4.1 Prior art for resolving redundancy . . . . . . . . . . . . . . . . . . . . 216 11.4.2 Redundancy resolution based on mobility . . . . . . . . . . . . . . . . 218

v

11.4.3 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 11.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226 12 Software Design

227

12.1 Design Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 12.2 File Format and Data Structures . . . . . . . . . . . . . . . . . . . . . . . . 229 12.2.1 External File Format . . . . . . . . . . . . . . . . . . . . . . . . . . . 230 12.2.2 Internal Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . 237 12.3 Force Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 12.4 Equivalent Graph Representation . . . . . . . . . . . . . . . . . . . . . . . . 241 12.4.1 Directed Graphs vs. Spanning Trees . . . . . . . . . . . . . . . . . . 242 12.4.2 Equivalent Graph Representation . . . . . . . . . . . . . . . . . . . . 245 12.5 Memoization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246 12.6 Graphical User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248 12.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249

vi

List of Figures 1

A three cycle adder, adapted from [18] . . . . . . . . . . . . . . . . . . . . .

15

2

Simplified representation of memory hierarchy, adapted from [18] . . . . . . .

17

3

A Hydraulic Excavator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

59

4

Spanning tree for hydraulic excavator . . . . . . . . . . . . . . . . . . . . . .

64

5

Modified spanning tree for hydraulic excavator with chords converted to arcs

68

6

Loop v members and nodes in kinematic path of loop v’s base node. . . . . .

73

7

Run times for 10000 gaussian elimination evaluations with complete row and column pivoting for A: Block–diagonalization via kinematic substructuring, B: Sparse factorization using HSL MA48 algorithm, and C: Full, dense gaussian elimination

8

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

86

Run times for 10000 gaussian elimination evaluations without pivoting for A: Block–diagonalization via kinematic substructuring, B: Sparse “fast–factorization” using HSL MA48 algorithm, and C: Full, dense gaussian elimination without pivoting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9

87

Overview of sparse matrix solution process for preconditioned equations of motion with substructures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

10

Slider Crank Model shown in Graphical User Interface . . . . . . . . . . . . 146

11

Simulation results for slider–crank . . . . . . . . . . . . . . . . . . . . . . . . 147

12

Bricard’s mechanism moving from (a.) a first planar configuration to (b.) a square configuration to (c.) a second planar configuration . . . . . . . . . . . 151

13

States for Bricard’s Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . 154

14

Constraint error, number of evaluations per reporting interval, energy and Baumgarte constraint stabilization parameter (ωn ) used for Bricard’s mechanism. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

15

States for excavator simulation. . . . . . . . . . . . . . . . . . . . . . . . . . 157

16

Constraint error, number of function evaluations per time step, energy and Baumgarte constraint stabilization parameter (ωn ) used for excavator. . . . . 158 vii

17

Screen capture of a vehicle rolling chassis . . . . . . . . . . . . . . . . . . . . 159

18

Directed Graph of Vehicle Model . . . . . . . . . . . . . . . . . . . . . . . . 162

19

Fiala model validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

20

x and y position of vehicle chassis at a constant forward velocity with steering rack displaced 10cm for (a.) 15 m/s and (b.) 25 m/s forward velocity . . . . 168

21

Linear and translational velocities expressed in vehicle chassis frame for (a.) 15 m/s and (b.) 25 m/s forward velocity and linear and translational accelerations expressed in the vehicle chassis frame for (c.) 15 m/s and (d.) 25 m/s forward velocity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

22

A four bar mechanism in an instantaneous stationary position [103] . . . . . 172

23

A 7H Spatial Linkage [103] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

24

Parameterized slider–crank mechanism . . . . . . . . . . . . . . . . . . . . . 187

25

Mobility numbers for a slider–crank mechanism . . . . . . . . . . . . . . . . 188

26

Bricard mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190

27

Mobility numbers for Bricard’s mechanism . . . . . . . . . . . . . . . . . . . 191

28

Sarrus linkage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

29

Directed graph for the Sarrus linkage . . . . . . . . . . . . . . . . . . . . . . 193

30

Directed graph for the Sarrus linkage with virtual bodies . . . . . . . . . . . 194

31

Klann mechanism (a.)–(c.) leg is in contact with the ground. (d.)–(f.) leg is lifted and moved into position for next “step” . . . . . . . . . . . . . . . . . 195

32

Directed graph for the Klann linkage . . . . . . . . . . . . . . . . . . . . . . 196

33

Directed graph for the Klann linkage with virtual bodies . . . . . . . . . . . 196

34

Mobility numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

35

Mobility numbers for excavator model

36

Excavator model in graphical user interface . . . . . . . . . . . . . . . . . . . 201

37

Caliper brake utilizing a rocker–link mechanism . . . . . . . . . . . . . . . . 204

38

Directed graph of caliper brake . . . . . . . . . . . . . . . . . . . . . . . . . 205

39

Mobility numbers for bicycle brake mechanism . . . . . . . . . . . . . . . . . 206

viii

. . . . . . . . . . . . . . . . . . . . . 199

40

Mechanical advantage for bicycle brake mechanism . . . . . . . . . . . . . . 208

41

A screen capture of the front steering linkage shown in the graphical user interface. Moving the slider allows repositioning a joint and all joints and bodies in the kinematic substructure containing it. . . . . . . . . . . . . . . . 211

42

Scaling movement by joint mobility . . . . . . . . . . . . . . . . . . . . . . . 212

43

A 7R Robot Arm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214

44

One and two dimensional manifold visualization . . . . . . . . . . . . . . . . 223

45

Double Pendulum Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232

46

directed graph with arcs and chords identified . . . . . . . . . . . . . . . . . 242

47

Spanning–tree representation . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

48

Screenshot of graphical user interface . . . . . . . . . . . . . . . . . . . . . . 249

ix

List of Tables 1

Summary of Notational Conventions . . . . . . . . . . . . . . . . . . . . . .

9

2

Description of Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

3

Detail of transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13

4

Minimal floating point operations required for spatial transformations . . . .

24

5

Floating point operations required for Adjoint transformations using full, dense matrix algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6

24

Floating point operations required for similarity transformations using full, dense matrix algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

25

7

Identification of Bodies and Joints in the Excavator Example . . . . . . . . .

61

8

Virtual and embedding body id for each kinematic loop . . . . . . . . . . . .

67

9

Compact representation of spanning tree using parent–child list and binary tree 69

10

levels list for the excavator model . . . . . . . . . . . . . . . . . . . . . . .

11

Members of each kinematic loop with loops listed in monotonically increasing

71

order by level in the spanning tree. The base node associated with each loop is the first member of each list. . . . . . . . . . . . . . . . . . . . . . . . . .

73

12

Kinematic substructures and their corresponding chord–loops . . . . . . . . .

76

13

Quantities accessed in P O(1) . . . . . . . . . . . . . . . . . . . . . . . . . . 105

14

Quantities accessed in P O(2) . . . . . . . . . . . . . . . . . . . . . . . . . . 105

15

Quantities accessed in P O(3) . . . . . . . . . . . . . . . . . . . . . . . . . . 106

16

Indices used for matrix factorization . . . . . . . . . . . . . . . . . . . . . . . 130

17

Indices for backward elimination/forward substitution . . . . . . . . . . . . . 130

18

Counters for size of index arrays . . . . . . . . . . . . . . . . . . . . . . . . . 131

19

Stability limits for varying order Adams–Bashforth Adams–Moulton PECE numerical integration (from [83]) . . . . . . . . . . . . . . . . . . . . . . . . 142

20

Body parameters of slider–crank simulation

21

Joint parameters of slider–crank simulation . . . . . . . . . . . . . . . . . . . 147

22

Simulation parameters for slider–crank model . . . . . . . . . . . . . . . . . 148 x

. . . . . . . . . . . . . . . . . . 146

23

Geometric and inertial properties for each link in Bricard example model . . 152

24

Initial configuration of Bricard’s mechanism . . . . . . . . . . . . . . . . . . 153

25

Simulation parameters for Bricard’s mechanism . . . . . . . . . . . . . . . . 153

26

Geometric parameters of vehicle model . . . . . . . . . . . . . . . . . . . . . 159

27

Parameters of vehicle model . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

28

Joint substructure assignments . . . . . . . . . . . . . . . . . . . . . . . . . . 161

29

Fiala tire model parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

30

Joint substructure assignments for the hydraulic excavator model . . . . . . 198

31

Identification of Bodies and Joints in the caliper brake . . . . . . . . . . . . 205

32

Shape matrices for 7R robot . . . . . . . . . . . . . . . . . . . . . . . . . . . 219

xi

Abstract In the present work, a systematic method based on graph theoretic concepts is presented that allows setting up a general mechanism’s governing equations and analyzing transmission performance for a wide range of parametric and topological variations. The algorithms and methods described in this work are designed to be both fully automatic – requiring minimal supervision from an analyst for successful execution, robust – capable of handling instantaneous bifurcations and end-of-stroke conditions, and numerically efficient – through the application of numerical reduction strategies, custom sparse matrix methods and vectorization. In Preliminaries, background on a 6×6 matrix algebra for transforming velocities and accelerations (i.e. twists) and forces and torques (i.e. wrenches) is provided. As numerical efficiency is a central theme in the present work, discussion of data storage and sparse matrix methods is provided, implementation details which are not commonly discussed in other texts on the subject. In the first primary section, the focus is on automatic, graph-theoretic methods for setting up a mechanism’s constraint equations and solving the dynamic equations of motion. A multibody system’s constraint equations, i.e. the Jacobian matrix, plays a central role in the equations of motion, and is almost never full-rank, which complicates the solution process even for relatively simple systems. Therefore, Generalized Coordinate Partitioning (GCP), a numerical method based on LU decomposition applied to the Jacobian matrix is applied to find the optimal set of independent, generalized coordinates to describe the system. To increase the efficiency of the GCP algorithm, a new general purpose graph-partitioning algorithm, referred to as Kinematic Substructuring is introduced and numerical results are provided. Furthermore, a new numerical implementation of solving the equations of motion, referred to as the Preconditioned Spatial Equations of Motion is presented and new sparse matrix solver is described and demonstrated in several numerical examples. In the second primary section, it is shown how a simple numerical procedure applied to a mechanism’s constraint equations can be used as a measure of transmission performance. The xii

metric, referred to as mobility numbers provides an indication of a joint’s ability to affect a change on a mechanism’s overall configuration and is directly related to a mechanism’s instantaneous mobility. The relationship between mobility, transmission and manipulability is discussed. Unlike many other measures of transmission performance, mobility numbers are normalized and bound between 0 and 1, and can be computed simply and efficiently from the Jacobian matrix using LU and QR matrix decomposition methods. Examples of applications of mobility numbers are provided. Finally, in the last section, aspects of software design, including external and internal storage formats and memoization programming methods are discussed.

xiii

Acknowledgments The success of any endeavor depends largely on the encouragement, support and guidance of many others. I take this opportunity to express my gratitude to those who have proven instrumental in the successful completion of this project. I extend my thanks and appreciation to my advisor, Prof. Bahram Ravani and to my committee members Profs. Harry Cheng and Steven Velinsky for providing valuable feedback on the preparation of this manuscript. Special thanks to the staff at Full Speed Ahead (FSA), and particularly to Ron Correa, for kindly providing the 3D model of the brake used in Section 11.1. Thank you to Raeita Mehraban Teymouri for helping to collect geometric parameters of the vehicle chassis used in Section 7.4. I also thank Alex Kiani for drawing the 3D model of the Klann linkage shown in Section 10.2.2 as part of a summer student internship. Furthermore, I acknowledge the support of the International Research and Teaching Group (IRTG 2057). The IRTG 2057 program provided me with a unique opportunity to present my work at conferences at the Technical University of Kaiserslautern, Germany and Monterrey, California and to engage in interesting discussions with international students during their exchanges here at the University of California, Davis and during my exchange in Kaiserslautern, Germany. Finally, I extend my gratitude to my parents, Roger and Patricia Wehage for providing me with the means and encouragement to follow my dreams and to my wife, Teri, and daughter, Willow Wehage, for their unending support, love and patience.

xiv

1

Introduction

Starting in the 1950s and 60s with the advent of mainframe computers, there has been increasing interest in the application of computational multibody dynamics algorithms for the design and performance assessment of machines and mechanisms. Most early applications of computational multibody dynamics involved a small number of experts who formulated hand–optimized code for a particular problem. With the successful application of these programs and rapid advancement in computing power came the demand for general–purpose programs that could handle a broader range of mechanical systems and therefore be accessible to a broader range of design engineers and analysts. One of the earliest general–purpose programs capable of solving time histories for 2D planar systems undergoing large displacement dynamic motion was DAMN (Dynamic Analysis of Mechanical Networks) in 1969 [1], which led to the later development of DRAM (Dynamic Response of Articulated Machinery) starting in 1971 [2]. Starting in the 1970s and early 1980s, general–purpose software for 3D mechanical systems, such as IMP (Integrated Mechanisms Program) [3], MSC ADAMS (Automated Dynamics of Mechanical Systems) [4, 5], and LMS DADS (Dynamic Analysis Design Systems) [6] became prevalent. More recently, a number of general–purpose multibody dynamics analysis software tools, both free and commercial, have become available to engineers and analysts interested in assessing the performance of their designs. With the widespread integration of multibody dynamics software into the product development process, general–purpose multibody dynamics software can be viewed as a “commodity”. Simply by specifying a number of bodies and connecting them together with joints, design engineers can rapidly assemble complex virtual mechanisms and deploy simulations without a detailed understanding of the underlying algorithms. Many larger organizations typically have “champion” users to assist design engineers in model development and interpretation of simulation results, but smaller organizations may not. As dynamic models are increasingly being set up by non–specialists, it is critical that general–purpose software be robust – i.e. capable of automatically selecting a suitable set of generalized coordinates to 1

describe any arbitrary system topology and reliably handling events like bifurcations and end–of–stroke conditions. Along with the need for generality comes the need for algorithm performance. Although computer hardware has become faster in recent years, modern design engineers are finding new applications that push the capabilities of general–purpose multibody dynamics software to the limits. Some example applications include design sensitivity analysis, Monte Carlo simulation, motion planning and hardware or human–in–the–loop simulations. In these various applications it is often desired that algorithms run in real–time or many times faster than real–time. The drive for generality and execution speed are often at odds with one another. Some methods sacrifice generality for speed. For example, hand–optimized code for a particular problem is typically the fastest. The disadvantage to hand–optimized code is that it is far from being general, typically requiring a specialist to rewrite the code for each new problem encountered. Another example of sacrificing generality for speed are the many O(n) and and O(nlog(n)) algorithms for serial chain and tree topologies, most based on the work of Featherstone [7, 8]. Although some progress has been made on adapting these algorithms to handle topologies with multiple closed–loops (see for example [9, 10]), many of the adaptations to handle loop constraints require significant additional overhead. Furthermore, care must be taken to apply the methods in a reliable, robust and automatic way. Due to the additional overhead required to adapt O(n) or O(nlogn) algorithms to handle systems with closed loops, Featherstone advocates solving systems of equations with loop constraints using the index–1 Euler–Lagrange equations [11], discussed in Section 3. On the other hand, some algorithms sacrifice speed for generality. For example, although the MSC ADAMS software does benefit from sparse matrix methods, the states of the multibody system are represented as a maximal number of generalized coordinates (i.e. six cartesian coordinates for each body), which requires solving a larger system of equations at each time step than if joint coordinates were used. If carefully applied, methods that set up and solve a system of equations for a minimal

2

set of independent, generalized coordinates have the potential to offer an excellent combination of generality and execution speed. The primary motivation for using a minimal set of independent variables (i.e. joint coordinates) to describe the system is that the resulting equations are simpler and more efficient to solve. One of the primary objections to using joint formulation is that the methods are conceptually more difficult to understand and implement for complex topologies – fortunately the tools of graph theory provide not only a systematic framework to set up and solve the equations, but also provide valuable insight to their underlying structure. Another objection to using joint formulation is that the resulting equations are typically differential–algebraic (DAE) in nature – however modern numerical techniques, such as Generalized Coordinate Partitioning (GCP) [12, 13], can be applied to efficiently and automatically identify an optimal set of independent generalized coordinates, thereby converting the equations to ordinary differential equations and simplifying the solution process. This thesis is focused on the development and implementation of algorithms that maximize both the generality and execution speed of joint–formulation approaches. A common theme in this thesis is that all techniques are designed to take as input a representation of a mechanical system as a list of bodies connected together by a number of joints. Each body may contain a number of shape matrices which specify attachment points for other bodies. In this way, a user has created a directed–graph representation of their multibody system. This representation of a mechanical system should be familiar to most engineers, who are accustomed to assembling mechanisms in a Computer Aided Design (CAD) environment in a similar fashion. In this thesis, extensive preprocessing is performed on a user’s directed graph representation. The process begins with automatic identification of loops, loop members and the root– oriented directed graph of the mechanism (i.e. the spanning tree), thereby reducing burden on the analyst, eliminating potential user-error in the model setup process and increasing simulation throughput. The preprocessing phase continues by automatically traversing the user’s topology to identify clusters of overlapping chord loops, which are referred to in this

3

thesis as Kinematic Substructures. With the substructures identified, a numerical reduction can be applied to the Generalized Coordinate Partitioning algorithm and solution of the equations of motion. In the final preprocessing step, the minimum number of operations required to set up and solve the equations of motion are recorded (i.e. memoized ) so that they can be executed at run–time with minimal overhead. The goal of the extensive preprocessing is to simplify the equations of motion so that the resulting execution simplicity and speed approaches that of hand–optimized code. In addition to the preprocessing steps described above, another important technique presented in this thesis is the heuristic application of Generalized Coordinate Partitioning. After a simulation is launched, the optimal split of dependent and independent variables is subject to change, so the simulation is monitored at run–time to determine if GCP should be reapplied. If it is determined to reapply GCP, a new split of independent and dependent variables is obtained automatically, and the minimal number of operations required to solve the equations are reevaluated and stored for subsequent playback. For many mechanisms, the split of independent and dependent variables remains valid throughout the entire simulation and GCP need never be called again after an initial application. For other simulations, the split of independent and dependent variables changes occasionally, and GCP is only applied on an as–needed basis, significantly reducing computational overhead. For paradoxical mechanisms such as Bricard’s mechanism, discussed in Section 7.2, the split of dependent and independent variables changes at every cycle of mechanism motion. The paradoxical nature of Bricard’s mechanism causes problems for some multibody dynamics software, but the heuristic application of GCP presented in this thesis efficiently and automatically processes the equations of motion for Bricard’s mechanism, demonstrating an excellent combination of generality and execution speed. The first part of the thesis focuses on methods to automatically set up and solve the equations of motion for complex mechanisms. The second part of the thesis provides tools for design engineers and analysts to set up simulations and analyze the transmission performance of the models they create. Mobility, i.e. the total degrees of freedom of a mechanism, is an

4

important property of a multibody system as it indicates how many inputs are required to control the system. Over the years, many authors have attempted to create simple formulas to compute mobility based on the number of bodies and joints. Unfortunately these simple formulas fail to compute the correct mobility for some paradoxical and exceptional linkages. In his seminal review paper on mobility, Gogu [14] states that perhaps the only reliable way to compute the instantaneous mobility of a system is by subtracting the rank of the mechanism’s constraint equations from the number of joints. Gogu goes on to say that the method is generally not accessible for analysis of complex mechanisms due to the difficulty in setting up constraint equations. Fortunately, the algorithms presented in the first part of this thesis can be used to automatically set up constraint equations, even for very complex systems. Furthermore, the kinematic substructuring algorithm is applied so that mobility analysis can be performed locally on each substructure. The method suggested by Gogu is equivalent to computing the rank of the null–space of the constraint equations. In this thesis, it will be shown that much more information can be obtained from the null–space of the constraint equations, including a new metric of transmission performance called mobility numbers. Mobility numbers can be computed efficiently as a byproduct of Generalized Coordinate Partitioning. A number of examples and practical applications of mobility numbers are provided. The chapters of this thesis are arranged as follows. In Chapter 2, an overview of the notation used in this work is provided. Chapter 3 introduces some preliminary information for the development of concepts in this thesis. First, a brief introduction to processorbased information involved in achieving better performance of multibody dynamics code is provided. Then a brief overview of geometric and kinematic concepts and rules are given in preparation for developing the spatial equations of motion in Cartesian space and converting them to Lagrangian form in joint space. Finally the concept of Generalized Coordinate Partitioning is introduced, which will play an integral part of obtaining robust and efficient solver algorithms. Chapter 4 introduces the concept of preconditioning the constraint equations. The pre-

5

conditioning process is used to reduce the factorization and solution complexity when using sparse matrix methods. Some algorithmic advantages of preconditioning the equations are summarized. Chapter 5 introduces the concept of Kinematic Substructuring. Automating the process of identifying the kinematic substructures and reorganizing the constraint equations and joint variables is a complex and important part of the algorithm optimization process. Chapter 5 also demonstrates how reordering variables into substructures impacts setting up the generalized equations of motion. Chapter 6 gives a more detailed description of the sparse matrix methods used to factor and solve the preconditioned equations of motion. Note that the Kinematic Substructuring algorithm identifies a permutation of bodies and joints into level–increasing order within each substructure. Generalized Coordinate Partitioning identifies a new permutation order with dependent variables first within each substructure, followed by independent variables. To minimize matrix fills during sparse factorization and solution of the preconditioned equations of motion, a third permutation order with independent variables moved to the beginning of each substructure, followed by dependent variables in their original level–increasing order within each substructure is used. The ordering identified by Kinematic Substructuring is called permutation order 1, the order identified by Generalized Coordinate Partitioning is called permutation order 2; and the order necessary for robust sparse factorization and solution of the constrained equations of motion is called permutation order 3. Chapter 6 introduces a number of tools to efficiently book–keep the various permutations and efficiently work between them. Additionally, Chapter 6 describes how to set up the sparse factorization and solve process. In Chapter 7 a number of real-world example problems with kinematic loops are presented to demonstrate the automatic setup and solution process. The standard slider-crank model is simulated to demonstrate adaptive Baumgarte stabilization techniques. Bricard’s mechanism, with its paradoxical loop constraint equations, is simulated. The techniques that have been developed to reliably and accurately simulate Bricard’s mechanism serve to

6

make all simulations more reliable and accurate. Additional hydraulic excavator and steered four-wheel vehicle models were set up and simulated to illustrate solving the equations with multiple kinematic substructures. Chapter 8 provides an overview of a number of techniques that have been developed over the past decades to quantify transmission performance, i.e. the ability of various one degreeof-freedom mechanisms to transfer effort from an input link or joint to another output link or joint, including some of their advantages and shortcomings. Additionally, the concept of its mobility is introduced and its connection to other concepts such as manipulability are discussed. In Chapter 9 the concept of computing joint mobility numbers from the right null-space matrix of a substructures Jacobian matrix is introduced. The right null-space matrix may be obtained efficiently as a byproduct of the Generalized Coordinate Partitioning method, and joint mobility numbers may be obtained by first orthonormalizing the columns of this matrix and squaring the norm of individual rows of the resulting matrix. The method can be considered a short–cut to obtaining the orthonormalized right null–space basis compared to the more expensive SVD algorithm, which could also be used to obtain the same information. In Chapters 10, examples of computing mobility numbers for several single and multiloop mechanisms are provided including the slider–crank mechanism, Bricard’s mechanism and Klann’s walking linkage. In Chapter 11, it is shown that mobility numbers have a number of practical applications including positioning of multiloop models in a CAD environment, computing mechanical advantages, obtaining a static force balance and as a tool to analyze redundacy in multidegree–of–freedom systems. All of the methods and techniques presented in this work are demonstrated in a software tool with an intuitive graphical user interface that is inspired by modern CAD software. Chapter 12 highlights some of the unique aspects of the software, including information about the datastructures and a new external file–format based on the JSON syntax. The software is designed to be modular with support for force modules that allow feeding back forces into a running simulation as a function of displacements, velocities or accelerations

7

between coordinate systems or within joints.

8

2

Notation

2.1

Conventions Table 1: Summary of Notational Conventions example

description

reserved for

A

capital symbols

Matrices

a

lowercase symbols

Scalar quantities

a

underlined symbols

Vector quantities, or a column matrix of stacked vector quantities



overhead dot

First time derivative

a ¨ h i

overhead double dot

Second time derivative

square brackets

m × n matrices where both m and n are typically greater than 1

n o

curly brackets or braces

m × 1 column matrix

vff b,f

bold

6×1 screw quantities, such as spatial velocities

lined

face/under-

and accelerations. The superscript is used to convey the coordinate system that the quantity is expressed in. The value after the comma in the subscript indicates the point that the quantity is expressed at. The letters f and b denote that the quantity is from f to b in the active sense.

9

∗f f ∗f f b,f , f b,f

bold face, under-

6 × 1 coscrew quantities, such as forces. The

lined, asterisk

superscript f indicates that the quantity is expressed in the fixed inertial reference frame. The quantity after the comma in the subscript indicates that the quantity is expressed at the origin of the fixed inertial reference frame. The notation f b before the superscript indicates that the quantity is applied from the fixed inertial reference frame to the body frame b (such as a spatial wrench acting from the inertial reference frame to body b). In the second case, the notation b indicates that the quantity is a property of body b.

10

2.2

Symbols Table 2: Description of symbols symbol

name in program

description

q

q

generalized coordinates



dq

first time derivative of generalized coordinates (generalized velocities)

¨q

ddq

second time derivative of generalized coordinates (generalized accelerations)

PC

pcl

parent child list

λp

lambda p

lagrange multipliers

e

e

constraint error

f

f

working forces applied within joints

γp

gamma p

quadratic velocity terms (−J˙q˙ − e(t)) ˙

M`

Ml

generalized mass matrix

λd

lambda d

preconditioned lagrange multipliers

γd

gamma d

preconditioned quadratic velocity terms

f`

fl

generalized forces

J

J

Jacobian matrix, J(q), a mapping of body velocities to generalized velocities

N

N

The right null–space of the Jacobian matrix

B di

B

A useful intermediate quantity used to compute the right null–space of the Jacobian matrix. B di = (U pi )−1 UR . U pi and UR are found using LU decomposition with complete row and column pivoting.

Z

Z

An orthonormalized right null–space basis of the Jacobian matrix

11

p∗

p

spatial momentum

h

h

In this work, h is referred to as an influence coefficient matrix. In the literature h is sometimes referred to as the joint’s subspace map or partial velocity. h is a unit velocity screw

vff b,f

v

(i.e. twist) associated with    joint motion. v f  f b,f . a 6 × 1 velocity screw. vff b,f =  ωf  fb

The superscript is used to convey the coordinate system that the quantity is expressed in. The value after the comma in the subscript indicates the point that the quantity is expressed at. The letters f and b denote that the velocity is from the inertial reference frame f to the body frame b. g∗f b,f

g

externally applied spatial wrenches, such as from gravity, contact or dampers  springsand   ff  b  = that are not in a joint g∗f b,f τ f  b,f



PO

linear permutation array

P

N/A

two–dimensional row and column permutation matrix

12

2.3

Transformations Table 3: Detail of transformations

symbol

operation

description

Rab

R ∈ R3x3 , an SO(3) rotation matrix, specifying a rotation of a vector from reference frame a to b in the active sense and from b to a in the passive sense

R

ab−1

R

abT

raab

raab ∈ R3 , a translation from a to b in the active sense, expressed in reference frame a

ra−1 ab

˜r

ab a Φab ab (R , r ab )

−raab  0 −rz ry    rz 0 −rx  −ry rx 0   Rab raab   0 1

 r˜ ∈ R3×3 , the skew-symmetric cross product

   

operator matrix

ab a Φab ab (R , r ab ) ∈ SE(3), a general rigid body

displacement consisting of rotation Rab and translation raab .

Φab−1 ab

[Ad]ab ab

[Ad]ab−1 ab

[Ad]∗ab ab

  abT abT a R −R rab  = Φba  ba 0 1   a ab ab R ˜rab R   0 Rab   RabT −˜rbab RabT   abT 0 R   ab R 0   a ab ab ˜rab R R

13

Adjoint operator for transforming screws

inverse of Adjoint operator for transforming screws

Coadjoint operator for transforming coscrews

[ad]aab,a , [ad]vaab,a

∗ [ad]∗a ab,a , [ad]va ab,a

  a a ω ˜ v˜  ab ab,a  0 ω ˜ aab   a ω ˜ 0  ab  a a ˜ ab v˜ab,a ω

time derivative operator of [Ad], or left screw cross product operator time derivative operator of [Ad]∗ , or left coscrew cross product operator

14

3

Preliminaries

This chapter reviews geometric concepts necessary for analyzing kinematics and dynamics of mechanisms. Many of the operations described in this chapter are presented elsewhere in the literature (see for example the work of Selig [15] or Murray et al. [16]), however these texts focus primarily on geometric and algebraic concepts, rather than on numerical efficiency and computer implementation. Featherstone (see for example [11, 17]) uses a matrix algebra similar to Selig and Murray et al. and is well-known for his focus on numerical efficiency. Note, however that Featherstone’s primary focus is on reduction in Floating Point Operations (FLOPs), and he does not discuss other important numerical aspects, such as data storage schemes, cache efficiency and pipelining of arithmetic operations (i.e. vectorization). Therefore, this section briefly introduces important numerical concepts in the context of a geometric framework that will be used throughout the work.

3.1 3.1.1

Numerical Preliminaries Pipelining / Vectorization

Pipelining, or vectorization, of arithmetic operations is the primary reason why vector computers are fast. Golub and Van Loan[18] describe floating point arithmetic as analagous to an assembly line production. For example, the addition of 2 floating point numbers, x and y, requires multiple operations, as illustrated in Fig. 1.

Figure 1: A three cycle adder, adapted from [18]

In Fig. 1, the sum is computed after three cycles, spending one cycle at each of the three stations. When a single scalar addition is performed, only one of the three stations is active at any given time. However, if multiple additions are chained together, each of the stations remain active and when the pipeline is full the speed of vector addition is approximately 3 15

times faster than scalar addition. The stations in Fig. 1 are known as vector registers. The registers are the fastest functional units of memory and are of a fixed size. On recent Intel processors at the time of this writing, vector registers generally have a size of 256 bits. Vector registers of this size can process four double precision numbers simultaneously, resulting in up to a twelve–fold speedup over scalar operations when the pipeline is full. A vector pipeline computer comes with a collection of vector instructions for performing arithmetic operations, including vector add, vector multiply, vector swap, dot product, saxpy y ← αx + y and others. Intel’s set of vector instructions is known as SSE (Streaming SIMD Extensions) [19]. Unfortunately, SSE vector instructions are typically accessible only from Assembly language. Higher level languages such as C and C++ do not provide direct access to the register. When using a higher level language such as C or C++, it is the job of the compiler to convert the user’s code to vectorized instructions. Modern compilers are good, although not perfect, at analyzing access patterns within a program to identify potential opportunities for vectorization. If more direct control of data movement to the register is desired, it is possible to write inline Assembly code within a C/C++ program or use wrapper functions to SSE instructions. A portable way to achieve vectorization is to use the well-known BLAS (Basic Linear Algebra Subroutine) library [20]. BLAS provides a set of core functions for vector/vector algebra (Level 1 BLAS), matrix/vector algebra (Level 2 BLAS), and matrix/matrix algebra (Level 3 BLAS) and optimized BLAS libraries are available for many platforms. For example, Intel offers a version of BLAS as part of their MKL (Math Kernel Library). Apple offers an optimized version of BLAS as part of their Accelerate framework. The BLAS library is a convenient and portable way to write vectorized code that is compatible across multiple computing environments. For these reasons, BLAS is commonly used in many high-performance scientific computing applications. In the present work, BLAS is used for vectorization of memory swap, daxpy operations used in dense matrix factorization and most matrix–vector operations. Many of the Level 1

16

BLAS (i.e. vector–vector operations) for 3×1 vectors and 6×1 screws do not benefit from BLAS – in these cases hand–written functions in C with unrolled for loops performed about 20% faster than the BLAS equivalents. Writing SSE vector instructions or using Intel’s Integrated Performance Primitive (IPP) Library [21] in a multibody dynamics framework is a candidate for future improvement to the code. 3.1.2

Memory Access

On modern computers, the Central Processor Unit (CPU) clock speed is measured in gigahertz, whereas memory access is typically measured in the megahertz range. To resolve the approximate order of magnitude speed difference between a computer’s main working memory (i.e. Random Access Memory or RAM ) and the fast functional units on the CPU (i.e. the register ), modern processors are equipped with a cache, as shown in Fig. 2. The cache is a fast region of memory that is used as a buffer between the main memory and functional units.

Figure 2: Simplified representation of memory hierarchy, adapted from [18]

At each level in the memory hierarchy, speed increases, but for economic reasons, the size decreases. There is a cost to moving data between levels, roughly equivalent to the cost associated with performing a saxpy/daxpy or dot product operation [18]. To minimize the cost associated with moving data between the main working memory and the cache, when data is requested from the computer’s main working memory for the first time, the unit of data and the surrounding data, adjacent in contiguous working memory (i.e. a cache block or cache line) [22], are moved to the cache. On the next function call, the cache is checked to 17

see if the requested data is already there. If the data is present, it is loaded from the cache to the register. This case is known as a cache hit. If the data is not present on the cache, it must be loaded from the computer’s main working memory to the cache, then to the register. The second case incurs a greater cost in speed and is known as a cache miss. Arranging data used for performance-critical operations in contiguous, closely-packed memory is an important strategy to avoid cache misses [23]. 3.1.3

Vector Stride

Stride is the distance in logical memory locations between a vector’s components. In Fortran, two-dimensional arrays are stored in column–major order. Therefore, column components are accessed with a unit stride and row components are accessed with a stride equal to the column dimension. In the C/C++ programming language, the convention is the opposite, with data stored in row–major order. Non-unit stride operations interfere with a computer’s pipelining capability, and will degrade performance [18]. For general matrix–vector and matrix–matrix operations, there is not a penalty for accessing quantities in row vs. column– major order as the vector instructions can be rearranged to access quantities with a unit stride in either case. There are two motivations for choosing column–major ordering in a multibody dynamics simulation environment: 1. OpenGL matrices are specified in columnmajor order [24] and 2. if using full, dense algebra to process the Jacobian matrix, elements of the Jacobian matrix can be vector copied with a unit stride.

3.2

Geometric Preliminaries

In the early 1800s, Chasles showed that any change in the configuration of a rigid body can be parameterized in terms of a helical motion along a fixed axis [25]. Poinsot is credited with the discovery that a system of forces acting on a rigid body can be resolved as a force acting along a line combined with a torque acting about the same line [26]. Chasles’ and Poinsot’s work was later developed by Sir Robert Ball into the theory of screws [27]. Screws represent angular and translational quantities in a single 6×1 vector quantity. Screws 18

provide a systematic framework for kinematics and dynamics of mechanisms. Furthermore, their compact representation is well–suited for computer storage and facilitates vectorization. In the present work, screws are discussed with reference to 4×4 matrices for transforming homogeneous coordinates, which are commonly used to describe rigid body displacements [28]. In group theory, these 4×4 matrices are referred to as the Special Euclidian group, SE(3). Another important group is the Special Orthogonal group, SO(3), the set of 3×3 orthogonal matrices for rigid body rotation with a determinant of one. Note that screws can be formalized in terms of group theory and Lie algebra. A detailed discussion of group theory and Lie algebra is beyond the scope of the present work, but some of the symbols for groups and operators are used to be consistent with the literature. The interested reader is referred to [29, 15, 16, 30] for more information. 3.2.1

Transforming Screws

Linear and angular velocity can be combined into a single 6×1 quantity as:

vbf b,b =

  v b  f b,b

(1)

 ωb  fb

Here the superscript denotes which Cartesian reference frame the quantity is expressed in. In this work, the letter f is used to designate the fixed inertial reference frame. The f b in the subscript indicates that the linear and angular velocities are from the inertial reference frame f to the body frame b. Velocities and accelerations may be expressed at any point – the b after the comma in f b, b indicates that the linear velocity is taken at the body reference frame, b. The b in superscript indicates that the quantities are expressed in the body frame coordinates. In 1873, Clifford introduced the word motor to refer to this screw quantity (i.e. a combination of motion and vector ) [31]. Later, Ball referred to this screw quantity as a twist [27]. The term twist is more common in modern literature, and is therefore used in the present work.

19

The 6×6 adjoint transformation matrix, [Ad], for transforming twists from one coordinate system to another can be formed from SE(3) matrices. The derivation is briefly summarized as follows. Consider a rotation about the z axis given by the Special Orthogonal SO(3) matrix 

raab = Rab rbab

 cosθz −sinθz 0     = sinθz cosθz 0 rbab   0 0 1

(2)

Then r˙ aab,a

h i ab b ab abT ˙ ˙ = R rab = R R Rab rbab

when rbab is constant in frame b, and      0 −θ˙z 0 cosθz sinθz 0 −θ˙z sinθz −θ˙z cosθz 0     h i       ˜z R˙ab RabT =  θ˙z cosθz −θ˙z sinθz 0 −sinθz cosθz 0 = θ˙z 0 0 = ω      0 0 0 0 0 0 0 0 1 The tilde operator here indicates a skew–symmetric matrix formed from a vector as   0 −ωz ωy     ω ˜ =  ωz 0 −ωx    −ωy ωx 0 In Eq. (4), ω ˜ z is the skew symmetric matrix formed by the angular velocity ω z      0    0          ωz = 0 = 0          ˙      θz ωz The derivation can be extended to SE(3) matrices for rigid body transformations as        ab a ra  r b  r b  R r ac bc ab bc ab   = = Φab 1 1 0 1 1 and with additional manipulation the following can be obtained        abT abT a ab a r˙ a  h r b  i ˙ ab r˙ a R r R R −R r ac ab,a ab ab bc ab b ab ab−1 ˙       = Φab Φab Φab rbc = 0 0 0 0 1 0 1 1 20

(3)

(4)

(5)

(6)

(7)

(8)

Simplifying yields 

ab−1 Φ˙ ab ab Φab



0 −ωz ωy vx         a a a ab abT ab abT a   ˙ ˙ ωz 0 −ωx vy  ω ˜ ab v ab,a R R −R R rab + r˙ ab,a =  =  =   0 0 0 0 0 vz  −ωy ωx   0 0 0 0 (9)

The quantities in Eq. (9) can be arranged into a 6×1 column matrix, i.e. a velocity screw, also known as a twist, vaab,a . The superscript indicates that the quantity is expressed in the coordinate system of frame a. The subscript indicates that the quantity is expressed at point a.

vaab,a =

  a     v(x)ab,a          a   v   (y)ab,a          v a (z)ab,a

  a   ω(x)ab           a   ω   (y)ab        a   ω(z)ab 

=

  v a  ab,a

(10)

 ωa  ab

In the language of modern differential geometry, velocity screws are known as the Lie algebra of the SE(3) group; their group is designated as se(3) [30]. The quantity in Eq. (10) can be expressed in other coordinate systems, such as the inertial reference frame, f , by a similarity transformation.       ω ˜ fab v fab,f Rf a rff a ω ˜ aab v aab,a Rf aT −Rf aT rff a   =   0 0 0 1 0 0 0 1   f aT fa a f aT f fa a fa a R ω ˜ ab R R v ab,a − R ω ˜ ab R rf a  = 0 0

(11)

In computer implementations of the spatial velocity transformation, it is not recommended to perform the sequence of 4×4 matrix operations. Not only can one exploit matrix sparsity, but it is also possible to take advantage of the following identity to reduce the number of operations. As the quantity Rf a ω ˜ aab Rf aT is skew symmetric, it is only necessary to compute 21

the terms ω fab as f fa fa fa fa fa fa fa fa fa fa fa fa a a a ω(x)ab = (R(yy) R(zz) −R(yz) R(zy) )ω(x)ab +(R(yz) R(zx) −R(yx) R(zz) )ω(y)ab +(R(yx) R(zy) −R(yy) R(zx) )ω(z)ab

(12) fa fa fa fa fa fa fa fa f fa fa fa fa a a a +(R(xx) R(zz) −R(xz) R(zx) )ω(y)ab +(R(xy) R(zx) −R(xx) R(zy) )ω(z)ab ω(y)ab = (R(xz) R(zy) −R(xy) R(zz) )ω(x)ab

(13) fa fa fa fa fa fa fa fa f fa fa fa fa a a a +(R(xx) R(yy) −R(xy) R(yx) )ω(z)ab +(R(xz) R(yx) −R(xx) R(yz) )ω(y)ab ω(z)ab = (R(xy) R(yz) −R(xz) R(yy) )ω(x)ab

(14) The matrix Rf a is orthonormal and therefore the second row of Rf a crossed with the third row yields the first row, and the following is true fa fa fa fa fa = R(xx) R(zy) − R(yz) R(zz) R(yy)

(15)

fa fa fa fa fa R(yz) R(zx) − R(yx) R(zz) = R(xy)

(16)

fa fa fa fa fa R(yx) R(zy) − R(yy) R(zx) = R(xz)

(17)

fa fa fa a a a a ω(z)ab ω(y)ab + R(xz) ω(x)ab + R(xy) ω(x)ab = R(xx)

(18)

fa fa fa a a a a ω(y)ab = R(yx) ω(x)ab + R(yy) ω(y)ab + R(yz) ω(z)ab

(19)

fa fa fa a a a a ω(z)ab = R(zx) ω(x)ab + R(zy) ω(y)ab + R(zz) ω(z)ab

(20)

ω fab = Rf a ω aab

(21)

and therefore

and       f fa a f a ω a ] Rf a v a ^ f a ω a ]r f ^ f a ω a ] Rf a v a ω ˜ fab v fab,f [R^ − [R [R + ˜ r R ω ab,a ab,a ab ab ab f a ab fa  = =  0 0 0 0 0 0 (22) Using Eq. (22), the 6×6 Adjoint transformation matrix for transforming twists can be generated as vfab,f

  v f 

     f fa fa  a  v a  R ˜ r R v ab,a ab,a ab,f fa  = = [Ad]ff aa = [Ad]ff aa vaab,a =  ωf    fa a  a  0 R ω ab ω ab ab 22

(23)

In the language of differential geometry, [Ad] is known as the Adjoint representation of a similarity transformation – Eq. (23) replaces the similarity transformation in Eq. (11) with an equivalent left matrix multiplication. [Ad] provides a linear mapping of elements of a Lie algebra and forms a 6×6 screw basis. A similar matrix operator for transforming spatial velocities was introduced by Yuan and Freudenstein in 1970 [32]. Some authors put ω above v as Yuan did. The form of spatial velocity specified in Eq. (23) was chosen in this work as it is the original order specified in Clifford’s Algebra [33] and is commonly used in flexible multibody dynamics formulations [34]. Some authors refer to velocity screws expressed at a point on a body (i.e. point a in Eq. (10)) as the true velocity, and velocity screws expressed at the inertial reference frame (i.e. point 0 in Eq. (23)) as the spatial velocity [16]. Although the many subscripts and superscripts may seem cumbersome, the notation is useful to clearly convey the point at which the screw quantities are expressed. The primary advantage to working with screw quantities that have been transformed to a common stationary inertial reference frame is that relative velocities can be found simply by a vectorized subtraction of one screw quantity from another. In this work, spatial and true screw quantities are clearly identified by the point identifier symbol following the comma in the subscript. The 6×6 screw basis, [Ad], provides a convenient and compact notation for transforming screws, and therefore in computer implementation of the transformation, it is tempting to carry out full 6×6 matrix algebra to transform twists. However, it is strongly emphasized that the full 6×6 matrix algebra should never be carried out in computer code, and additionally that the full 6×6 [Ad] adjoint transformation matrix need not be stored or created in computer memory. In the present work, all rigid body displacements are stored as a 12×1 array with the entries of a SO(3) rotation matrix stored in column–major order in the first 9 entries of the array, followed by a translation vector in the last 3 entries. When computing v fab,f and ω fab in Eq. (23), it is only necessary to multiply the non-zero terms of the various quantities as

23

  v f  ab,f

 ωf 

=

ab

 R f a v a

ab,a

+ fa

ω aab

R





˜rff a Rf a ω aab 

(24)



The number of floating point operations required to compute Eq. (24) are shown in Table 4. On the other hand, carrying out full, dense matrix algebra requires the operations shown in Table 5. The operations required for using full dense matrix algebra to perform the similarity transformation in Eq. (11) are summarized in Table 6. Table 4: Minimal floating point operations required for spatial transformations Operation

Multiplies

Adds

Rf a ω aab

9

6

˜rff a Rf a ω aab

6

3

Rf a v aab,a

9

6

˜rff a Rf a ω aab + Rf a v aab,a

0

3

total

24

18

Notes

Rf a ω aab computed above

Table 5: Floating point operations required for Adjoint transformations using full, dense matrix algebra Operation

Multiplies

Adds

Notes

˜rff a Rf a

27

18

Counts refers to the full, dense version.

Full, dense Eq. (23)

36

30

total

63

48

24

Table 6: Floating point operations required for similarity transformations using full, dense matrix algebra Operation

Multiplies

Adds

Notes

−Rf aT raf a

27

18

Counts refers to the full, dense version.

Full, dense Eq. (11)

128

96

total

155

114

Carrying out the minimal operations to transform screw quantities requires 62% fewer operations compared to full dense algebra applied to an adjoint transformation and 85% fewer operations compared to full dense algebra applied to a similarity transformation. Furthermore, using the minimal number of operations requires 66% less memory and many of the matrix–matrix and matrix–vector operations benefit from vectorization. The above example illustrates that although the notational convention of Lie algebra provides a convenient and compact representation of kinematic and dynamic quantities, care should be taken in the computer implementation of the operations. In the present work, a custom kinematics library consisting of over 200 functions for processing screw quantities and SE(3) displacements has been developed that uses the minimal set of operations as described above. The library takes advantage of vectorization and matrix sparsity as appropriate. 3.2.2

Coscrews

Just as the linear and angular velocities can be combined into a single 6×1 column vector, forces and moments can be combined into a single quantity called a wrench. A wrench is dual to a screw, i.e., it is a linear operator on a screw. In other words, screws describe quantities related to the configuration of a mechanism (i.e. velocities, accelerations, etc.) and coscrews describe quantities that are linear operators on screws (i.e. forces, momentum, etc.). These two quantities can be considered as operating in two spaces: configuration space and function space. In older literature, a wrench is often referred to as another kind of

25

screw. In modern literature, wrenches are referred to as coscrews [15]. To clearly convey the difference between the two spaces, a superscript



is used to distinguish coscrew quantities

from screws. A wrench, f ∗ , is defined as

f ∗b f b,b =

   fb  fb

τ b  f b,b

=

  b    f(x)f b            fb    (y)f b           fb  (z)f b

(25)

  b   τ(x)f  b,b          b   τ   (y)f b,b          τ b (z)f b,b

As with screws, the superscript here indicates that the coscrew is expressed in reference frame f . The f b in the subscript indicate that the force is applied from the fixed frame f to body b and the term after the comma in the subscript indicates that the wrench is expressed at point b. If a force or torque is applied directly to a body, then only a single body identifier could be used. In coscrews, the translational component is bound and the angular component is free. Note that screws are exactly the opposite, with the linear components free and the angular components bound. Referencing the free quantities to a point effectively specifies the bound quantities. For example, once a point has been chosen to express the translational velocity, the rotation axis must pass through the point. On the other hand, torques applied to bodies are free because when the bodies are rigid they produce the same effect no matter where they are applied. However, forces applied to a body are bound to any point lying on their line of action. Although the notation may seem cumbersome at first, it very clearly indicates which quantities are fixed and free, and at what point quantities are expressed at, removing any possibility for ambiguity. The power, p, imparted by a linear combination of a wrench and screw is invariant, regardless of the common coordinate system in which they are expressed in, and it is given as ∗bT b b p = f bf b · v bf b,b + τ bf b,b · ω bf b = f ∗b f b,b · vf b,b = f f b,b vf b,b

(26)

Recall that the operator for passively transforming screw quantities from reference frame 26

b to the fixed inertial reference frame f is

[Ad]ff bb

  f fb fb R ˜rf b R  = fb 0 R

(27)

Therefore

f b−1 fb b T f ∗bT p = f ∗f f b,f vf b,f = f f b,b [Ad]f b [Ad]f b vf b,b

   f f bT f bT f fb fb R −R ˜ r R ˜ r R fb fb    vbf b,b = f ∗bT f b,b 0 Rf bT 0 Rf b (28)

which indicates that coscrew quantities can be transformed by right multiplication with ([Ad]ff bb )−1 or by left multiplication with ([Ad]ff bb )−T as: 

fb



R 0 f b −T b b ∗b   f ∗b f ∗f f f b,b = [Ad]∗f f b,b f b,f = ([Ad]f b ) f b f f b,b = f ˜rf b Rf b Rf b Therefore, the 6×6 Coadjoint transformation matrix for transforming coscrews is 

Rf b

0

1



b   [Ad]∗f fb = f fb fb ˜rf b R R 1

(29)

(31)

Note that the linear components of force and velocity are found in the top position of the velocity screw

and wrenches; the angular components are found in the lower position. This convention allows power to be computed using the linear algebra dot product operator. Many authors swap the angular and linear terms in screw and wrench quantities, in which case the dot product does not yield power, and a special “reciprocal product” operator, denoted as is used. For example, if angular terms are placed above the velocity screws, a reciprocal product is used to combine screw and coscrew quantities as:  T f  p=f ·v+τ ·ω =W $= τ 



0

 I

  I ω   0 v 

(30)

Certainly, no convention is more correct than another, however keeping linear and angular components in the same position in screws and coscrews does have its advantages: (1) it obviates the need to introduce an additional reciprocal product operator, and (2) when formulating the equations of motion, the generalized mass matrix is symmetric, allowing the equations to be solved more efficiently using standard numerical methods.

27

3.2.3

Influence Coefficient Matrices

An important tool in the analysis of mechanisms and multibody systems is a map of a joint’s velocity to body velocity. From Eqs. (6) and (10), it is possible to find the velocity screw associated with a revolute joint. In this example, joint a is aligned with z axis of the parent frame j + and its child frame k − . Here frame j + corresponds to the distal frame on body j, which in general is not the reference frame of body j. The frame k − is the proximal reference frame on body k located closest to the root of the spanning tree. In other words, joint a is negatively incident with body j at frame j + and positively incident with body k at body reference frame k − .

+

vjjk,j +

          0 0                         0 0                     0 0 + q˙ = ha q˙ , hjjk,j + q˙ = =   0   0                         0 0                     q˙ 1

(32)

Here ha is the influence coefficient matrix [35] or joint subspace map associated with the joint. The extra superscripts and subscripts are dropped when referring to influence coefficient matrices for convenience. In this example, the influence coefficient matrix taken at the joint’s parent frame, j + , is simply a unit twist about the z axis. Usually it is desirable to express the influence coefficient matrix with respect to other coordinate systems, such as the inertial reference frame. Using the [Ad] operator in Eq. (10)

28

yields

vfjk,f

 Rf a = hf q˙ = [Ad]ff aa ha q˙ =  0

     0             0          f fa   ˜rf a R 0  q˙   fa   0 R            0           1

(33)

These influence coefficient matrices taken with respect to the inertial reference frame form elements of the Jacobian matrix for multibody systems. As a mechanism or multibody system moves through varying configurations, hf changes, and may require updating many thousands or even millions of times in the course of a simulation. The structure of Eq. (33) reveals important insight that allows one to reduce the number of floating point operations (FLOPs) required to update the influence coefficient matrix of primitive joints.

hf

  fa fa  f f   ry R(zz) − rz R(yz)            f a f a f f   R R − r r  x z (zz)  (xz)         r f R f a − r f R f a  x (yz) y (xz) (34) = fa     R   (xz)         fa   R     (yz)       f a   R(zz)    fa fa fa f fa f fa f fa f fa f fa f fa  0 R R(xy) R(xz) ry R(zx) − rz R(yx) ry R(zy) − rz R(yy) ry R(zz) − rz R(yz)    (xx)       fa      f a f a f a f a f a f a f a f a f f f f f f R    R R r R − r R r R − r R r R − r R 0   z x z x z x (yx) (yy) (yz) (xx) (zx) (xy) (zy) (xz) (zz)       fa     fa fa f fa f fa f fa f fa f fa f fa    R  R R r R − r R r R − r R r R − r R 0 x (yx) y (xx) x (yy) y (xy) x (yz) y (xz)   (zx) (zy) (zz) =   fa fa fa  0   0 0 R(xx) R(xy) R(xz)   0            fa fa fa   0 0 R(yx) R(yy) R(yz) 0  0            fa fa fa  0 0 0 R(zx) R(zy) R(zz) 1

Once the position and orientation of each joint in the mechanism is computed, the influence coefficient matrix can be computed simply by forming the corresponding column of [Ad], 29

which requires only 9 additional FLOPs for revolute joints and no additional FLOPs for prismatic joints. Additionally, if SE(3) displacements and the Jacobian matrix are stored in column-major order, updating the influence coefficient matrices and elements Jacobian matrix can be vectorized with a stride of 1, which is optimal in terms of memory access. 3.2.4

Constraint Forces

Consider two bodies pinned together by a revolute joint aligned with the z axis of its local coordinate frame, a. If the joint is modeled without friction, then no torques about the joint axis can be transmitted from the first body to the second body. The only forces that can be imparted from one joint to the next are those orthogonal to the joint’s axis. If a spatial wrench f ∗a is applied to the joint, the non–working joint reaction forces are given as   1 0 0 0 0 0     0 1 0 0 0 0     0 0 1 0 0 0   ∗a ¯ a∗ f ∗a H f ab,a =  0 0 0 1 0 0 ab,a       0 0 0 0 1 0   0 0 0 0 0 0

(35)

The Coadjoint transformation matrix, [Ad]∗ , from Eq. (31) provides a convenient and compact method to express constraint forces in other coordinate systems.    1 0 0 0 0 0 1 0 0       0 1 0 0 0 0 0 1 0         fa 0 0 1 0 0 0 0 0 1 R 0    ¯ f∗ = [Ad]∗f a    H =   fa f 0 0 0 1 0 0 fa fa  R R ˜ r   0 0 0 fa       0 0 0 0 1 0 0 0 0    0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 1 0 0 1 0 0

 0   0   0   0    0  0

(36)

Second, consider the case where working forces may be applied in a joint, such as in a robotic actuator. The torque provided by the motor required to resist the applied wrench f ∗f a,f is given by 30

Hf∗ f ∗f a,f

 0   0   0 a = [Ad]∗f fa  0    0  0

  0 0 0 0 0 0     0 0 0 0 0 0      fa 0 0 0 0 0 0 R 0  ∗a    = f  a,a  f  0 0 0 0 0 ˜rf a Rf a Rf a  0     0 0 0 0 0 0   0 0 0 0 1 0

 0 0 0 0 0   0 0 0 0 0   0 0 0 0 0  ∗a  f a,a 0 0 0 0 0    0 0 0 0 0  0 0 0 0 1

(37)

From Eqs. (36) and (37) it follows that ¯ f∗ ∪ Hf∗ ] [Ad]∗f a = [H

(38)

Alternatively, the Coadjoint transformation matrix in Eq. (38) can be formed by replacing ¯ ∗ with the non–zero column of H ∗ corresponding to the joint degree the all–zero column of H f of freedom (i.e. h∗f ) as ¯ f∗ ∪ h∗f ] [Ad]∗f a = [H

(39)

¯ T h∗ = 0 H f f

(40)

Additionally, it follows that

¯ f are orthogonal to h∗f . as each of the columns of H 3.2.5

Transforming the Inertia Matrix

cc In the body center of mass frame, the 6×6 spatial inertia matrix, Mb,cc , is expressed as

cc Mcc

  mI 0  = cc 0 I¯b,cc

(41)

cc where m is the body mass, I is a 3×3 identity matrix and I¯b,cc is the 3×3 rotational inertia

matrix. The value before the comma in the subscript indicates that the inertia corresponds to body b. The value after the comma in the subscript indicates the inertia properties are expressed at point c, the centroid of body b. The superscripts indicate that the inertia terms are expressed with respect to the principal axes of body b. When solving the equations of 31

motion in a fixed inertial reference frame, it is necessary to transform these local inertia matrices to the origin of the fixed inertial reference frame, f . To understand how the inertia matrix is transformed to other coordinate systems, consider the equation for spatial momentum, p∗ . The asterisk indicates that p∗ is a coscrew quantity. The 6×6 inertia matrix provides a mapping of body velocity to momentum.

cc p∗c = Mb,cc vcf b,c b,c

(42)

As noted above, coscrew quantities are transformed by left multiplication with the Coadjoint operator, [Ad]∗ . ∗f c f c −1 fc c c ∗c cc [Ad]∗f f c pb,c = [Ad]f c Mb,cc ([Ad]f c ) [Ad]f c vf b,c

(43)

c f c −1 fc c ∗f c f c −1 f cc cc = [Ad]∗f p∗f f c Mb,cc ([Ad]f c ) [Ad]f c vf b,c = [Ad]f c Mb,cc ([Ad]f c ) vf b,f b,f

(44)

ff ∗cf −1 f c −1 cc Mb,f f = ([Ad]cf ) Mb,cc ([Ad]f c )

(45)

ff ∗f c cf cc Mb,f f = [Ad]f c Mb,cc [Ad]cf

(46)

yielding

therefore

and equivalently

The 6×6 Adjoint and Coadjoint matrices provide a convenient and compact notation to describe transformations, but again it is emphasized that full dense matrix algebra need not be applied to carry out transformation of the inertia matrix, noting that   c cf fc cf fc R mIR −R mI˜ r R fc ff ∗f c cf cc   Mb,f f = [Ad]f c Mb,cc [Ad]cf = f c c fc cf fc cf f c ¯cc cf ˜rf c R mIR −R m˜rf c I˜rf c R + R Ib,cc R

(47)

which can be further simplified as 

−mRf c ˜rcf c Rcf



mI ff   Mb,f f = ff f 2 fc c cf ¯ Ib,cc − m(˜rf c ) mR ˜rf c R

32

(48)

and 

−m˜rff c



mI ff   Mb,f f = f ff f 2 ¯ m˜rf c Ib,cc − m(˜rf c )

(49)

Based on Eq. (49), it is only necessary to store ten terms to reconstruct the inertia matrix: mc , mc r, and the terms on and below the diagonal of I¯ (i.e. Ixx , Iyy , Izz , Ixy , Iyz and Izx ). Storing only these 10 terms requires approximately 70% less storage than the full 6×6 inertia matrix. If the reference frame for the body is not taken as the principal axes at the body center of mass, then in addition to the inertia matrix stored in the center of mass coordinate system, a vector from the body’s reference frame to the center of mass frame, rbbc , is stored. Here, b denotes the body reference frame and c denotes the center of mass frame. ff ∗f b bf ∗bc cc cb Mb,f f = [Ad]f b [Ad]bc Mb,cc [Ad]cb [Ad]bf

(50)

Equation (50) can be further decomposed into ff ∗f b bf ∗bb ∗bc cc cb bb Mb,f f = [Ad]f b [Ad]bc [Ad]cc Mb,cc [Ad]cc [Ad]cb [Ad]bf b bf ∗bb bb bb = [Ad]∗f f b [Ad]bc Mb,cc [Ad]cb [Ad]bf   mI 0 b bf ∗bb   [Ad]bb = [Ad]∗f cb [Ad]bf f b [Ad]bc bb 0 I¯b,cc

(51)

where the transformation [Ad]∗bb bc corresponds to a translation, but no rotation. bb The quantity I¯b,cc is constant and can be computed once in a preprocessing step. The ff quantity I¯b,cc is computed as ff bb I¯b,cc = Rf b I¯b,cc Rbf

(52)

A vector from the fixed reference frame to the body centroid is found as rff c = rff b + Rf b rbbc

(53)

To reduce operations required to compute Eq. 50, the quantity rbbc can be computed once and stored in a preprocessing step. The final transformed mass matrix is expressed using the temporary quantity ttemp ttemp = mrff c 33

(54)

as 

−˜ttemp



mI ff   Mb,f f = f ff ˜ttemp I¯cc ˜ ˜ − ttemp tf c

(55)

f Then the following identity is used to simplify the matrix product ˜ttemp˜tf c

a ˜˜b = baT − (aT b)I

(56)

The product a ˜˜b is symmetric, and therefore only the terms on and below the diagonal of baT are computed using 6 floating point operations as    b     0 n o result ← b1 a0 a1 a2         b2

(57)

The contribution from (aT b)I is found as result ← result − trace(result) ∗ I

(58)

requiring two additions to compute the trace, and three subtractions to subtract off the terms along the diagonal. Finally, the inertia transformed to the coordinate system of the inertial reference frame and expressed at the origin of the reference frame is found as

f ¯b,f f − result I¯fb,f f = Icc

3.2.6

(59)

More derivative operator matrices

In Eq. (9), the derivative operator of the SE(3) group was found. Similarly, left and right derivative operators for the 6×6 transformation matrices, [Ad] and [Ad]∗ , exist. Let the left derivative operator of [Ad] be defined as

˙ [Ad]−1 [ad]` = [Ad] `

(60)

and the right derivative operator of [Ad] be defined as ˙ [ad]r = [Ad]−1 [Ad] r 34

(61)

Let [Ad]ab ab

= [Ad](raab ) ∗ [Ad](Rab ) = [Ad](Rab ) ∗ [Ad](rbab )       a b ab ab I ˜rab R 0 R 0 I ˜rab   =   = ab ab 0 I 0 R 0 R 0 I

Using Eqs. (62) and (63) to differentiate [Ad]ab ab with respect to time yields       a a a ab ab ˜ 0 r˙ ab R 0 I ˜rab ω ˜ R 0 ˙ ab =   +   ab  [Ad] ab a ab ab 0 0 0 R 0 I 0 ω ˜ ab R       b 0 r˜˙ab 0 I ˜rbab Rab 0 Rab ω ˜ bab   +  = ab b ab 0 R ω ˜ ab 0 R 0 I 0 0

(62) (63)

(64)

(65)

The time derivatives of the displacement vector terms raab and rbab may be expressed relative to the frames a and b as r˙ aab = r˙ aab,a − r˜aab ω aab

(66)

r˙ bab = r˙ bab,b − r˜bba ω bab

(67)

and

Substituting Eqs. (66) and (67) into Eqs. (63) and (65) gives       a a a a a ab ab ˜ ^ 0 r˙ ab,a − ˜rab ω ab 0 R 0 ω ˜ R I ˜rab ˙ ab =     ab +  [Ad] ab a ab ab 0 ω ˜ ab R 0 0 0 I 0 R       b b b 0 Rab 0 0 r˜˙ab,b − ˜r^ ω Rab ω ˜ bab I r˜bab ba ab +    = ab b ab 0 R ω ˜ ab 0 I 0 R 0 0

(68)

(69)

Substituting the identities a a ˜raab ω ˜ aab − ˜r^ ˜ aab r˜aab ab ω ab = ω

(70)

b b ω ˜ bab ˜rbab − ω ˜^ rbab ω ˜ bab ab r ab = ˜

(71)

and

into Eqs. (68) and (69) yields       a a a b b b a ab ab b ˜ ˜ R 0 ω ˜ r ˙ + ω ˜ ˜ r R 0 ω ˜ r ˙ + ˜ r ω ˜ ab ab ab ab ˙ ab =  ab ab,a   =   ab ab,b  (72) [Ad] ab a b ab ab 0 ω ˜ ab 0 R 0 R 0 ω ˜ ab 35

which may be factored as 

   a ab I r˜ab R 0    ω ˜ aab 0 I 0 Rab

(73)

    b b ab b ˜ R 0 r I r ˜ ω ˜ ˙ ab ˙ ab =     ab ab,b  [Ad] ab 0 ω ˜ bab 0 Rab 0 I

(74)

˙ ab =  [Ad] ab

ω ˜ aab 0

a r˜˙ab,a

or

The left derivative operator matrix [ad]aab,a is obtained from Eq. (73) and is formed from the spatial velocity vaab,a . It is sometimes referred to as the spatial cross product operator [36] [ad]aab,a = [ad]vaab,a

  ω ˜ aab v˜aab,a  = a 0 ω ˜ ab

(75)

The right derivative operator [ad]bab,b is obtained from Eq. (74) and is formed from the velocity screw vbab,b . [ad]bab,b = [ad]vbab,b

  ω ˜ bab v˜bab,b  = b 0 ω ˜ ab

(76)

In summary both the left and right derivative operators [ad] can be formed from the spatial velocity v. The primary difference between the two variants is the coordinate system and point that the spatial velocity is expressed in and with respect to.



˙ ab ([Ad]ab )−1 [ad]aab,a , [ad]vaab,a = [Ad] ab ab

(77)

−1 ˙ ab [ad]bab,b , [ad]vbab,b = ([Ad]ab ab ) [Ad]ab

(78)

0

   ωz   −ωy  [ad] =   0     0  0

−ωz

ωy

0

−vz

0

−ωx

vz

0

ωx

0

−vy

vx

0

0

0

−ωz

0

0

ωz

0

0

0

−ωy

ωx

36

vy



  −vx       0  ω ˜ v˜  =  ωy  0 ω ˜   −ωx   0

(79)

A similar development can be applied to the Coadjoint transformation matrix [Ad]∗ . For brevity, only the end result is shown. The left derivative operator for [Ad]∗ matrices is   0 −ωz ωy 0 0 0      ωz 0 −ωx 0 0 0        a −ωy ωx  0 0 0 0 0  ω ˜ ∗ ˙ ∗ab ([Ad]∗ab )−1 =    ab = [Ad] [ad]∗a =   ab ab,a = [ad]va ab ab,a  0  a a v˜ab,a ω ˜ ab −vz vy 0 −ωz ωy       0 −vx ωz 0 −ωx   vz   −vy vx 0 −ωy ωx 0 (80) Similarly, the right derivative operator for [Ad]∗ matrices is   0 −ωz ωy 0 0 0      ωz 0 −ωx 0 0 0        b  −ωy ωx 0 0 0 0  ω ˜ ab 0 ∗ab −1 ∗ ˙ ∗ab   [ad]∗b = ab,b = [ad]vbab,b = ([Ad]ab ) [Ad]ab =    0 b b v ˜ ω ˜ −v v 0 −ω ω  z y z y  ab,b ab     0 −vx ωz 0 −ωx   vz   −vy vx 0 −ωy ωx 0 (81) Previously, it was discussed how the derivative operators form the Lie algebra of the SE(3) group. Similarly, the 6×6 [Ad] and [Ad]∗ matrices themselves fulfill the necessary requirements for a Lie group [15]. The [ad] and [ad]∗ matrices comprise their respective Lie algebras.

3.3

Dynamic Preliminaries

In this section, a brief overview of popular methods for setting up dynamics equations is provided and computer implementation details are discussed. The dynamics of constrained multibody systems has its roots in classical mechanics. The simplest element of a multibody system is the free particle, which can be modeled by Newton’s equations of motion [37]. In 1775, Leonard Euler introduced the concept of a rigid 37

body and the well–known Newton–Euler equations [38]. A systematic analysis of constrained mechanical systems was established by Joseph-Louis Lagrange in 1788 in his seminal work, Mecanique Analytique, where he reformulated classical dynamics equations in terms of a minimal set of generalized coordinates [39]. 3.3.1

Lagrange’s Equations of the First Kind

Lagrange’s equations require taking derivatives of an energy functionional (i.e. the Lagrangian, L), with respect to generalized coordinates and generalized velocities and time. Lagrange’s formulation typically results in a large set of differential algebraic equations with no exploitable structure. Therefore Lagrange’s equations are not well–suited to computer implementation, particularly for complex mechanisms. Many authors have applied numerical reduction strategies to simplify the equations. 3.3.2

Euler-Lagrange Equations

In 1966, Roberson and Wittenberg [40, 41] showed how Lagrange’s equations can be derived from the Newton-Euler equations using graph theoretic methods and relative (joint) coordinates. Garc´ıa de Jal´on developed a similar formulation using natural coordinates [42]. The formulations of both authors result in a system of equations as

M ` q¨ + J T λ = f `

(82)

J q˙ = e(t)

(83)

Here, M ` is a symmetric, positive-semidefinite generalized mass matrix whose sparsity pattern is precisely determined by the mechanism topology and f ` is a vector of dynamic generalized forces. Equations (82) and (83) are often referred to as the Euler-Lagrange equations.

38

3.3.3

Index-1 Euler-Lagrange Equations

Many authors have further simplified Eqs. (82) and (83) by replacing the velocity level constraints with acceleration level constraints [43, 44, 45].  d J q˙ + e(t) = J¨q + J˙q˙ + e˙ (t) = J¨q − γ c dt

(84)

γ c = −J˙q˙ − e˙ (t)

(85)

where

With this new set of constraints, the Euler-Lagrange equations can be written in matrix form as

     M ` J T  ¨q  f `    = J 0 λ   γ c 

(86)

The system of equations has now been reduced to a set of index-1 DAEs. Later in this chapter, a detailed derivation of Eq. (86) based on graph theoretic concepts and formalized in the modern language of 6 × 1 screws and wrenches [15, 16, 31] is provided. If the coefficient matrix in Eq. (86) is nonsingular, a unique solution can be obtained using the method of Cholesky factorization [46, 45]. Unfortunately, the Jacobian matrix is almost always singular, which complicates the solution process even for relatively simple mechanisms. 3.3.4

Null-Space Methods

Hemami and Weimer [43] introduced the orthogonal complement, N , of the constraint Jacobian matrix, J. N is the kernel of J, where JN = 0 Premultiplying the top block row of Eq. (86) with N T yields     N T f `  NT M`   q¨ =  γ  J 39

(87)

(88)

Note that Hemami and Weber did not provide a systematic method to find the null space matrix, N . Garc´ıa de Jal´on et al. also derived Eq. (88) [42, 47]. Although the complexity of the system of equations has been reduced, the Jacobian matrix is still not full row rank, which complicates the solution process. 3.3.5

Pseudo–Inverse Methods

One of the first attempts to handle a singular Jacobian matrix is to assume that all constraints and variables contribute equally to the solution and use the root–mean–squared (RMS) or Moore–Penrose Pseudo–Inverse (MPPI) method as shown below    ¨q  λ where the

+

 =

+   f `   γ  0 c

M` JT J

(89)

notation indicates the pseudo–inverse operation. For example, if the matrix A

has full column rank, its pseudo–inverse can be obtained as A+ = (AT A)−1 AT

(90)

Udwadia and Kalaba applied the MPPI to solve Eq. (88) in [48]. Unfortunately, the MPPI approach not only requires many additional floating–point operations, but it also may fail to converge in conditions of poor mechanical advantage [49, 28]. 3.3.6

Kane’s Equations

In 1965 Kane introduced an expression for the equations of motion that, unlike the form of Lagrange’s equations of the first kind, does not require differentiating energy functionals. Kane’s method requires identifying a set of generalized coordinates and generalized speeds [50]. Typically, the generalized speeds are taken as the first derivative of the generalized coordinates but need not necessarily be. In Kane’s method, the system of differential equations is more compact than the result of expanding Lagrange’s equations. However, Kane’s method has the disadvantage that it requires an analyst to manually choose a set of in-

40

dependent generalized coordinates and is therefore not well–suited to automatic computer implementation. 3.3.7

Maggi’s Equations

Maggi’s work was first published in Italian in 1896 [51] and 1901 [52], but was not made available to an English audience until 1972 [53]. Like Kane’s method, Maggi’s method also requires manual identification of a mechanism’s independent generalized coordinates and results in a reduced system of differential equations where only independent accelerations appear. The Jacobian matrix can be manually permuted and block–partitioned into coefficients associated with its primary and secondary constraint equations and dependent and independent coordinates as   pd pi J J  J = J sd J si

(91)

h i pd pi Note that J forms the row–space of J. Using the first block row of Eq. (91), a J replacement for the generalized accelerations can be found as      −1        pd −1 pi p pd pi pd −1 pd −1 pi  p  pd −1  ¨q d  (J ) J γc J J γc (J ) −(J ) J (J )  q¨i    γ p − = = = c    ¨q i  i i I ¨q ¨q 0 I 0 I 0 (92) It also follows that   pd −1 pi −(J ) J  = JN = 0 J I

(93)

  −(J pd )−1 J pi  is a basis for the null space of the Jacobian matrix, introduced and therefore,  I in Eq. (87). Combining the results of Eqs. (88) and (92) with some additional algebraic manipulation yields   pd −1 (J )  γp N T M ` N q¨i = N T f ` − N T M `  c 0

41

(94)

In Eq. (94), the terms of the generalized mass matrix, M ` must be permuted to match the ordering specified in Eq. (91). Borri and coworkers have discussed the equivalence of Maggi’s and Kane’s equations [54].

3.4

Generalized Coordinate Partitioning

Of the various methods for processing dynamics equations with constraints, the cutset methods are appealing in that they offer a reduced system of equations to solve and avoid applying the expensive Moore–Penrose Pseudo Inverse. A primary disadvantage to the cutset methods is that they require an analyst to manually select a set of independent variables to describe the system. Historically, specialists have chosen the cutset of primary constraints and generalized coordinates based on their experience with similar classes of problems. Manual selection, however, has its limitations. For complex systems, the manual approach is prone to error and may impose a substantial burden on the analyst. Bifurcations, end–of–stroke conditions, or instantaneous alignment of joints may occur as a system changes configuration, and therefore a suitable set of generalized coordinates depends on a system’s posture and may require periodic updating during a simulation. In addition, a mechanism’s changing posture may periodically require a different set of primary constraint equations, or even a different number of primary equations when the Jacobian matrix rank changes. Robust solution methods must reliably handle these situations with minimal demands on the analyst, who may lack the training to intervene. While the cutset approach is appealing from an efficiency standpoint, ensuring robust solutions requires care in selecting the primary equations and independent variables. Particularly, choosing variables as independent that would result in large changes in system configuration for small incremental inputs must be avoided, because this may cause numerical instability and inaccuracy problems. Therefore, the optimal independent generalized coordinates are those that induce the smallest changes in system configuration. It follows that this set of generalized coordinates will have roughly the greatest mechanical advantage. In recent decades numerical methods have been devised to automatically identify optimal 42

sets of independent generalized coordinates and primary constraint equations. Generalized Coordinate Partitioning (GCP) [12, 3, 55] is a row and column permutation and partitioning method applied to the Jacobian matrix. The GCP method is a byproduct of LU decomposition with complete row and column pivoting (i.e. Gaussian Elimination with Complete Pivoting (GECP)). GECP factors a matrix into row–echelon form, reliably revealing its rank [56]. At each successive factorization step, GECP permutes the largest remaining pivotal element to the upper–left position in the residual matrix, thereby selecting the next dependent variable with possibly the smallest mechanical advantage. After factorization is complete, the Jacobian matrix will have been permuted so that the rows of the largest nonsingular matrix J pd identify the primary constraint equations, and the columns identify the dependent variables. GCP has demonstrated an order of magnitude speedup over the RMS method [13] and has been successfully used in general–purpose commercial multibody dynamics software such as LMS DADS. Other methods have been proposed to automatically determine a set of independent generalized coordinates. In 1985, Mani and Haug used Singular Value Decomposition (SVD) [57]. In 1986, Kim and Vanderploeg used QR decomposition [58]. In both the SVD and QR methods, the resulting J pd matrix was shown to have a lower condition number near singular configurations. A primary disadvantage of these methods is poor numerical efficiency of the partitioning process compared to the LU algorithm. For example, QR decomposition requires 2-4 times as many floating point operations as the LU factorization algorithm. In turn, the SVD algorithm requires 2-10 times more operations than the QR decomposition algorithm [18]. Despite the advantages of GCP, the primary disadvantage is that it is an expensive process to apply, requiring O(n3 ) memory copy and floating point operations, where n is the number of joints in the system. One of the primary objectives of this thesis is to reduce the overhead of applying the Generalized Coordinate Partitioning technique, calling it only when necessary. See Chapter 5 for more information.

43

3.5

Graph theory for multibody systems

An engineer specifies a mechanical system by enumerating a number of bodies and connecting them together with joints. In this way, the engineer has created a representation of the system topology as a network with bodies as nodes and joints as edges. Roberson, Wittenberg [40], Sheth and Uicker [49] were among the first to apply the rich field of graph theory to setting up dynamics equations in a general–purpose multibody dynamics code. Graph theoretic methods were successfully applied in Sheth and Uicker’s IMP multibody dynamics software. Graph theory provides a number of tools to process a mechanical system represented as a network or graph. One important tool is the incidence matrix, which is used to describe the connectivity of bodies and joints, as well as the directionality of the joints. Another useful tool is the loop, or circuit matrix, found from the right null space of the incidence matrix transposed, ΓT [59]. For example, Sheth showed that the Jacobian matrix for a complex system can be automatically obtained from the circuit matrix, where each block six row of the Jacobian corresponds to an independent chord loop. Another important tool is the spanning tree of the system (i.e. the subset of graph edges required to reach all nodes in the system exactly once). The primary advantage to graph theoretic methods is that they provide a systematic framework to organize and process multibody dynamics data that is well–suited to computer automation. Once a graph theoretic method has been successfully programmed, it is suitable to set up equations for any arbitrary topology described as a list of bodies and joints. Methods to automatically identify spanning trees, loops and loop members are important as they obviate the need for users to perform the analysis, thereby eliminating human operator error and facilitating faster model setup and deployment. In Chapter 5, it will be shown that additional information can be obtained from a representation of a mechanical system as a graph that allows for numerical reduction of the equations of motion. A new, fully–automatic algorithm – referred to as Kinematic Substructuring – is used to traverse the topology of a multibody system represented as a graph to identify clusters of overlapping loops, which allows numerical reduction in the equations of 44

motion. 3.5.1

Spatial Equations of Motion

The form of the equations of motion used in this thesis are sometimes referred to as the spatial equations of motion. The spatial equations of motion are based on pioneering work of Featherstone, Park, Bobrow, Ploen and others [11, 30]. For a more recent discussion, refer to [60]. The spatial equations of motion have a few interesting properties. First, since all spatial velocities and accelerations are expressed at the origin of the inertial reference frame with coordinates projected onto this frame, relative velocities and accelerations between bodies can be found simply by substracting one quantity from another, which simplifies traversal of the spanning tree for forward kinematics. Second, the form of the spatial equations of motion is invariant under coordinate transformation, allowing a simulation to easily be reparameterized in terms of a different inertial reference frame. For completeness, the spatial equations of motion are derived in this section. The spatial equations of motion for a rigid body expressed in the fixed reference frame can be found from the time derivative of spatial momentum.

= g∗f b,f

d ∗f d  ff f  pb,f = Mb,f f vf b,f dt dt

(95)

where g∗f is a wrench acting on body b, expressed at the origin of the fixed inertial frame b,f f and in the inertial reference frame coordinates. The wrench g∗f may be due to gravity or b,f other externally applied forces and torques. The quantity p∗f represents spatial momentum. b,f Using the chain rule, the right hand side is expanded as ff f ff ˙ ff b,f g∗f = M˙ b,f f vf b,f + Mb,f f v b,f

(96)

bb Mb,bb is the 6×6 inertia expressed in body b’s reference frame b and referenced to the origin ff bb of frame b. Noting that Mb,bb in the body reference frame is constant, M˙ b,f f is found using

the chain rule as ∗f b ff bf bb ˙ bf ˙ ∗f b bb M˙ b,f f = [Ad]f b Mbb [Ad]bf + [Ad]f b Mbb [Ad]bf

45

(97)

and ff ∗f ∗f b bf ∗f b bf f bb bb M˙ b,f f = [ad]f b,f [Ad]f b Mb,bb [Ad]bf + [Ad]f b Mb,bb [Ad]bf [ad]bf,f

(98)

yielding   ∗f ∗f b bf ∗f b bf f b bf f ∗f bb bb bb gf c,f = [ad]f b,f [Ad]f b Mb,bb [Ad]bf + [Ad]f b Mb,bb [Ad]bf [ad]bf,f vff b,f +[Ad]∗f ˙ f b,f f b Mb,bb [Ad]bf v (99) which is further simplified as ( ff ff f ff ((f( f = [ad]∗f ˙ ff b,f g∗f (100) M( f b,b Mb,f f vf b,f + ( b,f( f [ad]bf,f vf b,f + Mb,f f v b,f    f f f  ω ˜ f b v˜f b,f v f b,f  ff f f   is zero. The term Mb,f [ad] v cancels because the matrix product f bf,f f b,f f f   ω ˜ ω ((

fb

fb

The final form of the spatial equations of motion may be written as ff ff f − [ad]∗f Mb,f ˙ ff b,f = g∗f fv f b,f Mb,f f vf b,f f b,f

3.5.2

(101)

Comparison to Newton-Euler Equations

Expanding the spatial equations of motion and translating to the body center of mass frame yields        c v˙ c  v c  mI 0 0 ω ˜ mI 0 f c,c fc f c,c b    = +  ω˙ c   ωc  τ c  cc cc 0 I¯b,cc v˜cf c,c ω ˜ cf c 0 I¯b,cc fc b,c fc          fc  v˙ c   m˜ ω cf c v cf c,c  mI 0 f c,c b   + = τ c  0 I¯cc  ω˙ c  ω c I¯cc ω c     fc 

b,c

fc

b,cc

f c b,cc

(102)

(103)

fc

whereas the Newton-Euler equations of motion are        c   fc     mI 0 v˙ f c,c 0 b  = + τ c   ω˙ c  ω c I¯cc ω c  cc 0 I¯b,cc fc b,c f c b,cc f c

(104)

The translational component of spatial acceleration differs from the true, or classical linear acceleration in Eq. (104) by ω ˜ cf c v cf c,c . Care must be taken when interpreting spatial accelerations – the spatial acceleration is only equal to the true acceleration at the origin of the inertial reference frame. The difference between true accleration and spatial acceleration is discussed further in the next section. 46

3.5.3

Properties of Spatial Accelerations

Earlier in this section, it was discussed how a velocity screw expressed in the body reference frame b can be expressed in another frame, such as the fixed inertial reference frame f using the Adjoint transformation [Ad] as vff b,f = [Ad]ff bb vbf b,b        f fb fb  b  v f  ˜rf Rf b ω b + Rf b v b  R ˜ r R v f b,b fb f b,b f b,f fb fb  = =  ωf   Rf b ω b 0 Rf b  ω b   fb

fb

(105) (106)

fb

Taking the time derivative in the stationary inertial reference frame f yields   f f f   ˜ r ω + v d fb fb f b,b v˙ ff b,f =  dt  ω ff b   v˜f ω f + ˜rf ω˙ f + v˙ f  f b,b f b fb fb f b,b = f   ω˙

(107)

fb

The spatial acceleration in Eq. (107) is the true acceleration. Transforming the result of Eq. (107) back to frame b yields v˙ bf b,b = [Ad]bf ˙ ff b,f = bf v    f f f f  b bf  f bf v˜f b,b ω f b + ˜rf b ω˙ f b + v˙ f b,b R r˜bf R  =   0 Rbf ω˙ f

(108) (109)

fb

  r˜b Rbf ω˙ f + Rbf v˜f ω f + Rbf ˜rf ω˙ f + Rbf v˙ f  bf fb f b,b f b fb fb f b,b =   Rbf ω˙ ff b    f f b  bf f f b bf f bf bf f  r˜bRbf ω˙ f b + R v˜f b,b R R ω f b −  ˜rbfR ω˙ f b + R v˙ f b,b bf =   Rbf ω˙ ff b   v˙ b + v˜b ω b  f b,b f b,b f b =   ω˙ bf b   b b  v˙ b − ω ˜ f b v f b,b f b,b =   ω˙ b fb

47

(110)

(111)

(112)

(113)

Equation (113) reveals that while spatial velocities transform using the 6×6 Adjoint operator matrix, true accelerations do not. When transforming true acceleration in the fixed frame f to the body frame b, after transformation the spatial acceleration differs from the true acceleration by the centripetal acceleration quantity −˜ ω bf b v bf b,b . With this knowledge, the true acceleration at point b can be recovered by transforming to the body frame using the Adjoint transformation and adding the quantity ω ˜ bf b v bf b,b to (or subtracting the centripetal acceleration from) v˙ bf b,b . A useful property of working with spatial accelerations is that the spatial acceleration at any point on a rigid body may be obtained from the spatial accelerations at any other point on the body using Adjoint transformations [11]. Furthermore, when all spatial accelerations are expressed at the same point and in the same reference frame, relative spatial accelerations between bodies can be found simply by subtracting the spatial acceleration of one body from another. 3.5.4

Lagrange’s Spatial Equations of Motion

In the following discussion, the strict adherance to the subscript and superscript notational convention in Eq. (101) is relaxed when dealing with stacked column matrices of screw quantities as M v˙ f = g∗f − [ad]∗vf M vf

(114)

Here the shortened notation is used with the understanding that coordinates of all spatial quantities are expressed in the fixed inertial reference frame and referenced to the origin of that frame. Lagrange’s spatial equations of motion can be derived from Eq. 114 and graph theoretic methods. An important tool in formulating the equations of motion is the incidence matrix, Γ, a representation of an oriented graph of a mechanism [28], with columns corresponding

48

joints and rows corresponding to bodies. An entry of row b and column h of Γ is defined as    +1 if joint h is positively incident with body b     Γ(b, h) = −1 if joint h is negatively incident with body b (115)      0 if joint h is not incident with body b For a detailed, worked example of an incidence matrix, refer to the next section. A corollary to the incidence matrix, the connectivity matrix, is formed by element–wise replacement of each +1 and -1 in ΓT with a positive or negative 6×6 identity matrix. The partial velocity matrix, H, is a block–diagonal matrix containing 6×1 influence coefficient matrices along the diagonal and provides a mapping of joint velocities to relative velocities across joints. vrel = Cvabs = H q˙

(116)

Taking the time derivative of Eq. (116) yields C v˙ abs = H¨q + H˙ q˙ = H¨q + γ c

(117)

where γ c = H˙ q˙ . Equation (117) can be expressed in factored matrix form as   h i v˙  abs = γc C −H  q¨ 

(118)

H T k∗ = f

(119)

Additionally the equation

maps joint reaction wrenches, k∗ onto joint space. The column matrix f represents working generalized forces (such as from springs, dampers, actuators, friction, etc.) applied within the joints. If no such effects exist, f will be zero. The equation C T k∗ = γ c represents equal and opposite joint reaction forces acting between bodies.

49

(120)

Equations (118), (119) and (120) can be appended to Eq. (114) as      ∗ ∗ T    ˙ M 0 C gf − [ad]vf M vf   vabs              T = 0 0 −H  q¨ f                ∗  −k C −H 0 γc

(121)

If the connectivity matrix is structurally nonsingular and invertible, then the absolute spatial accelerations, v˙ abs and reaction wrenches, k∗ can be symbolically eliminated to obtain the generalized equations of motion as H T C −T M C −1 H q¨ = f + H T C −T (g∗f − [ad]∗vf M vf − M C −1 γ c )

(122)

which can be expressed as M ` q¨ = f `

(123)

M ` = H T C −T M C −1 H

(124)

f ` = f + H T C −T (g∗f − [ad]∗vf M vf − M C −1 γ c )

(125)

where

and

The symmetric, positive semidefinite M ` corresponds to the generalized mass matrix. It contains the same quantities that would have been obtained by expanding a Lagrangian, hence the superscript ` [61]. f ` represents the generalized forces, and additionally contains quantities that would have been obtained from expanding a Lagrangian. While equation (123) is suitable to describe the dynamic behavior of mechanisms with no closed loops (i.e. mechanisms with star or tree configurations), if closed kinematic loops are present it is necessary to append constraints to the system of equations. Setting up kinematic–loop constraint equations for a multibody system requires identifying a spanning tree, i.e. a root-oriented directed graph that provides a unique path between each body and the fixed inertial reference frame. The set of graph edges comprising the spanning tree are known as the arcs of the directed graph. The remaining edges are classified as chords. Once the arcs and chords of the system have been identified, Eq. (116) can 50

be block partitioned as 





C H 0  arc  vabs =  arc Cchord 0 Hchord

   q˙  arc  q˙ 

(126)

chord

which yields −1 Harc q˙arc vabs = Carc

(127)

Cchord vabs = Hchord q˙chord

(128)

and

Substituting Eq. (127) into Eq. (128) and rearranging yields −1 − Cchord Carc Harc q˙arc + Hchord q˙chord = 0

(129)

Equation (129) represents the chord-loop velocity constraints for a mechanism, and from it the constraint Jacobian matrix is identifed as i h −1 J = −Cchord Carc Harc Hchord

(130)

Uicker [49] and Wittenberg [40] showed that the Jacobian matrix can be generated from the circuit, or loop matrix, which is obtained from the right null space of the incidence matrix transposed, ΓT [59]. Note that the result of Eq. (130) and the methods of Uicker and Wittenberg are equivalent. A mechanism’s constraint equations may include explicit time varying constraints, e(t). Taking the time derivative of the constraint equations yields d (J q˙ + e(t)) = J¨q + J˙q˙ + e˙ (t) = J¨q − γ c = 0 dt

(131)

γ c = −J˙q˙ − e˙ (t)

(132)

where

Appending the Lagrange multipliers and Eq. (131) to Eq. (123) yields a compact form of Lagrange’s equations for constrained dynamical systems      M ` J T  ¨q  f `    = J 0 λ   γ c  51

(133)

Note that the terms M ` , J and γ c are dependent on a mechanism’s present configuration and must be updated at every time step. The overhead in updating these quantities is offset by using only the relative joint coordinates during numerical integration. Furthermore, efficient recursive methods exist to carry out the operations for updating the generalized mass matrix (i.e. Eq. (124)) such as described in [62, 63]. Another advantage to the form of Eq. (133) is that the coefficient matrix is symmetric – therefore, only the terms on and below the diagonal need be processed, requiring fewer operations to factor and solve the system of equations. Although the form of Eq. (133) has advantages, as discussed above, there are challenges with the solution proocess. If the coefficient matrix in Eq. (133) is nonsingular, a unique solution can be obtained using basic linear algebra methods [46]. However, in practice, the Jacobian matrix almost never has full row rank, so the coefficient matrix in Eq. (133) is generally singular. Therefore, Generalized Coordinate Partitioning is applied to the constraint equations. After LU decomposition with complete row and column pivoting, the Jacobian will have been partitioned as        T    d pd pi  d  d p γ p   q¨ J J q¨ P P c    eJ  v = =      J sd J si q¨i q¨i γ sc  Pvi Pes

(134)

The matrices Pep and Pes represent two pieces of an orthonormal permutation matrix that rearrange and partition rows of the Jacobian matrix and equations into primary and secondary components, and the matrices Pvd and Pvi represent two pieces of an orthonormal permutation matrix that rearrange and partition columns of the Jacobian matrix and rows of the column matrix of variables into dependent and independent components. Thus the column matrices q¨ and γ c have been appropriately permuted and partitioned according to the equation and variable permutation matrices. As the GECP algorithm factors the Jacobian matrix, the equation and variable permutation information is stored and available for subsequent reuse.

52

The primary constraint equations are extracted from Eq. (134) as   h i  q¨i  = γ pc J pi J pd q¨d 

(135)

where this reduced Jacobian matrix now has full row rank. The secondary constraint equations are removed from the system of equations, yielding      M `ii M `id J piT     f `i    q¨i            `di  `d d `dd pdT = M f M J  q¨               p  p  pi pd γc λ J J 0

(136)

The reduced system of equations in Eq. (136) can now be solved using standard numerical methods such as reordered Cholesky factorization. In the next chapter, a new preconditioning method will be applied that improves robustness and allows for further numerical reduction.

53

4

Preconditioned Spatial Equations of Motion

In this chapter a new formulation of the equations of motion, referred to as the Preconditioned Spatial Equations of Motion is presented. It will be shown that the preconditioned spatial equations of motion are equivalent to a special form of Maggi’s equations as presented in [64], but the numerical implementation of the equations is different. Furthermore, the preconditioned spatial equations of motion are amenable to numerical reduction stratetegies, which will be further discussed in the next chapter. The size and complexity of Eq. (136) in Section 3.4 has now been reduced compared to the form of Eq. (133). Furthermore, the coefficient matrix will be nonsingular, and all unknowns may be solved for. However, further numerical reduction can be obtained by preconditioning the primary constraints.

4.1

Preconditioning method

From Eq. (134) an alternate set of primary constraint equations can be formed. Equation (134) can be expressed as          pd pi  d  q¨d  γ p  h i L q¨ J J c     = = U UR   i sd si  i  q¨ LR q¨ γ sc  J J

(137)

From Eq. (137) an alternative set of primary constraint equations can be extracted as   h i q¨d  L U UR = γ pc (138)  q¨i  Since L and U are invertible, Eq. (138) may be revised to     h i q¨d  h i q¨d  = I B di = U −1 L−1 γ pc = γ d I U −1 UR  q¨i   q¨i  Equation (139) identifies a replacement for the primary equations in Eq. (135) as   h i  q¨i  = γd B di I q¨d  54

(139)

(140)

This new set of constraint equations can be used in place of Eq. (138), yielding a preconditioned form of dynamics equations as     M `ii M `id B diT      f `i   q¨i            `di  `d d `dd = f M M I  q¨                 d d di γ B I 0 λ 

(141)

This preconditioned form of the spatial equations of motion is used throughout this work. Numerical aspects related to the solution process are discussed in the next two chapters.

4.2

Equivalence to Maggi’s Equations

The system of preconditioned dynamics equations is equivalent to a special form of Maggi’s equations for holonomic systems, as presented in [64]   pd −1 (J )  γp N T M ` N q¨i = N T f ` − N T M `  c 0

(142)

Here N is the right null space of the Jacobian matrix. In the previous chapter, it was shown how the row–space of the Jacobian matrix can be found using LU factorization with complete row and column pivoting. Considering the row–space of the velocity–level constraints   h i q˙d  = e(t) ˙ (143) J pd J pi  q˙i  where e(t) ˙ are time varying velocity level constraints. If e(t) ˙ = 0, then LU factorization yields h

L U UR

  i q˙d 

=0

(144)

 q˙i 

Multiplying both sides of Eq. (144) by U −1 L−1 yields       h i q˙d  h i q˙d  h i q˙d  U −1 L−1 L U UR = I U −1 UR = I B di =0  q˙i   q˙i   q˙i 

55

(145)

Equation (145) allows one to trivially solve for the right null–space of the Jacobian matrix as   i i −B di h h  = I B di N = 0 I B di  I

(146)

  di −B  N = I

(147)

where

To illustrate how the two methods are equivalent, consider how Maggi’s method can be block–partitioned as      `dd `di pd −1 `dd `di M M (J ) M M  N q¨i = N T f ` − N T   γp  NT  c M `id M `ii 0 M `id M `ii

(148)

Substituting Eq. (147) into (148) yields         `d  h `dd `di  d  di  i h i h i M `dd M `di γ M M f −B    q¨i = −B diT I − −B diT I  −B diT I   f `i  M `id M `ii  0  M `id M `ii I (149) Expanding the left–hand side of Eq. (149) yields h

 M `ii − B diT M `di − M `id B di + B diT M `dd B di q¨i = −B diT

    i f `d  h i M `dd  γd − −B diT I  I  f `i  M `id (150)

Equation (150) can then be block–partioned, resulting in Eq. (141) above. Starting from the lower–right block 2×2 submatrix of the coefficient matrix in Eq. (141), one application of the Schur complement can be applied (see Section 6.1 for further discussion of the Schur complement) as follows          `di h i 0 h i 0 I M I f `d M `ii − M `id B diT    q¨i = f `i − M `id B diT     I −M `dd γd I −M `dd B di (151) Expanding Eq. (151) yields       h i M `di h i f `d M `ii − B diT M `id − B diT M `dd   q¨i = f `i − B diT M `id − B diT M `dd   di B γd (152) 56

Further algebraic manipulation yields   M `ii − B diT M `di − M `id B di + B diT M `dd B di q¨i = f `i − B diT f `d + M `id − B diT M `dd γ d (153) Finally, Eq. (153) is rearranged as h

 M `ii − B diT M `di − M `id B di + B diT M `dd B di q¨i = −B diT

    i f `d  h i M `dd  γd + −B diT I  I  f `i  M `id (154)

Equations (150) and (154) are equivalent.

4.3

Advantages to the Preconditioned Spatial Equations of Motion

Compared to the form of the equations presented by Garcia de Jalon et al., Eq. (141) offers a number of advantages: • With the preconditioned form of the dynamics equations one need never carry out the full matrix product N T M ` N . Evaluating the quantity N T M ` N involves more multiplications which increases numerical roundoff errors. • The preconditioned equations are amenable to numerical reduction strategies (see Chapter 5). • In the method presented by Garcia de Jalon et al., only the independent variables are integrated. Therefore, if GCP indicates a new split of independent variables during the solution process, numerical integration must be stopped and restarted. In the preconditioned spatial equations of motion, all variables are numerically integrated. If a new split is indicated, there is no need to stop and restart numerical integration, resulting in increased efficiency of the integration process – all variables maintain their time history during numerical integration. Refer to Section 7.2 for additional information.

57

5

Kinematic Substructuring

5.1

Motivation for substructuring

In the previous chapter, a compact formulation of the equations of motion was derived and an efficient solution method was discussed. One bottleneck in the solution process is the Gaussian Elimination with Complete Pivoting (GECP) algorithm itself, which is a tripleloop procedure requiring r(6mn−3(m+n)r+2r2 )/3 FLOPS, where m is the number of rows, n is the number of columns and r is the rank of the matrix. Furthermore, GECP requires both row and column permutations during factorization, which necessitates non-unit stride vector copies, which is not optimal for memory access. Note that other algorithms have been proposed for the automatic identification of independent and dependent variables, for example, based on Singular Value Decomposition (SVD) [57, 65], QR Decomposition [58] or Congruency Transformations [66]. The run–time of these alternative methods is increased over GECP; for example, SVD requires m2 n + 22n3 FLOPS using the R-SVD algorithm [18]. Although the modified GECP algorithm is more efficient than alternative partitioning strategies, due to its O(n3 ) complexity and memory access requirements, performing the GECP algorithm should be avoided whenever possible. Certainly, one strategy to improve performance would be to call GECP only once at the beginning of a program’s execution. As kinematic bifurcations are common in even relatively simple mechanisms, and the primary system of constraints are subject to change, this approach comes at the expense of robustness and is not recommended. Alternatively, one could conservatively call GECP at every time step—and pay a substantial performance penalty. Other approaches call GECP at regular intervals or use a heuristic method to determine when to repartition the Jacobian matrix. Regardless of the specific method used, the robust numerical solution of dynamic equations may require performing GECP many times in the course of a simulation, and strategies to improve GECP algorithm performance are important. Block–diagonalization is a well–known technique for breaking a large system of equations into smaller, more manageable pieces, thereby increasing computational efficiency. Several 58

algorithms exist in the literature to block partition matrices, such as described in [67, 68, 69, 70, 71]. However, these methods are generally unsuitable for block partitioning Jacobian matrices for mechanisms because they involve taking linear combinations of variables in addition to row and column permutations; are designed for special–purpose matrices (i.e. square or hermitian); or do not fully uncouple substructures (i.e. the resulting partitioning scheme has overlap or is in single–bordered, K–way form [70]). This section introduces a simple technique to automatically identify and uncouple independent kinematic substructures within a mechanism. After the kinematic substructuring process is complete, the Jacobian matrix will have been permuted to a block–diagonal form, allowing each block submatrix to be processed independently, thereby reducing the overhead of applying the GECP algorithm. In other words, this method takes advantage of the kinematic substructures to block–diagonalize the Jacobian matrix. Throughout this section, kinematic substructuring is applied to a representative example of a complex mechanism, a hydraulic excavator, shown in Fig. 3, which contains five kinematic loops.

Figure 3: A Hydraulic Excavator

5.2

Representing Topology

For completeness, the topology of the excavator is described using graph theory [72, 41]. The preliminary process requires setting up a directed graph that describes how bodies, i.e. nodes, are connected by joints, i.e. edges. For illustrative purposes, in this section, 59

the directed graph is described using the incidence matrix, a useful tool from graph theory. Later, it will be shown that other more compact methods can be used to represent the excavator’s topology. The excavator’s incidence matrix, Γ, is shown in Eq. (155). Manual generation of incidence matrices for complex mechanisms presents a burden to the analyst, and the process is prone to errors; therefore, automated methods are preferred. It is more convenient for an analyst to provide body–joint–body descriptions and shape matrices [28], details which are discussed more later in this chapter and in Chapter 12. For the present example, the bodies and joints of a hydraulic excavator are provided in an arbitrary, user–defined order as described in the Table 7. Each body is assigned a unique number, and each joint is assigned a unique letter. As an aid to following the development in this work, the arbitrary letter assigned to each joint has an associated descriptive name shown in Table 7. Each name includes an arrow to indicate the user–defined sense or direction of the respective joint. Note that here the zero identifier in the list is reserved for the fixed inertial reference frame.

60

Table 7: Identification of Bodies and Joints in the Excavator Example ID

Body Name

ID

Joint Name

1

undercarriage

A

fixed → undercarriage

2

cabin

B

undercarriage → cabin

3

boom

C

cabin → boom

4

stick

D

boom → stick

5

bucket

E

stick → bucket

6

bucket linkage 1

F

stick → bucket linkage 2

7

bucket linkage 2

G

bucket → bucket linkage 1

8

boom cylinder 1

H

cabin → boom cylinder 1

9

boom cylinder 2

I

cabin → boom cylinder 2

10

boom rod 1

J

boom → stick cylinder

11

boom rod 2

K

stick → bucket cylinder

12

stick cylinder

L

boom cylinder 1 → boom rod 1

13

stick rod

M

boom cylinder 2 → boom rod 2

14

bucket cylinder

N

stick cylinder → stick rod

15

bucket rod

O

bucket cylinder → bucket rod

P

boom rod 1 → boom

Q

boom rod 2 → boom

R

stick rod → stick

S

bucket rod → bucket linkage 1

T

bucket linkage 1 → bucket linkage 2

From the body–joint–body information in Table 7, the incidence matrix for the excavator is given in Eq. (155). Zeros are removed from sparse matrices to improve clarity.

61

A



  B   C   D    E   F   G   H   I      J T  Γ =  K   L   M    N   O   P   Q   R    S  T

0

1

−1

1 −1

2

3

4

5

6

7

8

9

10

11

12

13

14

15 

      −1 1    −1 1    −1 1     −1 1    −1 1    −1 1    −1 1     −1 1    −1 1    −1 1    −1 1    −1 1    −1 1     1 −1    1 −1    1 −1    1 −1   −1 1 1

(155)

Each row of ΓT in Eq. (155) corresponds to a joint A through T. Each column is associated with a body. A -1 in a row indicates that the corresponding joint is negatively incident with the body at the column index. A positive 1 indicates the joint is positively incident. For example, joint A goes from the fixed frame (0) to the undercarriage (1). In Eq. 156, the bodies and joints have been permuted into a natural, level–increasing order using row and column permutations, and the sense of some joints are reversed so that all are oriented away from the fixed inertial reference frame. A technique for automating this process based on a more compact representation of a mechanism’s topology is described later in this chapter. The direction of some joints are reversed for the convenience of obtaining positive entries on the diagonal of the matrix shown in Eq. (156). To effectively reverse joints without changing the sign of their variables, the sign of their influence coefficient matrices are reversed when evaluated. The sign of variables are not changed to remain consistent with any user–defined constraint equations and force models. The first column of the incidence 62

matrix, which corresponds to the fixed inertial reference frame, has been dropped because the fixed frame is not movable and has no state variables. After the body and joint reordering process, the revised incidence matrix becomes 1 A

2

3

8

9

11

4

10

12

13



1   B  −1 1  C −1 1    H −1 1   I  −1 1   Q −1   D −1   P −1   J  −1    R   T Γ = F   E   K    T   O      L −1    M −1   N   G  S

7

5

14

6

15 

1 1 1 1 −1

1

−1 −1 −1

1 1 −1

1

                                1    1    1    −1 1    −1 1                  −1 1  1 −1

(156)

The subset of graph edges containing all nodes, but no loops, is known as the kinematic tree or spanning tree [28]. The edges comprising this subset are often referred to as the arcs of the directed graph. A fundamental property of a spanning tree is that it provides a unique kinematic path between any two nodes or bodies, including the root node. The remaining edges required to reconnect the spanning tree are the chords of the directed graph. Consequently, the number of chords is equal to the number of independent loops in a directed graph [28]. The new incidence matrix in Eq. (156) has 15 joints comprising the arcs of the directed graph shown above the dashed line; they represent the row–space of ΓT . The 5 joints below 63

Figure 4: Spanning tree for hydraulic excavator the dashed line are the chords of the directed graph. The sense of joints P, Q, R and T have been reversed to align with the spanning tree and to obtain positive entries on the diagonal of ΓT . The upper block has been converted to lower triangular form with 1s along the diagonal. Note that there is not a unique choice of arcs and chords in a given mechanism. In the present example, the permutations were chosen to maximize the breadth of the spanning tree, as shown in Fig. 4. The solid black arrows correspond to the joints that form the arcs in the mechanism, and the dashed blue arrows correspond to the chords. A corollary of the incidence matrix, the connectivity matrix, C, is used to navigate the topology of the mechanism and form the Jacobian matrix. The connectivity matrix is formed by element–wise replacement of ±1s in ΓT with positive or negative 6×6 identity matrices, or screw bases. The connectivity matrix yields the relative velocities across joints as

Cvabs = vrel

(157)

where vabs and vrel are respective column matrices of absolute and relative body twists across the joints that form the arcs and chords. Inspecting rows of the incidence matrix

64

in Eq. (156) reveals that the product in Eq. (157) effectively takes the difference between absolute body twists to obtain relative body twists across the arcs and chords. Additionally, the Partial Velocity Matrix, H, a block diagonal matrix of 6×1 joint influence coefficient matrices [35] (i.e. unit velocity screws or twists transformed to the fixed inertial reference frame) provides the mapping of joint rates, q, ˙ to relative twist velocities expressed in the fixed inertial reference frame as

Cvabs = H q˙

(158)

Equation (158) may be block partitioned into 

Carc





Harc

  vabs =  Cchord 0

0 Hchord

   q˙  arc   q˙

(159)

chord

Equation (159) yields −1 vabs = Carc Harc q˙arc

(160)

Cchord vabs = Hchord q˙chord

(161)

and

Substituting Eq. (160) into Eq. (161) and rearranging yields

−1 − Cchord Carc Harc q˙arc + Hchord q˙chord = 0

(162)

Equation (162) represents the chord-loop velocity constraints for the mechanism, and from it is identified the constraint Jacobian matrix h i −1 J = −Cchord Carc Harc Hchord

(163)

which was introduced in Eq. (133). Using the mapping for vabs and vrel defined in Eq. (162), the Jacobian matrix for the

65

excavator becomes A B 

i   ii    J = iii    iv   v

C

H

−hC

hH

−hC

I

Q

D

P

J

R

F

E

K

T

O

−hP hI

L

M

N

G

S 

hL

−hQ

hM −hD

hJ

−hR

hN −hF −hF

−hT

hE hK

−hT

hG hO

hS

          

(164)

where the hi are 6×1 influence coefficient matrices for each of the respective joints. The block row identifier labels, i through v, correspond to a minimal set of kinematic loops in the mechanism, where the number of loops is equal to the number of chords in the directed graph.

5.3

Kinematic Substructuring Algorithm

At this point in the analysis, information regarding the system topology required to solve Eq. (133) has been determined. Performing GECP on Eq. (164) would require approximately 9000 FLOPs, assuming the Jacobian matrix has the dimensions m = 30, n = 20 and r = 15 and using the formula for number of operations described in Section 5.1. In the following section, it will be shown how kinematic substructures can be identified and uncoupled, allowing for an increase in computational efficiency. The kinematic substructuring algorithm requires traversing the directed graph of the mechanism, and the spanning tree is an essential tool for this purpose. Note, however, that the spanning tree contains only the subset of user–defined joints that were classified as arcs in the previous section. A simple modification to Eq. (156) will effectively convert all user– defined joints to arcs of the directed graph, which is useful in the following discussion. Cut the child-side Cartesian frame of each chord joint from its embedding body and consider this released frame as a virtual, massless body. Each chord joint will then have its own unique child body and will become an integral part of an expanded spanning tree. This process will have effectively cut open all kinematic loops and replaced all chords by arcs. For each virtual body, six chord–loop constraint equations are then imposed to connect the virtual body back to its original embedding body. 66

In the excavator example, the child–side frames of the five chord joints are cut from their embedding bodies and converted into virtual, massless bodies. The virtual bodies are then constrained so that their position, orientation, velocity, and acceleration match the node locations in the respective embedding bodies from which they were cut. With this modification, the chords of the directed graph have been converted to rigid constraints. This simple modification requires no changes to Eq. (164). When the chord loops are cut, the virtual body number and the original chord’s embedding (positively oriented) body id are recorded for later reuse. Table 8 shows this information for the hydraulic excavator model. Table 8: Virtual and embedding body id for each kinematic loop loop id

virtual

embedding

body id

body id

i

16

10

ii

17

11

iii

18

13

iv

19

6

v

20

6

A new spanning tree, modified to include the virtual bodies, is shown in Fig. 5. The solid arrows correspond to the joints that form the arcs in the mechanism, and the dashed lines correspond to the new constraints imposed. Virtual, massless bodies are designated by octagons.

67

Figure 5: Modified spanning tree for hydraulic excavator with chords converted to arcs

While the incidence matrix in Eq. (156) provides all the necessary information for traversing the directed graph, navigating through its many ones and zeros is cumbersome. To avoid this problem the excavator’s graph may be represented more compactly as a parent–child list and binary tree [73]. The parent–child list facilitates traversal from any node of the spanning tree upward, toward the root element. The index into the parent–child list, P, is each body or node’s unique ID number, and the value at that index is the node’s unique parent node ID number. A binary tree facilitates traversal of the graph from any node of the directed graph away from the root element, towards the leaves of the spanning tree. Two lists, L and R, are used to represent the binary tree. The first list, L, is called the left or descendent list and contains the index of each node’s first descendant node, if present. The second list, R, is called the right or sibling list and contains the node’s first sibling, if present. Additionally, an index list contains the body indices in original user–defined order. For the excavator model, the four lists are shown in Table 9.

68

Table 9: Compact representation of spanning tree using parent–child list and binary tree list

node id

index

0

1

2 3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

P

-1

0 1

2

3

4

7

4

2

2

3

3

3

4

4

14

8

9

12

5

15

L

1

2

3 4

5

19

0

6

16

17

0

0

18

0

15

20

0

0

0

0

0

R

0

0

0 8

10

7

0 13

9

0

11

12

0

14

0

0

0

0

0

0

0

The kinematic substructuring process is divided into the following five steps: 5.3.1

Identify the level of each node or body in the spanning tree

Using the parent–child list and binary tree, the increasing level of each body in the spanning tree, moving away from the fixed inertial reference frame, is identified and stored in a node– level list. The parent–child list allows one to easily determine if node k is in the kinematic path of node j using Algorithm 1. Algorithm 1 InP ath(j, k, P ) inP ath ← f alse while j ≥ 0 do if j is k then inP ath ← true break end if j ← P [j] end while returninP ath

Using an iterative preorder binary tree traversal [73] and a temporary stack variable, S, a levels list can be generated. The levels list is of equal length to the index list introduced above. In the following algorithm descriptions, list manipulation is described as follows: list.pop() indicates removing an element from the end of the list, and list.pop(j) indi69

cates removing an element from the list at the specified index, j. Similarly, list.push(1) indicates appending the value 1 to the end of the list. All pseudocode is described using the C/Python programming convention for indexing. Therefore, list.pop(0) indicates removing the first element from the list. Additionally, list[1:end] represents a subset of the list containing elements from the second value up to and including the last element. The levels list is generated according to Algorithm 2. The levels list for the excavator model is shown in Table 10. Algorithm 2 detectLevels(P, L, R) Initialize empty stack, S Initialize empty list, levels levels[0] ← 0 k ← L[0] while not S.isempty() or k > 0 do if k > 0 then levels[k] = levels[P [k]] + 1 if R[k] > 0 then S.push(R[k]) end if k ← L[k] else k ← S.pop() end if end while returnlevels

70

Table 10: levels list for the excavator model list

5.3.2

node id

index

0

1

2

3

4

5

levels

0

1

2

3

4 5

6

7

8

9

10

11

12 13

14

15

16

17

18 19

20

6 5

3

3

4

4

4

5

6

4

4

5

7

5

6

Identify loop members

Each loop’s base node and the nodes in the loop can also be readily identified from the parent–child list and Table 8. Let the respective kinematic paths from the spanning tree’s root to each chord joint’s virtual body and its user–defined embedding body represent the positive and negative branch segments of the loop, as shown for loop v in Fig. 6. In Fig. 6, edges F and T are oriented negatively with respect to the loop direction, edges K, O and S are oriented positively with respect to the loop direction, node 4 is loop v’s unique base node, and nodes 0, 1, 2 and 3 are in the ancestor path of node 4. From Table 8, let the virtual body id for loop j be stored in positiveN odes[j] and the corresponding embedding body id stored in negativeN odes[j]. For each loop in Table 8, the loop members can be identified using Algorithm 3 and stored in the members list, an initially empty list of lists equal to the number of chord loops, i-v. The first and second while loops are used to decrement each node index in the kinematic path until node indices in the positive and negative branches of the loop are at the same level in the spanning tree. Once the decremented node indices have reached the same level, they will be decremented together by level until the loop’s unique base node is identified. During this process, the node indices are stored in a temporary stack variable. At the end of the process, the last node index that was pushed to stack is removed from the end of the list, as the loop’s base node will have been written twice. The list sequence is then reversed and stored in members[j]. The final members list contains each loop’s base node first, followed by the loop members in monotonically increasing order by level.

71

Algorithm 3 detectLoopM embers(P, levels, negativeN odes, positiveN odes) Initialize empty stack, S Initialize empty list of lists, loopM embers for j ← 0 to number of loops do negative ← negativeN odes[j] positive ← positiveN odes[j] S.push(negative) S.push(positive) while levels[positive] > levels[negative] do positive ← P [positive] S.push(positive) end while while levels[negative] > levels[positive] do negative ← P [negative] S.push(negative) end while while positive is not negative do negative ← P [negative] positive ← P [positive] S.push(negative) S.push(positive) end while S.pop() while not S.isempty() do members[j].push(S.pop()) end while end for returnmembers

72

Figure 6: Loop v members and nodes in kinematic path of loop v’s base node. 5.3.3

Order loops by level

At this point the loops are rearranged by sorting their base nodes into monotonically increasing order by level in the spanning trees. After processing by Algorithm 3 and sorting, the loop ordering sequence and loop member lists are shown in Table 11. Table 11: Members of each kinematic loop with loops listed in monotonically increasing order by level in the spanning tree. The base node associated with each loop is the first member of each list. loop ID

base node

loop

level

members

i

2

2, 3, 8, 10, 16

ii

2

2, 3, 9, 11, 17

iii

3

3, 4, 12, 13, 18

iv

4

4, 7, 5, 6, 19

v

4

4, 7, 14, 15, 6, 20

In the excavator example, no reordering of the loops was necessary, as their base nodes were already arranged in monotonically increasing level order. 73

5.3.4

Merge loops into kinematic substructures

Any given mechanism will consist of at least one kinematic substructure, i.e. the entire mechanism. Therefore, the loop–merging process begins by assigning the first loop, i, to the first substructure. Each successive loop is then inspected to determine if it should be added to any of the existing kinematic substructures or if the loop should be assigned to a new substructure according to Algorithm 4. A candidate chord–loop will be merged with a kinematic substructure if the substructure and the merge candidate have at least one body and one joint in common. Note, however, that the graph edge (joint) labels are not required for merging. The merging process can be performed using only the parent–child list and loop member assignments. A candidate loop will be merged with a kinematic substructure if the substructure’s base node matches the candidate’s base node or if it is in the candidate base node’s ancestor path, and if the candidate loop and substructure have at least one descendant node in common. The term descendant node refers to any loop member that is not the loop’s base node, i.e. the subset of loop members, loop[1:end]. This applies to kinematic substructures as well, where loop[1:end] is the union of all merged chord-loop bodies, excluding the unique base body at the lowest level in the spanning tree.

74

Algorithm 4 detectSubstructures(members,P) initialize empty list of lists, substructures substructures.push( empty list ) substructures[0].push(members.pop(0)) while not members.isempty() do mergeCandidate ← members.pop(0) merge ← f alse for substructure in substructures do for loop in substructure do if InP ath(mergeCandidate[0], loop[0], P ) then if any mergeCandidate[1 : end] in loop[1 : end] then substructure.push(mergeCandidate) merge ← true break end if end if end for if merge then break end if end for if not merge then initialize empty list, S S.push(mergeCandidate) substructures.push(S) end if end while returnsubstructures

75

In the present example, the base node of loop i is found to be in the ancestor path of loop ii’s base node. Inspecting the descendant members of loop ii in Fig. 5 reveals that node 3 is also a child of loop i, indicating that loop ii should be merged into the first kinematic substructure. Moving on to loop iii, an ancestor of loop iii’s base node is found to be the base node of the first kinematic substructure. However, none of the descendant members of loop iii are members of loops i or ii, and therefore loop iii is not a member of the first kinematic substructure. Thus loop iii is assigned to a second kinematic substructure. Loops iv and v are then analyzed. An ancestor of the base node of loop iv is found to be in the kinematic path of the base node of loops i and ii in the first kinematic substructure, however, none of the children of loop iv’s base node is a member of loops i and ii. Therefore, loop iv is not a member of the first substructure. Similarly loop iv is not a member of the second substructure. Thus a third substructure is designated, to which loop iv is assigned. Similarly, loop v is not a member of the first and second substructures. However, one child of loop v’s base node is a member of the third substructure, so loop v is assigned to it. The substructures and their corresponding loops are shown in Table 12. Table 12: Kinematic substructures and their corresponding chord–loops

5.3.5

substructure id

loops

1

i, ii

2

iii

3

iv, v

Reorder bodies and joints

In this last step the joints are ordered by level in the spanning tree and permuted according to their kinematic substructure assignments, which requires creating a short list of bodies that are either: • not a member of any kinematic substructure, or 76

• are the base node of a kinematic substructure In the excavator example, the short list of bodies, ordered by level in the spanning tree becomes: (0) fixed, (1) undercarriage, (2) cabin, (3) boom, and (4) stick. From the bodies list, a short list of joints is also created. Each joint in the joint short list is the parent joint associated with the corresponding body in the body short list. The (0) fixed body has no parent, so the joints in the joint short list for the present example are: A, B, C, and D. From the joint short list, an expanded joint list is created. For each joint in the joint short list, perform the following operations: • if the joint is not a member of a kinematic substructure, append the joint to the expanded joint list, otherwise: • append the joint associated with the base node of the kinematic substructure to the expanded joint list only if it has not been appended in a previous step. Then, append the remaining joints associated with nodes in the kinematic substructure, ordered by level in the spanning tree. After the final sorting and expansion operation, the revised Jacobian matrix becomes

A B 

i   ii    J= iii    iv   v

C

H

−hC

hH

−hC

I

hI

Q

P

L

−hP

hL

−hQ

M

D

J

R

N

F

E

K

T

O

G

S 

hM −hD

hJ

−hR

hN −hF

−hT

hE

−hF

hK

−hT

hG hO

hS

          

(165)

and the block submatrices J1 –J3 associated with each kinematic substructure become C J1 =



−hC  ii −hC

i

H

I

Q

P −hP

hH −hQ

hI

77

L

M 

hL

 hM

(166)

D

J

h

J2 = iii −hD

hJ

R

N

−hR

hN

i

(167)

and F  iv −hF  J3 = v −hF

E

K

T

O

−hT

hE hK

−hT

G

S 

hG

 hO

(168)

hS

The permuted Jacobian matrix in Eq. (165) now shows the block–diagonalized structure. Joints A and B are not in any substructure and are therefore always independent. Chord loops i and ii and joints C, H, I, Q, P, L and M comprise kinematic substructure 1, chord loop iii and joints D, J, R, and N comprise kinematic substructure 2, and chord loops iv and v and joints F, E, K, T, O, G and S comprise substructure 3. The revised spanning tree shown in Fig. 5 has been assigned colors according to the kinematic substructure assignments: (Red: Substructure 1, Green: Substructure 2, and Blue: Substructure 3). The arcs, chords, and nodes are colored according to which kinematic substructure they appear in, with the exception that base nodes are colored according to their associated kinematic substructures. Likewise, labels J1 , J2 and J3 are assigned to each of the respective Jacobian submatrices. GECP can then be applied independently to each of the Jacobian submatrices. Assuming that each Jacobian submatrix has full row rank, GECP would require a total of 1008 FLOPs according to the formula provided in Section 5.1, an approximate order of magnitude reduction in floating point operations. As the operations count is not always a direct measure of numerical efficiency, in the next section a numerical experiment is carried out to compare the kinematic substructuring technique to other methods. Although the process for automatically determining kinematic substructures is described in specific context to mechanisms, the algorithm is a general graph partitioning method that may have applications in other disciplines. To facilitate transposition to other subject areas, note that the Jacobian matrix, J, is a byproduct of the right null space of the directed graph,

78

ΓT , as described in [12]. See also Eq. (163). In graph theory, this null space matrix is often referred to as the loop matrix [59]. Therefore, the methods described in this section may be adapted to other subjects by replacing any reference to the Jacobian matrix with the more general loop matrix.

5.4

Modified algorithm for level–order sort and identification of loops and loop members

In the preceding sections, all algorithms were presented in terms of the large sparse incidence and connectivity matrices to show connections to well–established graph theoretic concepts. It is important to note, however, that these large, sparse matrices need never be computed or stored in multibody dynamics code. The mechanism topology represented as a parent– child–list and the cycle basis, which forms the basis of the Jacobian matrix, can be identified using a simple and numerically efficient algorithm that requires O(n2 ) operations, with n as the number of joints in the mechanism. This modified approach, which is described in Algorithm 5, also obviates the need for the binary tree representation of system topology to traverse from the root of the spanning tree out to the leaves, as this representation was only necessary for determining the body or joint’s level away from the root node in the previous discussion. The simple sorting operation generates a breadth–first spanning tree from a user’s list of bodies, b0, and joints, j0, specified in arbitrary order from the input file and reoders in monotonically level increasing order b1 and j1. To initiate the process, the user’s input file is loaded as a directed–graph representation from the input file into computer working memory. Memory pointers to each joint’s parent (negatively oriented) body and child (positively oriented) body are set by matching string names specified in the input file. The bodies and joints are then reordered according to Algorithm 5. Note that here the index -1 is reserved for the fixed inertial reference frame in the parent–child list.

79

Algorithm 5 levelOrderSort(j0) Initialize temporary empty stack, S Initialize empty (second) list of joints, equal to number of joints j1 Initialize empty parent–child list equal to total number of active joints, P Initialize empty levels list, equal in length to number of joints Initialize empty sense list, equal in length to number of joints Find joints that are positively or negatively incident with the fixed inertial reference frame and append to S for j in j0 do if j.parent body is “fixed” then j.parent index = -1 j.level ← 0 j.sense ← +1 S.append(j0.pop(j)) else if j.child body is -1 then j.parent index = “fixed” j.level ← 0 j.sense ← -1 S.append(j0.pop(j)) end if end for Pop joints off the front of the list S, add to new list j1 while S do j ← S.pop(0) if j.sense is +1 then scan for joints connected to the child body of j in list j0 for k in j1 do if j.child body is k.parent body then k.parent index = i k.level ← j.level + 1 k.sense ← +1 S.append(j0.pop(k)) end if if j.child body is k.child body then k.parent index = i k.level ← j.level + 1 k.sense ← -1 S.append(j0.pop(k)) end if end for else if j.sense is -1 then if j.parent body is k.parent body then k.parent index = i k.level ← j.level + 1 k.sense ← +1S S.append(j0.pop(k)) end if if j.parent body is k.child body then k.parent index = i k.level ← j.level + 1 k.sense ← -1 S.append(j0.pop(k)) end if end if levels[i] ← j.level P[i] ← j.parent index j1[i] ← j index ← index + 1 end while returnlevels, sense, P, j1

80

The algorithm starts out by removing any joints with a connection to the fixed reference frame from the first list of joints and appending them to the end of the temporary list S, flipping the sense of the joints as necessary to align with the spanning tree. Once all joints with a connection to the reference frame are detected, the algorithm continues by popping a joint from the front of the temporary list S and adding it to the second list j2. Any children of the joint being processed are appended to the end of the temporary list S. The process continues similarly, flipping the sense of joints as necessary until there are no more joints to process. With the joint order specified by Algorithm 5, the bodies are reordered to follow the joint order using Algorithm 6 and constraints are identified. This algorithm is also used to set up the virtual bodies described in Table 8 and Fig. 5. Algorithm 6 reorderBodies(j1, b0) initialize empty list negativeNodes initialize empty list positiveNodes for i in length(j2) do if j2[i].sense is +1 then b ← b1.pop(j2[i].child body) else if j2[i].sense is -1 then b ← b1.pop(j2[i].parent body) end if if b then b.index ← i b2[i] ← b else initialize massless virtual body v v.index = i b2[i] ← v positiveNodes.append(v.index) if j2[i].sense is +1 then negativeNodes.append(j2[i].child body.index] else if j2[i].sense is -1 then negativeNodes.append(j2[i].parent body.index] end if end if end for returnnegativeNodes, positiveNodes, b2

In the first version of the kinematic substructuring algorithm described starting in Section 81

5.3.1, the binary–tree was used to find the level of each node in the spanning tree. If the revised algorithm is used, there is no need to create a binary–tree to traverse the mechanism topology. The resulting parent–child list can be used for forward and backward traversal of the topology. The kinematic substructuring algorithm can then begin with Step 2 (identifying loop members) as described in Section 5.3.2. Additionally, the matrix C need never be formed or evaluated in multibody dynamics code. The inverse of the connectivity matrix, C −1 , which is also sometimes referred to as the path matrix [28], plays a central and important role in the dynamics equations (see for example Eq. (122)). Operations such as y ← C −1 x, where x and y are two stacked column matrix of screw quantities, can be efficiently, recursively carried out as shown in Algorithm 7. Algorithm 7 productCinversex(P, x, y) y←x for j in range(length(P), -1, -1) do i ← P[j] y[i] ← y[i] + y[j] end for returny

The above operation could also be applied in–place if the quantity x were not used again for another purpose, thereby avoiding the memory copy at the beginning of Algorithm 7. Each independent kinematic loop corresponds to a block 6 row of the Jacobian matrix. With the loop members identified, the Jacobian matrix can be found simply and efficiently without computing the null–space basis of the incidence matrix as described in [28, 3]. Furthermore, the parent–child list, which is linear in storage requirements and O(1) in access time, contains all the necessary information to traverse the mechanism. While the highly sparse incidence, path and connectivity matrices described above serve as useful instructive tools to convey the system topology, their quadratic storage requirements and linear access time make them inefficient in the computer implementation of storing or processing system topology. This section has demonstrated that the large, sparse matrices need never be formed

82

to represent the topology of a multibody system. These concepts will be discussed further in Chapter 6.

5.5

Numerical Experiment

To investigate the numerical efficiency of kinematic substructuring, GECP of the block– diagonalized Jacobian matrix in Eq. (165) is computed using (a) the kinematic substructuring method applied to each of the three block submatrices, J1 –J3 , (b) the sparse matrix algebra MA48 factorization algorithm (part of the Harwell Subroutine Library (HSL) [74]) applied only to the nonzero terms of the 30×20 Jacobian matrix, J and (c) the full, dense matrix algebra applied to the full 30×20 Jacobian matrix, J. In sparse matrix methods, only the non–zero terms of the coefficient matrices are stored. Therefore, memory usage and floating point operations required to carry out sparse-matrix algebra is reduced compared to full, dense algebra. Indeed, many large scientific problems, such as the numerical solution of finite element analysis (FEA) equations, would be intractable without the use of sparse-matrix methods. Note, however, that in sparse-matrix methods, the set of instructions to perform the operations is more complicated than full, dense matrix algebra. Furthermore, matrix coefficients may not be stored or accessed in contiguous memory, which may increase cache misses [75]. The use of sparse-matrix algebra is expected to offer little or no benefit when the coefficient matrices are small and/or dense. In contrast to the large, highly sparse equations found in other disciplines, the equations of motion for many multibody systems are typically smaller and moderately sparse. Nonetheless, Gonzalez et al. [76] showed that sparse-matrix methods do offer some performance benefit over full, dense matrix algebra, considering the special structure of dynamics equations, even for small to moderate problem sizes. Block–diagonalization via kinematic substructuring is essentially a hybrid of full and sparse matrix methods. As discussed in the previous section, full, dense matrix algebra can be applied to each of the Jacobian matrices J1 –J3 at a small fraction of the operations required to apply full matrix algebra to the entire Jacobian matrix, J. Furthermore, these block 83

submatrices are dense, and sparse-matrix algebra is expected to have increased complexity compared to full, dense algebra applied to them individually. The HSL MA48 factorization algorithm was chosen as the reference sparse implementation because HSL is well–known for its efficient, well–tested, and high–quality numerical algorithms. HSL subroutines are commonly used in many commercial and research–oriented multibody dynamics solvers [76, 77], and the MA48 algorithm is suitable for nonsymmetric, rectangular matrices. The HSL MA48 algorithm consists of three primary components. The first analyze component analyzes the specific structure of a coefficient matrix and generates an initial factorization. The second factorize component is used to compute the factors of a matrix with a known sparsity pattern determined in the analyze phase. In factorize, a “fast” flag can be set. If the fast flag evaluates to true, the factorization sequence from the previous call to analyze or factorize is used; if the fast flag evaluates to false, the factorization sequence may change. The third component of HSL MA48 is a solve phase, which uses the matrix factors to solve a set of linear equations. For benchmarking purposes, only the first and second components of HSL MA48 are considered. The analyze component of HSL MA48 is analogous to block–diagonalization via kinematic substructuring as described in Section 5.3. The analyze phase would be called once at the beginning of a simulation for the particular sparsity pattern of the Jacobian matrix. Analyze would not be called again unless a pivot has come too close to zero or the system’s topology has changed. Factorize, with the “fast” flag set to false, is analogous to gaussian elimination with partial pivoting, and factorize, with the “fast” flag set to true, is analogous to gaussian elimination without pivoting. To compare the efficiency of kinematic substructuring to sparse and full matrix factorization, a computer program was written in the C programming language that closely follows the development of this paper. The bodies and joints of the excavator model are specified in an arbitrary user–defined order in a text file. When the program is launched, the bodies and joints are loaded into computer memory using a linked–list data structure. Bodies and joints

84

are permuted according to Section 5.3 using memory pointers. Once the permutation process is complete, quantities are copied to contiguous memory, and Newton–Raphson iteration is applied to eliminate constraint errors, if present. Finally, the spanning tree is traversed to update the terms of either (a) the block diagonal submatrices J1 –J3 , (b) a sparse matrix representation of the Jacobian, or (c) the full dense Jacobian matrix, J. Once the terms are computed, the Jacobian matrix is passed to one of the three algorithms for matrix factorization. The Jacobian update and factorization steps are repeated 10000 times for each method. The numerical experiments were carried out on a 6–core AMD Phenom IITM X6 1090T Processor with 16GB of RAM, running the 64 bit 3.18.6-1 Arch Linux kernel. The program was compiled using the GNU Compiler Collection (GCC) C compiler, version 4.9.2-4. The HSL MA48 algorithm is used in conjunction with the ATLAS (Automatically Tuned Linear Algebra Software) [78] implementation of BLAS (Basic Linear Algebra Subprograms), whereas the GECP algorithm for full, dense matrix factorization was programmed wholly without the use of BLAS. Refer to reference [76] for more details of various BLAS implementations in multibody dynamics. Results of the experiment are provided in Fig. 7. Kinematic substructuring provides an approximate 5× speedup over full, dense matrix algebra. The speedup is not directly proportional to the 9× reduction in FLOPS calculated in Section 5.3 due to the increased number of function evaluations. Nonetheless the 5× speedup represents a considerable savings over full, dense GECP. MA48’s performance was between the two methods. Kinematic substructuring was more than two times faster than MA48, even without the use of BLAS.

85

A B

0.036214 0.090873 0.188134

C

0 0.05 0.1 0.15 0.2 time (seconds) for 10000 evaluations, less is better Figure 7: Run times for 10000 gaussian elimination evaluations with complete row and column pivoting for A: Block–diagonalization via kinematic substructuring, B: Sparse factorization using HSL MA48 algorithm, and C: Full, dense gaussian elimination

The results depicted in Fig. 7 represent the portion of simulation time required to determine the set of independent and dependent generalized coordinates if GECP were conservatively called at every time step during numerical integration. As discussed in the introduction, GECP need not be called at every time step. If the set of dependent and independent variables remains unchanged between two integration steps, the factorization pivot sequence may be reused. However, it is strongly emphasized that if this approach is taken, for robustness, GECP must be reapplied as the mechanism approaches a bifurcation configuration. In the following example and timings, a heuristic method for monitoring the stability of the choice of dependent and independent variables based on [79] was used to determine when to reapply GECP. The ratio of largest to smallest pivots of the largest nonsingular submatrix of J in full, dense GECP or of the block–diagonal matrices J1 –J3 in block–diagonalized GECP is monitored. If the pivot ratio becomes greater than a user–specified threshold obtained from the initial starting value, GECP is applied again. The logic to check the pivot ratio is not added to the HSL MA48 “fast factorization” subroutine. Instead, MA48 returns an error when a pivot approaches machine roundoff error during the fast factorization process, which may be used as a basis to call factorization with the “fast” flag set to false again to determine a new set of dependent and independent variables. Although the timings described in the following section are based on pivot ratios, an alternative heuristic method was ultimately used when setting up and solving the equations

86

of motion. The alternative method monitors the infinity norm of the B di matrix (i.e. the largest absolute value in the B di matrix) in the preconditioned constraint equations, which was introduced in Eq. (139). If the infinity norm increases to more than 2 times its initial starting value, LU factorization with complete row and column pivoting is reapplied. The infinity norm of the B di matrix was used as a heuristic because it provides a mapping of the dependent variables to independent degrees of freedom. On an initial call to LU decomposition with complete pivoting, the terms of the B di matrix are . 1. If the terms of the B di matrix increase greater than the initial starting value, it is an indication that the mechanism has moved to a posture where the variables that were initially selected as dependent in an initial call to GCP now have a greater influence on changes of the overall system configuration, which signals that GCP may need to be reapplied. See Section 7.2 for more implementation details. A B C

0.009111 0.064939 0.026275

0 0.05 0.1 0.15 0.2 time (seconds) for 10000 evaluations, less is better Figure 8: Run times for 10000 gaussian elimination evaluations without pivoting for A: Block–diagonalization via kinematic substructuring, B: Sparse “fast–factorization” using HSL MA48 algorithm, and C: Full, dense gaussian elimination without pivoting

When the pivot sequence is reused from the initial call to GECP, block–diagonalization via kinematic substructuring is approximately 7× faster than MA48’s fast factorize routine, even accounting for the additional check on pivot ratio. The factorization and pivot ratio calculation for the entire full, dense Jacobian matrix was also more than twice as fast as sparse factorization with the “fast” flag set to true. Another implication of using a heuristic method, together with the kinematic substructuring technique, is that when the mechanism approaches a bifurcation configuration, GECP 87

need not be reapplied simultaneously to all three block submatrices, J1 –J3 . Each kinematic substructure can be monitored individually, and full GECP can be reapplied on an as–needed basis to each submatrix in approximately 1/3 the time indicated in Fig. 7A. On the other hand, if the mechanism approaches a bifurcation configuration when using full, dense algebra or sparse-matrix methods, GECP must be reapplied to the entire Jacobian matrix, and the time required to identify a new set of dependent and independent variables shown in Figs. 7B and 7C will always be roughly the same. The results of Figs. 7 and 8 indicate that block–diagonalization via kinematic substructuring requires a minimal computational expense to select a set of dependent and independent variables, which is a critical part of a robust solution of dynamics equations. The next section summarizes the role of kinematic substructures in setting up and processing the dynamic equations of motion.

5.6

Dynamics Equations with Substructures

In this section, the introduction of incidence and connectivity matrices into the dynamic equations of motion is discussed; a minimal representation of the equations of motion is derived, accounting for each of the kinematic substructures; and a preconditioning process, hereafter referred to as the method of Preconditioned Primary Constraints (PPC), is applied to the primary constraint equations to simplify the equations of motion. With the addition of virtual bodies, the incidence matrix, ΓT , is now square and invertible. Accounting for the joint and body ordering and block partitioning identified in Section 5.3, ΓT becomes

88

A B C H I Q P L M ΓT = D J R N F E K T O G S

1 2 1  −1 1   −1   −1   −1                            

3

8

9 11 10 16 17

4 12 13 18

7

5 14

6 15 19 20 



                                  

1 1 1 −1 −1

1 1 −1

1 −1

−1 −1

1 1 1 −1

1 −1

−1 −1 −1

1 1 1 1 −1

1 −1

1

−1

1 −1

(169)

1

The block pattern in Eq. (169) results because the kinematic substructures are coupled to each other only through their base bodies. Recall that the connectivity matrix, C, must also be updated by element–wise replacement of the ones and zeros in Eq. (169) with positive or negative 6×6 identity matrices. The column matrix v˙ abs in Eq. (158) is modified to include the absolute twist velocities of the virtual bodies. With these modifications, Eq. (158) is differentiated to yield

C v˙ abs = H q¨ + H˙ q˙ = H q¨ + γ

(170)

γ = H˙ q˙

(171)

where

Additionally, the constraints, denoted as e, and their first and second time derivatives must satisfy e(q, t) = 0

(172)

e˙ = J q˙ + et = 0

(173)

89

and ¨ e = J q¨ − γ c = 0

(174)

γ c = −J˙q˙ − e˙ t

(175)

where

In Eqs. (172)–(174), the symbol t represents time, which may appear explicitly in some constraint equations. Generally the chord-loop constraint equations don’t depend on time, but Eqs. (172)–(174) may also include additional constraints imposed on joint variables that do depend on time. After kinematic substructuring has been performed, the mass matrix M will be block– diagonal with each body’s 6×6 spatial inertia matrix appearing along the diagonal, arranged according to the body ordering in Eq. (169). Interspersed along the block–diagonal submatrices of M will be zero or more 6×6 all–zero submatrices, one for each floating child frame of a chord joint variable and one for each floating child frame of selected arc joint variables. Thus M will generally be singular, and this singularity generally propagates into the generalized mass matrix M ` . As the bodies are now grouped according to their kinematic substructure assignments, as illustrated in Fig. 5, the mass matrix may be block–partitioned as   Mii     M11   M =  M 22   M33

(176)

where Mii consists of inertia matrices associated with bodies 1 and 2, M11 consists of inertia matrices associated with bodies 3, 8, 9, 11, 10, 16 and 17, M22 consists of inertia matrices associated with bodies 4, 12, 13 and 18, and M33 consists of inertia matrices associated with bodies 7, 5, 14, 6, 15, 19 and 20. Note that a body may technically be a member of more than one kinematic substructure, but its parent joint will be included in at most one kinematic substructure. Therefore, to better understand the related body ordering and substructure assignments in Eq. (176), refer to Fig. 5 and the ordering of each body’s unique parent joint in Eq. (165) or to the ordering within each kinematic substructure in Eqs. (166) to (168). 90

Mapping of the sparse, block–diagonal matrix, M , onto the generalized mass matrix, M ` , is determined by the congruency transformation shown in Eq. (124) and follows a unique structure precisely defined by the mechanism’s topology. Efficient recursive strategies for computing M ` have been discussed in the literature, for example in [62, 63]. In Eq. (169), the order of bodies was permuted to follow the order of joints. Recall that the generalized mass matrix is obtained from Eq. (124). As suggested by the block partitioning in Eq. (169), the quantities in matrix ΓT are partitioned according to the kinematic substructuring algorithm and its inverse is obtained as

A B C H

Γ−T

1 2 3 8 9 11 10 16 17 = 4 12 13 18 7 5 14 6 15 19 20

                                   

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

I Q P

L M D

J

R N

F

E K

T

O G

S 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

                                  

1 1 1 1 1

1 1 1

1 1

1

1 1 1 1 1 1 1 1 1 1 1

The inverse connectivity matrix, C

1 1 1

1 1

1 1 1 1 1 1 1 −1

1 1 1 1 1

1 1

1

1

1

1

1

(177)

1

is obtained from element–wise replacement of the

1s with 6×6 identity matrices and after carrying out the projection H T C −T M C −1 H, the block–partitioned generalized mass matrix, M ` , is obtained as   Mii` Mi1` Mi2` Mi3`    ` ` ` `   M1i M11 M12 M13   M` =   ` ` ` `   M2i M21 M22 M23    ` ` ` ` M3i M31 M32 M33

(178)

` ` ` where Mii` , M11 , M22 , and M33 each have dimensions 1/6th the size of the respective subma-

trices Mii , M11 , M22 , and M33 in Eq. (176). 91

The block–partitioned Jacobian submatrices in Eq. (165) and the constraint equations may be interspersed within the generalized  Mi2` M ` Mi1`  ii  ` ` `  M1i M11 JIT M12    J1   `  M` M` J2T M22  2i 21   J2    ` ` `  M3i M31 M32 

equations of motion as     `   q¨i  f `i  Mi3                     `     ` q ¨      f M13    1 1                 γ     λ     1 1          ` `  M23   q¨2  =  f 2            γ2     λ2                      ` ` T      q ¨ f    M33 J3   3     3                 γ λ3 J3 3

(179)

The kinematic substructures appear as the three block 2 × 2 submatrices along the diagonal of the coefficient matrix in Eq. (179). Coupling between these substructures is through the off-diagonal submatrices, which are relatively sparse. As discussed earlier, if the Jacobian submatrices in Eq. (179) do not have full row rank, then it will be just as difficult to process this system of equations. Therefore, GECP, GCP and PPC are applied to each Jacobian submatrix to eliminate these problems. First consider how GECP would handle the constraint equations for a typical kinematic structure. The permuted and partitioned equations may be represented as    T        Pvd q¨d  J pd J pi q¨d  γ pc  Pep   J   = = γ s   i i sd si  i  s Pv q¨ J J q¨ Pe c

(180)

In Eq. (180) the matrices Pep and Pes represent two pieces of an orthonormal permutation matrix that rearrange and partition rows of the Jacobian matrix and equations into primary and secondary components, and the matrices Pvd and Pvi represent two pieces of an orthonormal permutation matrix that rearrange and partition columns of the Jacobian matrix and rows of the column matrix of variables into dependent and independent components. Thus the column matrices q¨ and γ c have been appropriately permuted and partitioned according to the equation and variable permutation matrices. As the GECP algorithm factors each sample Jacobian matrix, it generates the appropriate equation and variable permutation information, which is stored and available for subsequent use. 92

The primary constraint equations are extracted from Eq. (180) as   i  q¨i  h = γ pc J pi J pd q¨d 

(181)

where this reduced Jacobian matrix now has full row rank. Because the primary Jacobian matrix in Eq. (181) for each kinematic substructure has full row rank, the primary constraint equations may be appended to a partitioned and permuted version of the generalized equations  `ii  Mii   M `ii  1i   M `di  1i      `ii  M2i   `di  M2i       M `ii  3i   M `di  3i 

of motion instead of the full Jacobian matrix as was done in Eq. (179), yielding     `    `ii `id `ii `id `ii `id    q¨i  fi  Mi1 Mi1 Mi2 Mi2 Mi3 Mi3                      i `i piT `ii `id `ii `id `ii `id      q ¨ f    M11 M11 J1 M12 M12 M13 M13     1 1            `d  d pdT `di `dd `di `dd `di `dd     q ¨ f     M11 M11 J1 M12 M12 M13 M13   1 1             d      pi pd d     γ λ J1 J1  1      1           i  `i  piT `ii `id `ii `id `ii `id    f2   q¨2 M21 M21 M22 M22 J2 M23 M23    d  =  `d  (182) pdT `di `dd `di `dd `di `dd   q¨2  f2  M21 M21 M22 M22 J2 M23 M23                  d     pi pd d     γ λ J2 J2    2 2                i `i piT   `ii `id `ii `id `ii `id    q ¨ f    M31 M31 M32 M32 M33 M33 J3       3 3             `d d pdT      `di `dd `di `dd `di `dd     q ¨ f M31 M31 M32 M32 M33 M33 J3     3 3             d     pi pd d     γ λ3 J3 J3 3

For well–posed models the coefficient matrix in Eq. (182) will be nonsingular, and all unknowns may be solved for. Starting at the lower right corner of the large matrix in Eq. (182), the submatrix J3pd is nonsingular and the block 2×2 submatrix containing it and its `dd transpose is also nonsingular, regardless of the condition of M33 , which could be entirely

zero. Thus one may proceed to eliminate the unknowns q¨d3 and λp3 from Eq. (182) and then q¨i3 . With these quantities eliminated, the same processes may be applied to the second kinematic substructure, then the first kinematic structure, and finally q¨i may be computed. One drawback of this approach is that inverting the block 2×2 submatrices in Eq. (182) and computing the Schur complement [80] matrices can be a somewhat complicated procedure. Therefore, the preconditioning process described in Chapter 4 is applied to each of the substructures individually. 93

To demonstrate the preconditioning method with substructures, recall that Eq. (180) can be expressed as 

J

pd

J

sd



        d  q¨d  γ p  h i q¨ J L c  =   U UR =      J si γ sc  q¨i q¨i LR pi

(183)

where L represents an invertible lower–triangular matrix with unity diagonal, U represents an invertible upper–triangular matrix with pivots on the diagonal, and LR and UR represent residual matrices. From Eq. (183) an alternate set of primary constraint equations may be extracted as h

L U UR

  i q¨d   q¨i 

= γ pc

(184)

Since L and U are invertible, Eq. (184) may be revised to     h i q¨d  h i q¨d  = I B di = U −1 L−1 γ pc = γ d I U −1 UR  q¨i   q¨i 

(185)

Equation (185) identifies a replacement for the primary equations in Eq. (181) as

h

B di I

  i  q¨i 

= γd

(186)

q¨d 

Equation (186) is the preconditioning template that is applied for each kinematic substructure. Using the notation B diT = B id , the system of equations for the entire mechanism may be simplified as 

`ii  Mii  `ii M  1i   M `di  1i      `ii  M2i   `di  M2i       M `ii  3i   M `di  3i 

Mi1`ii Mi1`id

Mi2`ii Mi2`id

Mi3`ii Mi3`id

`id id `ii `id `ii M11 B11 M12 M12 M11

`ii `id M13 M13

`dd `di M12 M12

`di `dd M13 M13

`di `dd M11 M11 di B11

I

I

`ii `id M21 M21

`ii `id id `ii `id M22 M22 B22 M23 M23

`di `dd M21 M21

`di `dd M22 M22

`ii M31

`id M31

`di `dd M31 M31

di B22

I

`ii M32

`id M32

I

`di `dd M23 M23

`ii `id id M33 M33 B33

`di `dd M32 M32

`di `dd M33 M33 di B33

94

I

I

      q¨i      f `i                     i  `i    q¨1  f1                  `d  d      q ¨ f         1 1            d   λd      γ1      1                 i `i   q¨2   f 2     d  =  `d    f2   q¨2                d     d    γ λ     2 2              i  `i       q ¨ f         3 3              q¨d     f `d         3    3        d    d  λ3   γ   3

(187)

In this form, inverting the block 2×2 pivotal submatrices containing the identity matrices is trivial. Refer to the next chapter for details of the solution process.

95

6

Numerical Solution of Dynamics Equations with Substructures

6.1

The Schur Complement Method

The Schur Complement Method (SCM) [80] is a block matrix factorization algorithm that can be used to recursively uncouple and solve linear systems of matrix-based equations. The basic premise of the method of the Schur complement is that the rows and columns of a large matrix may be permuted and partitioned into a block two by two matrix, where a small matrix on the diagonal is square and easily invertible. If the small matrix is in the upper-left position, Forward SCM (FSCM) produces a forward elimination and backward substitution algorithm, otherwise Reverse SCM (RSCM) produces a backward elimination and forward substitution algorithm. In this chapter RSCM will be used to recursively uncouple and solve the generalized equations of motion with appended preconditioned constraint equations split into kinematic substructures. In its coarsest form the block-partitioned equations may be recursively uncoupled and solved one kinematic substructure at a time to provide a birds-eye view of the whole process. In its more practical and efficient granular form the block-partitioned equations may be recursively uncoupled and solved one or two variables at a time. When applying RSCM to the preconditioned equations of motion, the coarse and granular uncoupling algorithms have essentially identical algorithmic forms. Before illustrating how RSCM is applied to recursively uncouple and solve the preconditioned equations of motion, a simple example is used to demonstrate its application to a general system of equations with a symmetric coefficient matrix. Only one step of the recursive uncoupling and solution process is presented, but the resulting template equations can be repeated on the residual matrix and reduced system of equations until the final residual matrix is small enough to directly invert, which terminates the recursive uncoupling process.

96

Consider the block partitioned system of equations      A C T x a   = C D y   b 

(188)

where the block matrices A and D are symmetric, and D is easily invertible. The first step of RSCM is summarized as follows. Multiply the second block equation in Eq. (188) by C T D−1 , and then subtract this modified equation from the first block equation in Eq. (188) to obtain  T T −1 Dy y − C T D−1 Cx −  C D = a − C T D−1 b Ax + C  

(189)



which reduces to  A − C T D−1 C x = a − C T D−1 b

(190)

The operations in Eq. (190) are performed using in–place assignment as A ← A − C T D−1 C

(191)

a ← a − C T D−1 b

(192)

and

where A − C T D−1 C is the Schur complement of D. Now the smaller system of equations can be used to solve for x as x = A−1 a

(193)

Once x is known, y can be found from the second block equation in Eq. (188). y = D−1 (b − Cx)

(194)

With these basic templates, consider how the method of the Schur complement can be used to solve the preconditioned dynamics equations. Consider a simple example, consisting of only one kinematic substructure  `ii M `id B diT  M   M `di M `dd I   B di I 0 97

          f `i   ¨q i               ¨q d = f `d            d   λd    γ 

(195)

The lower right block submatrix of the coefficient matrix in Eq. (195) has a trivial inverse, given as  −1   `dd M I 0 I   =  `dd I 0 I −M

(196)

and therefore the submatrix shown in Eq. (196) has an inverse, even if Mdd is singular.    h i 0 I M `di   Carrying out the operations in Eq. (191) projects the quantities M `id B diT  `dd di I −M B    f `d  h i 0 I `id diT   and M onto M `ii and f `i respectively as B I −M `dd  γ d   h

M `ii − M `id

        `di f `d  h i i 0 0 I M I `i i `id diT diT    q¨ = f − M    B B I −M `dd  γ d  B di I −M `dd (197)

Refer to Chapter 4 for more details. Again, no inversion of mass terms is necessary to carry out the projection detailed in Eq. (197). The resulting coefficient matrix in Eq. (197) will be positive-definite if all bodies associated with independent variables have mass or if massless bodies associated with independent variables have mass projected onto them by dependent variables in application of the Schur complement. For this to occur, any zero mass pivotal elements associated with independent joint variables should lie in the kinematic path of joints associated with dependent variables that have nonzero mass. After the projection step, Eq. (197) could be solved using standard methods such as reordered LT L or LT DL matrix decomposition. Once ¨q i is known, ¨q d can be found from the third block row of Eq. (195) as ¨q d = γ d − B di q¨i

6.2

(198)

Recursive Schur Complement with Multiple Substructures

Although the reverse Schur complement method was described in context to a system of equations with one substructure, the recursive method generalizes to multiple substructures, 98

such as the excavator example. The excavator’s preconditioned system of equations is shown in Eq. (199) below with additional block partitioning. The notation B id = B diT is used to match the  M `ii  ii   M1i`ii    M `di  1i       M2i`ii   `di  M2i       M `ii  3i   `di  M3i 

superscripts of the B di matrix terms to the generalized mass matrix terms.       Mi1`ii Mi1`id Mi2`ii Mi2`id Mi3`ii Mi3`id  q¨i     f `i                     `i  `ii `id id `ii `id `ii `id i      M11 M11 B11 M12 M12 M13 M13 q ¨ f         1 1             `d d `di `dd `di `dd `di `dd      M11 M11 I M12 M12 M13 M13 q¨1  f1                        d di d     B11 I λ γ     1    1               `ii `id `ii `id id `ii `id    q¨i2   f `i M21 M21 M22 M22 B22 M23 M23 2    d  =  `d  (199) `di `dd `di `dd `di `dd   f2  q¨2  M21 M21 M22 M22 I M23 M23                     d d di          λ γ B22 I 2   2          i      `i `ii `id `ii `id `ii `id id     q ¨ f M31 M31 M32 M32 M33 M33 B33    3    3                `d  d `dd `di `dd `di `dd `di       q ¨ f I  M33 M33 M32 M32 M31 M31  3 3                 d d di λ3 γ3 B33 I

In the following discussion, only the operations applied to the coefficient matrix are shown. Note that additional operations must be applied to the right–hand–side of Eq. (199), as described in Section 6.1. To solve the system of equations using the reverse Schur complement, start from the lower right of the coefficient matrix. Using the labels from Eq. (188) let            A=          

Mii`ii

Mi1`ii

Mi1`id

M1i`ii

`ii M11

`id M11

`di `dd M1i`di M11 M11

M2i`ii

di B11

I

`ii M21

`id M21

id B11

I

Mi2`ii

Mi2`id

Mi3`ii

`ii M12

`id M12

`ii M13

`di `dd M12 M12

`di M13

`ii M22

`di `dd M2i`di M21 M21

`id M22

`di `dd M22 M22 di B22

id B22

`ii M23

I

`di M23

I

`ii `id `ii `id `ii M3i`ii M31 M31 M32 M32 M33   `di `dd `di `dd `di M3i`di M31 M31 0 M32 M32 0 M33  C= di B33

99

                     

(200)

(201)

and   `dd M33 I  D= I

(202)

The matrix product C T D−1 C is given as            C T D−1 C =           



di Mi3`id B33 `id di M13 B33 `dd di B33 M13

0 `id di M23 B33 `dd di B33 M23

0 id B33 M3i`di

id `di B33 M31

id `dd B33 M31

0

id `di B33 M32

id `dd B33 M32

0

id `di B33 M32

+

id di M23 B33

id `dd di − B33 M33 B33

                    

(203)

Carrying out one step of the reverse Schur complement as described in Eq. (191) for the coefficient matrix yields A ← A − C T D−1 C =  di Mi2`id Mi3`ii − Mi3`id B33 Mi2`ii Mii`ii Mi1`ii Mi1`id   `ii `id di `id `id id `ii `ii `ii  M13 − M`13 B33 M12 M12 B11 M1i M11 M11    `dd di `di `dd `di `di `dd M13 − M13 B33 M12 M11 I M12 M11 M1i`di    di  B11 I   `ii `id `ii `id id `ii `id di  M2i`ii M21 M21 M22 M22 B22 M23 − M23 B33   `dd di `dd `di `di `di `di `dd  I M − M M M M M M 23 B33 22 23 22 2i 21 21    di B22 I   id `ii id `di `id id `dd `ii id `di `id id `dd `ii id `di `id di id `dd di M3i`ii − B33 M3i`di M31 − B33 M31 M31 − B33 M31 M32 − B33 M32 M32 − B33 M32 M33 − (B33 M32 + M23 B33 − B33 M33 B33 )

                     

(204)

Note that the block sparsity pattern of the residual matrix is unchanged after the first step is applied. Now, consider a second application of the reverse Schur complement method. Assuming that the various quantities in the original storage space have been overwritten by the

100

assignments in Eq. (204) shown in red, then starting again with Eq. (188) as a template, let   `ii `ii `id `ii `id Mi2 Mi2   Mii Mi1 Mi1     M `ii M `ii M `id B id M `ii M `id   1i 11 11 11 12 12     `di `di `dd `di `dd M M M I M M   1i 11 11 12 12     di (205) A=  I B11     `ii `id id   M `ii M `ii M `id M22 M22 B22 21 21   2i     M `di M `di M `dd `di `dd M M I   2i 21 21 22 22   di I B22 `ii id dd dd di id `dd di D = M33 − (B33 M33 + M23 B33 − B33 M33 B33 )

(206)

h i id `ii id `di `id id `dd `ii id `di `id id `dd C = M3i`ii − B33 M3i` M31 − B33 M31 M31 − B33 M31 0 M32 − B33 M32 M32 − B33 M32 0 (207) or i h 0`id 0`ii 0`id 0`ii C = M3i0`ii M31 M31 0 M32 M32 0

(208)

and 

0`id 0`ii Mi30`ii D−1 M32 Mi30`ii D−1 M32

0`id 0`ii Mi30`ii D−1 M31 Mi30`ii D−1 M3i0`ii Mi30`ii D−1 M31

  0`ii −1 0`ii  M13 D M3i   0`di −1 0`ii M D M 3i  13  T −1  C D C=   0`ii −1 0`ii  M23 D M3i   0`di −1 0`ii M23 D M3i 

0`ii −1 0`ii 0`ii −1 0`id M13 D M31 M13 D M31

0`ii −1 0`ii 0`ii −1 0`id M13 D M32 M13 D M32

0`id 0`di −1 0`ii 0`di −1 D M31 D M31 M13 M13

0`ii 0`di −1 0`id 0`di −1 M13 D M32 M13 D M32

0 0`ii −1 0`ii 0`ii −1 0`id M23 D M31 M23 D M31

0`ii −1 0`ii 0`ii −1 0`id M23 D M32 M23 D M32

0`di −1 0`ii 0`di −1 0`id M23 D M31 M23 D M31

0`di −1 0`ii 0`di −1 0`id M23 D M32 M23 D M32

Projecting the quantity C T D−1 C onto the residual matrix, A, yields

101

                  0

(209)

          A ← A − C T D−1 C =         

Mii0`ii

Mi10`ii

Mi10`id

Mi20`ii

Mi20`id

M1i0`ii

0`ii M11

0`id id 0`ii M11 B11 M12

0`id M12

0`di 0`dd M1i0`di M11 M11

M2i0`ii

di B11

I

0`ii M21

0`id M21

0`di 0`dd M2i0`di M21 M21

I

0`di 0`dd M12 M12

0`ii M22

0`id M22

0`di 0`dd M22 M22 di B22

I

             id  B22   I   

(210)

The quantities that have been modified during the projection are highlighted in red. In the second application of the reverse Schur complement method, note that no projections were made onto the preconditioned constraint terms. The lower right block of Eq. (210) 0`dd is singular. The has the same structure as Eq. (202) and again has an inverse, even if M22

process of backward elimination continues similarly for the residual matrix. Once backward elimination is complete, the generalized accelerations and lagrange multipliers can be found by forward substitution.

6.3

Sparse Matrix Methods and Matrix Fills

Up to this point, the solution process was described using full–dense matrix algebra. Note, however, that the inverse connectivity matrix, C −1 (see for example Eq. 177), and the corresponding coefficient matrix in Eq. (199) are moderately sparse, and therefore the factorization algorithm benefits from sparse matrix methods. Additionally, the coefficient matrix is symmetric, and therefore only the terms on and below the diagonal need to be processed or stored. Furthermore, knowledge of where the projections illustrated in Eqs. (204) and (210) occur allows one to reduce the number of operations. A sparse matrix solution algorithm can be tailored to the unique structure of the preconditioned equations of motion. In sparse matrix methods, a matrix fill indicates that during the factorization process, a position that was initially zero in the coefficient matrix is filled with a non-zero value. When bodies and joints are arranged in monotonically level-increasing order, the Parent-Child list 102

can be used to factor the generalized mass matrix with zero fills using sparse LT L or LT DL decomposition [81]. Additionally, when joints are arranged in monotonically increasing levelorder, the Parent-Child list can be used to update the topology of the mechanism or evaluate the generalized mass matrix. Note that the joint ordering provided by the kinematic substructuring algorithm is not strictly monotonically, level–increasing. However, the joints are arranged in monotonically level–increasing order within each substructure. Therefore, the Parent-Child list can still be used to update quantities such as influence coefficient matrices and the generalized mass matrix. Partitioning the generalized coordinates into independent and dependent sets typically requires permuting some variables out of optimal, level–increasing order, which results in unavoidable matrix fills. When GCP is carried out on each substructure independently, any permutations of variables dictated by the algorithm remain local to a given substructure. A primary advantage to permuting variables locally within each substructure is that matrix fills are minimized.

6.4

Permutation Orders

The various permutations used to solve the preconditioned equations of motion are described in this section. Let the order of bodies and joints specified by the user be Permutation Order 0, or P O(0). The user-specified order is not used in the solution process. When the kinematic substructuing process is performed, bodies and joints are arranged in monotonically level increasing order within each substructure. Let the joint ordering provided by the kinematic substructuring algorithm be called Permutation Order 1, or P O(1). Kinematic substructuring would only be performed once at the beginning of a simulation, unless the topology of the system changes, and therefore it is typically not necessary to work between P O(0) and P O(1). The Generalized Coordinate Partitioning (GCP) algorithm, based on applying LU decomposition with complete pivoting to the kinematic substructure Jacobian matrices, provides 103

a new joint-variable ordering, with generalized coordinates ordered approximately by their increasing mechanical advantage within each substructure. From the LU decomposition with complete pivoting algorithm, the dependent variables come first, followed by the independent variables. The order provided by the GCP takes the variables out of monotonically increasing level order. Let the order provided by GCP be called Permutation Order 2, or P O(2). To eliminate the need to work between three levels during the solution process, when the matrix preconditioning process is applied, the results are stored in Permutation Order 3, which is described next. The system of equations in Eq. (199) has the independent coordinates arranged first, followed by dependent coordinates within each substructure, which indicates a final reordering of variables. Let this order used in the final solution process be called Permutation Order 3, or P O(3). To minimize matrix fills when factoring the coefficient matrix, within each substructure the independent variables are arranged in monotonically level increasing order first, followed by the dependent variables also in monotonically level increasing order. In the excavator example, each kinematic substructure has one independent variable, which for this particular example implies that, at most, one variable in each substructure is taken out of its optimal level-increasing order. The permutation orders of the various quantities used to solve the dynamics equations are indicated in the following subsections. 6.4.1

Permutation Order 1

Permutation Order 1, P O(1), is the order of variables determined by kinematic substructuring. Bodies and joints are arranged in monotonically increasing level order within each substructure. In P O(1), the parent-child list can be used to generate the generalized mass matrix. The values that are accessed in P O(1) are summarized in Table 13.

104

Table 13: Quantities accessed in P O(1) variable name

name in program

description

¨q

ddq

second time derivative of generalized coordinates



dq

first time derivative of generalized coordinates

q

q

generalized coordinates

PC

PC

parent child list

λp

lambda p

lagrange multipliers associated with primary constraint equations

f

6.4.2

f

working forces applied in the joints

Permutation Order 2

Permutation Order 2, P O(2), is the order of variables determined by generalized coordinate partitioning with dependent variables first and independent variables last within each substructure. Table 2 contains a summary of quantities which are accessed in P O(2). Table 14: Quantities accessed in P O(2) variable name

name in program

γ d , γ pc

gamma d

description relative quadratic velocity terms across constraints, note that both γ d , γ pc are evaluated and stored in the same location in computer memory in P O(2)

J

J

using fast factorization, the Jacobian matrix is set in permuted order and factored without pivoting

6.4.3

Permutation Order 3

Permutation Order 3, P O(3), is the order of variables used to solve the system of dynamics equations with independent variables first and dependent last within each kinematic substructure. Within each independent and dependent grouping in a kinematic substructure, 105

Table 15: Quantities accessed in P O(3) variable name

name in program

description

M`

Ml

λd

lambda d

f`

fl

generalized forces

B di

b3

a sparse representation of part of the preconditioned

generalized mass matrix preconditioned lagrange multipliers

constraint equations variables are arranged in level increasing order to minimize fills. The quantities stored in P O(3) are summarized in Table 15. To avoid working between levels when sparse factoring and solving the equations, the B di matrix, which forms part of the preconditioned constraint equations, is evaluated from the full dense Jacobian matrix in P O(2) and stored in a sparse representation in P O(3).

6.5

Representing Permutations

In linear algebra, permutations are represented using square, orthogonal binary matrices [82]. A column vector with quantities in P O(1) can be permuted to P O(3) by left multiplication with a suitable permutation matrix, P31

f `P O(3) = P31 f `P O(1)

(211)

Note the annihilation property of the subscripts associated with the permutation matrix. The subscripts on P31 indicate a permutation from P O(1) to P O(3). A permutation matrix contains a single 1 in each row and zeros elsewhere. To eliminate the need to process the many zeros in permutation matrices, a more compact and efficient representation is used in the computer implementation of applying permutations. Consider the permutation of variables from P O(1) to P O(2) in the example below:

106

   5       1         3

P O(2)

  0 0 1     = 1 0 0   0 1 0

21

   1       3         5

(212) P O(1)

n oT n oT where 1 3 5 is a column matrix of quantities in P O(1) and 5 1 3 is a column   0 0 1     matrix of quantities in P O(2). Then P21 = 1 0 0 represents the permutation from   0 1 0 P O(1) to P O(2). The information in P21 can be represented more compactly using a linear permutation array, ℘12 , where:

℘12

n

o = 2 0 1

(213)

The index into ℘12 corresponds to an index into the array P O(2). The value found at the index indicates where to find the data in P O(1). Consider the example of transforming generalized forces from P O(1) to P O(2) as shown in Eq. (214). f `P O(2) (i) = f `P O(1) (℘12 (i))

(214)

Here i is an index into f `P O(2) in P O(2) and the operation is carried out for all indices of f `P O(2) , in this case for i = 0:length(f l) is equivalent to range(length(f l)) using the C/Python indexing convention. In pseudocode, Eq. (214) is used to reorder f `1 according to Algorithm 8. Algorithm 8 reorderArray(℘12 , fl 1, fl 2) for i in range(length(fl 2)) do fl 2[i] ← fl 1[℘12 [i]] end for Note that the subscripts of ℘12 are reversed from those of P21 . Reversing the indices for ℘12 preserves the annihilation property of indices in Eq. (214). The index i is in P O(2), and the operation ℘12 (i) effectively converts the index i in P O(2) to P O(1). 107

6.5.1

Inverse Permutations

Permutation matrices are orthogonal, and therefore the inverse of the permutation matrix, P21 , can be found by taking its transpose as −1 T P12 = P21 = P21

(215)

The ability to convert indices in one permutation order to another can be used to find an inverse permutation array. For example, ℘−1 12 (i.e. ℘21 ) can be found according to Algorithm 9. Algorithm 9 inversePermutation(℘12 , ℘21 ) for i in range(length(℘12 )) do ℘21 [℘12 [i]] ← i end for

6.5.2

Evaluating Permutations

In the following discussion, permutation arrays will be used to set up the solution method. The generalized coordinate partitioning algorithm yields the matrix rank of each block subjacobian and the local row and column permutation arrays, ℘r12 and ℘c12 where superscripts r and c indicate respective row and column operations. A local column permutation array, ℘c12 , permutes the generalized coordinates locally within each substructure, and must be converted to a global permutation array for processing the entire multibody system, as described in Algorithms 10–14. The global variable permutation array is denoted as ℘v12 . The superscript v indicates that the permutation is applied to variables. The length of the permutation array is equal to the number of generalized coordinates, n. Converting the local permutation arrays to a single global permutation array requires knowledge of the offset of each substructure in the global list of generalized coordinates, which is computed and stored when kinematic substructuring is initially performed. Identity permutation arrays are initialized according to Algorithm 10.

108

Algorithm 10 setIdentity(℘, n) for i in range(n) do ℘[i] ← i end for A global ℘v12 array is initialized to identity using Algorithm 10 and evaluated based on the result of generalized coordinate partitioning for each substructure according to Algorithm 11. Algorithm 11 setPV12(substructures, pv 12) for substructure in substructures do for j in range(substructure.dq count) do pv 12[substructure.dq offset + j] ← substructure.pc 12[j] + substructure.dq offset end for end for

An is dependent array is initialized to false and then evaluated according to Algorithm 12. Algorithm 12 setIsDependent(is dependent, substructures, pv 12) for substructure in substructures do for j in range(substructure.rank) do is dependent[pv 12[substructure.dq offset + j]] ← true end for end for

In Algorithm 12 dq offset is the offset for each substructure and rank is the matrix rank of the local block Jacobian found from LU decomposition with complete row and column pivoting. Next, the global permutation array, ℘v13 is initialized using Algorithm 10 and for each substructure is evaluated according to Algorithm 13. In Algorithm 13, the index k corresponds to a global index in the variable permutation array in P O(3). 109

The algorithm moves all independent variables first within each substructure, followed by dependent variables within each substructure. Algorithm 13 setPV13(is dependent, substructures, pv 13) k ← offset for i in range(substructure.dq offset, substructure.dq count + substructure.dq offset) do if not is dependent[i] then pv 13[k] ← i k←k+1 end if end for for i in range(substructure.dq offset, substructure.dq count + substructure.dq offset) do if is dependent[i] then pv 13[k] ← i k←k+1 end if end for In Algorithm 13, dq count refers to the column dimension of the block submatrix of the current kinematic substructure being processed. Now that ℘v12 and ℘v13 are known, all other permutation arrays necessary to work between the various permutation orders can be found. First ℘21 is found from ℘12 using Algorithm 10. Then the permutation arrays ℘v23 , ℘v32 and ℘v13 can be found using the annihilation of indices properties of permutation arrays according to Algorithm 14. Algorithm 14 setPermutations(pv 31, pv 13, pv 32, pv 23, pv 21, pv 12) for i in range(n) do pv 31[pv 13[i]] = i pv 23[i] = pv 21[pv 13[i]] pv 32[pv 23[i]] = i end for 110

6.6 6.6.1

Evaluating the Generalized Equations of Motion Matrix Storage

As the coefficient matrix, M ` is symmetric, memory for only the lower triangular portion need be allocated. Memory space for the lower triangular portion of the coefficient matrix M ` is allocated as a linear array with a size of ((n + 1) ∗ (n + 2))/2. An array of length (n(n + 1))/2 is sufficient to hold the terms on and below the diagonal of a full, dense array. In the solution process, the extra row is used as a workspace. 6.6.2

Evaluating the Generalized Mass Matrix

Recall that the generalized mass matrix, M ` is obtained as M ` = H T C −T M C −1 H

(216)

where the matrix M is block diagonal with 6×6 body inertia matrices along its diagonal, H is a block diagonal matrix with influence coefficient matrices along its diagonal, and C is a square, nonsingular connectivity matrix. The sparse matrix C is never set up, but may be envisioned as formed by element–wise replacement of 1s and -1s in ΓT with positive or negative 6 × 6 identity matrices. If Γ is square and nonsingular, Γ−T is the oriented path matrix, where each row shows joints that appear in the kinematic path from the base body to the corresponding body [28]. The path matrix Γ−T contains the following entries    +1 if joint h is contained in the path from the base body to body b,        and if the orientation of joint h is aligned with the orientation of the path     Γ−T (b, h) = −1 if joint h is contained in the path from the base body to body b,        and if joint h is aligned opposite the orientation of the path       0 if joint h is not contained in the path from the base body to body b (217) If the sense of all joints are aligned with the root-oriented spanning tree, as described in Chapter 5, the terms of Γ−T will consist of only positive ones and zeros. Recall that in 111

Chapter 5, a method to automatically flip the user–defined sense of joints to align with the spanning tree was presented that ensures that all entries in Γ−T and C −T are positive. Consider the example of the brake mechanism discussed in Chapter 11.1. The brake mechanism consists of a number of massless bodies to represent multi-degree of freedom joints. After applying the cut–body approach described in Chapter 5, and with the addition of three virtual, massless bodies, the block diagonal mass matrix has the form  M  0   M1    M2    0    0    M = M5    M6    0    0     0 

                               0

(218)

Here the subscript corresponds to the body’s index in the spanning tree with members in

112

level increasing order. The path matrix, C −1 is   I     I I      I  I     I  I I       I I I      C −1 = I  I     I  I I     I I  I     I  I I       I I I I    I I I I and the matrix product C −T M C −1 has the form  M + M1 + M2 + M5 + M6 M1 M2 + M6  0   M1 M1 0    M2 + M6 0 M2 + M6    0 0 0    0 0 0    C −T M C −1 =  M5 0 0    M6 0 M6    0 0 0    0 0 0     0 0 0  0 0 0

(219)

 0 0 M5 M6 0 0 0 0   0 0 0 0 0 0 0 0   0 0 0 M6 0 0 0 0   0 0 0 0 0 0 0 0    0 0 0 0 0 0 0 0   0 0 M5 0 0 0 0 0   0 0 0 M6 0 0 0 0   0 0 0 0 0 0 0 0   0 0 0 0 0 0 0 0    0 0 0 0 0 0 0 0  0 0 0 0 0 0 0 0 (220)

The terms on the diagonal of Eq. (220) are sometimes referred to as accumulated inertia matrices. Note that the sparsity pattern of Eq. (220) reproduces symmetrically the sparsity pattern of C −1 , with the exception of the massless bodies (both virtual and user– 113

specified), which have no contributions. When a user–specified massless body has one or more descendant bodies with mass and inertia, there will be projected inertia matrices from its descendant bodies into the block row and column corresponding to that massless body. In the brake example, the massless bodies have no descendant bodies and no mass is projected onto them. The next step in evaluating the generalized mass matrix involves carrying out the matrix product H T C −T M C −1 H, which effectively projects inertia expressed in the cartesian frame onto the joint subspace. The matrix product for the brake example is M ` = H T C −T M C −1 H =  hT (M0 + M1 + M2 + M5 + M6 )h0 hT0 (M1 )h1 hT0 (M2 + M6 )h2  0   hT1 (M1 )h1 0 hT1 (M1 )h0   T T  h2 (M2 + M6 )h0 0 h2 (M2 + M6 )h2    0 0 0    0 0 0    T  h5 (M5 )h0 0 0   T T  h6 (M6 )h0 0 h6 (M6 )h2    0 0 0    0 0 0     0 0 0  0 0 0

 0 0 hT0 (M5 )h5 hT0 (M6 )h6 0 0 0 0   0 0 0 0 0 0 0 0   0 0 0 hT2 (M6 )h6 0 0 0 0   0 0 0 0 0 0 0 0    0 0 0 0 0 0 0 0   0 0 hT5 (M5 )h5 0 0 0 0 0   0 0 0 hT6 (M6 )h6 0 0 0 0   0 0 0 0 0 0 0 0   0 0 0 0 0 0 0 0    0 0 0 0 0 0 0 0  0 0 0 0 0 0 0 0

(221)

While the connectivity matrix and its inverse contain all the information necessary to update the generalized mass matrix, processing its many zeros is inefficient. As discussed in Chapter 5, a more compact representation of a mechanism’s topology is the parent-child list, P C. As noted in the tables above, the coefficient matrix is evaluated in P O(3), but P C and computational quantities related to bodies and joints are stored in P O(1). To take advantage of the parent–child list in evaluating the generalized mass matrix, all operations are carried out in P O(1) and the final assignments are made in P O(3). Note that virtual, massless bodies (as well as any user defined massless bodies) have no contribution to the generalized mass matrix, and therefore no additional operations are required to process those bodies. A boolean has mass array equal in length to the number 114

of bodies is used to determine if a body requires processing. Additionally, the 6 × 6 inertia matrices (for example, M1 -M5 in Eq. (218)) are stored in a linear array called inertia of length n ∗ 10, with ten entries for each body including the mass, m, mass times a vector to the body’s center of mass, mr, and the rotational inertia terms (Ixx , Iyy , Izz , Ixy , Iyz and Izx ), expressed relative to the center of mass. An array called inertia p is used to store the accumulated inertia matrices shown in Eq. (220). A temporary array called M H product contains intermediate quantities in Eq. (221) in P O(1) order. The intermediate quantities stored in M H product are of the form Mip hi . The unique nonzero spatial inertia matrices stored in the inertia p array correspond to those accumulated matrices on the block diagonal of the matrix in Eq. (220). For each nonzero accumulated mass matrix term (Mip ) on the diagonal of Eq. (220), multiply the corresponding quantity stored in the inertia p array with the corresponding hi matrix stored in the H array and store the result in the M H product array. There is no recursion here so this process is easily vectorized. The following section describes the minimum operations required to evaluate the lower triangular portion of the matrix product H T C −T M C −1 H in P O(1). First the generalized mass matrix, M ` (M l) is initialized to zero and the accumulated inertia matrices are computed using the Parent Child list according to Algorithm 15. Algorithm 15 accumulatedInertias(P C, inertia, inertia p, n) copy inertia to inertia p for j in range(n - 1, -1, -1) do i = PC[j] if i > -1 then inertia p[i] += inertia p[j] end if end for

The inertia array copied into the inertia p array in Algorithm 15 is an array of inertia matrices that have been transformed to the inertial reference frame according to Section 3.2.5. An additional savings can be achieved by noting that processing influence coefficient matrices for translational joints in Eq. (221) requires fewer operations – as the last three 115

entries will always be zero. Therefore, a boolean array called is translation, equal in length to the number of joint variables, is used to specify whether a variable corresponds to a translational (true) or rotational (false) degree of freedom. Note that any operation that makes use of C −1 implies a traversal of the mechanism’s spanning tree. The above operation projects inertia onto the parent of a given body, starting from the leaves of the spanning tree and working towards the root, or fixed body. Note that parallelism may be applied when traversing a multibody system’s spanning tree, however threads must be collected at each branch of the tree, requiring significant overhead in synchronization. Large multibody systems with many branches in the spanning tree, such as molecular dynamics simulations, may benefit from parallelism, however in the author’s experience, due to the overhead in synchronizing the traversal of the spanning tree, applying parallelism to tree traversal is expected to offer little benefit for the moderately complex mechanisms typically of interest to industry. There is, however, an opportunity for parallelism in setting up dynamics equations that requires minimal synchronization between threads. The operations described below that do not use the Parent Child list may be performed in parallel. In the operations that do not require the parent child list for synchronization, the n bodies and joints may be distributed between multiple processors, requiring no synchronization during processing. The threads must be joined before proceeding to the next operation. In the simulation results presented later in the document, all of the operations are performed serially, but the opportunity for parallelism is mentioned here as a candidate for speed improvement in a future generation of the code. In Algorithm 16, the terms along the diagonal of the matrix product (C −T M C −1 )H are evaluated. Full dense matrix algebra is not necessary to carry out any of the products. Algorithm 16 describes the minimum number of operations required to carry out the matrix product. The matrix product H T C −T M C −1 H is evaluated in P O(1) and stored in P O(3) using the ℘v31 (pv 31) permutation array. This operation is carried out in level decreasing order in 116

Algorithm 16 setMhProduct(mhProduct, inertia p, n, h, is rotation) for j in range(n) do mhProduct[j][0] ← mhProduct[j][0] + inertia p.m * h[j][0] mhProduct[j][1] ← mhProduct[j][1] + inertia p.m * h[j][1] mhProduct[j][2] ← mhProduct[j][2] + inertia p.m * h[j][2] mhProduct[j].m5 ← mhProduct[j].m5 - inertia p[j].mt.y * h[j][0] mhProduct[j].m4 ← mhProduct[j].m4 + inertia p[j].mt.z * h[j][0] mhProduct[j].m3 ← mhProduct[j].m3 - inertia p[j].mt.z * h[j][1] mhProduct[j].m5 ← mhProduct[j].m5 + inertia p[j].mt.x * h[j][1] mhProduct[j].m3 ← mhProduct[j].m3 + inertia p[j].mt.y * h[j][2] mhProduct[j].m4 ← mhProduct[j].m4 - inertia p[j].mt.x * h[j][2] if is rotation(j) then mhProduct[j][1] ← mhProduct[j][1] - inertia p[j].mt.z * h[j][3] mhProduct[j][2] ← mhProduct[j][2] + inertia p[j].mt.y * h[j][3] mhProduct[j][3] ← mhProduct[j][3] + inertia p[j].Ixx * h[j][3] mhProduct[j][4] ← mhProduct[j][4] + inertia p[j].Ixy * h[j][3] mhProduct[j][5] ← mhProduct[j][5] + inertia p[j].Izx * h[j][3] mhProduct[j][0] ← mhProduct[j][0] + inertia p[j].mt.z * h[j][4] mhProduct[j][2] ← mhProduct[j][2] - inertia p[j].mt.x * h[j][4] mhProduct[j][3] ← mhProduct[j][3] + inertia p[j].Ixy * h[j][4] mhProduct[j][4] ← mhProduct[j][4] + inertia p[j].Iyy * h[j][4] mhProduct[j][5] ← mhProduct[j][5] + inertia p[j].Iyz * h[j][4] mhProduct[j][0] ← mhProduct[j][0] - inertia p[j].mt.y * h[j][5] mhProduct[j][1] ← mhProduct[j][1] + inertia p[j].mt.x * h[j][5] mhProduct[j][3] ← mhProduct[j][3] + inertia p[j].Izx * h[j][5] mhProduct[j][4] ← mhProduct[j][4] + inertia p[j].Iyz * h[j][5] mhProduct[j][5] ← mhProduct[j][5] + inertia p[j].Izz * h[j][5] end if end for

117

P O(1) from the leaves of the mechanism to the root node using the Parent Child list and the final assignment is stored in the generalized mass matrix M ` in P O(3) Algorithm 17 performs a number of important tasks to insure that entries of M ` evaluated in P O(1) are stored correctly in P O(3) in the linear array Ml. In addition it sets up a linear Boolean array Ml b to record the sparsity pattern of Ml in P O(3) as it is evaluated. Note that the boolean matrix Ml b need only be evaluated when a new split of dependent and independent variables is indicated by the heuristic application of generalized coordinate partitioning described in Chapter 5, and therefore a second, simpler version of Algorithm 17 can be applied to set the mass matrix when the fast–factorization method is used. In the interest of brevity, the second simpler version is not shown; it can be easily obtained from Algorithm 17 by removing any operations associated with assignments to the Boolean array Ml b. The outer for loop in Algorithm 17 evaluates one row of M ` at a time from bottom to top and from diagonal left in P O(1). If hasMass[j] is false then the jth row and column of M ` are entirely zero and no terms will be evaluated. Otherwise the process begins to set up for evaluating the jth row of M ` in P O(1) and storing it in P O(3). m 3 points to the row (and possibly column) of M ` in P O(3) where the permuted entries will be stored. ks 3 points to the position in Ml and Ml b corresponding to the first column in row m 3 of M ` in P O(3). k 3 points to the position in Ml and Ml b corresponding to the diagonal entry in row m 3 of M ` in P O(3). Ml b[k 3] is set to true because there will always be a nonzero entry on the diagonal. mhProduct[j] contains inertia p[j] * h[j] evaluated in Algorithm 16. The minimum number of operations required to perform an inner product between a screw and a coscrew are carried out based on whether h[i] is associated with a rotational or translational joint. Once the diagonal entry is set the algorithm works its way leftward in row j in P O(1) by first setting i ← P C[j] to the parent of j and then recursing in the while loop with i ← P C[i] until reaching the left edge of M ` , which finishes the row. Index i always represents the current column in M ` being processed in P O(1) and i 3 identifies the column in P O(3).

118

Row m 3 and column i 3 give location in M ` in P O(3) where the current entry in row j and column i in M ` in P O(3) will go. If row m 3 is greater than column i 3 the permuted entry lies below the diagonal of M ` , and it will be stored at k 3 ← ks 3 + i 3 in Ml. Otherwise row m 3 is less than column i 3, and the permuted entry must be reflected across the diagonal. This alternate assignment will be stored at k 3 = (i 3 * (i 3 + 1)) / 2 + p v 31[j] in Ml. To minimize storage requirements for the system of equations, the components of the coefficient matrix (i.e. the generalized mass matrix and the constraint equations for each substructure) are stored in separate datastructures. 6.6.3

Evaluating Right Hand Side

Similarly, the right hand side of the equations of motion can be evaluated in P O(1) and stored in P O(3). The generalized forces, f ` , are found as

f ` = f + H T C −T (g∗f − [ad]∗vf M vf − M C −1 γ)

(222)

The operation in Eq. (222) can be broken down into the following steps. Initialize the generalized forces, f ` (f l), to zero. Then, carry out the operation

γ ← C −1 γ

(223)

according to Algorithm 18 using in–place assignment. Next, carry out the operation g∗ ← g∗ − M γ

(224)

These operations are carried out using the minimal number of operations according to Algorithm 19.

119

Algorithm 17 setMl(mhProduct, inertia p, n, h, is rotation, Ml, PC, p v 31) zero lower triangular matrix Ml for j in range(n - 1, -1, -1) do if has mass[j] then m 3 ← p v 31[j] ks 3 ← (m 3 * (m 3 + 1)) / 2 k 3 ← ks 3 + m 3 Ml b[k 3] ← true for l in range(3) do Ml b[k 3] ← Ml b[k 3] + mhProduct[m 1][l] * h[j][l] if not is translation[j] then for l in range(3, 6) do Ml[k 3] ← Ml[k 3] + mhProduct[m 1][l] * h[j][l] i ← PC[j] while i > -1 do i 3 ← p v 31[i] if m 3 > i 3 then k 3 ← ks 3 + p v 31[i 1] else ks alt = (i 3 * (i 3 + 1)) / 2 k 3 = ks alt + p v 31[m 1] end if Ml b[k 3] ← true for l in range(3) do Ml[k 3] ← Ml[k 3] + mhProduct[m 1][l] * h[i][l] if not is translation[i] then for l in range(3, 6) do Ml[k 3] ← Ml[k 3] + mhProduct[m 1][l] * h[i][l] end for end if end for i = PC[i] end while end for end if end for end if end for

120

Algorithm 18 accumulatedGamma(P C, gamma d, n) for j in range(n - 1, -1, -1) do i = PC[j] if i > -1 then gamma d[i] += gamma d[j] end if end for

Algorithm 19 subtractMGamma(inertia, n, aq, g, f l zero array f l for j in range(n) do if has mass[j] then g[j][0] ← g[j][0] - inertia[j].m * aq[j].m0 g[j][1] ← g[j][1] - inertia[j].m * aq[j].m1 g[j][2] ← g[j][2] - inertia[j].m * aq[j].m2 g[j][4] ← g[j][4] - inertia[j].mt.z * aq[j].m0 g[j][5] ← g[j][5] + inertia[j].mt.y * aq[j].m0 g[j][3] ← g[j][3] + inertia[j].mt.z * aq[j].m1 g[j][5] ← g[j][5] - inertia[j].mt.x * aq[j].m1 g[j][3] ← g[j][3] - inertia[j].mt.y * aq[j].m2 g[j][4] ← g[j][4] + inertia[j].mt.x * aq[j].m2 g[j][1] ← g[j][1] + inertia[j].mt.z * aq[j].m3 g[j][2] ← g[j][2] - inertia[j].mt.y * aq[j].m3 g[j][3] ← g[j][3] - inertia[j].Ixx * aq[j].m3 g[j][4] ← g[j][4] - inertia[j].Ixy * aq[j].m3 g[j][5] ← g[j][5] - inertia[j].Izx * aq[j].m3 g[j][0] ← g[j][0] - inertia[j].mt.z * aq[j].m4 g[j][2] ← g[j][2] + inertia[j].mt.x * aq[j].m4 g[j][3] ← g[j][3] - inertia[j].Ixy * aq[j].m4 g[j][4] ← g[j][4] - inertia[j].Iyy * aq[j].m4 g[j][5] ← g[j][5] - inertia[j].Iyz * aq[j].m4 g[j][0] ← g[j][0] + inertia[j].mt.y * aq[j].m5 g[j][1] ← g[j][1] - inertia[j].mt.x * aq[j].m5 g[j][3] ← g[j][3] - inertia[j].Izx * aq[j].m5 g[j][4] ← g[j][4] - inertia[j].Iyz * aq[j].m5 g[j][5] ← g[j][5] - inertia[j].Izz * aq[j].m5 end if end for

Next, the generalized momentum is updated as p∗f ← M vf 121

(225)

This operation is also carried out in P O(1) and summarized in Algorithm 20. Algorithm 20 momentum(p, inertia, n, aq, g) zero array p for j in range(n) do if has mass[j] then p[j][0] ← p[j][0] + inertia[j].m * v[j][0] p[j][1] ← p[j][1] + inertia[j].m * v[j][1] p[j][2] ← p[j][2] + inertia[j].m * v[j][2] p[j][4] ← p[j][4] + inertia[j].mt.z * v[j][0] p[j][5] ← p[j][5] - inertia[j].mt.y * v[j][0] p[j][3] ← p[j][3] - inertia[j].mt.z * v[j][1] p[j][5] ← p[j][5] + inertia[j].mt.x * v[j][1] p[j][3] ← p[j][3] + inertia[j].mt.y * v[j][2] p[j][4] ← p[j][4] - inertia[j].mt.x * v[j][2] p[j][1] ← p[j][1] - inertia[j].mt.z * v[j][3] p[j][2] ← p[j][2] + inertia[j].mt.y * v[j][3] p[j][3] ← p[j][3] + inertia[j].Ixx * v[j][3] p[j][4] ← p[j][4] + inertia[j].Ixy * v[j][3] p[j][5] ← p[j][5] + inertia[j].Izx * v[j][3] p[j][0] ← p[j][0] + inertia[j].mt.z * v[j][4] p[j][2] ← p[j][2] - inertia[j].mt.x * v[j][4] p[j][3] ← p[j][3] + inertia[j].Ixy * v[j][4] p[j][4] ← p[j][4] + inertia[j].Iyy * v[j][4] p[j][5] ← p[j][5] + inertia[j].Iyz * v[j][4] p[j][0] ← p[j][0] - inertia[j].mt.y * v[j][5] p[j][1] ← p[j][1] + inertia[j].mt.x * v[j][5] p[j][3] ← p[j][3] + inertia[j].Izx * v[j][5] p[j][4] ← p[j][4] + inertia[j].Iyz * v[j][5] p[j][5] ← p[j][5] + inertia[j].Izz * v[j][5] end if end for

Next perform the screw cross product operation [ad]∗vf p∗f and subtract the results from g∗f as g∗f ← g∗f − [ad]∗vf p∗f

(226)

using Algorithm 21. Again this operation is performed in P O(1) and may be parallelized with minimum synchronization.

122

Algorithm 21 subtractVCrossG(g, v, p, n) for j in range(n) do if has mass[n] then g[j][0] ← g[j][0] - v[j][4] * p[j][2] g[j][1] ← g[j][1] - v[j][5] * p[j][0] g[j][2] ← g[j][2] - v[j][3] * p[j][1] g[j][3] ← g[j][3] - v[j][1] * p[j][2] g[j][4] ← g[j][4] - v[j][2] * p[j][0] g[j][5] ← g[j][5] - v[j][0] * p[j][1] g[j][3] ← g[j][3] - v[j][4] * p[j][5] g[j][4] ← g[j][4] - v[j][5] * p[j][3] g[j][5] ← g[j][5] - v[j][3] * p[j][4] g[j][0] ← g[j][0] + v[j][5] * p[j][1] g[j][1] ← g[j][1] + v[j][3] * p[j][2] g[j][2] ← g[j][2] + v[j][4] * p[j][0] g[j][3] ← g[j][3] + v[j][2] * p[j][1] g[j][4] ← g[j][4] + v[j][0] * p[j][2] g[j][5] ← g[j][5] + v[j][1] * p[j][0] g[j][3] ← g[j][3] + v[j][5] * p[j][4] g[j][4] ← g[j][4] + v[j][3] * p[j][5] g[j][5] ← g[j][5] + v[j][4] * p[j][3] end if end for

123

Then perform the operation g∗f ← C −T g∗f

(227)

using the parent child list according to Algorithm 22. Algorithm 22 accumulateG(g, n, PC) for j in range(n - 1, -1, -1) do i ← PC[j] if i > -1 then for k in range(6) do g[i][k] ← g[i][k] + g[j][k] end for end if end for

The final operation f ` ← f + H T g∗f

(228)

required to update the right hand side is stored in P O(3) using the ℘v31 (pv 31) permutation array. The pseudocode to carry out Eq. (228) is summarized in Algorithm 23. Algorithm 23 setfl(fl, f, h, g, n, pv 31) for j in range(n) do k ← pv 31[j] fl[k] ← f[j] for l in range(3) do fl[k] ← fl[k] + h[j][l] * g[j][l] if not is translation[j] then for l in range(3, 6) do fl[k] ← fl[k] + h[j][l] * g[j][l] end for end if end for end for

6.6.4

Evaluating Constraints

The next step in setting up the equations of motion involves evaluating the constraints for each substructure. As mentioned above, each Jacobian matrix is initially evaluated in 124

P O(1). Generalized coordinate partitioning yields P O(2) and the equations of motion are solved in P O(3). In the following discussion, it is assumed that a first LU decomposition with complete row and column pivoting has been performed, and the various global permutation arrays have been generated according to Section 6.5.2. As described in Chapter 5, after an initial call to generalized coordinate partitioning, subsequent evaluations of the Jacobian matrix are performed in P O(2). Preconditioning of the constraint equations is performed according to Section 3.4, and the preconditioned results are stored in P O(3). As noted in Section 3.4, each set of preconditioned constraint equations contains a B di matrix with a row dimension equal to the number of dependent variables and a column dimension equal to the number of independent variables. To take advantage of sparsity, only the nonzero terms of each substructure’s B di matrix are stored in a linear array in row major order. In order to evaluate the B di matrix’s sparsity pattern in P O(3), it is first necessary to determine the sparsity pattern of each block subjacobian in P O(1), which is performed using Algorithm 24 for each substructure. Algorithm 24 is called only once after LU decomposition with complete row and column pivoting. Thereafter, the mapping of the B di matrix terms to the sparse B di array is available for subsequent reuse. In Algorithm 24, J is the local block subjacobian corresponding to the given substructure, J b is a boolean matrix with True (or a nonzero value) indicating a nonzero value and False (or a zero value) indicating a zero value at the corresponding position in the block subjacobian. row dim and col dim indicate the block subjacobian’s row and column dimension, respectively. rank, pr 21 and pc 21 are the inverse linear row and column arrays generated by an initial call to LU decomposition with complete row and column pivoting. Additionally, the influence coefficient matrices that comprise the elements of the Jacobian matrix are evaluated and stored in the positive loop members and negative loop members data structures and an offset into the global list of joint variables is stored. Refer to Chapter 5 for additional information. Note that pr 21 and pc 21 are used to map positive loop members and negative loop members (which are stored in P O(1)) to J and J b, which are stored in P O(2).

125

Algorithm 24 evaluateJacobianBoolean(substructures) for substructure in substructures do zero substructure.J b for i in range(length(substructure.positive loop members)) do for for j in range(length(substructure.positive loop members.count)) do for k in range(6) do joint ind ← substructure.positive loop members[i][j] i2 ← pr 21[i * 6 + k] j2 ← pc 21[joint ind] - substructure.dq offset substructure.J b[i2, j2] ← true end for end for end for for i in range(length(negative loop members)) do for j in range(negative loop members[i])) do for k in range(6) do joint ind ← negative loop members[i][j] i2 ← pr 21[i * 6 + k] j2 ← pc 21[joint ind] - substructure.dq offset substructure.J b[i2][j2] ← true end for end for end for end for

126

Once the Jacobian matrices are evaluated in P O(2), they are symbolically factored without pivoting, and matrix fills are recorded in J b according to Algorithm 25. Algorithm 25 symbolicFactor(substructures) for substructure in substructures do for i in range(substructure.rank) do for j in range(i + 1, substructure.rank) do if substructure.J b[j, i] then for k in range(i + 1, substructure.col dim) do if substructure.J b[i, k] then substructure.J b[j, k] ← true end if end for end if end for end for for i in range(substructure.rank - 1, -1, -1) do for j in range(substructure.rank - 1, i, -1) do if substructure.J b[i, j] then for k in range(substructure.rank, substructure.col dim) do if substructure.J b[j, k] then substructure.J b[i, k] ← true end if end for end if end for end for end for

Once the sparsity pattern of the factored J b has been determined for each substructure, the mapping of the full dense representation in P O(2) to a sparse representation of B di with nonzero entries stored in contiguous memory accessible in P O(3) can be found according to Algorithms 26 and 27. First, the number of nonzero entries of each substructure’s B matrix are found using Algorithm 26.

127

Algorithm 26 countNonZeroB(substructures) b offset ← 0 for substructure in substructures do substructure.b offset ← b offset nnzb ← 0 for i in range(substructure.rank) do for j in range(substructure.rank, substructure.col dim) do if substructure.J b[i][j] then nnzb ← nnzb + 1 end if end for end for substructure.nnzb ← nnzb b offset ← b offset + (substructure.col dim - substructure.rank) * substructure.rank end for b3 ← zeros(b offset)

Memory is then allocated for the arrays ij mapping, rb3, and cb3. The array ij mapping contains the mapping of the position of nonzero entries of the B di matrix of the full dense Jacobian matrix stored in P O(2). The array rb3 contains row ranges of the nonzero entries in the b3 matrix. The array cb3 contains the column index at the specified index. After the analysis is complete, the mapping of full dense B di in PO(2) to a sparse representation in P O(3) is set. In Algorithm 27, ij mapping po3 is a temporary array used to set up the various permutations.

128

Algorithm 27 setMapping(substructures, rb3, b3, cb3) for substructure in substructures do initialize ij mapping po3[substructure.dqCount * substructure.dqCount] initialize substructure.ij mapping[substructure.row dim * substructure.col dim] set all values in ij mapping po3 and substructure.ij mapping to -1 generate a local permutation array from PO(3) to PO(2): pv local 23[substructure.dqCount] for i in range(substructure.dqCount) do pv local 23[i] = pv 23[i + substructure.dqOffset] - substructure.dqOffset end for k ← substructure.bOffset for i3 in range(substructure.dqCount - substructure.rank, substructure.dqCount) do i2 ← pv local 23[i3] rb3 index ← i3 + substructure.dqOffset rb3[rb3 index][0] ← k for j3 in range(substructure.dqCount - substructure.rank) do j2 ← pv local 23[j3] if substructure.J b[i2][j2]) then ij mapping po3[i3][j3] ← k cb3[k] ← j3 + substructure.dqOffset k←k+1 end if end for rb3[rb3Index][1] ← k end for for i3 in range(substructure.dqCount - substructure-¿rank, substructure.dqCount) do i2 ← pv local 23[i3] for j3 in range(substructure.dqCount - substructure.rank) do j2 ← pv local 23[j3] substructure.ij mapping[i2][j2] ← ij mapping po3[i3][j3] end for end for end for

6.6.5

Setting up Sparse Factor and Solve

In this section, the mechanics of setting up the sparse matrix factorization and solve algorithm for the preconditioned equations of motion with substructures is described. These operations need only be carried out at the beginning of a simulation or after a reconfiguration event has occurred and a new split of dependent and independent variables is specified as

129

described in Chapter 5. Setting up the sparse matrix solution method requires the Boolean mass matrix that was set up in the previous sections. The following indices and corresponding operations used to sparse factor and solve the preconditioned equations of motion with substructures are described in Tables 16 and 17. The arrays i1[] – i8[], j1[] – j6[] and p1[] - p3[] hold indices used in the factorization process. A first pass through the factorization and solve process determines the size of these index arrays using the counters described in Table 18. Table 16: Indices used for matrix factorization index

operation

l1

Ml[i1[l1]] -= M[i2[l1]] * B[i3[l1]]

l2

Ml[p] = M[i5[l2]], M[i5[l2]] *= Ml[i4[i]]

l3

Ml[i6[l3]] -= M[i7[l3]] * M[i8[l3]]

Table 17: Indices for backward elimination/forward substitution index

operation

m1

fl[j1[m1]] -= B[j2[m1]] * fl[i], ddq[i] -= B[j2[m1]] * ddq[j1[m1]]

m2

fl[j3[m2]] -= Ml[j4[m2]] * ga[i], la[i] -= M[j4[m2]] * ddq[j3[m2]]

m3

fl[j5[m3]] -= Ml[j6[m3]] * fl[i], ddq[i] -= Ml[j6[m3]] * ddq[j5[m3]]

130

Table 18: Counters for size of index arrays counter

index arrays

cl1

i1[], i2[], i3[]

cl2

i5[]

cl3

i6[], i7[], i8[]

cm1

j1[], j2[]

cm2

j3[], j4[]

cm3

j5[], j6[]

The quantities ni and nd represent the number of independent and dependent variables found in an initial call to LU decomposition with complete row and column pivoting for each substructure. The index arrays n1[]–n3[] and p1[]–p3[] are sized according to ni and nd as n1[nd], n2[ni], n3[ni], p1[nd], p2[nd] and p3[ni]. The arrays n1[], p1[] and p2[] hold operation counts for each dependent variable. The arrays n2[], n3[] and p3[] hold operation counts for each independent variable. The generalized mass matrix is initialized as a lower triangular matrix with size ((n + 1) ∗ (n + 2))/2 where n is the number of bodies. The last row of Ml is used as a work array during the factorization process.

131

Algorithm 28 countOperations(Mlb, n, ni, nd) initialize a list of independent variables in a substructure that are coupled to the current dependent variable being eliminated: coupledList[dqCount] copy Mlb to Mlbs for later reuse: Mlbs ← Mlb initialize a list of independent variables in a substructure that are not coupled to the current dependent variable being eliminated: nonCoupledList[dqCount] set counters cl1, cl2, cl3, cm1, cm2 and cm3 to zero startOfWorkInM ← (n * (n + 1)) / 2 for i in range(dq count - 1, -1, -1) do startOfRowiInM ← i * (i + 1) / 2 endOfRowiInM ← startOfRowiInM + i i4[i] ← endOfRowiInM if is dependent po3[i] then Initialize incremental counters cl1, cm1 and cm2 for current variable: cl1 ← 0, cm1 ← 0, cm2 ← 0 Initialize counter for number of independent variables in substructure that are coupled to current dependent variable being eliminated: coupled dq count ← 0 Initialize counter for number of independent variables in substructure that are not coupled to current dependent variable being eliminated: noncoupled dq count ← 0 for k in range(rb3[i][0], rb3[i][1]) do col index ← cb3[k] if Mlb[startOfRowiInM + colIndex] then coupledList[coupled dq Count] ← col index coupled dq count ← coupled dq count + 1 else nonCoupledList[noncoupled dq count] ← col index noncoupleddqCount ← noncoupled dq count + 1 end if end for unprocesseddqCount ← coupleddqCount + nonCoupleddqCount while unprocesseddqCount do if coupleddqCount then coupleddqCount ← coupleddqCount - 1 k ← coupledList[coupleddqCount] else nonCoupleddqCount ← nonCoupleddqCount - 1 k ← nonCoupledList[nonCoupleddqCount] end if unprocesseddqcount ← unprocesseddqCount - 1 startOfRowkInM = k * (k + 1) / 2

132

if Mlb[endOfRowiInM] then for j in range(i - 1, -1, -1) do i2s ← startOfRowiInM + j if Mlb[i2s] then if k >= j then i1s ← startOfRowkInM + j else i1s ← j * (j + 1) / 2 + k end if Mlb[i1s] ← true l1 ← l1 + 1 cl1 ← cl1 + 1 end if end for i1s ← startOfRowiInM + k Mlb[i1s] ← true i2s ← startOfRowiInM + i l1 ← l1 + 1 cl1 ← cl1 + 1 i1s ← startOfRowkInM + k Mlb[i1s] ← true l1 ← l1 + 1 cl1 ← cl1 + 1 end if m1 ← m1 + 1 cm1 ← cm1 + 1 j←0 while j < nonCoupleddqCount do k ← nonCoupledList[j] if Mlb[startOfRowiInM + k] then coupledList[coupleddqCount] ← k coupleddqCount ← coupleddqCount + 1 for k in range(j, nonCoupleddqCount - 1) do nonCoupledList[k] ← nonCoupledList[k + 1] end for nonCoupleddqCount ← nonCoupleddqCount - 1 j←j-1 end if j←j+1 end while end while

133

for j in range(i - 1, -1, -1) do pMij ← startOfRowiInM + j if Mlb[pMij] then m2 ← m2 + 1 cm2 ← cm2 + 1 end if end for n1[nd] ← cl1 p1[nd] ← cm1 p2[nd] ← cm2 nd ← nd + 1 else cl2 ← startOfWorkInM cl3 ← 0 cm3 ← 0 for j in range(i - 1, -1, -1) do if Mlb[startOfRowiInM + j] then startOfRowjInM ← j * (j + 1) / 2 for k in range(j, -1, -1) do i8s ← startOfRowiInM + k if Mlb[i8s] then i6s ← startOfRowjInM + k Mlb[i6s] ← true l3 ← l3 + 1 cl3 ← cl3 + 1 end if end for p←p+1 m3 ← m3 + 1 cm3 ← cm3 + 1 l2 ← l2 + 1 cl2 ← cl2 + 1 end if end for n2[ni] ← cl2 n3[ni] ← cl3 p3[ni] ← cm3 ni ← ni + 1 end if end for size index arrays Mlb ← Mlbs

134

Algorithm 29 recordOperations(Ml) initialize counters: l1 ← 0, l2 ← 0, l3 ← 0, m1 ← 0, m2 ← 0, m3 ← 0 initialize coupledList[n] initialize coupledIndexList[n] initialize nonCoupledList[n] initialize nonCoupledIndexList[n] Begin backward recursion for matrix factorization for i in range(n - 1, -1, -1) do startOfRowiInM ← i * (i + 1) / 2 endOfRowiInM ← startOfRowiInM + i if is dependent po3[i] then coupleddqCount ← 0 nonCoupleddqCount ← 0 for k in range (rb3[i][0], rb3[i][1]) do col index ← cb3[k] if Mlb[startOfRowiInM + colIndex] then coupledList[coupleddqCount] ← colIndex coupledIndexList[coupleddqCount] ← k coupleddqCount ← coupleddqCount + 1 else nonCoupledList[nonCoupleddqCount] ← colIndex nonCoupledIndexList[nonCoupleddqCount] ← k nonCoupleddqCount ← nonCoupleddqCount + 1 end if end for unprocesseddqCount ← coupleddqCount + nonCoupleddqCount

135

while unprocesseddqCount do if coupleddqCount then coupledddqCount ← coupleddqCount - 1 k ← coupledList[coupleddqCount] kIndexInB ← coupledIndexList[coupleddqCount] else nonCoupleddqCount ← nonCoupleddqCount - 1 k ← nonCoupledList[nonCoupleddqCount] kIndexInB ← nonCoupledIndexList[nonCoupleddqCount] end if unprocesseddqCount ← unprocesseddqCount - 1 startOfRowkInM ← k * (k + 1) / 2 if Mlb[endOfRowiInM] then for j in range(i - 1, -1, -1) do i2[l1] ← startOfRowiInM + j if Mlb[i2[l1]] then if k >= j then i1[l1] ← startOfRowkInM + j else i1[l1] ← j * (j + 1) / 2 + k end if Mlb[i1[l1]] ← true i3[l1] ← kIndexInB l1 ← l1 + 1 end if end for i1[l1] ← startOfRowiInM + k Mlb[i1[l1]] ← true i2[l1] ← startOfRowiInM + i i3[l1] ← kIndexInB l1 ← l1 + 1 i1[l1] ← startOfRowkInM + k Mlb[i1[l1]] ← true i2[l1] ← startOfRowiInM + k i3[l1] ← kIndexInB l1 ← l1 + 1 end if j1[m1] ← k j2[m1] ← kIndexInB m1 ← m1 + 1 j←0

136

while j < nonCoupleddqCount do k ← nonCoupledList[j] if Mlb[startOfRowiInM + k] then coupledList[coupleddqCount] ← k coupleddqCount ← coupleddqCount + 1 for k in range(j, nonCoupleddqCount - 1) do nonCoupledList[k] ← nonCoupledList[k + 1] end for nonCoupleddqCount ← nonCoupleddqCount - 1 j ← j - 1 end if j ← j + 1 end while end while for j in range(i - 1, -1, -1) do pMij ← startOfRowiInM + j if Mlb[pMij] then j3[m2] ← j j4[m2] ← pMij m2 ← m2 + 1 end if end for else p ← startOfWorkInM for j in range(i - 1, -1, -1) do if Mlb[startOfRowiInM + j] then i5[l2] ← startOfRowiInM + j startOfRowjInM ← j * (j + 1) / 2 for k in range(j, -1, -1) do i8[l3] ← startOfRowiInM + k if Mlb[i8[l3]] then i6[l3] ← startOfRowjInM + k Mlb[i6[l3]] ← true i7[l3] ← p l3 ← l3 + 1 end if end for p ← p + 1 j5[m3] ← j j6[m3] ← i5[l2] m3 ← m3 + 1 l2 ← l2 + 1 end if end for end if end for

137

At runtime, the indices stored in the index arrays i1[]–i7[] and j1[]–j6[] can be used to evaluate the nonzero entries in the generalized mass matrix to sparse factor the matrix and to solve the generalized equations of motion. The sparse factorization step is described in Algorithm 30. Algorithm 30 sparseFactor(Ml, ni, nd, i1, i2, i3, i4, i5, i6, i7, i8) initialize counters: l1 ← 0, l2 ← 0, l3 ← 0, ni ← 0, nd ← 0 for i in range(n - 1, -1, -1) do if is dependent po3[i] then for j in range(n1[nd] do Ml[i1[l1]] ← Ml[i1[l1]] - Ml[i2[l1]] * b3[i3[l1]] l1 ← l1 + 1 end for nd ← nd + 1 else Ml[i4[i]] ← 1.0 / Ml[i4[i]] copy the nonzero entries in row i of the Ml[] to the left of the diagonal into the work array. for j in range(startOfWorkInM, n2[ni]) do ML[j] ← Ml[i5[l2]] l2 ← l2 + 1 end for l2 ← l2 - n2[ni] - startOfWorkInM for j in range(startOfWorkInM, n2[ni]) do Ml[i5[l2]] ← Ml[i5[l2]] * Ml[i4[i]] l2 ← l2 + 1 end for for j in range(n3[ni]) do Ml[i6[l3]] ← Ml[i6[l3]] - Ml[i7[l3]] * Ml[i8[l3]] l3 ← l3 + 1 end for ni ← ni + 1 end if end for

The solve component consists of a sparse backward elimination and forward substitution step, which are described in Algorithms 31 and 32

138

Algorithm 31 sparseBackwardElimination(Ml, fl, la, gamma d, b3, ddq, n, nd, ni, la, p1, p2, i4, j1, j2, j3, j4, j5, j6, p v 23) k ← nd - 1 initialize counters: nd ← 0, ni ← 0 for i in range(n - 1, -1, -1) do if is dependent po3[i] then for j in range(p1[nd] do fl[j1[m1]] ← fl[j1[m1]] - b3[j2[m1]] * fl[i] m1 ← m1 + 1 end for for j in range(p2[nd]) do fl[j3[m2]] ← fl[j3[m2]] - Ml[j4[m2]] * gamma d[p v 23[k]] m2 ← m2 + 1 end for nd ← nd + 1 ddq[i] ← gamma d[p v 23[k] la[k] ← fl[i] if Ml[i4[i]] then la[k] ← la[k] - Ml[i4[i]] * ddq[i] end if k←k-1 else for j in range(p3[ni]) do fl[j5[m3]] ← fl[j5[m3]] - Ml[j6[m3]] * fl[i] m3 ← m3 + 1 end for ddq[i] ← Ml[i4[i]] * fl[i] ni ← ni + 1 end if end for

139

Algorithm 32 sparseForwardSubstitution(Ml, b3, ddq, n, nd, ni, la, gamma d, p1, p2, p v 23, m1, m2, j1, j2, j3, j4, j5, j6) k←0 for i in range(n) do if is dependent po3[i] then nd ← nd - 1 for j in range(p1[nd]) do m1 ← m1 - 1 ddq[i] ← ddq[i] - b3[j2[m1]] * ddq[j1[m1]] end for for j in range(p2[nd]) do m2 ← m2 - 1 la[k] ← la[k]- Ml[j4[m2]] * ddq[j3[m2]] end for k←k+1 else ni ← ni - 1 for j in range(p3[ni]) do m3 ← m3 - 1 ddq[i] ← ddq[i] - Ml[j6[m3]] * ddq[j5[m3]] end for end if end for

Algorithms 30–32 already require significantly fewer operations to factor and solve the system of equations compared to full–dense algebra, but the process still requires a number of conditional statements to be evaluated at run–time. The logic can be further simplified by recording the operations as arrays of function pointers and source and destination memory pointers, a process sometimes referred to as memoization. In the interest of brevity, the process to memoize the operations in Algorithms 30–32 is not described here. Refer to Section 12 for further discussion of memoization.

6.7

Numerical Integration

The equations are integrated using a C implementation of the Adams–Bashforth Adams– Moulton predict, evaluate, correct, evaluate (PECE) method [83]. Numerical integration

140

methods such as from Euler, Heun, Taylor and Runge–Kutta are called single–step methods because they use the information at a single time step to determine the solution at the next step. In contrast, the Adams–Bashforth Adams–Molton integrator is a multi–step method. An advantage to the Adams–Bashforth Adams–Moulton method is that it uses several points prior to the current time step to predict the next time interval. The method allows the local truncation error (LTE) to be determined and a correction term can be applied to improve the accuracy at each time step. It is also possible to determine if the time step is small enough to obtain an accurate solution for a desired error tolerance, yet large enough so that unecessary calculations are eliminated. The integrator adjusts the step size during a simulation to take as large a step as possible for a desired error tolerance. Note that in the preconditioned equations of motion, only the acceleration level constraints are appended to the system of equations. Due to the limited precision of computer arithmetic, it is well–known that appending only acceleration level constraints can lead to drift in constraint error. Fortunately, growth in constraint error can easily be remedied using Baumgarte constraint stabilization [84, 85]. In Baumgarte constraint stabilization, an error term proportional to the velocity and position level constraints is fed back into the model. Recall that the acceleration level constraints are J q¨ + J˙q˙ = 0

(229)

The key to driving down the acceleration level constraints is to feed back the error term proportional to the velocity and position level constraints such that J q¨ + J˙q˙ + 2ωn e˙ + ωn2 e = 0

(230)

which has the form of the characteristic equation for a critically damped oscillator, where ωn is the frequency of oscillation. When used in conjunction with the Adams–Bashforth Adams–Moulton numerical integration scheme, higher values of ωn require more iterations for a given time interval and may slow down a simulation. On the other hand, choosing a lower value for ωn may not fully stabilize constraint growth. In practice, the choice of ωn is a compromise. 141

In the present work, ωn is either a (fixed) parameter selected by the user in the input file or is set adaptively using a heuristic method similar to the method described by Chiou et al. in [86]. For the Adams–Bashforth method, Chiou and coauthors set ωn proportional to the stability limit, SL of the integration method such that ωn =

SL(k) ∗s h

(231)

Chiou applied the method to the Adams–Bashforth integrator, which has different stability limits than the Adams–Bashforth Adams–Moulton method used in the present work. The stability limits for varying predictor orders are provided in Table 19 below and were obtained by reading the Nyquist plots in [83]. The variable s is an adjustable safety factor and was set to 0.005 for all numerical experiments. The parameter h is the current step size, which adapts through the course of a simulation. The stability limit is a function of the order of the polynomial extrapolation, k. Table 19: Stability limits for varying order Adams–Bashforth Adams–Moulton PECE numerical integration (from [83]) order

limit

1

2.0

2

2.26

3

1.86

4

1.4

5

1.03

6

0.78

7

0.58

8

0.44

9

0.34

10

0.26

11

0.21

12

0.06

Additionally, a stabilization term proportional to the integral of constraint error is fed back into the model, a method which was originally suggested by Ostermeyer [85, 87]. See also [88] for additional information. Recall that earlier the term J˙p q˙ was defined as −γ p , 142

where the superscript p indicates the primary constraints identified by generalized coordinate partitioning. If the stabilization terms, −γ pc and the integral error term are moved to the right hand side of the constraint equations, the constraint equations have the form p

J q¨ =

γ pc

− 2ωn e˙ −

ωn2 e

Z − KI

edt

(232)

where KI was set to 1500. Although a comprehensive study to obtain optimal parameters for Eqs. (232) and (231) is beyond the scope of the present work, in general, the parameters used above demonstrated good results for the models that were simulated. The maximum step size of numerical integration, h, was restricted to 0.01 seconds. For many simple models, only 2 evaluations were required per 0.01 second time interval, which is optimal for the Adams–Bashforth Adams–Moulton PECE method. Obtaining optimal settings for ωn is an ongoing topic of research (see for example: [89]). 6.7.1

Overview of numerical solution process

An overview of the numerical solution process is shown in Fig. 9. The process begins by preprocessing the topology, identifying loops and loop members from the user’s directed graph representation, reordering bodies and joints according to the Kinematic Substructuring algorithm, and evaluating the Jacobian matrices in permutation order 1. Newton–Raphson iteration is applied to remove constraint violations in the model, if present. These preprocessing steps are only performed once at the beginning of program execution. When the simulation begins, LU decomposition with complete row and column pivoting, referred to in Fig. 9 as full factor, is performed and the infinity norm of the B di matrix is stored. The full factor step determines permutation order 2, and in the process establishes generalized coordinate partitioning. From the variable splitting into dependent and independent sets is established indirectly permutation order 3. The process then proceeds to the analyze equations phase, where the operations required to sparse factor and solve the system of equations are stored. 143

In the evaluate equations phase, the generalized mass matrix and right–hand side are set in permutation order 3. The accelerations are then solved and saved in permutation order 1, corresponding to bodies and joints arranged in level–increasing order by substructure. If the current simulation time is less than the total simulation time, the simulation proceeds and the block Jacobian submatrices are evaluated in permutation order 2, which was determined in the full factor step. If the infinity norm of the B di matrix becomes more than 2 times greater than the value recorded in the most recent execution of full factorization with complete row and column pivoting, then full–factor is performed again to generate a new split of dependent and independent variables and new permutation orders. In most cases, the infinity norm will be less than two times the initial starting value, and the simulation will proceed on the inner loop, thereby avoiding the expensive memory copy operations associated with LU decomposition with complete row and column pivoting. For many models, the split of dependent and independent variables obtained in an initial full factor step remains valid for the entire simulation, and the outer loop in Fig. 9 is never executed.

144

evaluate topology level order sort identify loops and loop members start simulation

kinematic substructuring remove constraint violations forward kinematics and set Jacobian(s)

full factor yes LU Decomposition with full pivoting b0 ← infinity norm of B matrix

analyze equations evaluateJacobianBoolean() symbolicFactor() countNonZeroB() setSparsity() countOperations() recordOperations()

evaluate equations no

set generalized mass matrix set right–hand side

sparse solve sparseFactor() sparseBackwardElimination() sparseForwardSubstitution() write states to file or memory

yes t >= tSpan ?

end simulation

no fast factor forward kinematics and set Jacobian(s) LU Decomposition with no pivoting

infinity norm > 2 * b0 ?

Figure 9: Overview of sparse matrix solution process for preconditioned equations of motion with substructures

145

7

Numerical Examples

To verify the integrity of the sparse matrix solution of the Preconditioned Spatial Equations of Motion, four numerical examples are provided.

7.1

Slider Crank

The first numerical example is a single degree of freedom slider–crank consisting of three bodies and four joints, shown in Fig. 10.

Figure 10: Slider Crank Model shown in Graphical User Interface

The parameters associated with the bodies of the slider–crank model are summarized in Table 20. The center of masses of the crank and connecting rod are located at the midpoint of the links. The center of mass of the piston is located at its origin. Table 20: Body parameters of slider–crank simulation ID

name

mass (kg)

length (m)

Ixx

Iyy

Izz

Ixy

Iyz

Izx

1

crank

1.0

0.6

1.0

0.1

1.0

0.0

0.0

0.0

2

connecting rod

1.0

1.5

1.0

1.0

0.5

0.0

0.0

0.0

3

piston

1.0

n/a

1.0

1.0

1.0

0.0

0.0

0.0

146

The joints are assigned the letters A–D. The direction of the joints, initial displacements, velocities and damping are summarized in Table 21. Table 21: Joint parameters of slider–crank simulation ID

name

q0 (m, rad)

q˙0 (m/s, rad/s)

damping (Ns/m)

A

fixed → crank

5.3311

0.0

0.0

B

fixed → piston

1.7661

0.0

1.0

C

connecting rod → piston

2.8097

0.0

0.0

D

crank → connecting rod

2.8540

0.0

0.0

The crank is oriented with its y–axis aligned from joint A to joint D. The connecting rod is oriented with its -z–axis aligned from joint D to C. The simulation parameters for the slider–crank mechanism are summarized in Table 22. The simulation was performed with gravity pointing in the +z direction. Simulation results are provided in Fig. 11. Figure 11: Simulation results for slider–crank

147

Table 22: Simulation parameters for slider–crank model parameter

setting

time span

20 seconds

requested time interval

0.03 seconds

relative error tolerance

1e-4

absolute error tolerance

1e-4

As discussed in Chapter 6, at each step of numerical integration, the PECE method performs a predict, evaluate, correct, and evaluate step, where the prediction and correction steps involve evaluating polynomials and the two evaluation steps require setting up and solving the system of dynamics equations. As the simulation proceeds, the Adams–Bashforth Adams–Moulton method adjusts its step size to take as large of a step as possible while maintaining the requested error tolerance. In all simulations in this thesis, the maximum step size of the Adams–Bashforth Adams–Moulton numerical integrator was limited to 0.01 seconds. The adaptive method described in Chapter 6 was used to set the Baumgarte stabilization term, ωn . Simulation results were requested at 0.03 second time intervals. Each 0.03 second reporting interval required an average of 6 evaluations of the system of equations, which indicates that the numerical integrator was taking the maximum allowable time step for each requested reporting interval. Additionally, the constraint error was monitored during the simulation as E = Φf a Φ−1 fb − I

(233)

where Φf a ∈ SE(3) specifies the absolute displacement of the frame associated with the chord constraint’s virtual body and Φf b ∈ SE(3) specifies the absolute displacement of the constraint’s original embedding frame. The nonzero terms of E are put in screw coordinate

148

form e as described in [28].

e=

              

E[0,3] E[1,3] E[2,3]

              

(234)

   E[2,2] + E[2,1] + E[1,1]              E + E + E   [0,0] [0,2] [2,2]         E  [1,1] + E[1,0] + E[0,0] Finally, the root mean square constraint error, e, is reported as r eT e e= 6

(235)

Using the adaptive stabilization method, the maximum root mean square constraint error observed during the simulation was e = 1.4536 ∗ 10−10 at the slider crank’s snap– through configuration. After passing through the snap–through position, Baumgarte constraint stabilization rapidly damped out constraint error. An average RMS constraint error of e = 9.1287 ∗ 10−12 was recorded for the simulation. The combination of low constraint violation throughout the simulation and maximum time step taken by the numerical integrator indicate that the adaptive Baumgarte stabilization technique described in Chapter 6 is working well for the slider–crank model. In Section 5.5, a heuristic method to determine when to reapply LU factorization with complete row and column pivoting was described. The infinity norm of the B di matrix (largest absolute value) was monitored. If the infinity norm increased greater than 2 times its initial starting value, full factorization would be reapplied. Generalized coordinate partitioning selected joint D (crank → connecting rod) as the optimal independent coordinate. The heuristic method was used to monitor the infinity norm of the B di matrix. Joint D remained a valid choice of independent variable throughout the simulation – therefore, full factorization with complete row and column pivoting was only called once at the beginning of the simulation. A comparison of the execution speed of the sparse matrix solution of the preconditioned equations of motion with substructures to other solution methods is beyond the scope of 149

this thesis and will be the focus of future research. In the interim, to convey that the setup and solution of an abritrary mechanism’s governing equations is suitable for real– time applications, the ratio of total simulation time to computation time is reported (i.e. how many times faster than real–time the solver runs). The simulation was performed on a desktop PC with a 4.0 GHz Intel Skylake Core i7 processor, 32GB of RAM, running Arch Linux and the Linux 4.8.13 kernel. This simple mechanism ran 1700 times faster than real–time (average of 10 runs).

7.2

Bricard’s Mechanism

Bricard’s mechanism is an example of an overconstrained, paradoxical mechanism. When the mechanism assumes a planar configuration (as shown in Fig. 12a), the joints at the three corners have no ability to impart a change on overall system configuration, yet the system still has the ability to move a linear combination of the joints located along the edges of the triangle. Therefore, the joints at the corners of the triangle are not suitable choices for independent variables in this configuration. As the mechanism moves through its range of motion, it assumes a square configuration, as shown in Fig. 12b where all joints contribute equally to mechanism motion, and then to a second planar configuration as shown in Fig. 12c. In the second planar configuration, the joints that were located at the edges in Fig. 12a have transitioned to becoming joints located at the corners of the triangle. Therefore, unlike the slider–crank mechanism, it is not possible to parameterize the kinematics or dynamics of Bricard’s mechanism using any single independent joint variable for the entire range of motion of the mechanism. Note that some of these details will be discussed more in the context of mobility in Section 10.1.2. The paradoxical nature of Bricard’s mechanism poses a modelling challenge to multibody dynamics software. For example, any multibody dynamics software that applies generalized coordinate partitioning only once at the beginning of simulation execution would fail for Bricard’s mechanism. Certainly one conservative procedure to handle the need to select a different set of independent variables for Bricard’s mechanism would be to apply the LU, QR or SVD de150

(a) planar configuration 1

(b) square configuration

(c) planar configuration 2

Figure 12: Bricard’s mechanism moving from (a.) a first planar configuration to (b.) a square configuration to (c.) a second planar configuration composition methods for generalized coordinate partitioning at every time interval during numerical integration. Unfortunately, as discussed in Chapter 5 applying these methods at every time interval introduces substantial overhead – with the LU decomposition method requiring the least overhead of the three. Other authors have reparameterized the Bricard mechanism in terms of another set of variables that is valid for the entire range of motion [90]. The primary drawback to the reparameterization described by Arponen et al. is that it is difficult to generalize the method to arbitrary models. Arponen et al. note that their reparameterization is only possible for Bricard’s mechanism as the Jacobian matrix does not change rank through its range of motion. In contrast to the above methods, the heuristic method for determining when to reapply generalized coordinate partitioning as described in Chapter 5 offers an excellent combination of execution speed and generality. During simulation runtime, the infinity norm of the B di matrix is monitored. If the infinity norm increases more than 2 times from its initial starting value, LU decomposition with complete row and column pivoting is reapplied. As noted in Chapter 4, when solving the preconditioned spatial equations of motion, all variables (both independent and dependent) are numerically integrated. Unlike other methods, such as described in [64], if the heuristic method indicates to apply a new split of dependent and independent variables, there is no need to stop/restart numerical integration. When reapplying LU factorization with complete row and column pivoting, it is important

151

to minimize constraint violations to improve the probalilty of computing the correct rank of the Jacobian matrix. To that end, if the heuristic method indicates to reapply GCP, then before LU factorization with complete row and column pivoting is applied to determine a new split of independent and dependent variables, the current joint variables are saved and Newton-Raphson iteration is applied to reduce constraint violations to an error tolerance of 1e-12. Once LU factorization with complete pivoting has been applied to determine a new split of dependent and independent variables, the variables are reset back to their initial values before Newton–Raphson iteration was applied to ensure continuity in the kinematic and dynamic equations. The mechanism’s parameters are summarized in Table 23. To verify the functioning of the heuristic method for successfully obtaining a new split of dependent and independent variables, the mechanism is set to an initial starting configuration as described in Table 24 with one of the links fixed to ground. The numerical integrator settings are summarized in Table 25. The error tolerances were set exceptionally tight to demonstrate Baumgarte stabilization’s ability to maintain tight constraint error control and the ability of the numerical integrator to accurately conserve energy in this conservative system model. Generally the integration error tolerance would be set several orders of magnitude larger, the simulation would run up to two orders of magnitude faster, and simulation results would still be sufficiently accurate for most applications. Table 23: Geometric and inertial properties for each link in Bricard example model parameter

value

length

1.5m

center of mass

0.75m

Ixx

0.1875 kgm2

Iyy

0.1875 kgm2

Izz

0.0100 kgm2

Ixy

0.0 kgm2

Iyx

0.0 kgm2

Ixz

0.0 kgm2

mass

1.0 kg

152

Table 24: Initial configuration of Bricard’s mechanism joint

angle (rad)

angular velocity (rad/s)

even

0.776

0.0262

odd

-2.000

0.100

Table 25: Simulation parameters for Bricard’s mechanism parameter

setting

time span

100 seconds

requested time interval

0.01 seconds

relative error tolerance

1e-11

absolute error tolerance

1e-11

The simulation results are summarized in Figs. 13-14. The vertical dashed red lines indicate a step in the simulation where the heuristic method indicated that full factorization should be reapplied to find a new split of independent and dependent variables. In Fig. 13 the states of the model are shown for the duration of the simulation. Only the even and odd joint angles and velocities are shown as all even quantities are identical and all odd quantities are identical.

153

Figure 13: States for Bricard’s Mechanism

Due to the very tight error tolerance specified for the simulation, the first time interval required 24,198 evaluations of the dynamics equations to build a time history in the numerical integrator. An average of 367 evaluations was required for subsequent time reporting intervals. While the number of function evaluations required for a given time reporting interval fluctuated throughout the course of simulation, there was no significant increase in evaluations required after a new set of independent variables was chosen. The plots in Fig. 14 demonstrate that the method allows smoothly reparameterizing the model in terms of a new set of independent variables during a simulation without disrupting the total energy or constraint error. Total energy was conserved to 1 ∗ 10−16 . Because all variables are numerically integrated, the integrator maintains its time history as it transitions between a new

154

split of independent and dependent variables. If only the independent variables were integrated – as is done in other methods – it would be necessary to stop and start the numerical integrator, requiring in this case approximately 24000 function evaluations to develop a new time history.

Figure 14: Constraint error, number of evaluations per reporting interval, energy and Baumgarte constraint stabilization parameter (ωn ) used for Bricard’s mechanism.

7.3

Excavator

The first two examples discussed in this chapter consisted of only one loop and a single degree of freedom. Therefore, the first two models contained only a single kinematic substructure. While mechanisms like Bricard’s linkage have their own modeling challenges, the size and complexity of the models remain small. To demonstrate the solution of the preconditioned dynamics equations with substructures, two additional numerical results are provided. For benchmarking purposes, the excavator example from Chapter 5 is allowed to drop from the configuration shown in Fig. 3 and swing like a triple pendulum. For this nu-

155

merical experiment, all bodies were assigned a mass of 2kg, Ixx = Iyy = Izz = 5kgm2 and Ixy = Iyx = Ixz = 0.0kgm2 . The simulation was numerically integrated using the Adams– Bashforth Adams–Moulton PECE integrator with a 1e-4 relative and absolute error tolerance and a 0.03 second maximum step size. During the simulation, full factorization (i.e. LU decomposition with complete row and column pivoting) was performed twelve times for the third substructure containing the bucket. The split of dependent and independent variables in the other two substructures remained valid throughout the simulation. Integrating the equations required an average of 7.5 evaluations of the system of equations per requested time interval. The simulation was carried out on a desktop PC with a 4.0 GHz Intel Skylake Core i7 processor, 32GB of RAM, running Arch Linux with the Linux 4.8.13 kernel. The simulation ran 356 times faster than real time (an average of 0.0281 CPU seconds required per 10 seconds of simulation time – average of 10 timings). The twelve full–factor events corresponded to the bucket linkage flipping a full 360 degrees with accompanying high snap–through accelerations, events which would not occur in ordinary operation of the excavator. Nonetheless, the combination of adaptive time step and heuristic application of GCP is capable of efficiently and robustly modeling these exceptional events. The simulation results are summarized in Figs. 15 and 16.

156

Figure 15: States for excavator simulation.

157

Figure 16: Constraint error, number of function evaluations per time step, energy and Baumgarte constraint stabilization parameter (ωn ) used for excavator.

7.4

Vehicle Model

In this section, a rolling chassis model of a passenger vehicle with 51 joints and 45 bodies is set up to demonstrate the solution of the preconditioned dynamics equations for a complex mechanism with multiple substructures. The chassis is representative of a typical problem that would be of interest to industry. The primary focus of this section is to demonstrate the automatic detection of substructures using the kinematic substructuring algorithm presented in Chapter 5 and the sparse matrix solution method discussed in Chapter 6. 7.4.1

Overview of model

An image of the vehicle chassis is shown in Fig. 17.

158

Figure 17: Screen capture of a vehicle rolling chassis

The geometric and inertial properties of the model are chosen to be representative of a light–weight, high–performance vehicle. The rolling chassis has McPherson front suspension with Ackermann steering geometry and semi–trailing link rear suspension with realistic camber, kingpin and caster angles (summarized in Table 26). The mass and inertia properties of the various bodies comprising the rolling chassis, as well as a list of joints are provided in Table 27. Table 26: Geometric parameters of vehicle model description

value

Caster angle

2.9◦

Camber Angle

0.67◦

Kingpin angle

10.9◦

Wheelbase

2.78 m

Front Tread

1.64 m

Rear Tread

1.63 m

159

Table 27: Parameters of vehicle model ID

joint

type

ID

body

mass

I¯xx

I¯xy 2

(kg)

(kgm )

I¯yy 2

(kgm )

I¯yz 2

(kgm )

I¯zz 2

(kgm )

I¯zx 2

(kgm )

(kgm2 )

A

f → 1

Tx

1

massless body 1

0.0

0.0

0.0

0.0

0.0

0.0

0.0

B

1 → 2

Ty

2

massless body 2

0.0

0.0

0.0

0.0

0.0

0.0

0.0

C

2 → 3

Tz

3

massless body 3

0.0

0.0

0.0

0.0

0.0

0.0

0.0

D

3 → 4

Rx

4

massless body 4

0.0

0.0

0.0

0.0

0.0

0.0

0.0

E

4 → 5

Ry

5

massless body 5

0.0

0.0

0.0

0.0

0.0

0.0

0.0

F

5 → 26

Rz

6

massless body 6

0.0

0.0

0.0

0.0

0.0

0.0

0.0

G

26 → 37

Rx

7

massless body 7

0.0

0.0

0.0

0.0

0.0

0.0

0.0

H

23 → 39

Rx

8

massless body 8

0.0

0.0

0.0

0.0

0.0

0.0

0.0

I

26 → 29

Rx

9

massless body 9

0.0

0.0

0.0

0.0

0.0

0.0

0.0

J

26 → 10

Rx

10

massless body 10

0.0

0.0

0.0

0.0

0.0

0.0

0.0

K

26 → 8

Rx

11

massless body 11

0.0

0.0

0.0

0.0

0.0

0.0

0.0

L

26 → 27

Ty

12

massless body 12

0.0

0.0

0.0

0.0

0.0

0.0

0.0

M

26 → 28

Rx

13

massless body 13

0.0

0.0

0.0

0.0

0.0

0.0

0.0

N

25 → 38

Rx

14

massless body 14

0.0

0.0

0.0

0.0

0.0

0.0

0.0

O

26 → 36

Rx

15

massless body 15

0.0

0.0

0.0

0.0

0.0

0.0

0.0

P

15 → 40

Tz

16

massless body 16

0.0

0.0

0.0

0.0

0.0

0.0

0.0

Q

24 → 40

Rx

17

massless body 17

0.0

0.0

0.0

0.0

0.0

0.0

0.0

R

36 → 44

Ry

18

massless body 18

0.0

0.0

0.0

0.0

0.0

0.0

0.0

S

37 → 45

Ry

19

massless body 19

0.0

0.0

0.0

0.0

0.0

0.0

0.0

T

22 → 41

Rx

20

massless body 20

0.0

0.0

0.0

0.0

0.0

0.0

0.0

U

17 → 41

Tz

21

massless body 21

0.0

0.0

0.0

0.0

0.0

0.0

0.0

V

28 → 6

Rx

22

massless body 22

0.0

0.0

0.0

0.0

0.0

0.0

0.0

W

6 → 7

Ry

23

massless body 23

0.0

0.0

0.0

0.0

0.0

0.0

0.0

X

7 → 32

Rz

24

massless body 24

0.0

0.0

0.0

0.0

0.0

0.0

0.0

Y

8 → 9

Ry

25

massless body 25

0.0

0.0

0.0

0.0

0.0

0.0

0.0

Z

9 → 30

Rz

26

chassis

1120.0

318.6

0.0

1879.1

0.0

1907.1

0.0

AA

10 → 11

Ry

27

steering rack

2.66

0.16

0.0

0.003

0.0

0.16

0.0

AB

11 → 31

Rx

28

left front control arm

3.26

0.06

0.0

0.02

0.0

0.08

0.0

AC

29 → 12

Rx

29

right front control arm

3.26

0.06

0.0

0.02

0.0

0.08

0.0

AD

12 → 13

Ry

30

left front shock upper

1.92

0.013

0.0

0.013

0.0002

0.0098

-0.00002

AE

13 → 33

Rz

31

right front shock upper

1.92

0.013

0.0

0.013

0.0002

0.0098

-0.00002

AF

32 → 42

Ry

32

left steering knuckle

4.98

0.14

-0.0005

0.13

-0.038

0.03

0.0017

AG

30 → 32

Tz

33

right steering knuckle

4.98

0.14

0.0005

0.13

0.038

0.03

0.0017

AH

31 → 33

Tz

34

left steering link

1.52

0.029

0.0

0.00015

0.0

0.029

0.0

AI

33 → 43

Ry

35

right steering link

1.52

0.029

0.0

0.00015

0.0

0.029

0.0

AJ

32 → 14

Rx

36

left trailing arm

8.8

0.099

0.08

0.16

0.02

0.23

0.03

AK

38 → 15

Rz

37

right trailing arm

8.8

0.099

-0.08

0.16

0.02

0.23

-0.03

AL

14 → 34

Rz

38

left rear shock upper

1.0

0.005

0.0

0.02

0.0

0.002

0.0

AM

33 → 16

Rx

39

right rear shock upper

1.0

0.005

0.0

0.02

0.0

0.002

0.0

AN

39 → 17

Rz

40

left rear shock lower

0.98

0.0061

0.000094

0.0061

0.00018

0.0029

0.00003

AO

16 → 35

Rz

41

right rear shock lower

0.98

0.0061

0.000094

0.0061

0.00018

0.0029

0.00003

AP

18 → 34

Rz

42

left front wheel

21.2

0.578

0.0

0.932

0.0

0.578

0.0

AQ

19 → 18

Ry

43

right front wheel

21.2

0.578

0.0

0.932

0.0

0.578

0.0

AR

27 → 19

Rx

44

left rear wheel

21.2

0.578

0.0

0.932

0.0

0.578

0.0

AS

20 → 35

Rz

45

right rear wheel

21.2

0.578

0.0

0.932

0.0

0.578

0.0

AT

21 → 20

Ry

AU

27 → 21

Rx

AV

37 → 22

Ry

AW

26 → 23

Ry

AX

36 → 24

Ry

AY

26 → 25

Ry

160

The directed graph for the vehicle, with the edges color–coded according to their substructure assignments, is shown in Figure 18. The three substructures, which were detected automatically using the kinematic substructuring algorithm are the left rear suspension (consisting of one independent chord loop), the right rear suspension (consisting of one independent chord loop) and the front steering/suspension assembly (consisting of four independent chord loops). The joints are grouped by their substructure assignments i-iii in Table 28. Table 28: Joint substructure assignments substructure

members

i

G, AW, AV, H, T, AN, U

ii

AY, O, N, AX, AK, Q, P

iii

I, J, L, K, M, AC, AA, Y, V, AD, AB, Z, W, AE, AH, AU, AR, AG, X, AM, AT, AQ, AJ, AO, AS, AP, AL

There are many techniques for modeling tire forces (refer to [91, 92] for an overview). In the present work, a transient version of the Fiala [93] tire model based on the MSC ADAMS [94] and CHRONO [95] implementations was programmed. The Fiala tire model was selected for its relative simplicity and because it is widely used in commercial vehicle dynamics software. In contrast to other models, the Fiala model requires only 9 parameters – compared to Pacjecka’s “Magic” Tire Model, which requires over 50 parameters. Because of its simplicity, there is a relatively low overhead in its programming and execution. The drawback to the Fiala model is that it offers lower fidelity than other models. Specifically, it does not accurately model combined slip situations, the effect of inclination (camber) angle on forces and moments is ignored, the variation of lateral stiffness and longitudinal stiffness with normal load and overturning moment is not considered, and the force and moment curves cannot be offset from the origin (i.e. zero slip always implies zero Fx , Fy and Fz ). Nonetheless it has been used successfully to model steady–state tire/road interaction,

161

Figure 18: Directed Graph of Vehicle Model

162

particularly for smooth road surfaces [91]. In the implementation of the Fiala tire model used in the present work, tire forces are computed at the tire contact patch (also called the W-frame). In this work, tire forces are expressed in the SAE coordinate system (with the Z-axis down, Y-axis to the right and X-axis forward). Another common choice is to express tire forces in the ISO frame (with the Z-axis up, Y-axis to the left, and X-axis forward), such as that used in the Chrono software. The Fiala tire model used in the present code is briefly described as follows. At the start of the computation, it is necessary to compute the penetration depth, zdepth , of the tire unloaded radius with the ground and the velocity of the tire expressed at the contact patch in the coordinates of the contact frame, (i.e. vw f w,w ). When the tire is in contact with the surface zdepth is positive, otherwise it’s value is immaterial and set to zero. In the ensuing discussion, let vxw , vyw , vzw , ωxw , ωyw and ωzw represent the various components of the velocity screw vw f w,w . The vertical force in the w–frame is computed as    −kz ∗ zdepth − bz ∗ vzw for zdepth > 0 w fz =   0 for zdepth ≤ 0

(236)

The model requires computing the longitudinal and lateral slip states of the tire patch relative to the terrain surface, which are noted as sx and sy respectively. The time derivative of the longitudinal slip is computed as s˙ x =

w |vxw |κ + vsx xRL

(237)

w where vsx is the longitudinal slip velocity of the wheel at the tire contact patch and κ is

defined as w vsx κ=− w |vx |

(238)

The time derivative of lateral slip is computed as vyw − |vxw |tan(sy ) s˙ y = yRL In the above equations, xRL and yRL are the x and y relaxation lengths. 163

(239)

The comprehensive slip ratio, ss is given as q ss = s2x + s2y

(240)

from which the current coefficient of friction is computed as µ = µmin + (µmax − µmin )ss

(241)

In Fiala’s model, the longitudinal force calculation is based on whether the tire is in an elastic deformation state or a state of pure sliding. The critical longitudinal slip ratio scritical , which is used to determine which state the tire is in, is computed as w µf scritical = z 2CS

(242)

where CS is the longitudinal stiffness. The longitudinal component of force is then computed as:    C S ∗ s x for sx < scritical w fx =  (µ∗fzw )2  sign(sx ) ∗ (µ ∗ |fzw | − 4∗s otherwise x ∗CS

(243)

where the first condition is for when the tire is in a state of elastic deformation and the second case is for sliding. The lateral slip state is computed in a fashion similar to the longitudinal slip state. Determining whether the tire is in an elastic deformation or pure sliding state requires computing the critical slip angle as  αcritical = atan

3 ∗ µ ∗ |fzw | Cα

 (244)

The current slip angle, α is found as  α = atan

vy |vxw |



Lateral forces are then computed as:    −µ|fzw | ∗ (1 − h3 ) ∗ sign(α) for |α| ≤ αcritical w fy =   −µ|fzw | ∗ sign(α) for |α| > αcritical 164

(245)

(246)

and the aligning moment, mw z , is found as    −µ|fzw | ∗ sign(α) for |α| ≤ αcritical w mz =   0 for |α| > αcritical

(247)

The variable w is the effective tire width, which may differ from the actual tire width and h is the intermediate quantity h=1−

Cα |tan(α)| 3µ|fzw |

(248)

Rolling resistance is computed as w mw y = −CR ∗ |fz |sign(ωwheel )

(249)

The Fiala model does not capture tire roll moment, so mw x can be permanently set to zero. The six components of forces and moments are expressed at the origin of the w-frame in the coordinate system of the w-frame. The forces and moments are assembled into coscrew coordinate form f ∗w f w,w and transformed to the inertial reference frame as

∗f w ∗w f ∗f f w,f = [Ad]f w f f w,w

(250)

∗ The quantity f ∗f f w,f is then added to the array of body and gravitational forces gf at the

index corresponding to the wheel body. The tire parameters used in the simulation are provided in Table 29. Table 29: Fiala tire model parameters description

value

Wheel Unloaded Radius (r0 )

0.314 m

Vertical Stiffness (kz )

310000 N/m

Vertical Damping (bz )

3100 N-s/m

Rolling Resistance (CR )

0.001 m

Width (w)

0.235 m

Longitudinal Stiffness (CS )

1000000 N

Lateral Stiffness (Cα )

45800 N

Peak coefficient between tire and road (µmax )

1.0

Sliding coefficient between tire and road (µmin )

0.9

X relaxation length (xRL )

0.05 m

Y relaxation length (yRL )

0.15 m

165

A simple steady–state validation of the tire was performed by creating a test rig which was propelled forward in the x–direction at a constant velocity of 20 m/s. The tire yaw angle about the vertical z–axis was set between 0 and 10 degrees at 2 degree increments about the z–axis to evaluate tire forces for varying slip angles. A 3000N and 4500N normal force in the downward z direction was applied to the test rig to place operational loads on the tire. The tire forces, expressed at the contact patch of the tire are shown in Fig. 19.

166

Figure 19: Fiala model validation

Further validation of the tire model would be required before using it in a production environment, but the plots in Fig. 19 do suggest that the tire forces generated are reasonable and consistent with the gross behavior of other tire models, such as described in [91].

167

To demonstrate the sparse matrix solution of the preconditioned equations of motion with substructures, two simulations of the vehicle were performed. In the first simulation, the vehicle chassis was positioned above the origin of the fixed inertial reference frame and given an initial 15 m/s forward velocity. PID (Proportional, Integral and Derivative) control was used to apply a torque in the front wheels to maintain steady–state forward velocity in the vehicle reference frame. PID control was also used in the steering rack to hold it at a fixed lateral displacement of 10 cm. In the second simulation, the car’s forward velocity was set to 25 m/s with the other parameters remaining the same. The results are summarized in Figures 20 and 21.

(a) 15 m/s

(b) 25 m/s

Figure 20: x and y position of vehicle chassis at a constant forward velocity with steering rack displaced 10cm for (a.) 15 m/s and (b.) 25 m/s forward velocity

168

(a) velocities for 15 m/s

(b) velocities for 25 m/s

(c) accelerations for 15 m/s

(d) accelerations for 25 m/s

Figure 21: Linear and translational velocities expressed in vehicle chassis frame for (a.) 15 m/s and (b.) 25 m/s forward velocity and linear and translational accelerations expressed in the vehicle chassis frame for (c.) 15 m/s and (d.) 25 m/s forward velocity

The simulations were run using a reporting interval of 0.03 seconds for 80 seconds of simulation time on a 4Ghz Intel i7 Skylake processor with 32GB of RAM running the Arch Linux operating system and the 4.8.13 Linux kernel. The simulations ran approximately 9 times faster than real–time (average of 1.11 seconds per 10 seconds of simulation time for 10 runs). Each reporting interval required an average of 86 solutions of the system of equations, which indicates that the simple tire model introduced artificially high frequency content, causing the Adams–Bashforth Adams–Moulton PECE numerical integrator to take small time steps to reach the desired error tolerance of 1e-4. Nevertheless, the 9–fold speedup over real–time for the vehicle model is encouraging and demonstrates that the preconditioned 169

equations of motion with substructures is suitable for real–time and hardware–in–the–loop applications for complex multiloop mechanisms. Based on the relative complexity and nearly 400 times faster than real time simulation of the excavator model, it is expected that a higher fidelity tire model will allow the simulation to run approximately 10–20 times faster. Investigating the effect of using a higher–fidelity tire model on the run–time will be the focus of future work.

7.5

Summary

This section has demonstrated the integrity of the sparse matrix solution of the Preconditioned Spatial Equations of Motion for four numerical examples. In each case, kinematic substructures, if they exist, were obtained automatically from an input file with bodies and joints arranged in a hierarchical directed–graph with possible cycles. In the second part of this thesis, additional tools are presented to help analysts set up, deploy and understand the mobility of their models.

170

8

Overview of Transmission Performance and Mobility

In the first half of this thesis, a robust, efficient, and general-purpose method was described to automatically set up and solve a system of dynamics equations. The second half of the thesis focuses on assessing the transmission performance of general mechanisms by analyzing constrained mobility and developing a number of applications. In this chapter, various metrics of transmission performance are introduced and the concept of mobility is summarized.

8.1

Transmission metrics

A designer laying out a new mechanism is typically interested in how well an input to the system achieves the desired motion of the mechanism. The motion may refer to the movement of the mechanism as a whole (i.e. a linear combination of movements of all joints that comprise the system) or the motion of a single output of the mechanism. To that end, many quantitative measures of transmission quality have been proposed in recent decades to aid engineers in the design and optimization of mechanisms. In this section, a brief review of various measures of transmission quality and a discussion of their advantages and disadvantages is provided. 8.1.1

Transmission Angle and Pressure Angle

One of the earliest developments, the transmission angle, came from Alt [96] and was later expanded in [97, 98, 99]. The pressure angle, a similar performance measure for cam mechanisms, was first introduced by Dyson [100], and further developed in [101, 102]. The pressure angle reduces to the complement of the transmission angle for planar mechanisms. Tsai and Lee [103] discussed how the transmission and pressure angles are inadequate to characterize transmission in certain configurations. For example, Fig. 22 from [103] shows a four–bar mechanism. In this configuration, the four–bar mechanisms’s transmission angle, µ, is 90 degress, which is considered to yield an optimum transmission of an input applied at A0 to achieve a desired output motion at B0 . However, in the configuration shown in Fig. 22, the

171

output is in an instantaneous stationary configuration.

Figure 22: A four bar mechanism in an instantaneous stationary position [103]

8.1.2

Screw-Based Methods

To address the shortcomings of the transmission angle as a measure of transmission performance, Yuan, Freudenstein and Woo [104] used Ball’s virtual coefficient, V C, [105] between the input link’s Transmission Wrench Screw (TWS) and the output link’s velocity screw as a measure of transmission quality. Note that using the terminology introduced in Chapter 3, the TWS is actually a coscrew quantity.

V C = |(ht + h0 )cosθ − rsinθ|

(251)

where ht is the pitch of the transmission wrench screw (TWS), h0 is the pitch of the output velocity screw, and r and θ are the distance and angle between the two screw axes. As the value of Freudenstein et al’s measure of transmission may vary between −∞ to +∞, the measure is not suitable for use as an objective function in the synthesis of mechanisms. Later, Sutherland and Roth [106] normalized the virtual coefficient by dividing V C by the maximum value of V C as VC |(ht + h0 )cosθ − rsinθ| = p T ISR = max(V C) (ht + h0 )2 + ρ2

(252)

where the quantity ρ is the distance from a characteristic point on the TWS to the output velocity screw. Holte et al. [107] noted that the external loading would affect the force transmission performance of a machine and introduced the joint force index (JFI) for planar linkages. 172

The joint force index is given as Fij JF I = max FL

(253)

where Fij represents the magnitude of the maximum joint force and FL represents an external load. As Holte’s method requires computation of forces and applying an external load to the mechanism, the JF I is not an intrinsic property of a mechanism. Tsai and Lee [103] noted that the Sutherland–Roth Transmission also has shortcomings, notably that the characteristic point is undefined when the pitch is infinite or the input link is parallel to the TWS. Therefore, Tsai and Lee modified the transmission index as

T IT L =

|w∗T W S · so | |fT W S · vo | + |τT W S · ωo |

(254)

where w∗T W S is the transmission wrench screw and so is the output velocity screw. Note that the notation of Tsai and Lee has been modified to follow the conventions described in Chapter 3. Furthermore they describe a “manipulability index”, M I, computed similarly to Eq. (254) as

M IT L =

|w∗T W S · so | |fT W S · vi | + |τT W S · ωi |

(255)

Tsai and Lee define the “Total Transmission Index” as the product between the T IT L and M IT L .

T T IT L = T IT L ∗ M IT L

(256)

The velocity ratio of the input screw to output screw provides a measure of the instantaneous mechanical advantage of the system. Tsai and Lee’s T T IT L provides a normalized mechanical advantage. Tsai and Lee’s observation that the total transmission quality should be a function of both the input and output screw’s alignment with the TWS is important. However, their

173

method to normalize T T IT L is dependent on the coordinate system in which it is computed, and is therefore also not suitable as a general descriptor of transmission quality. To address the limitations of Tsai and Lee’s method, Chen and Angeles [108] introduced the Generalized Transmission Index (GTI) as VC |(ht + h0 )cosθ − rsinθ| = p  GT I = max(V C) 2 2 max (h + h ) + ρ > 0 ht ,h0 ,ρ

t

(257)

0

and used a slightly different approach to obtain the distance ρ. Chen and Angeles’ method is also an extension of Ball’s virtual coeffient, but is normalized in a different way. Lin and Chang [109] define a Force Transmission Index suitable for planar parallel mechanisms based on an extension of the transmission angle. They extended their analysis for more general parallel mechanisms in [110]. While the many screw–based approaches to obtaining a metric of transmission performance offer a number of advantages over simple angle–based metrics, the primary disadvantage to the screw–based approaches described above is that they require detailed geometric analysis that is difficult to automate in a CAD environment. Furthermore, most of the methods are suitable only for single–loop mechanisms and for some mechanisms the transmission wrench screw may not be uniquely defined. 8.1.3

Transmission Quality from the Jacobian Matrix

The measures of transmission quality discussed thus far have been related to the geometric arrangement of adjacent links. Others have sought a more general numerical measure of transmission quality from the Jacobian matrix. For example, Denavit et al. [111] first used the determinant of the Jacobian matrix as a measure of transmission quality. A similar measure was also used by others including Wu et al. [112]. Methods based on the determinant of a mechanism’s Jacobian matrix are less useful when a mechanism encounters a singular configuration – i.e. the mechanism is redundantly actuated or has instantaneously lost rank due to alignment of joints or end–of– stroke conditions. 174

Ghosal and Roth proposed a “transmission ratio” as n Y vT v = ( λi )1/2 = (det(J T J))1/2 q˙T q˙ i=1

(258)

where λ1 – λn are the eigenvalues of J T J and describe the major and minor axes of a velocity ellipsoid. Lee, Duffy and Keller proposed a “quality index” [113] as

QIL.D. =

|detJ| |detJm |

(259)

where Jm is the Jacobian matrix computed at the mechanisms’s central symmetric configuration and J is the Jacobian matrix at its present configuration. Lee et al’s method was tailored to a symmetric planar parallel device. A limitation of applying Lee et al’s quality index to general mechanisms is that the Jacobian matrix must be square and nonsingular. Zhang [114] proposed an extension of Lee, et al’s method. Zhang’s quality index is given as s QIZhang =

det(JJ T ) T) det(Jm Jm

(260)

Zhang noted that by the Cauchy–Binet theorem, the product JJ T is well–defined and square. Zhang also suggested that the det(JJ T ) has a geometrical interpretation. The determinant of a matrix represents the volume distortion of a region after being transformed. Yoshikawa proposed a different manipulability index as the product of the nonzero singular values of the constraint Jacobian of a robotic end–effector as [115] M IY oshikawa =

Y

σi

(261)

i

Compared to the screw–based methods discussed in the previous section, metrics of transmission performance based on the Jacobian matrix are somewhat easier to implement into a general–purpose analysis environment. For example, it was already shown in Chapter 5 how the Jacobian matrix can be obtained automatically from the directed–graph representation of a mechanism’s topology. A primary disadvantage to many of the techniques described 175

in this section is that they provide only a single number for an entire mechanism, i.e. the metric provides no granular information to the designer of each joint’s contribution to the overall transmission. Furthermore, in many of the methods described above, the metrics are not normalized or bound (for example between 0 and 1).

8.2

Computing the TWS from the Jacobian Matrix

The Transmission Wrench Screw (TWS), a wrench that lies orthogonal to a system of wrenches between the input and output of a mechanism, plays an important role in most measures of transmission performance. Most methods in the literature compute the TWS using Pl¨ ucker line geometry or screw coordinates. Others have attempted to compute a measure of transmission quality from the Jacobian matrix. In this section, it will be shown how the TWS can be found from a mechanism’s Jacobian matrix, thereby demonstrating the connection between the two approaches. An example of the TWS for a representative 7H linkage is shown in the Fig. 23.

Figure 23: A 7H Spatial Linkage [103]

The definition of the TWS is given in [32, 106, 103] and is summarized as follows. For the TWS to be uniquely determined in a three-dimensional, single-loop spatial mechanism, the system of wrenches between the input and output must be a 5-system. As discussed in Section 3.2.4, each joint effectively constrains relative velocities in the five directions

176

orthogonal to the internal joint motion; relative motion between bodies is permitted only in the directions aligned with the joint (i.e. only one direction for each primitive joint). Forces are dual to velocities; their behavior is similar to but opposite that of velocities. If no internal working forces are applied within the joints, then no forces may be transmitted in the direction of the joint’s motion. Therefore, the only forces imparted from one joint to the next are those in directions orthogonal (i.e. reciprocal ) to the orientation of the joint, as shown in Eq. (35). A system of five non-redundant joints has one unique velocity screw in which forces may be transmitted. Let the input and output influence coefficient matrices be si and so respectively. And let the influence coefficient matrices associated with internal joints 1-5, be s1 –s5 . The Jacobian matrix for the entire single–loop mechanism in Fig. 23 becomes

h i J = si s1 s2 s3 s4 s5 so

(262)

The transmission performance is a measure of how well the output velocity screw, so , is aligned with the transmission wrench screw. To obtain the transmission wrench screw, let us look at J1−5 separately i h J1−5 = s1 s2 s3 s4 s5

(263)

In the notational convention used in this work, the Transmission Wrench Screw (TWS) is a coscrew – in older literature no distinction was made between the two spaces. Let the TWS coscrew be denoted as w∗T W S , where w∗T W S is a linear operator on the unique velocity screw not spanned by joints 1–5. Only the coscrew, w∗T W S , can do useful work on the system, applied through the velocity screw sT W S . Therefore, w∗T W S applied through every one of the influence coefficient matrices in J1−5 should generate zero power. Another interpretation of this statement is that the transpose of coscrew w∗T W S lies in the left null space of J1−5 . The left null space of J1−5 can be computed using LU decomposition. Recall that the LU decomposition algorithm permutes the Jacobian matrix into

177

  pd pi J J  PR J1−5 PCT =  sd si J J

(264)

If the 5 joints span a 5-system, they are not redundant and J pd will have full column rank. Equation (264) would then reduce to   J pd T   PR J1−5 PC = sd J

(265)

where J pd = LU , PR is an orthonormal row permutation matrix and PC is an orthonormal column permutation matrix. A convenient representation of the column–space of J is  

J pd J sd





 U −1 L−1

  I =  = J sd U −1 L−1 C I



(266)

which allows trivially computing the left null–space, X, of J as   h i I   X = 0, X = −C I C

(267)

h i The matrix c −C I is the cokernel of J and any value of c will satisfy the left nullity of J. As J1−5 is a 5-system, the matrix X is a row vector. The TWS is expressed as a normalized quantity, however care must be taken in normalizing screw and coscrew quantities as the [Ad] and [Ad]∗ transformation matrices are not orthonormal. w∗T W S is normalized by setting 1 where f is the linear force component of the coscrew w∗T W S (excluding torques) [16]. |f | The row permutation matrix, PR , is used to permute the columns of X back to original

c to

unpermuted order. XPRT w∗T = TWS f T W S

(268)

Ball’s virtual coefficient, V C, which forms the basis of most measures of transmission performance, is given as 178

1 V C = w∗T W S · so 2

(269)

Note that, by definition, so is a normalized quantity. Ball’s virtual coefficient (VC) is the original metric for transmission performance proposed by Yuan in [32]. The primary difference between this expression and other transmission indices lies in how VC is normalized. Although VC is not used as a metric of transmission performance in the present work, the method is described here to illustrate the connections between the screw–based methods, Jacobian–based methods and generalized coordinate partitioning.

8.3

Mobility

A fundamental property of a mechanism is its mobility (i.e. a mechanism’s degrees of freedom). Mobility is an important design criteria in the synthesis and analysis of mechanisms as it provides the number of independent degrees–of–freedom of a mechanism. 8.3.1

Chebyshev-Gr¨ ubler-Kutzbach criteria

A mechanism’s overall mobility, M , is typically computed using the Chebychev-Gr¨ ublerKutzbach mobility formula as

M = 6(n − 1 − j) +

j X

fi

(270)

i=1

Where fi is the degrees of freedom of joint i, n is the number of bodies in the system, including the fixed link, or ground, and j is the number of joint variables. For example, a serial 6DOF robotic manipulator with 6 floating bodies and one fixed body (therefore P n = 6 + 1) and 6 joints has an overall mobility of ji=1 fi = 6 using the Chebyshev-Gr¨ ublerKutzbach criteria. While Eq. (270) works for many classes of mechanisms, it fails for certain paradoxical and exceptional linkages. As a mechanism changes configuration, joints may line up, causing the mechanism to temporarily lose or gain mobility. For Bricard’s paradoxical

179

mechanism, which was presented in Chapter 7, Eq. (270) always yields the incorrect result (i.e. it indicates the mechanism has a mobility of zero – that it is immobile). Over the years many researchers have attempted to refine or modify the Chebyshev-Gr¨ ubler-Kutzbach criteria to create a simple equation to account for these exceptions [14]. 8.3.2

Mobility based on the rank of the Jacobian

In his seminal review paper, Gogu [14] states that perhaps the only reliable method of obtaining any arbitrary mechanism’s mobility is

MI = j − rank(J)

(271)

where j is the number of joints and rank(J) is the dimension of the largest nonsingular submatrix of J, which is noted as J pd throughout this development. Gogu claims that the primary difficulty in implementing Eq. (271) is that it is difficult to set up the Jacobian matrix for complex mechanisms. Fortunately, the algorithms presented in Chapter 5 for identifying substructures can be used to automate the process of setting up and simplifying a mechanism’s constraint equations and Jacobian matrix, allowing Eq. (271) to be computed automatically on the basis of each substructure. Equation (271) is also sometimes referred to as the instantaneous mobility of the system, as it may differ from the global mobility computed by the Chebyshev-Gr¨ ubler-Kutzbach criteria. 8.3.3

Numerical computation of mobility

Gogu states that when computed numerically, the rank of J in the proximity of a singular configuration requires some interpretation due to machine roundoff error [14]. Although symbolic computation of the rank could be applied to overcome the limitations of floating– point arithmetic, in general, numerical computation is sufficient for most practical applications (indeed, the algorithm presented below is capable of computing the correct mobility of Bricard’s mechanism using double precision arithmetic). Part of the process for obtaining 180

the correct matrix rank involves correct assembly of a mechanism. Therefore, it is critical to apply Newton–Raphson iteration to remove constraint violations, if present, before numerical computation of the matrix rank is performed. In this work, the Newton–Raphson method is applied automatically when a mechanism is loaded using a RMS error tolerance of 1e-12. The primary advantage to computing the rank numerically is in speed of execution, which has some advantages in the applications discussed in the next section. Recall from Section 3.4 that LU factorization with complete row and column pivoting can be used to reliably find the largest nonsingular submatrix of J, thereby revealing the matrix rank of J. LU factorization with complete row and column pivoting permutes and partitions the Jacobian matrix J into the form   pd pi J J  PR JPCT =  sd si J J

(272)

and factors the largest nonsingular submatrix of J, J pd , into L and U . A convenient representation of the row space of J is found by

h i h i U −1 L−1 J pd J pi = I B di

(273)

The right null space of J is the matrix N, such that

h

i I B di N = 0

In this form, the solution for N is trivial, and is found as   di −B  N = I

(274)

(275)

The “instantaneous mobility”, MI , of the mechanism is found as MI = rank(N ) which is equivalent to Eq. (271). 181

(276)

Equation (276) indicates that mobility is related to the null–space of the constraint Jacobian matrix. In other words, the row space of the Jacobian matrix is associated with the constraints, whereas the null–space indicates that linear combinations of joint velocities are possible without violating the constraints.

8.4

Transmission Performance, Manipulability and Mobility

In this chapter, various measures of transmission performance, manipulability and mobility were obtained from the Jacobian matrix. Note that the Jacobian matrix used for numerical positional control of an end–effector differs slightly from that used in setting up and solving dynamics equations. In the numerical positional control of an end–effector, an artificial constraint at the end–effector is imposed, forming an artificial chord loop. The row and column spaces of the end–effector’s Jacobian matrix in this context correspond to the ability of the joints to impart an incremental change of end–effector position (i.e. manipulability). In this context, the null–space of J is associated with a mechanism’s ability to reconfigure its pose without moving the end–effector (i.e. mobility). The transmission performance for a robotic end–effector is then associated with the row and column spaces of the constraint Jacobian matrix (i.e. the manipulability). These concepts will be discussed in more detail in Chapter 11. In contrast, when setting up dynamics equations, the constraint Jacobian matrix typically corresponds to real constraints associated with independent chord loops present in a mechanism. For example, consider the slider–crank mechanism from Chapter 7. If the slider is chosen as the mechanism’s end–effector, due to the constraints, the manipulability of the end–effector would be zero using Yoshikawa’s manipulability index Eq. (261). The mechanism has no manipulability in the this sense, but it would have an overall mobility of one (see the next section for more details). Therefore, in the design of constrained mechanisms, the transmission performance of interest is the joints’ ability to impart a change on the overall system configuration without violating the constraints, movements which lie in the null space of the constraint Jacobian matrix and correspond to the mobility of the mechanism. 182

Measures of transmission performance based on the row and column space of the Jacobian matrix (i.e. the manipulability) or the null space of the Jacobian matrix (i.e. the mobility) can therefore both be used as measures of transmission performance. The interpretation depends on the context and application. In the next chapter, it will be shown that further insight can be obtained from the null–space of the Jacobian matrix.

183

9

Mobility Numbers

In the previous section, an overview of various transmission metrics and mechanism mobility was provided. In this chapter it will be shown that each joint’s contribution to overall instantaneous mobility (i.e. its mobility number, can be found using a simple and numerically efficient algorithm. Furthermore, each joint’s contribution to overall instantaneous mobility can be used as a metric of transmission performance, thereby demonstrating the connection between transmission quality and mechanism mobility.

9.1

Computing Mobility Numbers

In Section 8.3.3, it was shown how to obtain the null–space basis N simply and efficiently from the constraint equations using LU decomposition   with complete row and column pivoting. −B di , has been obtained it can be factored into Once the right null-space matrix, N =  I     dm dm Z Z R R =  N = ZR =  im −1 Z R

(277)

using Gram-Schmidt orthonormalization, (i.e. QR decomposition) [18] where R is a transformation matrix and Z is an orthonormal basis. The R matrix effectively takes a linear combination of directions along the mechanism’s manifold and orthogonalizes and normalizes the columns of Z. The resulting Z matrix, is analogous to a unit orthogonal direction-cosine matrix on the mechanism’s manifold where each column corresponds to a direction along the manifold and each row to a unique joint variable. Each of the entries in Z corresponds to a joint’s projection onto one orthogonal direction tangent to the constraint manifold. In Eq. (277), Z is partitioned into Z dm , where each row corresponds to a unique dependent variable, and Z im , where each row corresponds to a unique independent variable. Since Z has been orthonormalized, each column has a unity magnitude. In addition, the number of columns in Z is equal to the mechanism’s instantaneous mobility, MI . Thus the

184

sum of squares of all entries in Z is MI =

n X m X

Zij2

(278)

j=1 i=1

where m is the row dimension and n is the column dimension of Z. In other words, ith row of the Z matrix indicates the ith joint variable’s degree of alignment with the manifold surface. If the norm of the ith row of Z is close to 1 the ith variable is almost parallel to the constraint manifold, and if the norm is close to 0 it is almost orthogonal to the manifold. In this work, the sum of squares in the ith row of Z is referred to as the joint’s mobility number, mi . mi =

n X

Zij2

(279)

j=1

From Eqs. (278) and (279) it follows that joint mobility numbers are closely related to the overall instantaneous mobility of the system. Furthermore, mobility numbers have the advantage that they are normalized and bound between 0 and 1. If the mobility numbers in a given row sum to zero, the joint is orthogonal to the constraint manifold and is unable to effect a change on the system configuration. A unit mobility number indicates that the joint is perfectly aligned with the constraint manifold. If multiple joints align with the constraint manifold, the instantaneous mobility will be shared between them. In the next section, it will be shown that mobility numbers can be used as a normalized metric of transmission performance of individual joints for constrained mechanisms.

9.2

Connection to Singular Value Decomposition

Singular Value Decomposition (SVD) can also be used to find the largest nonsingular submatrix of J [116]. SVD decomposes a matrix as

J = U ΣV T

   h i Σpd 0 V dT   = Up Us  iT 0 0 V

(280)

where Σ is a square diagonal matrix containing monotonically decreasing singular values and U and V are orthonormal transformation matrices. 185

Since U and V are orthonormal, the inverse is simply the transpose and     pT pd h i U Σ 0 J V d V i =   U T JV =  sT U 0 0

(281)

V d effectively takes linear combinations of columns to find the diagonal matrix Σ. U d takes linear combinations of rows. U sT and V i are the cokernel and kernel of J, respectively. From Eq. (281) is obtained U pT JV i = 0, ∴ JV i = 0

(282)

which shows that V i also represents the orthonormalized null space matrix Z presented above. Full SVD could also be used to find an orthonormalized null space basis of J, which would be suitable for computing mobility numbers. It also provides additional information (i.e. the matrices V d , U p and U s , which are not necessary for analyzing the mobility of the system) and requires many more operations than the LU/QR decomposition method described above. The LU/QR decomposition algorithm can therefore be considered a shortcut to finding an orthonormalized null–space basis compared to full SVD [117].

186

10

Mobility Numbers – Examples

This chapter provides a number of examples to demonstrate the computation of mobility numbers as described in Chapter 9. First, mobility numbers are computed for two single– loop mechanisms, including a parameterized slider–crank model and Bricard’s paradoxical linkage, which was first discussed in Chapter 7. To demonstrate how mobility numbers extend to multiple–loop parallel mechanisms, mobility numbers are computed for the single degree of freedom Sarrus linkage and Klann’s walking linkage. Finally, an interpretation of mobility numbers for complex mechanisms consisting of multiple substructures is provided.

10.1

Single Loop Mechanisms

10.1.1

Slider–crank

The first example used to illustrate the computation of mobility numbers is the slider–crank mechanism shown in Fig. 24. In this example, the length of the follower is held constant at 1 and the length of the crank, α, is allowed to vary from 0.001 to 10.

Figure 24: Parameterized slider–crank mechanism

When α is less than one, the crank can sweep a full 2π radians. Therefore the mobility numbers for α ≤ 1 are plotted as a function of crank arm angle, θ1 in Fig. 25a. When α > 1, the follower, θ3 , is capable of sweeping a full 2π radians, and the results are shown as a function of θ3 in Fig. 25b. For α ≤ 1, shown in Fig. 25a, the mobility number for the slider decreases to zero at θ1 = π and θ1 = 0, which corresponds to the crank’s dead center positions. At these configurations, the slider has no ability to impart a change in the mechanism’s configuration. 187

(a) α ≤ 1

(b) α > 1

Figure 25: Mobility numbers for a slider–crank mechanism

188

Similarly, for α ≥ 1 in Fig. 25b, the mobility number for the slider decreases to zero at θ3 = π and θ3 = 0, which also corresponds to the crank dead center positions. Figs. 25a and 25b also show that for α ≈ 1, the mobility of all variables varies rapidly throughout the slider–crank cycle. As described in Chapter 8, the mobility of a system is of interest to a design engineer as it provides an indication of how many inputs are required to control a mechanism. The slider–crank mechanism has an overall mobility of 1 and therefore a single input is required to drive the system. Note, however that care must be undertaken as to where the single input is applied. As the slider–crank moves through a cycle, each joint’s contribution to the overall mobility changes, which is indicated by the joint’s mobility number. Many of the joints exhibit a zero mobility at some point during the cyclic motion, and therefore are not capable of imparting a change in the overall system configuration at these positions. For α ≤ 1, only the crank θ1 has nonzero mobility throughout the entire range of motion. Therefore, if an input force could only be applied in one joint, the crank would be the only suitable joint to drive the system through an entire cycle. Similarly, for α ≥ 1, only joint θ3 has a nonzero mobility for all values of α and θ3 and is the only suitable choice. In most industrial applications, slider–crank mechanisms have a value of α ≤ 1 and are not driven solely by the crank, but by a combination of the slider (i.e. the piston) and the crank. It is well–known that the piston is only capable of imparting a change on the overall system configuration for a portion of the slider–crank cycle; this fact is clearly conveyed in Fig. 25a. If a joint is only capable of driving the system through a portion of the desired motion, then another joint (or joints) must drive the system through the remaining portion of the desired motion. In the slider–crank mechanism, the rotational inertia of the flywheel, which is attached to θ1 is used to carry the mechanism through the piston’s region of poor force transmission capability. Therefore, the crank must have sufficient mobility to drive the system in regions where the slider has poor mobility. In the slider–crank mechanism, the contribution of each joint to the overall mobility shifts as it moves to different configurations. Knowledge of each joint’s contribution to the global

189

mobility can provide insight to the design engineer as to how to distribute control efforts. 10.1.2

Bricard’s mechanism

Bricard’s mechanism is an example of an overconstrained, paradoxical mechanism. As discussed in Chapter 8, Bricard showed that the usual formulas for mobility do not apply for this mechanism and yield a mobility of 0, whereas it is well–known that the correct mobility for the system is 1.

Figure 26: Bricard mechanism

Bricard’s mechanism consists of six joints and six bodies forming a single kinematic loop. It has three–fold rotational symmetry, with all even–numbered joints having the same angle and all odd –numbered joints having the same angle. The constraint Jacobian for Bricard’s mechanism is a 6×6 matrix. If the mechanism is assembled with high accuracy so that the even–numbered joint axes intersect at a point and the odd –numbered joint axes intersect at a point, the rank of its null–space basis is 1, and the linkage has a mobility of 1. If any misalignment exists in the joints, the mechanism either will not be able to assemble or the joints will be unable to rotate. In the latter case, the system would have zero mobility (i.e. it would be a structure, not a mechanism). For this reason, it is important to carry out Newton–

190

Raphson iteration using a tight error tolerance to reduce constraint errors before applying LU factorization with complete row and column pivoting for the purpose of rank determination. In this thesis, Newton–Raphson iteration is carried out using an error tolerance of 1e-12 before LU decomposition with complete row and column pivoting is applied. After Newton–Raphson iteration was applied to remove constraint errors, the null–space basis and mobility numbers for Bricard’s mechanism were computed during the simulation presented in Chapter 7. The mobility numbers for all even–numbered joints are the same. Likewise, the mobility numbers for all odd–numbered joints are the same. The mobility numbers of the even and odd–numbered joints are shown in Fig. 27.

Figure 27: Mobility numbers for Bricard’s mechanism

Figure 27 shows that the overall instantaneous mobility of Bricard’s mechanism is shared between the even and odd joints. At approximately 45 seconds into the simulation, the mechanism has reached one of its triangular configurations with the even–numbered joints 191

located at the three corners of the mechanism. At this configuration, the even–numbered joints have no ability to impart a change in the overall system configuration – their mobility numbers are zero, whereas the odd–numbered joints have mobility numbers of 13 . As the sum of all mobility numbers is always equal to the overall instantaneous mobility of the system (i.e. one in this case), this indicates that the odd–numbered joints have the maximum ability to change the overall system configuration. As Bricard’s mechanism continues through its range of motion, it eventually reaches a pose where all joints have equal mobility numbers of 16 . This pose corresponds to the one of the mechanism’s square configurations. At approximately 100 seconds of simulation time, the mobility has transferred from the odd–numbered joints to the even–numbered joints. In this second triangular configuration, the odd joints have moved to the three corners and have no ability to impart a change on overall system configuration. One of the unique aspects of Bricard’s mechanism is that it is impossible to control the system using only one variable. If it is desired to drive the system, alternating inputs in an even and odd–numbered joint must be applied.

10.2

Multiloop Mechanisms

In the previous section, it was shown that mobility numbers can be computed for single–loop, single degree–of–freedom mechanisms. The method easily applies to multiloop mechanisms. Two examples of multiloop mechanisms are provided. 10.2.1

Sarrus linkage

The Sarrus linkage is a simple multiloop system with an instananeous mobility of 1. The top and bottom of the mechanism can move in a prismatic motion relative to each other.

192

Figure 28: Sarrus linkage

The mechanism has two independent chord loops as shown in Fig. 29 below.

Figure 29: Directed graph for the Sarrus linkage

After the cut–body technique described in Chapter 5 has been applied, a massless virtual body has been appended for each kinematic chord loop.

193

Figure 30: Directed graph for the Sarrus linkage with virtual bodies

The mobility numbers remain constant throughout the range of motion for this symmetric mechanism. Each joint located at the top and at the base has a mobility number of

1 . 18

Each

joint located half–way between the top and base has a mobility of 29 . Note that if the top and bottom of the Sarrus linkage were replaced with square platforms and two additional links were added, thereby adding a third independent kinematic chord loop to the system, the mobility numbers for the joints located at the top and base would be 1 . 24

Each joint located half–way between the top and base would have a mobility number of

1 . 6

As more kinematic loops are added to a system, the overall mobility is naturally shared

between all joints in each cluster of overlapping chord loops. The difference between the values of mobility numbers associated with the triangular and square configurations of the Sarrus linkage highlight that care must be taken in the interpretation of mobility numbers. If a force were applied in one of the links at the base of either the triangular or square version of the Sarrus linkage, provided that there is no resistance in the links, the same force would be required to lift the platform. And yet, the mobility numbers for the two configurations are different. Therefore, the mobility numbers associated with triangular and square Sarrus linkages should not be directly compared to one another. Rather, the joint’s mobility number provides a metric of its contribution to overall mobility. The mobility numbers in the square version of the Sarrus linkage are lower, but there are also more opportunities to distribute input forces over a greater number of 194

joints if it were practical to do so. 10.2.2

Klann’s linkage

Klann’s linkage is named for its designer, Joseph Klann. Klann’s linkage is a planar multiloop mechanism designed to simulate the gait of a legged animal. The walking motion is actuated by a single input crank. The crank rotation is transferred to a connecting arm, which causes the leg to move in an arcuate reciprocating motion along a restricted path [118].

(a) leg down configuration 1 (b) leg down configuration 2 (c) leg down configuration 3

(d) leg raised configuration 1 (e) leg raised configuration 2 (f) leg raised configuration 3

Figure 31: Klann mechanism (a.)–(c.) leg is in contact with the ground. (d.)–(f.) leg is lifted and moved into position for next “step”

The Klann linkage also has two independent chord loops as shown in Fig. 32.

195

Figure 32: Directed graph for the Klann linkage

After the cut–body technique described in Section 5 has been applied, a massless virtual body has been appended for each kinematic chord loop.

Figure 33: Directed graph for the Klann linkage with virtual bodies

The Klann linkage is assembled automatically from its body–joint–body representation and cycled through its range of motion. The mobility numbers of Klann’s mechanism are plotted as a function of the crank angle (joint A), which is shown on the x–axis of Figure 34.

196

Figure 34: Mobility numbers

Klann’s linkage is typically driven by a single input (joint A), which therefore must have a nonzero mobility for the entire range of motion. While joint A does maintain nonzero mobility throughout the walking cycle, it is interesting to note the extent to which it fluctuates. At

π 2

radians, joint D, i.e. the revolute joint on the opposite end of the crank from A, has the

highest overall mobility in the system. Shortly after

π 2

radians, the crank mobility decreases

to 0.1, with joints B, C, D and F all having higher ability to impart a change on system configuration than the input crank. If the designer had the choice to drive the mechanism from any one joint for the range of motion, only joints A and D would be suitable choices. All other joints reach a condition of zero mobility at some point during the walking cycle. Furthermore, joints A and D are better choices to drive the mechanism from as they have consistently higher contribution to overall mobility than all other joints in the mechanism. Note that the sum of the mobility numbers associated with joints A and D remains relatively uniform throughout the walking cycle. As joint A loses mobility, joint D gains mobility. Therefore, simultaneously distributing efforts to joints A and D may achieve a more uniform drive of the mechanism. 197

Table 30: Joint substructure assignments for the hydraulic excavator model

10.3

substructure

joint

i

C, H, I, Q, P, M, L

ii

D, J, R, N

iii

F, E, K, T, O, G, S

Mechanisms with Substructures

The previous examples applied mobility analysis to mechanisms consisting of either a single loop or a single cluster of kinematic chord loops. The method can be applied to more complex mechanisms, such as the excavator model, which was first presented in Chapter 5. After the kinematic substructuring algorithm has been applied, the mobility of each cluster of kinematic chord loops can be analyzed individually. Mobility of each joint is assigned according to the following rules: • If a joint j is in a kinematic substructure, the mobility number is found from the sum of the squares of row j of the orthnormalized null space basis, Z, of the constraint Jacobian for the substructure. • If a rigid joint is specified, it has no ability to impart a change on system configuration and has a mobility number of zero. • If a joint is not in a substructure, it has a mobility number of 1. Using the rules described above, mobility analysis was performed on the excavator, which was first presented in Chapter 5. For the purpose of this analysis, the undercarriage is held fixed to the inertial reference frame by the rigid joint A and therefore has a mobility number of 0. With the rigid joint in place, joint A has no ability to effect a change on the overall system configuration. Joint B is not in any substructure and has a mobility of 1. The remaining joints are members of substructures, and therefore mobility analysis can be carried out on the basis of the constraint equations for each substructure independently. The joint substructure assignments are summarized in Table 30. 198

In the excavator, the boom (substructure i), stick (substructure ii) and bucket (substructure iii) are driven by hydraulic actuators. Therefore, the mobility numbers are plotted as a function of the linear displacement of the hydraulic actuators.

(a) Substructure i (boom)

(b) Substructure ii (stick)

(c) Substructure iii (bucket)

Figure 35: Mobility numbers for excavator model

The boom is actuated by two parallel hydraulic cylinders M and L. Due to the symmetric nature of substructure i, the mobility numbers for the two hydraulic cylinders M and L are identical. Similarly, the mobility numbers for joints H and I, which correspond to the revolute joints at the proximal end of the hydraulic cylinder, are identical. Likewise, the mobility numbers for joints Q and P, the revolute joints corresponding to the distal ends of the hydraulic actuators, are identical. 199

In substructure i, joint C (at the base of the boom) and the two revolute joints H and I (at the proximal ends of the hydraulic cylinders) have a high overall mobility, indicating that if it were practical to drive the linkage using revolute joints, joints H and I would be suitable choices. The revolute joints at the distal ends of the hydraulic cylinders have a very low contribution to overall mobility throughout the operating range. Although substructure i is a parallel linkage consisting of two independent chord loops and substructure ii is a linkage consisting of only one independent chord loop, the overall layout of the linkages is similar. This aspect is reflected in the plots of their mobility numbers in Figs. 35a and 35b, which exhibit similar overall trends. On the other hand, substructure iii contains two serial independent kinematic chord loops. Its topology is much different than the other two substructures, which is reflected in the plot of the mobility numbers in Fig. 35c. All hydraulic cylinders (joints M, L, N and O) exhibit a parabolic trend in their mobility numbers through the range of motion. The cylinders’ mobility numbers peak approximately midway through the range of motion and drop to zero at the lower and upper extremes of the ranges. The upper and lower extremes of the ranges plotted correspond to toggle positions, where the cylinders have no ability to impart a change in overall system configuration. As the hydraulic cylinders are the only inputs used to control these linkages, care must be taken to avoid reaching these configurations. At the toggle positions, control inputs act orthogonal to the linkage motion, imparting no change in system configuration, but generating potentially high reaction forces in the joints — with the possibility to cause damage to the machine. Real excavators are designed with mechanical stops and pressure release valves to avoid reaching these toggle positions. The mobility numbers can serve as a useful aid to the design engineer to identify and account for the toggle positions. A software tool with a graphical user interface was developed to support this research (see Chapter 12 for more information). When a model is loaded into the software from the user’s input file, the topology is processed automatically to identify substructures according to Chapter 5 and generate their corresponding equations. If the user clicks on a joint that is a

200

member of a kinematic substructure, the other members of the substructure are highlighted, as shown in Fig. 36. The mobility numbers are computed according to the rules above and displayed in the model–overview pane at the left of the viewport window. Tools to reposition the mechanism are provided. As the user interactively repositions the mechanism, the joint values and mobility numbers in the overview pane are updated in real–time. Repositioning a multiloop mechanism requires computing the Jacobian matrix and performing Newton– Raphson iteration to remove constraint violations. The mobility numbers can therefore be computed efficiently as a byproduct of the Newton–Raphson process with minimal additional expense.

Figure 36: Excavator model in graphical user interface

Once the designer has established the geometry and topology of the mechanism, the process for identifying substructures, setting up constraint equations, and performing mobility analysis is completely automated, serving to minimize human operator error in performing mobility analysis and to increase design throughput.

201

10.4

Summary

This chapter has demonstrated the efficient and automatic computation of mobility numbers for a number of complex mechanisms. In the next chapter, several practical applications of mobility numbers will be presented.

202

11

Mobility Numbers – Applications

The previous chapter provided examples of the computation of mobility for many different mechanisms. Mobility numbers are a useful tool for a designer as they provide a simple, normalized metric of each joint’s contribution to the overall mobility of the system. In this section, a number of practical applications are presented.

11.1

Mechanical advantages

To demonstrate the application of mobility numbers for computing mechanical advantages, a bicycle caliper brake mechanism is analyzed. In a braking system for a road bicycle, a cyclist applies braking forces with their hands to brake levers located on the handlebar. Braking forces are then transmitted from each brake lever through a Bowden cable to either the front or rear brake caliper, which is mounted to the bicycle frame near each of the bicycle rims. When braking forces are applied, the caliper clamps the rim of the bicycle wheel, thereby decelerating the rider. To decrease weight and minimize the overall footprint of the brake mechanism, many brake caliper designs utilize a mechanical linkage to increase the mechanical advantage at the caliper. For example, the caliper brake shown in Fig. 37 uses a rocker link between the left and right arms. A Bowden cable housing attaches to the upper cable holder, shown at the upper left of Fig. 37a. The Bowden cable passes through the upper cable holder and is rigidly fixed behind the lower cable holder. When tension is applied to the Bowden cable, a prismatic joint (denoted as Joint I in Table 31) between upper and lower cable holders is formed. The prismatic joint, I, serves as the input to the mechanism. A second prismatic joint, L, is formed between the left and right arms attached at the left and right brake pads. The second prismatic joint serves as the output of the mechanism. With the prismatic joints I and L, the mechanism has three independent kinematic chord loops. The various bodies and joints of the caliper brake are identified and labeled in Table 31.

203

Figure 37: Caliper brake utilizing a rocker–link mechanism

(a) Front view

(b) Isometric view

(c) Rear view

204

Figure 38: Directed graph of caliper brake

Table 31: Identification of Bodies and Joints in the caliper brake ID

Body Name

ID

Joint Name

1

bridge

A

(0) fixed → (1) bridge

2

left arm

B

(1) bridge → (2) left arm

3

right arm

C

(1) bridge → (3) right arm

4

upper cable holder

D

(2) left arm → (4) upper cable holder

5

lower cable holder

E

(2) left arm → (7) massless body 2

6

massless body 1

F

(2) left arm → (6) massless body 1

7

massless body 2

G

(3) right arm → (6) massless body 1

8

massless body 3

H

(3) right arm → (5) lower cable holder

I

(4) upper cable holder → (5) lower cable holder

J

(3) right arm → (8) massless body 3

K

(7) massless body 1 → (8) massless body 3

After the cut–body technique has been applied to the brake according to Chapter 5, its root–oriented directed graph, shown in Fig. 38 is obtained automatically.

i



B

C

hB

−hC

  J = ii  hB  iii hB

D

−hC −hC

E

F

G

hF

−hG

H

I

J

 −hJ

hE −hH

hD

K

  hK  

(283)

hI

All three chord loops have joints B and C in common and therefore joints B–K belong to one substructure. The fixed joint A is not a member of any chord loop and therefore is 205

Figure 39: Mobility numbers for bicycle brake mechanism not part of the Jacobian matrix. The mobility numbers for the brake mechanism are plotted in Fig. 39 as a function of joint K’s displacement, the opening distance between the pads, for the operating range of the brake. Figure 39 shows that the input, I, and output, K, dominate the overall instantaneous mobility of the brake mechanism. To understand how the right null–space basis Z, which was introduced in Chapter 9, relates to mechanical advantages, consider that Z projects joint velocities onto the constraint subspace as q˙ = Z p˙

(284)

Each column of Z represents linear combinations of joint velocities and each row of p˙ represents orthogonal velocities along the constraint manifold. In general, if a cluster of overlapping chord loops has n joint variables and m degrees of freedom, the matrix Z has a dimension n × m and p˙ is an m × 1 column vector. The caliper brake has 10 joints and a single degree of freedom; therefore p˙ is a scalar quantity and Z is a 10 × 1 matrix. The input, I, and output, K, are the two joints of interest for computing mechanical advantages of the brake mechanism. The quantity p˙ is independent and therefore can be set 206

T to any value. Therefore, set p˙ to the transpose of the ith row of Z, (i.e. Z[i,:] ). Note that

for the brake mechanism, this amounts to setting the scalar quantity p˙ to the single entry in the ith row of Z. This allows computing velocity ratios,

q˙i , q˙k

in terms of known quantities to

obtain mechanical advantages. The effective velocity in joint i is q˙i = Zi Zi T = mi

(285)

where mi is the mobility number associated with joint I. Note that this quantity appears on the diagonal of the matrix ZZ T . Similarly the quantity q˙k , the effective velocity in joint K is computed as q˙k = Zk Zi T

(286)

where the quantity Zk Zi T corresponds to the off-diagonal element of ZZ T at row index of joint k and column index of joint i. The ratio of q˙i to q˙k yields the mechanical advantage of joint i with respect to joint k.

MA =

mi q˙i = q˙k Zk Zi T

(287)

The derivation above was provided to illustrate the connection of mobility numbers and mechanical advantages to the projection matrix ZZ T , which is further discussed in Section 11.2. However, for a mechanism with a single degree of freedom, there is a simpler method to obtain mechanical advantages, noting that r q˙i Zi Zi T Zi mi MA = = = =± T q˙k Zk mk Zk Zi

(288)

Therefore, the mechanical advantage can be found simply by taking the ratio of the scalar quantites in the ith and kth rows of the matrix Z, which in the case of a single degree of freedom system is equal to the square root of the ratio of the mobility numbers. The mechanical advantage of the brake over its range of motion is shown in Fig. 40. Consider that the orthonormalized null–space basis of a matrix is not unique. For a mechanism with a 1–dimensional manifold, there are two possible tangent bases that may be found using QR decomposition. If the first tangent basis points in a positive direction 207

Figure 40: Mechanical advantage for bicycle brake mechanism along the manifold, the second basis points in the opposite, or negative, direction along the manifold. An emphasis is placed on positive and negative here as the choice of sign applied to a given direction is completely arbitrary. For the purpose of computing velocity ratios from the terms of a 1–dimensional null–space basis, it does not matter which direction is used, as the sign ambiguity is resolved when taking ratios of terms in Z. Note that the above procedure would need to be modified to perform an analysis of mechanical advantages for a mechanism with a constraint–manifold having two or more dimensions. For a constraint manifold having two or more dimensions, the velocity input from one joint may be distributed across multiple directions along the manifold. It is therefore necessary to choose a value of p˙ that is meaningful for the analysis when analyzing mechanical advantages. The matter is somewhat complicated by the fact that there are an infinite number of possible tangent bases at a given point on the manifold for manifolds with two or more dimensions. For example, in a two dof system, the two orthogonal tangent vectors can be arbitrarily rotated and still represent a valid tangent basis at the point. Resolving the ambiguity for the purpose of computing joint velocity ratios requires knowledge of the

208

velocity ratios of two or more directions along the constraint manifold. This observation is used in Section 11.4.

11.2

Computing Static Force Balance

This section describes how the orthonormal null space matrix, Z, can be used to obtain a static force balance (i.e. to determine what working loads within joints are required to maintain a particular pose). Starting with Lagrange’s equations in joint coordinate space, the dynamic terms can be canceled as:      `  ` T      f q¨ M J    =  e ˙+ ˙t − J˙φ J 0 λ 

(289)

leaving the first block row:

−1 ((( ¨ ( M( C( (C q¨ − H φ)) J T λ = f ` = f + H T C −T (g∗f − (

( ((

(290)

Here, f represents joint working forces. Recall that the equation: f = H T k∗f

(291)

projects constraint reaction forces k∗f onto joint subspace, where H is a block diagonal matrix consisting of each joint’s influence coefficient matrix, h along its diagonal. When no unique solution for f ` , exists, some authors find the set of Lagrange multipliers, λ, that minimizes the objective function Ψ:

Ψ = f T f = (J T λ − H T C −T g∗f )T (J T λ − H T C −T g∗f )

(292)

An alternative approach makes use of the orthonormalized right null space matrix, Z for this purpose. Rearranging Eq. (290) yields

f = J T λ − H T C −T g∗f Recall that: 209

(293)

JZ = 0, ∴ Z T J T = 0

(294)

Allowing the elimination of Lagrange multipliers from Eq. (293).

Z TJT λ − Z T H T C −T g∗f ZT f =  

(295)

Z T effectively projects the working constraint forces, f , onto the mechanism’s manifold. Left-multiplying by Z again projects the forces back to joint-space.

f¯ = ZZ T f = −ZZ T H T C −T g∗f

(296)

A numerical example demonstrating the efficacy of this technique is beyond the scope of this thesis and will be the focus of future research. Nonetheless, Eq. 296 yields important insight as to the relationship between the orthonormalized null–space basis, Z, and joint working forces.

11.3

Mobility Numbers in a CAD Environment

Because most of the setup process for computing mobility numbers can be automated, the method is well–suited to incorporation into a CAD (Computer Aided Design) environment. As already demonstrated in Section 10.3, mobility numbers can be computed on the fly as a mechanism is moved to various configurations interactively. In this section, it will be shown that mobility numbers can also be used as an aid in positioning of a constrained model in a CAD environment. In the graphical user interface that was developed in support of this thesis, an overview of the mechanism topology is provided at the left–hand side of the viewer window. The model overview pane shows the bodies, joints and frames grouped according to their assembly assignment. Clicking on a joint reveals a slider, which allows the user to reposition the joint. When the user moves the slider away from the center position, mobility numbers are computed according to Chapter 9 with one minor modification. The joint that the 210

user clicked on is chosen as a specified independent variable. Before LU decomposition with complete row and column pivoting is initiated, the specified variable is permuted to the right column of the Jacobian matrix. This step is done to force the specified variable to be independent when carrying out GCP.

Figure 41: A screen capture of the front steering linkage shown in the graphical user interface. Moving the slider allows repositioning a joint and all joints and bodies in the kinematic substructure containing it.

LU Decomposition with complete pivoting defines the equation and variable permutation and partitioning as    pd pi  d  J J q˙   =0 J sd J si  q˙i 

(297)

The primary constraint equations in Eq. (297) yield J pd q˙d + J pi q˙i = 0

(298)

q˙d = −J pd−1 J pi q˙i = −B di q˙i

(299)

and therefore

211

It follows by multiplying Eq. (299) by a differential change in time dt that: δq d ≈ −B di δq i

(300)

When the slider is moved, an incremental change in the specified independent variable associated with the current selected slider, δq i , is set according to δq i = Ks ∗ s ∗ f (m)

(301)

Incremental changes in the remaining independent variables (if any) are set to zero and the corresponding incremental changes in dependent variables are computed and applied to the system. Then one to two Newton–Raphson iterations are performed to remove any constraint violations. In Eq. (301), s is the slider displacement, Ks is an adjustable slider gain factor and f (m) is a function of the mobility number. Two possibilities for f (m) are shown in Fig. 42.

Figure 42: Scaling movement by joint mobility

The two versions of f (m) shown in Fig. 42 were chosen so that as the joint mobility decreases, increasing effort from the user in the form of increased displacement of the slider is required to move the mechanism. As a joint’s mobility approaches zero, f (m) evaluates to zero, preventing the user from moving the mechanism into a singular configuration. The first version, shown in blue in Fig. 42 provides minimal user–feedback. The independent 212

variable selected by the user maintains its full ability to impart a change on the mechanism configuration until right before a singular configuration is reached, at which point the output from the slider drops off abruptly to zero. The second version in green provides a more dramatic difference in effort required to move the mechanism. In the latter case, the transition is less abrupt, which allows the user to observe dynamic changes in the selected joint’s contribution to overall mobility. The less abrupt scaling function also allows users time to adjust the slider position before the joint’s mobility number gets too close to zero or to select another joint with higher mobility to drive the mechanism. The second scaling function provides an intuitive way to interactively position constrained systems and provides useful feedback to the user. For the excavator presented in Chapter 10, if the user actuates the boom, stick or bucket linkage by one of the cylinders, the mechanism gradually slows down near the cylinders’ toggle positions and stops altogether just before reaching those positions. The method was also used for the purpose of animating Klann’s linkage in Chapter 10. The walking–cycle was animated by moving the slider corresponding to the crank with a constant displacement. As the linkage moves through the walking cycle, the animation speeds up in poses with high mobility and slows down in poses with low mobility. The technique provides users with important intuition as to how forces would be transmitted through constrained mechanisms without actually applying or computing forces in the model. To minimize the overhead of the method, the Kinematic Substructuring algorithm from Chapter 5 is applied to automatically identify clusters of overlapping chord loops. The technique is then only applied to the block subjacobian corresponding to the specified independent joint the user has selected. Furthermore, after the first call to LU decomposition with complete row and column pivoting is applied, subsequent evaluations are performed using LU decomposition with no pivoting (i.e. using the heuristic method described in Chapter 5), which further increases performance. When the user clicks on a different joint in the current kinematic substructure or in a different kinematic substructure displayed in the graphical user interface, a new specified independent variable is set. The column of the

213

Jacobian matrix associated with that specified variable is moved to the right. Then LU factorization with complete row and column pivoting is applied to the Jacobian matrix to obtain new LU factors and set up to incrementally move the system using Newton-Raphson iteration. The technique runs very efficiently and is capable of repositioning complex mechanisms, such as the steering linkage or bicycle caliper brake, in real–time in a CAD environment. The algorithm has no problem to achieve 60 Hz to allow animating the repositioning of the model at 60 frames per second.

11.4

Application of Mobility Numbers for Redundancy Resolution

Consider the 7R robotic manipulator as shown Fig. 43. Because of their increased dexterity and ability to avoid bifurcations and singular configurations, 7 and 8 DOF robotic manipulators are finding increased applications in industrial settings. See, for example [119, 120].

Figure 43: A 7R Robot Arm

In the numerical solution of robotic posture equations, it is common to introduce an artifical constraint at the end–effector and apply the Newton–Raphson method to iteratively solve for a nearby posture by taking a Taylor series expansion as:

f (q + δq) = f (q) + Jδq + R2 ≈ f (q) + Jδq = 0 214

(302)

where J is the Jacobian matrix. R2 represents the remainder terms associated with higher order terms q 2 –q ∞ . In the context of robotic manipulators, f (q) represents the constraint violation (i.e. the error between current configuration and desired configuration) of the end–effector. If the end-effector position is specified using the SE(3) matrix Φf e , which indicates an active transformation from the inertial reference frame, f , to the end-effector frame, e and a nearby configuration is specified as Φ0f e , then the error is expressed as

E = Φ0f e Φ−1 fe − I

(303)

The nonzero terms of E are put in screw coordinate form, e as described in Eq. (234) in Chapter 7. The quantity e is substituted for f (q) in Eq. (302), yielding

e = −Jδq

(304)

where δq represents an incremental change in joint angles that drives the quantity e toward zero. In the context of numerical solution for a nearby posture for a manipulator, the Jacobian matrix plays a different role than in the formulation of the dynamic equations. In numerical position control, the constraint introduced at the end–effector frame is an artificial construct and effectively converts a serial chain mechanism into a single–loop mechanism. Many of the challenges outlined in Section 3.4 associated with processing the Jacobian matrix also apply to the solution of Eq. (304). If J is square and invertible, then a unique solution can be found using basic linear algebra methods as:

δq = −J −1 e

(305)

However, the Jacobian matrix is rarely square and invertible. While the Jacobian matrix for 6 DOF manipulators may be invertible over a wide range of configurations, it may

215

instantaneously lose rank as joints instantaneously align or the manipulator reaches the edge of the workspace. The Jacobian matrix for the 7R manipulator, shown in Eq. (306) below, is rectangular, and a matrix inverse does not exist.

i h J = hA hB hC hD hE hF hG

(306)

The seven revolute joints are assigned letters A-G starting from the base and working towards the end–effector frame. The hi terms are the influence coefficient matrices associated with the respective joints. The Jacobian matrix for this manipulator is always rank deficient, so there is not a unique solution to the system of equations and Eq. (305) cannot be used. 11.4.1

Prior art for resolving redundancy

A common technique to resolve rank deficiency is to take the generalized inverse or Moore– Penrose pseudo inverse of the Jacobian as δq = −(J T J)−1 J T e = −J + e

(307)

which effectively finds a nonlinear least squares solution [121]. Equation (307) is only valid if the Jacobian matrix has full column rank. If the Jacobian matrix does not have full column rank, the more expensive SVD algorithm may be required to obtain the pseudo–inverse. The terms effectively minimized are proportionally quadratic in joint velocity, and are therefore approximately proportional to the kinetic energy of the system. A variation on the generalized inverse method is the weighted or damped least squares method δq = −(J T W J + W )−1 J T W e

(308)

where W is a diagonal matrix of weighting factors specified by the user [122, 123, 124]. Some authors select the weights based on the inertia of the links [125]. As discussed in Section 3.4 methods based on the generalized inverse were also originally used to solve a system of differential algebraic dynamic equations. The generalized inverse 216

method not only requires many additional FLOPS, Uicker et al. [28] noted that the method may sometimes fail to converge. Despite the disadvantages described above, variations of the generalized inverse method are still widely used in robotics literature for numerical positional control of redundant manipulators – see for example [126, 127, 128] and many others. In Section 3.4, LU decomposition with complete row and column pivoting was used to find the set of primary and secondary constraint equations and split of dependent and independent variables. LU decomposition with complete pivoting can also be used to find δq in Eq. (304). Recall that LU decomposition with complete pivoting yields       T    J pd J pi δq d  −ep  Pvd δq d    J   = =  i sd si  s i δq J J δq i  −es  Pe Pv 

Pep

(309)

where J pd is the largest square, nonsingular submatrix of J. The independent variables q i , may be set to any arbitrary value within practical limits, and a change in dependent variables can be found from the row–space as

δq d = −U −1 L−1 (ep + J pi δq i )

(310)

where L and U are the LU factors of J pd . The variables δq i are those variables that have approximately the greatest mechanical advantage and therefore exert the smallest overall effect on a change in instantaneous configuration. It follows that the independent variables will generally have higher mobility numbers. Uicker et al. [28] suggest a simple method to resolve the redundancy: lacking any additional information, simply set the incremental change in independent variables to zero, and solve the reduced system of equations

δq d = −U −1 L−1 ep

(311)

While Eq. (311) is appealing in that it requires far fewer FLOPS than Eqs. (307) and (308) the approach does not fully exploit the redundancy of the manipulator. In the 7R robot, one of the manipulator’s actuators will always be identified as independent. Therefore 217

if this technique were applied, only 6 out of 7 actuators would be moved at any time. By always setting an incremental change in one of the actuator’s variables to zero, the system effectively acts like a nonredundant 6 dof actuated system. In the next section, it will be shown how mobility numbers can be used to find a solution that is in the spirit of Eqs. (310) and (311) (i.e. it does not require taking the generalized inverse) and is capable of fully exploiting the redundancy of the system so that all 7 degrees of freedom are activated. 11.4.2

Redundancy resolution based on mobility

Mobility numbers, which are related to the instantaneous mobility of a mechanism, are computed numerically from a mechanism’s constraint equations as described in Chapter 9. The technique is applied to a 7R robot, whose shape matrices are provided in Table 32. The joints are labeled sequentially A–G from the base to the end–effector and consist of revolute joints aligned along the y, x, z, x, z, x and y axes respectively. For this numerical example, the joint angles are assigned the values 0.1, 0.2, -0.2, 3.3416, -0.2, 3.3416 and 1.7708 radians respectively. The topology of the 7R robot is traversed to update the absolute position and orientation of the various frames, and the Jacobian matrix for the robot in the configuration shown in Fig. 43 is evaluated as

 0.0000 −0.0134 −0.1356 −0.0256 0.1756 −0.0123   0.0000 0.0200 −0.0000 0.1821 0.0073 −0.3956   0.0000 −0.1343 0.0136 0.1766 −0.0106 −0.1770  J = 0.0000 0.9950 −0.0978 −0.9791 0.0604 0.9981    0.1986 −0.1947 −0.0038 −0.0040 1.0000 0.0000  0.0000 −0.0998 −0.9751 0.0585 0.9981 −0.0604

 −0.0936   −0.0074   0.0040    −0.0584    0.2024   −0.9775

(312)

With the artificial constraint in place at the end-effector, the Jacobian matrix has a maximum possible rank of 6 and an instantaneous mobility of 1. The implication of this is that the robot arm has one extra degree of motion available to it; the robot arm can move a 218

Table 32: Shape matrices for 7R robot body

1

2

3

4

5

6

7

shape matrix   1 0 0 0     0 1 0 0.045 + 1h   Φ =  0 0 1 0    0 0 0 1   1 0 0 0     0 1 0 0.09 + 2h  Φ =   0 0 1 0.02   0 0 0 1   1 0 0 0     0 1 0 0  + 3h  Φ =   0 0 1 −0.085   0 0 0 1   1 0 0 0       0 1 0 0 +  Φ4h =    0 0 1 −0.125   0 0 0 1   1 0 0 0       0 −1 0 0 +  Φ5h =    0 0 −1 −0.085   0 0 0 1   1 0 0 0     0 0 −1 −0.125 + 6h   Φ =  0 1 1 0    0 0 0 1   1 0 0 0     0 1 0 0.085 + 7h   Φ =  0 0 1 0    0 0 0 1

219

linear combination of joints while not violating the artificial constraint at the end–effector. Computing mobility numbers for the robot with the artificial constraint imposed is not particularly useful, as the numbers in this case indicate how well a joint is aligned with a direction orthogonal to a desired end–effector motion. In order to use mobility numbers for the analysis, it is necessary to make the desired trajectory part of the problem statement. Let the motion path of the end–effector be parameterized in terms of position, orientation, velocity and time. A number of approaches to parameterize the smooth motion path of the end-effector through the position, velocity and orientation of a number of control points exist in the literature, including interpolation of cubic Hermite splines [129], spherical interpolation of quaternions [130] (SLERP) or smooth invariant interpolation of rotations [131]. For the sake of this discussion, let us assume that one of these methods is used to parameterize the path and that the instantaneous screw motion of the end–effector, expressed in its local frame coordinates is available for time t. The velocity screw can be expressed as a unit twist by dividing by the magnitude of the translational velocity, kvk for translational motion or the magnitude of angular velocity kωk for angular motion. If the screw motion consists of both a prismatic and angular component, then it will contain a combination of translation and rotation related by a pitch or inverse pitch depending on which translational or rotational variable is selected as independent. In this case the screw motion would be obtained by dividing the twist through by a linear or angular velocity as appropriate. To find the influence coefficient matrix associated with the unit screw expressed in the end–effector frame, it is necessary to transform the screw to the fixed inertial frame. For the present example, assume that sepath in the end–effector frame is given as sepath

n oT = 0.1958 0.9789 0.0587 0.5873 0.0196 0.3916

(313)

where the superscript e indicates that the screw is expressed in the end-effector frame. Transforming the unit velocity, sepath , using the [Ad]ff ee transformation yields the influence coefficient matrix, hpath associated with the path. For further details on the [Ad] transformation or influence coefficient matrices, refer to Chapter 3. The influence coefficient matrix for the

220

path is appended to the 7DOF robot’s Jacobian matrix as J= Inserting numerical values  0.0000 −0.0134    0.0000 0.0200    0.0000 −0.1343  J =  0.0000 0.9950     1.0000 0.0000  0.0000 −0.0998

h

hA hB hC hD hE hF hG hpath

i

(314)

yields −0.1356 −0.0256 −0.0000

0.1821

0.0136

0.1766

−0.0978 −0.9791 0.1986 −0.9751

0.1756 0.0073

−0.0123 −0.0936 −0.3956 −0.0074

−0.0106 −0.1770 0.0604

0.9981

−0.1947 −0.0038 −0.0040 0.0585

0.9981



0.2853 0.6905

0.0040

−0.0611

−0.0584

0.2548

0.2024

−0.6369

             

−0.0604 −0.9775 −0.1671 (315)

An orthonormalized null–space basis for J is obtained as  0.1972 A  −0.2296  B   0.0255 −0.0278  C   −0.3567 −0.7338   0.2346 D  −0.5108  Z=  E   0.1345 −0.5346  0.2115 F   −0.5013  0.1783 G   0.5165  −0.1491 0.0677 path

                     

(316)

Note that the null–space basis in Eq. (316) is not unique. The null–space represents two orthogonal tangent vectors on the mechanism’s constraint manifold, as illustrated in Fig. 44b in red. At this stage, neither tangent vector is aligned with a particular direction of interest, such as the path of the end-effector, which is illustrated with the black dashed arrow in Fig. 44b. The necessary information to align the null space basis to the desired path is found in the last row of Eq. (316). When the tangent–space is aligned with the desired end–effector path, the entry in the first column of the row associated with the path will project fully onto the path. The entry in the second column will be zero. An orthonormal 221

rotation matrix, R can therefore be applied to rotate Z to align the tangent vectors so that one column projects fully onto the path and the other is fully orthogonal to the path. Note that this is not the same matrix R used for QR decomposition in Chapter 9. 



A  0.2906 −0.0847     B  −0.0348 0.0148      C  0.0217 0.8156         −0.9107 −0.4132 D  0.5621 −0.0026  0   = Z = ZR = Z     0.4132 −0.9107 E  −0.3434 0.4313      F  0.5439 0.0146      G  −0.3967 −0.3758     path 0.1638 0.

(317)

The matrix Z 0 now has the first column aligned with the target path. The first (path– control ) column represents a linear combination of the joint velocities A–G that fully project onto the instantaneous screw motion of the path. The second (pose–control ) column of Z 0 represents a tangent vector that is orthogonal to the target path. The linear combination of joint velocities indicated in the second column of Z 0 impart no motion along the target path and maintains the pose of the end–effector. In other words, the combination of quantities in the pose–control column of Z 0 reconfigure the robot while maintaining the artificial constraint at the end–effector. Note that the quantities in the pose–control column of Z 0 are precisely those that would have been obtained from the unaugmented Jacobian in Eq. (306). Once the first column of Z 0 has been aligned to the target path, the entries in the path– control column indicate how to proportion joint velocities to impart maximum transmission onto the path n oT δq path 0 0 0 0 0 0 0 δq A−G = Z[A,0] Z[B,0] Z[C,0] Z[D,0] Z[E,0] Z[F,0] Z[G,0] 0 Z[path,0]

(318)

With this information, a first–order update of the joint variables δqA−G can be applied using Eq. (318) with fewer operations than Eqs. (307) and (308). Furthermore, the heuristic method presented in Chapter 5 can be applied to further improve performance for obtaining 222

(a) A one dimensional mani(b) A two dimensional mani- (c) A two dimensional mani-

fold

fold with an arbitrary set of fold with one tangent vector tangent vectors

aligned with path direction

Figure 44: One and two dimensional manifold visualization the orthonormalized null space, Z 0 . A numerical example demonstrating the efficacy of this technique is beyond of the scope of this thesis. 11.4.3

Future work

In this section, a few observations are provided to inform future work on the application of mobility numbers for redundancy analysis. Equation (318) represents the greedy algorithm for obtaining an incremental update of joint variables δq A−G at time t0 . Equation 318 is greedy in the sense that it maximizes transmission of joint variables A–G onto the desired trajectory at time interval t0 with no consideration of the state of the mechanism at the next time interval(s). Note that it is possible the optimal projection onto the desired trajectory at one time interval may put the robot in a position where one or more of the joints have a low ability to impart transmission onto the desired trajectory at a later time interval. Optimal robot path tracking (i.e. following a specified path while minimizing an objective function) has been investigated by many researchers. Various objective functions that have been investigated in the past include minimizing jerk in the motors, minimizing control efforts or minimizing total energy expenditure. The problem has been investigated using dynamic programming [132, 133], numerical integration methods [134, 135] and convex optimization [136, 137]. Many authors have investigated 5 and 6 dof robots, but for the 7R robot, the size of the problem becomes enormous. It is expected that mobility analysis may serve as an important tool in optimal path tracking. 223

To understand how mobility analysis may play a role in optimal path tracking, consider the significance of the various terms in Z 0 . For most of the examples presented in Chapter 10, the mechanisms had a limited number of active joints. The remaining joints were passive, i.e no forces were actively applied in them. When analyzing the mobility numbers in the previous examples, the mobilities associated with only the actively actuated inputs, and sometimes a limited number of passive outputs, were of primary interest. In general, the mobility analysis was carried out to ensure that the limited number of active inputs maintained sufficiently high mobility throughout the operating range to achieve successful actuation of the linkages. In that sense, the 7R robot is different because every joint is actuated; it has no passive joints. To ensure that every joint is actively engaged in moving the end–effector along the trajectory, it is desired that all entries in the first column of Z 0 remain relatively uniform with adequate mobility to participate in the desired motion. Therefore one possible objective function for optimal path–tracking based on mobility would be to maximize the minimum entry in the first path–control column of Z 0 . Furthermore, consider the application of the simple rotation matrix used to align the null–space basis Z to the trajectory in Eq. (317). Simply by applying another rotation matrix, it is possible to explore variations from the greedy algorithm. With no rotation matrix applied to the entries of Z 0 , the quantities represent an optimal projection onto the path. Applying another rotation matrix for a small incremental rotation to Z 0 yields a linear combination of incremental joint displacements in the first column of Z 00 that vector the transmission of some joint displacments orthogonal to the desired trajectory. Therefore, by applying a rotation matrix to Z 0 it is possible to efficiently obtain any linear combination of joint variables that project fully onto the desired path, completely orthogonal to the path, or any linear combination thereof. Projecting incremental joint displacements orthogonal to the desired trajectory does no work to actively move the end–effector, but allows the robot to reconfigure itself to put it into a potentially better position for transmission at later time intervals. Therefore, application of a rotation matrix, such as in Eq. (317), may serve as a simple and powerful tool for exploiting redundancy in the 7R robot.

224

As mentioned above, the size of the optimization problem for the 7R robot is enormous. Due to the redundancy in the 7R robot, at each discrete time interval, there are an infinite number of possible variations of joint variables that track the desired trajectory. In that sense the problem can be considered an infinite graph with an infinite number of choices at each discrete time interval. One strategy that could be employed to make the optimization problem tractable is to sample the graph — for example, by evaluating the objective function at the next time interval for three candidates: i.e. the optimal projection (the entries in the first column of Z 0 ) and a ±10 degree deviation away from the optimal projection (which can be obtained by applying a ±10 degree rotation matrix to Z 0 ). The choice of which combination of incremental displacements to apply at time t0 could be based on the evaluation of the objective function for the three choices at the next time interval, t1 . Using this simple method, it is possible to choose the linear combination of joint variables that maximizes the objective function for the next time interval. It is of course possible to extend the concept to look ahead multiple time intervals. If each of the three variations at time interval t1 are evaluated, there are 9 candidates to evaluate for time t2 , 27 for time t3 and so on. The number of evaluations rapidly grows with the number of branches to investigate at each time interval and the number of look–ahead steps. If m is the number of look–ahead steps and n is the number of branches to sample P i 0 at a given time interval, then m+1 i=1 n evaluations of the constraint equations and Z are required for the first time interval t0 . The path along the sampled graph that maximizes the objective function for the next m steps is chosen at time interval t0 . Once a step is chosen, the path–tracking algorithm increments forward from time interval t0 to t1 . The branches of the sampled graph that were not selected at t0 are pruned from the graph and added as leaf nodes to the end of the graph. Then at time t1 it is only necessary to evaluate the leaf nodes of the sampled graph, requiring only nm+1 evaluations of Z 0 for subsequent time steps. Since the nodes of the sampled graph can be repurposed, the process requires a fixed P i memory footprint of m+1 i=1 n nodes. The topology of the sampled graph can be compactly represented using a binary tree and parent–child list, which was presented in Chapter 5. The

225

method described is similar to receding–horizon model–based predictive control strategies, such as presented in [138, 139, 140] and elsewhere. Note that the look–ahead method described requires a large number of evaluations of the constraint equations and the orthonormalized null–space bases, Z and Z 0 , for a given time interval, and therefore stands to benefit greatly from the heuristic application of generalized coordinate partitioning, which was presented in Chapter 5. Further performance gains may be obtained using parallelization, noting that the constraint equations, Z and Z 0 can be evaluated for each of the leaf nodes independently requiring no synchronization. The look– ahead method and objective functions based on maximizing mobility will be investigated in future research.

11.5

Summary

This chapter has demonstrated several computationally efficient and practical applications of mobility numbers applied to complex mechanisms. Additionally, the groundwork for future applications of mobility numbers applied to optimal path–tracking has been laid. In this and previous chapters, a new software tool with a command–line and graphical user interface was used to generate and visualize the results. The next chapter describes some unique aspects of the software and datastructures.

226

12

Software Design

A new general–purpose software tool was developed to demonstrate the numerical algorithms described in this thesis. This chapter is intended to provide an overview of some of the features of the software. Note that providing a comprehensive user’s manual for the software is beyond the scope of this thesis. All numerical algorithms discussed in the previous chapters were written in the C programming language and compiled as a static library. In this way, the numerical algorithms are available as an Application Programming Interface (API), and can easily be incorporated into other applications. At present, the C API has been incorporated into a simple command line interface written in the C programming language and a graphical user interface (GUI) using the C++ programming language, OpenGL and the Qt windowing framework. This chapter describes some unique attributes of the data structures and programming methods used in the code.

12.1

Design Criteria

The following design criteria for the software, graphical user interface and datastructures were identified: • The user interface should be separated from the core functionality, thereby allowing alternative front-ends to be programmed with no changes to the core code-base. • Newly developed external file formats should be compatible with modern web technologies. • Internal and external datastructures should allow representation of mechanisms and multibody systems as a hierarchy of assemblies and subassemblies, thereby facilitating rapid model development and design reuse. • Traversing the topology of a mechanism to update kinematics and dynamics should require minimal logic and computational overhead. • The user interface and file formats should be familiar to engineers. 227

• The software should be extensible, allowing user–defined functions to feed back forces into a running simulation. To keep the user interface separate from the core functionality, the bulk of the code is written in C and compiled as a static library as an Application Programming Interface (API). As discussed above, the command–line tool and graphical user interface use the C API to import models and deploy simulations. In this way, the core code can be readily adapted to different purposes. A new file format to describe the topology, geometry and inertia properties of mechanisms and multibody systems was developed in support of this research. The file format uses the popular Javascript Object Notation (JSON) [141] syntax, which is based on a subset of the Javascript programming language. Because of the close association with the Javascript programming language, the JSON syntax is well–suited to exchange of multibody system topological data over the web. Furthermore, parsers for the JSON syntax exist for many programming languages, enabling advanced applications for power users such as scripting of model creation and simulation deployment. Users of modern CAD software will be familiar with the concept of organizing their models as a hierarchy of assemblies and subassemblies. The file format and software support the ability to arrange bodies, joints, lights and force modules in an arbitrary user–defined hierarchy. To reduce the overhead and logic required to update the kinematics and dynamics of complex multibody systems, several techniques are implemented. One of the techniques used in the present work is referred to as equivalent graph representation. Other techniques that are used to reduce complex logic for updating the kinematics and dynamics of a multibody system include the use of memoization to record complex logic for later play–back, the application of a custom kinematics library that uses the minimum number of operations to carry out spatial transformations (see Chapter 3 for more details) and the automatic identification of Kinematic Substructures to reduce the complexity in processing a system’s constraint equations (see Chapter 5). 228

The graphical user–interface is inspired by modern CAD software, an essential tool in an engineer’s workflow. For example, the familiar model–tree pane at the left of the 3D visualization viewport provides an overview of the user’s assembly hierarchy. Additionally, many user–friendly and context–aware menus allow easily inspecting, editing or visualizing the parameters specified in the JSON input file. Finally, the software includes preliminary support for force modules which allow users to feed back forces into a running simulation between two frames or within a joint as a function of position, velocity or acceleration between frames or within joints. Many of these design criteria are discussed in more detail below.

12.2

File Format and Data Structures

In modern CAD software, an engineer creates complex mechanisms by specifying a number of bodies and connecting them together with joints. The joints may be assigned a sense or direction. Groups of bodies and joints may be arranged in assemblies and subassemblies, which facilitates rapid model development and design reuse. When engineers specify a model in this way, they have created a directed graph of their system, with bodies as graph nodes and joints as graph edges. If one or more closed kinematic loops exist in the system, the directed graph may contain one or more cycles. A hierarchical directed graph with possible cycles is a natural representation to a CAD user, but updating kinematic and dynamic quantities requires processing and eliminating the cycles. More specifically, updating the topology requires identifying the root–oriented directed graph or spanning tree of the system. The spanning tree consists of the subset of graph edges required to touch each node exactly once with the sense of all joints oriented away from the root of the spanning tree. This subset of graph edges is often referred to as the arcs of the directed graph. The remaining graph edges are referred to as the chords of the directed graph. Furthermore, information about which assembly a body or joint belongs to is not necessary to traverse the topology. The hierarchical, possibly cyclic directed graph representation of a system’s topology is a 229

natural representation for the engineer — and closely mirrors how physical mechanisms are assembled. Bodies and joints can be specified in any order and logically grouped to organize complex systems into more primitive subcomponents. Furthermore, joints may be oriented directions that makes sense to the user, without concern for alignment with the spanning tree. Therefore, the external file format used in the software is based on the hierarchical, possibly cyclic directed graph representation. 12.2.1

External File Format

Generally, CAD software allows exporting models as a mesh, i.e. a representation of part geometry as a collection of points connected by patched faces. Common mesh-based file formats include STL, OBJ, PLY and others. Other more sophisticated data formats, such as IGES, Parasolid and STEP, support representation of part geometry as complex curves and surfaces. Refer to [142] for an overview. These formats typically account only for part geometry and provide no information regarding overall system topology. While limited support is now becoming available for exporting/exchanging topological data between commercial CAD packages, most CAD programs do not support exporting topology information to open, universal formats, such as COLLADA [143] or X3D [144]. Importing and exporting topology information from CAD software is complicated by the fact that commercial CAD software uses proprietary and/or binary file formats. Another reason that export to COLLADA and X3D may not be widely supported is that these formats were originally created for general-purpose computer graphics applications, not necessarily for the specific requirements of CAD. For example, embedded in the COLLADA and X3D file formats is the notion of a scene graph or spanning tree, which is used to represent the relative configuration of bodies in a scene. While the spanning tree is necessary for processing the topology, as discussed above, it is not a natural representation for a CAD user, who is accustomed to adding bodies to a scene and connecting them together with arbitrarily directed joints. Another disadvantage to the X3D and COLLADA file specification for describing multi-

230

body systems is that they are general–purpose formats with many attributes not directly related to describing mechanism topology. Additionally, the file formats are based on the XML (eXtensible Markup Language) specification, which is more verbose and less readable than other file format specifications. For these reasons, the author has elected to design a new file format tailored to mechanisms and multibody systems. The file format is simpler and less verbose than many of the existing text–based file formats, and is based on the more natural directed graph with possible cycles. The new file format designed in support of this research is based on the ASCII text–based JSON (Javascript Object Notation) specification, a subset of the Javascript programming language [141]. JSON is a lightweight, programming language independent syntax that is faster to process than XML [145]. JSON is a common data exchange format for web services and is a native file format representation in NoSQL databases such as CouchDB and MongoDB, and is therefore well-suited for applications such as exchange of multibody system topological data over the web. The JSON language format has a number of advantages. JSON files are programming language independent – parsers exist in many programming languages. In this work, the Jansson C library for reading/writing JSON files [146] is used to facilitate loading JSON files. An advantage to the simple syntax used in the file is that it allows power users to manually generate their models using a text editor or to generate models in a scripted fashion using their programming language of choice. For example, all of the various tire validation models described in Chapter 7 were generated using Python. Using only a few lines of Python code, all variations in normal load and steering angle were generated, the tire simulations were initiated, and the data was collected and read in for plotting, which demonstrates the power and flexibility of the JSON format. In the file designation, model topology is specified as a hierarchical directed graph with possible cycles. The JSON file contains a list of bodies connected together by joints. The bodies and joints are organized into a hierarchy of assemblies and subassemblies. In this way, generating a JSON file to describe the mechanism topology does not burden the user with

231

finding the spanning tree of the mechanism, aligning the sense of joints with the spanning tree, or identifying chord loops in their mechanisms. The representation of a directed graph with possible cycles serves to increase simulation throughput and reduce the possibility of human operator error. In addition to topological data, the JSON file contains information about units, simulation parameters, and body mass and inertia properties. Furthermore, a number of coordinate frames can be defined for each body. Part geometry and appearance is specified in an external mesh–based file [142]. The Open Asset Import Library, which supports a number of mesh–based file formats [147], is used to load part geometry and appearance information for rendering purposes. A simple double–pendulum model, as illustrated in Fig. 45 is used to demonstrate the JSON syntax.

Figure 45: Double Pendulum Model

The accompanying JSON file is shown below: 1

3

{ " model_parameters " : { " version " : " 0.2 " , " units " : {

232

5

7

9

11

13

15

17

19

21

23

25

27

29

31

33

35

37

39

41

43

45

47

49

51

53

55

57

59

61

" angle " : " radians " , " length " : " meters " , " mass " : " kilograms " , " time " : " seconds " }, " gravity " : [ 0.0 , 0.0 , 9.81 ], " include_dir " : [ " models / pendulum / " , " examples / " ], " simulation " : { " integration_type " : " ode113b " , " time_span " : 100 , " time_step " : 0.03 , " relative_error " : 1e -9 , " absolute_error " : 1e -9 } }, " fixed " : { " frames " : [ { " name " : " overhead_1 " , " T " : [6.0 , -6.0 , -6.0] , " R " : [[1.0 , 0.0 , 0.0] , [0.0 , 1.0 , 0.0] , [0.0 , 0.0 , 1.0]] }, { " name " : " overhead_2 " , " T " : [ -6.0 , -6.0 , -6.0] , " R " : [[1.0 , 0.0 , 0.0] , [0.0 , 1.0 , 0.0] , [0.0 , 0.0 , 1.0]] } ], " lights " : [ { " name " : " overhead_1 " , " type " : " point " , " parent " : [ " fixed " , " overhead_1 " ] , " Ka " : [0.4 , 0.4 , 0.4] , " Kd " : [0.6 , 0.6 , 0.6] , " Ks " : [0.2 , 0.2 , 0.2] }, { " name " : " overhead_2 " , " type " : " point " , " parent " : [ " fixed " , " overhead_2 " ] , " Ka " : [0.6 , 0.6 , 0.6] , " Kd " : [0.6 , 0.6 , 0.6] , " Ks " : [0.2 , 0.2 , 0.2] } ] },

233

63

65

67

69

71

73

75

77

79

81

83

85

87

89

91

93

95

97

99

101

103

105

107

109

111

113

115

117

119

" assemblies " : [ { " name " : " pendulum " , " active " : true , " bodies " : [ { " name " : " link1 " , " definition " : [ " link . obj " , " link . mtl " ], " center_of_mass " : [ 0.0 , 0.0 , -0.5 ], " mass " : 1.0 , " moment _of_in ertia " : { " Izx " : 0.0 , " Ixx " : 1.0 , " Ixy " : 0.0 , " Izz " : 1.0 , " Iyy " : 1.0 , " Iyz " :0.0 }, " frames " : [ { " name " : " distal " , " T " : [0.0 , 0.0 , -1.0] } ] }, { " name " : " link2 " , " definition " : [ " link . obj " , " link . mtl " ], " center_of_mass " : [ 0.0 , 0.0 , -0.5 ], " mass " : 1.0 , " moment _of_in ertia " : { " Izx " : 0.0 , " Ixx " : 1.0 , " Ixy " : 0.0 , " Izz " : 1.0 , " Iyy " : 1.0 , " Iyz " :0.0 } } ], " joints " : [ { " name " : " A_fixed_link1 " , " body_frame_pairs " : [

234

[

121

" fixed " , " origin "

123

], [

125

" link1 " , " origin "

127

] ], " type " : " Rx " , " i n i t i a l _ d i s p l a c e m e n t " : 1.5707963267948966 , " initial_velocity " : 1000.0

129

131

}, {

133

" name " : " A_fixed_link1 " , " body_frame_pairs " : [ [ " link1 " , " distal " ], [ " link2 " , " origin " ] ], " type " : " Ry " , " initial_displacement ": 0, " initial_velocity " : 1000.0

135

137

139

141

143

145

147

}

149

] }

151

] 153

}

../../src/models/pendulum/double pendulum.json The JSON file contains a number of key–value pairs. In each key–value sequence, the key is indicated in quotations, followed by a colon and the value. The value may consist of a string in quotations, a real number, an array of real numbers or a number of other key–value pairs. For more information on the JSON syntax, refer to [141]. The various aspects of the JSON file are described in detail below: The "fixed" key at the root level of the JSON file represents the fixed inertial reference frame. The "fixed" key can be considered a special assembly that is not a member of any other assembly. It does not contain any bodies or joints – it only contains frames and lights. Frames are coordinate systems represented as 3x3 rotation matrices and translation vectors from the origin of the inertial reference frame, sometimes referred to as shape matrices [28]. If the "fixed" field is not present, a fixed inertial reference frame with only an "origin" 235

reference frame is created automatically by the program. If the user wishes to specify a number of spot or directional static "lights", they may be attached to a frame under the "fixed" key. Alternatively, dynamic lights can be specified by attaching the light to a frame on a moving body. If no lights are specified, a directional overhead light in the direction aligned with the gravity vector is automatically created for rendering purposes. Each entry under the "bodies" key contains a unique name, a number of shape matrices and mass and inertia properties. Body geometry is specified in an external mesh–based file format under the "definition" key. All mesh–based file formats supported by the Assimp Open Asset Import library are supported; refer to the ASSIMP documentation for more information [147]. Each entry under the "frames" key contains a unique name and a shape matrix specified as a 3x3 SO3 rotation matrix and a translation vector from the body’s origin. As with the "fixed" inertial reference frame, it is not necessary to specify an "origin" reference frame. By default every body contains the "origin" reference frame. Therefore, the name "origin" is a reserved keyword that cannot be used as a reference frame name at any level. The "joints" key contains connections between frames. Each joint has a unique name, a type, an initial position and velocity, and a body–frame pair. Currently the following joint types are supported: primitive rotations, primitive translations, rigid connections, 3DOF spherical and 6DOF float joints. Revolute joints are specified using the strings "Rx", "Ry", and "Rz". Primitive translations are specified using the strings "Tx", "Ty", and "Tz". Zero DOF rigid connections are specified with the "rigid" string. 3DOF spherical and 6DOF float joints are specified using the "spherical" and "float" strings respectively. Float joints would be used to initially position a free–floating body that is not explicitly attached to the inertial reference frame. Support for compound joints such as planar, universal or geared joints is planned for a later version of the file format. The "assemblies" key contains one or more groupings of key–value pairs containing any number of bodies, joints or force modules. In the double pendulum example, bodies "link1"

236

and "link2" are assigned to the "double pendulum" assembly. Although the pendulum consists of a single assembly, multiple assemblies can be inserted at the top level, and any number of subassemblies can be inserted into other assemblies by specifying the "assembly" key within an assembly. Assemblies and subassemblies can be nested arbitrarily deep. Joints in one assembly can connect bodies from different assemblies/subassemblies by specifying the full path to the body in the assembly hierarchy using an array of descriptive names.

For example, the array of strings ["factory", "workcell1", "robot2",

"body1", "frame1"], indicates the frame "frame1" on the body "body1" in the "robot2" subassembly, which is contained within the "workcell1" subassembly, which is in turn a member of the "factory" main assembly. If no path is specified (i.e. a single string, rather than an array of strings specified in brackets, is specified), the program searches for descriptive names only within the local assembly. Note that the sense of the joint need not align with the spanning tree or scene graph of the mechanism. To highlight this feature, in the pendulum example, the second joint is oriented opposite of the model’s root-oriented graph. Orienting the sense of joints opposite of the root–oriented graph would cause formulation problems for software based on the notion of a spanning tree, but the software developed in support of this research has no problem loading this representation of the system topology, and the original, user–defined sense is maintained for reporting purposes. Although not shown in the JSON file for the pendulum model, every assembly, body, joint, frame or force module in the JSON file supports an optional "active" key. If the "active" key is set to false, loading of the assembly, body, joint, frame or force module is suppressed. The ability to easily suppress parts of the model with minimal changes to the file is useful for debugging purposes and design sensitivity analysis. 12.2.2

Internal Data Structures

When the JSON file is loaded from disk into working memory at program run–time, it is imported into a linked-list datastructure. The descriptive names in the body-frame pairings are matched to set the appropriate memory pointers. The linked-list representation is used

237

primarily for programming convenience, as it allows allocating memory for model primitives on an as-needed basis. For example, the linked-list enables a user to interactively add or remove objects from a model. Note that the linked–list representation of a model is not used for computationally intensive/demanding tasks – aspects which are discussed later in this chapter. Models are defined using the following primitives: • assemblies linked–list – force modules linked–list – bodies linked–list ∗ frames linked–list – joints linked–list – lights linked–list – subassemblies linked–list where each assembly in the subassemblies linked–list contains 0 or more assemblies, each containing a number of force modules, bodies, joints, lights, etc. In addition to the data specified in the JSON file, a body datastructure includes a reference frame. The reference frame of each body is set automatically by the software and may not coincide with the origin or mass centroid of the body. The reference frame is chosen as the proximal frame that is closest to the root of the spanning tree automatically by the software. Furthermore, inertia properties and all other frames on the body are transformed so that they are expressed with respect to the new reference frame automatically by the software. This preprocessing step simplifies traversal of the graph of the multibody system, and the process is transparent to the user of the software. Additionally, each body contains geometric data loaded from the model geometry definition which is used for the purpose of rendering, such as the vertices, faces, colors and textures.

238

The frame datastructure contains SO3 matrices and translation vectors specifying relative orientation and translation from the body’s origin and reference frame. In addition to the terms specified in the JSON file, a joint datastructure contains properties necessary for processing the equations of motion, such as the influence coefficient matrices [16], and current joint positions, velocities and accelerations.

12.3

Force Modules

The JSON file contains a key for "force modules". Force modules allow feeding forces into the model as a function of an input, which may consist of a displacement, velocity or acceleration within a joint or of any reference frame in the system, expressed either in the fixed inertial reference frame or the local coordinate system. The output is a force applied in a joint or between two frames. Support for more elaborate force modules involving multiple inputs and outputs, such as detailed contact models between two parts, is planned for a future version of the software. Springs, dampers and friction are examples of a simple type of force module. An example of a spring is specified below. When the JSON file is parsed, the type of the force module is read in to determine what keys to access for the particular type of force module. The force module has a unique name, a type, knot points, coefficients, and an input and ouput, which may consist of either a joint name or a body–frame name. Linear springs and dampers may be specified as either associated with a translational joint variable or as a body-frame pair. In the latter case, an additional field for "distance type" may be specified. If no distance type is specified, the program will assume the Euclidian distance between parent and child frames should be used. Otherwise, it is possible to set the distance as aligned with the x, y or z axis of the parent frame as: 1

3

5

7

" force_modules " : [ { " name " : " example " , " type " : " stiffness " , " input " : { " type " : " frame " , " body_frame_pair " : [ [ " body1 " , " 1 h_plus " ] ,

239

[ " body2 " , " origin " ]

9

] } " distance_type " : " Tz " , " knot_points " : [0] , " coefficients " : [[0 , 0.5 , 2.0] , [0]]

11

13

}

15

]

A spring, damper or frictional restraint may be set as a piecewise, nonlinear function of the joint’s displacement or velocity by setting a number of knot points and arrays of coefficients. The number of coefficient arrays must be one greater than the number of knot points. In the example above, a single knot point is set with a value of 0. The coefficients in the first block are used to evaluate the function before the knot point. The coefficients in the second block are used to evaluate the function after the knot point. More complicated functions are possible by specifying more knot points and more arrays of coefficients. In the above example, the piecewise function evaluates to:    0 + 0.5q + 2.0q 2 f (q) =   0

q b od y F un c t io n P oi n t er ( body ) ; // update shape matrices 4

// process children for ( int i = 0; i < body - > childCount ; i ++) { body - > childJoints [ i ] - > j o i n t F u n c t i o n P o i n t e r ( body - > childJoint [ i ]) ; R e c u r s i v e T r e e T r a v e r s a l ( body - > childJoint [ i ] - > childBody ) ; }

6

8

10

}

12

// then call the recursive function as Body * fixed ; R e c u r s i v e T r e e T r a v e r s a l ( fixed ) ;

14

Note, however, that there are a number of disadvantages with this approach: • Any properties related to Chords D and E are not updated. For example, additional logic would be required to process influence coefficient matrices associated with joints D and E. • This approach involves many pointer dereferences that the compiler will not be able to optimize away; there is no way for the compiler to know the body-joint-body sequence before the model definition is loaded from the JSON file at run–time. • Many algorithms that use functional recursion are slower than nonrecursive implementations due to increased number of stack frame allocations. • Functional recursion requires greater memory useage on the stack than other methods for traversing a graph. 244

• Traversing a linked–list datastructure in fragmented memory is cache-inefficient. • By removing joints D and E from the graph, the user’s original directed graph with possible cycles is lost. 12.4.2

Equivalent Graph Representation

To address these disadvantages, a different approach to process and update the system’s topology is taken. The user’s original directed graph with possible cycles is maintained, and a second spanning tree is created. Each time the user adds or removes a body or joint, memory for the second graph is freed and a new graph is generated automatically. Instead of removing joints from the user’s original directed, possibly cyclic graph representation, additional massless virtual bodies (shown with octagons) are added and constraints (shown as dashed blue lines) are imposed as shown in Fig. 47.

Figure 47: Spanning–tree representation An important aspect in this process is that memory is allocated for quantities required to update the mechanism’s topology based on the total number of active bodies, joints and frames in the system. Data are then copied by value from their position in the directed graph representation to the spanning tree representation and the memory pointers in each of the bodies and joints are updated to point to the new locations. Quantities required to process the kinematics and dynamics of the system are now stored in contiguous, closely-packed memory in a struct of arrays (SOAs) datastructure, which improves data locality and serves to minimize cache misses [149]. A memory profiler was used to ensure that no memory leaks exist when the second directed graph representation is created or destroyed. 245

If the preprocessor detects that the sense of the joint should be flipped from the user’s directed graph representation to align with the spanning tree (as in the pendulum example), the inverse transformation is set in the appropriate position in the function pointer array, so as to maintain the user’s originally defined sense. This is particularly useful for force modules, as the direction of the force applied will always follow the convention of the sense of the joint defined by the user, requiring no logic to flip the sign of the computed forces according to the sense of the joint. The equivalent graph representation is completely transparent to the user. The user can access body and joint parameters through their orginal linked–list, directed graph representation within each assembly using the C-API or through the graphical user interface. The assemblies serve to organize the pointers to bodies, joints, frames, devices and other objects, but are not present in the second spanning–tree representation. Another advantage to the equivalent graph representation is that data exists in only one location. As the kinematics and dynamics of the mechanism are updated, there is no need to copy data by value to a second datastructure. As both the directed graph and spanning–tree representations are simultaneously present in computer memory, multiple output formats are possible. For example, a user could make changes to the directed graph with cycles in the graphical user interface and save the updated results back to disk in the JSON file format described above. Alternatively, the spanning– tree representation of the mechanism could be used to generate a file format that requires a scene–graph, such as a COLLADA or X3D – although writing an export module is beyond the scope of the present work.

12.5

Memoization

As noted above, functional recursion could be used to traverse the topology of a complex mechanism. However, as discussed above, due to the increased number of stack–frame allocations, many algorithms that use functional recursion have lower performance than algorithms that use iterative methods. Modern compilers are capable of converting recursive code to 246

iterative methods if the layout of data in computer memory is known at the time of compilation [150]. However, in general–purpose multibody dynamics software, the topology of the system is loaded from an external file at run–time, and the compiler has no knowledge of the layout of the topology at the time of compilation. Fortunately, the use of functional recursion for traversing the topology of a complex mechanism can be avoided altogether. In the software, an iterative method is used to traverse the topology. Traversing the topology using iteration requires sorting the bodies and joints into monotonically increasing level order within computer memory as described in Chapter 5. In a multiloop system, once the virtual bodies are added as described above, the number of bodies and joints are equal and traversing the topology is simplified. Using the function pointer method described above, the joint and body function pointers can be called as: 2

int n ; Body ** bodyPointers = malloc ( n * sizeof ( Body *) ) ; Joint ** joints = malloc ( n * sizeof ( Joint *) ) ;

4

6

8

10

for ( int i = 0; i < n ; i ++) { joints [ i ] - > functionPointer ( joints [ i ]) ; // // bodies [ i ] - > functionPointer ( body [ i ]) ; // // }

update across update across

relative properties parent joint relative properties body

The technique performs a breadth–first traversal of the spanning–tree and eliminates the need for functional recursion. The logic for processing the system topology can be even further simplified. For brevity, the exact details of each body function pointer are not shown, but note that the function requires updating one or more displacements across a body, which requires a number of pointer dereferences and secondary loop constructs to book-keep the operations. The approach taken in the present work is to generate a first function pointer array for updating both joints and bodies and a second data pointer array, which indicates memory addresses where each function pointer should read and write the data to and from: 2

int n ; void * functionPointer ( double ** dataPointer ) ; double *** dataPointer ;

247

This method significantly reduces the logic required to traverse the spanning tree, as illustrated in the following code: 1

3

for ( int i = 0; i < n ; i ++) { functionPointer [ i ]( dataPointer [ i ]) ; }

Once the operations required to update the graph are determined, the steps are memoized as arrays of function, source and destination pointers to minimize logic required to traverse the spanning tree of the mechanism. This process is also sometimes referred to as automata based programming [151]. Dynamically building a set of instructions from an external model definition is a concept used by other model–based programming languages such as the Unified Modeling Language (UML) [152], but to the author’s knowledge, the technique has not been applied to simplify graph traversal of a multibody system.

12.6

Graphical User Interface

An example mechanism is shown in a screen shot below. The assembly hierarchy, displayed at the left of the viewport in the model pane, is generated using functional recursion from the linked–list assembly hierarchy representation described above. The forward kinematics of the mechanism in the viewport is updated using memoized arrays of function, source and destination memory pointers. Many of the menus are context–aware. For example, clicking on a joint in the model overview pane at the left brings up tools to inspect or change the properties of the joint. In addition to the model pane at the left of the display, coordinate systems and joints can be visualized in the 3D graphics window to assist the user in the model setup process. The bodies that comprise the model are rendered in OpenGL using Phong GLSL shaders written by the author.

248

Figure 48: Screenshot of graphical user interface

12.7

Summary

This chapter has highlighted some of the unique attributes of the software developed in support of this research, including file formats, data structures, programming techniques and aspects of a fully–functional graphical user interface.

249

References [1] M.A. Chace. “DAMN-a prototype program for the Dynamic Analysis of Mechanical Networks”. In: Proceedings of the 7th Design Automation Workshop. ACM. 1970, pp. 158–172. [2] M.A. Chace and J.C. Angell. Interactive simulation of machinery with friction and impact using Dram. Tech. rep. SAE Technical Paper, 1977. [3] P.N. Sheth and J.J. Uicker. “IMP (Integrated Mechanisms Program), a computeraided design analysis system for mechanisms and linkage”. In: Journal of Manufacturing Science and Engineering 94.2 (1972), pp. 454–464. [4] N. Orlandea, M.A. Chace, and D.A. Calahan. “A sparsity-oriented approach to the dynamic analysis and design of mechanical systems, Part 1”. In: Journal of Engineering for Industry 99.3 (1977), pp. 773–779. [5] N. Orlandea, D.A. Calahan, and M.A. Chace. “A sparsity-oriented approach to the dynamic analysis and design of mechanical systems, Part 2”. In: Journal of Engineering for Industry 99.3 (1977), pp. 780–784. [6] R.C. Smith and E.J. Haug. “DADS - Dynamic analysis and design system”. In: Multibody systems handbook. Springer, 1990, pp. 161–179. [7] R. Featherstone. “A divide-and-conquer articulated-body algorithm for parallel O (log (n)) calculation of rigid-body dynamics. Part 2: Trees, loops, and accuracy”. In: The International Journal of Robotics Research 18.9 (1999), pp. 876–892. [8] R. Featherstone. “The calculation of robot dynamics using articulated-body inertias”. In: The International Journal of Robotics Research 2.1 (1983), pp. 13–30. [9] K.S. Anderson. “An order n formulation for the motion simulation of general multirigid-body constrained systems”. In: Computers & structures 43.3 (1992), pp. 565– 579.

250

[10] K.S. Anderson and J.H. Critchley. “Improved Order-N Performance Algorithm for the Simulation of Constrained Multi-Rigid-Body Dynamic Systems”. In: Multibody system dynamics 9.2 (2003), pp. 185–212. [11] R. Featherstone. Rigid Body Dynamics Algorithms. New York, NY, USA: SpringerVerlag New York, Inc., 2008. isbn: 9780387743141. [12] P.N. Sheth. “A digital computer based simulation procedure for multiple degree of freedom mechanical systems with geometric constraints”. PhD thesis. University of Wisconsin–Madison, 1972. [13] R.A. Wehage. “Generalized Coordinate Partitioning in Dynamic Analysis of Mechanical Systems”. PhD thesis. University of Iowa, 1981. [14] Grigore G. “Mobility of mechanisms: a critical review”. In: Mechanism and Machine Theory 40.9 (2005), pp. 1068–1097. issn: 0094-114X. doi: http://dx.doi.org/10. 1016/j.mechmachtheory.2004.12.014. [15] J. M. Selig. Geometric Fundamentals of Robotics. Monographs in Computer Science. Springer, 2005. isbn: 9780387208749. [16] R.M. Murray, Z. Li, and S.S. Sastry. A Mathematical Introduction to Robotic Manipulation. Taylor & Francis, 1994. isbn: 9780849379819. [17] R. Featherstone and D. Orin. “Robot dynamics: equations and algorithms”. In: ICRA. 2000, pp. 826–834. [18] G. H. Golub and C. F. Van Loan. Matrix Computations. Johns Hopkins Studies in the Mathematical Sciences. Johns Hopkins University Press, 1996. isbn: 9780801854149. [19] C. Lomont. “Introduction to Intel advanced vector extensions”. In: Intel White Paper (2011). [20] S. Blackford et al. “An updated set of basic linear algebra subprograms (BLAS)”. In: ACM Transactions on Mathematical Software 28.2 (2002), pp. 135–151. [21] S. Taylor. Intel integrated performance primitives. Intel Press, 2004. 251

[22] J.L. Hennessy and D.A. Patterson. Computer architecture: a quantitative approach. Elsevier, 2011. [23] J. Mellor-Crummey, D. Whalley, and K. Kennedy. “Improving memory hierarchy performance for irregular applications using data and computation reorderings”. In: International Journal of Parallel Programming 29.3 (2001), pp. 217–247. [24] R.S. Wright and B. Lipchak. OpenGL superbible. Sams, 2004. [25] M. Chasles. “Note sur les propri´et´es g´en´erales du syst`eme de deux corps semblables entr’eux et plac´es d’une mani`ere quelconque dans l’espace; et sur le d´eplacement fini ou infiniment petit d’un corps solide libre”. In: Bulletin des Sciences Math´ematiques, F´erussac 14 (1830), pp. 321–26. [26] L. Poinsot. Sur la composition des momens et la composition des aires. 1806. [27] R. S. Ball. A Treatise on the Theory of Screws. reprinted in 1998. Cambridge university press, 1900. [28] J.J. Uicker, B. Ravani, and P.N. Sheth. Matrix Methods in the Design Analysis of Mechanisms and Multibody Systems. Cambridge University Press, 2013. [29] T. Hawkins. Emergence of the theory of Lie groups: an essay in the history of mathematics 1869–1926. Springer Science & Business Media, 2012. [30] F.C. Park, J.E. Bobrow, and S.R. Ploen. “A Lie group formulation of robot dynamics”. In: The International Journal of Robotics Research 14.6 (1995), pp. 609–618. [31] J.K. Davidson and K.H. Hunt. Robots and screw theory: applications of kinematics and statics to robotics. Oxford University Press, 2004. [32] M. Yuan and F. Freudenstein. “Kinematic Analysis of Spatial Mechanisms by Means of Screw Coordinates — Part 2: Analysis of Spatial Mechanisms”. In: vol. 93. ASME. 1970, pp. 67–73. [33] W. Clifford. Mathematical Papers. London: Macmillan, 1882. [34] A.A. Shabana. Computational Dynamics, 3rd Edition. Wiley, 2010. isbn: 9780470686157. 252

[35] L.W. Tsai. Robot analysis: the mechanics of serial and parallel manipulators. John Wiley & Sons, 1999. [36] F. Freudenstein. “Dynamic Analysis of Mechanisms Using Screw Coordinates”. In: Journal of Engineering for Industry (1971), p. 273. [37] I. Newton. Philosophiae Naturalis Principia Mathematica. 1687. [38] L. Euler. Nova methodus motum corporum rigidorum determinandi. 1775. [39] J. Lagrange. Mecanique analytique. 1788. [40] R.E. Roberson and J. Wittenburg. “A dynamical formalism for an arbitrary number of interconnected rigid bodies, with reference to the problem of satellite attitude control”. In: International Federation of Automatic Control, Congress, 3 Rd, London, England. 1966, p. 1966. [41] J. Wittenburg. Dynamics of multibody systems. Springer Science & Business Media, 2007. [42] J.G. de Jalon et al. “Dynamic analysis of three-dimensional mechanisms in natural coordinates”. In: Journal of Mechanical Design 109.4 (1987), pp. 460–465. [43] H. Hemami and F.C. Weimer. “Modeling of nonholonomic dynamic systems with applications”. In: Journal of Applied Mechanics 48.1 (1981), pp. 177–182. [44] C.W. Gear, B. Leimkuhler, and G.K. Gupta. “Automatic integration of Euler-Lagrange equations with constraints”. In: Journal of Computational and Applied Mathematics 12 (1985), pp. 77–90. [45] R. Serban et al. “A Topology-Based Approach for Exploiting Sparsity in Multibody Dynamics: Cartesian Formulation”. In: Journal of Structural Mechanics 25.3 (1997), pp. 379–396.

253

[46] P. E. Nikravesh. “Some Methods for Dynamic Analysis of Constrained Mechanical Systems: A Survey”. English. In: Computer Aided Analysis and Optimization of Mechanical System Dynamics. Ed. by E.J. Haug. Vol. 9. NATO ASI Series. Springer Berlin Heidelberg, 1984, pp. 351–368. isbn: 978-3-642-52467-7. [47] J. Unda et al. “A comparative study on some different formulations of the dynamic equations of constrained mechanical systems”. In: Journal of Mechanical Design 109.4 (1987), pp. 466–474. [48] F.E. Udwadia and R.E. Kalaba. “A new perspective on constrained motion”. In: Proceedings: Mathematical and Physical Sciences (1992), pp. 407–410. [49] J.J. Uicker, J. Denavit, and R.S. Hartenburg. “An iterative method for the displacement analysis of spatial mechanisms”. In: Journal of Applied Mechanics, ASME Transactions (1964), pp. 309–14. [50] T.R. Kane and C.F. Wang. “On the derivation of equations of motion”. In: Journal of the Society for Industrial & Applied Mathematics 13.2 (1965), pp. 487–492. [51] G.A. Maggi. “Principii della Teoria Matematica del Movimento Dei Corpi: Corso di Meccanica Razionale”. In: Ulrico Hoepli (1896). [52] G.A. Maggi. “Di alcune nuove forme delle equazioni della dinamica applicabili ai systemi analonomi.” In: Atti Accad. Naz. Lincei Rend. Cl. Fis. Mat. X (1901), pp. 287– 291. [53] J. Neimark and N. Fufaev. “Dynamics of Nonholonomic Systems (Translations of Mathematical Monographs Vol. 33 Providence: Am. Math. Soc.)” In: (1972). [54] M. Borri, C. Bottasso, and P. Mantegazza. “Equivalence of Kane’s and Maggi’s equations”. In: Meccanica 25.4 (1990), pp. 272–274. [55] R.A. Wehage and E.J. Haug. “Generalized Coordinate Partitioning for Dimension Reduction in Analysis of Constrained Dynamic Systems”. In: Journal of Mechanical Design 104 (1982), pp. 247–255.

254

[56] A.M. Turing. “Rounding–off errors in matrix processes”. In: Quart. J. of Mech. Appl. Math 1 (1948), pp. 287–308. [57] N. K. Mani, E. J. Haug, and K. E. Atkinson. “Application of singular value decomposition for analysis of mechanical system dynamics”. In: Journal of Mechanical Design 107.1 (1985), pp. 82–87. [58] S.S. Kim and M.J. Vanderploeg. “QR decomposition for state space representation of constrained mechanical dynamic systems”. In: Journal of Mechanical Design 108.2 (1986), pp. 183–188. [59] S. Seshu and M.B. Reed. Linear graphs and electrical networks. Addison-Wesley Reading, Mass., 1961. [60] A. M¨ uller and Z. Terze. “Geometric methods and formulations in computational multibody system dynamics”. In: Acta Mechanica 227.12 (2016), pp. 3327–3350. [61] L. A. Pars. A Treatise on Analytical Dynamics. Ox Bow Press, 1979. [62] K. Lilly and D. Orin. “Efficient O(n) Recursive Computation of the Operational Space Inertia Matrix”. In: IEEE Transactions on Systems, Man and Cybernetics 23.5 (1993). [63] G. Rodriguez, A. Jain, and K. Kreutz-Delgado. “A Spatial Operator Algebra for Manipulator Modeling and Control”. In: Int. J. Robotics Research 10.4 (1991), pp. 371– 381. [64] J.G. de Jal´on, A. Callejo, and A.F. Hidalgo. “Efficient solution of Maggifffdfffdfffds equations”. In: Journal of computational and nonlinear dynamics 7.2 (2012), p. 021003. [65] R.A. Wehage and W.Y. Loh. “Application of SVD to independent variable definition in constrained mechanical systems”. In: ASME Dyn Sys Control Div Publ DSC, ASME, New York, NY,(USA), 1993 52 (1993), pp. 71–79.

255

[66] T. A. Loduha and B. Ravani. “On First–Order Decoupling of Equations of Motion for Constrained Dynamical Systems”. In: Journal of Applied Mechanics 62 (1995), pp. 216–222. [67] S. Acer. “A Recursive Graph Bipartitioning Algorithm by Vertex Separators with Fixed Vertices for Permuting Sparse Matrices into Block Diagonal Form With Overlap”. PhD thesis. Bilkent University, 2011. [68] B. U¸car and C. Aykanat. “Partitioning sparse matrices for parallel preconditioned iterative methods”. In: SIAM Journal on Scientific Computing 29.4 (2007), pp. 1683– 1709. [69] D.V. Steward. “Partitioning and tearing systems of equations”. In: Journal of the Society for Industrial & Applied Mathematics, Series B: Numerical Analysis 2.2 (1965), pp. 345–365. ¨ V. C [70] C. Aykanat, A. Pinar, and U. ¸ ataly¨ urek. “Permuting sparse rectangular matrices into block-diagonal form”. In: SIAM Journal on Scientific Computing 25.6 (2004), pp. 1860–1879. [71] R.J. Cave and J.F. Stanton. “Block diagonalization of the equation-of-motion coupled cluster effective Hamiltonian: Treatment of diabatic potential constants and triple excitations”. In: The Journal of Chemical Physics 140.21 (2014), p. 214112. [72] J.A. Bondy and U.S.R. Murty. “Graph Theory. 2008”. In: Grad. Texts in Math (2008). [73] T.H. Cormen. Introduction to Algorithms, 3rd Edition: MIT Press, 2009. isbn: 9780262033848. [74] I.S. Duff and J.K. Reid. “The design of MA48: a code for the direct solution of sparse unsymmetric linear systems of equations”. In: ACM Transactions on Mathematical Software (TOMS) 22.2 (1996), pp. 187–226. [75] S. Toledo. “Improving the memory-system performance of sparse-matrix vector multiplication”. In: IBM Journal of research and development 41.6 (1997), pp. 711–725.

256

[76] M. Gonz´alez et al. “On the effect of linear algebra implementations in real-time multibody system dynamics”. In: Computational Mechanics 41.4 (2008), pp. 607–615. [77] A.F. Hidalgo, F.J. Garc´ıa de Jal´on, and S. Tapia Fernandez. “High Performance Algorithms and Implementations Using Sparse and Parallelization Techniques on MBS”. In: Proc. of ECCOMAS Thematic Conference, Brussels, Belgium. 2011. [78] C.R. Whaley and A. Petitet. “Minimizing development and maintenance costs in supporting persistently optimized BLAS”. In: Software: Practice and Experience 35.2 (Feb. 2005), pp. 101–121. [79] R. Mittra and C. Klein. “The use of pivot ratios as a guide to stability of matrix equations arising in the method of moments”. In: Antennas and Propagation, IEEE Transactions on 23.3 (May 1975), pp. 448–450. issn: 0018-926X. doi: 10.1109/TAP. 1975.1141067. [80] F. Zhang. The Schur complement and its applications. Vol. 4. Springer, 2006. [81] R. Featherstone. “Efficient factorization of the joint-space inertia matrix for branched kinematic trees”. In: The International Journal of Robotics Research 24.6 (2005), pp. 487–500. [82] G. Strang. Linear Algebra and Its Applications. 3rd ed. Brooks Cole, Feb. 1988. isbn: 0155510053. [83] L.F. Shampine and M. K. Gordon. Computer solution of ordinary differential equations: the initial value problem. WH Freeman San Francisco, 1975. [84] J. Baumgarte. “Stabilization of constraints and integrals of motion in dynamical systems”. In: Computer methods in applied mechanics and engineering 1.1 (1972), pp. 1–16. [85] G. Ostermeyer. “On Baumgarte stabilization for differential algebraic equations”. In: Real-Time Integration Methods for Mechanical System Simulation. Springer, 1990, pp. 193–207.

257

[86] J.C. Chiou, J Y. Yang, and Shuen-De Wu. “Stability analysis of baumgarte constraint stabilization technique in multibody dynamic systems”. In: Journal of guidance, control, and dynamics 22.1 (1999), pp. 160–162. [87] G.P. Ostermeyer. Die Stabilisierung und Ersten Integralen als Regelungsproblem und Ihre Konsequenzen. private communication. 1983. [88] C.O. Chang and P.E. Nikravesh. “An adaptive constraint violation stabilization method for dynamic analysis of mechanical systems”. In: Journal of Mechanisms, Transmissions, and Automation in Design 107.4 (1985), pp. 488–492. [89] P. Flores et al. “A parametric study on the Baumgarte stabilization method for forward dynamics of constrained multibody systems”. In: Journal of computational and nonlinear dynamics 6.1 (2011), p. 011019. [90] T. Arponen, S. Piipponen, and J. Tuomela. “Kinematic analysis of Bricardfffdfffdfffds mechanism”. In: Nonlinear Dynamics 56.1-2 (2009), pp. 85–99. [91] M. Blundell and D. Harty. The Multibody Systems Approach to Vehicle Dynamics. Elsevier Science, 2014. isbn: 9780080994284. [92] H. Pacejka. Tire and vehicle dynamics. Elsevier, 2005. [93] E. Fiala. Seitenkrafte am rollenden luftreifen’Z. VDI bd. 96, no. 29. 1954. [94] MSC.Software Corporation. ADAMS/Tire Manual. 2015. [95] M. Taylor. “Technical Report TR-2015-13”. In: (2015). ¨ [96] H. Alt. “Der Ubertr¨ agungswinkel und seine Bedeutung f¨ ur das Konstruieren periodischer Getriebe”. In: Werkstattstechnik 26.S. (1932), pp. 61–64. [97] J. Hirschhorn. Kinematics and Dynamics of Plane Mechanisms. McGraw–Hill, 1962. [98] R.S. Hartenberg and J. Denavit. Kinematic Synthesis of Linkages. McGraw–Hill, 1962. [99] K. Hain. Applied Kinematics. McGraw–Hill, 1967.

258

[100] F. Dyson. Principles of Mechanisms. Oxford University Press, 1928. [101] W.G. Green. Theory of Machines. Blackie and Son, 1955, p. 772. [102] D. Lent. Analysis and Design of Mechanisms. Prentice Hall, 1961, p. 51. [103] M.J. Tsai and H.W. Lee. “The transmissivity and manipulability of spatial mechanisms”. In: Journal of Mechanical Design 116.1 (1994), pp. 137–143. [104] F. Freudenstein and L.S. Woo. “Kinematic Analysis of Spatial Mechanisms by Means of Screw Coordinates. Part 2 - Analysis of Spatial Mechanisms”. In: Journal of Engineering for Industry (1971), p. 67. [105] R. S. Ball. A Treatise on the Theory of Screws. Cambridge University Press, 1900. [106] G. Sutherland and B. Roth. “A transmission index for spatial mechanisms”. In: Journal of Manufacturing Science and Engineering 95.2 (1973), pp. 589–597. [107] J.E. Holte and T.R. Chase. “A force transmission index for planar linkage mechanisms”. In: Proceedings of the ASME Mechanisms Conference. 1994, pp. 377–386. [108] C. Chen and J. Angeles. “Generalized transmission index and transmission quality for spatial linkages”. In: Mechanism and Machine Theory 42.9 (2007), pp. 1225–1237. [109] C.C. Lin and W.T. Chang. “The force transmissivity index of planar linkage mechanisms”. In: Mechanism and Machine Theory 37.12 (2002), pp. 1465–1485. issn: 0094-114X. doi: http://dx.doi.org/10.1016/S0094- 114X(02)00070- 8. url: http://www.sciencedirect.com/science/article/pii/S0094114X02000708. [110] W.T. Chang, C.C. Lin, and J.J. Lee. “Force transmissibility performance of parallel manipulators”. In: Journal of Robotic Systems 20.11 (2003), pp. 659–670. [111] J. Denavit et al. “Velocity, acceleration, and static-force analyses of spatial linkages”. In: Journal of Applied Mechanics 32.4 (1965), pp. 903–910. [112] F. Wu and H.M. Lankarani. “New parameter for transmission quality and output sensitivity analysis of mechanisms.” In: 22 nd Biennial Mechanisms Conference. 1992, pp. 103–109. 259

[113] J. Lee, J. Duffy, and M. Keler. “The optimum quality index for the stability of inparallel planar platform devices”. In: Journal of Mechanical Design 121.1 (1999), pp. 15–20. [114] Y. Zhang. “Quality index and kinematic analysis of spatial redundant in-parallel manipulators”. PhD thesis. University of Florida, 2000. [115] T. Yoshikawa. “Manipulability of robotic mechanisms”. In: The international journal of Robotics Research 4.2 (1985), pp. 3–9. [116] D.C. Yang, J. Xiong, and X.D. Yang. “A simple method to calculate mobility with Jacobian”. In: Mechanism and Machine Theory 43.9 (2008), pp. 1175–1185. [117] T.F. Coleman and D.C. Sorensen. “A note on the computation of an orthonormal basis for the null space of a matrix”. In: Mathematical Programming 29.2 (1984), pp. 234–242. [118] J.C. Klann. Walking device. US Patent 6,260,862. July 2001. url: https://www. google.com/patents/US6260862. [119] E. Guizzo and E. Ackerman. “How rethink robotics built its new baxter robot worker”. In: IEEE Spectrum (2012). [120] K. Kreutz-Delgado, M. Long, and H. Seraji. “Kinematic analysis of 7-DOF manipulators”. In: The International journal of robotics research 11.5 (1992), pp. 469–481. [121] D.E. Whitney. “Resolved motion rate control of manipulators and human prostheses.” In: IEEE Transactions on man-machine systems (1969). [122] Y. Nakamura and H. Hanafusa. “Inverse kinematic solutions with singularity robustness for robot manipulator control”. In: Journal of dynamic systems, measurement, and control 108.3 (1986), pp. 163–171. [123] C.W. Wampler and L.J. Leifer. “Applications of damped least-squares methods to resolved-rate and resolved-acceleration control of manipulators”. In: Journal of Dynamic Systems, Measurement, and Control 110.1 (1988), pp. 31–38. 260

[124] H. Seraji and R. Colbaugh. “Improved configuration control for redundant robots”. In: Journal of Robotic Systems 7.6 (1990), pp. 897–928. [125] J. Wang, Y. Li, and X. Zhao. “Inverse kinematics and control of a 7-dof redundant manipulator based on the closed-loop algorithm”. In: International Journal of Advanced Robotic Systems 7.4 (2010), pp. 1–9. [126] F. Branz et al. “Kinematics and control of redundant robotic arm based on dielectric elastomer actuators”. In: SPIE Smart Structures and Materials+ Nondestructive Evaluation and Health Monitoring. International Society for Optics and Photonics. 2015, pp. 943023–943023. [127] A. M¨ uller. “A robust inverse dynamics formulation for redundantly actuated PKM”. In: 13th World Congress in Mechanism and Machine Science, Guanajuato, Mexico. 2011, pp. 19–25. [128] J.M. Hollerbach and K.C. Suh. “Redundancy resolution of manipulators through torque optimization”. In: Robotics and Automation, IEEE Journal of 3.4 (1987), pp. 308–316. [129] X.F. Zha and H. Du. “Generation and simulation of robot trajectories in a virtual CAD-based off-line programming environment”. In: The International Journal of Advanced Manufacturing Technology 17.8 (2001), pp. 610–624. [130] K. Shoemake. “Animating rotation with quaternion curves”. In: ACM SIGGRAPH computer graphics. Vol. 19. 3. ACM. 1985, pp. 245–254. [131] F.C. Park and B. Ravani. “Smooth invariant interpolation of rotations”. In: ACM Transactions on Graphics (TOG) 16.3 (1997), pp. 277–295. [132] K. Shin and N. McKay. “Minimum-time control of robotic manipulators with geometric path constraints”. In: IEEE Transactions on Automatic Control 30.6 (1985), pp. 531–541.

261

[133] M. Oberherber, H. Gattringer, and A. Mueller. “Successive dynamic programming and subsequent spline optimization for smooth time optimal robot path tracking”. In: MECHANICAL SCIENCES 6.2 (2015), pp. 245–254. [134] J.E. Bobrow, S. Dubowsky, and J.S. Gibson. “Time-optimal control of robotic manipulators along specified paths”. In: The international journal of robotics research 4.3 (1985), pp. 3–17. [135] F. Pfeiffer and R. Johanni. “A concept for manipulator trajectory planning”. In: IEEE Journal on Robotics and Automation 3.2 (1987), pp. 115–123. [136] D. Verscheure et al. “Time-optimal path tracking for robots: A convex optimization approach”. In: IEEE Transactions on Automatic Control 54.10 (2009), pp. 2318–2327. [137] T. Ardeshiri et al. “Convex optimization approach for time-optimal path tracking of robots with speed dependent constraints”. In: IFAC Proceedings Volumes 44.1 (2011), pp. 14648–14653. [138] J. Bellingham, A. Richards, and J.P. How. “Receding horizon control of autonomous aerial vehicles”. In: Proceedings of the 2002 American Control Conference (IEEE Cat. No. CH37301). Vol. 5. IEEE. 2002, pp. 3741–3746. [139] J.A. Primbs, V. Nevistic, and J.C. Doyle. “A receding horizon generalization of pointwise min-norm controllers”. In: IEEE Transactions on Automatic Control 45.5 (2000), pp. 898–909. [140] W.B. Dunbar. “Distributed receding horizon control of dynamically coupled nonlinear systems”. In: IEEE Transactions on Automatic Control 52.7 (2007), pp. 1249–1263. [141] T. Bray. The JavaScript Object Notation (JSON) Data Interchange Format. 2014. [142] K. McHenry and P. Bajcsy. “An overview of 3d data content, file formats and viewers”. In: National Center for Supercomputing Applications 1205 (2008). [143] Khronos Group. Collada 3D Asset Exchange Schema. Sept. 19, 2015. url: https: //www.khronos.org/collada/. 262

[144] Web3D Consortium. What is X3D? Sept. 20, 2015. url: http://www.web3d.org/. [145] N. Nurseitov et al. “Comparison of JSON and XML Data Interchange Formats: A Case Study.” In: Caine 9 (2009), pp. 157–162. [146] P. Lehtinen. Jansson Documentation. 2.6. ReadTheDocs, 2014. [147] A. Gessler et al. Assimp Open Asset Import Library. Aug. 19, 2015. url: http : //assimp.sourceforge.net/. [148] K.T. Wehage, R.A. Wehage, and B. Ravani. “Generalized coordinate partitioning for complex mechanisms based on kinematic substructuring”. In: Mechanism and Machine Theory 92 (2015), pp. 464–483. [149] R.E. Bryant and D.R. O’Hallaron. Computer systems: a programmer’s perspective. Vol. 2. Prentice Hall Upper Saddle River, 2003. [150] M.W. Bailey and N.C. Weston. Performance benefits of tail recursion removal in procedural languages. Tech. rep. Tech. Rep. TR-2001-2, Hamilton College, Clinton, NY, 2001. [151] T.L. Booth. Sequential machines and automata theory. Vol. 3. Wiley New York, 1967. [152] J. Rumbaugh, I. Jacobson, and G. Booch. Unified Modeling Language Reference Manual, The. Pearson Higher Education, 2004.

263