FAST AND OPTIMAL SOLUTION FOR THE ...

FAST AND OPTIMAL SOLUTION FOR THE GENERALIZED ATTITUDE DETERMINATION PROBLEM

by

Arnab Ghosh 05 August 2013

A thesis submitted to the Faculty of the Graduate School of State University of New York at Buffalo in partial fulfillment of the requirements for the degree of

Master of Science

Department of Mechanical & Aerospace Engineering

T o my advisor, colleagues and other professors of my department I thank all of you for providing me wonderful knowledge and valuable support without which my thesis would not have been possible.

ii

Acknowledgment I would like to thank all those people who contributed to the success of this thesis work.

First and foremost a special thanks for my advisor Dr. John L. Crassidis for providing me a challenging topic to work on. His help and patience through both the productive and unproductive phases of this research is appreciated. Additionally I would also like to thank Dr. Yang Cheng and Dr. Manoranjan Majji for their contributions towards my thesis.

iii

Contents

Preamble

ii

Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

ii

Acknowledgment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

iii

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

vii

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

viii

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

ix

1 Introduction

1

1.1

Attitude Determination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2

1.2

The Basic Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3

1.3

Underdetermined or Overdetermined? . . . . . . . . . . . . . . . . . . . . . . . . . .

3

1.4

Attitude Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4

1.5

Coordinate Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4

1.6

Rotation Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4

1.7

Attitude Representations

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6

1.7.1

Direction Cosine Matrix (DCM) . . . . . . . . . . . . . . . . . . . . . . . . .

6

1.7.2

The Euler Angles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7

1.7.3

The Unit Quaternions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8

Attitude Kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9

1.8

iv

Contents

v

2 Sensors and Measurement Models

11

2.1

Star Sensor Fundamentals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

2.2

Collinearity Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13

2.3

Unit Vector Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

14

2.4

Rank-One Update Approach

16

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3 Review of Attitude Determination Methods

18

3.1

Constrained Least Squares Approach . . . . . . . . . . . . . . . . . . . . . . . . . . .

18

3.2

Single-Frame Quaternion Estimators . . . . . . . . . . . . . . . . . . . . . . . . . . .

20

3.2.1

QUEST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

22

3.2.2

Singular Value Decomposition (SVD) . . . . . . . . . . . . . . . . . . . . . .

23

3.2.3

Estimator Of The Optimal Quaternion (ESOQ) . . . . . . . . . . . . . . . . .

24

3.2.4

Second Estimator Of The Optimal Quaternion (ESOQ2) . . . . . . . . . . . .

25

Filtering Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

26

3.3

4 Numerical Solutions

27

4.1

Review of Minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

28

4.2

Quaternion Based Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

30

4.3

Attitude Matrix Based Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

33

4.4

Analytical Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

36

4.5

Newton’s Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

37

4.6

Solve for Initial State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

38

4.6.1

Solve for Initial Quaternion . . . . . . . . . . . . . . . . . . . . . . . . . . . .

38

4.6.2

Solve for Initial DCM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

39

4.6.3

Basics of Quadratically Constrained Quadratic Programming (QCQP) . . . .

39

4.6.4

Basics of Quadratic Matrix Programming (QMP) . . . . . . . . . . . . . . . .

40

4.6.5

Basics of Semidefinite Relaxation . . . . . . . . . . . . . . . . . . . . . . . . .

41

Contents

vi

4.6.6

Basics of Semidefinite Programming . . . . . . . . . . . . . . . . . . . . . . .

43

4.6.7

SDR Method for Calculating DCM

44

. . . . . . . . . . . . . . . . . . . . . . .

5 Simulation Results

46

5.1

Results from Analytical Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . .

46

5.2

Results from Newton’s Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

48

5.2.1

Cramér-Rao Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

48

5.2.2

Monte Carlo Simulation for Newton’s approach . . . . . . . . . . . . . . . . .

50

Comparison of Computational Efforts . . . . . . . . . . . . . . . . . . . . . . . . . .

50

5.3

6 Conclusion

54

Bibliography

55

List of Figures

2.1

Model for the Effective Focal Length . . . . . . . . . . . . . . . . . . . . . . . . . . .

12

5.1

Comparisons of Roll Error for the Newton’s Quaternion Approach . . . . . . . . . .

51

5.2

Comparisons of Pitch Error for the Newton’s Quaternion Approach . . . . . . . . . .

51

5.3

Comparisons of Yaw Error for the Newton’s Quaternion Approach . . . . . . . . . .

52

5.4

Comparisons of Roll Error for the Newton’s DCM Approach . . . . . . . . . . . . . .

52

5.5

Comparisons of Pitch Error for the Newton’s DCM Approach . . . . . . . . . . . . .

53

5.6

Comparisons of Yaw Error for the Newton’s DCM Approach . . . . . . . . . . . . .

53

vii

List of Tables

5.1

List of All Quaternions Evaluated by HOM4PS 2.0 . . . . . . . . . . . . . . . . . . .

viii

47

Abstract This thesis provides a fast and optimal approach to solve Wahba’s general weighted problem. Applications of Wahba’s general weighted problem involve attitude determination using wide field-of-view sensors or GPS sensor observations. The first approach developed applies a homotopy continuation based solver to find the global minimizer of the minimization problem. This approach first finds all the stationary points of the minimization problem by solving the polynomial equations satisfied by the stationary points and then chooses the global minimizer from them. Next, a second approach is discussed, based on solving an unconstrained optimization problem using an iterative Newton solution. The parameter space includes the attitude and lagrange multipliers. The initial guess for Newton’s method is calculated by approximating the Wahba’s general weighted cost and constraints to a convex form and applying semidefinite programming to have global optimum attitude. Monte Carlo-based simulation results indicate that convergence is given quickly to the optimal solution.

ix

Chapter 1

Introduction Space, the final frontier has been always a very novel and rich proposition for the people across the world. Since Sputnik, a 58 cm diameter polished metal sphere, revolved around the Earth in low Earth orbit in 1957 many community of intellectuals started believing that humans can reach beyond Earth’s atmosphere. The space industry around the world has been booming for the past two decades. There has been tremendous technological breakthrough in all the fields, from communication protocol, solar power generation, imaging to attitude determination and many more. The reach of space technology envisioned earlier only for government and military use has broadened and new technologies are tested by different space agencies, private organizations and even universities. Due to the ever increasing computational power within the reach of all and miniaturization of electronics components and devices, several small satellite projects in the universities across the globe are available. Most of them are used for research purposes and studying new technologies. Educational institutions and universities are engaging more and more into space technology to provide hands on experience to students and prepare them for the forthcoming space challenges. Our university UB at Buffalo has UB Glint Analyzing Data Observation Satellite (GLADOS) program that aims at analyzing space debris. Among various technologies involved, attitude determination is one of the important areas of study, because it is the system responsible for the orientation of the spacecraft in the space. The main tasks involved in this area are development of new sensors and onboard software.

Attitude determination in the technique by which the orientation of the spacecraft is determined.

1

Chapter 1. Introduction

2

The spacecraft has to be correctly pointed in space for various reasons, such as the solar panels have to be pointed towards the Sun, the antenna towards the Earth and several other sensors in specified directions for its proper functioning. This is one of the important tasks for a complete space mission. The attitude is determined by the direction information towards which the spacecraft is currently pointing (a reference point) and the previously known coordinates of the reference point in the inertial coordinates from catalogs or other systems. The knowledge of the coordinates in both body frame and inertial frame gives a unique attitude solution for the spacecraft. The three axis attitude determination is a complex task in terms of computations and is more complex when the spacecraft has no apriori information of the attitude. There are several sensors which are used to determine attitude, such as Sun sensors, star trackers, magnetometers, gyroscopes etc. Out of all these sensors, the star tarcker produces the best attitude estimate even though the complexity of hardware and software is high.

1.1

Attitude Determination

Any modern or an intelligent device that does its work with minimal or no human interaction requires two basic hardware components: sensors and actuators. This field of study can be generically called as control theory. Sensors are used to sense or measure the state of the system, and actuators are used to adjust the state of the system. For example, a steam engine used to move locomotives has a centrifugal governor (sensor) and a connection to steam exhaust value (actuator). The control system keeps the pressure inside the container to an optimum level by comparing reference pressure to the measured pressure and releases excess pressure when it increases more than the optimum value. A spacecraft attitude determination and control system (ADCS) typically uses a variety of sensors and actuators. Because attitude is represented by at least three variables, the difference between the desired and measured states is slightly more complicated than for a governor, or even for the position of the spacecraft. Furthermore, the mathematical analysis of attitude determination is complicated by the fact that attitude determination is necessarily either underdetermined or overdetermined.


1.2

3

The Basic Idea

Attitude determination is done by using a combination of sensors to collect vector components in the body frame and vector components in the inertial reference frame by referring to pre-installed catalogs such as star catalogs or other means and mathematical models. These vector components are used in one of several different algorithms to determine the attitude, typically in the form of a quaternion, Euler angles, or a rotation matrix. It takes at least two non-collinear vectors to estimate the attitude. For example, an attitude determination system might use a Sun vector, ˜s and ˜ A Sun sensor measures the components of ˜s in the body frame, sb , while a magnetic field vector, m. a mathematical model of the Sun’s apparent motion relative to the spacecraft is used to determine the components in the inertial frame, si . Similarly, a magnetometer measures the components of ˜ in the body frame, mb , while a mathematical model of the Earth’s magnetic field relative to the m spacecraft is used to determine the components in the inertial frame, mi . An attitude determination algorithm is then used to find a rotation matrix A such that

sb = Asi

and mb = Ami

(1.1)

The attitude determination analyst needs to understand how various sensors measure the bodyframe components, how mathematical models are used to determine the inertial-frame components, and how standard attitude determination algorithms are used to estimate A matrix.

1.3

Underdetermined or Overdetermined?

In the previous section we claimed that at least two vectors are required to determine the attitude. We need three independent parameters to determine the attitude of the spacecraft. However a unit vector contains only two parameters because of the unit vector constraint. Therefore we require three scalars to determine the attitude. Hence the requirement is to have more than one and less than two vector measurements. The attitude determination is thus unique in that one measurement is not enough, i.e., the problem is underdetermined, and two measurements are too many, i.e., the problem is overdetermined. The primary implication of this observation is that all attitude determination algorithms are really attitude estimation algorithms [1].


1.4

4

Attitude Measurements

There are two basic classes of attitude sensors. The first class makes absolute measurements where as the second class makes relative measurements.

Absolute measurement sensors are based on the fact that knowing the position of a spacecraft in its orbit makes it possible to compute the vector directions with respect to an inertial frame, of certain astronomical objects, and of the force lines of the Earth’s magnetic field. Absolute measurement sensors measure these directions with respect to a spacecraft or body-fixed reference frame. By comparing the measurements with the known reference directions in an inertial reference frame we are able to determine (at least approximately) the relative orientation of the body frame with respect to the inertial frame. Absolute measurements are used in the static attitude determination algorithms developed in this thesis.

Relative measurement sensors belong to the class of gyroscopic instruments, including the rate gyro and the integrating gyro. Classically, these instruments have been implemented as spinning disks mounted on gimbals; however, modern technology has brought such marvels as ring laser gyros, fiber optic gyros, and hemispherical resonator gyros. Relative measurement sensors are used in the dynamic attitude determination algorithms.

1.5

Coordinate Systems

We consider the relationships between data expressed in two different coordinate systems: 1. The world coordinate system is fixed in inertial space. 2. The body-fixed coordinate system is rigidly attached to the object whose attitude we would like to determine.

1.6

Rotation Matrix

A rotation matrix is a matrix whose multiplication with a vector rotates the vector while preserving its length. The special orthogonal group of all 3 × 3 rotation matrices is denoted by SO(3). Thus,


5

if A ∈ SO(3), then det A = ± 1

and A−1 = AT

(1.2)

Rotation matrices for which det A = 1 are called proper and for which det A = −1 are called improper . Improper rotations are also known as rotoinversions and consist of a rotation followed by an inversion operation. We restrict our analysis to proper rotations as improper rotations are not rigid-body transformations.

We reference the elements of a rotation matrix as follows: h i A = a1 a2 a3   a11 a12 a13    = a21 a22 a23  a31 a32 a33

(1.3)

There are two possible conventions for defining the rotation matrix that encodes the attitude of a rigid body and both are in current use. Some authors prefer to write the matrix that maps from the body-fixed coordinates to the world coordinates; others prefer the matrix that maps from the world coordinates to the body-fixed coordinates. Though converting between the two conventions is as trivial as performing the transpose of a matrix, it is necessary to be sure that two different sources are using the same convention before using results from both sources together.

We define the rotation matrix that encodes the attitude of a rigid body to be the matrix that when pre-multiplied by a vector expressed in the world coordinates yields the same vector expressed in the body-fixed coordinates. That is, if z ∈ A3 is a vector in the world coordinates and z0 ∈ A3 is the same vector expressed in the body-fixed coordinates, then the following relations hold:

z0 = Az

(1.4a)

z = AT z 0

(1.4b)

These expressions apply to vectors, relative quantities lacking a position in space. To transform a point from one coordinate system to the other we must subtract the offset to the origin of the target


6

coordinate system before applying the rotation matrix. Thus, if x ∈ A3 is a point in the world coordinates and x0 ∈ A3 is the same point expressed in the body-fixed coordinates, then we may write

x0 = A(x − xb ) = Ax + x0w

(1.5a)

x = AT (x0 − x0w ) = AT x0 + xb

(1.5b)

Substituting x = 0 into Eq. (1.5a) and x0 = 0 in Eq. (1.5b) yields

1.7

x0w = −Axb

(1.6a)

xb = −AT x0w

(1.6b)

Attitude Representations

In this section a short description of the available attitude representation methods along with the basics of rigid body kinematics equations are presented. The quaternion representation and the related kinematics equations are emphasized due to their extensive use in the modeling and simulation of the system. The attitude representation of a rigid body can be described in terms of rotating the orientation of the body coordinate system to align with another one. The following methods can be applied to describe these kinds of rotations.

1.7.1

Direction Cosine Matrix (DCM)

A rotation matrix may also be referred to as a direction cosine matrix, because the elements of this matrix are the cosines of the unsigned angles between the body-fixed axes and the world axes. Denoting the world axes by (x; y; z) and the body-fixed axes by (x’; y’; z’), let θx0 ,y be, for example, the unsigned angle between the x0 -axis and the y-axis. In terms of these angles, the rotation matrix may be written  cos(θx0 ,x )   A = cos(θy0 ,x ) cos(θz0 ,x )



cos(θx0 ,y )

cos(θx0 ,z )

cos(θy0 ,y )

 cos(θy0 ,z ) 

cos(θz0 ,y )

cos(θz0 ,z )

(1.7)


1.7.2

7

The Euler Angles

The most common way to represent the attitude of a rigid body is a set of three Euler angles. These are popular because they are easy to understand and easy to use. Some sets of Euler angles are so widely used that they have names that have become part of the common parlance, such as the roll, pitch and yaw of an airplane.

Euler angles define a series of three successive rotations around the body coordinate system in order to achieve the desired orientation. Each rotation occurs around the original body axes. Since there are twelve possible ways to arrange sequentially the three axes, there are twelve different possible set of Euler angles. In this text the orientation roll ϕ (rotation about the x axis), pitch θ (rotation about the y axis) and yaw ψ (rotation about the z axis) is used. In this case a rotation can be represented by the following set of rotation matrices considering 1-2-3 Euler angle sequence [2]:  1  C1 (ϕ) =  0 0

0

 sin ϕ  

cos ϕ

(1.8)

− sin ϕ cos ϕ

 cos θ  C2 (θ) =   0 sin θ 



0

0

− sin θ



1

0

(1.9)

0

cos θ

  

 0  0  1

(1.10)

cos ψ

 C3 (ψ) =  − sin ψ 0

sin ψ cos ψ 0

The product of these three matrices is the rotation matrix between the initial and final orientations of the body and is a function of all three Euler angles.

An interesting case for the attitude matrix occurs when the Euler angles are small so that the cosine of the angle is approximately one and the sine of the angle is approximately the angle. In


8

this case the attitude matrix is adequately approximated by 

1

ψ

 A≈ −ψ

1 −φ

θ

h where α ≡ φ

θ

ψ

iT

−θ



 φ  = I3×3 − [α×] 1

(1.11)

, I3×3 is a 3 × 3 identity matrix and [α×] is referred to as a cross-product

with 

0

 α≡  α3

−α2

−α3 0 α1

α2



 −α1  

(1.12)

0

The main disadvantages of Euler angles are: (1) that certain important functions of Euler angles have singularities, and (2) that they are less accurate than unit quaternions when used to integrate incremental changes in attitude over time.

1.7.3

The Unit Quaternions

These deficiencies in the Euler angle representation have led researchers to use unit quaternions as a parameterization of the attitude of a rigid body. The relevant functions of unit quaternions have no singularities and the representation is well-suited to integrating the angular velocity of a body over time [2].

The quaternion q is a four-element vector which fully describes a rotation in a three dimensional space. The first three elements of the quaternion are vector components while the remaining element is a scalar value. The first three elements are proportional to the Euler axis e around which the rotation occurs. The remaining scalar value represents a measure of the magnitude of rotation. Considering θe the angle of rotation, the quaternion components are defined as follows:

h % = q1

  % q=  q4 iT q2 q3 = e sin(θe /2) q4 = cos(θe /2)

(1.13a) (1.13b) (1.13c)


9

Since a four-dimensional vector is used to describe three-dimensions, the quaternion components cannot be independent of each other. The quaternion satisfies a single constraint given by qT q = 1. For small angles the vector part of the quaternion is approximately equal to half angles so that % ≈ α/2 and q4 ≈ 1.

The above equation means that the four elements of the quaternion vector are mutually dependent. This makes sense since a four-dimensional vector is used to describe a rotation in a three-dimensional space. The reason why the quaternion representation is extensively used in attitude representation is that it does not introduce singularities and requires less computational effort to propagate when compared to representations including rotation matrices. However, the mapping between quaternions and Euler angles is not unique, since any pair of quaternions q and −q represents exactly the same rotation.

The main disadvantages of using unit quaternions are: (1) The four quaternion parameters do not have intuitive physical meanings, and (2) A quaternion must have unity norm to be a pure rotation. The unity norm constraint, which is quadratic in form, is particularly problematic if the attitude parameters are to be included in an optimization, as most standard optimization algorithms cannot encode such constraints.

1.8

Attitude Kinematics

The attitude kinematics equation can be derived by considering a state transition matrix Φ(t + ∆t) that maps the attitude from one time to the next [2]:

A(t + ∆t) = Φ(t + ∆t, t)A(t)

(1.14)

It is obvious that Φ(t + ∆t) is also an attitude matrix. Then, from the definition of the derivative we have lim

∆t→0

A(t + ∆t) − A(t) ∆t

= − lim

∆t→0

1 [α(t)×] A(t) ∆t

(1.15)


10

where the higher order terms vanish in the limit. Hence, the following kinematics equation can be derived: A˙ = −[ω×]A

(1.16)

where ω is the angular velocity vector of the body frame relative to the reference frame. Similarly the quaternion kinematics equation is given by

q˙ =

where

1 Ω(ω)q 2

  −[ω×] ω  Ω(ω) =  −ω T 0

(1.17)

(1.18)

Chapter 2

Sensors and Measurement Models Several static attitude sensors exist, including: three-axis magnetometers, Sun sensors, Earthhorizon sensors, global positioning system (GPS) sensors and star trackers. For missions with tight attitude knowledge requirements, the primary means to determine attitude is the star tracker. Star trackers fall into the category of line-of-sight (LOS) sensors as they measure the direction of a celestial body. In particular, the angle of that body is measured from the sensor boresight in two mutually orthogonal planes.

2.1

Star Sensor Fundamentals

Star sensors are one of the most accurate means of attitude determination [3]. A star tracker is a star sensor that keeps track of bright stars in its field of view. Star trackers take pictures of the sky, so they are only effective when the satellite is stationary or moving at a slow rate. The star sensor consists of a lens that refracts the incoming star light rays onto a photon sensitive charge-coupled device (CCD) surface. Figure 2.1 shows the effective focal length model of the optical system. In the figure, L is the effective focal length, θ is the angular distance between a given point on the focal plane and the satellite z-axis, and d is θ translated into a linear distance on the focal plane. The relationship between θ, L and d is given by Eq. (2.1). The small size of the CCD arrays yields a small angle θ, so the relationship between θ, L, and d is approximately linear:

11

Chapter 2. Sensors and Measurement Models

12

Figure 2.1: Model for the Effective Focal Length

tan θ =

d L

(2.1)

The accuracy of a star tracker is limited by the size of each CCD pixel. The resolution of the tracker is normally the size of 1 pixel; however, greater accuracy can be achieved by centroiding. Centroiding is done by defocusing the incoming star rays such that the distribution of photons from a single ray spreads over several pixels. The centroid of this pattern is used as the star measurement, the resolution of which can be significantly smaller than 1 pixel [4].

Once the position of the star on the CCD is read out the following general steps are taken to determine the attitude [5]: 1. Information about the star such as the brightness, the star pattern and wavelength information, is recorded. 2. This information is compared to an imbedded onboard star catalog, and the star is identified. 3. Once the star is identified, its inertial coordinates are known from the star catalog, and its spacecraft body coordinates are known from the measurement. 4. At least 2 stars are necessary to fully determine the full satellite attitude.

For LOS sensors the observation equations are given by the well-known collinearity equations, which relate image plane coordinates to object plane coordinates through an attitude rotation. For stellar applications the light sources can be treated as infinite distance points, so that the only unknown, once a star is identified, is the attitude matrix. All attitude sensors, including star trackers, contain noise in their measurements however this noise includes both systematic errors and random errors.


13

Systematic errors are reduced through calibration procedures, that can be even done on-orbit. Random errors are usually treated as zero-mean Gaussian white-noise processes with known covariance. A realistic covariance matrix takes into account an increase in the errors away from the boresight due to radial distortions and contains correlated terms. A frequently-used covariance model for the noise added to the collinearity truth equations is given by Eq. (2.4).

2.2

Collinearity Equations

Photogrammetry is the technique of measuring objects (2D or 3D) from photographic images or LOS measurements. Photogrammetry can generally be divided into two categories: far range photogrammetry with camera distance settings to infinity (commonly used in star cameras), and close range photogrammetry with camera distance settings to finite values. In general close range photogrammetry can be used to determine both the position and attitude of an object, while far range photogrammetry can only be used to determine attitude.

For star trackers [6] the photograph image plane coordinates of the j th star are determined by the stellar collinearity equations given by

A11 rxj A31 rxj A21 rxj βj = −f A31 rxj

αj = −f

+ A12 ryj + A32 ryj + A22 ryj + A32 ryj

+ A13 rzj + A33 rzj + A23 rzj + A33 rzj

j = 1, 2, . . . , N

(2.2a)

j = 1, 2, . . . , N

(2.2b)

where N is the total number of observations and (αj , βj ) are the image space observations for the j th line-of-sight. Also, A`m are elements of the attitude matrix A, and the inertial components of the vector toward the j th star are rxj , ryj and rzj . The camera focal length f is known from a priori calibration.

The noise associated by the star tracker is characterized by adding zero-mean Gaussian noise to αj and βj . Denoting αj and βj by the 2 × 1 vector γj ≡ [αj βj ]T , the measurement model follows as: ˜j = γj + wj γ

(2.3)


14

where wj is a zero-mean Gaussian noise process. A frequently used covariance for wj with f = 1 is given by RjFOCAL

 (1 + d αj2 )2 2  σ  = 1 + d (αj2 + βj2 )  (d αj βj )2

(d αi βj )2 (1 +

d βj2 )2

   

(2.4)

where d is on the order of one and σ is assumed to be known. Note that as αj or βj increases, the individual components of RjFOCAL increase, which realistically shows that the errors increase as the observation moves away from the boresight. Also the covariance model [7] is a function of the true variables αj and βj , that are never available in practice. However, using the measurements themselves or estimated quantities in Eq. (2.4) leads to only second-order error effects in the attitude estimation process.

2.3

Unit Vector Form

Through judicious change of variables, a linear form of Eq. (2.2) can be constructed. Choosing the z-axis of the image coordinate system to be directed outward along the boresight, then the star observation can be reconstructed in unit vector form:

bj = Arj ,

j = 1, 2, . . . , N

(2.5)

where   −αj   1  −βj  bj ≡ q   f 2 + αj2 + βj2 f h iT rj ≡ rxj ryj rzj

(2.6a)

(2.6b)

and N is the total number of star observations.

It is proven by Shuster and Oh in [8] that for small field-of-view (FOV) cameras nearly all the probability of the errors is concentrated on a very small area about the direction of bi , so the sphere


15

containing that point can be approximated by a tangent plane, characterized by

˜ j = Arj + υj , b

υjT bj = 0

(2.7)

˜ j denotes the j th measurement and the sensor error υj is approximately Gaussian, which where b satisfies

RQUEST j

E {υj } = 0 ≡ E υi υjT = σ 2 I3×3 − bj bTj

(2.8a) (2.8b)

where E { } denotes expectation and I3×3 denotes a 3 × 3 identity matrix. Equation (2.8b) is known as the QUEST measurement model (QMM). Shuster has shown that for αj 2 + βj 2 1 the QUEST measurement model agrees well with the inferred model for the real sensor given by Eq. (2.4). Note that Eq. (2.8b) is also a function of the unknown truth quantities. However, the advantage of using the QUEST measurement model is that the measurement covariance can effectively be replaced by a nonsingular matrix, given by σ 2 I3×3 .

It is clear that the QMM is only valid for a small FOV in which a tangent plane closely approximates the surface of a unit sphere. To derive a covariance for the actual unit vector measurement, the true values for αj and βj must be replaced with the measured ones in Eq. (2.6a). Performing this replacement does not explicitly yield the form given by Eq. (2.7) because the actual model cannot separate Arj from the noise. Hence, the actual noise model contains nonlinear terms coupled with non-Gaussian components. In order to derive a covariance, the new measurement model as shown in [9] is based on a first-order Taylor series expansion of the unit vector model in Eq. (2.7). Note that this approach does not make the small FOV assumption, but rather it makes the assumption that the measurement noise is small compared to the signal, which is valid for every star tracker. The Jacobean of Eq. (2.5) is given by  −1  1 ∂bj 0 =q Jj ≡  ∂γj 1 + αj2 + βj2 0

0



h  1 −1  − 1 + α2 + β 2 bj αj j j 0

βj

i

(2.9)


16

The wide FOV covariance is now given by

RWFOV = Jj RjFOCAL JjT j

(2.10)

If a small FOV model is valid, then Eq. (2.10) can still be used, but is well approximated by Eq. (2.8b). For both equations, the 3 × 3 covariance matrix is associated with a unit vector measurement with two independent parameters and therefore must be singular. It is shown in [9] that the eigenvector associated with the zero eigenvalue of RWFOV is bj , which is exactly the same eigenvecj has two repeated eigenvalues, σ 2 , . Since RQUEST tor associated with the zero eigenvalue of RQUEST j j then the associated eigenvectors, which are always in the plane perpendicular to bj , are not unique. Therefore, without loss in generality it can be assumed that RQUEST has the same eigenvectors as j RWFOV . Thus, the only differences between these two covariances are their nonzero eigenvalues. j

2.4

Rank-One Update Approach

The main idea of the approach is to add an extra term, cbj bTj (c > 0) as shown in [9], to the singular measurement covariance matrix to ensure that the modified measurement covariance matrix is nonsingular. This approach is a straightforward extension of Refs. [7, 10] to overcome the problem with the singular QUEST measurement covariance matrix. The original QUEST measurement covariance matrix, RjQU EST = σj2 (I3×3 − bj bj T ), is replaced by RjQU EST = σj2 I3×3 , with the modification to the QUEST measurement covariance given by RjQU EST ← RjQU EST + σj2 bj bj T

(2.11)

Note that RjQU EST bj = 0. For the new measurement model, we propose to modify the singular measurement covariance matrix, RjW F OV , in a similar manner: RjW F OV ← RjW F OV + cj bj bj T

(2.12)

There is no limitation of the parameter cj > 0 in order to guarantee non-singularity of the measurement covariance matrix and the innovation matrix. Physically, cj is small because the error along the


17

true boresight of the effective unit vector measurement converted from the focal-plane measurement is much smaller than the errors along the other two directions (the first order approximation of cj is zero). For numerical purposes, however, cj may be chosen to be

cj =

1 tr(RjW F OV ) 2

(2.13)

where “tr” denotes the trace of a matrix. That is, cj is the average of the nonzero eigenvalues of RjW F OV .

Chapter 3

Review of Attitude Determination Methods Optimal attitude determination (AD) algorithms have been developed following two main approaches: the classic constrained least squares approach, based on the so-called Wahba’s problem, and the minimum-variance (Kalman filtering) approach.

3.1

Constrained Least Squares Approach

Determining the attitude of a spacecraft is equivalent to determining the rotation matrix describing the orientation of the spacecraft-fixed reference frame, Fb , with respect to a known reference frame; say an inertial frame, Fi . That is, attitude determination is equivalent to determining A matrix. Although there are nine numbers in this direction cosine matrix, it only takes three numbers to determine the matrix completely. Since each measured unit vector provides two pieces of information, it takes at least two different measurements to determine the attitude. In fact, this results in an overdetermined problem, since we have three unknowns and four known quantities.

If more than two observations are available, and we want to use all the information, we can use a statistical method. In fact, since we discard some information from the two observations in devel-

18

Chapter 3. Review of Attitude Determination Methods

19

oping the TRIAD algorithm, the statistical method provides a (hopefully) better estimate of A.

Suppose we have a set of N unit vectors vk , k = 1, 2, . . . , N . For each vector, we have a senˆ kb , and a mathematical model of the components in the sor measurement in the body frame, v inertial frame, vki . We want to find a rotation matrix A, such that

vkb = Avki

(3.1)

for each of the N vectors. Obviously this set of equations is overdetermined if N ≥ 2, and therefore the equation cannot, in general, be satisfied for each k = 1, 2, . . . , N . Thus we want to find a solution for A that in some sense minimizes the overall error for the N vectors.

One way to state the problem is: find a matrix A that minimizes the loss function shown in [11]: N

J(A) =

1X 2 wk |vkb − Avki | 2

(3.2)

k=1

where wk is the weight for each measurements. This loss function is a sum of the squared errors for each vector measurement. If the measurements and mathematical models are all perfect, then Eq. (3.2) will be satisfied for all N vectors and J = 0. If there are any errors or noisy measurements then J > 0. The smaller we can make J, the better the approximation of A.

This is the Wahba’s problem, to find the orthogonal matrix A with determinant +1 that minimizes the loss function. Note that in the present formulation Wahba’s problem is a single-frame attitude determination problem; that is, it assumes that the vector measurements that are processed to estimate the attitude have been obtained for a constant attitude. Normally this takes place at a single time point. One family of solutions of Wahba’s problem is concerned with the determination of that optimal matrix A, while another family is concerned with the determination of the corresponding optimal quaternion.


3.2

20

Single-Frame Quaternion Estimators

One elegant method of solving for the attitude which minimizes the Wahba’s loss function J(A) is the q-method devised by Davenport for computing the optimal single-frame quaternion. We begin by expanding the loss funtion as follows: N

J=

1X wk (vkb − Avki )T (vkb − Avki ) 2 k=1 N

=

1X T T T wk (vkb vkb + vki vki − 2vkb Avki ) 2

(3.3)

k=1

The vectors are assumed to be normalized to unity, so the first two terms satisfy

T T vkb vkb = vki vki = 1

(3.4)

Therefore, the loss fuction becomes

J=

N X

T wk (1 − vkb Avki )

(3.5)

k=1

without loss of generality we can assume that

PN

k=1

wk = 1. Davenport showed that the weighted

Wahba’s cost function, Eq. (3.5), can be transformed into a quadratic function of the quaternion as follows [12] J(q) = 1 − qT Kq

(3.6)

where K is a 4 × 4 matrix given by  S − σI K= T z

z σ

 

(3.7)


21

with

B=

N X

T wk vkb vki

(3.8a)

k=1

S = B + BT z=

N X

wk (vkb × vki )

(3.8b) (3.8c)

k=1

σ = tr(B)

(3.8d)

Minimizing J(q) is the same as minimizing −qT Kq or maximizing the gain function g(q)

g(q) = qT Kq

(3.9)

To maximize the gain function, we take the derivative with respect to q, but since the quaternion elements are not independent the constraint must also be satisfied. Adding the constraint to the gain function with a Lagrange multiplier yields a new augmented gain function:

g 0 (q) = qT Kq − λqT q

(3.10)

Differentiating this new gain function shows that g 0 (q) has a stationary value when

Kq = λq

(3.11)

This equation is easily recognized as an eigenvalue problem. The optimal attitude is thus an eigenvector of the K matrix. However, there are four eigenvalues and they each have different eigenvectors. To see which eigenvalue corresponds to the optimal eigenvector that maximizes the gain function, recall

g(q) = qT Kq = qT λq = λqT q =λ

(3.12)


22

The largest eigenvalue of K maximizes the gain function. The eigenvector corresponding to this largest eigenvalue is the least-squares optimal quaternion estimate of the attitude.

There are many methods for directly calculating the eigenvalues and eigenvectors of a matrix, or approximating them. The q-method involves solving the eigenvalue/vector problem directly, but as seen in the next section, QUEST approximates the largest eigenvalue and solves for the corresponding eigenvector.

3.2.1

QUEST

The onboard computing requirements are a concern for satellite designers, so a more efficient way of solving the eigenproblem is needed. The QUEST algorithm as shown in [8] provides a ”cheaper” way to estimate the solution to the eigenproblem. From Eq. (3.12) the gain function is the optimal quaternion. Rearrenging the Eq. (3.5) povides a useful result:

λopt =

N X

wk − J(A)

(3.13)

k=1

For the optimal eigenvalue, the loss function should be small. Thus a good approximation for the optimal eigenvalue is λopt ≈

N X

wk

(3.14)

k=1

For many applications this approximation may be accurate enough. A Newton-Raphson method, which uses the approximate eigenvalue as an initial guess, can be used for better estimate. However, for sensor accuracies of 10 or better the accuracy of a 64-bit word is exceeded with just a single Newton-Raphson iteration.

Once the optimal eigenvalue has been estimated, the corresponding eigenvector must be calculated. The eigenvector is the quaternion which corresponds to the optimal attitude estimate. One way is to convert the quaternion in the eigenproblem to Rodriguez parameters, defined as

p=

q q4

(3.15)


23

The eigenproblem is rearranged as

p = [(λopt + σ)I − S]−1 z

(3.16)

derived from Kq = λq Taking the inverse in this expression is also a computationally intensive operation. An efficient approach is to use Gaussian elimination or other linear system methods to solve the equation: [(λopt + σ)I − S]p = z

(3.17)

Once the Rodriguez parameters are found, the quaternion is calculated by   p 1   q= p 1 + pT p 1

(3.18)

One problem with this approach is that the Rodriguez parameters become singular when the rotation is π radians. Shuster and Oh in [13] have developed a method of sequential rotations which avoids this singularity.

3.2.2

Singular Value Decomposition (SVD)

One way to find optimal attitude matrix is by singular value decompostion as shown in [14] reported by Markley. The SVD of B matrix in Eq. (3.8) which is also known as the attitude profile matrix we get: B = U SV T

(3.19)

where S = diag([s1 s2 s3 ]) and sj are the singular values and U and V are orthogonal matrices. The optimal attitude matrix is simply given by:

A = UMV T

(3.20)

M = diag([1 1 det(U ) det(V )])

(3.21)

where


3.2.3

24

Estimator Of The Optimal Quaternion (ESOQ)

From Eq. (3.11) it can be seen that the optimal quaternion is orthogonal to all the columns of the matrix K − λmax I. It means that the optimal quaternion is a one-dimentional subspace orthogonal to the subspace spanned by any three columns of K − λmax I. The optimal quaternion can be computed from the four-dimensional cross-product of any three columns of this matrix as shown in [15] because the four columns of K − λmax I are not linearly independent because of the unit quaternion constraints.

The other method is to examine the classical adjoint of K − λmax I as shown in [12]. Representing K in terms of its eigenvalues and eigenvectors we have:

adj(K − λI) = adj[

4 X

(λi − λ)qi qTi ]

i=1 4 X = (λj − λ)(λk − λ)(λ` − λ)qi qTi

(3.22)

i=1

for any scalar λ, where i, j, k, ` is a permutation of 1, 2, 3, 4. Setting λ = λmax = λ1 causes all the terms in this sum to vanish except the first, with the result:

adj(K − λmax I) = (λ2 − λmax )(λ3 − λmax )(λ4 − λmax )qopt qTopt

(3.23)

Thus qopt can be computed by normalizing any non-zero column of adj(K − λmax I): (qopt )i = c(−1)i+1 det[(K − λmax I)ki ] i = 1, . . . , 4

(3.24)

where(K − λmax I)ki is the 3 × 3 matrix obtained by deleting the kth row and ith column from (K − λmax I), and c is a multiplicative factor determined by normalizing the quaternion. It is desirable to choose the column with the maximum Euclidean norm.


3.2.4

25

Second Estimator Of The Optimal Quaternion (ESOQ2)

This proposed quaternion estimator as shown in [16] is designed using the relation of the quaternions with rotation axis e and rotation angle θe :   e sin(θe /2)  q= cos(θe /2))

(3.25)

The eigenvalue problem Eq. (3.11) can be broken down into two set of equations:

[(λmax + trB)I − S]q = q4 z

(3.26a)

(λmax − trB)q4 = qT z

(3.26b)

using Eq. (3.25) in Eq. (3.26) we have:

(λmax − trB) cos(θe /2) = zT e sin(θe /2)

(3.27a)

z cos(θe /2) = [(λmax + trB)I − S]e sin(θe /2)

(3.27b)

Multiplying Eq. (3.27b) by (λmax − trB) and substituting Eq. (3.27a) gives

M e sin(θe /2) = 0

(3.28)

M = (λmax − trB)[(λmax + trB)I − S] − zzT

(3.29)

where

Equation (3.28) shows that the rotation axis e is a null vector of M . Similar calculation for [12] as shown is Sec. 3.2.3 gives

qopt

  (λmax − trB)y   =q 2 z · y 2 |(λmax − trB)y| + (z · y) 1

where y is the column of adj(M ) with maximum norm.

(3.30)


3.3

26

Filtering Approaches

The most straightforward way to attack a nonlinear estimation problem is to linearize about the current best estimate. This leads, of course, to the EKF, which is the workhorse of satellite attitude determination. There are several different implementations of the attitude EKF, depending on both the attitude representation used in the state vector and the form in which observations are input as shown in [17]. It is a well-known fact that all globally continuous and nonsingular representations of the rotations have at least one redundant component, so we are faced with the alternatives of using an attitude representation that is either singular or redundant. The various strategies to face or evade this dilemma can be divided into three general classes, which we will refer to as the minimal representation EKF, the multiplicative EKF (MEKF) and the additive EKF (AEKF). No further investigation is carried out as this thesis concentrates on the static attitude determination where minimum-variance or the filtering approach deals with dynamic attitude estimation.

Chapter 4

Numerical Solutions For a numerical solution approach we can use a systematic iterative algorithm that converges to a rotation matrix giving a good estimate of the attitude. The algorithm requires an initial matrix A and iteratively improves it to minimize J defined in Eq. (4.10). However, recall that the components of a rotation matrix cannot be changed independently. That is, even though there are nine numbers in the matrix, there are constraints that must be satisfied, and, as we know, only three numbers are required to specify the rotation matrix completely (e.g., Euler angles). Thus, there are actually only three variables that needs to be determined. We use quaternions and DCM for attitude representation, but then we would need to incorporate the constraints qT q = 1 or AT A = 1, while trying to minimize J or the Wahba’s loss function.

Minimization of a function requires taking its derivative and setting the derivative equal to zero, then solving for the unknown variable(s). To minimize the loss function, we must recognize that the unknown variable is multi-dimensional in our case, so that the derivative of J with respect to the unknowns is an np × 1 matrix of partial derivatives where np is the number of variables. For

27

Chapter 4. Numerical Solutions

28

example, if we use quaternions as the attitude variables, then the minimization is ∂J ∂q1 ∂J ∂q2 ∂J ∂q3 ∂J ∂q4

=0 =0 =0 =0

subject to the constraint that qT q = 1. Incorporating the constraint into the minimization involves the addition of a Lagrange multiplier.

4.1

Review of Minimization

If we want to find the minimum of a function of single variable, say min f (x), we would solve 0

F (x) = f (x) = 0 which is making the first derivative to zero and finding the root. Either we can solve the equations analytically or use Newton’s method with an initial guess. The equations derived in this section are equally valid for analytical or Newton’s appraoch. The general solution shown here will be valid for both attitude quaternion and the attitude matrix parameterization.

The attitude is assumed to be parameterized by an n × 1 vector x. That is, A = A(x). The goal is to find the optimal solution that minimizes the objective function, given by

J(x) ≡ J(A(x))

(4.1)

subject to m = n − 3 equality constraints. given by 

 c1 (x)    .  c(x) =  ..  = 0m×1   cm (x)

(4.2)


29

The m × 1 vector of Lagrange multipliers used to solve the minimization problem is denoted by h λ = λ1

···

λm

iT

(4.3)

The (2n − 3) × 1 augmented state and the augmented objective function are   λ xa =   x

(4.4)

and J a (xa ) = J a (x, λ) = J(x) + λT c(x) = J(x) +

m X

λj cj (x)

(4.5)

j=1

respectively.

The gradient of J a with respect to x is

g(x, λ) =

∂J(x) ∂J a (x, λ) = + λT Gc (x) ∂x ∂x

(4.6)

where Gc (x) =

∂cT (x) h ∂ci (x) = ∂x ∂x

···

∂cm (x) ∂x

i

(4.7)

For x∗ to be local minimizer, the necessary conditions are

g(x∗ , λ∗ ) = 0n×1

(4.8)

c(x∗ ) = 0m×1

(4.9)

The solution to the above equation will give all the stationary points, but how do we tell whether they are (local) minimums or maximums ? This is done by calculating the second-derivative of the cost function, the Hessian matrix, at those stationary points and determine the sign of the matrix. If it is positive definite, J attains a local minimum at x∗ and if it is negative definite, J attains a local maximum at x∗ . But this condition is valid only for the unconstrained minimization problem. A bordered Hessian is used for the second-derivative test in constrained optimization problem, keeping in mind that all the constraints are equality constraints. The bordered Hessian cannot be definite, so the standard unconstrained optimization condition that the Hessian matrix must be positive definite


30

cannot be applied here. The second derivative test impose sign restrictions on the determinants of a certain set of n − m submatrices of the bordered Hessian. Intuitively, one can think of the m constraints as reducing the problem to one with n − m free variables.

Specifically [18], sign conditions are imposed on the sequence of principal minors (determinants of upper-left-justified sub-matrices) of the bordered Hessian, the smallest minor consisting of the truncated first m + 1 rows and columns, the next consisting of the truncated first m + 2 rows and columns, and so on, with the last being the entire bordered Hessian. There are thus nm minors to consider. A sufficient condition for a local minimum is that all of these minors have the sign of (−1)m . A sufficient condition for a local maximum is that these minors alternate in sign with the largest one having the sign of (−1)m+1 .

4.2

Quaternion Based Solution

This section derives the gradient and the bordered Hessian expression for the constrained minimization problem: N

J(q) =

1X ˜ ˜ j − A(q) rj ) (bj − A(q) rj )T Rj−1 (b 2 j=1

(4.10)

subject to qT q = 1. Note that in this case Rj is a full 3 × 3 matrix defined is Eq. (2.12), whereas in Wahba’s original formulation Rj is a scalar times the identity matrix. The attitude matrix is parameterized using the quaternion.

The attitude matrix is related to the quaternion by [19]

A(q) = ΞT (q)Ψ(q)

(4.11)

with   q4 I3×3 + [%×]  Ξ(q) ≡  −%T   q4 I3×3 − [%×]  Ψ(q) ≡  −%T

(4.12a)

(4.12b)


31

The matrix Ξ(q) obeys the following helpful relations for qT q = 1:

ΞT (q)Ξ(q) = I3×3

(4.13a)

Ξ(q)ΞT (q) = I4×4 − q qT

(4.13b)

ΞT (q)q = 03×1

(4.13c)

ΞT (q)λ = −ΞT (λ)q

for any λ4×1

(4.13d)

Note that the matrix Ψ(q) also obeys the same relations as the matrix Ξ(q) in Eq. (4.13). Also, for any 3 × 1 vector ω other useful identities are given by

Ξ(q)ω = Ω(ω)q

(4.14a)

Ψ(q)ω = Γ(ω)q

(4.14b)

Γ2 (ω) = Ω2 (ω) = −(ω T ω)I4×4

(4.14c)

where  −[ω×]   Ω(ω) ≡  −ω T

ω 0

  , 



[ω×]

 Γ(ω) ≡  

−ω

T

ω 0

   

(4.15)

The matrices Ω(ω) and Γ(ω) are both skew symmetric matrices so that ΩT (ω) = −Ω(ω) and ΓT (ω) = −Γ(ω). They also commute.

Substituting Eq. (4.11) into Eq. (4.10), using the relations in Eq. (4.14) and ignoring terms independent of q gives 1 J(q) = qT Kq − qT L(q)q 2

(4.16)

where

K≡

N X Ω(Rj−1 bj )Γ(rj )

(4.17a)

j=1

L(q) ≡

N X Γ(rj )Ξ(q)Rj−1 ΞT (q)Γ(rj ) j=1

(4.17b)


32

The quaternion constraint can be handled by using the method of Lagrange multipliers, leading to the following appended loss function: 1 J 0 (q) = qT Kq − qT L(q)q + λ(1 − qT q) 2

(4.18)

where λ is the Lagrange multiplier. Note that J 0 (q) is quartic in the quaternion. The gradient with respect to λ is simply given by 1 − qT q. The gradient with respect to q for the first term on the right-hand-side (RHS) of Eq. (4.18) is simply given by 2Kq. The gradient with respect to q for the second term on the RHS of Eq. (4.18) is more complicated due to the matrix Ξ(q). The relation in Eq. (4.13d) provides a straightforward approach to compute the gradient of qT L(q)q:   N  ∂  T X ∂[qT L(q)q] −1 T ¯ ¯ = 2L(q)q + q Ξ (Γ(rj )q) Rj Ξ (Γ(rj )q) q  ¯  ∂q ∂q j=1

(4.19) ¯ =q q

Taking the partial in Eq. (4.19) and using the relation in Eq. (4.13d) again, leads to ∂[qT L(q)q] = 4L(q)q ∂q

(4.20)

Hence, the gradient of J 0 (q) is given by

gq ≡

∂J 0 (q) = 2[K − L(q) − λI4×4 ]q ∂q

(4.21)

Now the bordered Hessian of J 0 (q) with respect to x can be calculated. Using Eq. (4.21) the second partial ∂ 2 J 0 (q)/∂λ ∂qT is given by −2q. The second partial ∂ 2 [qT L(q)q]/∂q ∂qT can be computed using a similar approach as shown in Eq. (4.19). Specifically, the partial of ΞT (q)Γ(rj )q is given by 2ΞT (q)Γ(rj ) using the same approach shown in Eq. (4.19). The remaining partial can be found by


33

using the relation in Eq. (4.14a), so that   N  ∂ X ∂ 2 [qT L(q)q] −1 T ¯ = 2L(q) + [Γ(rj )Ω( Rj Ξ (q)Γ(rj )q )] q  ¯T  ∂q ∂qT ∂q j=1 = 2L(q) +

N X

Γ(rj )Ω Rj−1 ΞT (q)Γ(rj )q

¯ =q q

(4.22)

j=1

= 2L(q) +

N X Γ(rj )Ω Rj−1 A(q)rj j=1

where the relations in Eqs. (4.11) and (4.14b) have been used. The expression ∂ 2 J 0 (q)/∂q ∂qT is now given by Hq ≡

∂ 2 J 0 (q) = 2[K − 2L(q) − M (q) − λI4×4 ] ∂q ∂qT

(4.23)

where M (q) ≡

N X Γ(rj )Ω Rj−1 A(q)rj

(4.24)

j=1

The gradient and Hessian of J 0 (x) with respect to x is now given by   T ∂J 0 (x) 1 − q q = g≡ ∂x gq   −2qT ∂ 2 J 0 (x)  0  H≡ = ∂x ∂xT −2q Hq

4.3

(4.25a)

(4.25b)

Attitude Matrix Based Solution

This section derives the gradient and the bordered Hessian expression considering the attitude matrix itself. The loss function defiend in for Eq. (4.10) can be redefined as N

J(a) =

1X ˜ ˜ j − Hj a) (bj − Hj a)T Rj−1 (b 2 j=1

(4.26)


34

with Hj ≡ rTj ⊗ I3×3 , where ⊗ denotes the Kronecker product. Partitioning the attitude matrix by its columns A = [a1 a2 a3 ] and also defining a ≡ [aT1 aT2 aT3 ]T Eq. (4.26) can be written as 1 1 J(a) = −pT a + aT Z a + c 2 2

(4.27)

with   p1 N   X ˜j  p = HjT Rj−1 b p≡  2 j=1 p3   Z11 Z12 Z13 N   X = Z Z Z Z≡ HjT Rj−1 Hj 12 22 23   j=1 Z13 Z23 Z33 c=

N X

˜ T R−1 b ˜j b j j

(4.28a)

(4.28b)

(4.28c)

j=1

where each pk is a 3 × 1 vector and each Z`m is a 3 × 3 matrix. Note that each Z`m sub-matrix is symmetric. Ignoring terms independent of a, Eq. (4.27) can be written as 1 J(a) = −pT a + aT Z a 2

(4.29)

The attitude constraint AT A = A AT = I3×3 can be written by the following six conditions: aT1 a1 = 1

(4.30a)

aT2 a2 = 1

(4.30b)

aT3 a3 = 1

(4.30c)

aT1 a2 = 0

(4.30d)

aT1 a3 = 0

(4.30e)

aT2 a3 = 0

(4.30f)


35

Multiplying terms in Eq. (4.29) and using the constraints in Eq. (4.30) leads to the following modified loss function to be minimized: J 0 (a) = −pT1 a1 − pT2 a2 − pT3 a3 1 T a1 Z11 a1 + aT2 Z22 a2 + aT3 Z33 a3 + aT1 Z12 a2 + aT1 Z13 a3 + aT2 Z23 a3 2 1 λ1 (aT1 a1 − 1) + λ2 (aT2 a2 − 1) + λ3 (aT3 a3 − 1) + 2

+

(4.31)

+ λ4 aT1 a2 + λ5 aT1 a3 + λ6 aT2 a3 where each λi is a Lagrange multiplier. The quadratic terms in the attitude parameters involving the Lagrange multiplies are multiplied by 1/2 to simplify the algebra.

The gradient of Eq. (4.31) with respect to a is given by

ga ≡

∂J 0 (a) = (Z + Λ ⊗ I3×3 ) a − p ∂a

(4.32)

where ga = 09×1 at the optimum point and  λ1  Λ≡ λ4 λ5



λ4

λ5

λ2

 λ6  

λ6

(4.33)

λ3

Note that (Λ ⊗ I3×3 ) a = vec(ΛA), where vec(G) is a vector made up of the columns of a matrix G.

The design variable for the optimization process is given by the following vector:   λ x=  a

(4.34)

where λ = [λ1 λ2 λ3 λ4 λ5 λ6 ]T . Now the bordered Hessian can be calculated. The second partial ∂ 2 J 0 (a)/∂a ∂aT is given by Ha ≡

∂ 2 J 0 (a) = Z + Λ ⊗ I3×3 ∂a ∂aT

(4.35)


36

The second partial ∂ 2 J 0 (a)/∂a ∂λT can be computed by considering the following identity:

(Λ ⊗ I3×3 )a = S λ

(4.36)

where 

a1

 S≡ 03×1 03×1

03×1

03×1

a2

a3

a2

03×1

a1

03×1

03×1

a3

03×1

a1

03×1



 a3   a2

(4.37)

Hence the second partial ∂ 2 J 0 (a)/∂a ∂λT is given by S. The gradient and bordered Hessian of J 0 (x) with respect to x is now given by  T  a1 a1 − 1  T  a2 a2 − 1    T  a3 a3 − 1   0 ∂J (x)  T  g≡ =  a1 a2    ∂x  aT a   1 3     aT a3   2  ga   T ∂ 2 J 0 (x) 06×6 S  = H≡ ∂x ∂xT S Ha

4.4

(4.38a)

(4.38b)

Analytical Solution

The method employed for the analytical solution is to 1) find all the stationary points of the minimization problem and 2) choose the global minimizer from them as the one with the smallest objective function. For minimization problems with equality constraints only, the global minimizer must be one of the stationary points. Thus the solution derived is globally optimal. Once the global optimal point is derived second derivative test of the objective function i.e. the bordered Hassian matrix at that point can be tested to prove the global optimal as the global minimizer.

Here we choose to solve the system of polynomials that are derived from the fact that the gradient should be zero and the equality constraints that comes with the selection of the parameterization of the attitude representation. The number of equations is 2n − 3, where n is the number of parameters


37

of an attitude representation. For example, the polynomial system for the attitude quaternion has five equations, of which the highest degree is three; the polynomial system for the attitude matrix has 15 equations, of which the highest degree is two. We employ homotopy continuation method as shown in [20] for searching all the stationary points that solve the set of polynomial equations.

4.5

Newton’s Solution 0

The method is based on the Taylor series of F (x) = f (x), where f (x) function of single variable, about the current estimate, xn , which we assume is close to the correct answer, denoted x∗ , with the difference between x∗ and xn denoted ∆x [20]. That is, F (x∗ ) = F (xn + ∆x) = F (xn ) +

∂F (xn ) ∆x + O(∆x2 ) ∂x

(4.39)

Since F (x∗ ) = 0, and we have assumed we are close (implying ∆x ≈ 0) and discard the higher order terms, we can solve this equation for ∆x, giving

∆x = −[

∂F (xn ) −1 ] F (xn ) ∂x

(4.40)

Thus, a hopefully closer estimate is

xn+1 = xn − [

∂F (xn ) −1 ] F (xn ) ∂x

(4.41)

We can continue applying these Newton steps until ∆x → 0, or until F → 0. Usually the stopping conditions used with Newton’s method use a combination of both these criteria.

Because our loss function J depends on more than one variable, we use the multivariable version of Newton’s method. In this case the Newton step is

xn+1 = xn − [

∂F (xn ) −1 ] F (xn ) ∂x

(4.42)

where the bold-face variables are column matrices, and the −1 superscript indicates a matrix inverse rather than “one over”.


4.6

38

Solve for Initial State

Newton’s numerical iterative method needs an initial state very close to the actual optimal value as stated in Sec. 4.5. For our case we need to find initial quaternion for the quaternion based solution and an initial (DCM) for the attitude matrix based solution.

4.6.1

Solve for Initial Quaternion

The optimal quaternion is the solution to the polynomial Eq. (4.21) equated to zero. Let us consider the case where Rj = σ 2 I3×3 is considered. Using Eqs. (4.13b), (4.14b), and (4.14c), and the fact that ΨT (q)q = 03×1 yields L(q)q = c q, where

c≡−

N 1 X T (r rj ) σ 2 j=1 j

(4.43)

The optimal q is found when gq = 0. With Rj = σ 2 I3×3 , this leads to ¯ Kq = λq

(4.44)

¯ = c + λ. Equation (4.44) is exactly Davenport’s form to determine the optimal quaternion, where λ found by taking an eignvalue/eigenvector decomposition of K. The initial quaternion is chosen by checking which eigenvector of K from Eq. (4.17a) yields the smallest loss in Eq. (4.10). This initial quaternion may not necessarily be associated with the smallest eigenvalue of K. The initial λ is found by assuming gq = 0 in Eq. (4.19), which is only true at the optimum point but is assumed here to be close to zero with the initial chosen quaternion. Taking the dot product qT gq = 0 and using qT q = 1 gives

qT [K − L(q)]q − qT λq = 0

(4.45a)

qT λq = qT [K − L(q)]q

(4.45b)

λ = qT [K − L(q)]q

(4.45c)

The initial quaternion is substituted into Eq. (4.45) to determine the initial λ. Numerical errors may occur in the computation of the inverse of H. In order to provide a well-conditioned matrix all


39

of the Rj are divided by some factor, which can be σ 2 for the star tracker case.

The bordered Hessian cannot be definite, so the standard unconstrained optimization condition that the Hessian matrix must be positive definite cannot be applied here. Fortunately, a simple test exists to determine whether or not a minimum is given. Sign conditions are imposed on the sequence of principal minors (determinants of upper-left-justified sub-matrices) of the Bordered Hessian; the smallest minor consisting of the truncated first 3 rows and columns, the next consisting of the truncated first 4 rows and columns and the last being the entire Bordered Hessian. There are thus 3 minors to consider for the quaternion determination problem, since H is a 5 × 5 matrix. With one constraint a sufficient condition for a local minimum is that all of these minors have negative sign [21]. If the sign conditions are not met then the algorithm is initialized using a different quaternion obtained from the eigenvectors of K. Note that these sign conditions must be met at the initial conditions. If not then the correction may be in the wrong direction. If none of the eigenvectors of K meet the sign conditions, then a random initial quaternion is drawn and the process is repeated until the sign conditions are met.

4.6.2

Solve for Initial DCM

The initial attitude matrix for the Newton’s method for the attitude matrix based solution is calculated with a new approach called the Semidefinite Relaxation (SDR) on quadratically constrained quadratic programming (QCQP) problems.

4.6.3

Basics of Quadratically Constrained Quadratic Programming (QCQP)

We introduce and study a special class of nonconvex quadratic problems in which the objective and constraint functions have the form f (X) = tr(xT Ax) + 2tr(B T x) + c; x ∈ Rn The latter formulation is termed quadratically constrained quadratic programming(QCQP) [22]. SDR is a powerful, computationally efficient approximation technique for a host of very difficult optimization problems. In particular, it can be applied to many nonconvex QCQP problem in an almost mechanical fashion.


40

These include the following problems:

min tr(xT A0 x) + 2tr(B0T x) + c0

(4.46a)

s.t. tr(xT Ai x) + 2tr(BiT x) + ci ≤ αi , i ∈ I

(4.46b)

tr(xT Aj x) + 2tr(BjT x) + cj = αj , j ∈ ε,

(4.46c)

x ∈ Rn

(4.46d)

with Ai = ATi ∈ Rn×n , Bi ∈ Rn×1 , αi , ci ∈ R, i ∈ {0} ∪ I ∪ ε. Notation. Vectors are denoted by boldface lowercase letters, e.g., y, and matrices by boldface uppercase letters e.g., A. For two matrices A and B, A > B(A ≥ B) means that A − B is positive definite (semidefinite). S n = {A Rn×n : A ≥ 0} is the set all real n × n symmetric positive semidefinite matrices. 0n×m is the n × m matrix of zeros and Ir is the r × r identity matrix. For a matrix M , vec(M ) devotes the vector obtained by stacking the columns of M .For two matrices A r and B, A ⊗ B denotes the corresponding Kronecker prouct. Eij is the r × r matrix with one at the

ij-th component and zero elsewhere, δij is the Kronecker delta, i.e., δii = 1 and δij = 0 for i 6= j.

4.6.4

Basics of Quadratic Matrix Programming (QMP)

Quadratic Matrx Programming is very similar to QCQP problems [22]. A quadratic matrix problem of order r is a function f : Rn×r → R of the form

f (X) = tr(X T AX) + 2tr(B T X) + c, X ∈ Rn×r

(4.47)

where A ∈ S n , B ∈ Rn×r and c ∈ R. If B = 0n×r , c = 0 then f is called a homogenous quadratic matrix function. We note that every quadratic vector function defined in Eq. (4.46) is a quadratic matrix function of order one. The opposite statement is also true: every quadratic matrix function is in particular a quadratic vector function. Indeed, the function f from Eq. (4.47) can be written as follows: f (X) = f v (vec(X))

(4.48)


41

where f v : Rnr → R is defined by

f v (z) = zT (Ir ⊗ A)z + 2vec(B)T z + c

(4.49)

The function f v is called the vectorized function of f and z = vec(X).

4.6.5

Basics of Semidefinite Relaxation

The Eq. (4.46) is the formulation of QCQP. The case where all the Ai are positive semidefinite, the problem is convex and can be solved efficiently. Here we focus on the case where at least one of the Ai is not positive semidefinite. This sets of QCQP are nonconvex and are NP-hard (very difiicult to solve).

The Eq. (4.46) is a nonhomogeneous QCQP. The first part of solving a QCQP problem is to convert nonhomogeneous QCQP to homogeneous QCQP. We can homogenize Eq. (4.49) as follows:

f v (z; t) = zT (Ir ⊗ A)z + 2vec(B)T zt + ct2    h i (Ir ⊗ A) vec(B) z   = zT t  vec(B)T c t s.t. t2 = 1

(4.50) (4.51) (4.52)

Problem (4.49) is equivalent to (4.50) in the following sense: if (z∗ , t∗ ) is an optimal solution to Eq. (4.50), then z∗ is an optimal solution to Eq. (4.49) when t∗ = 1.

The real-valued homogeneous QCQP can be written as follows:

min

x∈Rn

xT C0 x

(4.53a)

s.t. xT Ci x ≤ αi , i ∈ I

(4.53b)

xT Cj x = αj , j ∈ ε

(4.53c)


42

The crucial step in deriving an SDR [23] of Problem (4.53) is to observe that

xT Cx = tr(xT Cx) = tr(CxxT ),

(4.54a)

xT Ai x = tr(xT Ai x) = tr(Ai xxT ),

(4.54b)

In particular, both the objective function and constraints in Eq. (4.53) are linear in the matrix xxT . Thus, by introducing a new variable X = xxT and noting that X = xxT is equivalent to X being a rank one symmetric positive semidefinite (PSD) matrix, we obtain the following equivalent formulation of Eq. (4.53):

min tr(CX)

(4.55a)

s.t. tr(Ai X) ≤ αi

(4.55b)

tr(Aj X) = εi

(4.55c)

nr+1 X ∈ S+ , rank(X) = 1 and Xnr+1,nr+1 = 1

(4.55d)

X∈S n

Equation (4.55) is just as difficult to solve as Eq. (4.53). However, the formulation in Eq. (4.55) allows us to identify the fundamental difficulty in solving Eq. (4.53). Indeed, the only difficult constraint in Eq. (4.53) is the rank constraint rank(X) = 1, which is nonconvex (the objective function and all other constraints are convex in X). Thus, we may as well drop it to obtain the following relaxed version of Problem (4.53):

min tr(CX)

(4.56a)

s.t. tr(Ai X) ≤ αi

(4.56b)

tr(Aj X) = εi

(4.56c)

nr+1 X ∈ S+ , and Xnr+1,nr+1 = 1

(4.56d)

X∈S n

Equation (4.56) is known as the SDR of Eq. (4.53), where the name stems from the fact that Eq. (4.56) is an instance of semidefinite programming (SDP).


4.6.6

43

Basics of Semidefinite Programming

A semidefinite program is a generalization of a linear program (LP), where the inequality constraints in the latter are replaced by generalized inequalities corresponding to the cone of positive semidefinite matrices [24]. Concretely, a semidefinite program (SDP) in the pure primal form is defined as the optimization problem Eq. (4.56) The crucial feature of semidefinite programs is that the feasible set defined by the constraints above is always convex. This is therefore a convex optimization problem, since the objective function is linear. Simply speaking, the SDP problem can be solved with a worst case 4

complexity of O(max {m, n} n1/2 log(1/)) given a solution accuracy > 0. SDR is a computationally efficient approximation approach to QCQP, in the sense that its complexity is polynomial in the problem size n and the number of constraints m.

Of course, there is no free lunch in turning the NP-hard Problem into the polynomial-time solvable problem. Indeed, a fundamental issue that one must address when using SDR is how to convert a ˜ to Eq. (4.53). Now, if X ∗ is of rank one, globally optimal solution X ∗ into a feasible solution x then there is nothing to do, for we can write X ∗ = x∗ x∗T ,and x∗ will be a feasible and in fact optimal solution to Eq. (4.53). On the other hand, if the rank of X ∗ is larger than 1, then we must ˜ that is feasible for Eq. (4.53). There are somehow extract from it, in an efficient manner, a vector x many ways to do this, and they generally follow some intuitively reasonable heuristics (true even in the engineering sense). However, we must emphasize that even though the extracted solution is feasible for Eq. (4.53), it is in general not an optimal solution (for otherwise we would have solved an NP-hard problem in polynomial time). The most resonable way of extract x∗ from X ∗ is to apply a rank-1 approximation [23]. Carry on the eigen-decomposition we have X∗ =

r X

λi qi qTi

(4.57)

i=1

where r = rank(X ∗ ), λ1 ≥ λ2 ≥ · · · ≥ λr > 0 are the eigenvalues and q1 , · · · · · · qr ∈ Rn the respective eigenvectors. Since the best rank-one approximation X1∗ to X ∗ (in the least 2-norm sense) √ is given by X1∗ = λ1 q1 qT1 , we may define x ˜ = λ1 q1 as our candidate solution to Problem (4.53), ˜ to a “nearby” feasible solution x ˆ . In provided that it is feasible. Otherwise, we can try to map x general, such a mapping is problem dependent, but it can be quite simple.


44

It may raise question that since SDR is an approximation method, as an alternative we may also choose to approximate a nonconvex QCQP by available nonlinear programming method (NPM) (e.g. sequential quadratic programming, fmincon in MATLAB Optimization Toolbox). So it is obvious to ask which method is better. It is interesting to see that rather then competing they complement each other. The quality of NPM depends on the starting initial guess result. Here we consider a two-stage approach, in which SDR is used to provide a starting point for the NPM method.

4.6.7

SDR Method for Calculating DCM

The generalized Wabha’s loss function that we need to solve using Semidefinite Relaxation is expressed in QMP form:

min J(a) = aT Z a − 2pT a + c

(4.58a)

r r s.t. tr(AT (Eij + Eji )A) = 2δij , 1 ≤ i, j ≤ r

(4.58b)

a

r where A = [a1 a2 a3 ] and a ≡ [aT1 aT2 aT3 ]T , Eij is the r ×r matrix with one at the ij-th component

and zero elsewhere, δij is the Kronecker delta, i.e., δii = 1 and δij = 0 for i 6= j. The above Eq. (4.58) can be re-arranged in the form

min tr(C0 X)

(4.59a)

s.t. tr(Cij X) = 2δij , 1 ≤ i, j ≤ r

(4.59b)

X

nr+1 X ∈ S+ , rank(X) = 1 and Xnr+1,nr+1 = 1

(4.59c)

where X = aaT and 

Z

 −p  c

C0 =  T −p  r r (Ir ⊗ (Eij + Eji )) Cij =  T 0nr×1

(4.60) 0nr×1 0

 

(4.61)


45

where n = 3, r = 3 and nr = 9. after the rank relaxation the SDR form is

min tr(C0 X)

(4.62a)

s.t. tr(Cij X) = 2δij , 1 ≤ i, j ≤ 3

(4.62b)

10 X ∈ S+ and X10,10 = 1

(4.62c)

X

Now once X is calculated from the SDP, the rank one approximation defined in Sec. 4.6.6 is used ˜. We use this a ˜ to find feasible a ˆ and the initial λ defined in Eq. (4.34) by to derive approximate a equating Eq. (4.32) to zero at optimum:

ga = (Z + Λ ⊗ I3×3 ) a − p = Za − p + Sλ =0

Since S is not square matrix we can calculate λ = (S T S)−1 S T (p − Za).

(4.63)

Chapter 5

Simulation Results The simulation results for the analytical and the numerical approach both are provided here.

5.1

Results from Analytical Approach

For the analytical approach homotopy continuation based solver HOM4PS 2.0 is used. This is an executable that loads the polynomial system by reading an input data file and saves the solution to an output data file. It should be noted that the homotopy continuation method finds all isolated complex roots of a polynomial system instead of real roots only. Since only real roots have physical meaning, post-processing is need to eliminate the complex roots with non-zero imaginary part. We choose an attitude quaternion randomly as: 

0.637500352868806



   0.493382124324880    q=  −0.240485730112637   0.540679196105240

46

(5.1)

Chapter 5. Simulation Results

47

Two pairs of unit vector observations are:  h

b1

h r1

0.342058406974454

−0.341981877972056



i    b2 =  −0.000003188735171 −0.000003188829692 0.939678693069035 0.939706547347914   0.349166114804979 0.077272833574846 i    r2 =  −0.744574798193212 −0.996993389688454 −0.568938831657266 0.005752400267768

The vectors b1 and b2 are corrupted by zero-mean Gaussian noise with covariances:   0.116115131090038 0 0.059348024255871    0 0.065977502594721 0 R1 = 1 × 10−8    0.059348024255871 0 0.088339528391755   0.116115131090038 0 −0.059348024255871    0 0.065977502594721 0 R2 = 1 × 10−8    −0.059348024255871 0 0.088339528391755 respectively. The standard deviation for the noise is σ = 2.9 × 10−5 which in 3σ is 0.005 degrees. We consider q and −q to be identical as they produce same attitude matrix. Quaternion which Table 5.1: List of All Quaternions Evaluated by HOM4PS 2.0 No.

q1

q2

q3

q4

J

1 2 3 4 5 6

0.576328195302476 -0.493249463212554 0.540676634429397 0.396876314160757 0.240572194258626 -0.637494539022518

-0.668587129608844 0.637443242953661 -0.240480667186984 -0.588790744326098 0.540744079998105 -0.493371523988791

0.451225902164481 -0.540797700787657 -0.493386181378772 0.615016827152104 0.637469177413595 0.240497872532947

0.131271652067944 -0.240642734932995 -0.637501295462117 0.342883293518145 -0.493309139530404 -0.540690322958134

1.03113931096 1.03857939181 10.30485103735 1.03113931100 7.50940462864 0.00000000543

result in the smallest value of J is the estimated attitude quaternion:   −0.637494539022518   −0.493371523988791   ˆ= q   0.240497872532947    −0.540690322958134

(5.2)


48

The second derivative test, analysing the bordered Hassian matrix at that point shows that the point is an optimal minima. The quaternion error between the original and the estimated is:   0.000001696715693   0.000020360115815   δq =   0.000000000034545   0.999999999791293

(5.3)

which in terms of angle error is 0.0023 degrees. Similar results are achieved for the DCM approach as well.

5.2

Results from Newton’s Approach

The initial attitude matrix (DCM) for Newton’s method is calculated using software package CVX which is a MATLAB-based modeling system for convex optimization problem (SDP). For the initial quaternion we use Davenport’s q-method and uses MATLAB’s Linear algebra Toolbox for the eigendecomposition. To measure the accuracy and consistency of the result we use Monte Carlo simulation and Cramér-Rao inequality.

5.2.1

Cram´ er-Rao Inequality

The Cramér-Rao inequality can be used to give us a lower bound on the expected errors between the estimated quantities and the true value from the known statistical properties of the measurement errors. We first consider a conditional probability density function which is the function of unknown parameters and measurements denoted by p(˜ y|x). The Cramér-Rao inequality for the unbiased ˆ is given by estimate x P ≡ E (ˆ x − x)(ˆ x − x)T ≥ F −1

(5.4)

where the Fisher information matrix can be computed using the Hessian matrix, given by F = −E

∂2 ln[p(˜ y|x)] ∂x∂xT

(5.5)


49

For the multivariate normal distribution the conditional density function is

p(˜ y|x) =

1 1 T −1 exp − [˜ y − Hx] R [˜ y − Hx] 2 (2π)m/2 [det(R)]1/2

(5.6)

The natural log of p(˜ y|x) is given by m 1 1 y − Hx]T R−1 [˜ y − Hx] − ln(2π) − ln[det(R)] ln p(˜ y|x) = − [˜ 2 2 2

(5.7)

We can ignore the last two terms since they are independent of x. The Fisher information matrix is calculated using Eq. (5.5) and is given by

F = H T R−1 H

(5.8)

Therefore the estimated error covariance lower bound is

P ≥ (H T R−1 H)−1

(5.9)

In our case the fisher information for the attitude is expressed in terms of incremental error angles, δα, defined according to Aest = A(δα)Atrue

(5.10)

Since the δα represents small angles we have

A = [I − [δα×]]Atrue

(5.11)

where Atrue is the true attitude, Aest is the estimated attitude and the 3 × 3 matrix [δα×] is a cross product matrix. For the calculation of the Fisher information matrix Wahba’s general weighted loss function can be used in place of negative log likelihood of multivariate normal distribution as they are similar. The parameter vector is now given by x = δα, and the covariance is defined by P = E xxT −


50

E {x} E T {x}. The optimal error covariance can be derived as:  −1 N X P =  [Atrue rj ×]Rj−1 [Atrue rj ×]T 

(5.12)

j=1

The attitude Atrue is evaluated at its respective true value. In practice, though, Atrue rj is often ˜ j , which allows a calculation of the covariance without computing replaced with the measurement b an attitude. The Fisher information matrix is nonsingular only if at least two non-collinear observation vectors exist. This is due to the fact that one vector observation gives only two pieces of attitude information.

5.2.2

Monte Carlo Simulation for Newton’s approach

The quaternion estimates are found from 1000 trial runs using a different ramdom number seed between runs. Statistical conclusions can be made if the generalized Wabha’s problem solution is performed many times using different measurement sets. This approach is known as Monte Carlo simulation. A plot of the actual errors for each estimate and the associated 3σ boundaries found from taking the square root of the diagonal elements of P lower bound found from Cramér Rao inequality and multiplying the result by 3 is shown below for roll, pitch and yaw errors. From probability theory, for a Gaussian distribution, there is a 0.9389 probability that the estimate error will be inside the 3σ boundary. From Monte Carlo simulation it is visible, that the errors present at initial state are significantly reduced within two iteration of Newton’s method.

5.3

Comparison of Computational Efforts

For the analytical solution we use the software package HOM4PS 2.0 to do the homotopy continuation method on the set of polynomial equations derived both for the quaternion approach as well as for the attitude matrix approach. Two MATLAB programs are written, 1. First program prepare the input data file which contains the set polynomials for the HOM4PS 2.0 executable. 2. Second program processes the output data file produced by the executable and identifies the


51

80

60

60

40 20 Roll Errors (urad)

Roll Errors (urad)

40 20 0 −20

−20 −40

−40

−60

−60 −80 0

0

200

400 600 Run Number

800

−80 0

1000

(a) Q-method Roll Error

200

400 600 Run Number

800

1000

(b) Final Estimate Roll Error and 3σ Boundaries

Figure 5.1: Comparisons of Roll Error for the Newton’s Quaternion Approach

100

60 40

Pitch Errors (urad)

Pitch Errors (urad)

50

0

−50

20 0 −20 −40

−100 −60 −150 0

200

400 600 Run Number

800

1000

(a) Q-method Pitch Error

−80 0

200

400 600 Run Number

800

1000

(b) Final Estimate Pitch Error and 3σ Boundaries

Figure 5.2: Comparisons of Pitch Error for the Newton’s Quaternion Approach

solution that produces the least cost to the Wabha’s generalized problem given by Eq. (4.10). The software package HOM4PS 2.0 takes 0.26 seconds to evaluate all the stationary points for the quaternion approach which consist of five polynomial equations of order three. On the other hand for the attitude matrix approach which consist of fifteen polynomial equations of order two takes 27.3 seconds for the evaluation to complete.

To calculate initial state for the Newton’s method quaternion approach we use q-method which involves eigen-decomposition of the 4 × 4 K matrix given in Eq. (4.17). For this purpose we use


52

300

200 150

200

Yaw Errors (urad)

Yaw Errors (urad)

100 100

0

−100

50 0 −50 −100

−200

−150

−300 0

200

400 600 Run Number

800

1000

(a) Q-method Yaw Error

−200 0

200

400 600 Run Number

800

1000

(b) Final Estimate Yaw Error and 3σ Boundaries

Figure 5.3: Comparisons of Yaw Error for the Newton’s Quaternion Approach

900 60 800 40 Roll Errors (urad)

Roll Errors (urad)

700 600 500

20 0 −20

400 −40 300 −60 200 0

200

400 600 Run Number

(a) SDR Roll Error

800

1000

0

200

400 600 Run Number

800

(b) Final Estimate Roll Error and 3σ Boundaries

Figure 5.4: Comparisons of Roll Error for the Newton’s DCM Approach

MATLAB eig function. Time taken to evaluate the initial state and apply Newton’s method takes around 1.6 × 10−6 seconds. Software package CVX is used for the determination of the initial state for the Newton’s method attitude matrix approach. The total time taken for this approach is around 0.4 seconds.

All the programs are executed on a Windows computer with a 2.4 GHz Intel i5 processor and 6.0 GB RAM. It is seen that Newtons’s method is faster then homotopy method. The only disadvantage with the Newton’s method is that, it cannot be argued that the solution is a global optimal


53

−40

60

−60

40

−100

Pitch Errors (urad)

Pitch Errors (urad)

−80

−120 −140 −160 −180

20 0 −20 −40

−200 −60

−220 −240 0

200

400 600 Run Number

800

−80 0

1000

(a) SDR Pitch Error

200

400 600 Run Number

800

1000

(b) Final Estimate Pitch Error and 3σ Boundaries

−300

200

−350

150

−400

100 Yaw Errors (urad)

Yaw Errors (urad)

Figure 5.5: Comparisons of Pitch Error for the Newton’s DCM Approach

−450 −500 −550

50 0 −50

−600

−100

−650

−150

−700 0

200

400 600 Run Number

(a) SDR Yaw Error

800

1000

−200 0

200

400 600 Run Number

800

1000

(b) Final Estimate Yaw Error and 3σ Boundaries

Figure 5.6: Comparisons of Yaw Error for the Newton’s DCM Approach

solution, where as in the homotopy method the solution is the global optimal solution.

Chapter 6

Conclusion Two novel numerical method was presented for solving the Wahba’s general weighted minimization problem in attitude determination. The first novel method is based on the method of homotopy continuation used to find all the stationary points of the minimization problem and guarantees to find the global optimal solution to the minimization problem. The second novel method is based on Newton’s method for which the initial conditions are derived from the SDR form of the minimization problem which is of convex form.

Semidefinite relaxation and semidefinite programming is relatively new in the field of attitude determination and space technology. Only one paper by Shakil Ahmed [26] talks about robust attitude determination using SDP. This method is very much used in other engineering domain such as signal processing and operational research and therefore has great potential as it finds the global optimal solution. Representing the minimization problem into different forms by using different attitude parameterization can open new and efficient ways of solving the Wahba’s general weighted problem.

54

Bibliography [1] Hall, D. C., Spacecraft Attitude Dynamics and Control , chap. 4, Malabar, FL, Krieger Publishing Co, Virginia Tech, USA, 1991. [2] Crassidis, J. L. and Junkins, J. L., Optimal Estimation of Dynamic Systems, chap. 2, CRC Press, Boca Raton, FL, 2nd ed., 2012. [3] Cemenska, J., “Sensor Modeling and Kalman Filtering Applied to Satellite Attitude Determination,” Master’s Thesis, 2003. [4] A. Secroun, M. L. and Levi., M., “A High Accuracy, Small Field of View Star Guider with Application to SNAP,” Experimental Astronomy, Vol. 11, June 2002. [5] Nawaz, A., “The Sensor System for Fine Guiding the SNAP Satellite,” Masters thesis. [6] Light, D. L., “Satellite Photogrammetry,” Manual of Photogrammetry, edited by C. C. Slama, chap. 17, American Society of Photogrammetry, Falls Church, VA, 4th ed., 1980. [7] Shuster, M. D., “Kalman Filtering of Spacecraft Attitude and the QUEST Model,” The Journal of the Astronautical Sciences, Vol. 38, No. 3, July-Sept. 1990, pp. 377–393. [8] Shuster, M. D. and Oh, S. D., “Attitude Determination from Vector Observations,” Journal of Guidance and Control , Vol. 4, No. 1, Jan.-Feb. 1981, pp. 70–77. [9] Cheng, Y., Crassidis, J. L., and Markley, F. L., “Attitude Estimation for Large Field-of-View Sensors,” The Journal of the Astronautical Sciences, Vol. 54, No. 3/4, July-Dec. 2006, pp. 433– 448. [10] Shuster, M. D., “Constraint in Attitude Estimation Part I: Constrained Estimation,” The Journal of the Astronautical Sciences, Vol. 51, No. 1, Jan.-March 2006, pp. 51–74.

55

Bibliography

56

[11] Wahba, G., “A Least-Squares Estimate of Satellite Attitude,” SIAM Review , Vol. 7, No. 3, July 1965, pp. 409. [12] Markley, F. L. and Mortari, D., “How to Estimate Attitude from Vector Observations,” AAS/AIAA. Astrodynamics. Specialist Conference, August 1999, pp. 99–427. [13] Shuster, M. D., “Approximate Algorithms for Fast Optimal Attitude Computation,” AIAA Guidance and Control Conference, August 1978. [14] Markley, F. L., “Attitude Determination Using Vector Observations and the Singular Value Decomposition,” The Journal of the Astronautical Sciences, Vol. 36, No. 3, July-Sept. 1988, pp. 245–258. [15] Mortari, D., “ESOQ: A Closed-Form Solution to the Wahba Problem,” Journal of the Astronautical Sciences, Vol. 45, No. 2, April-June 1997, pp. 195–204. [16] Mortari, D., “ESOQ2 Single-Point Algorithm for Fast Optimal Attitude Determination,” AAS 97167, AAS/AIAA Space Flight Mechanics Meeting, Huntsville, AL,, February 1997. [17] F. Landis Markley, J. L. C. and Cheng, Y., “Nonlinear Attitude Filtering Methods,” . [18] Chiang, A. C., Fundamental Methods of Mathematical Economics, McGraw-Hill, 3rd ed., 1984, p. 386. [19] Shuster, M. D., “A Survey of Attitude Representations,” Journal of the Astronautical Sciences, Vol. 41, No. 4, Oct.-Dec. 1993, pp. 439–517. [20] Cheng, Y. and Crassidis, J. L., “Attitude Estimation Based on Solution of System of Polynomials via Homotopy Continuation,” AIAA/AAS Astrodynamics Specialists Conference, August 2012. [21] Moskowitz, M. A. and Paliogiannis, F., Functions of Several Real Variables, chap. 3, World Scientific, Singapore, 2011. [22] Beck, A., “Quadratic Matrix Programming,” SIAM Journal on Optimization, January 2007. [23] Zhi-Quan Luo, Wing-Kin Ma, A. M.-C. S. Y. Y. and Zhang, S., “Semidefinite Relaxation of Quadratic Optimization Problems,” IEEE SP MAGAZINE, SPECIAL ISSUE ON CONVEX OPT. FOR SP , May 2010.

Bibliography

57

[24] Parrilo, P. A. and Lall, S., “Semidefinite Programming Relaxations and Algebraic Optimization in Control,” European Journal of Control , Vol. 9, No. 2-3, 2003, pp. 307–321. [25] Schnemann, P. H., “A Generalized Solution of the Orthogonal Procrustes Problem,” Springer Link , Vol. 31, No. 1, March 1966, pp. 1–10. [26] Shakil Ahmed, E. C. K. and Jaimoukha, I. M., “Semidefinite Relaxation of a Robust Static Attitude Determination Problem,” IEEE Conference on Decision and Control and European Control Conference, December 2011.

FAST AND OPTIMAL SOLUTION FOR THE ...

FAST AND OPTIMAL SOLUTION FOR THE ...

Suggest Documents

Fast and optimal solution to the Rankine-Hugoniot problem

Fast and optimal solution to the Rankine-Hugoniot problem

Complete Fast Analytical Solution of the Optimal ... - Semantic Scholar

Complete Fast Analytical Solution of the Optimal Odd Single ... - UTIA

FAST AND OPTIMAL PARALLEL

Optimal Approximate Solution for -Multiplicative

fast multigrid solution of optimal control problems with

Fast Solution of Periodic Optimal Control Problems in Automobile Test

An optimal probabilistic solution for information ... - Eurecom

Optimal approximate solution for generalized ...

Optimal Approximate Solution for -Multiplicative ...

Optimal Solution for an Engineering Applications ...

KalaiâSmorodinsky Bargaining Solution for Optimal Resource ...

fast wavelet techniques for near-optimal image

fast wavelet techniques for near-optimal image

A Fast Algorithm for Computing Optimal Rectilinear

Optimal solution of the discrete cost multicommodity

FAST SOLUTION OF THE RADIAL BASIS FUNCTION ...

A Fast and Efficient Source Authentication Solution for Broadcasting in ...

Iterative Hessian sketch: Fast and accurate solution approximation for ...

Fast and Accurate Solution for SCUC Problem using

Semi-explicit Solution and Fast Minimization Scheme for an ... - CNRS

Wavelets for the Fast Solution of Second-Kind Integral Equations

Optimal Control Solution for Dual (Tail and Canard) Controlled Missiles

FAST AND OPTIMAL SOLUTION FOR THE ...