Development of an Enhanced Adaptive Control

0 downloads 0 Views 9MB Size Report
Apr 27, 2011 - where ¯w is a vector containing all values of the synaptic weights w plq ..... r2. ¤ ¯r. |¯r|. (3.3) where M and m are the attracting masses, ¯r is the ...
Studienarbeit: R1117S

Development of an Enhanced Adaptive Control Strategy and Visualization of the Spacecraft Clemens Rumpf (2898704) May 2011

Prof. Dr.-Ing. Peter Vörsmann Technische Universität Braunschweig Institut für Luft- und Raumfahrtsysteme Hermann-Blenk-Straße 23, 38108 Braunschweig

45

Figures

16

Tables

Contents 1. Introduction

16

2. Neural Networks 2.1. Qualities of Artificial Neural Networks 2.2. Historical Overview . . . . . . . . . . . 2.3. Propagation of Signals . . . . . . . . . . 2.4. Learning . . . . . . . . . . . . . . . . . . 2.4.1. Initiating Learning . . . . . . . . 2.4.2. Derivation of Backpropagation . 2.4.3. Jacobi Matrix . . . . . . . . . . . 2.4.4. Inverse Propagation . . . . . . . 2.5. Gradient Descent Method . . . . . . . . 2.6. Sliding Mode Control . . . . . . . . . . . 2.6.1. Derivation of SMC algorithm . .

. . . . . . . . . . .

19 19 20 22 24 24 25 28 30 31 33 33

3. Mathematical Attitude Representation 3.1. Motion of a Satellite . . . . . . . . . 3.1.1. Orbit Mechanics . . . . . . . 3.1.2. Attitude Dynamics . . . . . . 3.2. Direction Cosine Matrices . . . . . . 3.3. Quaternions . . . . . . . . . . . . . . 3.3.1. Quaternion Multiplication . . 3.3.2. Quaternion Inverse . . . . . . 3.3.3. Error Quaternion . . . . . . . 3.3.4. Why to use Quaternions . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

38 39 39 40 42 44 45 46 47 49

4. Neural Control 4.1. Satellite Control Architectures . . . . . . . . . . . . . . . . 4.1.1. Observer-Controller . . . . . . . . . . . . . . . . . 4.1.2. Reference Model augmented Observer-Controller 4.1.3. Augmented Observer-Controller after [1] . . . . . 4.2. Reference Model . . . . . . . . . . . . . . . . . . . . . . . 4.2.1. Simplified Physical Model . . . . . . . . . . . . . . 4.2.2. Control Law . . . . . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

51 51 51 53 55 56 57 58

2

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

Contents

Contents

4.2.3. Simulink Implementation . . . 4.2.4. Stability Analysis . . . . . . . . 4.3. Stable Error Signal after [1] . . . . . . 4.3.1. Proof of Stability . . . . . . . . 4.3.2. The Effect . . . . . . . . . . . . 4.3.3. Simulink Implementation . . . 4.4. Control Angle Error . . . . . . . . . . 4.4.1. Calculation of the Error Angle 4.5. Mission and Past Work . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

58 61 62 63 65 67 68 68 70 73 73 75 78 82 83 83 84 84 86 88 89 89 90 91

5. Investigation 5.1. Reference Model and Parameter Variation . 5.1.1. Parameter Variation . . . . . . . . . 5.1.2. Reference Model Influence . . . . . 5.2. Network Size . . . . . . . . . . . . . . . . . 5.3. Stabilized Error Signal . . . . . . . . . . . . 5.3.1. Two Error Signal Versions . . . . . . 5.3.2. Controller Performance Outcome . 5.3.3. Influence of A and B . . . . . . . . . 5.4. Multiple Training Runs . . . . . . . . . . . . 5.5. Increased Residual Atmosphere Density . . 5.6. Inverse Output Signal . . . . . . . . . . . . 5.6.1. Physical Interpretation . . . . . . . . 5.6.2. Inverse Output Signal Quality . . . 5.6.3. Control Signal Relation . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

6. Visualization 6.1. The Interface . . . . . . . . . . . . . . . 6.1.1. Client-Server . . . . . . . . . . 6.1.2. Peer-to-Peer . . . . . . . . . . . 6.2. Implementation . . . . . . . . . . . . . 6.2.1. Floating Point Number "Float" 6.2.2. Data Formatting . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

93 . 94 . 94 . 94 . 96 . 98 . 100

. . . . . .

. . . . . .

. . . . . .

7. Summary and Conclusions

102

A. Appendix A.1. Project Management . . . . . . A.2. Source Code . . . . . . . . . . . A.2.1. Subproject Visualization A.3. Simulink Models . . . . . . . .

105 105 117 124 126

. . . .

3

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

List of Figures 2.1. Depiction of interconnected brain cells. [2] . . . . . . . . . . . . . . . . 2.2. Simple perceptron devided into layers. . . . . . . . . . . . . . . . . . . 2.3. Inputs and outputs of example neuron # 3 in layer (l+1). Inside the neuron is the transfer function f pq which process the weighted sum of all inputs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4. The two commonly used transfer functions: Hyperbolic tangent (left) and Linear (right) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5. Illustration of a simple neural network with bias neurons. Some weights have been left out for reasons of clarity. . . . . . . . . . . . . . 2.6. The plot shows a hypothetical curve of network error E as a function of weights wij . A tangent on Epwq illustrates the slope B E{B wplq . The derivative’s magnitude determines the weight adjustment. . . . . . . . 2.7. Plot of e-function in equation 2.36. It illustrates the influence of λ’s sign on stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

21 22

3.1. Visualization of cosines between vectors. . . . . . . . . . . . . . . . . .

43

4.1. Simplified standard control theory block diagram [3]. All blocks inside the dashed frame are united in the "Plant"-block in this work. . . 4.2. Block diagram of observer-controller architecture. . . . . . . . . . . . . 4.3. Block diagram of indirect neural reference controller which implements a reference model to the observer-controller architecture. . . . . 4.4. Block diagram of augmented observer-controller architecture after [1]. 4.5. Screen shot of reference model as implemented as Simulink model . . 4.6. Block diagram of a reference model augmented observer-controller architecture. Signals are labled with corresponding time signatures. . 4.7. Block diagram of observer-controller architecture with error stabilization after [1]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.8. The plot shows two hypothetical curves of the plant’s and reference model’s state y (blue) and yre f (red), respectively. The augmented reference model output ym is illustrated as green dot. Errors e  yre f  y and em  ym  y are labled. Stabilized error signal em is calculated by equation 4.13. The augmented state is drawn closer to the plant’s state by Aepkq Bepk  1q and therefore produces weaker error signals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4

24 25 26

31 35

51 52 54 55 59 60 62

65

List of Figures

List of Figures

4.9. Simulink model of the stable error signal implementation using subtraction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.10. Simulink model of the stable error signal implementation using difference quaternion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.11. 3D-Illustration of two frames with error angles ∆Θ, ∆Φ and ∆Ψ between the corresponding axis. . . . . . . . . . . . . . . . . . . . . . . . . 4.12. Earth map with satellite’s time-marked ground path. Ground stations and their visibility are included [4]. . . . . . . . . . . . . . . . . . . . . 4.13. Satellite’s pointing accuracy using a PID controller. Deviation angle is plotted over mission time. The top plot provides a zoomed-in view while the bottom plot gives the zoomed-out illustration. . . . . . . . . 4.14. Plot of control accuracy over time using unmodified observer-controller architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1. Plot of control accuracy over time using reference model augmented observer-controller with baseline parameter setting. . . . . . . . . . . . 5.2. Plot of control accuracy over time using reference model augmented observer-controller with parameter variation of validation interval = 10. 5.3. Plot of control accuracy over time using reference model augmented observer-controller with parameter variation of λ  0.1. The result is unstable. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4. Two plots of control error and corresponding µ-values over mission time. Three neural networks control the attitude. Therefore three µseries are shown. The learning rate µ is small preventing adequate learning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5. Plot of observer’s inverse output signal over mission time. The inverse output signal also determines weight adaptation magnitude. It is a small scale signal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6. Two plots of control accuracy and µ time history for reference model augmented observer-controller architecture. With a parameter choice of λ  103 and G  102 , the controller’s performance increased significantly. The higher learning rate µ is key for the success. . . . . . . . . 5.7. Plot of control accuracy over mission time for the control architecture implementing stabilized error signal after [1], subtraction version. The controller grows unstable at the end of the mission. The simulation corresponds to # 10 in table 5.10. . . . . . . . . . . . . . . . . . . . . . . 5.8. Two plots show the effect of factors A and B on controlling robustness. The top plot visualizes controlling performance for A  0.2 & B  0.1 while the bottom plot shows the results for A  0.5 & B  0.25. Both simulations use the same parameter settings as listed in table 5.10 #11 & #12. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5

67 67 68 70

71 72 74 76

78

79

80

81

86

87

List of Figures

List of Figures

5.9. Two plots compare correlation between deviation signal (top plot) and inversely propagated signal (bottom plot). For reasons of clarity only the second element of the signals is shown. A correlation in shape is recognizable. Zero crossings show a strong correlation. Even though shape is similar, both signals have different signs, occasionally. The plots correspond to the mission with high performance parameter selection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.10. Two plots show correlation between observer approximation quality (top) and inversely propagated signal (bottom) over time. When approximation quality deteriorates, the inverse signal’s quality becomes noisy. For reasons of clarity, only the third of the three element signals is shown for the high performance mission. . . . . . . . . . . . . . . . . 5.11. Two plots compare inverse output (top plot) und NNC control signal (bottom plot). A correlation is visible only in degree of noise carried by the signals. The signals correspond to the high performance parameter selection. For reasons of clarity only the signal’s first element is shown. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

90

91

92

6.1. 6.2. 6.3. 6.4.

Screenshot of the space situation visualization software [5] . . . . . . . 93 Simulink Blocks for sending and receiving data via an UDP Port . . . 96 UDP send-block panel which allows to set additional parameters . . . 97 Subsystem which formats and sends the attitude information. The signal is augmented with a time stamp which is generated by the "Simulation time"-block. For peer-to-peer operation, the "Receive Stop Trigger"-block is included. . . . . . . . . . . . . . . . . . . . . . . . . . . 99 6.5. Bit assignment for a float value [6]. . . . . . . . . . . . . . . . . . . . . . 100 6.6. Graphical illustration of how a 32-bit float array is formatted in order to be send as four eight-bit integer via UDP. Simulink’s interface end is represented on top side. One DCM-entry is shown as 32-bit float value in binary coding. It is disintegrated into four 8-bit integers and sent via UDP to visualization’s interface end located at the bottom. Here, integers are reassembled to 32-bit float values. The example binary float array from figure 6.5 has been used. . . . . . . . . . . . . . 101 A.1. Top level of the Simulink attitude control simulator model. . . . . . . 126 A.2. Neural control block implementing observer-controller architecture. . 127 A.3. Neural control block implementing reference model augmented observercontroller architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 A.4. Neural control block implementing augmented error signals. . . . . . 129 A.5. Simulink subsystem receiving trigger information and holding the simulation until trigger reception in peer-to-peer configuration. . . . . 130

6

List of Figures

List of Figures

A.6. Output of state signals for post simulation evaluation in SimulinkWorkspace. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

7

List of Tables 5.1. Listing of baseline parameter selection for observer-controller with reference model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2. Performance index values for baseline parameter selection with and without reference model. . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3. Performance index values for the validation interval parameter variation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4. Performance index values for the training interval parameter variation. 5.5. The table shows the performance index values for λ-variations and if performance appears stable. . . . . . . . . . . . . . . . . . . . . . . . . . 5.6. Performance comparison of reference model with and without magnified error signal to traditional observer-controller architecture. . . . 5.7. The table shows the performance index values for observer-controller configuration with magnified error signal. . . . . . . . . . . . . . . . . 5.8. Neuron distribution for small, middle and big sized networks. The high performance reference model augmented configuration corresponds to the small network setting. The neuron distribution xx-yy-zz means that xx neurons are located in the input layer, yy in the hidden layer and zz in the output layer. Performance data is provided. . . . . 5.9. Control accuracy results for stabilized error signal enhanced control architecture. Results of the subtraction version are compared to results of the difference version. The corresponding network parameters and gain factors are presented. . . . . . . . . . . . . . . . . . . . . . . . . . . 5.10. Control accuracy results for simulations with and without stabilized error signal. Multiple parameter settings are examined. . . . . . . . . . 5.11. Table listing results of multiple training runs. Small networks have been employed with varying parameter sets. . . . . . . . . . . . . . . . 5.12. Various atmospheric densities and its effect on controller outcome for conventional and high performance reference model controller. . . . . 5.13. Control accuracy of neural controller for doubled atmospheric density. The performance increases with each successive simulation, as the network adapts itself. . . . . . . . . . . . . . . . . . . . . . . . . . .

8

74 75 76 77 78 80 82

83

84 85 87 88

89

List of Tables

List of Tables

6.1. Interactions of Simulink and Visualization in client-server configuration. The visualization works continuously and uses the time-stamp to synchronize data with Simulink. Simulink is the server and provides attitude data as well as time-stamps for synchronization. . . . . . . . . 95 6.2. Interactions of Simulink and Visualization in peer-to-peer configuration. 95 6.3. Value range of a 32 bit float. . . . . . . . . . . . . . . . . . . . . . . . . . 100

9

Nomenclature Latin A

Start frame. Primes correspond to intermediate frames.

A

First error signal augmentation factor.

a

Variable in SMC algorithm.



Unit vector.



Acceleration vector.

B

End frame.

B

Second error signal augmentation factor.

b

Variable in SMC algorithm.

bˆ 1

ˆ First element of unit vector b.

C AB

Rotation matrix or direction cosine matrix which represents rotation from start attitude A to end attitude B.

cij

Element ij of rotation matrix (direction cosine matrix) C.



Performance index.

D

Matrix.

E

Squared neural network Error.

e

Error value.



Unit rotation axis.



Force vector.

f pq

Transfer function of a neuron.

G

Inverse output gain factor.

10

List of Tables

List of Tables Inertia Matrix.

I i,

j

i,ˆ

j,ˆ

Neuron identification numbers. kˆ

Unit vectors building the imaginary frame.

Kd

Derivative gain factor.

Kp

Proportional gain factor.

k

Time step.

L

Number of layers in a neural network.



Angular momentum vector.



Lever arm.

l

Neural network layer.

m

Mass.

¯ N

Torque vector.

nil

Input to neuron i in layer pl q.

oil

Output of neuron i in layer pl q.

qo

Real part of q.

q pl q

Number of neurons in layer pl q.

q

Quaternion four-tuple.



Imaginary parts of q.



Position vector.

S

Sliding mode surface.

T

Total time.

T

Target neural network output.

t

Time.

u

Control signal.

V

Lyapunov candidate function.

11

List of Tables v¯

List of Tables Velocity-vector.

pl q

wij

Weight connecting neuron i in layer pl q with neuron j in layer pl

x

Neural network input.



Estimated neural network input.

y

Neural network output or, in section 4.3, abstracted system state.

1q.

Greek α

Angle.

β

Angle.

Γ

Universal gravitational constant.

γ

Lyapunov stability boundary.

∆Θ

Error angle of one-axis.

∆Φ

Error angle of two-axis.

∆Ψ

Error angle of three-axis.

pl q

δj

pl q δˆj

Alternate expression for network error sensitivity with respect to the input of neuron j in layer pl q, B E{B npj lq . Alternate expression for network output sensitivity with respect to the input of neuron j in layer pl q, B y{B npj lq .

δ˜

Estimated network output sensitivity with respect to the estimated network input, B y{B n.

e

Linear neural network error.

η¯

Eigenvector.

κ

Momentum term parameter for gradient descent algorithm.

λ

1) Sliding mode control parameter. 2) parameter of eλt -approach.

µ

Learning rate for weight adaptation algorithms.

µC

Earth’s gravitational constant.

12

List of Tables

List of Tables

ξ

Rotation angle.

ρ

Atmospheric density.

σ

Lyapunov stability boundary.

τ

Control torque.

¯ Υ

Angular attitude.

ω¯

Angular velocity vector.

Indices

pqa pqaug pqc pqd pqG pqm pqobs pqre f

Actual state. Augmented. Control / Controller. Desired state. Gravity. Augmented output enhanced state. Observed / Observer. Reference Model.

Acronyms ACS

Attitude Control System.

ANN

Artificial Neural Network.

DCM

Direction Cosine Matrix is another expression for rotation matrix.

DLR

German Aerospace Center.

ECEF

Earth Centered Earth Fixed coordinate frame.

ECI

Earth Centered Inertial coordinate frame.

EoM

Equation of Motion.

ILR

Institut für Luft- und Raumfatsysteme at the Braunschweig University of Technology. (Aerospace Institute in Engl.)

IP

Internet Protocol. Network Protocol.

13

List of Tables

List of Tables

ISS

International Space Station.

NNC

Neural Network Controller. A controller which implements a neural network.

NNI

Neural Network Identifier. Observer in an observer-controller architecture which implements an ANN.

NN

Neural Network.

NORAD

North American Aerospace Defense Command. Unit of the United States military.

PID

Proportional-Integral-Derivative. Controller Type in control theory.

SMC

Sliding Mode Control. Neural optimization strategy.

SNARC

Stochastic Neural Analog Reinforcement Calculator. First ANN to be built.

TCP

Transmission Control Protocol. Network Protocol.

TLE

Two Line Element. Catalogue and data format for space object orbit data.

UAV

Unmanned Aerial Vehicle.

UDP

User Datagram Protocol. Network Protocol.

14

Neural networks are successfully employed in a broad spectrum of applications. Research on neural control has gained momentum and bears much potential. At the Aerospace Institute (ILR) of the Braunschweig University of Technology, a high fidelity attitude control system Simulink model has been developed in cooperation with the German Aerospace Center (DLR). Investigations are based on this simulator. The work at hand aims at enhancing a neural control strategy for a three-axis controlled satellite. Two approaches are taken. The first implements a physical reference model to an observer-controller neural control architecture. Reference models yield desirable effects in terms of performance and stability. Development and characteristics of physical reference models are presented. Through parameter variation, it can be shown that the reference model increases control accuracy by  20% compared to previous work. The second approach uses the concept of stabilized error signals for stable learning of neural networks. Two versions of this method are implemented into the simulation. The method is scrutinized and its effects evaluated. It is shown that the method dampens the learning signal which does not necessarily improve stability. An attempt to proof Lyapunov stability is made but cannot be completed. Despite considerable effort and numerous simulations, a stabilizing effect cannot be found. At the ILR, a space object visualization program has been developed. A network interface which synchronizes attitude control simulator and visualization tool is developed. For this purpose, 32-bit floating point values have to be coded to 8-bit integer values. Number formatting and coding is explained. Synchronization is successful and the controlled satellite may be watched in real time as it follows its mission profile. In preparation of the main investigation, concepts of neural networks and their functional principles are presented. Gradient descent and sliding mode control (SMC) optimization algorithms are explained. Equations of motion for a satellite’s translational (orbit mechanics) and rotational motion (attitude dynamics) are derived. Mathematical approaches to describe attitude such as direction cosine matrices (DCM) and Quaternions are introduced. A difference quaternion, representing a rotation between desired and actual attitude, is derived. During the course of the main investigation, effects of parameter variations on SMC control outcome are investigated. It may be deduced that λ, learning rate µ and learning signal magnitude are essential for successful learning. The observer produces a training signal for the neural controller through inverse propagation of state deviation. Its inverse output signal is investigated and correlated to state deviation as well as control signal. The observer performs desirably but facilitates oscillations in the actuator activity. Neural networks possess the unique capability to adapt themselves over time to increase control performance. This ability is shown in extended simulations which also take place in increased atmospheric density.

1. Introduction The research project connects the two scientific fields of neural networks and satellite attitude control. Neural networks (NN) imitate basic functions of biological brains. The concept of neural networks has been known since humans began to understand the functional principles of the brain. Rather than one processing unit, NNs are made up of numerous neurons which process information in parallel. Neural networks are nonlinear approximators which enables them to learn and imitate complex processes. This predestines them to fulfil tasks where pattern recognition of complex systems is required. Under the paradigm of applying neural networks, research in various fields of potential applications has gained momentum in recent decades. Today, NNs perform tasks in a broad spectrum of applications. For example, they successfully recognize faces in camera systems, predict stock market trends and determine botanical classification of plants. One suitable task is neural control. Traditional controllers are designed to work at a specific operating point. This operating point represents a rigid framework of environmental and system parameters. The controller provides stable and adequate performance only in the operating point’s vicinity. This restriction is necessary because traditional controllers are based on linearisation of the plant as well as environmental conditions. However, in real life, processes are generally nonlinear. When the operating point moves too far away from the design point, linearisation loses its claim to adequately represent the process and performance deteriorates. On the other hand, neural networks are well suited to control nonlinear systems. Theoretically, they are able to perform adequate control for an unlimited number of operating points due to their ability to approximate nonlinear patterns. Approximation is achieved through learning. Neural controllers evaluate their performance continuously and may thus adapt to new operating conditions. This capability is especially effective for slow changes as the network has sufficient time to adapt itself. For example, a spacecraft changes its inertia over time as it burns propellant or releases payloads. Neural controllers will compensate the change as it happens. On the other hand, the aspect of continuous adaptation yields a significant risk for neural network operations. Just as misunderstandings between teacher and student are possible in real life, neural networks are prone to adapt themselves in an adverse way. To this day, an algorithm guaranteeing stable learning remains to be found. Consequently, systems other than neural networks, are applied to life sensitive tasks.

16

1. Introduction

1. Introduction

With the launch of Sputnik, in 1957, an new industry - the space industry - was born. Since then, the sector has been growing steadily. Today it is a multi-billion Euro industry with 74 launches in the year 2010 alone. Ever since its beginning the space sector has facilitated international cooperations all over the world. Even during the cold war, space enabled peaceful interactions among the USA and the, then called, USSR, when a Soyuz spacecraft docked with an Apollo capsule in 1975. With the dawning commercialisation of space, the sector is changing faster than ever. New concepts, such as neural controllers, will more easily be implemented as smaller companies bring new impulses to the industry. Attitude control is of major importance in space flight. Spacecraft are free to rotate in all three dimensions. Various disturbances affect the satellite in an undesirable fashion. Residual atmosphere exerts drag. Solar radiation, Earth’s magnetic field and gravity produce torques which act on the satellite. To counteract those disturbances, satellites often use a combination of actuators. Thrusters exert small impulses to turn satellites. Reaction control wheels are spinning masses which produce a moment by accelerating the wheels. Magnetic torquers interact with Earth’s magnetic field to create a moment. Attitude control systems ensure that the spacecraft is oriented in a desirable way. Attitude needs to be controlled continuously to meet several objectives. For communication, high gain antennae have to point at ground stations while the spacecraft swings as it passes overhead. Space observatories have to hold a specific attitude while they expose their sensors to the light of distant stars. In short, attitude control is essential to accomplish the satellite’s mission. Neural controllers are a speciality of the Aerospace Institute (ILR) at the Braunschweig University of Technology. Research focuses on the implementation of neural controllers as autopilots for unmanned areal vehicles (UAV). Since underlying principles of neural networks may be transferred from atmospheric flight to alternate applications, some research has been done on neural satellite attitude controllers. For this purpose, a high fidelity MatLab Simulink model has been developed in cooperation with the German Aerospace Center (DLR) [7]. The model simulates attitude control of satellites in Earth’s orbit under the influence of environmental disturbances. System characteristics such as sensor and Kalman filter dynamics are considered. The simulator enables investigation of different attitude control strategies. It has been designed with a conventional PID controller. Neural controllers have successfully been implemented in succeeding work [4, 8]. The neural controllers are able to control the satellite. However, their performance is several times inferior to a conventional control strategy. Only a hybrid system with a weak neural component was able to outperform the conventional controller. The work at hand aims to develop an enhanced neural control strategy to improve control performance. A focus is to implement a reference model to assist the neural controllers. Reference models promise desirable effects in terms of network learning and control accuracy. Furthermore, an approach introduced by article [1], which

17

1. Introduction

1. Introduction

promises stable learning of neural networks, shall be implemented and investigated. The approach augments the learning signal with network error’s time history. Space Debris is of increasing concern among the space industry community. Already today, the International Space Station (ISS) has to perform debris evasion manoeuvres and the debris situation will aggravate in the future. At the ILR, dedicated software tools are developed and employed to analyse risk imposed by space debris and to predict future development of the situation. Recently, a software has been developed which visualizes space debris orbiting about Earth. It may also display active satellites. An additional objective of this work is to build an interface which synchronizes the Simulink model with the visualization environment. The goal is to enable the user to watch the satellite in real time as it is being controlled. In chapter 2, neural networks are introduced and their functional principles are presented. Back and forward propagation of signals is explained. With gradient descent method and sliding mode control (SMC), two weight optimization algorithms are provided. Chapter 3 provides necessary mathematical concepts for attitude representation. In section 3.1, equations of motion for translational and rotational motion are derived. Attitude representation in terms of rotation matrices and quaternions is introduced in sections 3.2 and 3.3, respectively. Three neural control architectures are introduced in chapter 4. The observer-controller architecture is the fundamental neuro control configuration. After presenting theory and development, a physical reference model is added to this configuration. Section 4.3 investigates the concept of error stabilization after [1]. A control performance benchmark is introduced in section 4.4. To end the chapter and lead over to the investigation of results, the satellite’s mission and previous work is recapitulated in section 4.5. Chapter 5 starts with results of the reference model augmented control strategy. The attributes of different network sizes is investigated. Section 5.3 scrutinizes the stable error signal concept. Furthermore, the effects of extended learning time and increased atmospheric density are covered in this chapter. Characteristics of the inverse output signal are presented in the last section. The second task of visualizing the spacecraft is covered in chapter 6. Conclusions are presented in chapter 7.

18

2. Neural Networks A neural network is a network of interconnected neurons. The term describes every attempt to model a biological brain. It is not merely used in engineering alone but also in other scientific fields such as biology and psychology. In engineering, neural networks are also called artificial neural networks. An artificial neural network (ANN) is an model of a biological brain.

2.1. Qualities of Artificial Neural Networks ANNs are nonlinear approximators. Current applications include: facial recognition, control tasks, stock market analysis, to name a few. ANNs are applicable wherever a task requires pattern recognition in the broadest sense. Neural control is one suitable application. It has distinct advantages over conventional methods due to its nonlinear approximation qualities and ability to learn and adapt. While conventional controllers are linearised to operate in the vicinity of their design point, ANNs actually approximate the plant’s nonlinear behaviour. Therefore, they are theoretically able to control the plant under every acceptable condition. ANNs recognize poor performance and optimize themselves. Intriguingly, the reason for poor performance is irrelevant. Two obvious explanations come to mind when talking about the causes for poor controlling performance: 1. The controller is poorly tuned. 2. The internal (actuators) or external (disturbances) environment changes such that the system performs outside the operating point. In the first case, the solution is simple for both, conventional and ANN controller. The controller’s parameters need to be adjusted. For a conventional controller, this can only be done before the controller is implemented into the system. Whereas for an ANN controller, adjustment of parameters is possible while the system is operating. In the second case, only the ANN controller is able to adapt to a new environment outside the operating point. Eventually, the ANN controller will be able to control the plant satisfactory even if, initially, it was intended for another operating point. In the current application, the ANN is supposed to orient a satellite. For this purpose four reaction control wheels in tetrahedron formation generate the necessary control

19

2. Neural Networks

2.2. Historical Overview

moment. To illustrate network adaptation capabilities, the following scenario may be considered: The satellite orbits Earth in an altitude of 600km. Residual atmosphere disturbs attitude. Atmospheric density, and therefore disturbance, doubles if the satellite looses 100km altitude. For a conventional controller, performance suffers as the operating point moves away from the design point. A neural controller, on the other hand, adapts itself over time and compensates the new environmental condition. Those properties cause the attractiveness of neural networks for control tasks. However, the ANN’s property to continuously adapt itself also causes concerns from an engineering point of view. So far, it is impossible to guarantee stable performance for an ANN. The networks adaptation is unpredictable for multi-variable systems. Therefore, it might adapt itself in an unfavourable fashion. This is the reason why ANNs do not operate systems where human life is at stake. Research on ANNs is ongoing and might provide a proof of stability one day. Scientific study on neural networks began as soon as the functional principles of biological brains had been discovered.

2.2. Historical Overview In 1909, Cajal found that the brain consists of neurons, highly interconnected via synapses [9]. In the following years, researchers were able to achieve a good understanding of their functional principles. Figure 2.1 gives an illustration of interconnected braincells. Dendrites receive incoming signals and route them to neurons via axons. A neuron is the "processing unit" inside a brain. If excited sufficiently, it fires an impulse which can be excitatory or inhibitory. Synapses filter and transmit the signals using neurotransmitters1 . They alter the signal based on the properties of the neurotransmitter. Neural networks work such that many neurons input a single neuron as shown in figure 2.2. The more numerous and exciting incoming impulses are, the more likely is a strong impulse of the processing neuron. Inspired by the new findings, in 1949, Hebb described a simple way to imitate the learning process of neural networks [10]. In those days, popular conviction was that learning required change in the brain’s structure. Hebb argued that learning is rather accomplished by adapting the strength of connections between neurons. Building on Hebb’s work, Edmonds and Minsky built the first learning machine "SNARC2 " in 1951 [9]. In 1957, Rosenblatt invented the perceptron ANN architecture. The perceptron describes a feed-forward ANN capable of learning. In a feed-forward network, a neuron sends signal only to neurons in the next layer closer to the output. Other 1 chemicals 2 Stochastic

which transmit signals from a neuron across the synaptic gab to a target neuron Neural Analog Reinforcement Calculator

20

2. Neural Networks

2.2. Historical Overview

Figure 2.1.: Depiction of interconnected brain cells. [2] neural network architectures exist in which connections follow different rules. However, those types remain undiscussed in this work. Feed-forward networks resemble three categories of layers as shown in figure 2.2: • The first layer is called the retina or input layer. The neurons in this layer usually incorporate a liner transfer function3 . • The second category is the hidden layer. Neurons in this layer utilize non-liner transfer functions such as the hyperbolic tangent. Often, this area is made up of many successive neuron layers. • At the end of a perceptron is the output layer. Similarly to the input layer, a 3 transfer

functions are described later on

21

2. Neural Networks

2.3. Propagation of Signals

linear transfer function processes data in this layer.

Layers Input

Hidden

Output

Figure 2.2.: Simple perceptron devided into layers. The original perceptron only had one layer made up of one neuron in the hidden layer. However, it is common to have multiple layers with many neurons in the hidden layer, these days. These networks are called "Multi Layer Perceptrons" (MLP) [11]. MLP’s are the most frequently used networks today. They offer the possibility of backpropagation which will be described in further detail in section 2.4.2. The next section is about the propagation of signals through an ANN.

2.3. Propagation of Signals Inspired by the biological original, some analogies may be drawn to create a computational model of a neural network. According to Hebb, adapting the strength of connections between neurons results in learning. In biology, synapses take the role of strengthening connections depending on the type of neurotransmitter they employ. Therefore, an analogy for a synapse may be found Weights replace synapses in ANNs. They are scalar factors the output of a neuron is multiplied by. Learning of the ANN is accomplished by altering the values

22

2. Neural Networks

2.3. Propagation of Signals

pl q

of the connection weights. A weight is denoted by wij . The two indices i and j describe the neurons which the weight connects. The identifying number of the sending neuron is i. The receiving neuron is given by j. The layer, the sending neuron is located in, is denoted by pl q. Neurons receive incoming impulses and generate an excitatory or inhibitory signal based on it. The input to the neuron is given by the weighted sum

p l 1q  ¸ w p l q o p l q

nj

ij

i

(2.1)

i

pl 1q denotes the input n of neuron j in layer l 1. The output of neuron i in pl q layer pl q is o . where n j

i

Weighted Sum describes the function which processes the incoming signals and generates the input signal to a neuron. The weighted sum may be regarded as analogues to dendrites. Based on the input signal n, a transfer function generates the neuron’s output. Transfer Functions generate the impulse of a neuron. Therefore, transfer functions may be regarded as analogues to neurons. Usually, the transfer function is either linear or the hyperbolic tangent function. Their plots are given in figure 2.4. The non-linear hyperbolic tangent function enables an ANN to estimate non-linear functions. Figure 2.3 gives a graphical interpretation of these relations. Weighted sum and transfer function have been combined inside the neuron’s depiction. In a stepby-step description, the process within the neuron is given as follows: First, the pl 1q  °qplq wplq oplq , where qplq is the weighted sum calculates the neuron input n3 i i3 i number of neurons in layer p l q . Next, the transfer function calculates the neuron’s 

pl 1q

pl 1q

output o3  f n3 where f pq represents the transfer function. Bias neurons are to be mentioned briefly. Some of today’s network use them as an additional degree of freedom to adjust the network settings. The simulation, this work is based on, uses bias neurons. Bias neutrons offer strong parallelism to the purpose of synaptic weights. As opposed to conventional neurons, they lack an input. However, they output a constant value which is multiplied by a synaptic weight. Usually, the output value is one. Therefore, only the value of the synaptic weight is transmitted to the receiving neuron. Fundamentally, the addition of bias neutrons affects the network architecture such that an extra synaptic weight is present for every neuron in each layer. More synaptic weights offer more adjustment opportunities when the network is learning.

23

2. Neural Networks

2.4. Learning

o1l

wl

13

l w23

o2l .. .

 pl q q¸ pl 1q  wplq oplq f3 i3 i

p l 1q

o3

i

w qplq 3 l

oql plq Figure 2.3.: Inputs and outputs of example neuron # 3 in layer (l+1). Inside the neuron is the transfer function f pq which process the weighted sum of all inputs. Figure 2.5 offers an overview of a simple neural network with bias neurons and weights. The network input is denoted by x while the output is called y. In this example, the network only has one output. However, it should be clear that an artificial neural netword may have multiple in- and outputs.

2.4. Learning ANNs are able to approximate any given function. To accomplish this creed, an ANN needs to adjust its weights until the desired output is generated. The procedure of adjustment is called learning. Section 2.4.1 describes how learning is triggered while section 2.4.2 explains how learning is accomplished.

2.4.1. Initiating Learning In order to converge on the correct behaviour, the ANN needs to know when to learn. First, the network produces a forward propagation pattern. A pattern describes a state of the network where values of synaptic weights, inputs and outputs are stored. The network accomplishes a forward propagation when it propagates the input through the network as described in section 2.3. The propagation produces the network output yi . The index i identifies the output neuron. In order to know if training is required, the quality of the network output yi needs to be evaluated. The target network output Ti is introduced. Index i denotes the corresponding output neuron. A target network output Ti necessitates knowledge of a

24

2. Neural Networks

2.4. Learning

Hyperbolic Tangent

Linear 3

0.8 2

0.6 0.4

1

! f(x)=tanh(x)

0

f(x)

f(x)

0.2

! f(x)=x

0

−0.2 −1

−0.4 −0.6

−2

−0.8 −2

0

x

−3

2

−2

0

x

2

Figure 2.4.: The two commonly used transfer functions: Hyperbolic tangent (left) and Linear (right) desired network output. If Ti and yi are indifferent, training is obsolete. The criteria to determine the need for training is the squared error function E. E

1¸ p Ti  yi q2 2

(2.2)

i

Whenever E exceeds a predefined threshold value, training takes place.

2.4.2. Derivation of Backpropagation Backpropagation is a technigue able to determine the imapct of every single weight on the network error E. The result are the derivatives of the error E with respect to each synaptic weight

BE

B wijplq

@

i, j P weights,

P t1, .., Lu

where

25

L  maxpnumber of layersq  1. (2.3)

2. Neural Networks

2.4. Learning

pl1q

x1

pl1q

f1

pl q

f1

w 11

w pl q 1

w p l 1 12 q

1

pl q

w21

pl q

y1

p lq 1 w qp

1

pl1q

f q p l 1 q

b1

w pl q

x2

pl 1q

f1

lq

1q plqplq

w

f2

pl q

f qpl q Bias

Bias

Figure 2.5.: Illustration of a simple neural network with bias neurons. Some weights have been left out for reasons of clarity. Applying the chain rule on equation 2.3 allows to seperate B Eplq into two compoBw ij

nents:

pl 1q

B E  B E Bnj B wijplq B npj l 1q B wijplq

(2.4)

pl 1q is rewritten using relation 2.1. Equation 2.4 becomes

where n j

°

pl q pl q

B E  B E B i wij oi B wijplq B npj l 1q B wijplq

.

(2.5)

Actually applying the derivative to the rightmost term of equation 2.5 yields

B E  B E o pl q . B wijplq B npj l 1q i pl q

(2.6)

The term oi is known for all i and l since the last forward propagation pattern is recorded. The derivative of the network error with respect to the neuron inputs B E{Bnpj l 1q needs to be determined. A common alternate expression of B E{Bnpj lq is δpj lq For

26

2. Neural Networks

2.4. Learning

plq can only be determined recursively. Recursion starts at the last layer L. Applying the chain rule allows to disintegrate B Eplq into two terms for Bnj the entire network, δj the last layer:

p Lq δ  j

p Lq B E  B E Boj B npj Lq B opj Lq B npj Lq

(2.7)

where B E{B opj lq is the derivative of the network error with respect to the network output

pLq  y ). Differentiating equation 2.2 accordingly yields: j

(o j

B E    T  o p Lq   j j B opj Lq



Tj  yj .

(2.8)

p Lq In equation 2.7, B o j {B npj Lq is the derivative of the transfer function with respect to the input since: p Lq p Lq o j  f pn j q (2.9) therefore,

B opj Lq 1 pLq  f pn j q. B npj Lq

(2.10)

The derivative of the transfer function is common knowledge for both, the linear as well as the hyperbolic tangent function [12]. linear: hyperbolic tangent:

f p xq  x

ÝÑ f pxq  tanhpxq ÝÑ

f 1 p xq  1

f 1 p xq 

(2.11) 1

cosh2 pxq

.

(2.12)

For δ in the last layer it has been shown:

p Lq δ  j

p Lq B E B o j   T  opLq f 1pLqpnpLqq. j j j j B opj Lq B npj Lq

(2.13)

For the remaining layers, δj can be retrieved using a less obvious method:

pl q 

δj

BE  B npj lq



p l 1q pl q p l 1q p l q B E n1 po j q . . . nql 1 po j q B opj lq . B o pl q B n pl q j

27

j

(2.14)

2. Neural Networks

2.4. Learning

The index qpl 1q indicates the number of neurons in layer pl 1q. Equation 2.14 pl q suggests that E is indirectly dependent on o j . However, E is directly dependent on °

p l 1q

pl q

ni which in turn is dependent on o j . Applying this knowledge and building the derivative B E{B opj lq gives: i





° pl 1q pl q B E  B Ep i ni po j qq  ¸  B E B npi l 1q  . B opj lq B opj lq B npi l 1q B opj lq i

(2.15)

Substituting 2.15 into 2.14 gives

pl q 

δj

BE  B npj lq

¸ i

 p l 1q B o p l q B ni j .  BE p l 1q p lq p lq Bn Bo Bn 

i

j

(2.16)

j

The following identities have been used in the last equation:

B E  δ pl i p l 1q

1q

B ni B npi l 1q  wplq after 2.1 ij B opj lq B opj lq 1 plq  f pn j q after 2.10. B npj lq

(2.17)

(2.18)

(2.19)

Equation 2.16 can therefore be rewritten into ¸  p l 1q p l q  p lq p lq 1 δj  f pn j q δi wij .

(2.20)

i

p l 1q

Equation 2.20 requires knowledge of δi . Therefore, backpropagation can only be accomplished recursively - layer by layer. With equation 2.20 and 2.6 the backpropagation result B E{B wijplq may be retrieved for the whole network.

2.4.3. Jacobi Matrix The jacobi matrix describes the result of an altered backpropagation algorithm. The backpropagation’s result is the derivative B E{B wijplq . It describes how the squared network error changes with changing network weights. However, since E represents the squared error, the sign-information is lost in the process. This trait remains ir-

28

2. Neural Networks

2.4. Learning

relevant as long as optimizing network weights is the goal. Nevertheless, certain applications, such as inverse propagation (as in section 2.4.4), requires a more pristine signal. Minor changes of the conventional backpropagation algorithm address this demand. Instead of B E{B wijplq , the algorithm produces B y{B wijplq . This term describes how the network output changes with differing weights. The jacobi matrix shall be derived in analogy to the conventional backpropagation algorithm. Hence, equation 2.4 may be rewritten pl 1q By  By Bnj . (2.21) B wijplq B npj l 1q B wijplq

plq  B E{Bnplq , the notation

Similarly to the expression δj

j

pl q δˆj

 B yplq Bnj

(2.22)

is introduced. Index j identifies the neuron in layer pl q. With this identity, equation 2.21 may be simplyfied to B y  δˆpl 1q oplq. (2.23) B wijplq j i As is the case for conventional backpropagation, the jacobi matrix may only be derived iteratively . The process begins at the output layer and progresses backwards through the network, layer by layer. For the last layer, the following holds true: δˆp Lq

p Lq  B nBpyLq  BBno pLq  f 1pLq.

(2.24)

In order to retrieve δˆj for the hidden layers, equation 2.20 is adapted. One may write:

pl q δˆj

 f 1pnpj lqq

¸  p l 1q ˆ

δi

pl q  .

wij

(2.25)

i

Applying equations 2.25 and 2.23 recursively builds the jacobi matrix  

B y   B w¯ plq  

By

B wpl q

...

.. .

...

Bwpqlpqlq 1

...

11

By

29

By



By



Bw1qplqpl 1q   .. . .  Bwpqlpqlq qpl 1q

(2.26)

2. Neural Networks

2.4. Learning

This concludes the introduction to jacobi matrix. Section 2.3.2 in reference [8] provides a thorough derivation of the jacobi matix.

2.4.4. Inverse Propagation Inverse propagation estimates the network’s input signal based on the network’s output and current pattern. Since the network approximates a certain function, inverse propagation is equivalent to the inverse of that function. This feature is helpful in certain control tasks, as discussed in section 4.1.1. Backpropagation-based Inverse Propagation Section 2.4.3 introduces the jacobi matrix. Building the matrix necessitates introduction of equation 2.22 By pl q δˆj  pl q . Bnj

pl q

The section describes how δˆj may be obtained for the whole network. Equation 2.22 may be used to acquire the networks input based on its output. For this purpose, the estimated network input x˜i is introduced. The index i indicates the input neuron. p1q The network input xi  ni is different from the estimated input x˜i because inverse propagation is non-injective. Therefore, multiple inputs may have the same output. A corresponding input may only be estimated. xi

ÝInversePropagation ÝÝÝÝÝÝÝÝÝÝÝÑ

x˜i

.

(2.27)

Equation 2.22 may therefore be rewritten to

p1q δ˜i

 BBxy˜ .

(2.28)

i

Applying a first order Taylor series allows to solve equation 2.28 for the change in network output B y ∆x˜  δ˜p1q ∆x˜ . ∆y  (2.29) B x˜i i i i Isolating ∆ x˜i in equation 2.29 yields ∆ x˜i



∆y p1q δ˜

 δ˜ip1q

1

∆y.

(2.30)

i

Utilization of the Taylor series allows to solve for ∆ x˜i but it comes at the cost of linearisation [12] which implies validity of results only for small changes in ∆y.

30

2. Neural Networks

2.5. Gradient Descent Method

2.5. Gradient Descent Method The gradient descent method is an optimization algorithm. It is designed to adjust the synaptic weights based on the backpropagation result B E{B wijplq . This term describes the sensitivity of the network error E with respect to the synaptic weight pl q wij . The gradient descent algorithm is given by:

pl q  µ B E . B w pl q

∆wij

(2.31)

ij

E

BE

B w pl q

∆wij

wij

Figure 2.6.: The plot shows a hypothetical curve of network error E as a function of weights wij . A tangent on Epwq illustrates the slope B E{B wplq . The derivative’s magnitude determines the weight adjustment. Figure 2.6 provides a graphic interpretation of this method. The derivative B E{B wijplq contributes direction as well as appropriate magnitude for change of weights. Hence, E shifts towards the closest minimum. Learning rate µ commands the algorithm’s general behaviour. Small µ result in slow but stable convergence of E on its minimum. Big µ promise faster convergence speed but bear the risk of unstable learning. The size of µ needs to be determined empirically for each application. Obviously, equation 2.31 requires knowledge of B E{B wijplq . Consequently, applying the backpropagation algorithm beforehand is eminent to utilize gradient descent. The gradient descent method adjusts all weights in the network. Two issues associated with gradient descent shall be mentioned at this point:

pl q

pl q

1. On flat slopes of E over wij , the commanded change in weights ∆wij will

31

2. Neural Networks

2.5. Gradient Descent Method

be small. The algorithm may adjust weights inefficiently in such a situation. Convergence speed will be slow.

pl q

2. The change of network weights ∆wij will be large whenever steep sloped minima are encountered. The magnitude of B E{B wijplq is big in such a situation. Con-

pl q

pl q

sequently, ∆wij is big, causing the values of wij to oscillate between the slopes rather than descending to the bottom. The solution to these issues is addition of a so called momentum term to equation 2.31. The momentum term implements the last weight adaptation multiplied by a constant factor. The new weight adaptation algorithm is given by

pl q

∆wij,k

 µ B Epklq B wij,k

pl q

κ  ∆wij,k1

(2.32)

where the index appendix k or k  1 provides the corresponding time step. On flat plateaus, the momentum increases weight adjustment magnitude with every time step, since weight adaptation is oriented in equal direction. The algorithm leaves the plateau faster. Contrary, when descending into a valley, the algorithm may oscillate between side slopes, thus the weight adaptation direction changes. The momentum term ensures, that the weight adaptation magnitude decreases with every step in this case. The algorithm descends quicker.

32

2. Neural Networks

2.6. Sliding Mode Control

2.6. Sliding Mode Control Sliding Mode Control describes a control strategy. It treats the linear network error e as its state variable. The error e is defined as follows: ej

 Tj  yj @

j P t1...number of output neuronsu.

(2.33)

The control algorithm minimizes e j by means of adjusting the synaptic weights wijl . Sliding mode control (SMC) incorporates the gradient descent method. However, as opposed to conventional gradient descent, SMC adjusts the learning rate µ dynamically. This section focuses on the derivation of the corresponding algorithm. The control law adjusting the synaptic weights is given by: ∆w¯ 

B y¯pw,¯ x¯q  µ¯  signpS¯ q  |e¯| B w¯ ptq

(2.34)

pl q

where w¯ is a vector containing all values of the synaptic weights wij . The expression By¯{Bw¯ is the Jacobi Matrix as mentioned in 2.4.2. Similarly, x, ¯ y¯ and e¯ contain all network-inputs, -outputs and -errors, respectively. The learn rate vector µ¯ is required to optimize every neuron output. The system operates on a sliding mode surface S¯ . Stable behaviour is required for S¯  0. In this work, S is given by

Sj

 ej 9

λe j

@

j P t1...number of ouput neuronsu

(2.35)

where λ is a free parameter to be set initially. The SMC algorithm minimizes e j and ultimately drives S j to zero. An interesting feature of equation 2.35 is that S j can be zero even though e j  0. If e j changes rapidly towards zero, e9 j will cancel out the term λe j thus leaving S j  0. This characteristic prevents "over-adjustment" of weights and oscillations in e j . The term signpS j q represents this measure in equation 2.34. If the weights are such that the network error decreases fast, signpS j q will reverse the direction of weight adjustment. This causes the network to gently converge on the desired behaviour rather than "overshoot" it.

2.6.1. Derivation of SMC algorithm In equation 2.34, µ¯ is the learning rate. As mentioned before, the SMC-algorithm adjusts µ¯ dynamically. This work gives an introduction as of how the corresponding algorithm is derived. Reference [8] gives a comprehensive description fitted for applications found in this work. However, reference [13] provides a more fundamental documentation.

33

2. Neural Networks

2.6. Sliding Mode Control

The existence of a vector µ¯ implies an optimal learn rate for each network output y j . Consequently, equation 2.34 commands multiple weight corrections for every single synaptic weight, depending on the number of output neurons. Since almost all weights affect each network output, this situation bears conflict. Each weight should only be adjusted once in every learn phase. As a solution, an interval of permissible µ is derived for every network output. Comparing the µ-intervals derived for all network outputs, chances are to find an overlapping range in all µ-intervals. From the common range, a single learn rate µ applicable for the whole network is extracted. In many applications, another solution might be applicable to this problem; For each network output, an individual network can be initiated. In this case only one µ is found for each network. This section elucidates on this procedure. As a result, mathematical relations, presented here, are meant for one discrete network output. To this end, indices describing a specific network output are omitted. For example, a simple S actually stands for S j of network output j. It has to be shown for S deduced

 0 that e converges to zero. From equation 2.35, it can be e  e0  e  λ p t  t 0 q .

(2.36)

Two plots of equation 2.36 are shown in figure 2.7. If λ is positive, the differential equation 2.35 shows stable behaviour. In contrast, a negative λ results in unstable behaviour. Hence, for λ ¡ 0, e is asymptotically stable and S  0 represents a suitable sliding mode surface. Furthermore, the algorithm needs to ensure that S converges on zero. As shown before, S  0 guarantees stable learning of the network. Appropriate adjustment of the network weights induces constant decrease of S and therefore of the network error. The learn rate µ significantly influences the adjustment process. Hence, approach 2.37 is chosen to identify suitable µ,

| S k 1 |   | S k |.

(2.37)

Since the simulation operates in discrete time steps, k denotes the current time step. Equation 2.37 asks for a constant decrease of S in every time step k. The term e9 may be found when exerting e9 

ek  ek1 Ts

34

(2.38)

2. Neural Networks

2.6. Sliding Mode Control

7 # = 0.3 # = − 0.3

6 5

!(t)

4 3

" stable

2

" unstable

1 0 0

2

4

6

8

10

t

12

14

16

18

20

Figure 2.7.: Plot of e-function in equation 2.36. It illustrates the influence of λ’s sign on stability where Ts is the length of one discrete time step. Using 2.35 and 2.38, Sk and Sk can be rewritten:

Sk

1

Sk

 ek

 ek

1

9

9

λek λek

1

pλ pλ

1 1 q e k  e k 1 Ts Ts 1 1 q ek 1  ek . Ts Ts

1

(2.39) (2.40)

All values for the current time step k are known from the latest pattern. However, some assumptions have to be made to estimate ek 1 in equation 2.39. For ek 1 the following relation may be written:

 ek ∆ek where e describes the network error pT  yq. Therefore, ek 1  pTk 1  yk 1 q and ek  pTk  yk q. ek

1

(2.41)

(2.42)

Rearranging equation 2.41 and substituting equation 2.42 into 2.41 gives ∆ek

 ek 1  ek  pTk 1  yk 1q  pTk  yk q. 35

(2.43)

2. Neural Networks

2.6. Sliding Mode Control

Equation 2.43 can further be rewritten as

 pTk 1  Tk q  pyk 1  yk q  ∆Tk  ∆yk .

∆ek

(2.44)

¯ ∆y may be written as Assuming, y is a function only of w¯ and N,

 B yk pBww¯¯k , x¯k q ∆w¯ k B yk pBw¯x¯k , x¯k q ∆x¯k k k where B y{B w¯ may be found in a similar fashion as B E{B w¯ in section 2.4.2. ∆yk

(2.45)

Exerting the backpropagation algorithm, these elements may be determined. The last backpropagation gives B y{B x¯. The change in network input ∆ x¯ is difficult to anticipate. ¯ If impossible, Depending on the system at hand, it might be possible to deduce ∆ x. the literature regularly omits the second term of 2.45. In equation 2.45, ∆w¯ remains ¯ At this to be handled. Equation 2.34 provides the appropriate expression for ∆w. point, ∆yk of equation 2.45 is completely determined. The complete description of ∆e in 2.44 requires knowledge of ∆Tk . A performance index or reference model, as used in this work, may serve this purpose. The change in network error ∆ek may be rewritten as By By ∆ek  ∆Tk  k ∆w¯ k  k ∆ x¯ k . (2.46) B w¯ k B x¯k All tools are available to proceed with approach 2.37 to calculate µ. Substituting 2.41 into 2.39 yields 1 qpe Ts k

|Sk | ¡ |pλ

∆ek q 

1 e |  |pλ Ts k

1 q∆ek |. Ts

(2.47)

Further substitution of 2.46 into 2.47 gives

|Sk | ¡ |pλ

1 By By qp ∆Tk  k ∆w¯ k  k ∆ x¯ k q|. Ts B w¯ k B x¯k

(2.48)

Replacing ∆w¯ k in equation 2.48 with 2.34 gives 1 By By By qp ∆Tk  k p qT  µ  signpS q  |e|  k ∆ x¯ k q|. Ts B w¯ k B w¯ B x¯k

|Sk | ¡ |pλ

(2.49)

Isolating µ in 2.49 results in laws determining the optimal µ. Case distinctions are lengthy but necessary intermediate steps to acquire the final expression #

 Sb

k

Sk b

  µ   Sb S a b  µ  b a b

k

k

a b a b

for for

pSk ¡ 0 ^ b ¡ 0q _ pSk   0 ^ b   0q pSk ¡ 0 ^ b   0q _ pSk   0 ^ b ¡ 0q.

36

(2.50)

2. Neural Networks

2.6. Sliding Mode Control

In relation 2.50, a and b are introduced for reasons of clarity. They are given by a pλ b pλ

By 1 qp ∆Tk  p k ∆ x¯ k qq and Ts B x¯k 1 B yk B y T qp p q  signpS q  |e|q. Ts B w¯ k B w¯

(2.51) (2.52)

Reference [8] provides a comprehensive step by step derivation including the case distinctions mentioned above. As recognizable in 2.50, the optimal learn rate is given as an interval. In the simulation environment this work is based on, the actual learn rate is set as the mean value of the interval rµmin , µmax s µ

µmin

µmax 2

37

.

(2.53)

3. Mathematical Attitude Representation In Earth’s orbit, two control tasks need to be accomplished by a satellite control system. They are attitude control and orbit control. A satellite flies around Earth due to simple laws of physics; Gravitational pull counteracts the centrifugal acceleration forcing the satellite onto a circular path. Perturbations are continuous but weak. Therefore, manoeuvres to correct orbit errors are necessary but rare. Attitude, on the other hand, needs to be controlled constantly. Attitude Control is essential to achieve the satellite’s mission. Mission tasks might include pointing antennae at ground stations for telemetry, reorienting telescopes to picture their targets, stabilizing a satellite’s attitude against natural disturbances, and more. A structured approach is necessary to achieve these tasks. The concept of frames makes the problem of attitude control mathematically conceivable. Three frames (or coordinate systems) are usually considered for objects in Earth’s orbit: Spacecraft-Fixed system , or "body frame", is fixed to the spacecraft structure. Consequently, the frame rotates with the satellite. Usually, the origin is at the satellite’s center of mass. This frame is suitable when describing components of the satellite or its orientation. Earth Centered Inertial or "ECI" originates from the center of Earth but does not rotate with it. Instead, the one-axis points at vernal equinox1 . In addition, the one- and two- axis lie in the plane of Earth’s equator. Therefore, the 3axis remains parallel to Earth’s spin axis. ECI is well suited to describe orbit mechanics. Earth Centered Earth Fixed or "ECEF" has similar attributes as ECI. Its one- and two-axis lay in the plane of Earth’s equator and the three-axis is parallel to Earth’s spin axis. The origin is the same as well. However, ECEF is fixed within Earth’s body. It therefore rotates with Earth. Consequently, it is the appropriate choice when considering Earth fixed structures. 1 Earth’s

equator-plane and Earth’s orbit plane intersect creating a line which used (Earth is tumbling. Consequently, the line changes direction.) to point at the first point of Aries in constellation "The Ram". This direction is called vernal equinox.

38

3. Mathematical Attitude Representation

3.1. Motion of a Satellite

3.1. Motion of a Satellite The motion of a satellite may be separated into translational and rotational motion. "Orbit Mechanics" and "Attitude Dynamics" cover these subjects, respectively. In this work, only attitude dynamics are required. However, to provide a complete picture, orbit mechanics shall be mentioned briefly.

3.1.1. Orbit Mechanics Newton’s laws of motion describe the principles governing all motion. His second law is of particular use for orbit mechanics. It sates that the change in motion of a mass is proportional to the force acting on the mass and that the directions of the change and acting force are the same. F¯

d m  v¯  dt

(3.1)

where v¯ is the mass’ velocity vector, m is mass and F¯ the acting force vector. Often, time invariant systems are investigated. Therefore, mass is constant and equation 3.1 may be rewritten to F¯  m  v¯9  m  a¯ (3.2) where a¯ is the acceleration vector. Among others, gravity is the dominating force which determines the satellite’s lateral motion Gravitational Force Earth’s gravitation pulls the satellite towards itself. Newton’s inverse-square law of universal gravitation describes the force exerted by gravity. F¯G

r¯    ΓMm 2 |r¯| r

(3.3)

where M and m are the attracting masses, r¯ is the connecting position vector and Γ is the universal gravitational constant Γ  6.674  1011

N  m2 . kg2

(3.4)

In Earth’s orbit, Γ and Earth’s mass M are combined to µC

3

 ΓM  3.987  1014 ms2

39

(3.5)

3. Mathematical Attitude Representation

3.1. Motion of a Satellite

where µC is Earth’s gravitational constant. Equation 3.3 may therefore be rewritten to µC m r¯ F¯G   2  . (3.6) |r¯| r Substitution of equation 3.6 into 3.2 yields the fundamental equation determining motion of satellites around Earth a¯  

µC r2

 |rr¯¯| .

(3.7)

Disturbance or propulsion forces may be added by superposition. However, a dedicated essay on this subject is beyond the scope of this work.

3.1.2. Attitude Dynamics Attitude dynamics are concerned with rotational, rather than translational, motion of bodies. References [14, 15] are good sources on this subject. Rotational dynamics may be described by moments acting on a body. According to Newton, a moment is force times lever arm. ¯  l¯  F¯ N (3.8) ¯ is a moment and l¯ the corresponding lever arm. The assumption is being where N made that a satellite is a rigid body which is made up of infinitesimal small masses. With this assumption and in accordance with equation 3.2, equation 3.8 may be rewritten into

»  dv¯ ¯ ¯ dm (3.9) l N dt where l¯ is the lever arm between mass-particle dm and the origin. In attitude dynamics, the center of mass is usually assumed to be the origin as a rigid body rotates about its center of mass. Angular momentum of a point mass about some origin is defined as ¯ L¯  l¯  vm. (3.10) where L¯ denotes angular momentum. For a rigid body, angular momentum may be found with » ¯L  l¯  vdm. ¯ (3.11) Assuming constant mass for the rigid body, equation 3.11 may be differentiated using the product rule d L¯ dt



»

dl¯ ¯  vdm dt

»

dv¯ l¯  dm  dt

40

»

¯ v¯  vdm

»

dv¯ l¯  dm. dt

(3.12)

3. Mathematical Attitude Representation

3.1. Motion of a Satellite

The first term on the right hand side of equation 3.12 is zero. The second term is equal to equation 3.9. Therefore the moment acting on a rigid body is equal to the change in angular momentum of said rigid body d L¯ dt

¯  N.

(3.13)

The lateral velocity of a fragment of the rigid body may be disintegrated into two terms v¯  v¯0 ω¯  l¯ (3.14) where ω¯ is angular velocity of the body frame with respect to an inertial frame and v¯0 lateral velocity of the body frame’s origin in inertial coordinates. Reinvestigation of angular momentum with equation 3.14 yields L¯ 

»

»

l¯  v¯0 dm



l¯  ω¯  l¯ dm.

(3.15)

Because v¯0 is identical for all mass fragments, the first term of the right hand side is »

l¯  v¯0 dm 





¯ ldm

 v¯0  0.

(3.16)

This expression is equal to zero because the body frame’s origin is the center of mass. Thus, all lever arms times mass fragment add up to zero. Therefore, angular momentum may be expressed as L¯ 

»



l¯  ω¯  l¯ dm.

(3.17)

¯ N, ¯ l¯ and ω¯ have three coordinate components. For example, angular The vectors L, velocity is given by ω¯  ω1 ω2 ω3 . (3.18)

41

3. Mathematical Attitude Representation

3.2. Direction Cosine Matrices

Multiplying out equation 3.17 leads to L¯ 

 

»

»

»





ω2 l3  ω3 l2 l¯  ω3 l1  ω1 l3 dm ω1 l2  ω2 l1





l2 p ω1 l2  ω2 l1 q  l3 p ω3 l1  ω1 l3 q l3 pω2 l3  ω3 l2 q  l1 pω1 l2  ω2 l1 q dm l1 p ω3 l1  ω1 l3 q  l2 p ω2 l3  ω3 l2 q



ω1 l22  ω2 l1 l2  ω3 l1 l3  ω2 l 2  ω3 l2 l3  ω1 l2 l1 3 ω3 l12  ω1 l3 l1  ω2 l3 l2

(3.19)



ω1 l32 ω2 l12 dm. ω3 l22

Angular velocity is constant for every mass fragment which allows to draw angular velocity out of the integral and rewrite the equation in matrix notation ³



³

³





l22³ l32 dm ³  l1 l2 dm  ³ l1 l3dm ω1 

ω2 . L¯    ³ l2 l1 dm l12³ l32 dm ³  l2 l3 dm  ω3  l3 l1dm  l3 l2dm l12 l22 dm

(3.20)

The first matrix on the right hand side may be identified as the bodies inertia tensor I. Therefore angular momentum may be written as ¯ L¯  Iω.

(3.21)

Applying the rule of taking the time derivative of a vector on equation 3.21 gives ¯ N

¯

 ddtL  I ddtω¯

ω¯  pIω¯ q .

(3.22)

With equation 3.13, the connection to acting torque is apparent. Thus, the fundamental equation of motion for attitude dynamics may be derived. Iω¯9

 N¯  ω¯  Iω¯

(3.23)

¯ (due to disturbances or Change in angular velocity depends on applied torque N actuators), the bodies inertia I and angular velocity.

3.2. Direction Cosine Matrices Mathematically, it is easy to describe orientation of one frame with respect to another. One method is the use of direction cosine matrices (DCM). It resembles the cosines of the angles between all frame axis. Figure 3.1 gives an example.

42

3. Mathematical Attitude Representation

3.2. Direction Cosine Matrices

aˆ 1

cosα  bˆ 1

ˆb 1 α β cosβ  bˆ 1

aˆ 1

Figure 3.1.: Visualization of cosines between vectors. The red line represents the cosine of β, while the green line has the magnitude of cos α. All vectors are supposed to be unit-vectors. Therefore,

|greenline|

(3.24)

cos α  | greenline|

(3.25)

 cos βaˆ1

(3.26)

cos α 

|hypotenuse| .

Since the hypotenuse is |bˆ 1 |  1,

and

bˆ 1

cos α aˆ 2 .

In matrix notation, 3.26 becomes



bˆ 1



cos β 0 0 cos α

 

aˆ 1 . aˆ 2

(3.27)

The matrix containing the cosines in 3.27 is a simple DCM. In three-dimensional space, the DCM becomes a 3x3 matrix compiling all cosines between the frame axis  



 

bˆ 1 aˆ 1 c11 c12 c13 bˆ 2   c21 c22 c23   aˆ 2  . c31 c32 c33 aˆ 3 bˆ 3

(3.28)

As a result, all 3x3 DCMs are skew symmetric matrices. The geometric sum of each row or column is unity. An example is given here b

c211

c212

43

c213

 1.

(3.29)

3. Mathematical Attitude Representation

3.3. Quaternions

A DCM might also be called a rotation matrix. Rotation Matrices are convenient to work with when successive rotations need to be expressed. As shown in [16] DCMs relate rotations within a sequence of rotations by multiplication. The resulting Matrix is a DCM itself 1 1 2 2 C AB  C AA C A A C A B . (3.30) Start and target frame are denoted by A and B, respectively. All primed A’s show intermediate frames. The DCM which transforms coordinates of frame A into frame B is expressed by C AB . Euler proved that any rotation between two frames may be accomplished by a succession of 3 rotations about their principal axis. Applying this theorem to 3.30 shows that any relative orientation between two frames might be represented as a single rotation. Whenever something is rotated, a corresponding axis of rotation and angle of rotation exists. Rotation axis and angle are another way to represent a rotation. In fact, these information can be deduced from the corresponding DCM directly. If a vector η¯ exists which satisfies Dη¯  η¯ (3.31) then η¯ is called eigenvector of matrix D. It turns out that η¯ is also the axis of rotation for the change in orientation associated with a DCM D. In [16], the rotation angle ξ is given by TrpCq  1 (3.32) ξ  arccos 2 where C denotes the corresponding DCM and TrpCq denotes the trace of C. The rotation axis’ coordinate numbers are the same both in the frame the rotation starts from and the frame the rotation leads to. At this point it is only a matter of scaling to understand what quaternions represent.

3.3. Quaternions A Quaternion may represent a rotation and is a four-tuple. It has been shown that a rotation may be expressed in terms of an axis of rotation and a corresponding angle of rotation. Three elements of a quaternion are called imaginary because they are usually associated with the imaginary basis ti,ˆ j,ˆ kˆ u and can be understood as ¯ The fourth element q0 is a scalar and therefore called the real part. A a vector q. quaternion may be written as ¯ q  q0 q. (3.33) Sometimes, q0 is the last element in a quaternion. A convention norming this issue remains to be settled on. In this work, quaternions are expressed as in 3.33 with the real element first.

44

3. Mathematical Attitude Representation

3.3. Quaternions

Suppose the angle of rotation is denoted by ξ. The axis of rotation is given by eˆ  te1 , e2 , e3 u. Without proof, the quaternion is

 cos 2ξ

q0

(3.34)

ξ q¯  eˆ sin . 2

(3.35)

Therefore, vector q¯ represents the axis of rotation. The scalar q0 can be understood as angle of rotation. DCMs and quaternions are strongly related. Expressing successive rotations in terms of quaternions is possible as well. Successive rotations may be expressed as a quaternion multiplication. Quaternion multiplication is more complex than a simple vector multiplication due to special characteristics of quaternions. Hamilton found the following rule in 1843 and thereby invented quaternions iˆ2

 jˆ2  kˆ 2  iˆjˆkˆ  1.

(3.36)

This rule implies iˆjˆ  kˆ   jˆiˆ jˆkˆ  iˆ  kˆ jˆ

(3.37)

ˆ kˆ iˆ  jˆ  iˆk.

3.3.1. Quaternion Multiplication Relations 3.37 are required to accomplish a proper quaternion multiplication as shown in [16]. Quaternions p and q shall be multiplied where p  p0 q  q0

p1 iˆ p2 jˆ p3 kˆ ˆ q1 iˆ q2 jˆ q3 k.

(3.38) (3.39)

Writing the multiplication out yields pq  p p0

 p0 q0

p2 jˆ p3 kˆ qpq0 q1 iˆ p1 q0 iˆ p2 q0 jˆ p3 q0 kˆ

p1 iˆ

p0 q1 iˆ p0 q2 jˆ

p1 q1 iˆ2 p1 q2 jˆiˆ

p2 q1 iˆjˆ p2 q2 jˆ2

p0 q3 kˆ

p1 q3 kˆ iˆ

p2 q3 kˆ jˆ

45

q2 jˆ

p3 q1 iˆkˆ p3 q2 jˆkˆ p3 q3 kˆ 2 .

q3 kˆ q (3.40)

3. Mathematical Attitude Representation

3.3. Quaternions

Applying 3.37 on 3.40 gives pq  p0 q0

p1 q0 iˆ p2 q0 jˆ p3 q0 kˆ p0 q1 iˆ  p1 q1 p2 q1 kˆ  p3 q1 jˆ p0 q2 jˆ  p1 q2 kˆ  p2 q2 p3 q2 iˆ

(3.41)

p1 q3 jˆ  p2 q3 iˆ  p3 q3

p0 q3 kˆ which further may be regrouped to

pq  p0 q0  p p1 q1 p2 q2 p3 q3 q p0 pq1 iˆ q2 jˆ q3 kˆ q q0 p p1 iˆ p2 jˆ p3 kˆ q p p2 q3  p3 q2qiˆ p p3 q1  p1 q3q jˆ p p1 q2  p2 q1qk.ˆ

(3.42)

Equation 3.42 may be regrouped in the more compact term pq  p0 q0  p¯  q¯

p0 q¯

q0 p¯

¯ p¯  q.

(3.43)

Usually, equation 3.43 is referred to when the quaternion multiplication operator is cited. This section’s purpose is to further the understanding of its composition. In this work quaternion multiplication is needed to generate an "error quaternion" necessary to acquire a proper control signal.

3.3.2. Quaternion Inverse The inverse of a quaternion is given by [16] q 1

 |q* q|2

(3.44)

where q* is the conjugate complex of q with ¯ q*  q0  q.

(3.45)

The norm of quaternion |q| is

|q| 

b

q20

q21

q22

q23 .

(3.46)

The special case of the norm of q being unity is noteworthy. Here, the conjugate complex of q is equal to its inverse q 1

 q*.

46

(3.47)

3. Mathematical Attitude Representation

3.3. Quaternions

Since attitude quaternions are normalized, the inverse of q is simply given by 3.45.

3.3.3. Error Quaternion Earlier, it has been shown that the multiplication of two quaternions results in a single equivalent rotation. This fact helps generating the error quaternion ∆q. Quaternion qa denotes the actual attitude while qd is synonymous for desired attitude. Further it can be assumed that qd  qa ∆q (3.48) where ∆q shall be given by

1 ∆q  q a qd .

(3.49)

Equation 3.49 is essential for the attitude control problem. It permits calculation of the error signal since qa is known due to measurements and qd is given by the mission. This section proves the validity of 3.49. The proof’s strategy is to calculate ∆q using 3.49 and substituting it into 3.48. The result should be qd . Applying 3.43 on 3.49 gives 1 ∆q  q a qd

 qao qdo  qaoqdo

q¯a  q¯d q a1 qd1 

rqao q¯d s  rqdo q¯a s rq¯a  q¯d s



q a2 qd2



qd1 q ao q a1 qdo qd2 q ao   q a2 qdo  qd3 q ao q a3 qdo

q a3 qd3 



qa2 qd3 qa3 qd2  q a1 qd3  q a3 qd1  . qa1 qd2 qa2 qd1

(3.50)

The result of 3.50 are used to calculate 3.48. First, the real part qdo of qd is calculated applying the relevant terms of 3.43. qdo

 qao ∆qo  q¯a  ∆q¯  qao pqao qdo qa1 qd1 qa2 qd2 qa3 qd3q  qa1pqd1 qao  qa1 qdo  qa2 qd3 qa3 qd2q  qa2pqd2 qao  qa2 qdo qa1 qd3  qa3 qd1q  qa3pqd3 qao  qa3 qdo  qa1 qd2 qa2 qd1q.

(3.51)

Further expanding 3.51 yields qdo

 q2ao qdo qao qa1 qd1 qa2 qd2 qao qa3 qd3 qao  qa1 qd1 qao q2a1 qdo qa1 qa2 qd3  qa1 qa3 qd2  qa2 qd2 qao q2a2 qdo  qa2 qa1 qd3  qa2 qa3 qd1  qa3 qd3 qao q2a3 qdo qa3 qa1 qd2  qa3 qa2 qd1. 47

(3.52)

3. Mathematical Attitude Representation

3.3. Quaternions

In equation 3.52, it is a simple matter of crossing out terms negating each other to recover (3.53) qdo  qdo pq2ao q2a1 q2a2 q2a3 q.

The term pq2ao q2a1 q2a2 q2a3 q is equal to the squared norm of qa . Since the norm of a quaternion is one, 3.53 yields qdo

 qdo .

(3.54)

With this result, the first half of the proof is complete. The validity for the imaginary part remains to be shown. The imaginary term of 3.48 is q¯d

 pao ∆q¯

rq¯a  ∆q¯s  qd1 q2ao  q ao q a1 qdo  q a2 qd3 q ao q a3 qd2 q ao  qd2 q2ao  qao qa2 qdo qa1 qd3 qao  qa3 qd1 qao  qd3 q2ao  q ao q a3 qdo  q ao qd2 q a1 q a2 qd2 q ao 



∆qo q¯a

q a1 qd0 q ao q a2 qd0 q ao q a3 qd0 q ao 



q2a1 qd1 q a1 q a2 qd2 q a1 q a3 qd3 q a1 q a2 qd1 q2a2 qd2 q a2 q a3 qd3  q a1 q a3 qd1 q a2 q a3 qd2 q2a3 qd3

(3.55)



q a2 ∆q3  q a3 ∆q2 q a1 ∆q3 q a3 ∆q1  . q a1 ∆q2  q a2 ∆q1

Further substitution leads to 



qd1 q2ao  q ao q a1 qdo  q a2 qd3 q ao q a3 qd2 q ao q¯d  qd2 q2ao  q ao q a2 qdo q a1 qd3 q ao  q a3 qd1 q ao  qd3 q2ao  q ao q a3 qdo  q ao qd2 q a1 q a2 qd2 q ao 

q a1 qd0 q ao q a2 qd0 q ao q a3 qd0 q ao 



q2a1 qd1 q a1 q a2 qd2 q a1 q a3 qd3 q a1 q a2 qd1 q2a2 qd2 q a2 q a3 qd3  q a1 q a3 qd1 q a2 q a3 qd2 q2a3 qd3

q a2 pq ao qd3 q a3 qdo q a1 qd2 q a2 qd1 qq a3 pq ao qd2 q a2 qdo q a1 qd3 q a3 qd1 q

(3.56) 

qa1 pqao qd3 qa3 qdo qa1 qd2 qa2 qd1 q qa3 pqao qd1 qa1 qdo qa2 qd3 qa3 qd2 q . q a1 pq ao qd2 q a2 qdo q a1 qd3 q a3 qd1 qq a2 pq ao qd1 q a1 qdo q a2 qd3 q a3 qd2 q

48

3. Mathematical Attitude Representation

3.3. Quaternions

Expanding the last term gives 



qd1 q2ao  q ao q a1 qdo  q a2 qd3 q ao q a3 qd2 q ao q¯d  qd2 q2ao  q ao q a2 qdo q a1 qd3 q ao  q a3 qd1 q ao  qd3 q2ao  q ao q a3 qdo  q ao qd2 q a1 q a2 qd2 q ao 

q a1 qd0 q ao q a2 qd0 q ao q a3 qd0 q ao 



q2a1 qd1 q a1 q a2 qd2 q a1 q a3 qd3 q a1 q a2 qd1 q2a2 qd2 q a2 q a3 qd3  q a1 q a3 qd1 q a2 q a3 qd2 q2a3 qd3

(3.57)

q a2 qd3 q ao q a2 q a3 qdo q a2 q a1 qd2 q2a2 qd1 q a3 qd2 q ao q a3 q a2 qdo q a1 q a3 qd3 q2a3 qd1



qao qa1 qd3 qa1 qa3 qdo q2 qd2 qa1 qa2 qd1 qa3 qao qd1 qa1 qa3 qdo qa2 qa3 qd3 q2 qd2  . a3 a1 q ao q a1 qd2 q a1 q a2 qdo q2a1 qd3 q a1 q a3 qd1 q a2 qd1 q ao q a2 q a1 qdo q2a2 qd3 q a2 q a3 qd2

Crossing out negating terms yields 

qd1 pq2ao q¯d  qd2 pq2ao qd3 pq2ao As seen earlier, q2ao

q2a1

q2a2

q2a3

q2a1 q2a1 q2a1

q2a2 q2a2 q2a2



q2a3 q q2a3 q . q2a3 q

(3.58)

 1. Therefore, 



qd1  q¯d  qd2  . qd3

(3.59)

This fulfils the imaginary part of the proof. Combining real and imaginary part given by 3.54 and 3.59, gives in fact qd

 qdo

q¯d .

(3.60)



3.3.4. Why to use Quaternions Quaternions offer a compact form to represent orientation. Unfortunately, quaternions take some practice to readily visualize and are anti-intuitive. However, they posses advantages over the classical direction cosine matrix representation. 1. In numeric applications it is necessary to calculate back and forth between Euler angles, which describe orientation in terms of three angles, and the matrix format of DCMs. In order to convert Euler angles back into a DCM, fractions need to be solved which may have a zero in the denominator. Therefore, such calculations might abort due to singularity issues. This is the main reason why

49

3. Mathematical Attitude Representation

3.3. Quaternions

quaternions are favoured over DCM-representation for numerical attitude calculations. Quaternions are free of singularities. 2. Whenever a DCM needs to be transformed into Euler angles, several backup calculations need to be performed. This is due to the fact that cosine and sin functions are non-specific. For example: 0.5 is equal to cos p60 q as well as cos p300 q. Therefore, redundant calculations need to be performed to identify the proper quadrant. Quadrant checks are not required in quaternion-math. Quaternions are specific.

50

4. Neural Control The work at hand introduces new neural control techniques. They build on an existing method, the observer-controller architecture. The following sections recapitulate existing techniques and introduce the new approaches.

4.1. Satellite Control Architectures The satellite’s control system architecture impacts the control outcome. In figure 4.1, a simple feedback architecture is given as may be found in [3]. In this work, the blocks enclosed by the dashed box are collectively called "Plant". The feedback loop is omitted for reasons of clarity as well. On the other hand, other non-conventional blocks such as "Observer" are added. Neural control remains a sub-field of control theory and it is not granted that people are familiar with this area of research. This chapter presents common neuro control architectures.

Disturbance

+ Controller

Actuator

Plant

Sensor

-

Figure 4.1.: Simplified standard control theory block diagram [3]. All blocks inside the dashed frame are united in the "Plant"-block in this work.

4.1.1. Observer-Controller This section introduces a neural controller to the satellite’s control system architecture. The observer-controller architecture serves as fundamental neuro control

51

4. Neural Control

4.1. Satellite Control Architectures

approach. Various other control architectures are possible but remain undiscussed in this work. Generally, a neural network requires an appropriate error signal to tune itself to the desired behaviour. For the neural network controller (NNC), an optimal error signal is the difference between correct and commanded control signal. However, the correct control signal is generally unknown. The observer-controller architecture attempts to generate and make use of this signal. A so called observer or identifier is added to the architecture. Both, the controller as well as the observer, implement neural networks. The observer’s purpose is to copy the satellite’s behaviour. Section 2.4.4 introduces the concept of inverse propagation. This method allows to construct the network’s input based on its output. In the case of the observer, the satellite’s state deviation is inversely propagated through the network with the intent to obtain the corresponding control signal. It is assumed that the identified control signal is equal to the correct control signal if the observer accurately estimates the satellite’s behaviour. At this point, the actual and correct control signal may be compared. Thus, an adequate NNC-error signal may be deduced. Figure 4.2 illustrates the observer-controller architecture.

∆qd

τ¯c

NN Controller

Satellite

qa eobs

NN Observer ec

- τ¯obs

inverse propagation

qobs ∆qd

qd

Figure 4.2.: Block diagram of observer-controller architecture. Boxes framed in red signal incorporation of neural networks while a black frame symbolizes the hardware component (satellite). The NNC is fed with the deviation between desired and actual attitude ∆qd . The controller commands the control torque τ¯c which aim is to neutralize the deviation. The signal acts on the plant and results in the actual attitude qa . Simultaneously, the control signal is routed to the observer which generates the observed attitude qobs . The difference of actual and observed state produces the error (or training) signal eobs for the observer. Actual

52

4. Neural Control

4.1. Satellite Control Architectures

and desired attitude are compared and inversely propagated through the observer. The result is the observed, and thought of as correct, control signal τ¯obs . Subtraction of the commanded and observed control signal provides the controller’s training signal ec . The diagonal lines behind red boxes symbolize training of the neural networks. It should be noted that the observer also receives the actual satellite state. For reasons of illustrative clarity, the corresponding depiction is omitted. Within the simulation’s configuration files, one may specify the inverse output’s signal nature. Two options are available: 1. The signal may be an absolute value. In this case the signal has to be compared to the actual signal as illustrated in figure 4.2. 2. The inverse output signal may be a deviation signal. In this case the outputted signal is already equal to the controller’s training signal. A detailed description of configuration parameters may be found in [8].

4.1.2. Reference Model augmented Observer-Controller The reference model enhances the observer-controller architecture. The underlying theory of this architecture is an attempt to provide the NNC with a simplified behaviour to model. This approach promises faster convergence speed of the network. Consequently, increased control accuracy, especially for complex plants, may be expected. In the work at hand, a satellite’s operating point alternates between periods of fast dynamics and phases of smooth coasting. If convergence speed is slow, the operating point may shift before the network adequately models the satellite’s behaviour. Consequently, unsatisfactory control accuracy may be obtained. The reference model should offer an advantage in converging on these different behaviours. Figure 4.3 illustrates the incorporation of an reference model into an observer-controller architecture. A blue frame signals usage of a reference model. The reference model processes the desired attitude signal to generate a reference signal. Based on this reference signal, the controller is trained. Training of the NNI remains unaffected by the reference model. It only drives the NNC’s weight adjustment. The reference model’s output qre f is compared to the actual state qa . The reference attitude deviation ∆qre f is the result. Through inverse propagation by the observer, a controller applicable error signal ec is generated. Because ec is based on the reference model, the controller copies the reference model rather than the actual satellite. Mentioned advantages of a reference model come at a cost. Basically, the reference model filters the desired attitude in a, from the controller’s point of view, favourable

53

4. Neural Control

4.1. Satellite Control Architectures qre f

qd

Reference Model

qa

∆qd

τ¯c

NN Controller

Satellite

qa eobs

NN Observer ec

- τ¯obs

inverse propagation

qobs ∆qre f

Figure 4.3.: Block diagram of indirect neural reference controller which implements a reference model to the observer-controller architecture. way. However, filtering always means a loss of information. In this case, a slight delay is imprinted onto the signal owed to the dynamics of the reference model. Still, the potential advantages justify deployment.

54

4. Neural Control

4.1. Satellite Control Architectures

4.1.3. Augmented Observer-Controller after [1] In [1] a method is introduced which promises a stable error signal. Figure 4.4 illustrates its architecture. The "Augmented Output" block is the significant addition to a observer-controller scheme. This block is supposed to stabilize the error signal by considering the near error values history. In this configuration, the NNC is trained based on the augmented output generated by the corresponding block rather than the reference model. To analyse this architecture further, figure 4.4 is modified and investigated in section 4.3.

qre f

qd

Ref. Model

qa

∆qd

NNC

τ¯c

Augmented Output

- ∆qre f Satellite

qa

-

eobs ec NN Obs.

- τ¯obs

qaug

inv. prop.

qobs

∆qaug

Figure 4.4.: Block diagram of augmented observer-controller architecture after [1].

55

4. Neural Control

4.2. Reference Model

4.2. Reference Model Reference Models provide the control algorithm with a desirable dynamic to follow. For this purpose, they should be simple. Simplification requires knowledge of the system of concern. Therefore, the reference model designer needs to be familiar with the plant and environment it is operating in [17]. In many instances, a reference model can be modelled as an oscillator or a radically linearised version of the equations of motion of the plant [18, 19]. The concept of a reference model bears several advantages: • The stability of the reference dynamic may be guaranteed. A mathematical investigation can show that the eigenvalues of the characteristic equation are zero or smaller. Therefore, marginal or asymptotic stability is ensured. This is possible because the reference model should resemble simplified equations of motion which allow for investigation. • The reference model generates an error-value the neural controller is able to follow. It is possible that the mission asks for dynamics the plant is physically unfit to follow. In this case, the plant will trail the mission’s target attitude. Consequently, the neural controller would receive an error-signal suggesting poor performance. The result could be an adjustment of weights where no adjustment is necessary. This can lead to controller instability. To circumvent this problem, the reference model is tuned to output dynamics the real plant is able to follow. Generally, increasing the reference model’s inertia or limiting the actuators’ efficacy will have such an effect. A possibly decreased controlling accuracy is the downside of this method. • The neural controller is more likely to estimate the reference model’s dynamics. A reference model neglects disturbances and second order dynamics. Consequently, a simpler behaviour needs to be learned. The neural network is able to follow the simplified dynamics more readily and therefore stabilize itself. As mentioned before, neural networks are able to estimate any given function. To do so, they need to "learn" the characteristics of the function. In other words: they need to adjust their weight parameters. In order to learn, a neural network requires a performance index which indicates the network output’s quality. An appropriate algorithm adjusts the parameters whenever the index suggests poor performance. The model of an idealized plant can serve as such an index. By comparing the actual system state with the state dictated by the reference model, the neural controller is forced to approximate the reference model’s behaviour. Linearising the equations of motion yields an appropriate reference model.

56

4. Neural Control

4.2. Reference Model

4.2.1. Simplified Physical Model The concept of angular momentum is a common approach to describe dynamics of a rotating, rigid body such as satellites. The corresponding fundamental equation, as further described in section 3.1.2, is Iω¯9

 N¯  ω¯  Iω¯

(4.1)

¯ represents torques acting on where L¯ is the angular momentum vector. Vector N the body such as disturbances and actuators. Furthermore, ω¯ is the angular velocity of the body with respect to the reference frame. Writing equation 4.1 in component format and isolating ω¯9 gives

ω9 2

 NI 1 p I22 I I33 qω2 ω3 11 11 N2 I33  I11  p qω ω

ω9 3



ω9 1

I22 N3 I33

I22

3

1

(4.2)

p I11 I I22 qω1 ω2. 33

As mentioned earlier, the reference model should be as simple as possible. Therefore, ¯ only includes it neglects disturbances and second order dynamics. Consequently, N torques generated by the reaction control wheels. Angular velocity is assumed to be small for the model. Accordingly, products of angular velocity, which are second order dynamics, are negligible. The terms on the very right hand side of equation 4.2 are second order dynamics. Crossing them out yields: ω9 1

 NI 1

ω9 2



ω9 3



ω¯9

 NI .

or in vector format

11

N2 I22 N3 I33 ¯

(4.3)

(4.4)

Equation 4.4 is the simplified equation of motion used for the reference model in this work.

57

4. Neural Control

4.2. Reference Model

4.2.2. Control Law The control law governing the reference model is τ¯

¯ o  Kd ∆ω¯  K p ∆q∆q

(4.5)

where τ¯ is the commanded torque vector. The symbol ∆ denotes the difference between reference model state and mission / desired state. Consequently, ∆q¯ is the imaginary triple of the difference quaternion ∆q. Chapter 3 explains the method used to obtain ∆q. As mentioned in section 3.3, ∆q¯ represents the axis of rotation which, when rotated about, swings the misaligned satellite back into its desired state. Conveniently, the same axis describes a torque vector necessary to correct any misalignment. The multiplication of ∆q¯ with ∆qo scales the first term of the control law. The rotation angle is denoted by ∆qo . Therefore, ∆qo is big if the deviation of desired and reference attitude is big. In this case, the multiplication gives a big control torque in 4.5. Furthermore, the sign of ∆qo corrects the direction of the derived ¯ In retrospect, the first term of 4.5 represents a proportional control control torque τ. term. The term ∆ω is the difference angular velocity. Simple subtraction of the reference from the desired angular velocity yields the difference angular velocity ∆ω. Similarly to the quaternion term, the vector ∆ω is aligned with an appropriate correction torque vector. This part of equation 4.5 acts as a derivative control term as in a PID-controller. Terms ∆ω and ∆q¯ represent a PD-controller and posses according characteristics. Since an integrating part is absent, a constant, small deviation from the desired signal is apparent. Factors K p and Kd are constant valued gain matrices. They adjust the control signal to appropriate magnitudes. After equation 4.5 generates the control signal, a saturation block limits the maximum allowable control torque as shown in figure 4.5. This measure ensures that the reference model only generates reference signals the actual plant is fit to follow.

4.2.3. Simulink Implementation Figure 4.5 shows the Simulink block diagram which incorporates attitude dynamics and control law mentioned above. The desired state is fed to the reference model. Both, desired angular velocity and attitude are compared to the reference model’s state and two difference signals are generated respectively. In accordance to section 4.2.2, constant gain matrices K p and Kd scale the signals which are superpositioned subsequently. The saturation block guarantees a suitable control torque which is fed to the attitude dynamics. Equation 4.4 describes the control torgue’s transformation into angular acceleration. A successive integrator block calculates angular velocity from angular acceleration. The initial angular velocity is passed to this block as well to provide the reference model with the correct starting conditions.

58

1 desired state

z Unit Delay

1

q*

59

q1*q2

Quaternion Multiplication

q2

U( : )

Gain1

u*K

Reshape1

u/|u| Unit vector Matrix Multiply

Gain

u*K

q

w

Initial q

q

Embedded MATLAB Function

Matrix Multiply

1 s q_dot 2 q

xo

u

fcn y xo

1 s

Reshape

w_dot 2 w

Unit vector1

u/|u|

Singularity Check

Inertia Matrix1

w

Matrix Multiply1

omegq2qdot qdot

Saturation

LU Inverse

(LU)

General Inverse

Figure 4.5.: Screen shot of reference model as implemented as Simulink model

conj. complex quaternion

q

Reshape2

q1

Inertia Matrix

−C−

3 q_ref

2 omega_ref

1 omega_dot_ref

4. Neural Control 4.2. Reference Model

4. Neural Control

4.2. Reference Model

Fusion of angular velocity ω¯ and current attitude q produces the rate of attitude 9 change q. Equation 4.6 describes the corresponding attitude dynamics as may be found in [16]: q9 o q9 1 q9 2 q9 3

 0.5 pω1 q1 ω2 q2 ω3 q3q  0.5 pω1 qo  ω2 q3 ω3 q2q  0.5 pω1 q3 ω2 qo  ω3 q1q  0.5 pω1 q2 ω2 q1 ω3 qo q .

(4.6)

Similarly to angular acceleration, q9 is transformed into q using an integrator block. ¯9 angular velocity ω¯ This setup allows to output reference- angular acceleration ω, and attitude q. However, in this work, only reference attitude is required. A unit delay block is incorporated close to the input block in figure 4.5. The delay is required such that the signals correspond to the time-dependencies depicted in figure 4.6. The signals which are fed to reference model and observer need to be delayed in order for the simulation to run synchronized. The "Satellite"-block receives the correctly timed signal because the neural controller subsystem (NNC) acts as a delay. A delay block is obsolete in this case. For both, reference model and observer, the delay blocks lie within the subsystem blocks. If no delay blocks were to be incorporated, observer and reference model would receive signals from time step pk  1q while the satellite would work with pkq. It is clear that such a configuration would falsify the results. qd p k q

qre f pk

Reference Model

1q

-

q a pk q

∆qd pkq

NNC

τ¯c pkq

Satellite

q a pk

1q

eobs pk

q a pk q

NN Observer ec pk

1q

inverse propagation

qobs pk

1q 1q -

∆qre f pk

1q

Figure 4.6.: Block diagram of a reference model augmented observer-controller architecture. Signals are labled with corresponding time signatures.

60

4. Neural Control

4.2. Reference Model

4.2.4. Stability Analysis As mentioned in 4.2, a simple stability analysis is possible for the reference model. Reference [3] describes how to draw conclusions about stability from eigenvalues of the characteristic equation. For this purpose the eλt -approach is applied to the plant’s simplified equation of motion (EoM) 4.4. The angular velocity ω¯ is equiv¯9 which describe the attitude with three alent to the rate of Euler angle change Υ, perpendicular angles. The homogeneous EoM is needed:

Choosing the eλt -approach yields

: ¯  0. Υ

(4.7)

Υ¯  eλt .

(4.8)

Substituting equation 4.8 into 4.7 gives: λ2 eλt

 0.

(4.9)

Since eλt may never be zero, λ has to be zero for equation 4.9 to hold true. Because λ is squared, the root is twofold at the origin λ1,2

 0.

(4.10)

Roots at the origin mean semi-stable behaviour of the plant. The result makes sense because no perturbations were considered for the plant. In other words, no environmental disturbances such as atmospheric drag, solar pressure or magnetic- or gravity-gradient torques could dampen or excite the satellite’s motion. Conclusively, it has been shown that the plant possesses marginal stability.

61

4. Neural Control

4.3. Stable Error Signal after [1]

4.3. Stable Error Signal after [1] In section 4.1.3 the control method after [1] is introduced. In this section, the architecture is scrutinized. For this purpose the signal labels are generalized. Furthermore, the analysis requires consideration of time relations. Therefore, each signal carries its time stamp k. The modified block diagram is shown in figure 4.7.

Ref. Model

NNC

- epk 1q

u c pk q

Satellite eobs pk NN Obs.

e c pk

1q

1q

yobs pk

1q

ypk

Augmented Output

1q

∆ypkq

yre f pk

y m pk

y d pk q

1q

-

1q -

inv. prop.

em pk

1q

Figure 4.7.: Block diagram of observer-controller architecture with error stabilization after [1]. The "Augmented Output" block is the significant addition to the reference model enhanced observer-controller architecture. It implements the following algorithm: y m pk

1q  yre f pk

1q  A  pyre f pkq  ypkqq  B  pyre f pk  1q  ypk  1qq

(4.11)

where the difference yre f pkq  ypkq describes the unprocessed error-signal epkq  yre f pkq  ypkq.

(4.12)

Equation 4.11 may therefore be rewritten as y m pk

1q  yre f pk

1q  A  epkq  B  epk  1q.

(4.13)

The number within the parenthesis denotes the corresponding time step. Matrices A and B are Hurwitz matrices. If y and therefore e are scalar, A and B take negative real values. Generally, the values of A and B have to be determined by trial and error and lie within the unit interval. The author of [1] suggests to use

62

4. Neural Control

4.3. Stable Error Signal after [1]

bigger values for plants with fast dynamics while for plants with slow dynamic characteristics, smaller values are appropriate.

4.3.1. Proof of Stability In order to prove the error signal’s stability, an attempt to meet Lyapunov stability criteria is conducted. Lyapunov was a Russian mathematician who invented a method to prove stability for dynamic systems. Two cases of Lyapunov stability may be distinguished: 1. If, after an allowable disturbance from equilibrium point, the answer of a system remains inside predefined boundaries for all times, the system possesses marginal stability. In mathematical terms, an equilibrium state ye exists and at t  0 the system has state y0 . If

|y e  y0 |   γ

and

|yptq  y e |   σ

(4.14)

for predefined boundaries γ and σ, then the system producing yptq may be called marginally stable. 2. If, after an allowable disturbance from equilibrium point, the answer of a system remains bounded and eventually returns arbitrarily close to the equilibrium point, the system possesses asymptomatic stability. In mathematical terms: If |ye  yptq|   γ and lim |yptq  ye |  0 , (4.15) tÑ8

the system may be called asymptotically stable. Asymptotic stability implies marginal stability. Lyapunov suggested an indirect approach which proves stability of the concerned system: If for a dynamic system, a candidate function V pxq  f pyptqq may be found which is stable after Lyapunov, then the system itself is stable after Lyapunov. In order to qualify as Lyapunov function V9

 0

(4.16)

has to hold true. In article [1], the Lyapunov candidate function 4.17 is introduced V pepkqq  e T pkqepkq.

(4.17)

The error signal epkq has to decrease monotonously in order for V to qualify as Lyapunov function. Since the system at hand is a simulation with discrete time

63

4. Neural Control

4.3. Stable Error Signal after [1]

steps, the Lyapunov requirement 4.16 may be expressed as ∆V

 0

∆V

where

 V pk

1q  V pkq.

(4.18)

From figure 4.7, one may deduce that em pkq  ym pkq  ypkq.

(4.19)

Combining equations 4.19, 4.13 and 4.12 yields epk

1q  e m p k

1q

A  epk q

B  epk  1q.

(4.20)

To scrutinize the Lyapunov criteria, an expression for ∆V has to be found. Quite plausibly, one may write ∆V pepkqq  V pepk

1qq  V pepkqq  e2 pk

1q  e2 pkq.

(4.21)

Substituting 4.20 into 4.21 allows to expand the terms. One may find that ∆V pepkqq e T pkqpA2  Iqepkq

T 2em pk

2e T pkqAT Bepk  1q

1qAepkq

T 2em pk

e T pk  1qBT Bepk  1q

1qBepk  1q T em pk

1qem pk

1q

(4.22)

where I is the identity matrix. Equation 4.22 may be substituted into 4.18. In order for ∆V pepkqq   0 to hold true, inequality 4.23 has to hold T e T pkqppA  Iqqepkq ¡2em pk

T 2em pk

2e T pk

1qAepkq

1qBepk  1q

1qAT Bepk  1q

e2 pk  1qB2

2 em pk

(4.23)

1q.

Proper selection of A and B fosters inequality 4.23 to hold true. Of course, since dynamic systems are investigated, it is impossible to foresee the exact future plant behaviour. Peak disturbances may result in peak error values which cause inequality 4.23 to collapse. In this case, Lyapunov’s criterium would not be fulfilled and the whole approach loses its claim of guaranteed stability. However, it is worth to scrutinize inequality 4.23 for its likely outcome during standard operations. To this end, a close-up of the individual terms is appropriate. The first term, e T pkqppA2  Iqqepkq, will always be positive and relatively big. This is because A2 is small and positive. Therefore, ppA2  Iqq will result in a positive and big outcome for the first term. The augmented error em pkq is processed such that it should always be smaller than T pk T pk each individual plain error epkq. Both terms, 2em 1qAepkq and 2em 1qBepk  1q include em pkq and are additionally reduced by matrices A and B. Consequently,

64

4. Neural Control

4.3. Stable Error Signal after [1]

they may be regarded as small. The terms 2e T pk 1qAT Bepk  1q and e2 pk  1qB2 , both, implement reduction matrices A and B twice. This justifies their classification 2 pk as small terms. The last term, em 1q, carries the squared augmented error and should hence be minuscule. All in all, the right hand side of inequality 4.23 resembles an array of small terms while the left hand side appears to bear a big value. To this end, inequality 4.23 appears to hold true in normal operations but a strict proof remains to be offered.

4.3.2. The Effect The addition of the near error value time history, epkq and epk  1q, effectively dampens the reference model state yre f pk 1q. Multiplication of A and B weighs past error values epkq and epk  1q. Closer value epkq offers a stronger influence through bigger A than epk  1q with multiplication of B. The effect of error augmentation is depicted in figure 4.8. The reference model’s state yre f is drawn closer to the actual state y y yre f Bepk1q Aepkq

epkq

em p k

e p k  1q

k1

k

k

1

ym 1q

y

timestep

Figure 4.8.: The plot shows two hypothetical curves of the plant’s and reference model’s state y (blue) and yre f (red), respectively. The augmented reference model output ym is illustrated as green dot. Errors e  yre f  y and em  ym  y are labled. Stabilized error signal em is calculated by equation 4.13. The augmented state is drawn closer to the plant’s state by Aepkq Bepk  1q and therefore produces weaker error signals.

65

4. Neural Control

4.3. Stable Error Signal after [1]

by subtraction of weighted past error values epkq and epk  1q. Accordingly, the method produces weaker error signals because the new augmented state ym is used for controller error signal generation. Surely, weaker error signals promote moderate weight adjustment which in turn favours stable learning. On the other hand, smaller weight adjustments likely decrease convergence speed and therefore control accuracy. Whether the presented scheme actually adds stability to the network will be determined.

66

4. Neural Control

4.3. Stable Error Signal after [1]

4.3.3. Simulink Implementation Two implementation versions have been built. Figure 4.9 shows the first version. 1

e(k+1)

augmented out

1

1

−K−

z

q ref

e(k)

Gain A

Unit Delay

1

2

z

q est

Unit Delay1

1

−K−

z

e(k−1)

Gain B

Unit Delay2

1 z Unit Delay3

Figure 4.9.: Simulink model of the stable error signal implementation using subtraction. The method to stabilize the error signal is implemented as simple subtraction of time delayed error quaternions. The corresponding time labels may be found in the figure. This implementation is a direct realization of equation 4.20. The gain factors A and B are entered as negative values for the simulation. Figure 4.10 shows the second version. + 2)3%K

2+ 2+52F 2F D"!'%3&4*& E"0'4/04,!'4*&

6A)B)C >%?@!/%

"89"9

2+!L

6&4')7%,'*3+ +

!I!

< F 2)%?'

2

25

"89"9

2M)!)%ANO+C

+ !"#$%&'%()*"'

2M)!)%ANO+C 6&4')7%,'*3

=4#&

E!'341)E"0'4/0;

%ANC

G!4&)H

6&4'):%0!;

,*&-.),*$/0%1 2"!'%3&4*&

%AN!+ + < 6&4'):%0!;+

!I! )G!4&)J

Figure 4.10.: Simulink model of the stable error signal implementation using difference quaternion. Here, the stabilization of error signal is realized through building a difference quaternion. The stabilizing effect is achieved by weakening the real part of the difference quaternion according to equation 4.20. Similarly to the reference model implementation in section 4.2.3 , the sign of the quaternion’s real part is used to correct its imaginary part as described in section 4.2.2.

67

4. Neural Control

4.4. Control Angle Error

4.4. Control Angle Error The control angle error is used as main performance index. It represents the deviation of the actual from the desired satellite attitude. Chapter 3 explains that attitude is defined by a rotation of a principal body frame with respect to an inertial frame. The frame representing the actual orientation is fixed within the satellite body. The desired attitude frame is an image of the body frame. Optimally, both frames are congruent. In real life, a deviation will always be present between both frames. In this work, the deviation is defined as the angle between corresponding axis. Both, the desired- and the actual-attitude frame are made up of three principal axis. Accordingly, one error angle is defined between the one-axis of both frames. The angles between the respective two-axis give the second angle. The same scheme applies for the three-axis. Figure 4.11 depicts the relationship. 2d ∆Φ 2a ∆Θ 1a 1d 3d

3a

∆Ψ

Figure 4.11.: 3D-Illustration of two frames with error angles ∆Θ, ∆Φ and ∆Ψ between the corresponding axis. Hence, the angles are free to rotate about the desired axis depending on the orientation of the actual attitude frame. Consequently, the angles will always be positive. A negative angle is only possible to obtain if said angle is lies in a predefined plane.

4.4.1. Calculation of the Error Angle The direction cosine matrix defines the cosines between all axis between two frames as shown in equation 3.28. Quaternions may be transformed in a straight forward

68

4. Neural Control

4.4. Control Angle Error

manner into DCM format [16]. Equation 4.24 gives the relationship 

1  2pq22 q23 q DCM   2qo q3 2q1 q2 2qo q2 2q1 q3

2qo q3 1  2pq21 2qo q1

2q1 q2 q23 q 2q2 q3



2qo q2 2q1 q3 2qo q1 2q2 q3 . 1  2pq21 q22 q

(4.24)

Therefore, the cosines between the one-, two- and three-axis (c11 , c22 and c33 ) may be obtained applying

1  2pq22 c22 1  2pq21 c33 1  2pq21

c11

q23 q

q23 q. q22

(4.25)

q

Consequently, the angles between the one-, two- and three-axis are given by 

∆Θ  cos1 1  2pq22  ∆Φ  cos1 1  2pq2 

1

∆Ψ  cos1 1  2pq21 respectively.

69



q q23 q q22 q , q23

(4.26)

4. Neural Control

4.5. Mission and Past Work

4.5. Mission and Past Work This work is based on a Simulink model. The model has been designed to investigate an attitude control system (ACS) for a satellite-class developed by the "Deutsche Luft- und Raumfahrtinstitut" (DLR). The satellite class is derived from the BIRD1 satellite. This class describes a relatively small platform. The simulation’s aim is to adapt the ACS to new missions and to increase controller performance. The simulation has been developed by [7] in cooperation with DLR.

Latitude [°]

Satellite's Mission

Longitude [°] Figure 4.12.: Earth map with satellite’s time-marked ground path. Ground stations and their visibility are included [4]. Two missions were fed to the simulation in order to test its pointing accuracy. The first mission is to point the satellite along a vector in inertial space. In other words, the satellite holds its attitude while targeting. This mission is called "Inertial Target Pointing". The other mission requires the satellite to point at ground stations while it is passing overhead. Figure 4.12 shows the satellite’s ground track. The time-marks along the track show the satellite’s position at certain times in seconds. Four ground stations are illustrated. The shaded area around them is their field of visibility by the satellite. The axes mark Earth’s longitude and latitude in degrees. Because the satellite and the ground station are moving with respect to each other, the satellite needs to swing constantly in order to follow the ground station. This 1 Bispectral

Infra-Red Detection

70

4. Neural Control

4.5. Mission and Past Work

mission is called "Ground Station Tracking". The ANNs employed in the ACS’s observer and controller are pretrained. In other words, multiple simulation runs have been performed during which the networks trained themselves but were not actually controlling the satellite. Rather, a conventional controller performed the control task. One could say: the networks were watching and learning. This point is important because consequently, the networks were trained for a specific system configuration (e.g. observer-controller) and mission. Whenever the pretrained networks are employed in an altered environment, a performance decrease may be expected. Eventually, the networks shall adapt themselves to the new environment. Based on the two missions, optimal mission profiles have been calculated. They provide attitude time history for maximum transmission time with the stations. In between stations, the satellite performs a quick swing to target the next station. The controller’s accuracy is determined based on the deviation of the actual orientation with respect to the reference profile. Plot 4.13 shows the performance of the −4

Error Angle [°]

20

x 10

15 10 5

Error Angle [°]

0 0.06

!" !# !$

0.04 0.02 0 0

1000

2000

3000 time [sec]

4000

5000

6000

Figure 4.13.: Satellite’s pointing accuracy using a PID controller. Deviation angle is plotted over mission time. The top plot provides a zoomed-in view while the bottom plot gives the zoomed-out illustration. conventional PID controller developed in the original work by [7]. The controller’s maximum deviation is 0.06 at 4130 seconds. The other peaks are weaker. A common characteristic of all peaks is that they lie half way between ground stations. The peaks indicate high manoeuvring activity at those times. Indeed, at those points, the

71

4. Neural Control

4.5. Mission and Past Work

satellite swings to target the next ground station. The peaks are an expected result of those manoeuvres. Overall, plot 4.13 suggests robust controller performance. The focus of the succeeding work [4], is implementation of neural network controllers rather than increased pointing accuracy. The attempt was successful. However, the implemented neural controller performed inferior to the conventional controller in terms of accuracy. Plot 4.14 visualizes the neural controller’s performance. Its peak deviation is 3 at 4770 seconds. Compared to the PID controller, accuracy decreased by a factor of 50.

3

!" !# !$

Error Angle [°]

2.5 2 1.5 1 0.5 0

1000

2000

3000 time [sec]

4000

5000

6000

Figure 4.14.: Plot of control accuracy over time using unmodified observer-controller architecture. The work at hand tries to implement enhanced neural control strategies to improve controller performance. Increased controller performance may be achieved if robustness is raised or pointing accuracy improved. For this purpose, three different controller architectures and multiple parameter variations have been tested.

72

5. Investigation Multiple simulation runs have been performed scrutinizing various aspects of the neural control strategy. The following sections elaborate on individual aspects. Mean Control Error c¯ As a quick performance index, the mean control error c¯ is introduced. It serves as benchmark tool with which control outcomes of various configurations may quickly be compared. The index value gives the mean error angle over the entire mission. The sum of all three error angles (in correspondence to the three space dimensions as in section 4.4) is calculated and their mean value determined. Equation 5.1 provides the corresponding calculation c¯ 

1 T

»T 0

p∆Θ

∆Φ

∆Ψq dt

(5.1)

where T is the mission time.

5.1. Reference Model and Parameter Variation To start the investigation of the reference model’s influence, a baseline parameter set has been selected. The parameter values have been chosen to be the same as for the simulation run without reference model. Table 5.1 lists the corresponding values. The influence of each parameter is discussed in the following sections. For comparison, the performance of the controller without reference model is portrayed in figure 4.14. It appears stable but shows inferior performance to the conventional controller (figure 4.13). It should be noted that the neural controllers are set to take over attitude control after 60 seconds. Thereafter, performance decreases. The controller adjusts itself and shows robust control performance. A baseline parameter setting with reference model yields the performance time history shown in figure 5.1. Similarly to figure 4.14, the controller with reference model appears stable. At first glance, a performance increase may be suspected as the peak deviation decreases to about 2.5 . However, overall performance is inferior compared to figure 4.14. Introduction of a performance index confirms the finding. Table 5.2 provides the index values for the baseline parameter set with and without reference model. The performance

73

5. Investigation

5.1. Reference Model and Parameter Variation

3

!" !# !$

Error Angle [°]

2.5 2 1.5 1 0.5 0

1000

2000

3000 time [sec]

4000

5000

6000

Figure 5.1.: Plot of control accuracy over time using reference model augmented observer-controller with baseline parameter setting. decrease may partially be explained with a constant deviation which comes with the introduction of the reference model. The reference model has its own dynamics and inertias. Consequently, it has its own deviation from desired attitude. The deviation is propagated through the system and is inherent in the controller training signal. Therefore, the reference model’s deviation will always be present in the control outcome and may not be reduced by the controller. However, it seems to be constant for the mission and quite small compared to the absolute control error. The reference

Value

Parameter

Controller

Observer

Train Method Train Interval Validation Start Learn Rate λ

SMC 3 0 103 1010

SMC 3 0 103 1010

Table 5.1.: Listing of baseline parameter selection for observer-controller with reference model.

74

5. Investigation

5.1. Reference Model and Parameter Variation

Reference Model c¯

r s

with 2.8687

without 2.5472

Table 5.2.: Performance index values for baseline parameter selection with and without reference model. model produces a deviation of c¯re f

 0.1721.

5.1.1. Parameter Variation In this section, parameters undergo variation. Variation is necessary because the system’s environment has changed with the addition of the reference model. The employed networks are pre-trained without reference model. In order to adapt to the new system in merely the time of one mission, parameters need to be tuned to enable optimal learning performance. The ensuing results of the variations are presented and discussed. Validation Interval Validation describes a method, where the effect of network training is checked. If a positive effect is verified, the training is retained. On the other hand, if the network’s performance continues to decrease, training is overwritten with pre-training data. The validation interval determines the number of time steps over which performance data is recorded. Data recording occurs before and after training. Mean values of both data sets is used as performance indicator for verification. Both mean values of data sets before and after training are checked against each other. If posttraining data indicates increased performance compared to pre-training data, training data is kept and the verification process starts over. It appears that validation in general does not have a notable influence on performance. Figure 5.2 shows the result of a test run with the parameter "validation interval" set to ten. There is no visual difference notable when compared to the baseline result in figure 5.1. Only minuscule variations appear with the aid of the performance index. Table 5.3 lists performance indexes of four parameter variations. The variation values are chosen such that they cover a broad range of magnitudes. For the baseline scenario, no validation takes place (validation interval = 0). Compared to baseline configuration, only negligible performance changes are visible in the performance index. It may be deduced that training validation in general has an insignificant effect.

75

5. Investigation

5.1. Reference Model and Parameter Variation

3

!" !# !$

Error Angle [°]

2.5 2 1.5 1 0.5 0

1000

2000

3000 time [sec]

4000

5000

6000

Figure 5.2.: Plot of control accuracy over time using reference model augmented observer-controller with parameter variation of validation interval = 10. Validation Interval 1 10 100 1000



r s

2.8677 2.8691 2.8696 2.8694

baseline

2.8687

Table 5.3.: Performance index values for the validation interval parameter variation. Training Interval The training interval determines the number of time steps over which network data is recorded for training. Training occurs based on the mean value of the recorded data. Table 5.4 summarizes the results for each parameter variation. For small training intervals (=1), performance increases slightly. In contrast, for larger intervals, performance decreases on a similarly minuscule level. The results remain constant for values bigger than ten. One may deduce that small training intervals facilitate learning. During operations, the network needs to adjust itself as quick as possible for adequate control outcomes.

76

5. Investigation

5.1. Reference Model and Parameter Variation Validation Interval



r s

1 10 100 1000

2.8670 2.8699 2.8693 2.8694

baseline

2.8687

Table 5.4.: Performance index values for the training interval parameter variation. This is possible only if training occurs frequently. A larger training interval slows down the adaptation process. Slow adaptation results in poor controller performance. This result may be abstracted beyond the scope of the application at hand. Online learning should generally take place with higher dynamics than offline learning. Start Learning Rate Since training uses the SMC algorithm, the learning rate is variable. Start learning rate sets the initial learning rate. Even though the start learning rate has been adapted over a broad range of magnitudes, no change occurred in controller performance. As soon as simulation starts, the SMC algorithm adjusts the learning rate to appropriate levels regardless of the initial value. This is possible because SMC operates independently of past µ-values as shown in section 2.6. Consequently, the start learning rate offers a negligible influence on controller performance. SMC Parameter λ The parameter λ controls the value range of which learning rate µ is drawn from. Section 2.6 elaborates on this mechanism. Table 5.5 provides the variations and corresponding outcome. This listing suggests a serious impact of λ on controller performance. For λ  101 and λ  1, the satellite attitude control fails. Figure 5.3 shows the result for λ  101 . Control is unstable. After attitude control is issued to the neural controllers after 60 seconds, the satellite starts to tumble in an uncontrolled way. Obviously, λ strongly influences controller performance. Overall performance increases slightly with bigger λ.

77

5. Investigation

5.1. Reference Model and Parameter Variation

λ



r s

stable?

10100 101 1 10 103

2.8694 260.3148 240.8742 2.8635 2.8521

yes no no yes yes

baseline

2.8687

yes

Table 5.5.: The table shows the performance index values for λ-variations and if performance appears stable. 200

!" !# !$

Error Angle [°]

150

100

50

0

1000

2000

3000 time [sec]

4000

5000

6000

Figure 5.3.: Plot of control accuracy over time using reference model augmented observer-controller with parameter variation of λ  0.1. The result is unstable.

5.1.2. Reference Model Influence Up to this point, the reference model performed worse than the original observercontroller architecture. Despite various parameter settings, performance remained unsatisfactory. Theory promises a performance increase by incorporation of a reference model. One may suspect procedural problems preventing success. In fact, figure 5.4 displays an exceptionally small value of the learning rate µ. The figure

78

Error Angle [°]

5. Investigation

5.1. Reference Model and Parameter Variation

3

!" !# !$

2 1 0

1000

2000

3000

4000

5000

6000

−14

µ [−]

2

x 10

1st 2nd 3rd

1

0

1000

2000

3000 time [sec]

4000

5000

6000

Figure 5.4.: Two plots of control error and corresponding µ-values over mission time. Three neural networks control the attitude. Therefore three µ-series are shown. The learning rate µ is small preventing adequate learning. portrays the mission with baseline parameter set. As shown in section 2.5, µ determines the weight adaptation magnitude. If µ is too small, learning steps are tiny such that the network is unable to adapt itself. In the SMC algorithm, λ affects the learning rate µ. In order to boost the learning rate, λ needs to be increased. In addition to λ, the error signal also influences weight adaptation magnitude. In this application, the observer provides the error signal for the controller via inverse propagation. As may be seen in figure 5.5, this signal is small as well. Small magnitudes of λ and learning signal explain the absence of increased control performance. To facilitate adequate learning, the gain factor G is introduced. This factor magnifies the inverse propagation output signal. The inverse output signal serves as error signal for the controller. Thus, if G is increased, the weight adaptation values should increase and provide sufficient network adaptation capabilities. A suitable parameter combination for λ and G is found. The result show that the network architecture with reference model outperforms the traditional observercontroller architecture. The implemented reference model, with λ  103 and gain factor set to G  102 , provides a performance increase of about 20% compared to the observer-controller architecture. Table 5.6 summarizes the results. However, G

79

5. Investigation

5.1. Reference Model and Parameter Variation −10

4

x 10

InvOut

1

Inverse Output [−]

2

InvOut2 InvOut3

0 −2 −4 −6 −8 −10 0

1000

2000

3000 time [sec]

4000

5000

6000

Figure 5.5.: Plot of observer’s inverse output signal over mission time. The inverse output signal also determines weight adaptation magnitude. It is a small scale signal.

Architecture Ref. Model magnified Observer-Controller Reference Model (Baseline)



r s

2.0630 2.5472 2.8687

Performance Difference [%] +19.0 – -12.6

Table 5.6.: Performance comparison of reference model with and without magnified error signal to traditional observer-controller architecture. appears to facilitate unstable behaviour of the controller. With G  103 , the same simulation appears prone to instability after 5000 sec.

Abs

# of Received Bytes

Skip Trigger

Relational Operator

cond

do { ... } while

While Iterator

1 Number of holds

1 Trigger Bias

eps Max Error

0 Trigger Skip

Figure A.5.: Simulink subsystem receiving trigger information and holding the simulation until trigger reception in peer-to-peer configuration.

Data_Out



1 debug

Scope 1



Scope 2



Scope 3

refmoddata To Workspace

Figure A.6.: Output of state signals for post simulation evaluation in SimulinkWorkspace.

130

Bibliography [1] Mehrabian, A. R. ; Menhaj, M. B.: A real-time neuro-adaptive controller with guaranteed stability. In: Applied Soft Computing 8 (2008), Nr. 1, 530 - 542. http: //dx.doi.org/10.1016/j.asoc.2007.03.005. – DOI 10.1016/j.asoc.2007.03.005. – ISSN 1568–4946 [2] NIA: Unraveling the Mystery. http://www.nia.nih.gov/Alzheimers/ Publications/UnravelingtheMystery/. Version: February 2011 [3] Vörsmann, P. ; Winkler, S. ; Martin, T.: Regelungstechnik 1. Technische Universität Braunschweig, Institut für Luft- und Raumfahrtsysteme, October 2008 [4] Dumke, M.: Entwurf einer neuronalen Lageregelungsstrategie für Raumfahrzeuge. Technische Universität Braunschweig, Institut für Luft- und Raumfahrtsysteme, März 2010 [5] Möckel, M. ; Wiedemann, C. ; Flegel, S. ; Gelhaus, J. ; Vörsmann, P. ; Klinkrad, H. ; Krag, H.: Using parallel computing for the display and simulation of the space debris environment. In: Advances in Space Research (2011). http://dx.doi.org/10.1016/j.asr.2011.03.003. – DOI 10.1016/j.asr.2011.03.003 [6] Wikipedia: Float Code Example. http://en.wikipedia.org/wiki/File:Float_ example.svg. Version: March 2011 [7] Heidecker, A.: Development of algorithms for attitude determination and control of the AsteroidFinder satellite. Institut für Luft- und Raumfahrtsysteme, Technische Universität Braunschweig, Diplomarbeit, March 2009 [8] Mößner, M.: Optimierung einer adaptiven neuronalen Reglerstruktur und Stabilitätsanalyse der verwendeten Lernverfahren. Technische Universität Braunschweig, Institut für Luft- und Raumfahrtsysteme, March 2009 [9] Omatu, S. ; Khalid, M. ; Yusof, R.: Neuro-Control And Its Application. Springer, London, 1996 (Advances in Industrial Control) [10] Hebb, D.O.: The Organization of Behavior. Wiley, New York, 1949

131

Bibliography

Bibliography

[11] Kriesel, D.: A Brief Introduction to Neural Networks. http://www.dkriesel.com : Internet, 2007 [12] Merziger, G. ; Mühlbach, G. ; Wille, D. ; Wirth, T.: Formeln + Hilfen zur Höheren Mathematik. 4. Binomi Verlag, Springe, 2004 [13] Hebisch, H.: Grundlagen der Sliding-Mode Regelung / Gerhard-MercatorUniversität GH-Duisburg. 1995 (15/95). – Forschungsbericht [14] Tewari, A.: Atmospheric and Space Flight Dynamics. Birkhäuser Berlin, 2007 (Modeling and Simulation in Science, Engineering and Technology) [15] Wertz, J. R.: Spacecraft Attitude Determination and Control. Kluwer Academic Publishers, Netherlands, 1991 [16] Kuipers, J.: Quaternions and Rotation Sequences: A Primer with Applications to Orbits, Aerospace, and Virtual Reality. Princeton University Press, Princeton, New Jersey, 1999 [17] Harvey, S. A.: Spacecraft Attitude Control Using Direct Model Reference Adaptive Control, University of Wyoming, Diss., May 2008 [18] Cheng, J. ; Yi, J. ; Zhao, D.: Neural Network Based Model Reference Adaptive Control for Ship Steering System. In: International Journal of Information Technology 11 (2005), Nr. 6 [19] Patino, H.D. ; Liu, D.: Neural Network Based Model Reference Adaptive Control System. In: IEEE Transactions on Systems, Man, and Cybernetics 30 (2000), February, Nr. 1 [20] Wikipedia: Satellite Catalog Number. Satellite_Catalog_Number. Version: 2011

http://de.wikipedia.org/wiki/

[21] Campa, G.: udpip Library. http://www.mathworks.com/matlabcentral/ fileexchange/12021. Version: March 2011 [22] Haupt, M.: Informatik im Maschinenbau. Technische Universität Braunschweig, Institut für Flugzeugbau und Leichtbau : Scriptum, 2008

132