Model reduction for nonlinear differential algebraic

Nat.Lab. Unclassified Report PR-TN-2005/00919 Date of issue: 30/09/2005

Model reduction for nonlinear differential algebraic equations M.Sc. Thesis

Thomas Voß (University of Wuppertal, Germany)

Revision history:

Version 0.1 1.0

Dates 05.06.01 05.09.30

Unclassified Report c Koninklijke Philips Electronics N.V. 2005

Initials TV TV

Description Initial prototype Final Report

PR-TN-2005/00919

Unclassified Report

Authors’ address data: Thomas Voß [Philips Research:] WAY03.091 [Private:] Karl-Theodor-Str. 25, D-42119 Wuppertal, Germany [email protected] [email protected]

c Koninklijke Philips Electronics N.V. 2005 All rights are reserved. Reproduction in whole or in part is prohibited without the written consent of the copyright owner.

ii

c Koninklijke Philips Electronics N.V. 2005

Unclassified Report

PR-TN-2005/00919

Unclassified Report:

PR-TN-2005/00919

Title:

Model reduction for nonlinear differential algebraic equations M.Sc. Thesis

Author(s):

Thomas Voß (University of Wuppertal, Germany)

Part of project:

Internship at Philips Electronic Design and Tools/Analogue Simulation (ED&T/AS)

Customer:

Keywords:

DAE, Model reduction, Poor Mans TBR, PMTBR, Krylov, PRIMA, Balanced Truncation, TBR, Trajectory Piecewise Linear, TPWL, Empirical Balanced Truncation, EBT, Proper orthogonal decomposition, POD, Volterra Series, VS, Linearisation Tuple Controller

Abstract:

This thesis show the results of available techniques to reduce the size of a nonlinear DAE which are used to discribe circuits. This reduction of order increases the simulation time of the transient analysis. Because there are several options how to reduce an nonlinear DAE investigations of this methods are shown up. These methods are • Proper orthogonal decomposition (POD). This is well known technique which is frequently used in computational fluid dynamics. The idea behind this method is to simulate the system and then to find an optimal subspace by using snapshots of simulation results. • Empirical balanced truncation (EBT). The idea behind this technique is to find a similar representation of the balanced truncation for nonlinear systems. This method requires a lot of simulations to get a good approximation of the gramians. These approximation are called empirical gramians. After calculating the empirical gramians the standard ideas of TBR for linear ODEs are used. • Volterra Series (VS). This method builds a bilinear representation of the nonlinear system via a power series approximation. And then just using linear model reduction techniques to reduce the bilinear system. • Trajectory Piecewise Linear (TPWL). This is a relatively new method which is using several linearised systems, created along a typical trajectory of the original system, to compile a weighted overall linear system. Each of the linearised system is reduced by a linear model reduction technique. The results of this investigation is that the TPWL method seems to be the best option for circuit simulation, so this method was studied more detailed. The topics of these research are


iii

PR-TN-2005/00919

Unclassified Report

• Linearisation tuple controller. In this part it is shown how the selection of the linearisation tuples is influencing the quality of the TPWL model. And also several possibilities are presented to select ’optimal’ linearisation tuples. • Weighting. Here we study several options how to combine the local linearised reduced systems to create a global TPWL model. • Linear model reduction techniques. At last a comparison of several linear model reduction methods is presented, because each of the linearised systems has to be reduced to get a bigger speed up and some properties of these techniques have to be used to create a better linearisation tuple controllers. This project has been a co-operation between Philips Electronic Design & Tools/Analogue Simulation (ED&T/AS) at Philips Research Laboratories in Eindhoven, Netherlands and the Bergische Universität Wuppertal (BU), Germany. It has been executed under supervision of Dr. E.J.W. ter Maten (ED&T/AS) and Univ.-Prof., Dr. rer. nat., Dipl.-Math. Univ. Michael Günther (BU). The author ackknowledges for being able to use, works from Pieter Heres (PhD. Student TU Eindhoven, PRIMA code), Joost Rommes (PhD. student, University Utrecht, Hstar for converting Pstar files to Matlab files) and Dr. Tatjana Stykel (TU Berlin, Code of TBR for DAEs).

Conclusions:

The TPWL method applied to nonlinear DAEs, which are used to describe analouge electronic circuits, is a promising technique to increase the simulation time. It has several advantages compared to other methods. First of all we can get a speed up in simulation time, because we are only solving small linear systems to approximate our system. Then we can use the well-developed linear model reduction techniques to increase the performance of our methods. We also could create a linearisation tuple controller which could be used directly in a BDF method, which is a big advantage because we get a fast model extraction technique. And we can even increase the properties of the TPWL model if we construct a very good weighting procedure. The last thing we want to mention is that the TPWL method has also the very nice property that it is scalable. This means by using a different linearisation tuple controller, linear model reduction techniques and weighting methods, we can change the method from a very fast method but not so accurate method to a slower but also much more accurate method. This means that the user himself can decide what he desires. Speed or accuracy. But we also have to give some warning. Like in all model reduction techniques one has to be careful by reducing systems, because if we are reducing them to much we can not trust the results anymore.

iv


Contents 1

Introduction 1.1 Motivation of the project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Formulation of the problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Guideline to this document . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 1 1 2

2

Introduction to model reduction 2.1 Model reduction idea for nonlinear dynamical systems . . . . . . . . . . . . . . 2.1.1 Model reduction for differential-algebraic equations (DAE) for circuit simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Proper orthogonal decomposition (POD) . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Singular value decomposition . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 Idea of POD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.3 POD and SVD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.4 Galerkin projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Empirical balanced truncation (EBT) . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Linear systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 Gramians and principal component analysis . . . . . . . . . . . . . . . . 2.3.3 Construction of EBT . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.4 Computing the reduced model by EBT . . . . . . . . . . . . . . . . . . 2.4 Trajectory Piecewise-Linear (TPWL) . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 Linearisation of the system . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.2 Quasi-piecewise-linear . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.3 Weighting procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.4 Selecting linearisation points . . . . . . . . . . . . . . . . . . . . . . . . 2.4.5 Constructing the reduced TPWL-model . . . . . . . . . . . . . . . . . . 2.5 Volterra Series (VS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.1 Impulse response function and transfer function of a linear DAE . . . . . 2.5.2 Projection method for linear systems . . . . . . . . . . . . . . . . . . . . 2.5.3 Bilinear systems for a nonlinear ODE . . . . . . . . . . . . . . . . . . . 2.5.4 Bilinear model reduction for a nonlinear ODE . . . . . . . . . . . . . . . 2.5.5 Bilinear systems for a nonlinear DAE . . . . . . . . . . . . . . . . . . . 2.6 Complexity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.1 Complexity of the original system . . . . . . . . . . . . . . . . . . . . . 2.6.2 Complexity of POD . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.3 Empirical balanced truncation (EBT) . . . . . . . . . . . . . . . . . . . 2.6.4 Trajectory Piecewise-Linear (TPWL) . . . . . . . . . . . . . . . . . . . 2.6.5 Comparing the complexity for an example . . . . . . . . . . . . . . . . .

3 3

v

5 6 6 7 9 10 11 12 13 16 17 18 18 19 20 20 21 22 22 23 24 26 27 29 30 30 32 34 37

PR-TN-2005/00919

2.7 3

4

5

vi

Unclassified Report

Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Optimal linearisation points and global subspace 3.1 Knowledge of the exact solution . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 Error estimation for a linear model reduction technique which has no error estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.2 Error estimation for a linear model reduction technique which has an error estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.3 Using the error estimation for selecting the optimal linearisation points . 3.1.4 Solving the reduced system for selecting the LTs . . . . . . . . . . . . . 3.1.5 Implementation aspects . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Knowledge of the approximated solution . . . . . . . . . . . . . . . . . . . . . . 3.2.1 General idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Using the error estimation for selecting optimal linearisation points . . . 3.3 Creating the global subspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Creating the global subspace by using the full local subspaces . . . . . . 3.3.2 Creating the global subspace by using parts of local subspaces . . . . . . 3.4 Constructing the global TPWL model . . . . . . . . . . . . . . . . . . . . . . . Weighting procedure 4.1 How to apply weighting and the reason for weighting . . . . . . . . . . . 4.2 Calculating weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Distance dependent weights . . . . . . . . . . . . . . . . . . . . 4.2.2 Distance dependent weights under the knowledge of the Hessians 4.3 Using estimates to calculate the weights . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

Model reduction techniques for linear DAEs 5.1 General Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Methods for model reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Transfer function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.2 Passivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Krylov techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.1 General Krylov-subspace methods . . . . . . . . . . . . . . . . . . . . . 5.4.2 Arnoldi algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.3 Importance of preserving passivity . . . . . . . . . . . . . . . . . . . . . 5.4.4 Passive Reduced-order Interconnect Macro-modeling Algorithm (PRIMA) 5.5 Balanced truncation (TBR) for DAEs . . . . . . . . . . . . . . . . . . . . . . . 5.5.1 Descriptor systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.2 Model reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.3 Algorithms to calculate a TBR of a DAE . . . . . . . . . . . . . . . . . 5.6 Poor man’s TBR (PMTBR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6.1 Model reduction background . . . . . . . . . . . . . . . . . . . . . . . . 5.6.2 PMTBR approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6.3 Computational Complexity . . . . . . . . . . . . . . . . . . . . . . . . . 5.6.4 Practical implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

38 43 43 44 46 47 48 49 51 51 51 53 53 54 55 56 57 57 59 59 60 61 63 63 64 64 65 65 66 66 67 68 69 71 72 76 80 84 85 86 87 88 90


Unclassified Report

PR-TN-2005/00919

6

Numerical results 6.1 Test circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.1 Modeling of a MOSFET . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.2 Chaining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Linearisation tuple controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 State distance dependent LT controller . . . . . . . . . . . . . . . . . . . 6.2.2 LT controller without an error estimator for the reduction part . . . . . . 6.2.3 LT control by simulating the local linearised reduced system . . . . . . . 6.2.4 Final method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Weighting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 Distance dependent weighting . . . . . . . . . . . . . . . . . . . . . . . 6.3.2 Distance dependent weights under the knowledge of the Hessians . . . . 6.4 Results for different inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7

Summary 111 7.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 7.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111


91 91 92 94 96 96 98 98 100 102 102 102 104

vii

PR-TN-2005/00919

viii

Unclassified Report


Unclassified Report

PR-TN-2005/00919

Notations d q(t, x) dt

+ j(t, x) + u(t) = 0

z = Dx x(0) = x0 q j u ∂ C(t, x) := ∂x q(t, x) ∂ G(t, x) := ∂x j(t, x) n x∈ A = U V S P∈ r Sr

n

P q(t, Py) + P j(t, Py) + P u(t) = 0 z = D Py y(0) = P x0 y X Y Xˆ d dt

Yˆ + j(t, x) + Bu(t) = 0 z = Dx x(0) = x0 ˜ Ci x˙ + G i x + B˜ i u(t) =0 z = Dx x(0) = x0 wi ˜ Pi Ci Pi y˙ + Pi G i Pi y + Pi B˜ i u(t) =0 z = D Pi y x(0) = x0 ( s−1 wi (y, t)Cir )˙y + ( s−1 i=0 i=0 wi (y, t)G ir )y s−1 ˜ ˜ =0 ( i=0 wi (y, t) Bir )u(t) H(s) h(t) (A, v) δx := x − x˜ d q(t, x) dt

x˜ = Py


Differential-algebraic equation with an input which directly distributs the input to the nodes of dimension n on page 5. Represents the dynamic informations of the circuit. Represents the static informations of the circuit. Represents the input of the circuit. The Jacobian of q. The Jacobian of j. The state of the system on page 5. The singular value decomposition of A on page 7. The original subspace of dimension n. The projection matrix. Size of the reduced sytem. The reduced subspace of dimension r spanned by P. Projected differential-algebraic equation of dimension r on page 11. The solution of the reduced system. The observability Gramian on page 13. The controllability Gramian on page 13. The empirical observability Gramian on page 14. The empirical controllability Gramian on page 13. Differential-algebraic equation with an input distribution matrix B of dimension n × k and u of dimension k, page 18. The linearised differential-algebraic equation of dimension n on page 19. TPWL weight on page 20 The linearised Differential-algebraic equation of dimension n on page 20. TPWL model on page 21. Transfer function on page 22. Impulse response on page 22. Krylov subspace of dimension r on page 24. Distance between original and approximated solution on page 43. The solution of the reduced system on page 43.

ix

PR-TN-2005/00919

x

Unclassified Report


Chapter 1

Introduction 1.1 Motivation of the project Nowadays a lot of circuits which are used in many fields are not only purely digital or analogue. These circuits are a mixture of analogue and digital and are called mixed-signal circuits. For developing these circuits there is a need of tools which can simulate these circuits during the design phase as well as during the verification phases. This is needed in order to prove whether the circuits meet the functional requirements and the design is the optimal choice for the problem. Digital circuits behave nonlinear with respect to the source/input values. However, the timevarying behaviour may show several smooth periods. But the behaviour changes rapidly if there is an important change in the inputs. So in general we can say that the partial derivative with respect to time is almost zero. The digital part in mixed-signal designs contains several sub-circuits that are reused several times. The only difference between these parts is that they have different inputs (voltages, but also parameters). So simplifying these circuits could give a speed up for the transient analysis (assuming, that the simplification process is not too expensive). In mixed-signal analysis the internal results of a digital sub-circuit are not of big interest, because the most important thing is the behaviour at the terminals. This is partly reflected in the wish to use different tolerances on the digital and on the analog circuit description in a circuit simulator. So for the digital part it makes sense to look for a process to generate a different model of the circuit which is less complex but still has a similar behaviour at the terminals.

1.2 Formulation of the problem Differential-algebraic equations (DAEs) describe the dynamic behaviour of electrical circuits. We get the description by using the Kirchhoff’s laws equation on the circuit. This will give us for each node a differential or algebraic equation which describes the dynamics of the actual and dependence to the other nodes. These equations can be nonlinear, so for solving these equations numerical approaches have to be used which are really time consuming. One example for solving them are the backward differential formulars (BDF) methods. This method is accurate but has a very high complexity. So there is a big need of faster simulation techniques. One simple option for speeding up the simultation is to linearise our system around a equilibrium point. Then we only have to solve linear DAEs which are, compared to the nonlinear one, cheap to solve. But this approach has the disadvantage that the linearisation is only accurate as long as the actual state stays close to the equilibrium point. And this is an restrictoin which is hard 1

PR-TN-2005/00919

Unclassified Report

to fulfil in circuit simulation. Another option for speeding up is model order reduction. But if we use a model reduction technique we are introducing additional errors and of course we want to control these errors, so that we can be relatively sure that the reduced system is a good approximation to the original system. So we have to develop some theory to increase the approximation properties of the methods.

1.3 Guideline to this document In Chapter 2 we discuss several different possibilities to use model reduction for nonlinear circuits. The discussion includes a possible reduction which can be gained through the method and also the complexity of the different methods. We also make the decision which model reduction technique seems to be the best in circuit simulation. The compared methods are proper orthogonal decomposition (POD), empirical balanced truncation (EBT), trajectory piecewise linear (TPWL) and Volterra series (VS). We have to forestall that the final chosen method is TPWL. In Chapter 3 we discuss how we can select the optimal linearisation tuples for creating the TPWL model. This is a very important part because selecting the linearisation tuples in an ’optimal’ way has a big influence on the accuracy of the TPWL model. Then in Chapter 4 we show which influence different weights have to a TPWL model. The weights are used to combine the local linearised systems to create a global linear system which approximates the system around a given trajectory. The next point we discuss in Chapter 5 is which linear model reduction technique we can use in a TPWL approach. We see that there are big differences in these methods and that they influence on one hand the compression of the methods and on the other hand also how we have to select the linearisation tuples. In Chapter 6 then we show how a TPWL method performs in practice. These observations are done for several linearisation tuple controllers, weighting methods and linear model reduction techniques to see the influence of them on the TPWL model.

2


Chapter 2

Introduction to model reduction In this chapter the theory for nonlinear model reduction techniques of dynamical systems is introduced. First we discuss the model reduction idea in a very abstract way (method and problem independent). After that we take a look in a general way to four methods for nonlinear model reduction. We also point out the complexity of the four methods for a typical problem. And in the last section we point out which method, in the author’s opinion, is the best to use for nonlinear circuit simulation.

2.1 Model reduction idea for nonlinear dynamical systems Dynamical systems are used in many fields, for example mechanical engineering, simulation of chemical processes and circuit simulation. These systems are described/modelled through ordinary differential equations (ODE’s), partial differential equations (PDE’s), differential-algebraic equations (DAE’s) and other equations. In general these equations can be described through ˙ = 0 f(t, x, x)

(2.1)

z = h(x) x(0) = x0 with x(t) : → n being the representation of the internal states, t ∈ T = [0, b] , b ∈ + , and T being the time interval we are interested in, x0 ∈ n the initial value for solving the equations and z(t) : → q describes the output of the system. f(t, x, x˙ ) : × n × n → n describes the dynamical behaviour of the modelled system including also the input. h(x) : n → q describes how and which internals are mapped to the output of the system. One disadvantage of this model description is that many internals are needed to represent an accurate behaviour of the model system, so in general n is a large number. The fact that many internals are needed means that solving theses equations is very time consuming. So the idea of model reduction is to find a different model which is less complex but has nearly the same behaviour in the interval T . To describe the model reduction problem we need first some definitions. Definition 2.1. Complexity of a model. Define a complexity function c : → + , with the space of all models (ODE,PDE,DAE,...). A model M ∈ is more complex than the model N ∈ if c(M) > c(N) . An example for a complexity function for M ∈ O D E can be the number of states in M. 3

PR-TN-2005/00919

Unclassified Report

Definition 2.2. Distance of two models. A function d : × →

+

with the properties:

d(M, M) = 0, for all M ∈

d(M, N) ≥ 0, for all M, N ∈

d(M, N) = d(N, M), for all M, N ∈ is called a distance function. Clearly the distance from a model to itself is 0. The distance between two different models is always greater or equal to 0. With these definitions we can describe two model reduction problems. Problem 2.3. Given is a model M ∈ with complexity cM = c(M) and a given maximum complexity cmax < c M of the resulting reduced model. Find a reduced model Mr ∈ with complexity cMr = c(Mr ) ≤ cmax < c M such that the distance between the two models d(M, Mr ) is minimised . If we solve this problem we can ensure that the reduced model does not exceed our maximum complexity. But we have no information about the accuracy of the reduced model. Solving this kind of problem can be used if we want to guarantee a speed up, but the problem is that the reduced model can be a really bad approximation of the original system. So in practice this is not a good way to reduce a given system. Problem 2.4. Given is a model M ∈ with complexity cM = c(M) and a maximum distance d0 ∈ + of the reduced model to M. Find a reduced model Mr ∈ , whose distance d(M, Mr ) to the original model is smaller or equal to d0 and the complexity of the reduced model c(Mr ) is minimal and satisfies c(Mr ) ≤ c M . With solving this problem we ensure that the accuracy of the reduced model is good enough. However we have no idea about the complexity of the reduced model. At least we know that the complexity is not bigger than the original model. This is an often used technique, because we guarantee that the quality of the reduced model is good enough for the given purpose. But the speed up we get is maybe not as high as expected. Two possibilities to reduce a model For reducing a system we want to describe two possibilities to do this. 1. Reducing the system with respect to the physics of the system. The advantage of this approach is that we first can have better results if we have knowledge of the system and we can find a reduced system which describes to some extent the original system. The disadvantage is that this can not be used as a ’black-box’ algorithm because we have to use a different technique for each system and it is hard to develop an algorithm to do this. So in practice this is not such a good technique if we want to reduce several systems. 2. Project the system onto smaller subspace. The advantage of this approach is that we develop a technique which works problem independent to reduce several systems. One disadvantage is that the reduced system is not as good as we would expect. This is because we project the system to an optimal subspace of the states of the system. The optimality comes from the fact that the error which we get through the projection to the subspace is minimal. The best way to do this is with the help of 4


Unclassified Report

PR-TN-2005/00919

all states of the system. But these are an infinite number of points, so in general you can just reduce the system with the help of a finite number of states, which makes the results not as good as for case 1. Also we are not taking in account the physical behaviour of the system. In general internals of the reduced systems do not have a physical meaning anymore, but at the external we still have an approximation of our system.

2.1.1 Model reduction for differential-algebraic equations (DAE) for circuit simulation In circuit simulation (2.1) changes to d q(t, x) + j(t, x) + u(t) = 0 dt z = Dx

(2.2)

x(0) = x0 where q : × n → n represents the function of contributing dynamics, i.e. charge and flux. → n stands for the The function j : × n → n stands for the static information. u : n represents the internal voltages and currents of the simulated circuit. D ∈ q×n input. x ∈ represents the mapping of the internal states to the output. z ∈ q represents the output of the circuit. Functions q, j and u can be nonlinear. We call this system also model M. This is the system we are concerned within this report. For this type of systems we define the complexity function as follows. Definition 2.5. The complexity of the model of type (2.2) is defined as n, where n is the dimension of the state value x. Of course we want to find the reduced model of (2.2) which is defined as Mr :=

d qr (t, y) + jr (t, y) + ur (t) = 0 dt z = Dr y

(2.3)

y(0) = y0 with y ∈ r ,qr , jr : × r → r , ur : → r , Dr ∈ q×r . We want the complexity c(Mr ) = r of the model to be much smaller than c(M) = n. In general we can say that if the original model M is nonlinear that the reduced model Mr will also behave nonlinear. The only exception are methods which first linearise the model and then use linear model reduction techniques to reduce the linearised model.


5

PR-TN-2005/00919

Unclassified Report

2.2 Proper orthogonal decomposition (POD) As mentioned in Section 2.1 we want to change the model (2.2) to the reduced model (2.3), where it holds that the complexity of the reduced model is smaller than the original model (c(Mr ) < c(M)) and the distance between both models d(Mr , M) is minimised. One possibility to do this is a POD method. The idea of POD is to project the system on a proper subspace S of n that minimises the distance between the two models. For circuit simulation this technique can be applied to sub circuits which have several occurrences within a larger circuit. In order to obtain a basis of such a subspace, a set of data (snapshots) xi = x(ti ), i = 0, . . . , m ≥ r, which fulfil the equations at the timepoints, is needed. This set of snapshots then spans a subspace of n . Concerning the number of snapshots m to choose, it does, at first sight, not seem to make sense to choose the number of snapshots greater than c(M) = n. Because we can achieve at most n linear independent vectors. But in order to reduce the system it is a good idea to take snapshots at the timepoints ti , i = 1, . . . , m wherever important changes in the dynamics of the system take place. Therefore m can be bigger than n. Another point we have to think of is that the taken snapshots are not necessary linear independent. Think of a digital circuit which is constant for a long time, and so it may occur that the dimension of the subspace which is spanned through the snapshot is smaller than the desired order. What we also have to take into account is that the snapshots depend on the intital value x0 and the input u(t), see (2.2). Therefore the timegrid for taking the snapshots has to be chosen also depending on the input. Because different input functions u(t) can cause big differences in the state function x which then span a different subspace. Now we form an orthonormal basis by using a singular value decomposition algorithm (SVD). With this method we can compute an orthonormal basis for the dynamical system. And because of the advantage of the SVD that the basis elements are ordered with respect to their importance, see next subsection, it is easy to reduce the model. We just need to truncate the least important basis functions and at the same time we are also minimising the error between the original and reduced model. The last step is now to project the original model with the help of the new basis (POD-basis) to the subspace which is spanned by the POD-basis. This leads to a system with a complexity r < n which has nearly the same behaviour. Summing up, the following steps have to be made to reduce a given model. • Take m snapshots • Compute an orthonormal basis of the m snapshot • Sort the new basis elements according to their contribution in representation of the main dynamics • Use only the first r important basis elements for the projection to the subspace • Create the reduced order model of dimension r < n

2.2.1 Singular value decomposition The next sections describe the idea of the POD method. For more information see [9]. All proofs can be read there. For additional information about POD especially in computational fluid dynamics the reader is referred to [28]. 6


Unclassified Report

PR-TN-2005/00919

Theorem 2.6. Let U and V be finite dimensional Hilbert spaces with inner products < ·, · >U and < ·, · >V . dim V = m and dim U = n, where n > m. Furthermore let F : V → U be a linear operator. Then real numbers σ1 ≥ σ2 ≥ · · · ≥ σm ≥ 0 exist and orthonormal bases {ui }ni=1 of U and {vi }m i=1 of V such that F(vi ) = σi u i and F ∗ (u i ) = σi vi for i = 1, . . . , m where F ∗ : U → V is the Hilbert space adjoint of F, defined by < F(v), u >U =< v, F ∗ (u) >V for all u ∈ U and v ∈ V . Note in the case of U = n and V = F : V → U is given by a matrix A ∈ n×m . Theorem 2.7. Be A ∈

m

the next theorem shows the same results if the

n×n with rank(A) = s. Then two orthogonal matrices U ∈ s 0 and V ∈ m×m exist and a block diagonal matrix = ∈ n×m with s = 0 0 diag(σ1 , σ2 , . . . , σs ) ∈ s×s and σ1 ≥ σ2 ≥ · · · ≥ σs > 0, of which A has the decomposition n×m

A = U V . This decomposition is called the singular value decomposition (SVD) of A. The positive numbers σi are called the singular values of A. If we write U = (u1 , u2 , . . . , un ) and V = (v1 , v2 , . . . , vm ), then we will call ui ∈ n the left singular vectors and vi ∈ m the right singular vectors. Remark 2.8. For a SVD it holds that σi ui σi vi i = 1, . . . , s and A ui = • Avi = 0 i = s + 1, . . . m 0

i = 1, . . . , s . i = s + 1, . . . n

• {u1 , . . . , us } is an orthonormal basis of the image of A. {us+1 , . . . , un } is an orthonormal basis of the kernel of A . • {v1 , . . . , vs } is an orthonormal basis of the image of A .{vs+1 , . . . , vm } is an orthonormal basis of the kernel of A.

2.2.2 Idea of POD The POD-method can be formulated as a constrained minimisation problem. We consider X to be a complex Hilbert space with the inner product < ·, · >X and induced norm || · ||X . The set {x1 , x2 , . . . , xm } ∈ X contains the snapshots x(ti ) = xi (i = 1, . . . , m) of the DAE and its initial conditions of the system (2.2). In our case X = n . We assume that dim X = n ≥ m and the snapshots x1 , x2 , . . . , xm are linearly independent. If this is not satisfied, we will take more snapshots so that the xi span a subspace which then spans a subspace of the desired dimension. We define (2.4) S := span{x1 , x2 , . . . , xm } ⊂ X. Note that in practice as many snapshots as necessary should be taken to capture the dynamical behaviour of the system properly. As we will see later on, the linear independency is not compulsory for further calculations. c Koninklijke Philips Electronics N.V. 2005

7

PR-TN-2005/00919

Unclassified Report

We can express every snapshot with an orthonormal basis {ϕi }m i=1 of S xj =

m

< x j , ϕi >X ϕi

( j = 1, . . . , m).

(2.5)

i=1

If we define Pm as (ϕ1 , ϕ2 , . . . , ϕm ) we can write (2.5) as x j = Pm Pm x j

( j = 1, . . . , m)

The idea is to find a basis {ϕi }ri=1 (r ≤ m) that gives a very close approximation xˆ j for x j for j = 1, . . . , m. r < x j , ϕi >X ϕi xˆ j = i=1

where r presents the order of the reduced system and r ≤ m ≤ n. Analogously we can find a different formulation for xˆ j , xˆ j = P P x j with P = (ϕ1 , . . . , ϕr ). The problem of finding an optimal basis can be expressed in the following minimisation problem, where the mean square error between the snapshots and their corresponding r-th partial sum has to be minimal ⎧ m r ⎪ ⎨ min J (ϕ , . . . , ϕ ) = min ||x − < x j , ϕi >X ϕi ||2X 1 r j (2.6) (Pr ) j =1 i=1 ⎪ ⎩ subject to < ϕi , ϕ j >= δi j for 1 ≤ i, j ≤ m The minimising basis {ϕi }rk=1 to (Pr ) is called the POD basis of rank r and the appropriate space is given as Sr = span{ϕ1 , . . . , ϕr }. By construction there is no subspace of S with dimension r that contains more information than Sr . In order to solve (Pr ) we apply Theorem 2.7 with V = m . For that we need the linear mapping Y ∈ L(V, S) with Y (ei ) = xi (i = 1, . . . , m) where L(V, S) denotes the set of all bounded linear operators from V into S. L(V, S) is a normed m . A vector v ∈ m mapped by Y linear space. The set {ei }m i=1 presents the canonical basis in results in

m m < v, ei >Êm ei = < v, ei >Êm xi for v ∈ m . Y (v) = Y i=1

i=1

In our case the linear mapping Y is just ⎛⎡

⎤

⎡

⎤⎞ ⎛

⎞

Y (v) = Yv = ⎝⎣ x1 ⎦ · · · ⎣ xm ⎦⎠ ⎝ v ⎠ . The definition of the adjoint Y ∗ ∈ L(S, m ) leads to

m m ∗ ∗ < ϕ, xi >X xi = < ϕ, xi >X ei Y (ϕ) = Y i=1

Furthermore

Y ∗Y = K ∈

for

ϕ ∈ S.

i=1 m×m

where

Ki j =< xi , x j > X .

The matrix K is called correlation matrix. 8


Unclassified Report

PR-TN-2005/00919

2.2.3 POD and SVD Solving the minimisation problem The basis minimising (Pr ) can be found by the singular value decomposition of Y as the following theorem proofs. See also [9]. Theorem 2.9. Let σ1 ≥ σ2 ≥ . . . ≥ σm ≥ 0 be the singular values of Y . The sets {vi }ni=1 and n and S, respectively, and satisfy {ϕi }m i=1 form an orthonormal basis of Y (vi ) = σi ϕi

(i = 1, . . . , m)

as required in Theorem 2.6. In other words Y = U V . Matrices U ∈ n×n and V ∈ m×m consist of the orthonormal basis vectors of n and m respectively and = diag(σ1 , . . . , σm ) ∈ n×m . If we define Y r ∈ L( n , S) by Y on span{v1 , . . . , vr } r Y = 0 otherwise then problem (Pr ) is solved by Y r and the minimum is reached for ϕi for i = 1, . . . , r. , S) by m σi2 |||Y ||| = |||| F =

Proof. We define a norm ||| · ||| on L(

n

for

Y ∈ L(

n

, S)

i=1

where = diag{σ1 , . . . , σm } ∈

n×m

and || · ||F denotes the Frobenius norm. r r σi2 . |||Y ||| = i=1

Using xj =

m

< x j , ϕi > X ϕi

( j = 1, . . . , m)

i=1

we find |||Y − Y r |||2 = = =

m i=r+1 m i=r+1 m

||

j =1

=

σi2 < Y Y ∗ ϕi , ϕi >X m

< x j , ϕi >X ϕi ||2

i=r+1

J (ϕ1 , . . . , ϕr )

What we learned is that the POD basis {ϕi }ri=1 of rank r can be computed by the singular value decomposition of Y . In case of m < n = dim X the POD basis can be determined as follows c Koninklijke Philips Electronics N.V. 2005

9

PR-TN-2005/00919

Unclassified Report

1. Solve the eigenvalue problem Kvi = λi vi

for

i = 1, . . . , m

to determine the positive eigenvalues {λi }ri=1 and eigenvectors {vi }ri=1 . 2. Set ϕi =

√1 Y (vi ) λi

for i = 1, . . . , m.

Note that the square roots of the eigenvalues are just the singular values of Y λi = σ i . Choice of the order r of a reduced model In [4] the concept of the so-called information content is used in order to determine the smallest possible r. The information content is defined as r 2 i=1 σi . I (r) = m 2 i=1 σi If the system is supposed to contain the percentage p of the total information, the reduced order r is determined by p . r = min r : I (r) ≥ 100 This, however, is only a measurement for the information input. Unfortunately there can be found very little in literature about the optimality of the solution x. But what seems to be clear from experiments, is that p must be bigger than 99, to receive a good approximation of the original system.

2.2.4 Galerkin projection Projection of the system What we have reached so far is an orthonormal basis {ϕi }ri=1 of a subspace Sr ⊂ S that minimises the error of the projection. The next step is the projection of the dynamical system d q(t, x) + j(t, x) + u(t) = 0 dt z = Dx onto this subspace by the Galerkin projection. The snapshots xj can be expressed by the projection xˆ j onto space Sr added to the part r j in orthogonal complement Sr⊥ of Sr x j = xˆ j + r j ,

xˆ j ∈ Sr , r j ∈ Sr⊥

( j = 1, . . . , m).

The POD basis is optimal in minimising r j . Therefore, we now use only the projection of x on the subspace xˆ (t) :=

r

< ei , ϕi > yi (t) = Py(t)

i=1

Pr 10

:= (ϕ1 , . . . , ϕr ) c Koninklijke Philips Electronics N.V. 2005

Unclassified Report

where y ∈

r

PR-TN-2005/00919

is the actual state in Sr . The reduced model of order r can be expressed by d P q(t, Py) + P j(t, Py) + P u(t) = 0 dt z = D Py.

(2.7)

The difference between original and reduced system is, that the states of the reduced system are now elements of a smaller subspace. Because of the fact that we have constructed the subspace with respect to minimising the approximation error we can expect a good approximation of our original system. But there are some drawbacks. • We construct our subspace through the solutions of a ’training’ system, so if we now use the reduced system to simulate our circuit we must choose similar initial values and inputs as for the ’training’ system to guarantee that the solution stays near the subspace of the ’training’ subspace. • In fact we have to solve in each Newton iteration, which are needed in each timestep only, a linear system of dimension r n. As long as solving the linear systems in the Newton process is the most expensive part we can get a big speed up. But if we have a system where evaluating q and j is the most expensive part we will not get a big speed up due to the fact that we still have to evaluate both of them at each step. Initial conditions The initial conditions of the reduced system are computed by applying the Galerkin projection a second time. We do this in the same way as in 2.2.4. So our initial condition in the subspace is given as y(0) = Px(0). For this section see also [5] and [7]. Remark 2.10. In [27] they propose an error bound for the POD method for ODEs and also convergence analysis. The problem with the error bound is that it first is only for ODEs, of course it can be possible to adapt it to DAEs, and second the error bound is hard to compute, because we need Lipschitz constants and upper bounds of our functions. So in practice the bound is not useful because it is to hard to compute the bound.

2.3 Empirical balanced truncation (EBT) In the next section the EBT method is described. For more information see [11]. Balanced truncation is a well known technique to reduce the order of linear input-output system. We describe this method for nonlinear ordinary differential equations (ODE) because the theory was first developed for this type of systems. So instead of (2.2) we are now dealing with x˙ = f(x, u(t))

(2.8)

z = h(x) with x ∈ n the state of the system, u(t) ∈ now to find a reduced system

k

a given input and z ∈

q

the output. Our goal is

y˙ = fr (y, u(t)) z = hr (y) c Koninklijke Philips Electronics N.V. 2005

11

PR-TN-2005/00919

Unclassified Report

where y ∈ r , r < n, such that the input-output behaviour of the two systems is similar, for a specific class of input signals. In order to do this we develop a technique, which takes explicitly into account the input/output connection of the system. This means that we try to find out, how a past input is mapped to a future output. So we find out which of the inputs and outputs are important. If we know this relation we just truncate the unimportant parts. This idea is similar to POD, but there we only try to find out which internals are important. Therefore we first have to look to linear systems because we need some definitions of the linear case to adapt them to the nonlinear case.

2.3.1 Linear systems In this section we introduce some definitions, which we then adapt to nonlinear systems in 2.3.2. The definitions are needed to understand the work of balanced realisation and balanced truncation. For a linear system (2.8) changes to x˙ =

Ax + Bu(t)

(2.9)

z = Dx x(0) = x0 where A ∈

n×n

,B∈

n×k

and D ∈

q×n

. We can calculate x as t e A(t−s) Bu(s)ds. x(t) = e At x0 +

(2.10)

0

This linear system is called stable, if for all eigenvalues λi of A it holds that Re(λi ) < 0. Suppose that (2.9) is stable, then for u ∈ L2 (−∞, 0] the initial value x0 is given by ∞ e As Bu(−s)ds x0 = 0

This defines the controllability operator, : L2 [0, ∞) → This leads us to the following lemma. Lemma 2.11. Write Y := equation

.

n

by x0 = u(−t).

Then Y is the smallest semi positive solution to the Lyapunov

AY + Y A + B B = 0

x (0)Y

−1

x(0) =

(2.11) min

t∈(−∞,0)

||u(t)||2L 2

for

||x(0)||22

= 1.

The system is called controllable if I m( )= n , in which case Y is positive definite and (2.11) has a unique solution. The matrix Y is called the controllability gramian. We can do a similar definition for the output. Define the future output as z+ ∈ L 2 [0, ∞). Then we define the observability operator, : n → L 2 [0, ∞) by z+ = x0 , and hence x0 = De At x0 . Lemma 2.12. Write X := equation

. Then X is the smallest semi positive solution of the Lyapunov

A X + X A + D D = 0

x (0)Xx(0) =

max

t∈(0,∞)

(2.12) ||x(t)||2L 2

for

||x(0)||22

=1

The system is called observable if ker = {0}, in which case X is positive definite and (2.12) has a unique solution. The matrix X is called the observability gramian. 12


Unclassified Report

PR-TN-2005/00919

Both X and Y are n × n matrices, and are also given by the following integral formulae

∞

=

Y

0

=

X

∞

e At B B e A t dt

(2.13)

e A t D De At dt.

(2.14)

0

2.3.2 Gramians and principal component analysis The method of principal component analysis relies on the use of simulation/measurement data of the original system to construct the correlation matrix.1 This data is collected from ’typical’ system trajectories. In this analysis it is implicitly assumed, that the trajectories from which data are sampled are depending on x0 , the initial state of the system. Since we are now considering an input-output system we can instead make the assumption that the initial state of the system is zero and parameterise the trajectories with respect to the system input u(t). In theory we are not restricted to samples and so we can simply construct the correlation matrix using the integral ∞

K=

(x(t) − x¯ (t))(x(t) − x¯ (t)) dt

0

where x(t) is the state of the system at timepoint t and x¯ := limT →∞ To formulate our method we need some definitions: • • •

1 T

T 0

x(t) dt the mean state.

n is a set of l orthogonal n × n matrices, {T1, . . . , Tl } is a set of s positive constants, {c1 , . . . , cs } n is the set of all standard unit vectors of n , {e1, . . . , en }

• Given a function u ∈ L1 , we define the mean u¯ in the same way as the mean of x. We also make the assumption that x ∈ L1 and z ∈ L 2 . For the given conditions these assumptions are satisfied for exponentially stable systems. Definition 2.13. For (2.8) we define the empirical controllability gramian Yˆ as Yˆ := where ϒ i j k (t) ∈

n×n

∞ l s n 1 ϒ i j k (t)dt 2 lsc 0 j i=1 j =1 k=1

(2.15)

is given by ϒ i j k (t) := (xi j k (t) − x¯ i j k )(xi j k (t) − x¯ i j k )

and xi j k (t) is the state of (2.8) corresponding to the initial value x0 = 0 and the impulse input u(t) = c j Ti ek δ(t), with 1, t = 0 δ(t) = 0, t = 0 1 In POD a similar idea is used see 2.2.2


13

PR-TN-2005/00919

Unclassified Report

Lemma 2.14. For any sets n and , the empirical controllability gramian Yˆ of a stable linear system of type (2.9) is equal to the usual controllability gramian Y Proof. For the linear system x(t) is given as (2.10). In our case u(t) = cj Ti ek δ(t) so xi j k (t) = e At Bc j Ti ek . Hence, = c2j (e At BTi ek )(e At BTi ek )

ϒijk

= c2j e At BTi ek ek Ti B e A

t

hence

Yˆ

=

l s n 1 2 At c e BTi ek ek Ti B e A t dt 2 j lsc j i=1 j =1 k=1

∞

s 1

0

=

∞

0

j =1 ∞

=

s

e At B B e A t dt

e At B B e A t dt = Y.

0

The empirical controllability gramian (2.15) is a computable generalisation of the controllability gramian (2.13) for linear systems. It has the property that the eigenvectors ofYˆ corresponding to the nonzero eigenvalues span a subspace of S ⊂ n , which contains the set of states which are reachable using the chosen initial impulse inputs. So the idea of EBT, similar to POD Section 2.2, is to truncate those states corresponding to small eigenvalues of Yˆ . We do this through a Galerkin projection, see 2.2.4, to a subspace which is spanned by the eigenvectors of the largest eigenvalues of Yˆ . But because we are dealing with input-output systems it is not enough to study the input behaviour. The next definition is analogous to the previous one, but is related to the output behaviour. Definition 2.15. For (2.8) we define the empirical observability gramian Xˆ as Xˆ := ij

where km (t) ∈

n×n

∞ l s 1 Ti i j (t)Ti dt 2 lsc 0 j i=1 j =1

(2.16)

is given by km (t) := (zi j k (t) − z¯ i j k )(zi j m (t) − z¯ i j m ) ij

and zi j k (t) is the state of (2.8) corresponding to the initial value x0 = c j Ti ek and a zero input u ≡ 0. Lemma 2.16. For any sets n and , the empirical observability gramian Xˆ of a stable linear system of type (2.9) is equal to the usual controllability gramian X. 14


Unclassified Report

PR-TN-2005/00919

Gramian Yˆ Xˆ

x0

u(t)

0 c j Ti ek

c j Ti ek δ(t) 0

Table 2.1: Computing empirical gramians Proof. For the linear system x(t) is given as

x(t) = e x0 + At

t

e A(t−s) Bu(s)ds. 0

In our case u(t) = 0, for all t and x0 = c j Ti ek , so zi j k (t) = De At c j Ti ek . Hence, km (t) = c2j (De At Ti ek ) (De At Ti em ) ij

= c2j ek Ti e A t D De At Ti em hence

i j (t) = c2j Ti e A D De At Ti

and Xˆ

l s 1 2 ∞ c Ti Ti e A t D De At Ti Ti dt 2 j lsc 0 j i=1 j =1 s ∞ 1 e A t D De At dt = X. = s 0 m=1

=

Now we have the tools we need for the empirical analysis of the input-output behaviour of nonlinear systems. Rather than searching for the exact controllability and observability sub manifolds within the state space, our approach is to search for subspaces which approximate these manifolds. The computations which have to be done if we have calculated Xˆ and Yˆ are just a linear eigenvalue problem. We now have two important subspaces of the state space, and their corresponding eigenvalues. We can proceed in the same way as for linear systems and make use of the idea of balanced truncation theory [10] to decide on which subspace to project. Therefore we return to a description of linear system theory. Balanced realisation Corresponding to a linear system there is an associated Hankel operator which maps past inputs to future outputs, see [29]. The singular values of are the eigenvalues of , which are the same as those of the n × n matrix = XY . The balanced realisation shows a way to find which particular state value corresponds to which Hankel singular value. We may change state coordinates via a nonsingular linear transformation T ∈ n×n without affecting the input-output behaviour, in which case the Lyapunov equations (2.11) and (2.12) imply that the gramians transform according to Y → TYT X → (T −1 )T X T −1 . c Koninklijke Philips Electronics N.V. 2005

15

PR-TN-2005/00919

Unclassified Report

For linear systems the choice of T does not affect the Hankel singular values. Definition 2.17. A realisation (A, B, D) is called balanced if the controllability gramian Y and the observability gramian X are equal and diagonal.2 The Hankel singular values correspond to states through which the input is transmitted to the output. The balanced realisation gives a way to find out which particular states correspond to which Hankel singular value. The Hankel singular values σi then indicate the importance of the corresponding state in the balanced realisation. They respect the transfer of energy from past inputs to future outputs. This leads to a method of model reduction known as balanced truncation, introduced by Moore [10] in the context of realisation theory. The procedure is to truncate those states from the balanced realisation corresponding to a small Hankel singular value σi , because these states are hard to observe and to control. This means that the states with small Hankel singular values are ’unimportant’. If the states are ordered according to decreasing singular value, this is equivalent to applying a Galerkin projection to the balanced realisation with P = (I 0) to truncate the states.

2.3.3 Construction of EBT The empirical gramians give a quantitative method for deciding upon the importance of particular subspaces of the state space, with respect to typical inputs and outputs of the system. So the idea is to handle the nonlinear case in the same way as the linear way. This means that we have to find a linear change of the coordinates so that the empirical gramians are balanced and then to perform a Galerkin projection onto the subspace which is spanned by the eigenvectors of the largest eigenvalues of Xˆ Yˆ . Since the fact that the empirical gramians for a linear system are the same as the usual gramians, this method is exactly balanced truncation when it is applied to linear systems. Let T be the change of the coordinates such that the system is balanced T Yˆ T

= (T )−1 Xˆ T −1 = ,

I and let P = be the n × r projection matrix. To balance the system we ’simply’ have to 0 include T in our system x˙ = T f(T −1 x, u(t)) z = h(T

−1

(2.17)

x).

If we now apply the Galerkin projection to truncate unimportant states of (2.17) we get y˙ =

P T f(T −1 Py, u(t))

z = h(T −1 Py), which is our final reduced system. 2 One possibility how to balance a system is described in 2.3.4.

16


Unclassified Report

PR-TN-2005/00919

2.3.4 Computing the reduced model by EBT In this section we want to describe which type of computations we have to do to use this method. Because we want to use this method in circuit simulation we now describe how to adapt this method to d q(t, x) + j(t, x) + u(t) = 0 dt z = Dx. The first thing we need is the data to construct the gramians. To calculate this we apply (2.15) and (2.16). Because we have now discrete snapshots, which we get for example from a numerical integration scheme, all integrals change to finite sums. We also have to define n and , see 2.3.2. A reasonable choice is n = {I, −I }, since this leads to both negative and positive inputs and initial values. Of course we can choose bigger sets of n but this increases the computational time. describes the specific size of the input and initial values we are interested in. The choice is motivated by the inputs and states which are expected or are seen in further experiments, with the goal that during the computation of the empirical gramians, the dynamics should lay in a region of the state space close to that in which a closed loop system is operating. An important point is that it is not necessary to use the same for the controllability and the observability experiments. After we have chosen the sets we have to do transient analysis to compute the data. This is done through a transient analysis of the original system with the inputs u(t) described as in (2.15) ˆ to calculate Yˆ and another transient with the initial value x0 as in (2.16) to compute X. After that we have to compute T to balance our system. A simple algorithm is described in the following steps 1. Apply a Cholesky factorisation to Yˆ such that Yˆ = Z Z , with Z a lower triangular matrix with nonnegative diagonal entries. 2. Compute a singular value decomposition of Z Xˆ Z = U 2U ⇒ Xˆ = (Z −1 ) U 2U Z −1 . 1

3. Define T as 2 U Z −1 . Then it follows that T Yˆ T = ( 2 U Z −1 )(Z Z )((Z −1 ) U 2 ) = 1 1 (T −1 ) Xˆ T −1 = ( − 2 U −1 Z )((Z −1 ) U 2U Z −1 )(Z(U −1 ) − 2 ) = 1

1

as desired. We can now change the state coordinates of the nonlinear system, and truncate it using the Galerkin projection. Then our reduced system is then d P T q(t, T −1 Py) + P T j(t, T −1 Py) + P T u(t) = 0 dt z = T −1 Py. As for POD we now have a reduced system, but we only reduce our state and not the functions which describe the dynamical behaviour of the system. So if the evaluation of q and j is the most expensive part of the transient, we will not get the speed up we expect. An other point is that our reduced system is only as good as our ’training’ transient analysis we did to compute the empirical gramians. If we have different inputs or initial values we can encounter the problem that the real solution lies in a different subspace so that we can not use our approximated solution. c Koninklijke Philips Electronics N.V. 2005

17

PR-TN-2005/00919

Unclassified Report

2.4 Trajectory Piecewise-Linear (TPWL) In this section we want to discuss the TPWL-method. For more information see [2]. In the TPWL-method a transformation of the nonlinear model (2.2) to a linear reduced model is applied. Therefore we first have to linearise the original system at certain points and then use well-known linear model reduction techniques to reduce the new linearised system. But because the TPWLmethod is designed for input-output systems we are not dealing with (2.2) d q(t, x) + j(t, x) + Bu(t) = 0 dt z = Dx

(2.18)

x(0) = x0 where q and j describe the internals of the circuit, this means both are indepent with respect to the inputs, B ∈ n×k and u(t) : → k describes the input of the circuit and z ∈ q is the output of the system. The difference between 2.2 and 2.18 is that in 2.2 u also includes the distribution of the input to the nodes while in 2.18 B describes the distribution of u to the nodes.

2.4.1 Linearisation of the system To linearise the whole system around a given tuple (x(ti ), ti ) we use a Taylor expansion of (2.18) around this tuple. The order one Taylor expansions of q and j are ∂ q(ti , xi )(t − ti ) ∂t ∂ j(t, x) ≈ j(ti , xi ) + G(ti , xi )(x − xi ) + j(ti , xi )(t − ti ) ∂t

q(t, x) ≈ q(ti , xi ) + C(ti , xi )(x − xi ) +

where C(t, x) is the Jacobian of q(t, x) with respect to x and G(t, x) is the Jacobian of j(t, x) with respect to x. With the definitions Ci := C(ti , xi ), G i := G(ti , xi ), ji := j(ti , xi ), qi := q(ti , xi ),ji := ∂t∂ j(ti , xi ) and qi := ∂t∂ q(ti , xi ) our linearised system with frozen coefficients is given as d (qi + Ci (x − xi ) + qi (t − ti )) + ji + G i (x − xi ) + ji (t − ti ) + Bu(t) = 0 dt z = Dx. Now we can evaluate the derivative and get Ci x˙ + G i x + ji t + ki + Bu(t) = 0 z = Dx with

ki := (qi + ji − G i xi − ji ti ).

We can also treat the terms ki and ji t as additional inputs, so the input function changes to ⎛ ⎞ u(t) .. .. ˜ = [B .ki .ji ] ⎝ 1 ⎠ . B˜ i u(t) t 18


Unclassified Report

PR-TN-2005/00919

111111111111 000000000000 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 x2 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000 111111 000000000000 111111111111 000000 111111 000000000000 111111111111 000000 111111 000000000000 111111111111 000000 111111 000000000000 111111111111 000000 111111 000000000000 111111111111 000000 111111 000000000000 111111111111 000000 111111 000000000000 111111111111 000000 111111 000000000000 111111111111 x3 000000 111111 00000000000 11111111111 000000000000 111111111111 000000 111111 00000000000 11111111111 000000000000 111111111111 000000 111111 00000000000 11111111111 000000 111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 x4 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111

1111111111 0000000000 x(2) 0000000000 1111111111 0000000000 1111111111 00000000000000 11111111111111 0000000000 1111111111 00000000000000 11111111111111 0000000000 1111111111 00000000000000 11111111111111 0000000000 1111111111 00000000000000 11111111111111 0000000000 1111111111 00000000000000 11111111111111 0000000000 1111111111 00000000000000 11111111111111 0000000000 1111111111 00000000000000 11111111111111 x 1 0000000000 1111111111 00000000000000 11111111111111 0000000000 1111111111 00000000000000 11111111111111 0000000000 1111111111 00000000000000 11111111111111 0000000000 1111111111 00000000000000 11111111111111 0000000000 1111111111 00000000000000 11111111111111 0000000000 1111111111 00000000000000 11111111111111 0000000000 1111111111 00000000000000 11111111111111 0000000000 1111111111 000000000 111111111 00000000000000 11111111111111 x7 0000000000 1111111111 000000000 111111111 00000000000000 11111111111111 0000000000 1111111111 000000000 111111111 00000000000000 11111111111111 000000000 111111111 00000000000000 11111111111111 000000000 111111111 00000000000000 11111111111111 000000000 111111111 00000000000000 11111111111111 000000000 111111111 00000000000000 11111111111111 000000000 111111111 00000000000000 11111111111111 000000000 111111111 00000000000000 11111111111111 000000000 111111111 00000000000000 x511111111111111 00000000000 11111111111 000000000 111111111 00000000000000 11111111111111 00000000000 11111111111 000000000 111111111 00000000000000 11111111111111 00000000000 11111111111 000000000 111111111 00000000000000 11111111111111 00000000000 11111111111 000000000 111111111 00000000000 11111111111 000000000 111111111 00000000000 11111111111 000000000 111111111 00000000000 11111111111 000000000 111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 x 00000000000 11111111111 6 00000000000 11111111111 00000000000 11111111111 00000000000 x(1) 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111 00000000000 11111111111

Figure 2.1: Linearisation points and their accuracy regions Hence, the system we get at last and we deal with in this section is ˜ = 0 Ci x˙ + G i x + B˜ i u(t)

(2.19)

z = Dx. As long as ||x − xi || and |t − ti | are small enough, the accuracy of the linearised model is good.

2.4.2 Quasi-piecewise-linear In this section we discuss how we can approximate a nonlinear system by a combination of s linearised systems. We get the systems by linearising the original system at s linearisation tuples (xl0 , tl0 ), . . . , (xls−1 , tls−1 ), see 2.4.1, where xl0 = x0 and tl0 = t0 . The idea is to consider multiple linearisation around important states of the system, instead of having a single linearisation which will be inaccurate if ||x − xli || or |t − tli | is too big. So assume we have generated the s linearised models of the type (2.19) around the linearisation tuple (xli , tli ) i = 0, . . . , s − 1. Each of these models is a good approximation of the original as long as the solution and the timepoint stays in a region around the linearisation tuple (see Figure 2.1 on page 19) The reason why we do this is that, if we construct a common model which consists of all these models, we will increase the region where the system behaves similarly to the reduced model. One possible approach to construct such a system which consists of all linearised models is to take a weighted combination of all models (

s−1

s−1 s−1 ˜ w˜ i (x, t)Ci )˙x + ( w˜ i (x, t)G i )x + w˜ i (x, t) B˜ i u(t) =0

i=0


i=0

(2.20)

i=0

19

PR-TN-2005/00919

Unclassified Report

where w˜ i (x, t) ∈ are state depended weights with the condition that they are non negative and s−1 ˜ i (x, t) = 1, for allx, t. So we construct a satisfy a normalisation condition, such e.g. i=0 w convex combination of our linear systems.

2.4.3 Weighting procedure We want to choose the weights w˜ i (x, t), so that first the complexity of evaluating the weights is small and also that we select only the ’dominant’ linearised models based on the information about ||x − xli || and |t − tli |. But in fact for the reduced model we do not have the exact solution x available, because we have only the solution of TPWL model, which is in the reduced subspace. So what we can do is to use ||Py−xli || instead, where y ∈ r is the reduced solution and P ∈ n×r spans the reduced subspace. If we now assume that xli ∈ span{P}, for all i, we will get 1 ||Py − xli ||2 = ||Py − Pyli ||2 = (Py − Pyli ) (Py − Pyli ) 2 1 = (y − yli ) P P(y − yli ) 2 = ||y − yli ||2 . Clearly, there are many weighting procedures, but here we just want to present a similar approach as in [2]. 1. For i = 0, . . . , s − 1 compute di = α||y − yli ||2 + β||t − tli || with α, β > 0 and α + β = 1 2. Take m = mini=0,...,s−1 di 3. For i = 0, . . . , s − 1 compute wˆ i = e−

γ di m

, γ > 0 and constant for all y ˆ , with S = s−1 î 4. For i = 0, . . . , s − 1 set wi = wi i=0 w S In this way we obtain a weighting procedure, which ensures that only the linearised models whose linearisation tuple (xli , tli ) are near to the actual tuple (y, t) have an influence on the reduced system. Because if we choose γ large, in [2] one chooses γ = 25, and α||y − yli ||2 + β||t − tli || is small then the weight will be nearly 1, but if the distance increases the weight will become immediately close to zero. By changing the values α, β we can choose whether the state space or the time has the largest influence.

2.4.4 Selecting linearisation points Still there is the question of how we should select the linearisation points. The first idea is to solve the original system but this is a high effort. But because we do not need exact solutions of our system to linearise it we can just do an approximate simulation to get the linearisation points. The fact why we do not need exact solutions is, that as long as our simulation of the reduced model stays in the accuracy region we can expect that our reduced solution is a good approximation of the real solution. In Figure 2.2 on page 21 the approximated training trajectory (A) shows a good reduced linearised model as long as the solutions (B), (C), which are calculated with this linearised model stay in the accuracy regions. How we can calculate an approximate solution is described by the next procedure 1. Set i = 0, xl0 = x0 and tl0 = t0 2. Do, while i < s 20


Unclassified Report

PR-TN-2005/00919

111111111111 000000000000 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 x0 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 x 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 1 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 C 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 B 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 x 000000000000 x111111111111 000000000000 111111111111 2 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 3 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 x4 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 A 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 000000000000 111111111111 Figure 2.2: Training trajectory and its stability regions (a) Linearise the given system around xli and tli Ci x˙ + G i x + B˜ i u˜ = 0. (b) Reduce the linearised system. Therefore we apply a linear model reduction technique which generates Pi . ˜ = 0. Pi Ci Pi y˙ + Pi G i Pi x + Pi B˜ i u(t) (c) Use the reduced linearised system to calculate the next linearisation point. Therefore we have to compute y j , j > li + 1. We do this as long as the following condition holds ||y j − yli || < δ y and |t − tli | < δt . ||yli || So we can say that δy controls the accuracy of the linearised model in the state space and δt in the time domain. (d) The last (y j , t j ) is our new linearisation point. So xli+1 = Pi y j , tli+1 = t j and i = i +1.

2.4.5 Constructing the reduced TPWL-model Now we want to discuss how we can construct a reduced version of (2.20). Notice that (2.20) is still nonlinear. But because the nonlinearity is introduced by the scalar functions w˜i , an efficient reduced order representation may be derived. A way to reduce the model is to find a projection matrix P ∈ n×r with the following condition x = Py. We get Pr from linear model reduction methods like Arnoldi [3] or PRIMA[6, 8]. Therefore the most simple way to calculate Pr is just to reduce the system around (x0 , t0 ) and use this projection c Koninklijke Philips Electronics N.V. 2005

21

PR-TN-2005/00919

Unclassified Report

matrix for all the other linearised systems. This approach has the advantage that the reduction procedure is cheap, but the accuracy can be bad. But there are different methods which take into account all linearised methods to build up the projection matrix. These methods have a higher complexity but at the same time a better accuracy. With the help of P we can now project the linearised system, so (2.20) changes to (

s−1

s−1 s−1 ˜ w˜ i (Py, t)P Ci P)˙y + ( w˜ i (Py, t)P G i P)y + w˜ i (Py, t)P B˜ i u(t) =0

i=0

i=0

i=0

Note that evaluating the equation is cheap, because we just have to make matrix-vector multiplications and additions of size r, if we compute all static entries in advanced. But as one may note: the evaluation w˜ i (Py, t) can be expensive. So instead of using w˜i as weighting function we use wi (y, t) as a weighting function, see 2.4.3, which only depends on the size of the reduced system. So we can reduce the costs for evaluating the weighting functions from O(sn) to O(sr). Like w˜i , wi should be nonnegative and satisfy a normalisation condition. So we get our resulting reduced order system (

s−1

s−1 s−1 ˜ wi (y, t)Cir )˙y + ( wi (y, t)G ir )y + wi (y, t) B˜ ir u(t) =0

i=0

i=0

i=0

where Cir

:=

P Ci P

G ir

:=

Bir

:=

P Gi P P B˜ i

Conclusion With this approach we have first a possibility to extract the reduced model in a very fast way. Also the simulation time of the reduced model is very low compared to POD and EBT. But we have also to be careful while using this approach because we are approximating a full nonlinear system through a reduced linear system and therefore can expect some system problems. But these problems can be solved by using more linearisation tuples. Another disadvantage is that instead of just storing a matrix which represents the subspace we also have to store several linear systems.

2.5 Volterra Series (VS) In this section we introduce the Volterra series technique to reduce a nonlinear systems. The idea behind this method is to construct a bilinear system, which approximates the first moments of the nonlinear system. Then use linear model reduction techniques to create a reduced bilinear system which matches as many moments of the original system as possible. We first restrict the method to ODEs and then try to adapt the method to DAEs. For more information how VS can be used on ODEs, see [14, 15]. For more general information on the theory of VS, see [13, 16].

2.5.1 Impulse response function and transfer function of a linear DAE We start with discussing the relation between the impulse response function and transfer function of a system. We want to do this only for single-input single-output (SISO) systems, but of 22


Unclassified Report

PR-TN-2005/00919

course everything works also for multiple-input multiple-output (MIMO) systems. We also restrict ourselves to a linear SISO system, so the system we are dealing with is C x˙ = Gx + bu(t)

(2.21)

z = d x where C, G ∈ n×n , x, b, d ∈ n and u, z ∈ . For this system we can also calculate the output z as a convolution of the input u and the impulse response function h z(t) =

∞

−∞

h(σ )u(t − σ )dσ.

h is here also called the kernel of the system and satisfies h(t) = 0 for t < 0. This leads to a general system sketch, see Figure 2.3 on page 23 u

h(t)

z

Figure 2.3: SISO system. Now we transform our system to the frequency domain. Therefore we apply a Laplace transformation to our SISO system (2.21). This leads to s EX(s) = X(s) + cU (s) Z(s) = d X(s) where E := G −1 C and c := G −1 b. The resulting system is not a DAE system but an algebraic equation which depends on s. If we now do some additional transformations we get Z(s) = −d (I − s E)−1 cU (s) From this we can define the transfer function H(s) := −d (I − s E)−1 c in our case a scalar function and obtain Z(s) = H(s)U (s). It holds that the transfer function H(s) it the Laplace transformation of h(t). H(s) = (h(t)) where is the Laplace operator.

2.5.2 Projection method for linear systems Here we discuss projection techniques which are based on Krylov subspaces. The Krylov subspaces are optimal subspaces. In the meaning of that if we project our original system to the Krylov subspace we are minimising the approximation error. The idea is to find a matrix P, a c Koninklijke Philips Electronics N.V. 2005

23

PR-TN-2005/00919

Unclassified Report

basis of the reduced subspace Sr , which projects our system to the optimal subspace, so that it holds x ≈ Py where x is the state of the original system, y ∈ Sr is the state of the reduced system. So our reduced linear system then is P C P y˙ =

P G Py + P bu(t)

(2.22)

z = d Py. But the question still is how to find an optimal subspace. Therefore we take a look at the Laplace transformation of our original system X(s) = −(I − s E)−1 cU(s). If we now replace the inverse by a Neumann series we will get X(s) = −

∞

s i E i cU(s)

(2.23)

i=0

Hence, X ∈ span{c, Ec, E 2 c, . . .} and we call E i c , i ≥ 0 the i-th moment of the system. With the result of next the definition it is clear how we can get an optimal subspace. Definition 2.18. The Krylov subspace r ( A, p) of order r generated by a matrix A and a vector p, is the space which is spanned by the set of vectors { p, Ap, A2 p, . . . , Ar−1 p}. A basis for this subspace can be calculated by the Arnoldi-method, see [12]. If we now choose P ∈ r (E, c) we will get our optimal subspace. And the reduced system matches the first r moments of the original system.

2.5.3 Bilinear systems for a nonlinear ODE In this and the next section we want to show how we can use the idea of VS for nonlinear ODEs of the type d x = f(x) + Bu(t) dt z = Dx(t).

(2.24)

To use the theory of VS we first have to construct a so called bilinear system which approximates the system by a power series representation. Therefore we first define x(1) := x , x(2) := x ⊗ x , x(3) := x ⊗ x ⊗ x, and so on, where ⊗ is the Kronecker product. Then the power series expansion of f(x) is f(x) = A1 x(1) + A2 x(2) + A3 x(3) + . . . k where Ak ∈ n×n . Typically Ak = dxd k f(x) . Now system (2.24) can be rewritten as d x = dt

A1 x(1) + A2 x(2) + A3 x(3) + . . . +

bi u i (t)

i

z = Dx(t). 24


Unclassified Report

PR-TN-2005/00919

where bi is the i-th column of B and ui the i-th element of u. Now we introduce a new state vector x⊗ to get our bilinear representation. ⎛ (1) ⎞ x ⎜ x(2) ⎟ ⎟ ⎜ x⊗ = ⎜ x(3) ⎟ (2.25) ⎠ ⎝ .. . Of course we need the time derivative of x⊗ . As an example we take x(2) and calculate the derivative ! d (1) d (2) = x x ⊗ x(1) dt dt = " x˙ ⊗ x(1) + x(1) ⊗ x˙ # " # (1) (2) (1) (1) (1) (2) bi u i (t) ⊗ x + x ⊗ A1 x + A2 x + . . . + bi u i (t) = A1 x + A2 x + . . . + i

" =

( A1 ⊗ I )(x(1) ⊗ x(1) ) + ( A2 ⊗ I )(x(2) ⊗ x(1) ) + . . . +

(bi ⊗ I )u i (t)x(1) i

" (1)

+ (I ⊗ A1 )(x

(1)

(2)

⊗ x ) + (I ⊗ A2 )(x (2)

=

(A1 ⊗ I + I ⊗ A1 )x

=

A21 x(2) + A22 x(3) + . . . +

i

#

⊗ x ) + ... + (I ⊗ bi )u i (t)x(1)

#

(1)

(3)

+ ( A2 ⊗ I + I ⊗ A2 )x

!

+ ... +

i

(bi ⊗ I + I ⊗ bi )u i x(1) i

B2i u i x(1)

i

where A21 := ( A1 ⊗ I + I ⊗ A1 ), A22 := ( A2 ⊗ I + I ⊗ A2 ) and B2i := (bi ⊗ I + I ⊗ bi ). Continuing this process, we obtain a bilinear realisation of (2.24) d ⊗ x = A⊗ x⊗ + Ni⊗ u i (t)x⊗ + B ⊗ u(t) (2.26) dt i z = D ⊗ x⊗ where

⎛ ⎜ ⎜ A⊗ = ⎜ ⎝

A11

⎞ ··· A22 · · · ⎟ ⎟ A31 · · · ⎟ ⎠ .. . ⎞

A12 A21

⎛

Ni⊗

0 ⎜ B2i ⎜ = ⎜ ⎝

B⊗ = D⊗ =

0 B3i

0 .. .. . .

⎟ ⎟ ⎟ ⎠

B 0 · · · D 0 ···

with A1k := Ak and A j k := Ak ⊗ I ⊗ · · · ⊗ I + I ⊗ Ak ⊗ I ⊗ · · · ⊗ I + · · · + I ⊗ · · · ⊗ I ⊗ Ak with j − 1 Kronecker products in each term, and j terms. Similar Bj i is given by B j i := bi ⊗ I ⊗ · · · ⊗ I + I ⊗ bi ⊗ I ⊗ · · · ⊗ I + · · · + I ⊗ · · · ⊗ I ⊗ bi . c Koninklijke Philips Electronics N.V. 2005

25

PR-TN-2005/00919

Unclassified Report

2.5.4 Bilinear model reduction for a nonlinear ODE As described in 2.5.3 we now handle a system of type (2.26), but for simplicity reasons we restrict ourselves to SISO systems of the type d x = f(x) + bu(t) (2.27) dt z = d x(t) and their bilinear representation d ⊗ x = A⊗ x⊗ + N ⊗ x⊗ u(t) + b⊗ u(t) dt z = c⊗ x⊗ .

(2.28)

From now on we leave the superscript ⊗ out in all equations, to make the notations more simple. The Volterra series representation of a bilinear system of the type (2.28) with the kernel in the regular form is given by ∞ z i (t) z(t) = i=0

where the i-th degree subsystem is given by t t ··· h reg (t1 , . . . , ti )u(t − t1 − · · · − ti )u(t − t2 − · · · − ti ) · · · u(t − ti )dt1 · · · dti z(t) = 0

0

with the associated i-th degree regular kernel h reg (t1 , . . . , ti ) = c e Ati N · · · Ne At2 Ne At1 b.

(2.29)

If we now apply the multi-dimensional Laplace transformation to (2.29) we will get the i-th transfer function H(s1 , . . . , si ) = c (si I − A)−1 N · · · N(s2 I − A)−1 N(s1 I − A)−1 b.

(2.30)

From the Neumann series of (s j I − A), it is natural, see (2.23), to define the corresponding multimoment as (2.31) m(l1 , . . . , li ) = (−1)i c A−li N · · · N A−l2 N A−l1 b where l j ∈ , j = 0, . . . , i. The expressions of the transfer function (2.30) and the corresponding multi-moments (2.31) suggest that in order to match the moments for the i-th degree kernel, we can first generate the subspace P(i) of the nested Krylov subspaces with the depth i defined by −1 −1 A , A b . . . , i = 1, . . . , p span{P (i) } = m . . . m A−1 , A−1 N · where m states how many multi-moments of the i-th degree kernel are matched and p is the number of kernels which are approximated. Then take a union of the subspaces p $ span{P (i) }. span{P} = i=1

Once we have created our projection matrix P we can project our system to the subspace d y = P APy + P N Pu(t)y + P bu(t) dt z = c Py The result of this projection is that the reduced system matches the first m multi moments of the first p kernels of the original system. In [15] different reduction technique for bilinear system is presented, which has the advantage that it matches more moments with a smaller subspace. 26


Unclassified Report

PR-TN-2005/00919

2.5.5 Bilinear systems for a nonlinear DAE Now we discuss the bilinear system representation of nonlinear DAEs. The idea is to transform our nonlinear system to a bilinear system which is a better approximation as a linearisation. Therefore we create a power series representation of our system (2.2), i.e. Taylor series. In our example we do this around x0 = 0, t0 = 0. q(t, x) = Q 0 + Q 1 x(1) + R1 t j(t, x) =

J0 + J1 x(1) + J1 x(2) + . . . + K 1 t + K 2 t 2 + . . .

where x(1) := x, x(2) := x ⊗ x and so on. Qi , Ji ∈ n×n for i ≥ 0 and R1 , K j ∈ fact Q 1 is the Jacobian of q and is therefore not invertible. So our system is then i

n

for j ≥ 1. In

d ˜ (Q 0 + Q 1 x(1) + R1 t) + J0 + J1 x(1) + J1 x(2) + . . . + K 1 t + K 2 t 2 + . . . + B˜ u(t) = 0. dt After we calculated the derivative of the system we get Q1

d (1) x + J1 x(1) + J1 x(2) + . . . + Bu(t) dt

=

0 %

B˜ , R1 + J0 , K 1 , K 2 , · · · ˜ , 1 , t , t2 , · · · u(t) := u(t)

&

B :=

The bilinear model is then obtained by defining the new state variable x⊗ , see (2.25). Now we have to calculate the time derivate of x(i) . But this is a bit different from ODEs because there is no representation of our system where x˙ is stated for itself. Instead we have the term Q1 x˙ , remember Q 1 is singular. So we have to multiply the derivative of x(k) with Q 1 ⊗ · · · ⊗ Q 1 , where we have k − 1 Kronecker products, i.e. (Q 1 ⊗ Q 1 )

d (2) x = (Q 1 ⊗ Q 1 )(˙x(1) ⊗ x(1) + x(1) ⊗ x˙ (1) ) dt = (Q 1 ⊗ Q 1 )(˙x(1) ⊗ x(1) ) + (Q 1 ⊗ Q 1 )(x(1) ⊗ x˙ (1) ) = (Q 1 x˙ (1) ⊗ Q 1 x(1) ) + (Q 1 x(1) ⊗ Q 1 x˙ (1) ) k bi u i ) ⊗ Q 1 x(1) = −( J1 x(1) + J2 x(2) + . . . + i=1

−Q 1 x(1) ⊗ (J1 x(1) + J2 x(2) + . . . +

k

bi u i )

i=1

= −J1 ⊗ Q 1 x(2) − J2 ⊗ Q 1 x(3) − . . . −

k ((bi ⊗ Q 1 )(u i x(1) )) i=1

−Q 1 ⊗ J1 x(2) − Q 1 ⊗ J2 x(3) − . . . −

k ((Q 1 ⊗ bi )(u i x(1) )) i=1

(2)

= −( J1 ⊗ Q 1 + Q 1 ⊗ J1 )x − (J2 ⊗ Q 1 + Q 1 ⊗ J2 )x(3) − . . . k ((bi ⊗ Q 1 + Q 1 ⊗ bi )(u i x(1) )) − i=1

= − J21 x(2) − . . . −

k

B2i u i x(1)

i=0


27

PR-TN-2005/00919

Unclassified Report

where bi is the i-th column of B, ui (t) is the i-th element of u(t) = (J1 ⊗ Q 1 + Q 1 ⊗ J1 ) and B2i = (bi ⊗ Q 1 + Q 1 ⊗ bi ). Continuing this process we obtain a bilinear realisation Q⊗

k d ⊗ x + J ⊗ x⊗ + Ni⊗ x⊗ u i (t) + B ⊗ u(t) = 0 dt i=1

(2.32)

z = C ⊗ x⊗ of the nonlinear system, where Q ⊗ := diag(Q 1 , (Q 1 ⊗ Q 1 ), . . .) ⎛ J11 J12 ··· ⎜ − J21 − J22 · · · ⎜ J ⊗ := ⎜ − J31 − J32 · · · ⎝ .. . ⎛ ⎞ 0 ··· ⎜ −B2i ⎟ 0 ··· ⎜ ⎟ Ni⊗ := ⎜ −B3i 0 · · · ⎟ ⎝ ⎠ .. .. . . B 0 · · · B ⊗ := C ⊗ := C 0 ···

⎞ ⎟ ⎟ ⎟ ⎠

with J1k = Jk and J j k = Jk ⊗ Q 1 ⊗ · · ·⊗ Q 1 + Q 1 ⊗ Jk ⊗ Q 1 ⊗ · · ·⊗ Q 1 + · · ·+ Q 1 ⊗ · · · ⊗ Q 1 ⊗ Jk where there are j − 1 Kronecker products in each term and j terms. Similarly Bj i is given by B j i = bi ⊗ Q 1 ⊗ · · · ⊗ Q 1 + Q 1 ⊗ bi ⊗ Q 1 ⊗ · · · ⊗ Q 1 + · · · + Q 1 ⊗ · · · ⊗ Q 1 ⊗ bi Theorem 2.19. It is not possible to reduce equation (2.32) by a Krylov-method because of the fact that Q ⊗ and J ⊗ are singular. Proof. The determinant of Q⊗ is defined as det(Q ⊗ ) = det(diag(Q1 , (Q 1 ⊗ Q 1 ), . . .)). Because of the fact that the determinant of a block diagonal matrix is the product of the determinant of the diagonal blocks, i.e. det(Q ⊗ ) =

k ' i=1

det(Q 1 ⊗ · · · ⊗ Q 1 ). ( )* + i-times

We also know that det(Q1 ) = 0, so it follows that Q⊗ is singular. Because of the fact that J ⊗ is a block triangular matrix the determinant of J⊗ is the product of the determinants of the diagonal blocks det(J ⊗ ) =

k '

det( Ji1 ).

i=1

Now we take J21 and calculate its determinant det( J21 ) = det(Q 1 ⊗ J1 + J1 ⊗ Q 1 ). 28


Unclassified Report

PR-TN-2005/00919

Because of the knowledge that Q1 is singular, we can assume without loss of generality that the last row of Q 1 is zero Q 11 , Q1 = 0 with Q 11 ∈

Q 11 0

n−1×n

. Now we create the Kronecker product ⎛ Q 11 J1,11 ··· 0 ⎜ ⎜ Q 11 ⊗ J1 Q 11 ⎜ .. .. ⊗ J1 + J1 ⊗ = +⎜ . 0 0 ⎜ . ⎝ Q 11 ··· J1,n1 0

J1,1n .. . J1,nn

Q 11 0 Q 11 0

⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠

what we can see is, that the last row of J21 is zero and therefore det(J21 ) = 0. Hence, det(J ⊗ ) = 0. So, both, Q ⊗ , J ⊗ are singular and it follows that it is not possible to reduce the system by a Krylov method. One thing we also derive from the theorem is that the linear system (2.32) is not uniquely solvable because Q⊗ , J ⊗ is not a regular pencil anymore.

2.6 Complexity Analysis In this section the complexity of the introduced methods are compared. We will skip the VS approach because it cannot be used for DAEs. All methods are used to solve the system d q(t, x) + j(t, x) + u(t, x) = 0 dt x(0) = x0 To do this we first have to make some assumptions. Assume that: • Dimension of x is n. • Evaluation of j(t, x) has a complexity of (nα ), with α ≥ 1.

• Evaluation of q(t, x) has a complexity of (nβ ), with β ≥ 1.

• Evaluation of u(t) has a complexity of (n) and because of that we do not count the evaluation of u. ∂ j(t, x) = G(t, x) we consider three cases, being a complexity of 1. • For the evaluation of ∂x α+1 α (n ), 2. (n ), 3. (n). ∂ q(t, x) = C(t, x) we consider three cases, being a complexity of 1. • For the evaluation of ∂x β+1 β (n ), 2. (n ), 3. (n).

• Calculation of ∂t∂ j(t, x) and (n β ) respectively.

∂ q(t, x) ∂t

is done numerically so the complexity is

(nα ) and

• We are using fixed timesteps for transient analysis (with Euler-Backward), with the number of timesteps p. • All matrices during the processes are full. Because of the fact that a saxpy operation (y = ax + y) is cheap , (n), compared to the rest of the process we do not count the complexity of these operations. c Koninklijke Philips Electronics N.V. 2005

29

PR-TN-2005/00919

Unclassified Report

2.6.1 Complexity of the original system For solving the original system we need to solve the following system q(ti , xi ) +hj(ti , xi ) − q(ti−1 , xi−1 ) +hu(ti ) = 0 , i = 1, . . . , p. ( )* ( + )* +

(nβ )

(nβ )

This is done by a Newton iteration with the initial guess xi−1 . To do this we have to solve the following system in each step of the iteration (C(ti , x j ) +h G(ti , x j )) · x = −(q(ti , x j ) +h j(ti , x j ) +r) ( )* + ( )* + ( )* + ( )* +

(nβ ) (2)

(nα ) (2)

x

j +1

(nβ )

(nα )

= x + x j

where r = −q(ti−1 , xi−1 ) + u(ti ) and the last x j is then the solution xi . Solving the resulting linear system has a complexity of (n3 ). And we need an average of k Newton iterations. Summing up we get the following complexities for calculating xi out of xi−1 for the three different cases for the Jacobian’s 1. 2. 3.

(n β + k(n β+1 + n α+1 + n β + n α + n 3)) (n β + k(2nβ + 2nα + n 3 )) (n β + k(2n + nβ + n α + n 3)).

So we need to solve the whole system for the given time interval with p-times the complexity of one timestep. Hence, the complexity of the whole system is the following. 1. 2. 3.

( pnβ + pk(nβ+1 + n α+1 + n β + n α + n 3)) ( pnβ + pk(2nβ + 2nα + n 3)) ( pnβ + pk(2n + nβ + n α + n 3 )).

2.6.2 Complexity of POD If we want to use the POD method to reduce a system of dimension n to a system of dimension r < n, we have to do three steps: 1. Create the projection matrix P to project the original system to the subspace S ⊂

n

.

2. Solve the reduced system. 3. Project the reduced solution to the original space

n

.

Calculating the projection matrix P To calculate the projection matrix we have to calculate the m snapshots r < m. These snapshots are generated by solving the original system at m different timepoints. But because of the behaviour of circuits we have to do a full transient analysis with p timesteps. The reason for that is simple. If we think of a circuit which is first very smooth and after a while starts to oscillate very strongly, it does not make sense to build our subspace only with the information of the smooth part. Because the oscillating states lie in a different subspace. Due the fact that we need a whole transient analysis the complexity of computing them is, see 2.6.1 30


Unclassified Report

1. 2. 3.

PR-TN-2005/00919

( pnβ + pk(nβ+1 + n α+1 + n β + n α + n 3)) ( pnβ + pk(2nβ + 2nα + n 3)) ( pnβ + pk(2n + nβ + n α + n 3 ))

Then we have to compute the singular value decomposition of X := (x0 , x1 , . . . , x p ) = U V which has complexity of (n3 ) if n > p otherwise ( p3 ). And then P = (u1 , . . . , ur ) ∈ n×r . Solving the reduced system The reduced system has the following form d P q(t, Py) + P j(t, Py) + P u(t) = 0 dt y(0) = y0 with y ∈ r and y0 = P x0 . To solve this system we use the Euler Backward method using the Newton method. So in each timestep the following system has to be solved P q(ti , Pyi ) + h · P j(ti , Pyi ) − P q(ti−1 , Pyi−1 ) + P u(ti ) = 0 )* + ( )* + (

(nβ +nr)

(nr)

For each Newton iteration we need to solve P (C(ti , Py j ) +h · G(ti , Py j ))P y = −(P (q(ti , Py j ) + j(ti , Py j )) +r) ( )* + ( )* + ( )* + ( )* + (

(nβ +nr) (2) )* (nα +nr) (2) + + (n 2 r)

(

(nβ +nr))* (nα +nr) + + (nr)

y j +1 = y j + y

where r = P q(ti−1 , Pyi−1 )+ P u(ti ) and the last y j is then the solution yi . Solving the resulting linear system has an effort of (r3 ). And we need an average of k Newton iterations. Summing up we get for calculating yi out of yi−1 , for the three different cases for the Jacobians, the following complexities 1. 2. 3.

(n α + 2nr + k(n2r + n β+1 + n α+1 + 5nr + nβ + n α + r 3 )) (n α + 2nr + k(n2r + 2nβ + 2nα + 5nr + r 3 )) (n α + 2nr + k(n2r + 2n + nβ + n α + 5nr + r 3 ))

So it shows that we need to solve the whole reduced system for the given time interval, p-times the complexity of one timestep. Hence, the complexity of the whole reduced system is the following: 1. 2. 3.

( pnα + 2 pnr + pk(n2r + n β+1 + n α+1 + 5nr + nβ + n α + r 3 )) ( pnα + 2 pnr + pk(n2r + 2nβ + 2nα + 5nr + r 3 )) ( pnα + 2 pnr + pk(n2r + 2n + nβ + n α + 5nr + r 3 )).


31

PR-TN-2005/00919

Unclassified Report

Projecting the reduced solution y to

n

Therefore we just need to multiply every reduced solution at the timepoint i = 1, . . . , p with P. So the complexity of this step is ( pnr) .

2.6.3 Empirical balanced truncation (EBT) Using the EBT method to reduce a system of dimension n to a system of dimension r < n, we have to do three steps. These steps are the same as for the POD approach, see 2.6.2. Calculating the projection matrix P First calculate the transient analysis of our original system (2.2) for different inputs and initial values. How often we have to do this depends on how many ’typical’ inputs and initial values we want to prepare our reduced system. If we do this as described in Section 2.3 we have to do nls transient analysis processes and because of the results of 2.6.1 the complexity of doing this is nls times the complexity of the original system 1. 2. 3.

(nls( pnβ + pk(nβ+1 + n α+1 + n β + n α + n 3))) (nls( pnβ + pk(2nβ + 2nα + n 3))) (nls( pnβ + pk(2n + nβ + n α + n 3 ))).

Then we have to calculate the empirical gramians Yˆ and Xˆ

Yˆ : =

l s n 1 ϒijk 2 lsc j i=1 j =1 k=1 ( )* +

(lsn 3 )

where ϒijk

:= Xi j k (Xi j k ) ∈ ( )* +

n×n

( pn2)

X

ijk

:=

! (xi j k (t0 ) − x¯ i j k ), . . . , (xi j k (t p−1 ) − x¯ i j k ) ∈ R n× p )* + (

( pn)

x¯ i j k

:=

1 p (

p−1

xi j k (tν ) .

ν=0

)*

( pn)

+

This has a complexity of

(lsn3 + lspn3 + 2lspn2 ). 32


Unclassified Report

PR-TN-2005/00919

Xˆ : =

s l 1 T i j T 2 ( i )* i + lsc j i=1 j =1

(

)*

(lsn 2 )

(n3 ) +

with ij km

:=

p−1

< (zi j k (tν ) − z¯ i j k ), (zi j m (tν ) − z¯ i j m ) > k, m = 1, . . . , n

(ν=0 z¯ i j k

:=

)*

+

( pq)

p−1 1 ijk z (tν ) . p ν=0 )* + (

( pq)

This has a complexity of

(lsn2 + lsn3 + lspqn2 + lsnpq). ˆ To do this we have to go through the following procedure The next step is then to balance Yˆ and X. ˆ 1. Apply a Cholesky factorisation to Yˆ so that Y ( =)*Z Z +, with Z a lower triangular matrix

(n3 )

with positive diagonal entries.

2 −1 2 −1 ˆ 2. Compute a singular value decomposition of (Z)*Xˆ Z+ = U ( )*U + ⇒ X = (Z ) U U Z

(n3 )

(n3 )

3. Define T as ( 2 U)* Z −1+ . 1

(n3 )

With T we have to compute two projection matrices P1 ∈

n×r

and P2 ∈

n×r

.

P1 : = (T P) , ( )* +

(n2r)

P = [I, 0] ∈

n×r

, is used if we want project an element of the original space to the subspace. P2 : =

−1 (T )*P +

(n3 +n2r)

is used to project from the subspace to the original space. Summing up the complexity to calculate the reduced system is 1.

(5n2 + 2n2r + ls(2n2 + n 3 + p(nβ+1 + n 3 + 2n2 + qn2 + q + k(nβ+2 + n α+2 + n β+1 +

n α+1 + n 4 )))) 2. 3.

(5n2 + 2n2r + ls(2n2 + n 3 + p(nβ+1 + n 3 + 2n2 + qn2 + q + k(2nβ+1 + 2nα+1 + n 4)))) (5n2 + 2n2r + ls(2n2 + n 3 + p(nβ+1 + n 3 + 2n2 + qn2 + q + k(2n2 + n β+1 + n α+1 + n 4))))


33

PR-TN-2005/00919

Unclassified Report

Solving the reduced system Now we have to solve d P q(t, P2 y) + P1 j(t, P2 y) + P1 u(t) = 0 dt 1 which is the same as for POD it follows that the complexity is 1.

( pnα + 2 pnr + pk(n2r + n β+1 + n α+1 + 5nr + nβ + n α + r 3 ))

2.

( pnα + 2 pnr + pk(n2r + 2nβ + 2nα + 5nr + r 3 ))

3.

( pnα + 2 pnr + pk(n2r + 2n + nβ + n α + 5nr + r 3 )).

Projecting the reduced solution y to

n

Therefore we just need to multiply every reduced solution at the timepoints i = 1, . . . , p with P2 . So the complexity of this step is ( pnr).

2.6.4 Trajectory Piecewise-Linear (TPWL) Using the TPWL method to reduce the original system of dimension n to a system of dimension r < n requires four steps 1. Calculate the linearisation of the system at s linearisation points. (a) Simple approach (b) Extended approach 2. Create the TPWL model. 3. Solve the reduced system. 4. Project the solution back to the original space. Linearisation of the system Simple approach: To linearise the system we need to compute s linearised system at s different tuples (in general s < r). We first start with the tuple (x0 , t0 ) and linearise the system around this tuple. For this linearised system we then compute the projection matrix P and project the system to the subspace. Then we use this system to calculate yi , i = 1, . . . , l1 , with the property that xi = Pyi . We do this as long as ||y0 − yi || < δ y and |ti − t0 | < δt . The tolerances δy , δt define how accurate the reduced system is. If we leave the ball around (x0 , t0 ) we will create a new linearised model with linearisation tuple (xl1 , tl1 ) which is reduced by P, which is the same as for (x0 , t0 ). This procedure is done until we have p timesteps. As a result we have to do three things 34


Unclassified Report

PR-TN-2005/00919

1. Calculate the projection matrix P for the linearised model at timepoint t0 and state x0 . Therefore we have to compute C0 , G 0 , B˜ 0 = (B, k0, j0 ), ()*+ ()*+

(nβ ) (nα ) with

k0 : = ( q0 + j0 + G 0 x0 + j0 t0 ) ()*+ ()*+ ( )* + ()*+

(nβ ) (nα ) (n2 )

(nα )

. Calculating P has for the most linear reduction methods a complexity of (n2r). 2. Calculate s-times the reduced linearised system. Therefor we have to do the following computation P (

Ci P y˙ + P G i Py + P ()*+ ()*+

(n)*β ) (2) + (n2r)

(

B˜ i ()*+

(nα ) (2) (2n)*α +nβ +n2+) )* + ( (rn(k+2)) (n2r)

˜ u(t) =0

We define Cir := P Ci P , G ir := P G i P and Bir := P B˜ i . 3. Solve these linearised systems at p timepoint. The complexity of a linear system of dimension r is (r 3 ). So if we sum up all the complexities and take the three different Jacobian’s efforts in account we get 1. 2. 3.

(n α+1 +n β+1 +n β +2nα +n 2 +n 2r +s(2n2r +n β+1 +n α+1 +2nα +n β +n 2 +rn(k+2))+r 3 p) (2nβ + 3nα + n 2 + n 2r + s(2n2r + 3nα + 2nβ + n 2 + rn(k + 2)) + r 3 p) (2n + nβ + 2nα + n 2 + n 2r + s(2n + 2n2r + 2nα + n β + n 2 + rn(k + 2)) + r 3 p)

Extended approach: We start in this approch in the same way as in the simple approach. So first we create a linearised reduced sytem arround (x0 , t0 ), see Section 2.6.4 on the preceding page. Then we use this system to calculate yi , i = 1, . . . , l1 , with the property that xi = P0 yi . We do this as long as ||y0 − yi || < δ y and |ti − t0 | < δt . δ y , δt define how accurate the reduced system is. If we leave the ball around (x0 , t0 ) we create and reduce (P1 ) a new local linearised model with the linearisation tuple (xl1 , tl1 ). This procedure is done until we have have done p timesteps. So it is clear that we to do four steps 1. Calculate s-times the local projection matrix Pi for the linearised model at timepoint tli and state xli for i = 1, . . . , s. Therefore we have to compute Ci , G i , B˜ 0 = (B, k0, j0 ) ()*+ ()*+

(nβ ) (2) (nα ) (2) with

k0 : = ( q0 + j0 + G 0 x0 + j0 t0 ) ()*+ ()*+ ( )* + ()*+

(nβ ) (nα ) (n2 )

(nα )

. Calculating a reduced system has for the most linear reduction methods a complexity of (n 2r). c Koninklijke Philips Electronics N.V. 2005

35

PR-TN-2005/00919

Unclassified Report

2. Reduce s-times the local linearised system. Therefore we have to do the following computation ˜ B˜ i u(t) =0 P Ci Pi y˙ + Pi G i Pi y + Pi ()*+ ( i )* + ( )* +

(n2r)

(n2r)

(

(2n)*α +nβ +n2+) (rn(k+2))

3. Solve these linearised systems at p timepoints. The complexity of a linear system of dimension r is (r 3 ). ! 4. Calculate the global subspace (P) out of a SVD of PSV D := P0 x0 , . . . , Ps−1 xs−1 ∈ n×(s(r+1)) . The complexity of the SVD is (n3 ). The reason that we also include the state of the linearisation points is that by doing this we can increase the accuracy of the TPWL model. The idea is related to the POD approach. Then we have to update the reduced model with the new basis. Therefore we have to compute Cir := P Ci P, G ir := P G i P and Bir := P B˜ i . This has a complexity of (s(n 2r + rn(k + 2))). We have to recalculate the local systems because otherwise we are confronted with problems if the reduced solution is leaving one ball and entering a new ball. So if we sum up all the complexities and take the three different Jacobian’s efforts in account we get 1.

(s(nβ+1 + n α+1 + 2nα + n β + n 2 + 4n2r + 2rn(k + 2)) + pr 3 + n 3)

2.

(s(3nα + 2nβ + n 2 + 4n2r + 2rn(k + 2)) + pr 3 + n 3)

3.

(s(2n + 2nα + n β + n 2 + 4n2r + 2rn(k + 2)) + pr 3 + n 3)

Solving the reduced system Now we need to solve the reduced system

(

s−1 i=0

wi (y, t)Cir y˙ + )*

(sr 2)

+

s−1 (

wi (y, t)G ir

i=0

)*

(sr 2 )

y+ +

s−1 (

i=0

˜ wi (y, t)Bir u(t) =0 )*

(sr(k+2))

+

where the complexity of w(y, t) is (r). The complexity for solving a linear system of dimension r is (r 3 ). So we have to solve this system at p timepoints it follows that solving the reduced system has a complexity of ( p(2sr 2 + sr(k + 2) + r 3 )) Projecting the reduced solution y to

n

Therefore we just need to multiply every reduced solution at the timepoints i = 1, . . . , p with P. So the complexity of this step is

( pnr) 36


Unclassified Report

PR-TN-2005/00919

2.6.5 Comparing the complexity for an example Now we want to compare the complexity of the methods for a specific example. Therefore we assume • The reduced order r is equal to

√

n.

• The number or inputs k and outputs q is smaller than the reduced order r. • The number of timesteps p which are needed for solving the system is a fixed number which does not depend on n. √ • The number of linearisation points s which are taken for the TPWL is equal to r = n (in general s < r). • The number of iterations k which are needed in the Newton process is a fixed and small integer. • l, s which give the number of inputs and initial value for the EBT method is small. In Table 2.2 on page 41 we sum up the complexity of the different methods. We only show the dominant term. Because we do not know how large α , β, k and p are we always have to state the terms which are depending on these values. Summary To make a short summary we first need to explain the difference between an online and an offline model reduction. A model reduction technique is called online if the model extraction step and solving the reduced system is faster than the numerical integration scheme for the original system. We call a model reduction technique offline if the modelextraction time is slower than the numerical integration scheme for the original system. In this case the reduced model is of course still useful but only in a libary idea to reduce the complexity of typical sub-circuits. POD The model extraction time of a POD method has the same complexity of the original system. Because of that POD does only make sense if we want to use offline model reduction. But then we still have the problem, that for solving the reduced system we have to evaluate q, j and if this is the most expensive part of our system we get a small speed up. So this means that POD is not the best choice. EBT Extracting the reduced model with a EBT method has a very high complexity. We can say this takes nls times the time of solving the original model, but the resulting model should behave better than a POD-model. That is the reason why EBT only makes sense for offline model reduction for small systems. But as for POD we only get a small speed up if evaluating q, j is expensive. TPWL 1 The model extraction of a TPWL method is around pkn− 2 faster compared to solving the original system if evaluation q, j is expensive. So if the integration time interval is long and we need a lot of timesteps extracting the model will be cheap. And also the solution we compute in the c Koninklijke Philips Electronics N.V. 2005

37

PR-TN-2005/00919

Unclassified Report

model extraction time is maybe a proper approximation to our real solution. And also solving the 3 reduced system is very cheap because it is compared to the original system up to n2 -times faster. This makes the method not only interesting for offline model reduction it is also useful for online model reduction.

2.7 Conclusion In this section we want to discuss which of the presented methods is the most useful for circuit simulation. Therefore we first discuss the advantages and disadvantages of each method.

POD The POD method has the advantage that it works with the original system. So we expect that the results will be good if our initial value and the input are very similar to the ones which were used to create the subspace. But the fact that the POD method is only reliable if the given input and initial value are very similar to the ones used for training proves to be a disadvantage. This disadvantage gets even worse, because there is no practical theorem from which we can estimate if we can use the actual POD basis for our inputs and initial values. Another disadvantage is that the POD method has to evaluate the original functions. If the evaluation is the most expensive part, and this is a fact in circuit simulation, the resulting speed up is small. Also the model extraction time is high because we have to solve the original system, so POD can not be used for online model reduction. POD pros good approximation useful also for large circuits

cons complex model extraction no online model reduction small speed up strong dependency on input/initial value no approximation estimators

Table 2.3: POD pros and cons.

EBT The EBT method has similar advantages and disadvantages like POD, because of the strong relation between both. But the expected approximation of EBT should be better than POD because of the fact that a large set of training inputs and initial values is used. The large set of training inputs and initial values also show that the method should behave better for different inputs and initial values than POD. Another fact, which should result in better approximations, is that EBT takes the input-output behaviour of the circuit into account. But because of the large number of training inputs and initial values the model extraction process is by far the most expensive of all methods. Due this the method is only useful for small circuits and of course the method is not useful for online model reduction. Also solving the reduced system is expensive due the fact that therefore the original q, j have to be evaluated. 38


Unclassified Report

PR-TN-2005/00919

EBT pros good approximation better behaviour for different inputs/initial values

cons most expensive model extraction no online model reduction small speed up only for small circuits (n ≤ 200) no approximation estimators

Table 2.4: EBT pros and cons.

TPWL The TPWL method has several advantages. First of all solving the reduced system is by far cheaper than the other methods. The fact that we are just solving low dimensional linear systems results in a big speed up. Also the model extraction is cheap if we do it in the way described in 2.4.4, so in fact the method can also be used to do an online model reduction. Another fact is that there are several linear model reduction techniques (i.e. Krylov or Balanced truncation) which have different complexities and different accuracies, so it is possible to use different methods. This means that we can decide between a faster or a more accurate one. Also it is possible to take into account a possible sparsity of the system to make the linear model reduction method more effective. Due to the fact that we are just handling linear systems we can also treat bigger systems than the other methods. Another fact which makes TPWL attractive is, that we can save the linearised reduced system in a library to reuse the results often. This library is also very easy to extend because we only have to save the new linearised model. But TPWL also has some disadvantages. The first one is that we are using only linear systems and they will not be a good approximation if we have highly nonlinear, fastly changing behaviour of the original system. To avoid this we can take more linearisation points. Another thing is that the method is more memory consuming because we have to store all the linearised systems. But we only have to store small linear systems which do not consume many storage and also nowadays memory is quite cheap. So this is only a minor disadvantage TPWL pros big speed up (up to n) model extraction time low online model reduction possible scalable results (speed←→quality) useful for big circuits better useful in a library of circuits takes in account sparsities

cons more memory consuming solving linear systems less accurate for highly nonlinear systems

Table 2.5: TPWL pros and cons

VS As seen in Section 2.5, VS cannot be used to reduce nonlinear DAEs. This approach is restricted to ODEs. c Koninklijke Philips Electronics N.V. 2005

39

PR-TN-2005/00919

Unclassified Report

Result In the author’s opinion the TPWL method is the most useful method, because the method is the fastest method of all and the disadvantages of these methods are possible to overcome. Also the method is, compared too the other ones, easy to specialise, so that we can even get better results if we use the special properties and behaviour of circuits to adapt our method.

40


TPWL TPWL TPWL TPWL

1a 1b 2 3

Method Step Original POD 1 POD 2 POD 3 EBT 1 EBT 2 EBT 3 3 2

3 2

3 2

3 2

c Koninklijke Philips Electronics N.V. 2005 1 2

1 2

1 2

3 2

1 2

3 2

3 2

3 2

3 2

5 2

5 2

5 2

3

3 2

3 2

3 2

α+1

5 2

3 2

3 2

Complexity of evaluating the Jacobian (n α ), (n β ) ( pk(2nβ + 2nα + n 3) ( pk(2nβ + 2nα + n 3) ( pk(2nβ + 2nα + n 52 )) ( pn 32 ) (lspk(2nβ+1 + 2nα+1 + n 4)) ( pk(2nβ + 2nα + n 52 )) ( pn 32 ) 1 2

1 2

1 2

1 2

3 2

3 2

3 2

3 2

(n)

5 2

5 2

5 2

3 2

3 2

( pk(n + n + n )) ( pk(nβ + n α + n 3 ) β+1 α+1 3 ( pk(n + n + n )) ( pk(nβ + n α + n 3 ) ( pk(n β+1 + n α+1 + n )) ( pk(nβ + n α + n )) ( pn ) ( pn ) β+2 α+2 4 β+1 (lspk(n + n + n )) (lspk(n + n α+1 + n 4 )) ( pk(n β+1 + n α+1 + n )) ( pk(nβ + n α + n )) ( pn ) ( pn ) (n β+ + n α+ + 2n + pn ) (2nβ+ + 3nα+ + 2n + pn ) (n β+ + 2nα+ + 2n + pn ) (n β+ + n α+ + n 3 + pn ) (2nβ+ + 3nα+ + n 3 + pn ) (n β+ + 2nα+ + n 3 + pn ) (3 pn ) (3 pn ) (3 pn ) ( pn ) ( pn ) ( pn ) β+1

(nα+1), (n β+1 )

Unclassified Report PR-TN-2005/00919

Table 2.2: Comparison of the complexity

41

PR-TN-2005/00919

42

Unclassified Report


Chapter 3

Optimal linearisation points and global subspace In the next chapters we want to discuss the mathematical and implementation aspects of the TPWL method. In this chapter we show how to select linearisation points in an ’optimal’ way. In Chapter 4 we want to discuss how the weighting between the different models is done. Then we proceed in Chapter 5 with the question which linear model reduction techniques for DAEs can be used in the TPWL approach. Now we want to discuss a technique to select our linearisation tuples (xi , ti ) (LT) in an ’optimal’ way. For information on LT see 2.4.1. This means that we want to select as less LTs as x|| , where x is the original solution and x˜ := Py is the approximated possible so that the error ||x−˜ ||x|| solution, is smaller than a given bound δx > 0. We discuss this approach for two different circumstances. The first approach assumes that we know the ’exact’ solution of the system, for example solved by a BDF method. With this knowledge we try to estimate the error of the TPWL model while we are constructing it. The second approach assumes only the knowledge of an approximated solution. This idea is the same as in [2]. We must note that we can not estimate the error of the global TPWL model, because if we select LT i we have no idea how the LT j with j > i looks like, and therefore we do not know the influence of them to the global basis. So we can only estimate the error of the actual local linearised x|| with x˜ = Pi y. And because of the fact that the global subspace is constructed by system, ||x−˜ ||x|| a union of all local subspaces, we assume that the local error is similar to the global error. This assumption holds in practice because the local subspaces are normally quite small compared to the global subspace. So the global subspace holds much information of the local subspaces.

3.1 Knowledge of the exact solution For this approach we have knowledge of the ’exact’ solution x of our system in a given time interval T = [0, b]. For example, it can be obtained as the result of a BDF/NDF method. What we want to construct is an estimator of the error of the local linearised reduced model. The reason why we only estimate the local and not the global error is that we do not know which LTs the controller x|| , where x˜ := Pi y is the approximated will select in the future. So we want to estimate ||x−˜ ||x|| solution. Because of the fact that we can calculate ||x|| it is enough to estimate ||δx|| := ||x − x˜ || = ||x − Pi y|| 43

PR-TN-2005/00919

Unclassified Report

where y ∈ r is the solution of the local linearised reduced system and Pi spans the related local subspace. This estimation can be done in two different ways. To describe both we take a look at Figure 3.1 on page 44.

nonlinear DAE of dimension n linearisation error linear DAE of dimension n reduction error linear reduced DAE of dimension r Figure 3.1: The two basic TPWL steps In this picture, the procedure for creating the linearised reduced model is described. As we can see we have two errors, one caused by the linearisation process and the other because of the reduction process. If we use a linear model reduction technique which has an error bound we can estimate our error as the sum of the linearisation error and the reduction error. If we have a model reduction technique which has no error estimation, for example Krylov methods, we can not estimate the error in the reduction step, so we have to cover the error of both steps in one estimator.

3.1.1 Error estimation for a linear model reduction technique which has no error estimator The problem of using model reduction techniques which have no error bound is, that we have to create an ’error estimator’ which covers the linearisation error and the reduction error. We do this by subtracting the original and the local linearised reduced model from each other. To do this we first have a look at the original and the i-th reduced linearised system. The original system is given as d q(t, x) + j(t, x) + Bu(t) = 0 dt under the assumption that ∂t∂ q(t, x) = 0 for all t ∈ , x ∈

n

(3.1)

we get the resulting system

C(t, x)˙x + j(t, x) + Bu(t) = 0.

(3.2)

In the special case of circuit simulation our C(t, x) is time independent, so we denote it as C(x). The local i-th linearised reduced system is ˜ = 0. Pi Ci Pi y˙ + Pi G i Pi y + Pi B˜ u(t)

(3.3)

Our goal is to subtract both systems to get an estimator of the error, but because of the reason that (3.3) is a system of dimension r, we have to project the system back to the original solution space via multiplying with P. So we get ˜ = 0. Pi Pi Ci x˙˜ + Pi Pi G i x˜ + Pi Pi B˜ u(t) 44

(3.4)


Unclassified Report

PR-TN-2005/00919

Now we can subtract (3.4) from (3.2) and get ˜ = 0. C(x)˙x − Pi Pi Ci x˙˜ +j(t, x) − Pi Pi G i x˜ +Bu(t) − Pi Pi B˜ u(t) )* + ( )* + ( (1)

(3.5)

(2)

Here the parts (1) and (2) require a closer look. 1. In this part we have the problem that the reduced and original solution are multiplied with different singular matrices, remember we are handling DAEs. So we write this in a different way C(x)˙x − Pi Pi Ci x˙˜ = C(x)˙x − Pi Pi Ci x˙˜ + Pi Pi Ci x˙ − Pi Pi Ci x˙ ! = Pi P Ci (˙x − x˙˜ ) + C(x) − Pi P Ci x˙ i

=

i

! d Pi Pi Ci δx + C(x) − Pi Pi Ci x˙ . dt

Now we have a proper formulation of this part. Of course we do not know x˙ , but this part can be approximated through a numerical differentiation process of our known original solution. For example by interpolating the solution via a spline and then use a differential formula for calculating the approximative derivative. 2. The problem of this term is that x˜ is unknown, so we extend −Pi Pi G i x˜ = −Pi Pi G i x˜ + Pi Pi G i x − Pi Pi G i x =

Pi Pi G i δx − Pi Pi G i x

So finally we get Pi Pi Ci

! d ˜ δx + C(x) − Pi Pi Ci x˙ + j(t, x) + Pi Pi G i δx − Pi Pi x + Bu(t) − Pi Pi B˜ u(t) = 0. dt

Now we introduce k(t, x) which holds all known parts. ! ˜ k(t, x) := C(x) − Pi Pi Ci x˙ + j(t, x) − Pi Pi x + Bu(t) − Pi Pi B˜ u(t) with this we get d δx + Pi Pi G i δx + k(t, x) = 0. (3.6) dt At this point the problem arises that we have a DAE with an irregular pencil, because the matrix Pi Pi is singular with rank(Pi Pi ) = r. One possibility to overcome this problem is to multiply the system again with Pi , remember Pi Pi = Ir and then use the Euler backward method to ’solve’ the system, under the assumption that we know δxj , the error at timepoint t j . So we get Pi Pi Ci

Pi Ci

1 δx j − δx j −1 + Pi G i δx j + Pi k(t j , x j ) = 0 h 1 1 Pi Ci + G i δx j = Pi Ci δx j −1 − Pi k(t j , x j ) h h

This is a system with n unknowns but only r equations, so we have to solve it with a least square method, to get at least a feeling how big the error is. So iteratively we can compute a rough estimate of the error. c Koninklijke Philips Electronics N.V. 2005

45

PR-TN-2005/00919

Unclassified Report

Conclusion This approach has an error estimation which can be computed during solving our system with an ’exact’ solution. In fact the most parts are constant for the actual LT or are evaluated by the integration method (BDF). So the complexity of updating the error is relatively small. But it must also be clear that the given error estimator can be quite bad, because there is a whole subspace which fulfils (3.6) and we only compute the one which minimises the residual. So in fact it can be that the error is bigger.

3.1.2 Error estimation for a linear model reduction technique which has an error estimator In this section we describe how to construct an error bound for the TPWL system under the assumption that we have a bound for our reduction step. So we have only to construct an error bound for the linearisation error. This bound can be easily calculated through the properties of a Taylor expansion. Linearisation error To estimate the error of the linearisation step we have to look at the properties of an order 1 Taylor expansion. The Taylor expansion of f : n → n with f ∈ C 2 around a ∈ n is given by f(x) = f(a) +

1 ∂2 ∂ f(a)(x − a) + f(ξ )(x − a) ⊗ (x − a). ∂x 2 ∂x2

∂ f(a) is called the Jacobian of f and where ξ ∈ B(a, ε) as long as ||x − a|| ≤ ε. ∂x Hessian. ⊗ is the Kronecker product. The linearised representation of f is given as

∂2 f(ξ ) ∂x2

is the

˜f(x) = f(a) + ∂ f(a)(x − a). ∂x With this we can create an error bound for the Taylor expansion ∂2 1 sup || 2 f(x)|| · ||(x − a)||2 . ||f(x) − ˜f(x)|| ≤ 2 x∈B(a,ε) ∂x Now we adapt this to our system. The Taylor representation of j around a = x0 then is ∂ 1 ∂2 j(t0 , x0 )(t − t0 ) + j(ξt , ξx )(t − t0 )2 ∂t 2 ∂t 2 1 ∂2 ∂2 j(ξt , ξx )(x − x0 )(t − t0 ) + j(ξ , ξ )(x − x ) ⊗ (x − x ) + t x 0 0 2 ∂x2 ∂t∂x

j(t, x) = j(t0 , x0 ) + G(t0 , x0 )(x − x0 ) +

and the linearised representation is ˜j(t, x) = j(t0 , x0 ) + G(t0 , x0 )(x − x0 ) + ∂ j(t0 , x0 )(t − t0 ). ∂t If we now assume that

∂2 j(t, x) ∂t 2

and

∂2 j(t, x) ∂t∂x

are 0, for all x ∈

n

,t ∈

then we get1

1 ∂2 sup ||j(t, x) − ˜j(t, x)|| ≤ || 2 j(t, x)|| · ||x − x0 ||2 . 2 x∈B(x0,εx ),t∈[t0,t0 +εt ] ∂x 1 This assumption holds for the most circuit, because we have no sources in j.

46


Unclassified Report

PR-TN-2005/00919

Now we take a look at

d q(t, x). dt

We can write this as d q(t, x) = C(x)˙x dt

under the assumption that ∂t∂ q(t, x) = 0 and q ∈ C 2 . The Taylor representation of C is given as C(x) = C(x0 ) +

∂ ∂2 C(ξx )(x − x0 ) = C(x0 ) + 2 q(ξt , ξx )(x − x0 ) ∂x ∂x

and the constant representation of C(x) ˜ x) = C(t0 , x0 ) C(t, with ξt ∈ [t0 , t0 + εt ] and ξx ∈ B(x, εx ). With this we can see that the error between the constant representation and the original equation is ˜ ||C(x) − C(x)|| ≤

sup x∈B(x0,εx ),t∈[t0 ,t0 +εt ]

||

∂2 q(t, x)|| · ||x − x0 ||. ∂x2

With these results we have an upper bound of the linearisation error l of our system. , ˜ x˙ + j(t, x) − ˜j(t, x)|| l := || C(x) − C(x) . . 2 . .∂ . . ≤ sup . ∂x2 q(t, x). · x − x0 · ˙x x∈B(x0,εx ),t∈[t0 ,t0 +εt ] +

(3.7)

∂2 1 sup || 2 j(t, x)|| · ||x − x0 ||2 . 2 x∈B(x,εx),t∈[t0,t0 +εt ] ∂x

Resulting error estimator With the knowledge of the error estimator in the linearisation process l and an estimator for the error in the reduction r , which is given by the used model reduction technique, we can calculate our needed error bound δxm . This bound is given as ||δxm || := l + r and describes the error of our local reduced linearised system with respect to time point m.

3.1.3 Using the error estimation for selecting the optimal linearisation points In this section we want to describe an algorithm which selects an optimal linearisation point for our TPWL model. This algorithm can be easily used during an BDF/NDF method which solves the original system and because of the fact that the error estimations are reusing results from BDF/NDF method the complexity of calculating this estimator is quite moderate. Some implementation details are described in one of the following sections. Algorithm 1 is describing the procedure. c Koninklijke Philips Electronics N.V. 2005

47

PR-TN-2005/00919

Unclassified Report

Algorithm 1 Linearisation point controller 1. Set xl0 = x0 , tl0 = t0 , m = 0, i = 0 and δx > 0 2. While tm < b: 3. Create a linearised reduced model at LT (xli , tli ), see Section 2.4. Set ||δxm || := ||xm − Pi Pi xm ||. 4. Set m = m + 1. Calculate xm at time point tm via a BDF method. 5. Calculate the error estimation ||δxm || for the i-th linearised reduced model of the actual solution xm (a)

||δxm || ||xm ||

< δx go to step (4)

(b)

||δxm || ||xm ||

≥ δx , set i = i + 1, xli = xm−1 , tli = tm−1 and go to step (3)

Algorithm 2 Selecting LTs depending on the distance between original and reduced system 1. Set xl0 = x0 , tl0 = t0 , m = 0, i = 0 and δx > 0 2. While tm < b: 3. Calculate Pi for the LT (xli , tli ) and create the linearised reduced model, see Section 2.4. Set ym = Pi xm and ||δxm || := ||xm − Pi Pi xm || 4. Set m = m + 1. Calculate xm at time point tm via a BDF method. Calculate with the same stepsize as the BDF the solution of the linearised reduced model ym 5. Calculate the error ||δxm || := ||xm − Pi ym || for the i-th linearised reduced model of the actual solution xm (a)

||δxm || ||xm ||


(b)

||δxm || ||xm ||


3.1.4 Solving the reduced system for selecting the LTs In this approach we compute the error between the reduced and original system directly. The idea is based on the fact that solving the local reduced system is relatively cheap, because it is of dimension r n. The idea can be seen in Algorithm 2. The advantage of this approach is that we calculate the ’exact’ error between the original and reduced system, therefore we select less LTs than the other approaches. One will note that this approach is quite expensive compared to the other approaches, but in practice it was proved that the complexity of both approaches is similar. Further improvements In practice we have encountered two minor problems with this approach. The first problem is that x|| may result in too less LTs. The reason for this seems to be that the error bound is the error ||x−˜ ||x|| 48


Unclassified Report

PR-TN-2005/00919

not sensitive enough. If we have a large constant part in the states the error does not not grow fast, because the approximation of this part is quite well. But it can be that the error in the fast changing part can get big. One idea to overcome this problem is to construct a selective norm. This norm only takes in account the parts which are not constant / in the exact / solution. So in fact we scale / each element with a factor λi ∈ {0, 1} with λi = 1 if xm i − xm−1i / ≤ ε, 0 < ε 1 else λi = 0. With this scaled norm we only take the varying parts in our norm and so overcome this problem. Of course we are not using only this norm. In fact we use this norm only as backup to guarantee also a good approximation in the fast changing states. The second problem is that sometimes the last LT is not an optimal LT because the resulting local reduced system is quite bad so that the next solution of the ’exact’ solution also has to be a LT. This constructs a TPWL model which has too many LTs and can also be a bad approximation of the original system. Therefore we introduce a restarted method. If we encounter that LT n at timepoint tm n is not a good choice, this means the distance between LT n and n + 1 is too small, we delete LT n and n + 1 and set the n-th LT to mn − 1 and start again. This procedure has to be done until we have a good LTs, so the distance between LT n and n + 1 is big enough, or until we reach LT n − 1. If the last happens we have several choices, one option is to restart the BDF method at this point with a small stepsize or we can delete LT n − 1 and going further back or of course we can cancel the procedure and give an error message that we can not construct a TPWL model with the given setting. We want to describe this procedure on an example, see Figure 3.2 on page 50. In this picture we have 4 old LTs xi , i = 0, . . . , 3 and a new one x4 . But we also see that the distance between the last LT, x3 , and the new LT, x4 , is quite small. This means that the LT x3 was not a good choice for an LT. So we erase x3 and x4 and select a new LT which is in between x2 and x3 . In practice we chose the new LT as xn−1 if x3 = xn , where n is the time-step in the BDF solution. If the new LT has better properties than the old one, so the distance between x3,new and x4,new is large enough we go further. If the new LT x3,new again is not an optimal choice for a LT we do again one step back. This back stepping is done until we have a good LT or we have reached x2 . If we reach x2 we have several options, one for example is to restart the BDF with an higher accuracy or smaller time step or we just create an error message and stop the model extraction. If we use both improvements we get higher accurate TPWL models and can also reduce the number of LTs.

3.1.5 Implementation aspects In this section we want to discuss some implementation aspects of the different error estimators in our linearisation point controller. Our main goal is to reuse as much results of the BDF method as possible to get the complexity of the linearisation controller negligible with respect to the complexity of the BDF method.

Calculating the supremum of the norm of the Hessian As we can see in (3.7) we need an estimator for the supremum of the norm of a Hessian. First we think about how we can estimate the norm of the Hessian at a specific time and state point. Because we do not have an analytic expression of the Hessian of our functions we have to make a numerical approximation. For this we have two possibilities c Koninklijke Philips Electronics N.V. 2005

49

PR-TN-2005/00919

Unclassified Report

Figure 3.2: Back stepping example

. . 2 . .∂ . . . ∂x2 f(x). = . . 2 . .∂ . . . ∂x2 f(x). =

f(x − x) − 2 · f(x) + f(x + x) + ( x3 ) x2 .∂ . . f(x + x) − ∂ f(x − x). ∂x ∂x + ( x3 ) 2 x

where x ∈ n is a perturbation in the space domain. For a practical implementation a good choice for x would be 1

x = (xm − xm−1 ) c where xi is the i-th solution of our simulation and c ≥ 1. With this we have two different estimators for the norm of the Hessian. Because of the fact that in our case we know the Jacobians of j and q we can choose the estimator depending on which has the lowest complexity in the calculation, which is normally depending on the sparsity of C and G. The second part is to think about a solution to calculate the supremum of the norm of the Hessians. Because of the fact that we can not compute the norm of the Hessian at all times and state points, one idea is to use an updated maximum along the actual transient. Algorithm 3 describes how to calculate an estimation of the supremum of the Hessian of j. The estimates of the Hessians will be denoted as supj and supq for j and q, respectively. This algorithm should be a good and also fast estimation of the supremum. One idea to reduce the complexity of the estimation is to use the last Jacobian in the Newton process, which is done in the BDF method, for G(ti+m , xi+m − xi+m ). From this step we also can save the last result of the linear system in the Newton process and use it as xi+m . So then we only have to calculate the norms and G(ti+m , xi+m + xi+m ). 50


Unclassified Report

PR-TN-2005/00919

∂ Algorithm 3 Numerical estimation of supx∈B(x0,εx ),t∈[t0,t0 +εt ] || ∂x 2 j(t, x)|| 2

1. Set supj =

G(ti ,xi + xi )−G(ti ,xi − xi ) , xi

m = 1 and c > 1

2. Calculate next tuple (xi+m , ti+m ) and xi+m = 1c (xi+m − xi+m−1 ) & % G(ti+m ,xi+m + xi+m )−G(ti+m ,xi+m − xi+m ) 3. Setting supj = max supj , x i+m

4. If a new linearisation point is set go to step (1). Else go to step (2)

3.1.6 Conclusion In our tests we got the feeling that the LT controllers which are based on an error estimator seem to select too many LTs and therefore construct systems which are too big. And because of the fact that we are building up our global basis via a SVD, we can get problems like the information of one local basis does not have any influence on the global basis because other bases are more dominant. And this can lead to large errors. The approach in which we calculate the solution of the local reduced system has the advantage that the number of LTs is relatively small. Thus we do not have that many problems in merging them to construct a global basis. So in general the solution of this approach has, with the same setting as the other methods, a smaller TPWL model which also has a smaller error.

3.2 Knowledge of the approximated solution In this approach we do not have the knowledge of the exact solution. But we want to estimate the error of the resulting TPWL model compared to the exact solution. We only describe some ideas how this method can be used, but do not give examples how it works in practice because, for good quality models of a non smooth and fast changing system, this idea is not useful. So the responsibility of handling practical issues is given to the user.

3.2.1 General idea We want to estimate the error of a linearised reduced model to the nonlinear full system with only having knowledge of the reduced solution. This error estimation can be used in a LT controller which is only solving the local reduced systems. This gives a huge speed up in the model extraction step so that it is possible to use the TPWL method as an ’online’ model reduction. Our original system is given as in (3.1). The corresponding i-th local linear reduced system as in (3.3). As in Section 3.1 we subtract both systems and multiply the reduced system with Pi . The difference between both is given as, see (3.5) d ˜ (q(t, x) − Pi Pi Ci x˜ ) + j(t, x) − Pi Pi G i x˜ +Bu(t) − Pi Pi B˜ u(t) = 0. )* + dt )* + ( ( (1)

(2)

In this equation we have to take a closer look to part (1) and (2) 1. In this part we have the problem that we can not calculate dtd q(t, x) = C(x)˙x so we have to c Koninklijke Philips Electronics N.V. 2005

51

PR-TN-2005/00919

Unclassified Report

make some calculations d d (q(t, x) − Pi Pi Ci x˜ ) = (q(t, x) − q(t, x˜ ) + q(t, x˜ ) − Pi Pi Ci x˜ ) dt dt ! = C(x)˙x − C(˜x)x˙˜ + C(˜x) − Pi Pi Ci x˙˜ Still we have the problem that we can not compute C(x) but because of the fact that we only need the norm of C(x) which is very small we can make a good approximation of this through C(˜x), with this we get ! ! d C(x)˙x − C(˜x)x˙˜ + C(˜x) − Pi Pi Ci x˙˜ ≈ C(˜x) δx + C(˜x) − Pi Pi Ci x˙˜ dt 2. Here we have the problem that we can not calculate j(t, x) so we again add a zero j(t, x) − Pi Pi G i x˜ = j(t, x) − j(t, x˜ ) + j(t, x˜ ) − Pi Pi G i x˜ now we also have to assume that j(t, x) is Lipschitz continuous so it follows ||j(t, x) − j(t, x˜ )|| ≤ L j ||x − x˜ ||, for all x, x˜ ∈

n

(3.8)

With all this we end up with C(˜x) with

d δx + j(t, x) − j(t, x˜ ) + k(t, x˜ ) = 0 dt

(3.9)

˜ k(t, x˜ ) := (C(˜x) − Pi Pi Ci )x˙˜ + j(t, x˜ ) − Pi Pi G i x˜ + Bu(t) − Pi Pi B˜ u(t).

Because we need the norm of δx we now apply a norm to (3.9) and then use the Cauchy-Schwarz inequality and property (3.8). So we get . . . . . . d .C(˜x) δx. ≤ L j δx + .k(t, x˜ ). . (3.10) . . dt In this equation we now have the problem that the derivative of δx is multiplied with C(t,˜x). So we split both to separated norms which make the left side bigger. So we assume that the left side has grown slower than the right side so that the inequality still holds. So the inequality changes to . . . . .C(˜x). d δx ≤ L j δx + .k(t, x˜ ). . dt Under the assumption that we know δxi we can calculate δxi+1 by using the Euler backward method. This leads to an error estimation which can be computed iteratively. . . . . .C(˜x). 1 (δxi+1 − δxi ) ≤ L j δxi+1 + .k(t, x˜ ). hi . . .k(t, x˜ ). δxi . . δxi+1 ≤ . + 1 . . .C(˜x). − h i L j .C(˜x). − L j hi With this estimation we then can create an error bound which we can compute during solving the local linearised system. Of course this only is true if the assumption we made, that the left hand side of (3.10) is not growing faster then the right hand side, holds. So we have to be careful with this error estimation 52


Unclassified Report

PR-TN-2005/00919

3.2.2 Using the error estimation for selecting optimal linearisation points Now we want to describe an algorithm which selects the linearisation points for our TPWL model using the given error estimator. This algorithm can be used while we are solving the local reduced linear system. But of course we have to be really careful because the solution of the local reduced systems can be bad if the original system is not smooth and/or fast varying. Algorithm 4 Linearisation point controller with approximated solution 1. Set xl0 = x0 , tl0 = t0 , m = 0, i = 0 and δx > 0 2. Do until tm < b 3. Create a linearised reduced model at LT (xli , tli ), see Section 2.4. Set ym = Pi xm and ||δxm || := ||xm − Pi Pi xm || 4. Set m = m + 1. Calculate ym at time point tm . 5. Calculate the error estimation ||δxm || for the i-th linearised reduced model of the actual solution ym (a)

||δxm || ||Pym ||


(b)

||δxm || ||Pym ||


With using Algorithm 4 as a LT controller we can extract our systems very fast and therefore we can use this approach as online model reduction process. Also the solution y of this algorithm can be used as a solution of the original circuit but we have to be really careful in using this algorithm. Remark 3.1. One idea to get a better feeling about the error of the LT controller is to solve at the same time the local linearised and the local reduced linearised system. The distance between both solutions gives us a feeling about the model reduction error, and if we can compute an estimate for the linearisation error, we get a relatively good feeling about the local error of the approximated solution. But as we have already mentioned, the computation of the linearisation error is quite expensive, because we have to compute supremum of the norms of the Hessians. And if we can not compute them in a cheap way or know them in advance, this approach is fairly useless because the resulting complexity gets too high.

3.3 Creating the global subspace In this section we discuss how we can create a global subspace which we need to reduce our local linearised systems to the desired dimension. We need a global subspace because we want to combine the local linearised systems, if we did not do this we could get problems in combining them because if we used for example a Krylov technique to reduce the local linearised systems we could have local subspaces of different dimensions, because of the so-called happy breakdown. A happy break down means that every new column which is added to our subspace is linear depended to the old ones. This means that our actual subspace is rich enough. And with local subspaces of different sizes we cannot combine the local reduced systems via a weighting. We present two different approaches to create the global subspace. c Koninklijke Philips Electronics N.V. 2005

53

PR-TN-2005/00919

Unclassified Report

Remark 3.2. If we use only the subspace of the first linearised system to reduce all other systems we will not have to do additional computations. But the result can be bad because maybe the first subspace is totally different to the subspace of the later linearised systems. So we have to be very careful in using this approach. Another point we have to note is that in practice the dimension of the local subspaces is relatively small compared to the dimension of the global subspace. For example if the local dimension is 10 then the global one is about 50.

3.3.1 Creating the global subspace by using the full local subspaces In this approach we create the global subspace in a very simple approach, but in practice it comes out that this approach can cause problems. These problems are also discussed. Through the LT controller we get several linear systems with several different subspaces. So we can say that LT i has it’s own related subspace Si which is spanned through Pi . If we want to merge them we can use Algorithm 5. Algorithm 5 Creating the global subspace with the full local subspaces Given are s subspaces Si which are spanned by Pi , i = 0, . . . , s − 1. The i-th subspace is related to the i-th linearised system around the LT (ti , xi ) .. .. .. .. ˜ 1. Create P := P0 .x0 . . . . .Ps−1 .xs−1 2. Calculate the singular value decomposition of P˜ := U V with U ∈

n×n

.

3. Create the global subspace S which is spanned by P (a) P := (u1 . . . ur ) with r = maxr∈{1,...,n} r,r < δr . Where δr defines the accuracy of the subspace (b) P := (u1 . . . ur ) with r the desired dimension of the reduced system

As we can see we first build up a matrix which holds all local projection matrices and also the related linearisation state. The reason why we also include xi in this matrix is that this also reflects the subspace in which the original solution is lying and with this we can get better results. This idea is related to the idea of POD, see Section 2.2. Then we create a singular value decomposition of the merged projection matrices and because of the properties of the SVD the columns of matrix ˜ Also the columns of U are ordered by their importance. U span the subspace of the image of P. We also have a measurement of the importance of ui which is given by Si,i . This means if Si,i is large then the related column in U will be important an we should not neglect it, but if Si,i is very small we can neglect ui and still have a good approximation. This idea is reflected in step 3.a. In this approach we select all columns of U of which the singular value Si,i is bigger than a given bound δr . The bound then controls how exact our reduced system is and of course also how big our reduced system is. But because of the fact that we do not know the singular values in advance it might be that the reduction is not as high as expected. With step 3.b. we try to overcome the problem that we do not know the size of the reduced system in advance, because we are just taking the first r columns of U . So we can assure that our reduced system has the desired size, maybe it could be smaller if the singular values are small enough. But we also have to be careful if the singular values are too big because then we can 54


Unclassified Report

PR-TN-2005/00919

truncate some columns which are important for our global matrix. But this approach also has a problem. Because of the fact that the local subspaces can have different sizes it can be that we have a few very small subspaces while the rest, compared to them, is really big. Then we can encounter the problem that the SVD puts the information of the small subspaces in the last few columns, for the algorithm that means that they are unimportant, but of course they are not unimportant. This problem especially occurs if we have a period where the system is very smooth and another period where it is really changing fast. In the smooth part we need a few LTs and only small subspaces to cover the behaviour. But in the fast changing part we need a lot of LTs and the subspaces are also quite big so that the influence of this part in P˜ is really ˜ dominant and therefore is also dominant in the SVD. And this is reflected in P.

3.3.2 Creating the global subspace by using parts of local subspaces Now we want to discuss a possible solution to overcome the problem that the influence of the fast changing part is too big for the global subspace. To understand the approach we first have to remember what the columns of each local projection matrix represent. In the Krylov approach each column represents an approximation of the moment we want to approximate and because we are approximating them via a truncated Neumann series, see Chapter 5, the importance of the columns is related to their position in Pi . We can say that pi,n , the n-th column of Pi , contains more information of the spanned subspace than pi,m for all m > n. Hence if we neglect the last column we are of course increasing the reduction error but the error is small compared to the error which we would get if we would neglect one of the first columns. With this knowledge we can formulate an algorithm which can solve our problem of the different sizes of the local subspaces to create the global subspace. See Algorithm 6. Algorithm 6 Creating the global subspace by using parts of local subspaces Given are s subspaces Si which are spanned by Pi , i = 0, . . . , s − 1. The i-th subspace is related to the i-th linearised system around the LT (ti , xi ) 1. m = mini=0,...,s−1 dim(Si ) .. .. .. .. ˜ 2. Create P := p0,1 . . . p0,m .x0 . . . . .ps−1,1 . . . ps−1,m .xs−1 3. Calculate the singular value decomposition of P˜ := U SV with U ∈

n×n

.

4. Create the global subspace S which is spanned by P (a) P := (u1 . . . ur ) with r = minr∈{1,...,n} Sr,r < δr . Where δr defines the accuracy of the subspace (b) P := (u1 . . . ur ) with r the desired dimension of the reduced system

With this approach we can overcome the problem of subspaces of very different sizes. Because we construct the global subspace in such a way that each local subspace has the same influence to the global subspace. Because the error of matching the moments is also small with a very small subspace we can expect less problems, except that m is very small. c Koninklijke Philips Electronics N.V. 2005

55

PR-TN-2005/00919

Unclassified Report

3.4 Constructing the global TPWL model With the knowledge of our LTs and the global projection matrix we can now construct the global TPWL model. See therefore Algorithm 7. Algorithm 7 Construction of the global TPWL model Given is a nonlinear DAE of the type dtd q(t, x) + j(t, x) + Bu(t) = 0 1. Select the LTs as described in Section 3.1 or Section 3.2 depending if we want a high accuracy or low accuracy model respectively. From this we get the local linearised systems Ci x˙ + G i x + B˜ u˜ = 0, i = 0, . . . , s − 1 2. Construct the global subspace which is spanned by Pr as described in Section 3.3 3. Project the local linearised systems to the global subspace Cir

=

Pr Ci Pr

G ir B˜ r

=

Pr G i Pr Pr B˜

=

Then the global TPWL model is given as

s−1 s−1 wi (t, y)Cir y˙ + wi (t, y)G ir y + B˜ r u(t) = 0 i=0

i=0

y0 =

Pr x0 .

The question how we can select the weightings is discussed in Chapter 4.

56


Chapter 4

Weighting procedure In this chapter we present some approaches which can be used in creating a weighted TPWL model. We first show how weighting works in a very simple approach. And then we try to improve our method by using more available information of the original nonlinear system. We start with the question of how to use weighting and why we need weighting. Then we go further and discuss how we can construct a fast and good behaving weighting procedure. Of course someone can find several other approaches for weighting but here we only want to present some general ideas and techniques which have worked well in our tests.

4.1 How to apply weighting and the reason for weighting First we want to discuss how we can apply weights to our TPWL model to get a global linear system out of all the local linear systems. Remember that we get from our linearisation tuple controller s local linear systems, each of them is coming from a linearisation around a given tuple (tli , xli ), i = 0, . . . , s − 1, see Chapter 3. In the following sections we are not using xli for calculating the weights, instead we use yli = Pr xli . The reason for this is, that we want to make the complexity of the reduction process independent of the size of the original system. Because the weights would then be the only part which depend on the original size. The goal is to create a weighting procedure which has a complexity of (sr), with this we have a TPWL model whose complexity only depends on r and therefore we can get the full speed up. We assume that each of the local systems is a proper approximation of our original nonlinear system as long as our reduced solution stays near to the linearisation tuple, y ∈ ((tli , yli ), (εt , εy ). But of course we want a full transient simulation. Therefore the reduced solution y once leaves the accuracy region around the actual linearisation tuple. At this point we have to choose another reduced system to continue our simulation and assure the accuracy of the TPWL model. One idea to solve this problem is to create a linear combination of the local linear systems of our TPWL model. So we introduce a weight wi which reflects the influence of the i-th local linear system to the global linear system with respect to the actual state y. These weights are only depending on the actual state y and timepoint t. Due to this knowledge we can write our weighted TPWL model as

s−1 s−1 s−1 wi (y, t)Cir y˙ + wi (y, t)G ir y + wi (y, t)Bir u(t) = 0. (4.1) i=0

i=0

i=0

The weights are in general constructed in such a way that they fulfil two constraints: 0 ≤ wi ≤ 1 and s−1 i=0 wi = 1. So it follows that the global linear system is a convex combination of all local linear systems. To understand how we select the weights we have a look at Figure 4.1 on page 58. 57

PR-TN-2005/00919

Unclassified Report

y3t y0 y1t y2t y2 y1

t0

t2

t1

Figure 4.1: TPWL model In this picture we see a TPWL model consisting of 3 linear reduced models which are created around the LTs (t0 , y0 ), (t1 , y1 ) and (t2 , y2 ) that are lying on a trajectory of the original system. We also see three possible situations of the actual state y. The point y1t is only in the accuracy region around tuple (t1 , y1 ), so we should only take this system in the global TPWL model. This means that w1 ≈ 1 and w0 , w2 ≈ 0. For point y2t we see that it is in the accuracy region of tuple (t1 , y1 ) and (t2 , y2 ) so we should take a linear combination of both linear systems as the global system. So w1 + w2 ≈ 1 and w0 ≈ 0. The point y3t is not in any accuracy region. If this happens we should create a warning or even better an error that the solution is leaving the accuracy region of the TPWL model so that we can not trust the solution anymore. With this we can formulate a simple weighting procedure, see Algorithm 8. Algorithm 8 Simple weighting given s LTs (tli , yli ), i = 0, . . . , s − 1 and b = 0 for i = 0 to s − 1 if y ∈ ((tli , yli ), (δt , δy )) 0 wi ≤ 1. b=1 else 0 ≤ wi 1 end end if b = 0 Create warning end Such that s−1 i=0 wi = 1 This simple algorithm can be used as a template for the weighting procedures we want to create. The matter we have to discuss is how the weights are calculated. This is the topic of the next sections. 58


Unclassified Report

PR-TN-2005/00919

4.2 Calculating weights Now we want to discuss the question how we may calculate the weights to create our global TPWL model. We start with a weighting algorithm which only depends on the distance of the actual solution of the TPWL system to the linearisation points. Then we improve this weighting by using information of the linearisation error. All ideas are related to the linearisation error. As shown in Chapter 3 we have that . . 2 . .∂ d . ˜ x)|| ≤ ||q(t, x) − q(t, sup (4.2) q(t, x). . x − x0 ˙x . 2 dt x∈B(x0,εx ),t∈[t0 ,t0 +εt ] ∂x and 1 ∂2 ||j(t, x) − ˜j(t, x)|| ≤ sup || 2 j(t, x)|| · ||x − x0 ||2 2 x∈B(x,εx),t∈[t0,t0 +εt ] ∂x

(4.3)

with ˜j and q˜ the linear approximations to j and q, respectively. This holds only under the assumption that all partial time derivatives are zero. We note that the error grows quadratically if we go further away from the actual LT. This means that the weights we want to create should be distance dependent in such a way that the nearest local linear system should be the most dominant one in the global system, because the error of this system is smallest of all.

4.2.1 Distance dependent weights As already seen we want to create weights, which select the nearest/dominant local linear system and do not include local linear systems which are far away. This means that we want to construct ’sharp’ weights which depend only on the distance of the actual solution to the LTs, because we have no information about the Hessians of q and j. Here sharp means that the nearest system has a weight which is ≈ 1 and the part of the systems far away will have very small weights ≈ 0. The method we have chosen in our test can be seen in Algorithm 9. Algorithm 9 Distance dependent weights Given actual state y, actual time t, s LTs (tli , yli ) and αy , αt ≥ 0 with αy + αt = 1 . . / / 1. For i = 0, . . . , s − 1 compute di = αy .y − yli . + αt /t − tli / 2. For i = 0, . . . , s − 1 calculate w˜ i = e−

di β m

with m = mini=0,...,s−1 di , β > 0

3. Normalise the weights such that the given constraints hold ˜i wi = w˜si with s = s−1 i=0 w

Now we discuss this approach. First we calculate the ’distances’ di between the actual state y and time t to all LTs. The ’distance’ is a weighted sum of the state and time distance. The parameters αy and αt control the influence of state and time distance in the weights. So here a user can adjust the weighting procedure to his systems. In our tests we have chosen αy = 1 and αt = 0 to make the weights only state dependet. The reason for this is that in circuit simulation, if we make a separation of inputs and internal dynamics we will know that ∂t∂ q and ∂t∂ j are in general zero. Therefore the time distance has no influence on the error of the linearised system. Then we calculate the non normalised weights w˜i . Therefore we first divide the i-th distance by the minimum of all distances m. Then we multiply them with the negative constant −β and c Koninklijke Philips Electronics N.V. 2005

59

PR-TN-2005/00919

Unclassified Report

use this value in the exponential function. This assures that if the distance to the i-th LT is small then w˜ i will become bigger than w˜ k with a large distance to the k-th LT. With the parameter β we can control how fast w˜ i becomes small if the distance to the i-th LT is getting bigger. This means the bigger β the bigger the influence of a local linear system is if the actual state is close to the corresponding LT. But for big βs we also get that the influence of a local linear system is negligible if the distance is getting bigger. In our test we have chosen β = 25 to get sharp weights. The last step is then to normalise the weights. So we divide each of them by the sum of all. In this way we assure that the constraints of our weights are fulfilled. With this approach we have a weighting which is, firstly, fast, (sr), and, secondly, works in practice quite well, because the global system only consists of the dominant local systems. In practice we experienced that this approach is easy to implement and also the weighting give good results for the TPWL model.

4.2.2 Distance dependent weights under the knowledge of the Hessians This approach is similar to the first one but here we also assume that we can calculate the norms of the Hessians and the second partial time derivative of q and j. The reason why we should use these norms is related to the error bound of the linearisation error, see (4.2) and (4.3). So if we can calculate estimates of the norms of the Hessians we can construct better weights in the sense that they are not only distance dependent. So we scale them to get a better estimate of the error. To estimate the norm of the Hessians we have several options. The first is a simple one, there we estimate the norm of Hessians only directly at linearisation point. Of course this can be a really bad approximation, but it gives us at least an idea about the error. The second approach is similar to the one as in 3.1.5. So we create an updated norm during our transient analysis. To formulate the finial procedure we introduce . . 2 . .∂ . . sup Hj,i = . ∂x2 j(t, x). x∈B(xi ,εx ),t∈[ti ,ti +εt ] . . 2 . .∂ . . sup Hq,i = . ∂x2 q(t, x). . x∈B(xi ,εx ),t∈[ti ,ti +εt ]

The final method is described in Algorithm 10. Algorithm 10 Distance and norm dependent weights Given actual state y, actual time t and s LTs (tli , yli ), i = 0, . . . , s − 1 1. For i = 0, . . . , s − 1 compute di

=

1 Hj,i Py − xi 2 + Hq,i Py − xi P y˙ . 2

We approximate y˙ numericaly. 2. For i = 0, . . . , s − 1 calculate w˜ i = e−

di β m

with m = mini=0,...,s−1 di

3. Normalise the weights such that the given constraints hold. ˜i wi = w˜si with s = s−1 i=0 w As we can see the only difference between Algorithm 10 and Algorithm 9 on the page before is that we are here using a scaled distance. As already described the idea behind this scaling is 60


Unclassified Report

PR-TN-2005/00919

how the linearisation error is constructed. In practice we figured out that this approach lead to a better approximation, but we have a higher complexity in the model extraction step, because we have to calculate the norms of the Hessians. This method seems to be very useful if we want to construct high accurate models and do not care about the model extraction time, for example if we think of a libary.

4.3 Using estimates to calculate the weights If we look to our weighted TPWL model (4.1) we see that our system is still nonlinear because our weights are nonlinear, and are depending on the reduced state. So here we must use an iterative method to solve this equation for a given y, e.g. Newton method. Of course this is not what we want, because we try to construct weights in such a way that we have to solve a linear system, which is solvable in one step by a linear solver. So we have to think about an approach to avoid this problem. An obvious approach is to use estimates of the solution to calculate the weights. If we think about what the weights represent or better how we calculate them, we see that we are using the norm of distance of the actual state to the linearisation tuples to calculate the global linear system. So the weights ’only’ depend on the distance of the actual solution to the linearisation points. And if we now assume that the states are not changing dramatically when we go from timestep n to timestep n + 1 we can create the weights for calculating yn+1 by just using yn instead. Because if the state is not changing dramatically the weights also will not change much. Or we can use, and this is the better idea, an estimate for yn+1 . This estimate we can get for example through the integration process we are using. In a BDF algorithm we can use the predicted solution we would use as the starting value for the Newton process. Algorithm 11 describes how we can use a time stepping for our weights Algorithm 11 Using estimates for calculating weights 1. Create an estimate yen+1 for yn+1 . This estimate we can get from the used integration process. In a simple case we can use just yn as yen+1 or we can use a linear extrapolation as estimate. yen+1 = yn + (yn − yn−1 )

(tn+1 − tn ) . (tn − tn−1 )

2. Calculate the weights wi (yen+1 , tn+1 ), i = 0, . . . , s − 1 with one of the proposed methods 3. Solve the global TPWL system 4.1

This procedure has the advantage that the computational cost are quite small compared to a Newton procedure. And in practice we figured out that this approach is enough for weighting because the results do indeed not differ that much. Remark 4.1. The difference yn+1 − yen+1 is in Pstar, the Philips circuit simulator, used to estimate the truncation error.


61

PR-TN-2005/00919

62

Unclassified Report


Chapter 5

Model reduction techniques for linear DAEs In this chapter we want to discuss some linear model reduction techniques which can be used in a TPWL process. First we want to discuss the most popular techniques the so-called Krylov techniques and especially PRIMA. Then we describe the balanced truncation or also called truncated balanced realisation (TBR) process for linear DAEs. And at last we describe the poor man’s TBR algorithm.

5.1 General Problem Because we want to reduce our linearised system we focus in this chapter to a linear DAE of the type C x˙ = −Gx − Bu z = Dx.

(5.1)

In case of a single input or single output system B ∈ n or D ∈ n , respectively. If B and D are elements of n the system is called single input-single output (SISO), otherwise multi input-multi output (MIMO). In order to increase computational speed we want to find a system with lower order which approximates the input output behaviour of the original system Cr y˙ = −G r y − Br u z = Dz. In this system the vector y holds r internal variables (r n) and the projection to the original subspace (Py) this state approximates the solution of the original system. Nevertheless, input vector u and output vector z hold the same number of variables as before. Note that output vector z of the reduced system presents only an approximation of the output of the original system. Besides keeping essential characteristics an approximation problem may have to come up to several further constraints such as existence of a global error bound, preservation of stability and passivity or computational efficiency. But as we see not all linear model reduction techniques fulfil these criteria. 63

PR-TN-2005/00919

Unclassified Report

5.2 Methods for model reduction In recent years ,a variety of different approaches to solve this problem have been developed . Most of them make use of a Laplace transformation and then approximate the so-called transfer function in frequency domain. By the inverse transformation we obtain the reduced transfer function in time domain, which usually involves convolutions. Some of them are restricted in their application, for instance they only work for single input-single output (SISO) or require certain properties of the matrices. One class uses projections on Krylov-subspaces in order to implicitly match moments of the transfer function. Such methods are Padé via Lanczos (PVL) or the Arnoldi Algorithm as can be read for instance in [22] or [21]. Also PRIMA (Passive Reduced-order Interconnect Macromodelling Algorithm) falls in this category [6]. For a general overview about Krylov-based methods see also [23]. Another class takes a closer look to the Gramians of the system and is balancing this to find out which states are hard to observe or to control. This method is the truncated balanced realisation (TBR) also known as balanced truncation. There are also some methods which combine both properties to get the advantages of both methods in order to make better approximations than Krylov based methods with less computational effort than a TBR method. One of this methods is the ’Poor Man’s’ TBR (PMTBR) approach, [20].

5.3 Transfer function We start from the representation of our system as a linear set of equations, see (5.1). We assume in the following that matrix C is positive semi-definite and G is nonsingular. Therefore we can restrict our attention to the following system E x˙ = x + Fu z = Dx.

(5.2)

where E := −G −1 C and F := −G −1 B. To see the behaviour in frequency domain, we apply Laplace transformation techniques. Considering x(0) = 0 to be a well known initial value we obtain: s EX(s) = X(s) + FU(s) Z(s) = DX(s). Instead of a differential algebraic equation (DAE), we now have to deal with a linear algebraic equation, which only depends on s, the frequency. With some additional calculations we get a direct relation between input and output. Z(s) = −D(I − s E)−1 FU(s) If we define

H(s) := −D(I − s E)−1 F

we obtain Z(s) = H(s)U(s). H is called transfer function. The so-called impulse response h in the time domain is the inverse Laplace transform of H and satisfies the equation ∞ e−st h(t)dt = H(s) 0

64


Unclassified Report

PR-TN-2005/00919

In order to apply the inverse Laplace transform, we present H in a pole-residue expansion H(s) =

n i=1

Ri . s − pi

If we need the transfer function in the time domain we need to calculate the eigenvalues of matrix E. The eigenvalues of matrix E and the poles are very closely related, because the inverse of an eigenvalue is just a pole. The impulse response in this case is then h(t) =

n

Ri e pi t .

i=1

Now we can get started with approximating our transfer function. Note that the eigenvalue problem of matrix G−1 C is in fact a generalised eigenvalue problem of the matrices C and G (5.3) Cvi = λi Gvi i = 1, . . . , n where λi are the eigenvalues and vi are the corresponding eigenvectors. For more information see also [24].

5.3.1 Stability Definition 5.1. A linear system is called stable if for a bounded input and a finite initial value it holds that the state x converges, for t → ∞, against a bounded state. Theorem 5.2. A continuous time system is stable if and only if all its poles are located in the left-hand side of the complex plane Re( pi ) < 0 where pi are the poles of the system. The stability of a system can be easily checked by solving the generalised eigenvalue problem (5.3) and then look if all real parts of λi are in − .

5.3.2 Passivity First of all we have to consider so-called passive systems. Passive systems are systems that cannot produce energy from nothing. In practice almost all systems are non-ideal and, therefore, contain some loss (resistors convert a part of the electrical energy to heat). Systems that internally consume energy are said to be strictly passive. Passive systems are in general also stable. Therefore, we have to assure that our reduced model is passive as well, because, if not, it can generate energy from the nothingness and the simulation explodes and therefore is not anymore a good approximation of our original system. The transfer function H is a matrix valued rational function. Furthermore, the transfer function of a passive system is positive-real [25]. This means H(s) = H(s)

(5.4)

H(s) + H(s) ≥ 0 while Re(s) > 0

(5.5)

H(s) is analytic in s while Re(s) > 0

(5.6)

H

where H(s) H means complex conjugated and transposed. Positive realness can be tested for instance by an eigenvalue based algorithm as can be read in [18]. c Koninklijke Philips Electronics N.V. 2005

65

PR-TN-2005/00919

Unclassified Report

5.4 Krylov techniques We want to describe how Krylov based techniques work and how we can use them in practice. The idea of Krylov techniques is to find a proper subspace in which we can project our system. The subspace is chosen in the way that we minimse the error of the projection. Firstly we reduce the number of state variables by projecting the vector of state variables x on a subspace Sr of dimension r n. Therefore, we can express x as x = Wr y + e where Wr y ∈ Sr , Wr ∈ n×r and e ∈ Sr⊥ . Sr⊥ is the orthogonal complement of Sr . Now, we can use Wr y instead of x in our system. So in frequency domain we get s E Wr Y(s) = Wr Y(s) + FU(s) Z(s) = DWr Y(s) With this we have reduced the number of the state variables, but the system still consists of n equations. If we now multiply the whole system of equations by Vr with Vr ∈ n×r we reduce the number of equations from n to r. If we also assume that Vr and Wr are biorthogonal, Vr Wr = I , we get a reduced model of dimension r s Er Y(s) = Y(s) + Fr U(s) Z(s) = Dr Y(s) with Er = Vr E Wr , Fr = Vr F and Dr = DWr .

5.4.1 General Krylov-subspace methods The next step is to find a proper choice for Wr and Vr . There are several possibilities to think about. Maybe it is possible to use eigenvectors in any way. Another approach could be to compute either time series data or frequency domain data and then use singular value decomposition (SVD) to choose the r most important vectors which then span a subspace in which the solution is contained (POD approach). We restrict ourselves in this section to a Krylov subspace approach. From the presentation s EX(s) = X(s) + FU(s) in frequency domain we obtain X(s) = −(I − s E)−1 FU(s). By means of the Neumann series we have X(s) = −

∞

s k E k FU(s).

k=0

Hence, X ∈ span{F, E F, E 2 F, . . .}. As a result we learn that changing the base and using only the first r vectors of the Neumann series is equivalent to matching the first r derivatives around an expansion point. In other words we implicitly match moments. .. .. 2 .. .. r−1 X = F .E F .E F . · · · .E F Y = Wr Y Later on we need the following Theorem by E. Grimme [22]. 66


Unclassified Report

PR-TN-2005/00919

Theorem 5.3. If span{w1 , . . . , wr } ⊇

k F (E,

and span{v1 , . . . , vr } ⊇ then

k D (E

F)

, D)

d i H(0) d i Hr (0) = for i = 0, . . . , k D + k L ds i ds i

Proof. See [22]. Note that the above given choice of vectors spanning r is only true for a SISO system. In case of multi input, E k F are matrices and the columns of these matrices are chosen to span the space. Therefore we have to use a slightly different approach which can be found in [3] and [6].

5.4.2 Arnoldi algorithm The Arnoldi algorithm was exploited by Silveira, Kamon, White, Elfadel and Ling in 1990 [3]. In this special case we choose s = 0 as expansion point of the Taylor series and Wr = Vr , Vr orthogonal (Vr Vr = I ). If further span{v1 , v2 , . . . , vr } =

r (E,

F) = span{F, E F, E 2 F, . . . E r−1 F},

(5.7)

it is provable that then the first r moments, and therefore derivatives, of the reduced system match Dr Erk Fr = D E k F for k = 1, . . . , r − 1 dkH d k Hr | = |s=0 for k = 1, . . . , r − 1. s=0 ds k ds k Note that the above given choice of vectors spanning r is only true for a SISO system. In case of multi input, E k F are matrices and the columns of these matrices are chosen to span the space. Therefore we have to use a slightly different approach which can be found in [3] and [6]. The following lemma is used to prove the matching Lemma 5.4. Assume Vr ,E and F have the same properties as in (5.7). Be k = 0, . . . , s − 1. Then Vr Vr E k F = E k F Proof. E k F ∈ span{v1 , . . . , vr }

⇒

∃g ∈ span{v1 , . . . , vr } : E k F = Vr g

inserted into Vr Vr E k F yields Vr Vr E k F = Vr Vr Vr g = Vr g = E k F, ( )* + =I

because Vr is orthonormal. Let us now apply this lemma on the calculation of the moments. The transfer function is given as H(s) = −D (I − s E)−1 F

= −

∞ (D E k F)s k k=0


67

PR-TN-2005/00919

Unclassified Report

so we see that the k-th moment of the original system is given as m k := D E k F the reduced one as m kr := Dr Erk Fr . We want to show that the first r moments of both the systems are equal. So we try to transform the moments of the reduced transfer function m kr = Dr Erk Fr

= DVr (Vr E V )k Vr F = DVr Vr E Vr · · · Vr E Vr Vr F ( )* + k times = D (E ·)* · · E+ F times = D E F = mk k

k

This is true for k = 1, . . . , r. In order to prevent almost linear dependent columns in matrix Vr , we need to find an orthonormal base spanning the Krylov-subspace. How this can be done is shown in Algorithm 12 Algorithm 12 Arnoldi Algorithm 1 v1 = ||F|| F For i = 1, . . . k wi+1 = Ewi For j = 1, . . . , i wi+1 = wi+1 − (w i+1 w j )w j 1 wi+1 = ||wi+1 w || i+1 End End This algorithm is just the Gram-Schmidt orthogonalisation for Krylov-subspaces [23]. This method has the disadvantage that it is numerical unstable, so for a practical implementation the Modified Gram-Schmidt Krylov approach behaves much better, see [12]. In case of multiple input-multiple output the algorithm was extended to Block Arnoldi [3], [6].

5.4.3 Importance of preserving passivity As we have already seen, the model has to be passive in order to avoid energy production or nonphysical behaviour within the system in the simulation process. Especially in case of interconnection of models, passivity is a major issue. The interconnection of stable models is not necessarily stable, but the interconnection of passive models is a passive model. Now we mention some system properties that lead to a passive system. Note that these conditions are sufficient but not necessary. The following system sCX(s) = −GX(s) + BU(s) Z(s) = DX(s) and its transfer function

68

H(s) = D(G + sC)−1 B c Koninklijke Philips Electronics N.V. 2005

Unclassified Report

PR-TN-2005/00919

is considered. For passivity the following three properties are sufficient if they are fulfilled. 1. D = B 2. (C + C H ) is positive semi-definite, thus x (C + C H )x ≥ 0 , ∀x 3. (G + G H ) is positive semi-definite, thus x (G + G H )x ≥ 0 , ∀x It can easily be calculated that the special case of congruence transformation preserves positive semi-definiteness. A congruence transformation applied to E is defined as Er = Vr E Vr . Congruence transformations were first used in model reduction in [26]. Applied on positive semi-definite system matrices they preserve the systems passivity. Though the Arnoldi method contains a congruence transformation it is not preserving passivity. But the following method PRIMA is passivity preserving, because a congruence transformation is applied directly to the positive semi-definite system matrices.

5.4.4 Passive Reduced-order Interconnect Macro-modeling Algorithm (PRIMA) The PRIMA method is an improved Arnoldi algorithm. It was introduced by Odabasioglu, Celik and Pileggi in 1997 [6], [8]. We again choose Vr = Wr and Vr Vr = Ir ∈ r×r , but the difference is that, by this method, we directly apply it to the linear DAE (5.1). The difference to Arnoldi is that we are not multiplying the system with G−1 and then reducing it. The Krylov-subspace that is used here is span{r1 , . . . , rr } =

−1 −1 B) r (−G C, G −1 −1

= span{G

B, −G CG −1 B, . . . , (−G −1 C)r−1 G −1 B}.

In order to achieve Vr Vr = Ir we use the modified Gram-Schmitdt Arnoldi algorithm as it is very stable. To proof moment matching for PRIMA, we need the following lemma. Lemma 5.5. If Vr is orthonormal (Vr Vr = Ir ) and B ∈

n

such as

(G −1 C)k G −1 B ∈ colspan{Vr } then and

(Vr GVr )−1 Vr B = Vr G −1 B

(5.8)

(Vr GVr )−1 Vr C(G −1 C)k−1 B = Vr (G −1 C)k G −1 B k = 1, . . . , r − 1

(5.9)

Proof. (G −1 C)k G −1 B ∈ colspan(Vr )

⇒

∃g ∈ colspan(Vr ) : (G −1 C)k G −1 B = Vr g

We first proof (5.8). Inserted above yields −1 −1 Vr B = Vr GVr Vr GG −1 B Vr GVr −1 Vr GVr g = Vr GVr = g = Vr Vr g = Vr G −1 B c Koninklijke Philips Electronics N.V. 2005

69

PR-TN-2005/00919

Unclassified Report

Now we proof (5.9). (Vr GVr )−1 Vr C(G −1 C)k−1 B = (Vr GVr )−1 Vr GG −1 C(G −1 C)k−1 B −1 Vr GVr g = Vr GVr = g = Vr Vr g = Vr (G −1 C)k G −1 B.

The transfer function in this case is H(s) = −D(−G − sC)−1 B = D(I − s(−G)−1 C)−1 G −1 B ∞ % & k D (−G)−1 C G −1 B s k = k=0

and the moments are k Dr (−G r )−1 Cr G r−1 Br

&k −1 −1 Vr C Vr Vr B Vr GVr −Vr GVr −1 −1 Vr C Vr · · · −Vr GVr Vr C Vr · = DVr −Vr GVr ( )* + k times −1 Vr B Vr GVr )* + ( = DVr

%

=Vr G −1 B Lemma5.5

−1 −1 = DVr −Vr GVr Vr C Vr · · · −Vr GVr · Vr C Vr Vr G −1 B . )* + ( =G −1 B Lemma5.4

Lemma 5.4 and Lemma 5.5 are applied over and over again and it turns out that k k Dr −G r−1 Cr G r−1 Br = D −G −1 C G −1 B for k = 0, . . . , r − 1 The main difference between Arnoldi and PRIMA is that Arnoldi applies the projection framework to (5.2) and PRIMA directly applies to (5.1). Matrices C and G are typically positive semi-definite, but the product G−1 C might not be. The reduced system matrices with PRIMA are achieved by Cr = Vr C Vr

and

G r = Vr GVr .

Thus, the system matrices are directly reduced so that passivity can be preserved during reduction. Since Vr is a congruence transformation matrix, also the matrices Cr and G r are positive semidefinite and the system keeps its passivity. But PRIMA does not only preserve passivity, in the same time it is numerically stable. This is proved in [6]. Remark 5.6. The Krylov techniques have the advantage that the model extraction is fast. The reason for this is that we only need matrix-vector products and saxpy operations to construct the subspace. So this method is also useful for big systems. But the Krylov approaches have also some drawbacks which should be mentioned. We will discuss four of them. 70


Unclassified Report

PR-TN-2005/00919

• We are only approximating the transfer function around one frequency, so in general we can not assume that the reduced system will be accurate over a frequency range, if the input has a different dominant frequency. • We have no error bound for the approximated system, which causes problems in the linearisation tuple controller. And therefore we have no clue how good the TPWL model approximates the original nonlinear system. • Krylov approaches tend to construct too ’rich’ subspaces. This means that the approximation is good but we still need a relatively large subspace to get this result. There are several other methods which create smaller subspaces, but these approaches all have a higher complexity. • For a MIMO system we have to use a block Krylov approach. This means that the dimension of the resulting subspace is r = c · k, c ∈ + . So if we have a lot of inputs we cannot reduce our system to a order smaller than k, because the minimum size is k, and then we are still only approximating the first moment of our system, so we can have an insufficient approximation.

5.5 Balanced truncation (TBR) for DAEs Now we focus on the TBR method for DAEs and show how this method works and how we can make an implementation of this method. The idea behind a TBR method is to find out the nodes which are hard to observe and to control and then to neglect them to reduce the system. Therefore we transform the system to a balanced form, which then has the nodes ordered by their importance. The results we are presenting here are from [19], all proofs can be read there. To have the same notation as in [19] we consider a linear time invariant system of the type E x˙ =

Ax + Bu

(5.10)

z = Cx x(0) = x0 where the dimensions of all matrices are the same as in (5.1). We assume that the pencil λE − A

(5.11)

is regular, so det(λE − A) = 0 for only a finite set of λ’s. This is in general true for all circuits because the matrix A is regular. If the pencil is regular we can transfer our pencil to the Weierstrass canonical form. This means there exist matrices W and T such that both E and A can be written in a simpler form by the same transformation In f 0 T (5.12) E = W 0 N J 0 T A = W 0 In ∞ where In f is the identity matrix of dimension n f , J is the Jordan block corresponding to the n f finite eigenvalues of (5.11). N is nilpotent and corresponds to the infinite eigenvalues. N has the same nilpotency index ν as the the index of the DAE, Nν = 0. Representation (5.12) defines a composition of n into the complementary deflating subspaces of dimension nf and n ∞ c Koninklijke Philips Electronics N.V. 2005

71

PR-TN-2005/00919

Unclassified Report

corresponding to the finite and infinite eigenvalues of (5.11), respectively. The matrices In f 0 = T −1 T 0 0 0 0 W −1 = W 0 In ∞

Pr Pl

(5.13)

are the spectral projections onto the right and left deflating subspace of (5.11) corresponding to the finite eigenvalues. The pencil (5.11) is called c-stable if it is regular and all finite eigenvalues λn f of (5.11) fulfill the property that λn f ∈ − .

5.5.1 Descriptor systems Consider the descriptor system (5.10). It is well known that if (5.11) is regular, u(t) is ν times continuously differentiable and x0 is consistent, then the descriptor system (5.10) has a unique solution x, which is given by x(t) = (t)Ex0 +

∞ 0

(t − τ )Bu(τ )dτ +

ν−1

Fk+1 Bu(k) (t).

k=0

Here

(t) = T

−1

eJt 0

0 0

W −1

is a fundamental solution matrix of the descriptor system (5.10), and the matrices Fk have the form Fk = T

−1

0 0 0 −N k−1

W −1 .

The transfer function of (5.10) is given as G(s) := C(s E − A)−1 B. A quadruple of matrices [E, A, B, C] is called realisation of G(s). Often a realisation of G(s) is denoted as

sE − A C

B 0

.

ˆ E, ˆ B, ˆ C] ˆ are called restricted system equivalent, if there Two realisation [E, A, B, C] and [ A, ˆ ˆ exist two nonsingular matrices W and T such that

Wˆ I

sE − A C

B 0

Tˆ I

s Wˆ E Tˆ − Wˆ A Tˆ = C Tˆ

s Eˆ − Aˆ Bˆ . = 0 Cˆ

Wˆ B 0

(5.14)

The pair (Wˆ , Tˆ ) is then called a system equivalent transformation. Finally, the transfer function G(s) is called proper if lims→∞ G(s) < ∞, and G(s) is called strictly proper if lims→∞ G(s) = 0. 72


Unclassified Report

PR-TN-2005/00919

Controllability and observability The next two definitions describe different concepts of observability and controllability for DAEs. Definition 5.7. (5.10) and the triplet (E, A, B) are called controllable on the reachable set (Rcontrollable) if (5.15) rank (λE − A, B) = n for all finite λ ∈ . It is called impulse controllable (I-controllable) if rank (E, AK E , B) = n ,where the columns of K E span ker E.

(5.16)

It is called completely controllable (C-controllable) if (5.15) holds and rank (E, B) = n.

(5.17)

Because of the fact that observability is the dual property of controllability we get. Definition 5.8. (5.10) and the triplet (E, A, C) are called observable on the reachable set (Robservable) if λE − A (5.18) rank = n for all finite λ ∈ . C They are called impulse observable (I-observable) if ⎞ ⎛ E rank ⎝ K E A ⎠ = n ,where the columns of K E span ker E . C They are called completely observable (C-observable) if (5.18) holds and E rank = n. C

(5.19)

(5.20)

Controllability and observability Gramians Assume that (5.11) is c-stable. Then the integrals ∞ pc = (t)B B (t) dt 0

and

po =

∞ 0

(t)C C (t)dt

exist. The matrix pc is called the proper controllability Gramian and the matrix po is called the proper observability Gramian of (5.10). The improper controllability Gramian is defined as

ic =

ν

Fk B B Fk

k=1

and the improper observability Gramian as

io =

ν

Fk C C Fk .

k=1


73

PR-TN-2005/00919

Unclassified Report

If E = I , (which is the ODE case), then pc and po are the usual controllability and observability gramian. And the improper gramians do not exist. The proper Gramians are the unique symmetric, positive semidefinite solutions of the projected generalised continuous-time Lyapunov equations E pc A + A pc E = −Pl B B Pl

pc

=

Pr pc

(5.21)

and E po A + A pc E

po

= −Pr C C Pr =

po Pl ,

(5.22)

respectively, where Pl and Pr are the same as (5.13). For more details see [30]. If our system is in Weierstrass canonical form and if the matrices B1 W −1 B = B2 CT −1 = (C1 , C2 ) are partitioned in the same way as E and A, then the proper gramians are given as G 1c 0 −1 pc = T T − 0 0 po = W − G01o 00 W −1 where G 1c and G 1o satisfy the standard continuous-time Lyapunov equations J G 1c + G 1c J = −B1 B1 J G 1o + G 1o J

= −C1 C1 .

For more details see [30]. The improper Gramians are the unique symmetric, positive semidefinite solutions of the projected generalised discrete-time Lyapunov equations A ic A − E ic E = (I − Pl )B B (I − Pl )

(5.23)

A io A − E io E

(5.24)

Pr ic = 0

and

io Pl

= (I − Pr ) C C(I − Pr ) = 0.

Pl and Pr are the same as in (5.13). They can be represented as 0 0 −1 ic = T T − 0 G 2c 0 0 − io = W W −1 0 G 2o 74


Unclassified Report

PR-TN-2005/00919

where G 2c and G 2o satisfy the standard discrete-time Lyapunov equations G 2c + N G 2c N = B2 B2 G 2o + N G 2o N

= C 2 C2 .

The Gramians can be used to characterise controllability and observability properties of our system. Corollary 5.9. Considering (5.10), with a stable pencil. Then it holds that 1. The system is R-controllable and R-observable if and only if rank( pc ) = rank( po ) = rank( pc E po E) = n f .

(5.25)

2. The system is I-controllable and I-observable if and only if rank( ic ) = rank( io ) = rank( ic A io A) = n ∞ .

(5.26)

3. The system is C-controllable and C-observable if and only if (5.25) and (5.26) holds. Proof. See [19]. Hankel singular values The proper and improper Gramians pc , po , io and ic are not system invariant. Indeed under a system equivalence transformation (Wˆ , Tˆ ) the proper and improper controllability Gramians pc and ic are transformed to ˆ pc = Tˆ −1 pc Tˆ − and îc = Tˆ −1 ic Tˆ − , respectively, whereas the proper and improper observability Gramians po and io are transformed to ˆ po = Wˆ − po Wˆ −1 and îo = Wˆ − io Wˆ −1 , respectively. However, it follows from

ˆ pc Eˆ ˆ po Eˆ îc Aˆ îo Aˆ

= Tˆ −1 pc E po E Tˆ = Tˆ −1 ic A io ATˆ

that the spectra of the matrices pc E po E and ic A io A are system invariant. These matrices play an important role for descriptor systems as the product of the controllability and observability gramian for standard state systems. With this property we get the next result. Theorem 5.10. Let λE − A be c-stable. Then the eigenvalues of the matrices ic A io A are in + .

pc E po E and

For proof see [19]. Definition 5.11. Let n f and n ∞ be the dimensions of the deflating subspaces of the c-stable pencil λE − A corresponding to the finite and infinite eigenvalues, respectively. The square roots of the largest n f eigenvalues of pc E po E, denoted as σ j , are called the proper Hankel singular values of (5.10). The square roots of the largest n∞ eigenvalues of the matrix ic A io A , denoted by θ j , are called the improper Hankel singular values. c Koninklijke Philips Electronics N.V. 2005

75

PR-TN-2005/00919

Unclassified Report

We assume that the proper and improper Hankel singular values are ordered decreasingly, σ1 ≥ σ2 ≥ . . . ≥ σn f ≥ 0 θ1 ≥ θ2 ≥ . . . ≥ 0. In the ODE case, E = I , the proper Hankel singular values are the classical Hankel singular values of the standard space system. Since the proper and improper Gramians are symmetric and positive semidefinite, Cholesky factorisations exist there

pc po ic io

= R p R p = = =

(5.27)

L p L p Ri Ri L i Li

n×n where the matrices R are upper triangular matrices. The next lemma gives p , L p , Ri , L i ∈ a connection between the proper and improper Hankel singular values of (5.10) and the standard singular values of the matrices L p E R p and L i ARi .

Lemma 5.12. Assume that the pencil λE − A is c-stable. Consider the Cholesky factorisation (5.27) of the proper and improper Gramians. Then the proper Hankel singular values of (5.10) are the n f largest singular values of the matrix L p E R p , and the improper Hankel singular values are the n∞ largest singular values of the matrix Li ARi . For proof see [19]. A result of this lemma is that we now can calculate the Hankel singular values of our system in an effecient way and can also use them for the final reduction step.

5.5.2 Model reduction Here we focus on the question of how to reduce the order of system (5.10) in an efficient way under the knowledge of the Gramians. This leads us to a balanced truncation method for linear DAEs. Balanced realisation For a given transfer function G(s), there are many different realisation, as already seen. But here we are interested only in particular realisations that are useful in applications. Definition 5.13. A realisation [E, A, B, C] of the transfer function G(s) is called minimal if the triplet (E, A, B) is C-controllable and the triplet (E, A, C) is C-observable. Definition 5.14. A realisation [E, A, B, C] of the transfer function G(s) is called balanced if 0 pc = po = 0 0 and 0 0 ic = io = 0 with = diag(σ1 , . . . , σn f ) and = diag(θ1 , . . . , θn∞ ). 76


Unclassified Report

PR-TN-2005/00919

Now we show that for a minimal realisation [E, A, B, C] with a c-stable pencil, there exists a system equivalent transformation (Wb , Tb ) such as the transformed realisation [Wb E Tb , Wb ATb , Wb B, CTb ]

(5.28)

is balanced. Consider (5.27). Without loss of generality, we may assume that R p , L p , Ri and L i have full row rank. If our system is C-controllable and C-observable, then it follows from 5.9 and Lemma 5.12 that σ j > 0, j = 1, . . . , n f and θ j > 0, j = 1, . . . , n ∞ . It follows that L p E R p ∈ n f ×n f and L i ARi ∈ n∞ ×n∞ are regular. Now calculate a Singular Value Decomposition, L p E R p = U p V p L i ARi

=

(5.29)

Ui Vi

where U p , V p , Ui and Vi are orthogonal. = diag(σ1 , . . . , σn f ) and = diag(θ1 , . . . , θn∞ ) are regular, with σi ≥ σi+1 and θi ≥ θi+1 . Now construct & % 1 − 12 (5.30) L p U p − 2 , L U Wb = i i % & 1 1 Wb = E R p V P − 2 , ARi Vi − 2 and Tb = Tb =

% %

1

1

R p V P − 2 , Ri Vi − 2

& & 1

(5.31)

−2 E L p U p − 2 , A L i Ui 1

Since (see also (5.13)) Pl E

= E Pr

Pl A =

APr

we get that L p E Ri

= 0

L i AR p

= 0.

(5.32)

We want to show that (5.32) holds. Assume that (5.32) does not hold. Then L p E Ri = 0 ⇔ L p L p E Ri Ri = 0 ⇔ G po E G ic = 0 ⇔ G po Pl E G ic = 0 ⇔ G po E Pr G ic ( )* +

=

0.

0

This is a contradiction that means that our assumption was wrong so Lp E Ri = 0. In the same way we can show that L i AR p = 0. Continuing, (5.31)

1 1 1 1 − 2 U p L p E R p V p − 2 − 2 U p L p E Ri Vi − 2 = In , (Tb ) Tb = 1 1 1 1 − 2 Ui L i AR p V p − 2 − 2 Ui L i ARi Vi − 2 c Koninklijke Philips Electronics N.V. 2005

77

PR-TN-2005/00919

Unclassified Report

hence the matrices Tb and Tb are nonsingular and (Tb ) = Tb−1 . In the same way we can show that the matrices Wb and Wb are nonsingular and (Wb ) = Wb−1 . Using (5.27) and (5.29)-(5.32), we get that the proper and improper controllability Gramians of the transformed system (5.28) have the form 0 −1 − = = Wb−1 po Wb− Tb pc T 0 0 0 0 −1 − Tb ic Tb = = Wb−1 io Wb− . 0 As we see, if we choose Wb and Tb as described, we get balanced transformation of our system. Note that for ODEs the balancing transformation is not unique, the reason for that is that U and V of a SVD are not unique. From (5.29)-(5.31) we get In f 0 E b = Wb E Tb = 0 E2 A1 0 Ab = Wb ATb = 0 In ∞ where A1 = − 2 U p L p AR p V P − 2 and E2 = − 2 Ui L i E Ri Vi − 2 . We see that the pencil λE b − Ab is in Weierstrass canonical form and clearly it is regular, c-stable and has the same index as λE − A. 1

1

1

1

Balanced truncation In the previous subsection we have considered a transformation of a minimal realisation to a balanced realisation. However, computing the balanced realisation can be ill-conditioned as soon as or in (5.29) has small singular values. In addition, if the realisation is not minimal, then the matrices or will be singular. In the similar situation for standard state space systems one performs a model reduction by truncating the state components corresponding to zero and small Hankel singular values without significant changes of the system properties. This procedure is known as balanced truncation or truncated balanced realisation (TBR). We show that this idea can also be applied to a descriptor system (5.10). The proper Gramians can be used to describe the future output energy ∞ y (t)y(t)dt Ey = 0

and the past input energy

Eu =

0 −∞

u (t)u(t)dt

that is needed to reach from x(−∞) = 0 the state x(0) = x0 ∈ im(Pr ) when no input is applied for t ≥ 0. Theorem 5.15. Consider system (5.10). Assume that (5.11) is c-stable and the system is Rcontrollable. pc and po are the proper controllability and observability Gramians. If x0 ∈ im(Pr ) and u(t) = 0 for t ≥ 0, then Ex0 →y = x 0 E po Ex0 .

78


Unclassified Report

PR-TN-2005/00919

Moreover, for uopt (t) = B (−t) − pc x0 , we have Eu→x0 =

− min Eu = x 0 pc x0 ,

u∈Äk2 (Ê− )

where k2 ( − ) is the Hilbert space of all k dimensional vector functions that are square integrable on − = (−∞, 0) and the matrix − pc satisfies

pc −pc pc −pc pc −pc ( − pc )

= = =

pc −pc pc

(5.33)

For the proof see [19]. Remark 5.16. Equations (5.33) imply that − pc is the pseudo inverse of pc . It is, in general, not unique, but uopt (t) and Euopt are uniquely defined. Unfortunately, we were unable to find a similar energy interpretation for the improper controllability Gramians. If (5.10) is not minimal, then it has states that are uncontrollable and/or unobservable. These states correspond to the zero proper and improper Hankel singular values and can be truncated without changing the input output behaviour of our system. Note that the number of non-zero improper Hankel singular values is equal to the rank( ic A io A) which can be estimated as rank( ic A io A) ≤ min(νk, νq, n ∞ ). This estimate shows that, when the index of our DAE (ν) times the number of inputs (k) or the number of outputs (q) is much smaller then the dimension n∞ of the deflating subspace, the original system can be reduced significantly. Furthermore, Theorem 5.15 implies that a large input energy Eu is required to reach form x(−∞) = 0 the state x(0) = Pr x0 , which lies in an invariant subspace of the proper controllability Gramian pc corresponding to its small non-zero eigenvalues. Moreover, if x0 is contained in the invariant subspace spanned by E pc E corresponding to its small nonsingular eigenvalues, then the initial value x(0) = x0 has a small effect on the output energy Ey . For the balanced system, pc and E pc E are equal and, hence, they have the same invariant subspaces. In this case the truncation of the states related to the small proper Hankel singular values does not change system properties much. Unfortunately, this does not hold for the improper Hankel singular values. If we truncate the states that correspond to small non-zero Hankel singular values, then the pencil of the reduced system could get finite eigenvalues in the right half-plane. In this case the approximation is inaccurate because the resulting system is not stable anymore. Let [E, A, B, C] be a realisation (not necessarily minimal) of G(s). Assume that the pencil (5.11) is c-stable. And we also have the Cholesky factorisation (5.27) of the Gramians. Let 1 0 (5.34) L p E R p = (U1 , U2 ) (V1 , V2 ) 0 2 L i ARi

= U3 3 V3

(5.35)

be the ’thin’ singular value decomposition of Lp E R p and L i ARi , where (U1 , U2 ) , (V1 , V2 ) , U3 and V3 have orthogonal columns, 1 = diag(σ1 , . . . , σr f ) and 2 = diag(σr f +1 , . . . , σl f ) with c Koninklijke Philips Electronics N.V. 2005

79

PR-TN-2005/00919

Unclassified Report

σ1 ≥ . . . ≥ σr f > σr f +1 ≥ . . . ≥ σl f > 0 and r f = rank(L p E R p ) ≤ n f , 3 = diag(θ1 , . . . , θr∞ ) with θ1 ≥ . . . ≥ θr∞ > 0 and r∞ = rank(L i ARi ). Then the reduced system can be computed as

B˜ 0

s E˜ − A˜ C˜

where

=

Wr

=

Tr

=

Wr (s E − A)Tr CTr

−1 L p U1 1 2 ,

− 12 L i U 3 3

−1 R p V1 1 2 ,

−1 Ri V3 3 2

Wr B 0

(5.36)

∈

n×r

(5.37)

∈

n×r

and r = r f + r∞ . Note that the calculation of the reduced order system can be interpreted as applying a system equivalence transformation (Wˆ , Tˆ ) such that

Wˆ (s E − A) Tˆ C Tˆ

Wˆ B 0

⎛

= ⎝

(s E f − A f ) Cf

s E ∞ − A∞ C∞

⎞ Bf B∞ ⎠ 0

of λE∞ − A∞ are where the pencil λE f − A f has only finite eigenvalues and the eigenvalues infinite, and then reducing the order of the subsystem E f , A f , B f , C f and (E ∞ , A∞ , B∞ , C∞ ) using classical balanced truncation methods. So multiply each system with a matrix (Ir f , 0) and (Ir∞ , 0), respectively. Clearly the reduced order system (5.36) is minimal and the pencil λE˜ − A˜ is stable. The described decoupling of system matrices is equivalent to the decomposition of the transfer function as G(s) = G f (s) + G∞ (s), where G f (s) = C f (s E f − A f )−1 B f is the proper part and G∞ = C∞ (s E ∞ − A∞ )−1 B∞ of the improper part. The transfer function of the reduced system ˜ ∞ (s) with G ˜ f = C˜ f (s E˜ f − A˜ f )−1 B˜ f and G ˜ ∞ = C˜ ∞ (s E˜ ∞ − ˜ ˜ f (s) + G has the form G(s) =G −1 ˜ ˜ ˜ A∞ ) B∞ as the reduced subsystems. Note that G∞ (s) = G∞ (s) by construction, because we are ˜ f. ˜ only truncating the part with zero Hankel singular values. Hence, G(s) − G(s) = G f (s) − G Thus, we have the following upper bound for the ∞ -norm of the error between the transfer functions nf ˜ ˜ := sup ||G(iω) − G(iω)|| ≤ 2 σi (5.38) ||G(s) − G(s)|| À∞ ω∈Ê

i=r f +1

that can be derived in the same way as for the ODE case.

5.5.3 Algorithms to calculate a TBR of a DAE To reduce the order of a descriptor system we have to compute a Cholesky factorisation of the proper and improper Gramians that satisfy the projected generalised Lyapunov equations (5.21)(5.24). These factors can be computed using the generalised Schur-Hammarling method. Let the pencil (5.11) be in generalised Schur form E f Eu (5.39) U E = V 0 E∞ A f Au U A = V 0 A∞ 80


Unclassified Report

PR-TN-2005/00919

where U and V are orthogonal, E f and A∞ are upper triangular nonsingular, E∞ is upper triangular nilpotent (index ν) and A f is upper triangular. Now we compute the following matrices whose blocks are related to the size of the partition of A and E

V B =

=

CU

Bu B∞

(5.40)

C f , Cu .

Now the Cholesky factors of the Gramians of (5.10) have the form

Rf 0

Rp = U

Y R∞ R∞

(5.41)

Lp

= U = L f , −L f Z V

Li

= (0, −L ∞ ) V

Ri

where Y and Z are the solutions of the generalised Sylvester equation E f Y − Z E ∞ = −E u

(5.42)

A f Y − Z A ∞ = − Au . The matrices R f and L f are the Cholesky factors of the solution Xpc = R f R f and X po = L f L f of the generalised continuous-time Lyapunov equations E f X pc Af + A f X pc E f = −(Bu − Z B∞)(Bu − Z B∞)

(5.43)

E f X po A f

(5.44)

+

Af X po E f

=

−C f C f

and X io = L while R∞ and L ∞ are the Cholesky factors of Xic = R∞ R∞ ∞ L ∞ which are the solutions of the generalised discrete-time Lyapunov equations A∞ X ic A ∞ − E ∞ X ic E ∞ = B∞ B∞

A ∞ X io A∞

−

E∞ X io E ∞

(5.45)

= (C f Y − C u ) (C f Y − C u ).

(5.46)

From (5.39) and (5.41) we get that L p E R p = L f E f R f and L i ARi = L ∞ A∞ R∞ . Thus, the proper and improper Hankel singular values of (5.10) can be computed from the singular value decomposition of the matrices L f E f R f and L ∞ A∞ R∞ . Furthermore, it follows from (5.37) and (5.41) that the projection matrices Wr and Tr have the form Wr

= V

Tr

= U

−1

−1

Wf −Z W f Tf 0

Y T∞ T∞

0 W∞

− 12

2 with W f = L f U1 1 2 , W∞ = L ∞ U3 3 , T f = R f V1 1


(5.47)

−1

and T∞ = R∞ V3 3 2 . In this case 81

PR-TN-2005/00919

Unclassified Report

the matrices of the reduced system (5.36) are given by E˜

=

A˜ = B˜ = C˜ =

Ir f 0

0 W∞ E ∞ T∞

W f A f Tf 0

0

(5.48)

Ir ∞

W f (Bu − Z B∞ ) W∞ B∞

C f T f , (C f Y + C u )T∞ .

Algorithm 13 is a generalisation of the ’square root’ balanced truncation method for systems of type (5.10). Algorithm 13 Generalised Square Root (GSR) method Given a system of the type (5.10) with a c-stable pencil 1. Compute the generalised Schur form (5.39) 2. Compute the matrices (5.40) 3. Solve the generalised Sylvester equations (5.42) 4. Compute the Cholesky factors R f and L f of X pc = R f R f and X po = L f L f the solutions of the equations (5.43) and (5.44), respectively. and X io = L 5. Compute the Cholesky factors R∞ and L ∞ of X ic = R∞ R∞ ∞ L ∞ the solutions of the equations (5.45) and (5.46), respectively.

6. Compute the ’thin’ singular value decomposition, see (5.34) 1 L f E f R f = (U1 , U2 ) (V1 , V2 ) 2 7. Compute the ’thin’ singular value decomposition L∞ A∞ R∞ = U3 3 V3 , see (5.35) −1

−1

− 12

2 8. Compute W f = L f U1 1 2 , W∞ = L ∞ U3 3 , T f = R f V1 1

− 12

and T∞ = R∞ V3 3

˜ A, ˜ B, ˜ C] ˜ as in (5.48) 9. Compute the reduced order system [E,

If the original system (5.10) is highly unbalanced or if the deflating subspace of the pencil (5.11) corresponding to the finite and infinite eigenvalues are close, then the projection matrices Wr and Tr as in (5.47) will be ill-conditioned. To avoid accuracy loss in the reduced system, a square root balancing free method for ODEs has been proposed. This method can be adapted to a descriptor system as described in Algorithm 14 on the next page. The GSR and the GSRBF methods are mathematically equivalent in the sense that in exact arithmetic the transfer function of the reduced system is the same. It should be noted that the reduced realisation as in (5.49) is, in general, not balanced and the corresponding pencil λE˜ − A˜ is not in Weierstrass-like canonical form. 82


Unclassified Report

PR-TN-2005/00919

Algorithm 14 Generalised square root balancing free (GSRBF) method Given a system of the type (5.10) with a c-stable pencil 1. Compute the generalised Schur form (5.39) 2. Compute the matrices (5.40) 3. Solve the generalised Sylvester equation (5.42) 4. Compute the Cholesky factors R f and L f of X pc = R f R f and X po = L f L f the solutions of the equations (5.43) and (5.44), respectively. and X io = L 5. Compute the Cholesky factors R∞ and L ∞ of X ic = R∞ R∞ ∞ L ∞ the solutions of the equations (5.45) and (5.46), respectively.

6. Compute the ’thin’ singular value decomposition, see (5.34) 1 L f E f R f = (U1 , U2 ) (V1 , V2 ) 2 7. Compute the ’thin’ singular value decomposition L∞ A∞ R∞ = U3 3 V3 , see (5.35) 8. Compute the ’economy size’ QR decomposition R f V1 Y R∞ V3 = QR R 0 R∞ V3 0 L f U1 = QL L −Z L f U1 L ∞ U3 ˜ A, ˜ B, ˜ C] ˜ with 9. Compute the reduced order system [E, E f Eu ˜ QR E = QL 0 E∞ A f Au ˜ QR A = QL 0 A∞ Bu B˜ = Q L B∞ C˜ = C f , Cu Q R

(5.49)

Numerical aspects of the GSR and GSRBF methods Now we want to discuss some numerical and implementation related aspects of Algorithm 13 and 14. To compute the generalised Schur form (5.39) we can use the QZ algorithm or the GUPTRI algorithm. To solve the generalised Sylvester equation (5.42) we can use the generalised Schur method or its recursive block modification that is more suitable for large problems. The upper triangular Cholesky factors R f , L f , R∞ and L ∞ of the solutions of the generalised Lyapunov equations (5.43)-(5.46) can be determined without computing the solutions themselves using the generalised Hammarling method. In this case the generalised Schur and Hammarling methods are c Koninklijke Philips Electronics N.V. 2005

83

PR-TN-2005/00919

Unclassified Report

based on the preliminary reduction of the corresponding matrix pencil to the generalised Schur form, calculation of the solution of a reduced system and back transformation. Note that the pencils λE f − A f and λE∞ − A∞ in equations (5.42)-(5.46) are already in generalised Schur form. Thus, we only need to solve the upper (quasi-) triangular matrix equations. Finally, the singular value decomposition of the matrices L f E f R f and L ∞ A∞ R∞ , where all three factors are upper triangular, can be computed without forming these products explicitly. Since the GSR and the GSRBF methods are based on computing the generalised Schur form, the cost for each are (n3 ) flops in addition they have a memory complexity of (n2 ). Thus, these methods can be used for problems of small and medium size only. Moreover, they do not take into account the sparsity or any structure of the system and are not attractive for parallelisation. Remark 5.17. The TBR approach has the big advantage that we have an error bound which is cheap to compute (5.38). This fact makes this method attractive to be used in a TPWL approach, because the knowledge of an error bound can be used in the linearisation tuple controller to increase the accuracy of the model or to reduce the order of the system. But in practice this method has three disadvantages which speak against using this method in a TBR approach. • The complexity of this method is high, because we have to solve a generalised Lyapunov equation. So we can use this method only for small to medium size problems. But of course TPWL should be also useful for large size problems. • We have encountered some problems with the reduced systems. Before the reduction we have good conditioned systems which are easy to solve, but the reduced system is illconditioned. One reason for this can be that the systems themselves where strongly unbalanced and this caused problems. But in the author’s opinion this drawback can be solved in using more stable algorithms to compute the reduction of our system. • We can not easily merge our local bases to a global basis by using a SVD. The reason for this is that the reduction matrices Wr and Tr are not representing a subspace, like the reduction matrices in a Krylov approach. They are used for balancing the system. This means they are ’regrouping’ the system in such a way that if we truncate the last part of our system, we get an optimal approximation of our system, because we are only truncating parts which are hard/impossible to observe and to control. And because of the fact that each linearised system can have different important nodes, we can not just merge these matrices via an SVD. The SVD, of course, takes the most dominant columns of all local reduction matrices. But this means also that we have some local systems which, after the global reduction, are not balanced anymore. By this we truncate nodes which are related to infinite eigenvalues which have non-zero Hankel singular values. And as mentioned before it may happen that an infinite eigenvalue of the system moves to the right half plane, so that the system is not stable anymore and therefore not a good approximation to the original one.

5.6 Poor man’s TBR (PMTBR) In this section we discuss the PMTBR method; all results which are shown are coming from [20]. We show up a connection between the multipoint rational approximation techniques and the TBR method. This connection motivates us to develop the PMTBR algorithm. The PMTBR method has some of the advantages of both methods (TBR and multipoint rational approximation techniques), the straightforward implementation of projection based methods, e.g. PRIMA, and similar reduction properties of TBR. PMTBR has promising properties with respect to order control and error 84


Unclassified Report

PR-TN-2005/00919

estimation, which, while not as powerful as the error control of the TBR method, appears to be an advantage over multipoint projection. We demonstrate in an example, that the known methods, projection based and also TBR, have some disadvantages for special case systems . The example addresses one problematic feature of the TBR approach. While TBR provides strong guarantees based on error control, the method is global in the frequency domain, with no control over allocation for the modeling effort for different frequency bands. Thus, the nearoptimal approximation properties of TBR are only near-optimal for classes of problems rarely encountered in practical circuit analysis. Various approaches to frequency weighting have been proposed that generally involve construction and reduction of a composite system by pre-and/or post-multiplying the original system by auxiliary weighting systems. For narrow band applications such as in RF circuits, construction and merging of such auxiliary systems is not desirable. Projection methods, on the other hand, can easily be tuned to generate accurate approximations in any frequency bands of interest, but have no error bound. Due to of simplicity we first restrict the theory to ODEs of the type d x = Ax + Bu dt y = Cx with x ∈ n , A ∈ n×n , B ∈ system of dimension r

n×k

,u∈

k

,C ∈

q×n

and y ∈

(5.50)

q

. The goal is to find a reduced

d xr = Ar xr + Br u dt y = Cr xr with xr ∈ r , Ar ∈ r×r , Br ∈ r×k and Cr ∈ q×r , which approximates our original system. In the last subsection we then discuss how we can adapt the theory to DAEs.

5.6.1 Model reduction background In this section we discuss some linear model reduction background which we need to develop the PMTBR method. Because we need some knowledge about Krylov techniques and also the TBR method the user is referred to Section 5.4 and Section 5.5. An evolution of Krylov-subspace schemes is a method that constructs the projection matrix P from a rational, or multipoint, Krylov subspace. So instead of matching the moments at one frequency, we match moments at several frequencies, which are chosen in advance. Compared to the single-point Krylov-subspace projectors, the multipoint approximants in general are more accurate, because we are approximating a broader frequency range. The construction of the subspace is more expensive than for a Krylov approach. Given m complex frequency points sj , j = 1, . . . , m, a projection matrix P can be constructed in the way that the j -th column is calculated as p j = (s j E − A)−1 B.

(5.51)

This leads to multipoint rational approximation. Multipoint projection is known as an efficient reduction algorithm in the way that the number of columns, which determine the final model size, are usually smaller for a given allowable approximation error compared to a single-point Krylov space approach. Of course, there are many practical questions: how many frequency points s j should be used, and how should the sj be chosen? How is error determined? How can we guarantee the linear independency of the columns of P? c Koninklijke Philips Electronics N.V. 2005

85

PR-TN-2005/00919

Unclassified Report

An obvious strategy to guarantee linear independency is to perform an SVD on the projection matrix P. A main conclusion of this section is, that constructing projection matrices by multipoint frequency sampling, as in (5.51), followed by an SVD, converges to the TBR algorithm. The singular values obtained by this procedure approximate the Hankel singular values of the original system, and can be used for order and error control.

5.6.2 PMTBR approach Analysis in Frequency Domain For simplicity reasons consider the case A = A , C = B and further assume that A is stable. It is easy to see that in this case, the controllability and observability Gramian are equal. We can compute them by solving the Lyapunov equation (AX + X A + B B = 0). The Gramian can also be computed in the time domain as ∞ e At B B e A t dt. X= 0

However, within the knowledge that the Laplace transform of eAt is (s I − A)−1 , it follows from Parsevals theorem that the Gramian can also be computed from ∞ (iωI − A)−1 B B (iωI − A)−H dω (5.52) X= 0

where superscript H denotes Hermitian transpose. Consider evaluating X by applying numerical quadrature to (5.52). We use a quadrature scheme with nodes ωi and weights wi , and define z j = (iω j I − A)−1 B. So we can compute an approximation of X Xˆ =

w j z j z Hj .

(5.53)

j

√ √ Now construct Z = [z1 , z 2 , . . .], and W = diag( w1 , w2 , . . .). So we can write (5.53) as Xˆ = Z W 2 Z H . Model Construction via SVD To derive a model reduction procedure, consider the eigen decomposition of the controllability gramian X = VL VL . VL VL = I , since X is real symmetric. An obvious candidate for reduction is to pick a projection matrix formed from the columns of VL corresponding to the dominant eigenvalues of X. This is the same idea as in the POD approach. If the quadrature rule is accurate, Xˆ will converge to X. This implies that the dominant eigen space of Xˆ converges to the dominant eigen space of X. Now, consider the singular value decomposition of Z W = VZ S Z U Z 86


Unclassified Report

PR-TN-2005/00919

with S Z real diagonal, VZ and U Z unitary matrices. Hence, Xˆ = V Z S 2Z VZ . So, the dominant vectors in VZ , which are related to the dominant singular values in SZ , give the ˆ Therefore, VZ converges to the eigenspaces of X, and the Hankel singular eigenvectors of X. values are obtained from SZ . VZ can then be used as the projection matrix in a model order reduction scheme. It seems likely that the singular values of the matrix Z have something to do with approximation error. The above illustrates that it is in fact precise; the SVD of Z reveals the same information revealed by TBR (modulo the weights W ). One question still remains: how fast does the scheme converge? In particular, how fast do the dominant singular vectors of Z W converge to the dominant eigenvectors of X? In practice it turns out that good models can be obtained with a small number of sample points. This is in agreement with the experience with multipoint approximation. That is the reason, why the author’s of [20] denote their method “Poor Man’s” TBR (PMTBR), since the quantities thus computed are cheap approximations to a TBR. They also say that, surprisingly, in many practical applications, PMTBR seems to perform better than TBR in the sense of giving more accurate models for a given model size or amount of effort. This unexpected bonus demonstrates the virtues and rewards of frugality. The PMTBR algorithm is shown in Algorithm 15. We formulate the approach to allow the sample points si to be arbitrary points in the right half-plane. Algorithm 15 Poor Man’s TBR (PMTBR) 1. Do until satisfied 2. Select a frequency point si 3. Compute zi = (si I − A)−1 B 4. Build up Z i if si ∈ Z i = (z1 , . . . , zi−1 , zi ) else Z i = (z1 , z¯ 1 , . . . , zi , z¯ i ) where z¯ i is the conjugate complex of zi 5. Calculate a SVD of Z i , if the error is satisfactory, set Z = Zi and go to step 6, else go to step 2 6. Construct the projection matrix P form the columns of VZ , dropping the columns which have small singular values.

5.6.3 Computational Complexity To compare the cost of computing a r-th order model for a system with n states, using any of the proposed methods, we show the complexity results for the basic operations involved. For a Krylov-subspace algorithm and PMTBR, these are either SVD or QR operations, at a cost of (nr 2 ), matrix solves, at a cost of (nα ) (typically 1 ≤ α ≤ 1.2, for circuits), and matrix c Koninklijke Philips Electronics N.V. 2005

87

PR-TN-2005/00919

Unclassified Report

factorisations, at a cost of (nβ ) (typically, 1.1 ≤ β ≤ 1.5 for circuits). As we saw, all of these algorithms are used to project the original system to the lower dimensional subspace. Since the cost of this projection is the same for all methods we neglect this complexity The TBR algorithm needs the computation of the Gramians X and Y . We have to calculate the eigenvalues of their product, at a cost of (n3 ). Since the TBR method has a cubic complexity the use of this algorithm is limited to small- and medium-sized problems. A Krylov subspace method requires one (nr 2 ) QR factorisation, one matrix factorisation, and r solves. So the total costs are (nr 2 + rn α + n β ). Assuming that r frequency points are chosen in the quadrature scheme of the PMTBR method, looking to Algorithm 15, this indicates that it involves one SVD, r solves, and r factorisations. So the complexity of the method is (nr2 + rn α + rn β ). Because of the fact that a multipoint rational approximation is related to the PMTBR approach it follows that the costs are the same. So for a given size model, it follows that a Krylov-subspace method is the most efficient procedure for order reduction due to the smaller number of factorisations. But PMTBR and multipoint projection may not need a lower order to have the same approximation properties. So the choice depends on the relative cost of the factorisation. We see that a PMTBR should be preferred if α is close to β. In addition, when we take r frequency points in the PMTBR method, experiences indicate, that it is possible that the reduced model, after applying the SVD step, leads to an r-th ˆ order model, rˆ < r (see Step 6 of Algorithm 15). If the simulation cost dominates the model computation cost, then the good reduction properties of PMTBR make this algorithm a very efficient alternative.

5.6.4 Practical implementation Now we want to discuss some implementational aspects for the PMTBR algorithm. The first point we want to describe is how a PMTBR algorithm can be used to reduce a DAE. After this we want to take a look to an error bound for the PMTBR approach. Descriptor Systems Usually, in circuit analysis we are handling an ODE instead of a DAE, so we have to adapt our method to this kind of problem. In this case the system is given by E

d x = Ax + Bu, dt

and the controllability Gramian can be obtained from AX E + E X A + B B = 0. Not surprisingly, the frequency domain equation is X = (iωE − A)−1 B B (iωE − A)−H dω and the above procedure can be used in the same way, if we change that the columns of Z are given by z j = (s j E − A)−1 B. Note that the present complications in applying standard TBR to problems with singular-matrices vanish in PMTBR. 88


Unclassified Report

PR-TN-2005/00919

Error Estimation The above arguments can be extended to a generalised process of error estimation. The singular values obtained from the weighted Gramians can be interpreted as gains between filtered inputs and weighted outputs. Singular values from truncated modes can be interpreted as errors on the filtered system, i.e., finite-bandwidth or weighted errors. The singular value information can be used in three ways to guide model order control. First, if enough samples are taken that good estimates of the true Gramians are obtained, then the singular values obviously provide error bounds, through the connection to TBR. Second, the singular values can guide the process of point selection. With reasonable spaced sampling of points, as projection vectors are added to the Z W -matrix, convergence of the singular values indicates convergence of the error, which guides when to stop adding vectors to Z W . Third, we have found that, again assuming a sampling density consistent with the weighting w(ω), the singular values usually give a fairly good guide to model order well before convergence is achieved. Experiments indicate that when, for a number of samples in excess of the model ∞ order, the singular value distribution exhibits a small “tail” (that is, for a “small” ε, ∃k : ε > i=k+1 σi ), then sufficient order and point placement have been achieved. Again, this is, as one would expect, strikingly similar to the usual TBR concepts. Additional improvements In [20] they present two additional PMTBR approaches. We do not describe them in detail, because we have only used Algorithm 15 on page 87, but give a short introduction to the idea of both methods. The first approach is based on the idea that the input of our system is in general frequency banded and the designer of the circuit has an idea of this domain. But the TBR algorithm does not take this into account, because it treats each frequency with the same importance and constructs a reduced system which behaves well for all frequencies, also unimportant ones. So for the PMTBR method we introduce a frequency domain we are interested in. Then we only choose frequency points s j out of this domain to construct the final reduced system. This approach is called the frequency selective PMTBR. The second approach also takes into account possible correlations in the input. In [20] it is shown that if we have the correlation matrix of the inputs, we can use this additional information to create a reduced model which yields a better approximation. The authors of the paper point out that in some tests these approaches behave better than the shown approach. So it seems worth to try both methods. Because if we can use them in a TPWL approach we get better approximations with same effort. The PMTBR approach has several advantages compared to the other approaches if we use it in a TPWL method. • Compared to Krylov it creates smaller systems with a similar approximation, this speeds up the simulation of the reduced system and also saves memory space to store the TPWL method. It also has a better behaviour for a larger frequency domain so that we can trust the reduced model more. The disadvantage is of course that the PMTBR approach is more expensive compared to a Krylov approach. • PMTBR compared to TBR has the advantage that the model extraction step is cheap, but it also has the disadvantage that we do not have a guaranteed error bound. But as long as we stay in the frequency domain which we have used to create the PMTBR model we c Koninklijke Philips Electronics N.V. 2005

89

PR-TN-2005/00919

Unclassified Report

can expect a similar behaviour. The next advantage is that the implementation is not that difficult. For a PMTBR method we need a good linear system solver and a SVD method. For these methods algorithms exist which are effective and numerically stable. For the TBR approach there are still the problems of numerical stability. • A PMTBR approach is still a projection based method. So we can construct our subspace in an easy way.

5.7 Conclusion Now we want to discuss which method is the most advantageous to be used in a TPWL method. Krylov based methods As we have already seen, Krylov techniques will be a good choice if we want to have a fast model extraction, i.e. online model reduction. These techniques are good in the sense that building up the global basis is just a SVD of the all local bases. But still this approach has the disadvantage that we do not have a good error bound, of course near the frequency point we have chosen we can expect a good behaviour. Also the Krylov approach tends to create bases which are ’too large’. TBR for DAEs The TBR method for DAEs sounds at the first sight promising. Because they have a global error bound in the frequency domain and also have really good reduction properties. But as already discussed the complexity of this method is the highest of all methods (n3 ). We also have the disadvantage that building the global basis is not as easy as in the other methods because we just cannot use a SVD of the local basis’s to create the global basis. This makes this approach for the moment not useful in a TPWL approach. PMTBR method This approach has several advantages. First it constructs small reduced models which approximate our original system really well in the chosen frequency domain. We also have an error estimation for the local systems, but be aware that this is not a bound. The subspaces are also quite cheap to compute, compared to TBR, and the implementation is really straightforward. The disadvantage is that it is not as fast as the Krylov methods and also does not have a guaranteed error bound. So as we have seen at this instant we have to choose between Krylov methods, especially PRIMA, and a PMTBR approach. The decision should be made with respect to what we want. Fast model extraction, with possible bigger approximation errors and bigger local subspaces, or a slower model reduction which can give smaller local models and at least an error approximation. In the author’s opinion the PMTBR seems to have the best properties for a TPWL approach.

90


Chapter 6

Numerical results In this chapter we show how the proposed TPWL method performs in practice. We discuss the results of our linearisation tuple controller and also of our weighting algorithm. As already mentioned before, both steps have a big influence on the accuracy of the TPWL model. The chapter is divided in four sections. In the first section we describe the test circuit, a chain of inverters, and why this circuit is a good test for a model reduction method. The topic of the second section is how a good linearisation tuple controller can improve the approximation results without increasing the number of linearisation tuples or the order. In the third section we then show how the different weighting procedures perform. In the last section we present the results of the TPWL method when we use an input signal for simulating the circuit that is different to the training input. These results show that the proposed method is useful in circuit simulation in the way that we get relative robust and fast models which have a good approximation to the original system. All tests were made on a P4 workstation with a 2Ghz CPU. The tests have been done in MATLAB Version 7.0.1 (R14) Windows Version.

6.1 Test circuit All tests are done on a scalable circuit, which describes a chain of inverters, for informations see [31, 32]. The circuit behaves nonlinearly so that it is a good test for the TPWL method. Also we have dependencies between all nodes which is also not an optimal behaviour for a model reduction process. An inverter is a sub circuit, which transforms a logical value to its inverse. Let us uin denote a time varying input signal. High or status 1 is represented by a constant operating voltage called Uop , usually Uop = 5V, plus a threshold voltage Uthres , such as, if the input uin ≥ Uop − Uthres , the input is already regarded as high; on the other hand, low or status 0 is represented by the ground voltage, i.e. the input signal is of status low, if uin ≤ Uthres . (The remaining case Uthres < u in < Uop − Uthres is an intermediate state, where no logical value is assigned to.) This negation functionality can be technically obtained by using a MOSFET (metal oxide semiconductor field effect transistor). A rough model is given below. Moreover, a chain of an even number of inverters merely delays the input signal. So it can be seen as a suitable test for a TPWL method because on one hand we have a digital behaviour of the circuit and and on the other hand each node changes its state in such a way that we have an active and an inactive part of the circuit. 91

PR-TN-2005/00919

Unclassified Report

6.1.1 Modeling of a MOSFET The MOSFET is an element with four terminals: gate (g), source (s), drain (d) and bulk (b). (see Figure 6.1 on page 92). The potential difference between gate and source is the current ids (from drain to source). This is achieved by doping a piece of semiconducting material (e.g. Si) negative at drain and source region, but positive at the bulk. Furthermore, a thin layer of Si02 isolates the drain-source-channel from the gate, such that there are almost no currents through the gate. Assuming our MOSFET to be of n-channel type, we observe that the larger the potential difference ugs , the larger the drain-source current ids . Hence MOSFETs are basically described by a voltage-current relation and the easiest model for a MOSFET is thus a voltage controlled current source, such as i ds = k · f (u g , u d , u s ) where f (u g , u d , u s ) = max(u g − u s − Uthres , 0)2 − max(u g − u d − Uthres , 0)2

(6.1)

and all leakage currents are assigned to be zero (as for an ideal transistor): i ds = i gb = i gd = 0.

drain gate

bulk

source

Figure 6.1: A MOSFET

An inverter A single inverter, see Figure 6.2 on page 93, consists of one voltage source, one resistor, one capacitor and of course one MOSFET, where the voltage source serves as input device - the remaining, additional voltage source in Figure 6.2 on page 93 supplies the operating voltage. Using the above model for the MOSFET (6.1) together with the following list of parameters R =

3 k

C =

0.2 pF

k =

2 · 10−4 A/V2

and a scaling in time, and therefore the capacity, by 109 , we get the following equation for the node 1 (in a dimension free notation): u˙ 1 = Uop − u 1 − f (u in , u 1 , 0).

92

(6.2)


Unclassified Report

PR-TN-2005/00919

R

Uop 1

uin

C

Ground

Figure 6.2: Inverter The input signal As input for the later tests, the following pulse was chosen ⎧ t −5 for 5 ≤ t ≤ 10 ⎪ ⎪ ⎨ 5 for 10 ≤ t ≤ 15 u in = 5 for 15 ≤ t ≤ 17 (17 − t) ⎪ ⎪ 2 ⎩ 0 otherwise

5V

0V 5

10

15

20

25

time [ns]

Figure 6.3: Input signal having discontinuities in the first derivative. Therefore, it is advisable to select LTs at the times t = 5, 10, 15, and 17, what is done in the later TPWL tests. Thus, the initial value u 1 (t = 0) = 5 equipped to (6.2) is consistent with the above input signal and completes the mathematical description of the inverter. c Koninklijke Philips Electronics N.V. 2005

93

PR-TN-2005/00919

Unclassified Report

Solution A transient analysis for a single inverter yields for u1 the solution as given in 6.4 Solution for a single inverter: U1

5

4

3

2

1

0 0

5

10

15

20

25

30

35

40

45

50

Figure 6.4: Inverted signal We note some aspects : Firstly, the input signal is inverted, secondly it is smoothened and thirdly we observe a small delay, due to the memory effects of capacitors.

6.1.2 Chaining The chain of inverters is a concatenation of several inverters, while the output of one inverter serves as input for the succeeding. The circuit description is given in Figure 6.5 on page 95. Thus a chain of inverters of an even number of inverters, n say, mainly delays the input signal (and of course smoothned the signal). Our mathematical model is now given by the set of n equations, which are obtained from (6.2): u˙ 1 = Uop − u 1 − f (u in , u 1 , 0) u˙ k

(6.3)

= Uop − u k − f (u k−1 , u k , 0) for k = 2, . . . , n

plus initial conditions

ui =

0.59478 if i mod 2 = 0 5 if i mod 2 = 1

It is a very suitable test example for various reasons as mentioned above: firstly, the solution is almost given by the input, secondly, we obviously have a digital behaviour, and thirdly, its scalability. Feeding the above input signal (Figure 6.3 on page 93) to a chain of n = 100 inverters, we get at the nodes 2, 10, 20, 30 ,40 and 50 the voltage characteristics of Figure 6.6 on page 95. Inspecting Figure 6.6 on page 95 we observe that, when node 2 is active the nodes later than 26 (approximately) do not change at all. These are latent at the time the signal passes node 2. The 94


Unclassified Report

PR-TN-2005/00919

R

R

R

Uop 1

2

uin

n

C

C

C

Ground

Figure 6.5: Inverter chain

consequence of this is that we have an active and a latent part in our circuit which the LT controller should take into account. For the test we have created we have chosen n = 100 so the final circuit has 104 nodes which is an acceptable size for a model reduction technique.

Solution of chain of inverters 5.5

5

4.5

4

Voltage

3.5

3

2.5

2

1.5

1

0.5

0

0.5

1

1.5

2 Time

2.5

3

3.5

4 −8

x 10

Figure 6.6: Solution of the chain of inverters


95

PR-TN-2005/00919

Unclassified Report

6.2 Linearisation tuple controller As we have already discussed before, the linearisation point controller has a very big influence on the quality of a TPWL model. We now want to compare several different approaches we have developed for this report. For all methods we use the simple weighting procedure described in 4.2.1 to make the results comparable. As estimator for yn ,so that we can calculate the weights, we are just using yn−1 . We show the number of LTs which the used LT controller selects and how long the model extraction takes. This time does not include the simulation time of the BDF method. The BDF solution is computed in advance with a high accuracy, with a BDF2 and simulation time x|| of 800s. We also show the relative error ||x−˜ of each method, compared to a BDF2 solution with ||x|| fixed stepsize equal to the one we use for the TPWL method. And we also compare the solutions of the different approaches at node 50 with the BDF solution to see, if the TPWL method introduces additional latencies. The solution of the created TPWL model is created via an Euler backward method with fixed stepsize (h = 10−11 ). The global basis of the TPWL model is constructed by using the approach discussed in Section 3.3.2. This means that only the parts of the local basis are used to create the global basis. 6 The angular frequency point s where we create the Krylov models is given ! as 10 for all tests. 3 9 For the PMTBR approach we have chosen the frequency domain 10 , 10 . In this region we chose 104 (the number of unkowns in the original sytem) frequency sampling points which are equidistant in a logarithmic scale. The bound for dropping columns in the PMTBR approach is 10−30 and the maximum size of the local basis is given as r2 . For the PRIMA approach we used a special implementation by Pieter Heres, TU Eindhoven. This approach is more stable then the original PRIMA algorithm.

6.2.1 State distance dependent LT controller We start with a simple state distance dependent approach. This means that we select a new LT ||x−x || if the relative distance ||x||li of the original solution to the last LT is getting bigger than a given bound δx . As one will note we have not chosen a time distance control. The reason for this is that this state distance approach is already selecting so many LTs, that there is no need to also include also a time distance control. Because we are using only a state distance controlled LT controller we see that the number of LTs are the same for the Krylov and the PMTBR approach. For the test we chose δx = 0.025. The results can be seen in Table 6.1 and Figure 6.7 on page 97. All errors x|| . are relative errors ||x−˜ ||x|| We conclude from the results that the Krylov/PRIMA-TPWL approach has on one hand a fast model extraction (88s) but on the other hand that the approximation properties are quite bad (mean erros 0.018). The PMTBR-TPWL approach has exactly the opposite properties in the case that the model extraction is slower (121s) but the accuracy is much better (mean error 0.01).

Krylov PMTBR

# LTs 239 239

Model extraction time 88s 121s

Simulation time 53s 51s

max. error 0.072667 0.039801

mean error 0.017914 0.010231

Table 6.1: Results distance dependent LT controller

96


Unclassified Report

PR-TN-2005/00919 Error of a distance controlled LT−controller

0.08 PRIMA approach r=50 PMTBR approch r=50 Given bound 0.07

0.06

r

||x − x ||/||x||

0.05

0.04

0.03

0.02

0.01

0

0

0.5

1

1.5

2 Time

2.5

3

3.5

4 −8

x 10

Voltage at node 50 of a distance controlled LT−controller 5.5 BDF solution PRIMA approach r=50 PMTBR approch r=50

5 4.5

Voltage V

4 3.5 3 2.5 2 1.5 1 0

0.5

1

1.5

2 Time

2.5

4

4 3.95

3.9

3.9

3.85

3.85

3.8

3.8

3.7

3.7 3.65

3.6

3.6

3.55

3.55

2.71

2.72

2.73

2.74

2.75 Time

2.76

2.77

2.78

2.79

2.8 −8

x 10

4

3.75

3.65

3.5 2.7

3.5

−8

3.95

3.75

3

x 10

Voltage V

Voltage V

0.5

3.5

3.43

3.44

3.45

3.46

3.47 3.48 Time

3.49

3.5

3.51

3.52 −8

x 10

Figure 6.7: Results distance dependent LT controller c Koninklijke Philips Electronics N.V. 2005

97

PR-TN-2005/00919

Unclassified Report

6.2.2 LT controller without an error estimator for the reduction part Here we show the results of the LT controller discussed in Section 3.1.1. This means that we estimate the error of the local linearised reduced systems by solving a least square error system. We compare the results of a Krylov and a PMTBR dependent TPWL method. For the PMTBR approach we have chosen δx = 0.03 and for the Krylov technique δx = 0.09. The reason why we chose different accuracy parameters is that the local approximation of the Krylov approach is not that accurate compared to the PMTBR method. So we have two options the first one is to increase the order of the Krylov-TPWL model or to increase the error bound of Krylov-TPWL model. To make both methods comparable we chose the order of second both methods the same. The results can be seen in Table 6.2 and Figure 6.8 on page 99. We found that the Krylov approach is selecting less LTs (50) compared to the PMTBR approach (70), the reason for this is that we are using different δx ’s for these methods. The model extraction time of the Krylov approach (125s) is faster than PMTBR (151s). There are two reasons for this the first one is that the Krylov approach is selecting less LTs then PMTBR and second the PMTBR approach is more complex then the Krylov approach. We see that the mean error of both methods does not exceed the given bound, Krylov 0.05 and PMTBR 0.02. So our LT controller can control the error. But there is the problem that in both cases the approximated solution exceeds the given error bound. The max error for the KRylov approach is 0.1 and for the PMTBR approach is 0.05. The reason for this is that the bound is still an estimation and not a bound itself. In both approaches we see that the model extraction time is quite high. The reason for this is that solving the least square system is expensive. So in practice we can say that this approach can be useful, but in the next section we see that we can have better results by using another approach. If we look at the approximation at node 50 we will see that the PMTBR approach is approximating the flank in a good way while the Krylov approach has some problems to get the flanks.

Krylov PMTBR

# LTs

Model extraction time

Simulation time

max. error

mean error

50 70

125s 151s

40s 41s

0.102584 0.053478

0.051891 0.021165

Table 6.2: LT controller without an error estimator for the reduction part

6.2.3 LT control by simulating the local linearised reduced system Here we show the results of the LT controller discussed in 3.1.4. We compare the results of a Krylov and a PMTBR dependent TPWL method. For the PMTBR approach we chose δx = 0.025 and for the Krylov technique δx = 0.045. The reason that we chose different accuracy parameters is again that the local approximation of the Krylov approach is bad compared to the TPWL method. The results can be seen in Table 6.3 and Figure 6.9 on page 101. We see that the Krylov approach is selecting less LTs (74) compared to the PMTBR approach (84 for r=50 and 85 for r=40). The reason for this is that we are using different δx ’s for the methods. We see that the mean error of both methods does not exceed the given bound, the Krylov approach 0.012, the PMTBR with r=50 0.009 and the PMTBR approach with r=40 0.019. So our LT controller is able to control the error. But we have the problem that the error of the approximated solution in both cases does actually exceed the error bound. The max error for the Krylov approach is 0.09 for the PMTBR approach with r=50 0.06 and for the PMTBR with r=40 is the 98


Unclassified Report

PR-TN-2005/00919

Error of a norm controlled LT−controller 0.12 PRIMA approach r=50 PMTBR approch r=50 PRIMA bound 0.09 PMTRB bound 0.03

0.1

r

||x − x ||/||x||

0.08

0.06

0.04

0.02

0

0

0.5

1

1.5

2 Time

2.5

3

3.5

4 −8

x 10

Voltage at node 50 of a norm controlled LT−controller 5.5 BDF solution PRIMA approach r=50 PMTBR approch r=50

5 4.5

Voltage V

4 3.5 3 2.5 2 1.5 1 0

0.5

1

1.5

2 Time

2.5

4

4 3.95

3.9

3.9

3.85

3.85

3.8

3.8

3.7

3.7 3.65

3.6

3.6

3.55

3.55

2.71

2.72

2.73

2.74

2.75 Time

2.76

2.77

2.78

2.79

2.8 −8

x 10

4

3.75

3.65

3.5 2.7

3.5

−8

3.95

3.75

3

x 10

Voltage V

Voltage V

0.5

3.5

3.43

3.44

3.45

3.46

3.47 3.48 Time

3.49

3.5

3.51

3.52 −8

x 10

Figure 6.8: LT controller without an error estimator for the reduction part c Koninklijke Philips Electronics N.V. 2005

99

PR-TN-2005/00919

Unclassified Report

max error 0.07. The reason for this is that the bound is still an estimation and not a bound, because we can just control the local approximation error. Again we see that the Krylov method has the fastest model extraction (26s), but the worst approximation properties (mean error=0.012). In contrast the PMTBR approach requires a huger extraction time (r=50 40s, r=40 47s) but gives a better approximation (mean error r=50 0.009, mean error r=40 0.019). We also have to note that with the PMTBR approach we are able to reduce the system to dimension 40 while this is not possible for the Krylov approach. If we look at the comparison at node 50 we see that the PMTBR approach, with reduction to dimension 50, is approximating the flanks better than the Krylov approach. So we can say that the PMTBR approach is the better approach of both, because of its better compression properties and also better approximation qualities.

Krylov PMTBR PMTBR

r 50 50 40

# LTs 74 84 85

Model extraction time 26s 40s 47s

Simulation time 43s 47s 37s

max. error 0.087920 0.061861 0.073462

mean error 0.012763 0.009279 0.018975

Table 6.3: LT control by simulating the local linearised reduced system

6.2.4 Final method In this section we want to show the results of the final LT controller. The controller is combined with a BDF method. Simulating the circuit with the same settings as for the LT controller takes 223s for the BDF2 algorithm. The ideas we are using in this approach are the same as in 3.1.4. As a linear model reduction we are using the PMTBR approach because of the better approximation compared to a PRIMA approach. But here we also apply the back stepping if the actual LT is not a good point. The method also restarts the BDF method with the default stepsize if we reach in the back stepping step the last LT. And we not only use a standard norm to measure the distance between BDF and reduced system; but we also include the weighted norm we have introduced in 3.1.4. As already discussed, the back stepping and the additional weighted norm increase the accuracy of the LT controller. Because of the fact that we are using the LT controller directly in a BDF method we can not compare the model extraction time of this approach with the ones from before. But because of the strong relation between this method and 6.2.3 we can say that the model extraction itself, without the BDF, is the same. To get a good accuracy we have set δx = 0.025. The results can be seen in Table 6.4 and Figure 6.10 on page 103. We see that in all three cases the LT controller is selecting 62 LTs. The reason for this is that local subspaces can be relativly small, so that the local dimension never exceeds dimension 20. The model extraction time including the BDF method is for r=50 240s, for r=40 236s and for r=35 233s. This means that the model extraction time itself only takes 17s for r=50, 13s for r=40 and 10s for r=35. This shows that our approach does not ecxeed the simluation time that much, because we can reuse a lot of results for the BDF method. We also see that the errorbound in the r=50 and r=40 case holds (mean error r=50 0.12, mean error r=40 0.017). In the r=35 case the mean error (0.026) is exceeding the given errorbound of 0.025. Again we see that the maximum error of all three different dimensions is exceeding the bound, the reason is that only we can estimate the local approximation error and not the global error. 100


Unclassified Report

PR-TN-2005/00919

Error of a simulation controlled LT−controller 0.09 PRIMA approach r=50 PMTBR approch r=50 PMTBR approch r=40 PRIMA bound 0.045 PMTBR bound 0.025

0.08

0.07

0.06

r

||x − x ||/||x||

0.05

0.04

0.03

0.02

0.01

0

0

0.5

1

1.5

2 Time

2.5

3

3.5

4 −8

x 10

Voltage at node 50 of a simulation controlled LT−controller 5.5 BDF solution PRIMA approach r=50 PMTBR approch r=50 PMTBR approch r=40

5 4.5

Voltage V

4 3.5 3 2.5 2 1.5 1 0

0.5

1

1.5

2 Time

2.5

4

4 3.95

3.9

3.9

3.85

3.85

3.8

3.8

3.7

3.7 3.65

3.6

3.6

3.55

3.55

2.71

2.72

2.73

2.74

2.75 Time

2.76

2.77

2.78

2.79

2.8 −8

x 10

4

3.75

3.65

3.5 2.7

3.5

−8

3.95

3.75

3

x 10

Voltage V

Voltage V

0.5

3.5 3.45

3.46

3.47

3.48

3.49

3.5 Time

3.51

3.52

3.53

3.54

3.55 −8

x 10

Figure 6.9: LT control by simulating the local linearised reduced system c Koninklijke Philips Electronics N.V. 2005

101

PR-TN-2005/00919

Unclassified Report

Finally the results show that we can get quite reasonable speed ups. For r=50 the speed up factor, compared to the BDF simulation time, is 5.4, for r=40 we get a speed up of 7.2 and for r=35 the simulation is 8.3 times faster than the BDF method.

PMTBR PMTBR PMTBR

r 50 40 35

# LTs 62 62 62

Extr. time + BDF 240s 236s 233s

Simul. time 41s 31s 27s

max. error 0.037484 0.033233 0.057046

Speed up 5.4 7.2 8.3

mean error 0.012051 0.016672 0.026128

Table 6.4: Final LT controller

6.3 Weighting Now we want to compare the weighting methods we have developed in Chapter 4. As a TPWL model we are using the dimension 50 model from 6.2.4. As in Section 6.2 we use a Euler backward method with fixed stepsizes h = 10−11 for solving the system. As estimator for yn we are using yn−1 . We compare the different possibilities of weighting and also show how much effort is involved in using the different weighting methods.

6.3.1 Distance dependent weighting Now we want to show the results from a distance dependent weighting approach. This method is described in 4.2.1. This means that we calculate the weights in the way that they only depend on the sum of the state and time distance. We show the results for 3 different βs to see how the parameters influence the quality of the results. In Table 6.5 on page 102 we see the mean and maximum error for the different βs. And in Figure 6.11 on page 106 we can see the error and also take a closer look at one specific node. From the results we see the larger β is the smaller is the error. We get the smallest mean (0.01) with β = 125. The reason for this is that β controls the ’sharpness’ of the weights, in the way that big βs give sudden changes between a 0 and a 1 for the weight while small βs result in smooth weights. So if we want to increase the accuracy of our TPWL model we can also make our weights sharper, this means β larger. β 5 25 125

max error 0.057062 0.037484 0.032805

mean error 0.019907 0.012051 0.010792

Table 6.5: Distance dependent weighting

6.3.2 Distance dependent weights under the knowledge of the Hessians Here we describe the results of a weighting method if we have the information of the norms of the Hessians. The ideas of this method are described in 4.2.2. This means that we are calculating the weights in such a way that it reflects the linearisation error of the nonlinear system. With this 102


Unclassified Report

PR-TN-2005/00919

Final LT−controller 0.06 PMTBR approach r=50 PMTBR approch r=40 PMTBR approch r=35 Error bound 0.025

0.05

r

||x − x ||/||x||

0.04

0.03

0.02

0.01

0

0

0.5

1

1.5

2 Time

2.5

3

3.5

4 −8

x 10

Voltage at node 50 of the final LT−controller 5.5 BDF solution PMTBR approach r=50 PMTBR approch r=40 PMTBR approch r=35

5 4.5

Voltage V

4 3.5 3 2.5 2 1.5 1 0

0.5

1

1.5

2 Time

2.5

4

4 3.95

3.9

3.9

3.85

3.85

3.8

3.8

3.7

3.7 3.65

3.6

3.6

3.55

3.55

2.71

2.72

2.73

2.74

2.75 Time

2.76

2.77

2.78

2.79

2.8

4

3.75

3.65

3.5 2.7

3.5

−8

3.95

3.75

3

x 10

Voltage V

Voltage V

0.5

3.5 3.4

3.41

3.42

−8

x 10

3.43

3.44

3.45 Time

3.46

3.47

3.48

3.49

3.5 −8

x 10

Figure 6.10: Final LT controller c Koninklijke Philips Electronics N.V. 2005

103

PR-TN-2005/00919

Unclassified Report

approach we get better results by only scaling the distance. We scale our distance in using the estimator of the supremum of the Hessians which we get from the model extraction step. In Table 6.6 on page 104 we see the mean and maximum error for the different βs. And in Figure 6.12 on page 107 we can see the error and take a closer look at one specific node. We see again that large βs result in a sharp weighting. But we can also see that the error itself is smaller in this approach compared to 6.3.1. For example with β = 125 we have with the simple approach a mean error of 0.019 and with the distance dependent approach we get a mean error of 0.014. The reason for this is that these weights respect the behaviour of the distance better than the simple approach, because of its strong relation to the linearisation error. β 5 25 125

max error 0.043657 0.033948 0.032528

mean error 0.014254 0.011097 0.010735

Table 6.6: Distance dependent weighting with knowledge of the norms of the Hessians

6.4 Results for different inputs Now we want to present the results from the PMTBR-TPWL method if we use a different input as the training input. The different inputs are given as: 1. Shifted input. The shape of the input is the same as the training input but it is shifted to the right side. ⎧ t −7 for 7 ≤ t ≤ 12 ⎪ ⎪ ⎨ 5 for 12 ≤ t ≤ 17 u1 = ⎪ (19 − t) 52 for 17 ≤ t ≤ 19 ⎪ ⎩ 0 otherwise 2. Added sinus. This input signal has an additional sinus. ⎧ t − 5 + 12 sin(2π t) for 5 ≤ t ≤ 10 ⎪ ⎪ ⎨ for 10 ≤ t ≤ 15 5 + 12 sin(2π t) u2 = (17 − t) 52 + 12 sin(2π t) for 15 ≤ t ≤ 17 ⎪ ⎪ ⎩ 0 otherwise 3. Added sinus and shifted input. Here we have shifted the input and added a sinus to the input signal. ⎧ t − 7 + 12 sin(2π t) for 7 ≤ t ≤ 12 ⎪ ⎪ ⎨ for 12 ≤ t ≤ 17 5 + 12 sin(2π t) u3 = (19 − t) 52 + 12 sin(2π t) for 17 ≤ t ≤ 19 ⎪ ⎪ ⎩ 0 otherwise Again we have calculated the results with the PMTBR-TPWL model we get from 6.2.4. The model is of dimension 50. The TPWL system is solved by a fixed stepsize Euler backward method with h = 10−11 . For weighting we have used the distance dependent weighting with β = 25, see 4.2.1. The results of the test can be seen in Figure 6.13 on page 108, Figure 6.14 on page 109 104


Unclassified Report

PR-TN-2005/00919

and Table 6.7 on page 105. The results show that we can change the input and still have a good approximation. This can be seen if we compare the errors of the results from TPWL tests with the training input and one of the different inputs. The mean error of the TPWL with the training input is 0.012 while the mean error for the different inputs is 0.01. This shows that we can take also different inputs to simulate our system, but of course we can not take a totaly different input. The reason for this is simple. If we took a totaly different input we would leave the accuracy region of our TPWL model and this means that we no longer can guarantee the accuracy of our system. Input orginal u1 u2 u3

max error 0.037484 0.033004 0.029530 0.029550

mean error 0.012051 0.010591 0.010370 0.010270

Table 6.7: Results for different inputs


105

PR-TN-2005/00919

Unclassified Report

Results for distance dependent weighting of a PMTBR−TPWL model r=50 0.06 β=5 β = 25 β = 125 Error bound 0.025

0.05

r

||x − x ||/||x||

0.04

0.03

0.02

0.01

0

0

0.5

1

1.5

2 Time

2.5

3

3.5

4 −8

x 10

Voltage at node 30 of the distance dependent weighting for a PMTBR−TPWL model r=50 5.5 BDF solution β=5 β = 25 β = 125

5 4.5

Voltage V

4 3.5 3 2.5 2 1.5 1 0

0.5

1

1.5

2 Time

2.5

3.7

3.7 3.68

3.66

3.66

3.64

3.64

3.62

3.62

3.58

3.56

3.56

3.54

3.54

3.52

3.52

1.93

1.932

1.934

1.936

1.938 Time

1.94

1.942

1.944

1.946

1.948

4

3.6

3.58

3.5 1.928

3.5

−8

3.68

3.6

3

x 10

Voltage V

Voltage V

0.5

3.5 2.68

2.682

2.684

2.686

2.688

−8

x 10

2.69 Time

2.692

2.694

2.696

2.698

2.7 −8

x 10

Figure 6.11: Distance dependent weighting

106


Unclassified Report

PR-TN-2005/00919

Results for extended distance dependent weighting of a PMTBR−TPWL model r=50 0.045 β=5 β = 25 β = 125 Error bound 0.025

0.04

0.035

0.03

r

||x − x ||/||x||

0.025

0.02

0.015

0.01

0.005

0

0

0.5

1

1.5

2 Time

2.5

3

3.5

4 −8

x 10

Voltage at node 30 of extended distance dependent weighting for a PMTBR−TPWL model r=50 5.5 BDF solution β=5 β = 25 β = 125

5 4.5

Voltage V

4 3.5 3 2.5 2 1.5 1 0

0.5

1

1.5

2 Time

2.5

3.7

3.7 3.68

3.66

3.66

3.64

3.64

3.62

3.62

3.58

3.56

3.56

3.54

3.54

3.52

3.52

1.93

1.932

1.934

1.936

1.938 Time

1.94

1.942

1.944

1.946

1.948 −8

x 10

4

3.6

3.58

3.5 1.928

3.5

−8

3.68

3.6

3

x 10

Voltage V

Voltage V

0.5

3.5 2.68

2.682

2.684

2.686

2.688

2.69 Time

2.692

2.694

2.696

2.698

2.7 −8

x 10

Figure 6.12: Distance dependent weighting with knowledge of the norms of the Hessians c Koninklijke Philips Electronics N.V. 2005

107

PR-TN-2005/00919

Unclassified Report

Weighting results for a PMTBR−TPWL model r=50 0.04 Input with an additional sinus Shifted input Shifted input with an additional sinus Error bound 0.025 0.035

0.03

||x − x ||/||x||

0.025

r

0.02

0.015

0.01

0.005

0

0

0.5

1

1.5

2 Time

2.5

3

3.5

4 −8

x 10

Voltage at node 50 of a PMTBR−TPWL model r=50. The input is shifted. Problem 1 5.5 BDF solution TPWL solution

5 4.5

Voltage V

4 3.5 3 2.5 2 1.5 1 0

0.5

1

1.5

2 Time

2.5

4

4 3.95

3.9

3.9

3.85

3.85

3.8

3.8

3.7

3.7 3.65

3.6

3.6

3.55

3.55

2.91

2.92

2.93

2.94

2.95 Time

2.96

2.97

2.98

2.99

3

4

3.75

3.65

3.5 2.9

3.5

−8

3.95

3.75

3

x 10

Voltage V

Voltage V

0.5

3.5 3.6

3.61

3.62

3.63

3.64

−8

x 10

3.65 Time

3.66

3.67

3.68

3.69

3.7 −8

x 10

Figure 6.13: Results for different inputs (1/2)

108


Unclassified Report

PR-TN-2005/00919

Voltage at node 3 of a PMTBR−TPWL model r=50. The input has an additional sinus. Problem 2 5.5 BDF solution TPWL solution

5 4.5

Voltage V

4 3.5 3 2.5 2 1.5 1 0

0.5

1

1.5

2 Time

2.5

3.7

3.7 3.68

3.66

3.66

3.64

3.64

3.62

3.62

3.58

3.56

3.56

3.54

3.54

3.52

3.52

7.7

7.8

7.9

8

8.1 Time

8.2

8.3

8.4

8.5

4

3.6

3.58

3.5 7.6

3.5

−8

3.68

3.6

3

x 10

Voltage V

Voltage V

0.5

3.5 1.6

8.6

1.61

1.62

1.63

1.64

−9

x 10

1.65 Time

1.66

1.67

1.68

1.69

1.7 −8

x 10

Voltage at node 3 of a PMTBR−TPWL model r=50. The input has an additional sinus and is shifted. Problem 3 5.5 BDF solution TPWL solution

5 4.5

Voltage V

4 3.5 3 2.5 2 1.5 1 0

0.5

1

1.5

2 Time

2.5

4

3.7 3.68

3.9

3.66

3.85

3.64

3.8

3.62

3.58

3.65

3.56

3.6

3.54

3.55

3.52

0.97

0.98

0.99

1

1.01 Time

1.02

1.03

1.04

1.05

1.06

4

3.6

3.7

3.5 0.96

3.5

−8

3.95

3.75

3

x 10

Voltage V

Voltage V

0.5

3.5 1.8

1.81

1.82

1.83

1.84

−8

x 10

1.85 Time

1.86

1.87

1.88

1.89

1.9 −8

x 10

Figure 6.14: Results for different inputs (2/2) c Koninklijke Philips Electronics N.V. 2005

109

PR-TN-2005/00919

110

Unclassified Report


Chapter 7

Summary Now we summarise the results which we have presented in this thesis and we also want to give some advice for further research.

7.1 Conclusion The TPWL method applied to nonlinear DAEs, which are used to describe circuits, is a promising technique to reduce the simulation time. It has several advantages compared to other methods. First of all we can get a big speed up in simulation time, in our test factor 8.3, because we are only solving small linear systems to approximate our system. Next we can use the well-developed linear model reduction techniques to increase the performance of our methods. We can also create a linearisation tuple controller that can be used directly in a BDF method, which is a big advantage because we get a fast model extraction technique. And we can even improve the properties of the TPWL model if we construct a good weighting procedure. The last thing we want to mention is that the TPWL method also has the nice property that it is scalable. This means that by using different linearisation tuple controllers, linear model reduction techniques and weighting methods, we can change the method from a fast but not very accurate method to a slower but also much more accurate method. This means that the user can decide what is more important to him: speed or accuracy. But we also have to give some warning. Like in all model reduction techniques we have to be careful by reducing systems, because if we are reducing them too much one cannot trust the results anymore. In our test we have also figured out that the combination of the simulation depending linearisation tuple controller, Poor Man’s TBR as linear model reduction and a weighting with knowledge of the norms of the Hessians gives the best results. If we use this combination we get a speed up factor 8.2 compared to a BDF method. And we can increase the speed up even more if we use a linear DAE solver with a stepsize controller.

7.2 Future work During this project we have studied a of lot possibilities and ideas which could improve the performance and/or the accuracy of the TPWL method. Of course we could not investigate all of them in detail so we want to give some advices for further research. • As already shown the TBR method for the linear model reduction part in the TPWL process seems to be quite promising. There are several reasons why we think that additional research 111

PR-TN-2005/00919

Unclassified Report

in this part can improve the TPWL method. The first thing is that the TBR approach has a given error bound which can also be computed in a cheap way. If we use this method it would be possible to improve the linearisation tuple controller in such a way that we could choose better and less linearisation tuples. This would decrease the storage amount which is needed for a TPWL model and because we are using less linearisation tuples the simulation of our TPWL model would be faster. Also compared to the other methods, the TBR approach has better compression properties while having the same accuracy as the other methods. By using a TBR approach we may decrease the simulation time of the TPWL model because the local models can be smaller. But the TBR method has the disadvantage that we can not use a simple SVD to create the global basis, so we have to find another way to combine the local subspaces to create the global subspace. • One point we have to think about is stability and passivity of the TPWL model. If the original model is stable and also passive we want to obtain a reduced model which preserves these properties. Of course if we use a linear model reduction technique which preserves stability and passivity, e.q. PRIMA, we can be sure that the local models are also stable and passive. But we still make a linear combination of the local systems and therefore we have to construct weights which preserve these properties. • The accuracy of the TPWL model is remaining good as long as we stay in the accuracy region, but once we leave this region we have no idea what happens with the solution. So it is a good idea to enlarge the accuracy region. Therefore we may combine several TPWL models, all generated with different inputs, to construct a new TPWL model which has a much larger accuracy region. Of course we have to be aware that if we enlarge the accuracy region we are also increasing the storing costs, because we have to save more local systems. • We also want to point out that it is still possible to use the TPWL method as an online model reduction. This means we could use it instead of a BDF solver in applying directly a linearisation and a model reduction to the original equations. Of course we have to be really careful but we think that with additional research we may develop a technique which could be used for some special systems to reduce the simulation time dramatically. • In our tests we have seen that the TPWL model is able to get the flanks of the digital system which is for a circuit designer very important. But we also can see that the TPWL method is introducing some additional ’oscillation’. The idea to overcome this problem is to filter the solution after we have solved our TPWL system to remove this ’oscillations’. It can also be, that if we use a smaller local subspace that the ’oscillation’ gets smaller or even disappears. • Finally we want to note the idea of trajectory piece wise polynomial (TPWP). This idea is strongly related to TPWL, because the only difference between both methods is that we are not using a linear approximation but instead we are using a polynomial approximation to our system. Of course if we would use this technique we would increase the model extraction time because we have to calculate higher order derivatives of q and j, but we know from approximation theory that a polynomial approximation in general behaves much better than a linear approximation.

112


Bibliography [1] A. F ROHBERGER , Model reduction techniques for linear and nonlinear dynamical system, Report, Philips ED&T/Analogue Simulation, 2003. ´ [2] M.J. R EWIE NSKI , A trajectory piecewise-linear approach to model order reduction of nonlinear dynamical systems, PhD Thesis, Massachutes Institute of Technology, 2003.

[3] I.M. E LFADEL , D.D. L ING, A block rational Arnoldi algorithm for passive model-order reduction of multiport RLC networks, Proc. ICCAD, 1997. [4] M. H INZE , A. K AUFFMANN, Reduced order modeling and suboptimal control of a solid fuel ignition model, Preprint No. 636/99, Technische Universität Berlin, 1999. [5] A.J. N EWMAN, Model reduction via the Karhunen-Loeve expansion part I: An exposition, Technical Research Report T.R. 96–32, Inst. Systems Research, April 1996. [6] A. O DABASIOGLU , M. C ELIK , L.T. P ILEGGI, PRIMA: passive reduced-order interconnect macromodeling algorithm, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 17, No. 8, pp. 645–654, Aug. 1998. [7] C.W. ROWLEY, T. C OLONIUS , R.M. M URRAY, POD based models of self-sustained oscillations in the flow past an open cavity, AIAA paper 2000-1969, 6th AIAA/CEAS Aeroacoustics Conference, June 2000. [8] Q. S U , V. BALAKRISHNAN , C.-K. KOH, A factorization-based framework for passivitypreserving model reduction of RLC systems, DAC 2002, June 10-14, 2002. [9] S. VOLKWEIN, Proper orthogonal decomposition and singular value decomposition, SFBPreprint No. 153, 1999. [10] B.C. M OORE, Principal component analysis in linear systems: controllability, oberservability, and model reduction, IEEE Transactions on Automatic Control, 1981. [11] S. L ALL , J.E. M ARSDEN , S. G LAVASKI, A subspace approch to balanced truncation for model reduction of nonlinear systems, International Journal of robust and nonlinear controll, pp. 519-535, 2002. [12] Y. S AAD, Iterative methods for sparse linear systems, Sec. Edition, SIAM, 2003. [13] W.J. Rugh, Nonlinear System Theory, The Johns Hopkins Univerity Press, 1981. [14] J.R. P HILLIPS, Projection frameworks for modelreduction of weakly nonlinear systems, DAC Conference, Los Angeles, 2000. 113

PR-TN-2005/00919

Unclassified Report

[15] Z.BAI , D.S KOOGH, Krylov Subspace Techniques for Reduced-Order Modeling of Nonlinear Dynamical Systems, University California, 2000-2001. [16] M. S CHETZEN, The Volterra and Wiener theories of nonlinear systems, John Wiley, New York, 1980. [17] L. DANIEL , J. P HILLIPS , Model order reduction for strictly passive and causal distributed systems, DAC 2002, June 10-14, 2002, New Orleans, Lousiana, USA. [18] R. W INKLER , Eigenvalue-based algorithm for testing positive realness of SISO systems, Thesis, Matematiska Institutionen, Stockholms Universitet, May 2001. [19] T. S TYKEL , Gramian based model reduction for descriptor systems, Mathematics of Control, Signals and Systems 16, pp. 297-319, 2004. [20] J. P HILLIPS , L.M. S ILVERA , Poor Man’s TBR: A simple model reduction scheme, IEEE transactions on computer-aided design of integrated circuits and systems. Vol. 14 No. 1, January 2005. [21] B.N. DATTA, Krylov subspace methods for large-scale matrix problems in control, Special Issue on Structural Dynamical Systems in Linear Algebra: Computational Issues of the Journal Future Generation Computer Systems, October 2002. [22] E.J. G RIMME , Krylov projection methods for model reduction, PhD-Thesis, CoordinatedScience Laboratory, University of Illinois at Urbana-Champaign, 1997. [23] A.C. A NTOULAS , D.C. Sorensen, Approximation of large-scale systems: An overview, Technical Report, Rice University, Houston Texas, February 2001. [24] J. ROMMES , Jacobi-Davidson methods and preconditioning with applications in pole-zero analysis, MSc-Thesis, Nat.Lab Unclassified Report 2002/817, Philips Electronics N.V. 2002. [25] L. DANIEL , J. P HILLIPS , Model order reduction for strictly passive and causal distributed systems, DAC 2002, June 10-14, 2002, New Orleans, Lousiana, USA. [26] K.J. K ERNS , I.L. W EMPLE , A.T. YANG , Stable and efficient reduction of substrate model networks using congruence transforms, IEEE/ACM Proc. ICCAD, pp. 207-214, Nov. 1995. [27] M. R ATHINAM , L.R. P ETZOLD , A new look at proper orthogonal decomposition, SIAM Journal numerical analysis,Vol. 41, No. 5, pp. 1893-1925, 2003. [28] P. A STRID , Reduction of process simulation models: a proper orthogonal decomposition approach, PhD-Thesis, Department of Electrical Engineering, Eindhoven University of Technology, 2004. [29] Y. Z HOU , Numerical methods for large scale matrix equations with applications in LTI system model reduction, PhD-Thesis, University Houston, Texas, 2002. [30] T. S TYKEL , Analysis and numerical solution of generalized Lyapunov equations, PhDThesis, Institut für Mathematik, Technische Universität Berlin, 2002. [31] A. BARTEL , M. G ÜNTHER , A multirate W-method for electrical networks in state space formulation, J. Comp. Appl. Math. 147 (2002) 2, pp. 411-425. 114


Unclassified Report

PR-TN-2005/00919

[32] M. G ÜNTHER , P. R ENTROP, Multirate ROW methods and latency of electrical circuits, Appl. Numer. Math. 13 (1993), pp. 83-102.


115