Outline. Least Square. Problems. The Problem. Example:Polynomial. Data-Fitting. Orthogonal. Projection and the Normal. E
Least Square Problems Vu Xuan Quynh Outline Least Square Problems The Problem Example:Polynomial Data-Fitting Orthogonal Projection and the Normal Equations Pseudoinverse Normal Equations QR Factorization SVD Comparison of Algorithms
Least Square Problems Lecture 11 Vu Xuan Quynh K53 Advanced Maths, Ha Noi University of Sciences
September 26, 2011
Least Square Problems Vu Xuan Quynh Outline Least Square Problems The Problem Example:Polynomial Data-Fitting Orthogonal Projection and the Normal Equations Pseudoinverse Normal Equations QR Factorization SVD Comparison of Algorithms
1 Least Square Problems
The Problem Example:Polynomial Data-Fitting Orthogonal Projection and the Normal Equations Pseudoinverse Normal Equations QR Factorization SVD Comparison of Algorithms
The Problem Least Square Problems Vu Xuan Quynh Outline Least Square Problems The Problem Example:Polynomial Data-Fitting Orthogonal Projection and the Normal Equations Pseudoinverse Normal Equations QR Factorization SVD Comparison of Algorithms
We wish to find a vector x ∈ Cn that satisfies Ax = b, where A ∈ Cmn and b ∈ Cm Ingeneral,such a problem has no solution ( In the case, b ∈ / range(A)) We say that a rectangular system of equations with m > n is overdetermined. The vector known as the residual: r = b − Ax ∈ Cm (11.1) The problem takes the following form: Given A ∈ Cmn , m ≥ n, b ∈ Cm find x ∈ Cn such that kb − Axk2 is minimized. (11.2)
Example 11.1.Polynomial Interpolation Least Square Problems Vu Xuan Quynh Outline Least Square Problems The Problem Example:Polynomial Data-Fitting Orthogonal Projection and the Normal Equations Pseudoinverse Normal Equations QR Factorization SVD Comparison of Algorithms
Suppose we are given m distinct points x1 , x2 , · · · , xm ∈ C and data y1 , y2 , · · · , ym ∈ C There exists a unique polynomial interpolant to these data has form: p(x) = c0 + c1 x + · · · + cm−1 x m−1
(11.3)
with the property: p(xi ) = yi . We have 1 1 1 . . . 1
the following system: x1 x2 x3
(x1 )2 · · · (x2 )2 · · · (x3 )2 · · ·
xm (xm )2 · · ·
c0 (x1 )m−1 y1 c1 y2 (x2 )m−1 (x3 )m−1 c2 = y3 · · · · · · m−1 (xm ) cm−1 ym
Least Square Problems Vu Xuan Quynh Outline Least Square Problems The Problem Example:Polynomial Data-Fitting Orthogonal Projection and the Normal Equations Pseudoinverse Normal Equations QR Factorization SVD Comparison of Algorithms
Figure: Degree 10 polynomial interpolant to eleven data points. The axis scales are not given, as these have no effect on the picture
Example 11.2. Polynomial Least Squares Fitting Least Square Problems Vu Xuan Quynh Outline Least Square Problems The Problem Example:Polynomial Data-Fitting Orthogonal Projection and the Normal Equations Pseudoinverse Normal Equations QR Factorization SVD Comparison of Algorithms
Without changing the data points,we can do better by reducing the degree of the polynomial: p(x) = c0 + c1 x + c2 x 2 + · · · + cn−1 x n−1 for some n < m (11.5) Such a polynomial is a least square fit to the data if it minimizes the sum of the squares of these deviation from the data: m X |(p(xi ) − yi )|2 (11.6) i =1
Example 11.2. Polynomial Least Squares Fitting Least Square Problems Vu Xuan Quynh Outline Least Square Problems The Problem Example:Polynomial Data-Fitting Orthogonal Projection and the Normal Equations Pseudoinverse Normal Equations QR Factorization SVD Comparison of Algorithms
This sum of squares is equal to the square of the norm of the residual, (kr k2 )2 , for the rectangular Vandermonde system: y1 1 x1 · · · (x1 )n−1 y2 1 x2 · · · (x2 )n−1 c0 1 x3 · · · (x3 )n−1 c1 ≈ y3 ··· · · · . . . cn−1 n−1 1 xm · · · (xm ) ym
Example 11.2. Polynomial Least Squares Fitting Least Square Problems Vu Xuan Quynh Outline Least Square Problems The Problem Example:Polynomial Data-Fitting Orthogonal Projection and the Normal Equations Pseudoinverse Normal Equations QR Factorization SVD Comparison of Algorithms
Figure: Degree 7 polynomial least squares fit to the same eleven data points.
Theorem 11.1 Least Square Problems Vu Xuan Quynh Outline Least Square Problems The Problem Example:Polynomial Data-Fitting Orthogonal Projection and the Normal Equations Pseudoinverse Normal Equations QR Factorization SVD Comparison of Algorithms
Let A ∈ Cm×n (m ≥ n) and b ∈ Cm be given. A vector x ∈ Cn minimizes the residual norm kr k2 = kb − Axk2 , thereby solving the least squares problem(11.2), if and only if r ⊥ range(A), that is, A∗ r = 0 (11.8) or equivalently, A∗ Ax = A∗ b
(11.9)
or again equivalently, Pb = Ax
(11.10)
where P ∈ Cm×n is the orthogonal projector onto range (A). The n × n system of equations (11.9), known as the normal equations, is nonsingular if and only if A has full rank. Consequently the solution x is unique if and only if A has full rank.
Orthogonal Projection and the Normal Equations Least Square Problems Vu Xuan Quynh Outline Least Square Problems The Problem Example:Polynomial Data-Fitting Orthogonal Projection and the Normal Equations Pseudoinverse Normal Equations QR Factorization SVD Comparison of Algorithms
Figure: Formulation of the least squares problem (11.2) in terms of the orthogonal projection.
Proof Least Square Problems Vu Xuan Quynh Outline Least Square Problems The Problem Example:Polynomial Data-Fitting Orthogonal Projection and the Normal Equations Pseudoinverse Normal Equations QR Factorization SVD Comparison of Algorithms
(11.8) and (11.9) are equivalent because of r = b − Ax. (11.8) and (11.10) are equivalent follows the properties of orthogonal projectors. Suppose z 6= y is another point in range(A). Since z − y is orthogonal to b − y , the Pythagorean theorem gives: kb − zk22 = kb − y k22 + ky − zk22 > kb − y k22 , as required. if A∗ A is singular, then A∗ Ax = 0 for some nonzero x, ⇒ x ∗ A∗ Ax = 0 Thus, Ax = 0 ⇒ A is rank-deficent. Conversely, if A is rank-deficent, then Ax = 0 for some nonze x, ⇒ A∗ Ax = 0 so, A∗ A is singular. By (11.9) ⇒ the uniqueness of x.
Pseudoinverse Least Square Problems Vu Xuan Quynh Outline Least Square Problems The Problem Example:Polynomial Data-Fitting Orthogonal Projection and the Normal Equations Pseudoinverse Normal Equations QR Factorization SVD Comparison of Algorithms
If A has full rank then the solution x to the least squares problem(11.2) is unique and given by x = (A∗ A)−1 A∗ b. The matrix is known as the pseudoinverse of A, denote by A+ : A+ = (A∗ A)−1 A∗ ∈ Cm×n Summarize: the problem is to compute one or both of the vectors: x = A+ b, y = Pb where A+ is the pseudoinverse of A, and P is the orthogonal projector onto range (A)
Normal Equations Least Square Problems Vu Xuan Quynh Outline Least Square Problems The Problem Example:Polynomial Data-Fitting Orthogonal Projection and the Normal Equations Pseudoinverse Normal Equations QR Factorization SVD Comparison of Algorithms
The classical way to solve least squares problems is to solve the normal equations (11.9) If A has full rank, this is a square,hermitian positive definite system of equations of dimension n. ⇒ The standard method of solving such a system by Cholesky factorization.
Algorithm 11.1 Least squares via Normal Equations Least Square Problems Vu Xuan Quynh Outline Least Square Problems The Problem Example:Polynomial Data-Fitting Orthogonal Projection and the Normal Equations Pseudoinverse Normal Equations QR Factorization SVD Comparison of Algorithms
Algorithm 11.1 1 Form the matrix A∗ A and the vector A∗ b. 2 the Cholesky factorization A∗ A = R ∗ R. 3 the lower-triangular system R ∗ w = A∗ b for w. 4 Solve the upper-triangular system Rx = w for x. 5 A∗ A requires only mn2 flops. 3 Cholesky factorization requires n3 flops. 3 ⇒ total operation count: ∼ mn2 + n3 flops.
Algorithm 11.1 Least squares via Normal Equations Least Square Problems Vu Xuan Quynh Outline Least Square Problems The Problem Example:Polynomial Data-Fitting Orthogonal Projection and the Normal Equations Pseudoinverse Normal Equations QR Factorization SVD Comparison of Algorithms
Algorithm 11.1 1 Form the matrix A∗ A and the vector A∗ b. 2 the Cholesky factorization A∗ A = R ∗ R. 3 the lower-triangular system R ∗ w = A∗ b for w. 4 Solve the upper-triangular system Rx = w for x. 5 A∗ A requires only mn2 flops. 3 Cholesky factorization requires n3 flops. 3 ⇒ total operation count: ∼ mn2 + n3 flops.
QR Factorization Least Square Problems Vu Xuan Quynh
The ”modern classical” method for solving least squares problems is based upon reduced QR factorization.
Outline Least Square Problems The Problem Example:Polynomial Data-Fitting Orthogonal Projection and the Normal Equations Pseudoinverse Normal Equations QR Factorization SVD Comparison of Algorithms
Algorithm 11.2 1 2 3 4
b R. b Compute the reduced QR factorization A = Q ∗ b b. Compute the vector Q b =Q b ∗ b for x. Solve the upper-triangular system Rx n3 2 Work for Algorithm 11.2: ∼ 2mn − 2 3 flops.
QR Factorization Least Square Problems Vu Xuan Quynh
The ”modern classical” method for solving least squares problems is based upon reduced QR factorization.
Outline Least Square Problems The Problem Example:Polynomial Data-Fitting Orthogonal Projection and the Normal Equations Pseudoinverse Normal Equations QR Factorization SVD Comparison of Algorithms
Algorithm 11.2 1 2 3 4
b R. b Compute the reduced QR factorization A = Q ∗ b b. Compute the vector Q b =Q b ∗ b for x. Solve the upper-triangular system Rx n3 2 Work for Algorithm 11.2: ∼ 2mn − 2 3 flops.
Algorithm 11.3 Least Squares via SVD Least Square Problems Vu Xuan Quynh Outline Least Square Problems The Problem Example:Polynomial Data-Fitting Orthogonal Projection and the Normal Equations Pseudoinverse Normal Equations QR Factorization SVD Comparison of Algorithms
b ΣV b ∗. Compute the reduced SVD A = U b ∗b . Compute the vector U
b =U b ∗ b for w. Solve the diagonal system Σw Set x = Vw .
Work for Algorithm 11.3: ∼ 2mn2 + 11n3 flops.
Comparison Least Square Problems Vu Xuan Quynh Outline Least Square Problems The Problem Example:Polynomial Data-Fitting Orthogonal Projection and the Normal Equations Pseudoinverse Normal Equations QR Factorization SVD Comparison of Algorithms
When speed is the only consideration, Algorithm 11.1 may be the best. Since solving the normal equations is not always stable in the presence of rounding errors, so the Algorithm 11.2 instead as the standard method for least square problems. If A is close to rank-deficient the Algorithm 11.2 has less than ideal stability properties and we use Algorithm 11.3, based on the SVD.
Least Square Problems Vu Xuan Quynh Outline Least Square Problems The Problem Example:Polynomial Data-Fitting Orthogonal Projection and the Normal Equations Pseudoinverse Normal Equations QR Factorization SVD Comparison of Algorithms
[Lloyd N.Trefethen and David Bau] Nurmerical Linear Algebra (SIAM, 1997) [Golub, Gene H.; Van Loan, Charles F] Matrix Computations, 3rd edition, Johns Hopkins University Press [James W. Demmel ] Applied Numerical Linear Algebra (SIAM, 1997)
Thank You Least Square Problems Vu Xuan Quynh Outline Least Square Problems The Problem Example:Polynomial Data-Fitting Orthogonal Projection and the Normal Equations Pseudoinverse Normal Equations QR Factorization SVD Comparison of Algorithms
THANKS FOR YOUR ATTENTION!