Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Chapter 162 — Matrix Factorization

Prerequisites: Gaussian elimination (ch161), matrix multiplication (ch154) You will learn:

  • Why factorization is the key paradigm of numerical linear algebra

  • LU, QR, Cholesky as the three main factorizations

  • When to use each and what they reveal about the matrix

  • The unifying idea: split a matrix into structured, easier factors

Environment: Python 3.x, numpy

# --- Matrix Factorization: Overview of main factorizations ---
import numpy as np

np.random.seed(42)
A = np.array([[4.,3.,2.],[6.,3.,1.],[2.,5.,3.]])

print("=== Three main factorizations ===\n")

# 1. LU: A = LU (general square matrices, for solving systems)
import scipy.linalg
P, L, U = scipy.linalg.lu(A)
print("LU Factorization: PA = LU")
print(f"  L (lower triangular):\n{np.round(L,4)}")
print(f"  U (upper triangular):\n{np.round(U,4)}")
print(f"  PA = LU: {np.allclose(P @ A, L @ U)}\n")

# 2. QR: A = QR (orthogonal Q, upper triangular R; for least squares, eigenvalues)
Q, R = np.linalg.qr(A)
print("QR Factorization: A = QR")
print(f"  Q (orthogonal, QᵀQ=I):\n{np.round(Q,4)}")
print(f"  R (upper triangular):\n{np.round(R,4)}")
print(f"  QᵀQ = I: {np.allclose(Q.T @ Q, np.eye(3))}")
print(f"  QR = A:  {np.allclose(Q @ R, A)}\n")

# 3. Cholesky: A = LLᵀ (symmetric positive definite only; fastest, most stable)
S = A.T @ A   # guaranteed SPD
L_chol = np.linalg.cholesky(S)
print("Cholesky: S = LLᵀ  (requires S symmetric positive definite)")
print(f"  L (lower triangular):\n{np.round(L_chol,4)}")
print(f"  LLᵀ = S: {np.allclose(L_chol @ L_chol.T, S)}\n")

# When to use which:
print("=== Usage guide ===")
print("LU:        Solving Ax=b for general square A.  O(n³) once, O(n²) per right-hand side.")
print("QR:        Least squares (overdetermined). Eigenvalue algorithms.  More stable than LU for rank-deficient.")
print("Cholesky:  SPD matrices (covariance, Gram, Laplacian). 2x faster than LU.")
print("SVD:       General rectangular. Rank, pseudoinverse, PCA.  See ch173.")
=== Three main factorizations ===

LU Factorization: PA = LU
  L (lower triangular):
[[1.     0.     0.    ]
 [0.3333 1.     0.    ]
 [0.6667 0.25   1.    ]]
  U (upper triangular):
[[6.     3.     1.    ]
 [0.     4.     2.6667]
 [0.     0.     0.6667]]
  PA = LU: False

QR Factorization: A = QR
  Q (orthogonal, QᵀQ=I):
[[-0.5345  0.0376 -0.8443]
 [-0.8018 -0.3385  0.4925]
 [-0.2673  0.9402  0.2111]]
  R (upper triangular):
[[-7.4833 -5.3452 -2.6726]
 [ 0.      3.7985  2.5574]
 [ 0.      0.     -0.5629]]
  QᵀQ = I: True
  QR = A:  True

Cholesky: S = LLᵀ  (requires S symmetric positive definite)
  L (lower triangular):
[[7.4833 0.     0.    ]
 [5.3452 3.7985 0.    ]
 [2.6726 2.5574 0.5629]]
  LLᵀ = S: True

=== Usage guide ===
LU:        Solving Ax=b for general square A.  O(n³) once, O(n²) per right-hand side.
QR:        Least squares (overdetermined). Eigenvalue algorithms.  More stable than LU for rank-deficient.
Cholesky:  SPD matrices (covariance, Gram, Laplacian). 2x faster than LU.
SVD:       General rectangular. Rank, pseudoinverse, PCA.  See ch173.

4. Mathematical Formulation

LU:       PA = LU   — P: permutation, L: unit lower triangular, U: upper triangular
          Use: solve Ax=b as Ly=Pb (forward), Ux=y (back)
          Cost: O(n³) to factor, O(n²) per solve

QR:       A = QR    — Q: orthogonal (Qᵀ=Q⁻¹), R: upper triangular
          Use: least squares min||Ax-b|| via Rx = Qᵀb
          Cost: O(mn²) for m×n matrix

Cholesky: A = LLᵀ  — L: lower triangular (A must be SPD)
          Use: Gaussian covariance, PDEs, Kalman filter
          Cost: O(n³/3) — 2x cheaper than LU, no pivoting needed

7. Exercises

Easy 1. Which factorization would you use to solve 500 systems Ax=bᵢ with the same A? Why?

Easy 2. Can you Cholesky-decompose [[1,2],[2,1]]? What about [[4,2],[2,3]]? Why the difference?

Medium 1. Use LU factorization to compute the inverse of a 5×5 matrix. Compare timing to np.linalg.inv.

Medium 2. Implement solve_with_lu(A, B) that factors A once and solves for multiple right-hand sides B (columns of B).

Hard. Implement QR factorization via Gram-Schmidt orthogonalization. Verify Q is orthogonal and QR=A. Then use it to solve a 10×5 overdetermined least-squares system.


9. Chapter Summary & Connections

  • Factorization splits A into structured pieces: triangular (L, U, R) or orthogonal (Q).

  • LU for general square systems, QR for least squares, Cholesky for symmetric positive definite.

  • All run in O(n³); the constant factor matters in practice.

Forward connections:

  • In ch163 (LU Decomposition), we implement LU from scratch.

  • In ch173 (SVD), the fourth and most powerful factorization handles any matrix.