Chapter 163 — LU Decomposition - Mathematics for Programmers

Prerequisites: Gaussian elimination (ch161), matrix factorization (ch162) You will learn:
How Gaussian elimination produces LU as a byproduct
Implementing LU from scratch
Using LU to solve multiple right-hand sides efficiently
Condition number and numerical error bounds
Environment: Python 3.x, numpy

# --- LU Decomposition from scratch ---
import numpy as np

def lu_decompose(A):
    """
    Compute LU decomposition with partial pivoting: PA = LU.

    Args:
        A: square 2D numpy array (n, n)

    Returns:
        P: permutation matrix (n, n)
        L: unit lower triangular (n, n), L[i,i]=1
        U: upper triangular (n, n)
    """
    n = A.shape[0]
    U = A.astype(float).copy()
    L = np.eye(n)
    P = np.eye(n)
    
    for k in range(n):
        # Partial pivoting
        max_row = k + np.argmax(np.abs(U[k:, k]))
        U[[k, max_row]] = U[[max_row, k]]
        P[[k, max_row]] = P[[max_row, k]]
        L[[k, max_row], :k] = L[[max_row, k], :k]  # fix already-computed L entries
        
        for i in range(k+1, n):
            if abs(U[k, k]) < 1e-14: continue
            L[i, k] = U[i, k] / U[k, k]     # multiplier stored in L
            U[i, k:] -= L[i, k] * U[k, k:]  # eliminate
    return P, L, U

def lu_solve(P, L, U, b):
    """
    Solve Ax=b given PA=LU.

    Steps:
        1. Ly = Pb  (forward substitution)
        2. Ux = y   (back substitution)
    """
    n = len(b)
    pb = P @ b
    # Forward substitution: Ly = pb  (L is unit lower triangular)
    y = np.zeros(n)
    for i in range(n):
        y[i] = pb[i] - np.dot(L[i, :i], y[:i])
    # Back substitution: Ux = y
    x = np.zeros(n)
    for i in range(n-1, -1, -1):
        x[i] = (y[i] - np.dot(U[i, i+1:], x[i+1:])) / U[i, i]
    return x

# Test
A = np.array([[2.,1.,1.],[4.,3.,3.],[8.,7.,9.]])
b = np.array([1., 1., 1.])

P, L, U = lu_decompose(A)
x = lu_solve(P, L, U, b)
x_ref = np.linalg.solve(A, b)

print(f"PA = LU: {np.allclose(P @ A, L @ U)}")
print(f"Our solution: {x}")
print(f"NumPy solve:  {x_ref}")
print(f"Match: {np.allclose(x, x_ref)}")
print()

# Efficiency: factor once, solve many times
print("--- Solving 100 systems with same A ---")
import time
B = np.random.randn(3, 100)  # 100 right-hand sides

t0 = time.perf_counter()
for j in range(100):
    _ = lu_solve(P, L, U, B[:, j])
t_lu_reuse = time.perf_counter() - t0

t0 = time.perf_counter()
for j in range(100):
    _ = np.linalg.solve(A, B[:, j])
t_fresh = time.perf_counter() - t0
print(f"LU reuse (factor once): {t_lu_reuse*1000:.2f}ms")
print(f"np.solve (factor each): {t_fresh*1000:.2f}ms")

PA = LU: True
Our solution: [ 1. -1. -0.]
NumPy solve:  [ 1. -1. -0.]
Match: True

--- Solving 100 systems with same A ---
LU reuse (factor once): 5.53ms
np.solve (factor each): 3.95ms

4. Mathematical Formulation¶

PA = LU  where:
  P: permutation matrix (reorders rows for numerical stability)
  L: unit lower triangular (L[i,i]=1; stores the elimination multipliers)
  U: upper triangular (result of Gaussian elimination)

Solving Ax=b in two stages:
  Stage 1: Ly = Pb    — forward substitution, O(n²)
  Stage 2: Ux = y     — back substitution, O(n²)

Factorization cost: O(n³) — amortized over multiple solves

Condition number: κ(A) = ||A|| · ||A⁻¹|| 
  High κ → small perturbations in b → large changes in x
  Rule of thumb: you lose log₁₀(κ) digits of accuracy

7. Exercises¶

Easy 1. For a 4×4 matrix, how many multiplications does LU factorization use? How many for back substitution?

Easy 2. Why is LU with partial pivoting preferred over LU without pivoting?

Medium 1. Use lu_decompose + lu_solve to compute A⁻¹ by solving AX=I (solving n systems with identity columns as right-hand sides).

Medium 2. Compute the condition number of 5 random matrices and 5 near-singular matrices. Predict the expected accuracy of the solution and verify.

Hard. Implement cholesky(A) for symmetric positive definite A. The formula is: L[i,i] = sqrt(A[i,i] - sum(L[i,k]² for k<i)), L[j,i] = (A[j,i] - sum(L[j,k]*L[i,k] for k<i)) / L[i,i]. Verify against np.linalg.cholesky.

9. Chapter Summary & Connections¶

LU decomposition factors A = LU (with permutation). Costs O(n³), amortizes to O(n²) per additional solve.
Gaussian elimination IS LU decomposition — the multipliers form L, the transformed system is U.
Condition number κ(A) quantifies how ill-conditioned the system is (connects to ch038 — Floating Point).

Forward connections:

In ch168 (Projection Matrices), we solve AᵀAx = Aᵀb via Cholesky (since AᵀA is SPD).
In ch173 (SVD), we see the fourth and most powerful factorization.