Prerequisites: Gaussian elimination (ch161), matrix factorization (ch162) You will learn:
How Gaussian elimination produces LU as a byproduct
Implementing LU from scratch
Using LU to solve multiple right-hand sides efficiently
Condition number and numerical error bounds
Environment: Python 3.x, numpy
# --- LU Decomposition from scratch ---
import numpy as np
def lu_decompose(A):
"""
Compute LU decomposition with partial pivoting: PA = LU.
Args:
A: square 2D numpy array (n, n)
Returns:
P: permutation matrix (n, n)
L: unit lower triangular (n, n), L[i,i]=1
U: upper triangular (n, n)
"""
n = A.shape[0]
U = A.astype(float).copy()
L = np.eye(n)
P = np.eye(n)
for k in range(n):
# Partial pivoting
max_row = k + np.argmax(np.abs(U[k:, k]))
U[[k, max_row]] = U[[max_row, k]]
P[[k, max_row]] = P[[max_row, k]]
L[[k, max_row], :k] = L[[max_row, k], :k] # fix already-computed L entries
for i in range(k+1, n):
if abs(U[k, k]) < 1e-14: continue
L[i, k] = U[i, k] / U[k, k] # multiplier stored in L
U[i, k:] -= L[i, k] * U[k, k:] # eliminate
return P, L, U
def lu_solve(P, L, U, b):
"""
Solve Ax=b given PA=LU.
Steps:
1. Ly = Pb (forward substitution)
2. Ux = y (back substitution)
"""
n = len(b)
pb = P @ b
# Forward substitution: Ly = pb (L is unit lower triangular)
y = np.zeros(n)
for i in range(n):
y[i] = pb[i] - np.dot(L[i, :i], y[:i])
# Back substitution: Ux = y
x = np.zeros(n)
for i in range(n-1, -1, -1):
x[i] = (y[i] - np.dot(U[i, i+1:], x[i+1:])) / U[i, i]
return x
# Test
A = np.array([[2.,1.,1.],[4.,3.,3.],[8.,7.,9.]])
b = np.array([1., 1., 1.])
P, L, U = lu_decompose(A)
x = lu_solve(P, L, U, b)
x_ref = np.linalg.solve(A, b)
print(f"PA = LU: {np.allclose(P @ A, L @ U)}")
print(f"Our solution: {x}")
print(f"NumPy solve: {x_ref}")
print(f"Match: {np.allclose(x, x_ref)}")
print()
# Efficiency: factor once, solve many times
print("--- Solving 100 systems with same A ---")
import time
B = np.random.randn(3, 100) # 100 right-hand sides
t0 = time.perf_counter()
for j in range(100):
_ = lu_solve(P, L, U, B[:, j])
t_lu_reuse = time.perf_counter() - t0
t0 = time.perf_counter()
for j in range(100):
_ = np.linalg.solve(A, B[:, j])
t_fresh = time.perf_counter() - t0
print(f"LU reuse (factor once): {t_lu_reuse*1000:.2f}ms")
print(f"np.solve (factor each): {t_fresh*1000:.2f}ms")PA = LU: True
Our solution: [ 1. -1. -0.]
NumPy solve: [ 1. -1. -0.]
Match: True
--- Solving 100 systems with same A ---
LU reuse (factor once): 5.53ms
np.solve (factor each): 3.95ms
4. Mathematical Formulation¶
PA = LU where:
P: permutation matrix (reorders rows for numerical stability)
L: unit lower triangular (L[i,i]=1; stores the elimination multipliers)
U: upper triangular (result of Gaussian elimination)
Solving Ax=b in two stages:
Stage 1: Ly = Pb — forward substitution, O(n²)
Stage 2: Ux = y — back substitution, O(n²)
Factorization cost: O(n³) — amortized over multiple solves
Condition number: κ(A) = ||A|| · ||A⁻¹||
High κ → small perturbations in b → large changes in x
Rule of thumb: you lose log₁₀(κ) digits of accuracy7. Exercises¶
Easy 1. For a 4×4 matrix, how many multiplications does LU factorization use? How many for back substitution?
Easy 2. Why is LU with partial pivoting preferred over LU without pivoting?
Medium 1. Use lu_decompose + lu_solve to compute A⁻¹ by solving AX=I (solving n systems with identity columns as right-hand sides).
Medium 2. Compute the condition number of 5 random matrices and 5 near-singular matrices. Predict the expected accuracy of the solution and verify.
Hard. Implement cholesky(A) for symmetric positive definite A. The formula is: L[i,i] = sqrt(A[i,i] - sum(L[i,k]² for k<i)), L[j,i] = (A[j,i] - sum(L[j,k]*L[i,k] for k<i)) / L[i,i]. Verify against np.linalg.cholesky.
9. Chapter Summary & Connections¶
LU decomposition factors A = LU (with permutation). Costs O(n³), amortizes to O(n²) per additional solve.
Gaussian elimination IS LU decomposition — the multipliers form L, the transformed system is U.
Condition number κ(A) quantifies how ill-conditioned the system is (connects to ch038 — Floating Point).
Forward connections:
In ch168 (Projection Matrices), we solve
AᵀAx = Aᵀbvia Cholesky (since AᵀA is SPD).In ch173 (SVD), we see the fourth and most powerful factorization.