Part VIII: Probability | Computational Mathematics for Programmers
1. The Sample Space Ω¶
Before you can assign probabilities, you must define the universe of possible outcomes.
The sample space Ω (omega) is the set of all possible outcomes of an experiment. An outcome (also called a sample point or elementary event) is a single element of Ω.
(Sets and set notation were introduced in ch011 — Sets and Basic Set Operations.)
Examples:
| Experiment | Sample Space Ω |
|---|---|
| Flip one coin | {H, T} |
| Roll one d6 | {1, 2, 3, 4, 5, 6} |
| Roll two d6 | {(1,1), (1,2), ..., (6,6)} — 36 outcomes |
| Pick a real number in [0,1] | [0, 1] — uncountably infinite |
| Record tomorrow’s temperature | ℝ (or some bounded interval) |
The structure of Ω determines what mathematics applies: finite discrete, countably infinite, or continuous.
from itertools import product
# Enumerate Ω for two dice
faces = [1, 2, 3, 4, 5, 6]
omega_two_dice = list(product(faces, faces))
print(f"|Ω| for two dice: {len(omega_two_dice)}")
print("First 6 outcomes:", omega_two_dice[:6])
print("Last 6 outcomes:", omega_two_dice[-6:])
# Ω for three coin flips
omega_three_coins = list(product(['H', 'T'], repeat=3))
print(f"\n|Ω| for three coin flips: {len(omega_three_coins)}")
print("All outcomes:", omega_three_coins)|Ω| for two dice: 36
First 6 outcomes: [(1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6)]
Last 6 outcomes: [(6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6)]
|Ω| for three coin flips: 8
All outcomes: [('H', 'H', 'H'), ('H', 'H', 'T'), ('H', 'T', 'H'), ('H', 'T', 'T'), ('T', 'H', 'H'), ('T', 'H', 'T'), ('T', 'T', 'H'), ('T', 'T', 'T')]
2. Finite vs Infinite Sample Spaces¶
Finite Ω: Outcomes are countable and finite. Probability is a function from outcomes to non-negative numbers summing to 1.
Countably infinite Ω: For example, “how many coin flips until the first head?” The outcome could be 1, 2, 3, ... without bound. Ω = {1, 2, 3, ...}.
Uncountably infinite Ω: Picking a real number from [0, 1]. No individual outcome has positive probability — only intervals do. This requires measure theory and is handled through probability density functions (see ch248).
import numpy as np
import matplotlib.pyplot as plt
# Countably infinite example: geometric distribution
# Ω = {1, 2, 3, ...} — number of flips until first heads (p=0.5)
p = 0.5
k = np.arange(1, 20)
prob_k = (1 - p)**(k - 1) * p # P(X = k) = (1-p)^(k-1) * p
print("P(first heads on flip k):")
for ki, pi in zip(k[:8], prob_k[:8]):
bar = '█' * int(pi * 60)
print(f" k={ki:2d}: {pi:.4f} {bar}")
print(f"\nSum of first 19 terms: {prob_k.sum():.6f}")
print(f"Sum converges to 1.0 as k → ∞")P(first heads on flip k):
k= 1: 0.5000 ██████████████████████████████
k= 2: 0.2500 ███████████████
k= 3: 0.1250 ███████
k= 4: 0.0625 ███
k= 5: 0.0312 █
k= 6: 0.0156
k= 7: 0.0078
k= 8: 0.0039
Sum of first 19 terms: 0.999998
Sum converges to 1.0 as k → ∞
3. Uniform vs Non-Uniform Sample Spaces¶
A uniform sample space assigns equal probability to all outcomes:
A non-uniform space assigns different probabilities. The constraint is always:
# Weighted (non-uniform) die — face 6 is twice as likely as others
def make_weighted_die():
"""Returns (faces, probabilities) for a die where 6 has double weight."""
weights = np.array([1, 1, 1, 1, 1, 2], dtype=float)
probs = weights / weights.sum() # normalize to sum to 1
return np.arange(1, 7), probs
faces, probs = make_weighted_die()
print("Weighted die probabilities:")
for f, p in zip(faces, probs):
print(f" P(face={f}) = {p:.4f}")
print(f"Sum = {probs.sum():.4f}")
# Simulate 100k rolls and verify
rng = np.random.default_rng(seed=42)
rolls = rng.choice(faces, size=100_000, p=probs)
values, counts = np.unique(rolls, return_counts=True)
print("\nEmpirical vs theoretical:")
for v, c, p in zip(values, counts, probs):
print(f" Face {v}: empirical={c/100000:.4f} theoretical={p:.4f}")Weighted die probabilities:
P(face=1) = 0.1429
P(face=2) = 0.1429
P(face=3) = 0.1429
P(face=4) = 0.1429
P(face=5) = 0.1429
P(face=6) = 0.2857
Sum = 1.0000
Empirical vs theoretical:
Face 1: empirical=0.1420 theoretical=0.1429
Face 2: empirical=0.1436 theoretical=0.1429
Face 3: empirical=0.1413 theoretical=0.1429
Face 4: empirical=0.1434 theoretical=0.1429
Face 5: empirical=0.1439 theoretical=0.1429
Face 6: empirical=0.2858 theoretical=0.2857
4. Representing Sample Spaces in Code¶
In practice, we rarely enumerate Ω explicitly. We represent it through:
A distribution object (for parametric families)
A list of outcomes with associated probabilities
A simulation function that draws from Ω
class FiniteSampleSpace:
"""Explicit finite sample space with assigned probabilities."""
def __init__(self, outcomes, probabilities=None):
self.outcomes = list(outcomes)
n = len(self.outcomes)
if probabilities is None:
self.probs = np.full(n, 1.0 / n) # uniform by default
else:
self.probs = np.array(probabilities, dtype=float)
assert abs(self.probs.sum() - 1.0) < 1e-9, "Probabilities must sum to 1"
def sample(self, n=1, seed=None):
rng = np.random.default_rng(seed)
idx = rng.choice(len(self.outcomes), size=n, p=self.probs)
return [self.outcomes[i] for i in idx]
def __repr__(self):
return f"SampleSpace(|Ω|={len(self.outcomes)})"
# Fair coin
coin = FiniteSampleSpace(['H', 'T'])
print(coin)
print("10 flips:", coin.sample(10, seed=0))
# Weighted die
die = FiniteSampleSpace([1, 2, 3, 4, 5, 6], [1/7, 1/7, 1/7, 1/7, 1/7, 2/7])
print("\n10 rolls of weighted die:", die.sample(10, seed=0))SampleSpace(|Ω|=2)
10 flips: ['T', 'H', 'H', 'H', 'T', 'T', 'T', 'T', 'T', 'T']
10 rolls of weighted die: [5, 2, 1, 1, 6, 6, 5, 6, 4, 6]
5. Summary¶
The sample space Ω is the universe of all possible outcomes — it must be defined before probabilities can be assigned.
Finite, countably infinite, and continuous sample spaces each require different mathematical treatment.
Probabilities over Ω must be non-negative and sum (or integrate) to 1.
In code, sample spaces are represented as distributions or enumerated probability tables.
6. Forward References¶
Subsets of Ω — called events — are the objects to which we actually assign probabilities in practice. This is ch243 (Events). The rules governing how those probabilities combine are ch244 (Probability Rules). Continuous sample spaces require integration, formalized through probability density functions in ch248 (Probability Distributions).