Law of Large Numbers & Central Limit Theorem

In classical probability and statistics, two foundational theorems bridge the gap between theoretical probabilities and observable empirical data: the Law of Large Numbers (LLN) and the Central Limit Theorem (CLT). While both rely on the concept of sample size, they address two distinctly different statistical questions.

The Law of Large Numbers (LLN)

The Law of Large Numbers states that as the size of a sample drawn from a population increases, the sample mean will converge toward the expected value (theoretical population mean). According to probability theory, the empirical average approaches the actual average precisely because outliers are diluted by the sheer number of expected observations.

A Real-World Example

Imagine flipping a fair coin. Theoretically, the expected probability of landing on "Heads" is $0.5$ (or 50%). If you flip the coin $10$ times, you might get $7$ heads and $3$ tails—an average of $0.7$, which deviates heavily from the expected $0.5$. However, if you flip that exact same coin $10,000$ times, the Law of Large Numbers dictates that the final ratio will fall extremely close to exactly $0.5$. The "large number" of trials inherently stabilizes the outcome around the true population mean.

The Central Limit Theorem (CLT)

While LLN tells us where the mean is headed ($ \mu $), the Central Limit Theorem concerns itself with the distribution of those sample means.

The CLT states that if you take sufficiently large random samples from any population—regardless of the population's underlying distribution shape (whether skewed, uniform, or bimodal)—the distribution of those sample means will approximate a normal distribution (a bell curve). As a general rule of thumb, a sample size equal to or greater than $n \ge 30$ is considered legally "large enough" for the theorem to hold true.

Why it matters in Data Science

The Central Limit Theorem is arguably the most critical theorem in inferential statistics. It empowers data scientists to use normally-distributed hypothesis tests (like t-tests or z-tests) to make predictions about populations whose true distribution shapes are completely unknown.

Python Example: Visualizing the Law of Large Numbers

import numpy as np
import matplotlib.pyplot as plt

# Simulating 10,000 coin flips (0 or 1)
flips = np.random.randint(0, 2, size=10000)

# Running average (Law of Large Numbers)
running_means = np.cumsum(flips) / np.arange(1, 10001)

plt.plot(running_means)
plt.axhline(0.5, color='r', linestyle='--')
plt.title('Law of Large Numbers: Running Average Converging to 0.5')
plt.xlabel('Number of Flips')
plt.ylabel('Running Average')
plt.show()