Merge Sort and the Limits of Comparison

May 27, 2026

Every programmer knows merge sort runs in $O(n \log n)$ time. But fewer know why this is optimal for comparison-based sorting.

The Information-Theoretic Argument

There are $n!$ possible permutations of an input array. Each comparison gives us one bit of information, splitting the remaining possibilities in half. To distinguish between all $n!$ orderings, we need at least:

\log_2(n!) = \Theta(n \log n)

comparisons. By Stirling’s approximation:

\log_2(n!) \approx n \log_2 n - n \log_2 e

The Implementation

Here’s the classic divide-and-conquer approach:

fn merge_sort(arr: &mut [i32]) {
    let len = arr.len();
    if len <= 1 { return; }

    let mid = len / 2;
    merge_sort(&mut arr[..mid]);
    merge_sort(&mut arr[mid..]);

    let merged = merge(&arr[..mid], &arr[mid..]);
    arr.copy_from_slice(&merged);
}

fn merge(left: &[i32], right: &[i32]) -> Vec<i32> {
    let mut result = Vec::with_capacity(left.len() + right.len());
    let (mut i, mut j) = (0, 0);
    while i < left.len() && j < right.len() {
        if left[i] <= right[j] {
            result.push(left[i]);
            i += 1;
        } else {
            result.push(right[j]);
            j += 1;
        }
    }
    result.extend_from_slice(&left[i..]);
    result.extend_from_slice(&right[j..]);
    result
}

Beyond Comparisons

Non-comparison sorts like radix sort can beat this bound by exploiting the structure of keys — running in $O(nk)$ where $k$ is the key length. But they trade generality for speed.