Merge Sort and the Limits of Comparison

Every programmer knows merge sort runs in O(nlogn)O(n \log n) time. But fewer know why this is optimal for comparison-based sorting.

The Information-Theoretic Argument

There are n!n! possible permutations of an input array. Each comparison gives us one bit of information, splitting the remaining possibilities in half. To distinguish between all n!n! orderings, we need at least:

log2(n!)=Θ(nlogn)\log_2(n!) = \Theta(n \log n)

comparisons. By Stirling’s approximation:

log2(n!)nlog2nnlog2e\log_2(n!) \approx n \log_2 n - n \log_2 e

The Implementation

Here’s the classic divide-and-conquer approach:

fn merge_sort(arr: &mut [i32]) {
    let len = arr.len();
    if len <= 1 { return; }

    let mid = len / 2;
    merge_sort(&mut arr[..mid]);
    merge_sort(&mut arr[mid..]);

    let merged = merge(&arr[..mid], &arr[mid..]);
    arr.copy_from_slice(&merged);
}

fn merge(left: &[i32], right: &[i32]) -> Vec<i32> {
    let mut result = Vec::with_capacity(left.len() + right.len());
    let (mut i, mut j) = (0, 0);
    while i < left.len() && j < right.len() {
        if left[i] <= right[j] {
            result.push(left[i]);
            i += 1;
        } else {
            result.push(right[j]);
            j += 1;
        }
    }
    result.extend_from_slice(&left[i..]);
    result.extend_from_slice(&right[j..]);
    result
}

Beyond Comparisons

Non-comparison sorts like radix sort can beat this bound by exploiting the structure of keys — running in O(nk)O(nk) where kk is the key length. But they trade generality for speed.