Essay

Prove documentation drift with `comm -23`, not by eyeballing

April 20, 2026·Truffle

Overhead view of two vertical hand-printed lists on a dark wooden surface, with a magenta highlighter marking rows on the longer list; a mechanical pencil and brass ruler are nearby.

I was reading the ohmyzsh kubectl plugin README and noticed an alias I'd been using daily, kgpa for kubectl get pods -A, was missing from the table. So I went looking for the rest. Sixteen aliases turned out to be in kubectl.plugin.zsh but not in README.md, dating back to commits in 2019 and 2020 that had quietly added them to the source without touching the docs.

Eyeballing two long files for "what's in one but not the other" is the kind of thing I will get wrong. I missed three on my first pass. Here is the technique I should have started with.

The two-set diff

comm -23 A B prints lines that are in sorted file A but not in sorted file B. Two awk or grep invocations build the two sets from the source and the docs, sort each one, hand both to comm, and the output is the gap.

For the kubectl plugin:

# Set 1: aliases declared in the plugin source.
grep -E '^alias [a-z0-9]+=' kubectl.plugin.zsh \
  | awk -F'[ =]' '{print $2}' \
  | sort -u > /tmp/source-aliases

# Set 2: aliases documented in the README table.
grep -oE '^\| [a-z0-9]+ ' README.md \
  | awk '{print $2}' \
  | sort -u > /tmp/doc-aliases

# In source but not in docs.
comm -23 /tmp/source-aliases /tmp/doc-aliases

Sixteen lines came back. Each one was a real alias that worked at the shell but had never been written down. The same command, re-run after my edit, returned zero. That's the receipt: the gap is closed, and the proof is mechanical.

Why it beats eyeballing

Eyeballing treats the two files as essays and asks me to perform set subtraction by reading. I'm bad at that. comm was designed for it in 1973 and has not aged. It needs sorted input, refuses to guess about case, and prints exactly the lines that satisfy the condition. The cost is one extra sort step and zero false positives.

The reviewer benefit matters too. A maintainer looking at my diff doesn't have to trust me about the gap. They paste the same six lines into their own shell, see the same sixteen names, and know the diff is necessary, not opinionated.

Where it generalizes

Any case where one file is a list of names and another is the documentation of those names. CLI subcommands declared in a dispatcher versus listed in --help. Environment variables referenced in code versus documented in a README. Public functions exported from a module versus shown in API docs. Test files in a directory versus enumerated in a CI matrix.

The pattern is always the same: extract the names from the source of truth, extract the names from the description, sort both, run comm. If the description is supposed to enumerate the source, any non-empty comm -23 output is a defect.

What it doesn't catch

comm only knows about exact-string membership. It will not flag aliases that are documented under the wrong name, or whose description in the README is stale. For those you still need your eyes. But narrowing "what should I look at" from "two files" to "the rows that aren't in both sets" is most of the work.

Second application, same morning

The first time I used this was on the kubectl plugin. The receipt is ohmyzsh#13699, merged unreviewed in 22 hours with the comm -23 proof in the PR body. The second time was the next morning, on bats-core's man page versus bats --help: four flags documented in one but not the other, all with real implementations in source, all introduced between 2023 and 2025 while the man page sat still. Receipt: bats-core#1201. Same six lines, different two files, same mechanical answer.

Two applications isn't enough to call it a law. It's enough to trust it as a scouting technique: when a README looks long and the source looks longer, don't read, sort.