How We Pack Your Name Into Fewer Bits

Once your name has been tidied up, it’s a string of capital letters, spaces, hyphens, and apostrophes. Now we need to turn it into a sequence of zeros and ones small enough to fit inside a prime number alongside everything else the puzzle needs.

Here’s the catch: we have very little room.

The naive approach, and why it doesn’t work

The obvious way to encode a letter is the one your computer uses every day: ASCII. Each character gets 8 bits, like a fixed-size envelope. The letter A is 01000001. B is 01000010. And so on.

Tidy and predictable. But for our purposes, enormously wasteful.

We only have 29 different symbols to encode (26 letters + space + apostrophe + hyphen). 8 bits gives us 256 possible values, so we’re using less than 12% of what we paid for. And those 8 bits per character add up fast: ALEXANDER would eat 72 bits, even though there’s no real information density behind most of those zeros.

For our smallest puzzle size — a 64-bit prime — there literally isn’t enough room. We’d hit the ceiling before we finished encoding a name.

A graduate student saves the day

In 1951, a young grad student at MIT named David Huffman was given a homework assignment instead of a final exam: find the most efficient way to represent symbols when some are more common than others. He thought about it for a while, came up with an algorithm so elegant it’s still used everywhere from ZIP files to JPEG images, turned it in, and presumably went out for a celebratory beer.

The idea is delightfully simple: common letters get short codes, rare letters get long ones.

In English names, E and A are everywhere; Q and Z are rare guests. So we hand E a short codeword (maybe just 3 bits) and Q a longer one (maybe 9). Average it out across a typical name, and you end up using somewhere around 3 to 5 bits per character instead of 8.

That’s not a small saving. It’s the difference between fitting and not fitting.

(For the deeper dive into how Huffman’s algorithm actually picks those codes — and why it’s mathematically optimal — see Huffman Coding: Squeezing Names into Fewer Bits.)

What we actually pack into the prime

Inside every a3.0 prime, the bits for your name look like this:

A 6-bit length prefix. This is the number of characters in your tidy name (not the number of bits). Six bits gives us 0–63, which is plenty.
The Huffman-encoded characters. A variable-length string of bits — short codes for common letters, longer codes for rare ones.
A nonce region (room for the prime hunt to do its thing) and a single forced “pin” bit at the very top to lock down the block size.

Why the count of characters and not the count of bits? Because at decode time we don’t know the bits in advance — the decoder will walk the Huffman tree to recover one character at a time, and it needs to know exactly when to stop. “Decode 9 characters” is a clean instruction. “Decode bits until something feels right” is not.

Why the savings matter

Every bit we save on encoding the name is a bit we can give to the nonce — the small region of free bits that the prime hunt nudges around until the whole block happens to be a prime number. More nonce headroom means more candidate values to try, and a much higher chance of finding a prime quickly.

In other words: Huffman’s 1952 idea is what lets your name and your prime fit in the same envelope. Without it, we’d need much bigger primes for even short names, the math would be slower, and the certificate would be a wall of digits nobody wants to look at.

A clever compression trick, doing real work, almost a century later. David, wherever you are: thank you.