Skip to main content

๐Ÿ”ข Shrinking Rooms

Understanding Encodingโ€‹


๐ŸŽฏ The Simple Storyโ€‹

Imagine you need to send someone a recipe:

"3 cups flour, 2 teaspoons salt, 1 egg, bake at 350ยฐF for 45 minutes"

Problem:

  • Recipe is 100 words long โ†’ Too big!
  • Phone line is slow โ†’ Takes forever to send!

Solution: Write it in a secret code!

Replace each word with a short number:

  • "flour" โ†’ "3"
  • "salt" โ†’ "5"
  • "egg" โ†’ "7"
  • "bake" โ†’ "9"

Now the recipe is: "33579" (5 digits instead of 100 words!)

That's encoding! Converting stuff into a format computers can send efficiently.


๐Ÿง  Mental Modelโ€‹

Hold this picture in your head:

A big room (huge polynomial):
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ โ€ข โ€ข โ€ข โ€ข โ€ข โ”‚
โ”‚ โ€ข โ€ข โ€ข โ€ข โ€ข โ”‚ 256 coordinates
โ”‚ โ€ข โ€ข โ€ข โ€ข โ€ข โ”‚ Huge! Difficult!
โ”‚ โ€ข โ€ข โ€ข โ€ข โ€ข โ”‚
โ”‚ โ€ข โ€ข โ€ข โ€ข โ€ข โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

โ†“ Encoding โ†“

A tiny box (compressed bytes):
โ”Œโ”€โ”€โ”
โ”‚384โ”‚ โ† Just 384 bytes! โ”‚ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Much smaller, but Bob can re-expand to the original room!

Think of it like:

๐Ÿ“ž Phone call (encode big message โ†’ signal โ†’ decode โ†’ original message)

๐Ÿ—บ๏ธ Map (big territory โ†’ small map โ†’ you navigate anyway)

๐Ÿ”‘ Key (store 1 key, not 10,000 door locks)


๐Ÿ“Š See It Happenโ€‹

Let's see encoding in action:


๐ŸŽฎ Try It Yourselfโ€‹

Question 1: You have coefficients [3, 5, 2]. How many bytes to store them (12 bits each)?

Show Answer

12 bits per coefficient, 3 coefficients:

Total bits: 12 ร— 3 = 36 bits

Convert to bytes: 36 รท 8 = 4.5 bytes

Since bytes don't split: 5 bytes (round up)

Answer: 5 bytes (42 bits used)


Question 2: If you pack [01010101, 10101010, 11110000] into bytes, what does it look like?

Show Answer

Each 8-bit pattern is already a byte:

  • 01010101 โ†’ 85 (decimal)
  • 10101010 โ†’ 170 (decimal)
  • 11110000 โ†’ 240 (decimal)

Packed: [85, 170, 240] (3 bytes exactly!)


Question 3: ML-KEM matrices need 2,304 coefficients (12 bits each). How many bytes raw? How with compression trick?

Show Answer

Without compression:

2,304 coefficients ร— 12 bits = 27,648 bits 27,648 รท 8 = 3,456 bytes

With compression (seed trick):

Store just a 32-byte seed Expand deterministically to 3,456 bytes

Compression: 3,456 โ†’ 32 bytes (98% smaller!)

Answer: Raw: 3,456 bytes. Compressed: 32 bytes.


๐Ÿ”ข The Mathโ€‹

What is Encoding?โ€‹

Encoding = Converting something to a different format

Encoding polynomial: p(x)=a0+a1x+a2x2+โ‹ฏ+a255x255p(x) = a_0 + a_1x + a_2x^2 + \dots + a_{255}x^{255}

Encode coefficients: Encode(p)=a0โˆฅa1โˆฅa2โˆฅโ€ฆโˆฅa255\text{Encode}(p) = a_0 \| a_1 \| a_2 \| \dots \| a_{255}

Where โˆฅ\| = concatenation (join together)

Coefficient Bit Encodingโ€‹

ML-KEM uses 12-bit encoding for coefficients:

Coefficient: 3  โ†’  000000000011 (12 bits)
Coefficient: 5 โ†’ 00000000101 (12 bits)
Coefficient: 2 โ†’ 000000000010 (12 bits)

Join them: 000000000011 000000001010 000000000010

Why 12 bits?

  • Coefficients: 0 to 3,328
  • 12 bits = 0 to 4,095 (enough for 3,328)

Bit Packingโ€‹

Efficient storage means packing 12 bits into 8-bit bytes:

Coefficient 1: 010101010101 (12 bits)
Coefficient 2: 001100110011 (12 bits)

Pack into bytes:
[01010101] [0100 001]|100110011]

Note: Boundaries cross bytes! Must pack carefully.

Bit packing = Join bit strings, then split into 8-bit groups

Compression: Seed Trickโ€‹

Instead of storing 3,456 bytes (matrix A):

  1. Generate 32-byte random seed
  2. Store just the seed (32 bytes)
  3. When needed: Expand seed โ†’ full matrix (3,456 bytes)

Compression ratio: 3,456 / 32 โ‰ˆ 99% smaller!


๐Ÿ’ก Why We Careโ€‹

ML-KEM Needs Efficient Storageโ€‹

Matrix A: 3ร—3 = 9 polynomials = 2,304 coefficients

Without encoding:

  • Raw: 12 bits ร— 2,304 = 27,648 bits = 3,456 bytes

With encoding:

  • Compress: 3,456 โ†’ 864 bytes (public key component)
  • OR seed representation: 3,456 โ†’ 32 bytes

Result: Public key: 864 bytes + 320 bytes (vector t) = 1,184 bytes

Transmission Speedโ€‹

Imagine sending across slow network:

Without encoding:

  • 3,456 bytes ร— 8 seconds/KB โ‰ˆ 27 seconds

With encoding:

  • 864 bytes ร— 8 seconds/KB โ‰ˆ 7 seconds

3ร— faster! (And with seed: 32 bytes = instant!)


โœ… Quick Checkโ€‹

Can you explain encoding to a 5-year-old?

Try saying this out loud:

"Encoding is like writing a long recipe in a secret code. instead of writing 'flour, sugar, egg' (lots of words), you replace each word with a number like '3, 5, 7'. It's a small code but still means the same thing!"

Can you compute encoding sizes?

Try this examples:

Coefficient [4, 7, 2]. Encode with 12 bits each.

  • 4 = 000000000100 (12 bits)
  • 7 = 000000000111 (12 bits)
  • 2 = 000000000010 (12 bits)

Join: 000000000100 000000000111 000000000010 (36 bits)

To bytes: 36 รท 8 = 4.5 โ†’ 5 bytes

Answer: 5 bytes (42 bits used)


๐ŸŽ“ Key Takeawaysโ€‹

โœ… Encoding = Convert to computer-storable format โœ… 12 bits/coeff = Needed for 0-3,328 range โœ… Bit packing = Join bits into 8-bit bytes โœ… Compression = 3,456 bytes โ†’ 864 bytes (matrix A) โœ… Seed trick = 32 bytes โ†’ 3,456 bytes (98% smaller) โœ… Public key = 1,184 bytes (864 A + 320 t) โœ… Encoding/decoding = Reversible process (Bob can recover)


๐ŸŽ‰ What You'll Learn Nextโ€‹

Now you understand encoding! Next, we'll learn about key generation:

๐Ÿ”‘ Making the Key โ†’ Key Generation

How ML-KEM็”Ÿๆˆ (generates) keys safely!