👆 Digital Fingerprints

Hash Functions Explained

In 5 minutes: Understand how hash functions work and why they're useful
Prerequisite: None

🎯 The Simple Story

Alice wants to verify that Bob actually sent her a message.

Problem: Eve could send a fake message pretending to be Bob!

Alice's idea: Digital fingerprints!

Alice's message: "Hello"
Alice computes: Hash("Hello") = "7f1a..."
Alice sends: "Hello" + "7f1a..."
Bob receives and computes: Hash("Hello")
Bob compares: Does his hash match what Alice sent?
If Eve changed "Hello" to "Help", hash would be completely different!

The hash is like a digital fingerprint!

🧠 Mental Model

Hold this picture in your head:

Hash Function (Digital Fingerprint):

    Input: "Hello"
      ↓
  Hash Function (H)
      ↓
    Output: "7f1a23b5..."

    Properties:
    1. One-way: Can't go from "7f1a..." to "Hello"
    2. Fixed length: Always same output size (256 bits)
    3. Avalanche: One bit change → completely different output
    4. Collision resistant: Hard to find two inputs with same output

Think of it like:

👆 Fingerprint (Unique identifier)

🔢 Digital digest (Small representation of big data)

🔥 Burn after computing (Can compute, can't reverse)

📊 See It Happen

Let's watch a hash function in action:

🎭 The Story: Fingerprinting Messages

Alice sends an important message to Bob.

The message: "Transfer $100 to Bob. -Alice"

Eve wants to change it to "Transfer $100 to Eve. -Alice"

Without hash: Eve changes the message, Bob can't tell Eve modified it!

With hash:

Alice computes: Hash("Transfer $100 to Bob. -Alice") = "f8a2..."
Alice sends: "Transfer $100 to Bob. -Alice" + hash: "f8a2..."
Eve intercepts, changes to: "Transfer $100 to Eve. -Alice"
Eve should change hash but doesn't know to compute: Hash("Transfer $100 to Eve. -Alice")
Bob receives: "Transfer $100 to Eve. -Alice" + old hash: "f8a2..."
Bob computes: Hash("Transfer $100 to Eve. -Alice") = "b3c9..."
Bob compares: "f8a2..." vs "b3c9..." ✗ Doesn't match!
Bob knows Eve tampered with the message!

🎮 Try It Yourself

Question 1: Alice's message: "Hello". Hash("Hello") = "a1b2...". Eve changes "Hello" to "Help". Does Hash("Help") equal "a1b2..."?

Show Answer

No! Not even close!

Remember: Hash functions have the avalanche property. One character change (or even one bit change) results in a completely different hash.

Hash("Hello") might be: "a1b2c3d4..." Hash("Help") might be: "z9y8x7w6..."

They won't share any characters!

Answer: No! Completely different (avalanche effect)

Question 2: Why can't Eve figure out the message from the hash?

Show Answer

Because hash functions are one-way!

Given hash("Hello") = "a1b2c3d4...", Eve can't reverse it to get "Hello". The hash function destroys the information in a way that can't be undone.

Try reversing:

"a1b2..." → ???
No mathematical operation converts "a1b2..." to "Hello"

It's like blending orange juice:

Can blend oranges → orange juice
Can't unblend orange juice → oranges

Answer: Hash functions are one-way (can't reverse)

Question 3: What happens if two different messages have the same hash?

Show Answer

This is called a collision!

For good hash functions (like SHA-256), collisions are extremely unlikely. The probability is about 1 in 10^77.

That's like flipping a coin 256 times and getting heads every time. Basically impossible!

Modern hash functions (SHA-256, SHA-3, BLAKE2) are designed to make collisions astronomically unlikely.

Answer: Practically impossible with good hash functions like SHA-256

🔢 The Math

Hash Function Properties

A cryptographic hash function H must have:

1. Pre-image resistance (One-way):

Given y = H(x), finding x is hard.

Example: Given hash "a1b2c3d4...", find the input message.

2. Second pre-image resistance:

Given x and y = H(x), finding x' ≠ x with H(x') = y is hard.

Example: Given "Hello" and its hash "a1b2c3d4...", find another message with same hash.

3. Collision resistance:

Finding any x ≠ x' with H(x) = H(x') is hard.

Example: SHA-256

H("Hello") = 185f8db32271fe25f561a6fc938b2e264306ec304eda518007d176482638...
256 bits = 32 bytes = 64 hex characters

For ANY input (1 byte or 1GB), output is always 256 bits!

Avalanche Effect

Input:  "Hello World"      → Hash: 486ea46224d1bb4fb680f34f7c9ad96a8f24ec88be73ea8e5a6c65260e9cb8a7
Input:  "hello World"      → Hash: 2c74fd17edafd2e0efba1fd472d7a3c3927ec5f3568d319d984b28b5e66eb6b31

Just changed 'H' to 'h' (1 bit difference), hashes are completely different!

💡 Why We Care

Real-World Uses

Use Case	How It Works	Example
Message integrity	Check if message tampered	Bob verifies hash matches
Password storage	Store H(password), not password	Hash("mypassword") instead of password
Key derivation	Derive keys from secrets	KDF(secret) = encryption key
Data deduplication	Same content = same hash	Block-level duplicate detection

Signal Protocol Uses

The Signal Protocol uses hash functions for:

Message key derivation:

K = H(state || message_number || content)

Chain key derivation:

Next_chain_key = H(current_chain_key || input)

Ciphertext verification:

Check H(ciphertext) against original hash

✅ Quick Check

Can you explain hash functions to a 5-year-old?

Try saying this out loud:

"A hash function is like a magic fingerprint machine. You put a picture in, it prints out a special code. If someone changes the picture even a tiny bit, the code changes completely. And you can't use the code to get the picture back - it's a one-way street!"

What's the avalanche effect?

Example:

Input: "Hello" → Hash: "a1b2c3d4..." Input: "Hellp" → Hash: "x9y8z7w6..."

One letter change = completely different hash output!

This is why Eve can't tamper with messages - the hash would betray her.

📋 Key Takeaways

✅ Hash function = One-way digital fingerprint
✅ Fixed size output = Any input → 256 bits (SHA-256)
✅ Avalanche effect = One bit change → completely different hash
✅ Collision resistant = Impossible to find two inputs with same hash
✅ Can't reverse = Hash → ??? (can't find input)
✅ Integrity check = Verify messages haven't changed
✅ Signal Protocol use = Key derivation and message verification

🎉 What You'll Learn Next

Now you understand hash functions! These are used throughout the Signal Protocol for:

Deriving keys
Verifying message integrity
Chain key computation

Next, we'll learn about cryptographic signatures - how to verify who sent a message!

✍️ Continue: Wax Seals

We'll learn how Alice can prove she really sent a message, not Eve!

Now you know hash functions! Next: Cryptographic signatures!

Hash Functions Explained​

🎯 The Simple Story​

🧠 Mental Model​

📊 See It Happen​

🎭 The Story: Fingerprinting Messages​

🎮 Try It Yourself​

🔢 The Math​

Hash Function Properties​

Example: SHA-256​

Avalanche Effect​

💡 Why We Care​

Real-World Uses​

Signal Protocol Uses​

✅ Quick Check​

📋 Key Takeaways​

🎉 What You'll Learn Next​

Hash Functions Explained

🎯 The Simple Story

🧠 Mental Model

📊 See It Happen

🎭 The Story: Fingerprinting Messages

🎮 Try It Yourself

🔢 The Math

Hash Function Properties

Example: SHA-256

Avalanche Effect

💡 Why We Care

Real-World Uses

Signal Protocol Uses

✅ Quick Check

📋 Key Takeaways

🎉 What You'll Learn Next