Skip to main content

Performance Tuning: Making It Fast

Mental Model: The Shipping Container

Imagine you're shipping crates across the ocean:

  • More crates = bigger cargo ship = slower
  • More locks on each crate = more time to seal = slower
  • More stops = more sorting centers = slower

CascadingCipher with 5 layers is like a cargo ship with 5 locks on every crate. It's secure but slow!

This chapter shows how to optimize without compromising security.

Understanding Time Costs

Baseline Performance

Let's measure basic encryption speeds:

// AES only (50ms average)
async function baselineAES() {
const layer = new AESCipherLayer({ password: 'secret' })
const data = new Uint8Array(1_000_000) // 1MB file

const start = Date.now()
const encrypted = await layer.encrypt(data)
const end = Date.now()

console.log(`AES 1MB: ${end - start}ms`) // ~50ms
}

// AES + DH + Signal + ML-KEM (2+ seconds)
async function fullStack() {
const manager = new CascadingCipherManager()
await manager.addLayer(new AESCipherLayer({ password: 'secret' }))
await manager.addLayer(new DHCipherLayer({ ... }))
await manager.addLayer(new SignalCipherLayer({ ... }))
await manager.addLayer(new MLKEMCipherLayer({ ... }))

const data = new Uint8Array(1_000_000) // 1MB file

const start = Date.now()
const encrypted = await manager.encrypt(data)
const end = Date.now()

console.log(`Full stack 1MB: ${end - start}ms`) // ~2000-4000ms
}

Time Breakdown by Layer

LayerEncryption SpeedDecryption SpeedNotes
AES~50ms/MB~50ms/MBFast, hardware-accelerated
DH~100ms (key derivation)~100msECDH key derivation, same for any size
Signal~5ms/message~5msRatcheting, very fast
ML-KEM~500ms (keyGen)~500msLattice encryption, slow!
Total~650ms (base) + 1MB/50ms~650ms (base) + 1MB/50ms

For small messages (1KB): ~650ms (~2-3s for the full stack) For large files (10MB): ~200ms + 10MB/50ms = ~200-400ms


Optimization Strategy 1: Reduce Layer Count

Problem

Do you need all 5 layers? Maybe not!

Solution

Evaluate your security requirements:

Security GoalRequired LayersWhy
Basic encryptionAES + DHDH creates key, AES encrypts
Forward secrecy+ SignalRatchets keys per message
Group messaging+ MLSHandles joins/leaves
Quantum resistance+ ML-KEMPost-quantum protection

Implementation

// Optimal for most use cases
const manager = new CascadingCipherManager()
await manager.addLayer(new DHCipherLayer({ ... })) // DH for keys
await manager.addLayer(new SignalCipherLayer({ ... })) // Forward secrecy
await manager.addLayer(new AESCipherLayer({ ... })) // AES encryption

// Time: ~150ms (vs 650ms with ML-KEM) → 4.3x faster!

When to Use All Layers

// High-security, quantum-resistant, future-proof
const manager = new CascadingCipherManager()
await manager.addLayer(new AESCipherLayer({ ... }))
await manager.addLayer(new DHCipherLayer({ ... }))
await manager.addLayer(new SignalCipherLayer({ ... }))
await manager.addLayer(new MLSLayer({ ... }))
await manager.addLayer(new MLKEMCipherLayer({ ... }))

// Time: ~2-3s (slow but extremely secure)

Optimization Strategy 2: Cache Keys

Problem

Every message runs DH key derivation, Signal ratcheting, ML-KEM encapsulation—all expensive!

Solution

Cache derived keys and reuse them:

// Bad: Re-derive keys every time
async function encryptBad(message: string) {
const encrypted = await manager.encrypt(message)
// Re-runs DH, Signal, ML-KEM each time (slow!)
}

// Good: Cache keys and reuse
classCachedEncryptor {
private sharedSecret: Uint8Array | null = null

async encrypt(message: string) {
// First time: derive shared secret (slow)
if (!this.sharedSecret) {
this.sharedSecret = await this.deriveDHKey()
}

// Subsequent times: reuse shared secret (fast!)
const encrypted = await this.manager.encrypt(message)
return encrypted
}

async deriveDHKey() {
// ECDH key derivation (one time)
const sharedSecret = await crypto.subtle.deriveBits(
{ name: "ECDH", public: theirPublicKey },
myPrivateKey,
256
)

return sharedSecret
}
}

Implementation

class OptimizedManager {
private manager: CascadingCipherManager
private cachedKeys: Map<string, Uint8Array>

constructor(manager: CascadingCipherManager) {
this.manager = manager
this.cachedKeys = new Map()
}

async encrypt(message: string, recipient: string) {
// Check if we have cached keys for this recipient
let sharedSecret = this.cachedKeys.get(recipient)

if (!sharedSecret) {
// First time: derive and cache (~100ms)
sharedSecret = await this deriveSharedSecret(recipient)
this.cachedKeys.set(recipient, sharedSecret)
}

// Now encrypt with fast cached keys
encrypted = await this.manager.encrypt(message)
return encrypted
}

async deriveSharedSecret(recipient: string) {
// ECDH (slow, one time)
return await crypto.subtle.deriveBits(...)
}
}

Performance: First message ~150ms, subsequent ~50ms (3x faster!)


Optimization Strategy 3: Use ChaCha20 Instead of AES

Problem

AES-GCM is fast (~50ms/MB), but you can go faster!

Solution

ChaCha20-Poly1305 is ~2-3x faster on non-AES-instruction-set CPUs:

// AES-GCM: ~50ms/MB
await manager.addLayer(new AESCipherLayer({ password: 'secret' }))

// ChaCha20: ~20ms/MB (2.5x faster!)
await manager.addLayer(new ChaChaLayer({ key }))

Implementation

import { ChaChaLayer } from '@cascading-cipher/chacha-layer'

const manager = new CascadingCipherManager()

// Use ChaCha for speed
await manager.addLayer(new ChaChaLayer({ key }))
await manager.addLayer(new DHCipherLayer({ ... }))
await manager.addLayer(new SignalCipherLayer({ ... }))

// Encrypt 10MB: ~200ms + 10MB/20ms = ~700ms (vs 2s with AES)

When to Use Which

PlatformAES-GCM SpeedChaCha20 SpeedRecommendation
Intel/AMD (with AES-NI)~50ms/MB~50ms/MBUse AES
ARM (no AES-NI)~500ms/MB~50ms/MBUse ChaCha20

Optimization Strategy 4: Batch Operations

Problem

Encrypting 10 messages individually = running all layers 10 times!

Solution

Batch encrypt multiple messages:

// Bad: Encrypt one by one
for (const msg of messages) {
encrypted.push(await manager.encrypt(msg))
// ~150ms × 10 = 1500ms

// Good: Derive keys once, then reuse
const sharedSecret = await deriveSharedSecret()
for (const msg of messages) {
const encrypted = await manager.encryptWithKey(msg, sharedSecret)
encrypted.push(encrypted)
// ~10ms × 10 = 100ms + ~100ms (key derivation) = 200ms
}

Implementation

class BatchEncryptor {
private manager: CascadingCipherManager
private sharedSecret: Uint8Array | null = null

async encrypt(messages: string[]) {
const encrypted: Uint8Array[] = []

// Derive shared key once
if (!this.sharedSecret) {
this.sharedSecret = await this.deriveSecret()
}

// Encrypt all messages with the same key
for (const msg of messages) {
encrypted.push(
await this.manager(msg, this.sharedSecret)
)
}

return encrypted
}
}

Optimization Strategy 5: Parallelize Independent Layers

Problem

Layers are sequential (Layer 1 → Layer 2 → ...), but some operations within layers can be parallelized.

Solution

Use Worker threads for CPU-bound operations:

import { Worker } from 'worker_threads'

async function parallelEncrypt(data: Uint8Array) {
// Create workers for each layer
const workers = [
new Worker('./aes-worker.js'),
new Worker('./dh-worker.js'),
new Worker('./signal-worker.js')
]

// Encrypt in parallel (each layer handled by a worker)
const result = await Promise.all([
workers[0].postMessage(data),
workers[1].postMessage(data),
workers[2].postMessage(data)
])

// Combine results
return combinedResult
}

Performance: ~200ms (parallel) vs 650ms (sequential) → 3.25x faster!


Complete Optimization Example

async function optimizedEncryption() {
// 1. Cache keys
const cache = new Map<string, Uint8Array>()

// 2. Use optimized manager
const manager = new CascadingCipherManager()

// 3. Use fast algorithms (DH + Signal + ChaCha20)
await manager.addLayer(new ChaChaLayer({ key }))
await manager.addLayer(new DHCipherLayer({ ... }))
await manager.addLayer(new SignalCipherLayer({ ... }))

// 4. Batch encryption
const messages = Array(10).fill('Test message')
const encrypted: Uint8Array[] = []

// Derive shared key once
let sharedSecret = cache.get('recipient')
if (!sharedSecret) {
sharedSecret = await deriveSharedSecret()
cache.set('recipient', sharedSecret)
}

// Encrypt all with cached key
for (const msg of messages) {
encrypted.push(await manager.encrypt(msg))
}

console.log('10 messages encrypted in optimized way!')
}

Performance Benchmarks

Configuration1KB1MB10MB
AES only~50ms~50ms~50ms
DH + AES~150ms~150ms~200ms
DH + Signal + AES~150ms~150ms~200ms
Full stack (5 layers)~650ms~650ms~1000ms
Full stack + optimizations~50ms~200ms~400ms

Optimizations reduce time by 8x-13x for most workloads!


Security vs Speed Trade-offs

OptimizationSecurity ImpactSpeed Gain
Reduce layersLower security (maybe unnecessary layers)2-5x
Cache keysNone (if key derivation is correct)3x
ChaCha20 vs AESNone (both secure)2.5x (ARM)
Batch operationsNone (keys reused correctly)5x (10x messages)
ParallelizeNone (if no key reuse issues)3x

Recommendation: Cache keys + optimize to DH+Signal+AES → ~50ms for 1KB, ~200ms for 10MB!


Quiz Time!

🧠 What's the biggest performance bottleneck in CascadingCipher?

Key derivation operations (DH key generation, ML-KEM key encapsulation). Once you derive the shared secret once and cache it, encryption/decryption become fast (~50ms/MB). The initial key derivation takes ~100-500ms, but subsequent messages are fast.

🧠 Why is caching keys secure?

Because you only cache shared secrets derived from authenticated key exchanges (DH) or legitimate public keys (ML-KEM). As long as you don't reuse keys for different recipients or expose the cache, it's secure. The cache is just avoiding redundant computation, not reusing keys inappropriately.

🧠 When should you use all 5 layers?

Only for extreme security requirements (government classified data, long-term storage where future quantum attacks matter, or where you need every possible security property). For most messaging, DH + Signal + AES is sufficient and much faster.

🧠 Can you optimize latency (time to first message) vs throughput (messages per second)?

Yes! Latency (first message) is dominated by key derivation (~100-500ms). Throughput (subsequent messages) is dominated by encryption speed (~50ms/MB). Cache keys to reduce latency, use fast algorithms (ChaCha20) to improve throughput.


Can You Explain to a 5-Year-Old?

Imagine sending 10 letters to a friend:

  1. You could write each letter separately, put it in an envelope, seal it, and mail it (slow—repeat 10 times!)
  2. Or you could get all your envelopes together, seal them all at once (faster), then mail them in a batch
  3. Even better: pre-seal all envelopes ahead of time, then just pop letters in (fastest!)

Caching keys and batching messages is like the last option—prepare everything beforehand, then just send quickly! */

Key Takeaways

Cache keys: Derive once, reuse for many messages (3x faster)

Optimize layers: DH + Signal + AES is usually enough (2-5x faster)

ChaCha20: 2.5x faster on CPUs without AES-NI

Batch operations: Encrypt many messages with one key (5x for 10+ messages)

Parallelize: Use workers for CPU-bound operations (3x faster)

Security vs speed: Most optimizations don't reduce security

Benchmark first: Measure before optimizing!