People are right to fear LLM-generated code. The models hallucinate APIs, miss edge cases, and produce code that looks correct but fails under pressure. For distributed systems, where bugs hide behind race conditions and network partitions, the stakes are even higher. A subtle leader election bug can cause split-brain, data corruption, or cascading failures across your cluster.
But what if you could give an LLM the right feedback loop? Not just "write me leader election" but "write me leader election, and here's how I'll prove it correct." That feedback loop is simulation. I decided to test this idea by building a leader election recipe for foundationdb-rs, with invariants generated by Claude Code.
🔗The Algorithm
The recipe uses a ballot-based approach, similar to Raft's term concept but simpler. The algorithm is based on "Leader Election Using NewSQL Database Systems" (Ismail et al., DAIS 2015), which uses a database as distributed shared memory. Our implementation differs by leveraging FDB's strictly serializable transactions and versionstamps instead of timestamps. Leaders hold time-bounded leases. Ballots are monotonically increasing fencing tokens that prevent stale leaders from corrupting state. FDB's serializable transactions handle mutual exclusion without explicit locking.
The key insight is storing the leader state at a single key, making all operations O(1) instead of O(N) candidate scanning. Most leader election implementations require scanning all candidates to determine who's in charge. With N candidates, that's N reads per query. Our approach stores the current leader explicitly, so checking who's leader is a single read. Claiming leadership is a single write with a conflict check. No quorum is needed because FDB already provides the coordination guarantees we need through its serializable transactions.
sequenceDiagram
participant A as Process A
participant FDB as FoundationDB
participant B as Process B
A->>FDB: read leader key
FDB-->>A: ballot=5, lease expired
B->>FDB: read leader key
FDB-->>B: ballot=5, lease expired
A->>FDB: write ballot=6
B->>FDB: write ballot=6
FDB-->>A: COMMIT OK
FDB-->>B: CONFLICT!
Note over A: Becomes Leader
Note over B: Retries, sees ballot=6, backs off
The diagram shows what happens when two processes try to claim leadership simultaneously. Both read the same expired leader state. Both try to write ballot=6. FDB's serializable transactions guarantee only one succeeds. The loser gets a conflict error, retries, sees ballot=6 with an active lease, and backs off. No split-brain possible. This is the fundamental safety property that the entire algorithm rests on.
When a process wants to become leader, it first registers as a candidate. FDB assigns a versionstamp at commit time using the SetVersionstampedValue atomic operation. This versionstamp is assigned by FDB itself when the transaction commits, providing a globally-ordered identity that no other process can have. The versionstamp never changes throughout the process's lifetime, even as the process sends heartbeats and refreshes its registration. This immutability is important: it means the process's identity is stable and can be used reliably in log replay during verification.
Then the process enters a loop: send heartbeats to prove liveness, try to claim leadership if the current lease expired, and optionally resign if graceful handoff is needed. Each leadership claim increments the ballot number. The new leader stores its versionstamp in the leader state, so followers can verify they're talking to the actual leader. The full implementation lives in the foundationdb-rs recipes module.
🔗The Simulation Workload
Simulation on top of FoundationDB may feel redundant. FDB itself has been validated through a trillion CPU-hours of simulation. But FDB's resiliency doesn't mean your code is resilient. Your layer sits on top of FDB. Your transaction logic, your conflict handling, your retry behavior. How does your code respond when FDB is unhealthy?
Simulation introduces chaos to answer that question. Network partitions, process crashes, clock skew up to ±1 second. Same seed, same execution, same bugs. Deterministic replay.
| Phase | Who | What |
|---|---|---|
| Setup | Client 0 | Initialize election, register all candidates |
| Chaos | All clients | Loop: heartbeat → try claim → maybe resign (10%) |
| Check | Client 0 | Read logs, snapshot state, run 13 invariants |
FDB's simulator runs built-in chaos workloads alongside ours: RandomClogging injects network partitions, Attrition kills and reboots processes. Our workload adds clock skew simulation up to ±1 second.
Each operation logs its intent atomically in the same transaction, using the pattern from FDB's own AtomicOps workload. The SetVersionstampedKey atomic operation writes both the leader election mutation and its log entry in a single transaction. If the transaction commits, both succeed. If it aborts, neither does. This gives us a proper write-ahead log without the complexity of two-phase commit. The log key uses a versionstamp prefix for true FDB commit ordering. No clock skew ambiguity.
🔗How Atomic Logging Works
Each operation writes a log entry in the same transaction as the operation itself. The log key uses a versionstamp prefix, so entries sort by true FDB commit order, not by wall clock (which can drift ±1 second under simulation).
Log Key Structure:
┌─────────────────────┬───────────┬────────┐
│ versionstamp (10B) │ client_id │ op_num │
└─────────────────────┴───────────┴────────┘
↑
Assigned by FDB at commit time
Here's what a typical log might look like after chaos:
| Versionstamp | Client | Op | Result |
|---|---|---|---|
| 0x0001... | 0 | Register | ✓ |
| 0x0002... | 1 | Register | ✓ |
| 0x0003... | 2 | Register | ✓ |
| 0x0004... | 0 | TryBecomeLeader | ✓ became leader, ballot=1 |
| 0x0005... | 1 | TryBecomeLeader | ✗ conflict |
| 0x0006... | 0 | Heartbeat | ✓ |
| 0x0007... | 2 | TryBecomeLeader | ✓ became leader, ballot=2 |
| 0x0008... | 2 | Resign | ✓ |
Client 0 became leader with ballot 1. Client 1's claim failed (conflict). Client 2 later claimed ballot 2 when client 0's lease expired, then resigned. The versionstamp ordering is ground truth: no matter how skewed each client's clock was, the log shows exactly what committed and in what order.
During check, client 0 reads all logs and database state in a single snapshot, then runs the invariants. Same seed, same execution, same bugs. Deterministic replay.
🔗The Invariants
Here's where Claude Code enters the story. I asked it to generate invariants for leader election validation. I didn't give it a detailed specification. I pointed it at my previous post about designing workloads that find bugs and said "apply these patterns to leader election."
It generated 13 invariants. Here are the seven most important ones:
| Invariant | What | Why | How |
|---|---|---|---|
| DualPathValidation | Expected state from log replay must match actual FDB snapshot | The keystone check. If logs say process A is leader with ballot 7, but FDB shows process B with ballot 6, something corrupted state | Replay all logged operations in versionstamp order to compute expected leader/ballot, then compare with actual database state |
| FencingTokenMonotonicity | Each successful claim must have a strictly higher ballot than the previous | When a network partition heals, an old leader might try to act with a stale ballot. This catches that write | For each claim in the log, verify ballot > previous_ballot from the same transaction |
| OneValuePerBallot | Each ballot number maps to exactly one client | Two clients claiming the same ballot means either conflict detection failed or ballot increment is broken. Classic split-brain symptom | Scan all claims, verify no two different clients ever claimed the same ballot number |
| LeaderIsCandidate | Current leader must exist in the candidates registry | Edge cases like crash-during-registration or eviction-while-claiming can leave orphaned leaders | Read leader state, verify a matching candidate entry exists |
| NoOverlappingLeadership | Every leadership claim has a globally unique versionstamp | FDB serializes commits globally. Duplicate versionstamps mean either a logging bug or actual split-brain | Collect all claim versionstamps, verify no duplicates |
| GlobalBallotSuccession | Each new leader must have ballot > previous leader's ballot | Catches state regression after partition heals. An old leader can't "go back" to a stale ballot | Track previous_ballot in log entries, verify new_ballot > previous_ballot for every transition |
| LeaseExpiryAfterClaim | Lease must expire after the claim timestamp | Clock skew can cause a leader to claim with an already-expired lease. This catches incorrect lease calculation or extreme clock drift | For each claim, verify lease_expiry_nanos > claim_timestamp_nanos |
🔗What's Next
The leader election recipe has just been merged into foundationdb-rs as experimental. I haven't run it in production yet. The entire development was simulation-driven: write code, run simulation, fix what breaks, repeat. The simulation runs in CI on every PR.
Simulation gives us guarantees that the code behaves correctly under the right rules, even when FDB is tortured. Network partitions, clock skew, process crashes. The invariants verify that safety properties hold through all of it. Not a proof of correctness, but confidence earned through chaos.
The invariants that Claude Code generated encode patterns from prior simulation work: dual-path validation from AtomicOps, fencing tokens from the literature, lease checks tied to clock skew. What the LLM provided was speed: weeks of invariant development compressed into hours of review. The LLM proposes, simulation disposes.
If you have ideas for new tortures or invariants, the simulation code is open. Merge requests welcome.
Feel free to reach out with questions or to share your experiences with leader election and simulation testing. You can find me on Twitter, Bluesky or through my website.