Merkle DAG
Every commit produces a cryptographic fingerprint of the entire branch state, stored in a Merkle DAG (Directed Acyclic Graph).Merkle Tree Structure
A Merkle tree is a binary tree of hashes:- Leaves are hashes of the actual data files
- Internal nodes are hashes of their two children
- Root is a single 32-byte value representing every file in the branch
Leaf Hashing
Each leaf hash incorporates both the file’s relative path and its content:- Two files with identical content but different paths produce different leaf hashes
- Renaming a file changes the tree root even without content changes
- Paths are normalized to forward slashes for cross-platform consistency
Tree Construction
The tree is built in a deterministic order:- Walk the branch directory, skipping excluded files (
*.sock,*.pid,postmaster.pid) - Sort leaves by relative path (alphabetical)
- Pair leaves and hash them together
- If an odd leaf remains at any level, promote it to the next level
- Continue pairing until one hash remains — the root
The DAG
Commits form a directed acyclic graph through parent references:- hash — 7-character BLAKE3 digest of tree_root + parent_hash + timestamp
- branch — Which branch this commit belongs to
- parent_hash — Previous commit hash (NULL for first commit)
- tree_root — Merkle tree root (64 hex chars)
- verified — Whether the tree has been verified against the stored root
- message — User-provided commit message
- created_at — ISO 8601 timestamp
SQLite Schema
Why BLAKE3?
| BLAKE3 | SHA-256 | |
|---|---|---|
| Single-core throughput | ~1 GB/s | ~300 MB/s |
| Multi-core scaling | ~6 GB/s on 8 cores | ~300 MB/s (serial) |
| Output size | 256 bits (32 bytes) | 256 bits (32 bytes) |
File-Level Hashing
Graft hashes files, not blocks. A 2GB Postgres database has ~130 files (not 525K 4KB pages). This means:- Small Merkle tree (~6KB in memory)
- Fast to compute (~2-3 seconds)
- Actionable errors:
base/16384/12547 is corruptedtells you exactly what’s wrong - Acceptable trade-off: corruption is detected at file granularity, not block granularity