Skip to main content

Host Storage

Branch data lives on your host filesystem under ~/.graft/. This section explains the layout and how the object pool works.

Directory Layout

~/.graft/
  └── my-postgres/                   ← Project (named after target container)
      ├── config.json                ← State file (JSON, schema v2)
      ├── graft.db                   ← SQLite DAG (WAL mode, single connection)
      ├── objects/                   ← BLAKE3 object pool
      │   ├── a1/                    ← First 2 hex chars of hash
      │   │   ├── b2c3d4e5f6…        ← Remaining 62 hex chars
      │   │   └── 7890abcdef…
      │   └── f0/
      │       └── e1d2c3b4a5…
      └── branches/
          ├── main/                  ← Materialized branch directory
          │   ├── PG_VERSION
          │   ├── base/
          │   │   ├── 1/
          │   │   │   ├── 1259
          │   │   │   └── 1260
          │   │   └── 16384/
          │   ├── global/
          │   ├── pg_wal/
          │   └── pg_hba.conf
          └── experiment/
              └── ...                ← Hardlinked/deduplicated files

Object Pool

The object pool at objects/ stores file content keyed by BLAKE3 hash.

Storage Format

objects/<hash[:2]>/<hash[2:]>
Example: A file with BLAKE3 hash a1b2c3d4e5f67890abcdef1234567890abcdef1234567890abcdef1234567890ab is stored at:
objects/a1/b2c3d4e5f67890abcdef1234567890abcdef1234567890abcdef1234567890ab
The 2-character prefix shards the pool into 256 subdirectories, preventing any single directory from having too many entries.

Operations

OperationDescription
Put(hash, reader)Store a blob; idempotent (no-op if already present)
Get(hash)Open a blob for reading
Has(hash)Check if a blob exists
PutFile(path)Hash a file on disk and store it

Deduplication

Blobs are immutable and content-addressed. If two files have the same content (e.g., an unchanged Postgres heap file across commits), they map to the same hash and are stored exactly once. The Put operation checks Has first and returns immediately if the blob exists.

Object Materialization

When Graft needs to restore a branch (checkout, rollback, reset), it materializes files from the object pool into the branch directory.

Diff-Based Materialization

Rather than copying every file, Graft:
  1. Walks the current branch directory and computes BLAKE3 hashes
  2. Compares the result against the target tree entries
  3. Skips files that already match (same hash)
  4. Only copies, removes, or replaces files that differ
For each file that needs to be materialized, Graft attempts a reflink (copy-on-write) using cp -c:
cmd := exec.Command("cp", "-c", "-f", src, dst)
On APFS (macOS) and btrfs (Linux), reflinks are instantaneous — they create a new file handle pointing to the same disk blocks until a write occurs. This makes materialization nearly free. If cp -c fails (unsupported filesystem, missing binary), Graft falls back to a full byte-by-byte copy:
io.Copy(out, in)

Excluded Files

Certain runtime files are never stored in the object pool or copied between branches:
PatternReason
*.sockDatabase Unix sockets (created on every start)
*.pidPID files (process-specific, invalid after restart)
postmaster.pidPostgres-specific PID file
These files are created fresh when the database starts and should never be part of a data snapshot.

Host vs Docker Volume Storage

AspectDocker VolumeHost Filesystem
LocationInside Docker’s Linux VM~/.graft/ on your Mac
AccessContainer onlyFinder, ls, any tool
HardlinksImpossible (cross-VM)Native (APFS)
Backupdocker run ... tarcp -r
PerformanceVM overheadNative I/O
The host filesystem approach removes the Docker VM as a bottleneck for all data operations.