~/docs/internals/snapshot-delta

DOCUMENTATION

Snapshot + Delta

The core storage model

KiteDB separates storage into two parts: a snapshot (immutable, on disk) and a delta (mutable, in memory). This separation is the foundation of how KiteDB achieves fast reads and writes.

The Model

Database State

Snapshot

(disk)

Immutable
CSR format
Zero-copy

+

Delta

(memory)

Pending changes
Fast writes
Merged on read

WAL

(durability)

Recovery log
Crash safety
Write-ahead

Snapshot

The snapshot is a point-in-time image of the entire database. It's stored in CSR format and memory-mapped directly from disk.

Key properties:

Immutable – Once written, never modified. Safe for concurrent reads.
Zero-copy – Memory-mapped via mmap(). The OS handles caching.
Compressed – zstd compression reduces disk usage by ~60%.
Complete – Contains all nodes, edges, properties, and indexes.

Delta

The delta holds all changes since the last snapshot. It's a collection of in-memory data structures optimized for both reads and writes.

Delta State

├createdNodesMap<NodeID, NodeData>New nodes

├deletedNodesSet<NodeID>Tombstones

├modifiedNodesMap<NodeID, PropChanges>Property updates

├outAdd/outDelMap<NodeID, EdgePatch[]>Edge changes

├inAdd/inDelMap<NodeID, EdgePatch[]>Reverse index

└keyIndexMap<string, NodeID>Key lookups

How Reads Work

Every read operation merges snapshot and delta:

1

Is nodeId in delta.deletedNodes?

→ Yes:return null (deleted)

2

Is nodeId in delta.createdNodes?

→ Yes:return delta data (new node)

3

Does snapshot have this node?

→ No:return null (never existed)

4

Merge snapshot + delta.modifiedNodes

→Return combined result

Edge traversals work similarly—scan snapshot edges, skip deleted ones, add new ones from delta.

How Writes Work

Writes go to three places:

Transaction Commit

1

WAL→Append records (ensures durability)

2

Delta→Update in-memory state (visible to reads)

3

Cache→Invalidate affected entries

The snapshot is NOT touched during normal writes

Checkpoint: Merging Delta into Snapshot

Periodically, KiteDB creates a new snapshot that incorporates all delta changes. This is called a checkpoint.

Checkpoint Process

1Read current snapshot

2Apply all delta changes

3Write new snapshot (CSR, compressed)

4Update header to point to new snapshot

5Clear delta and WAL

Auto: when WAL reaches threshold

Manual: db.optimize()

During checkpoint, reads continue against the old snapshot + delta. The switch to the new snapshot is atomic.

Why This Works Well

Property	How It's Achieved
Fast reads	Snapshot is mmap'd. OS caches hot pages. Delta is small.
Fast writes	WAL append + memory update. No disk seeks.
Crash safety	WAL survives crashes. Replay rebuilds delta.
Concurrent reads	Snapshot is immutable. MVCC handles delta visibility.

Next Steps

CSR Format – How the snapshot stores edges
WAL & Durability – How the write-ahead log works
MVCC & Transactions – How concurrent access is handled

./edit --remote