~/docs/internals/performance

DOCUMENTATION

Performance

Why KiteDB is fast

KiteDB is designed for speed. This page explains why it's fast and how to get the best performance from it.

Why KiteDB is Fast

1. No Network Overhead

Traditional Database

App→Network→DB→Disk→...

Latency:1-10msper operation

KiteDB (embedded)

App→Memory/Disk→App

Latency:1-100μsper operation

10-1000x speedup just from eliminating network.

2. Zero-Copy Memory Mapping

Traditional Read

Disk→Kernel buffer

→User buffer

→Parse → Use

2+ memory copies, allocation overhead

KiteDB (mmap)

Disk→Page cache

→Direct access

0 copies — OS handles caching

Hot data stays in RAM automatically. Cold data is paged in on demand.

3. Cache-Friendly Data Layout

Traversing 10 Neighbors

Linked List

10 random accesses × 100ns

= 1000ns

CSR

1 seq read × 10ns + 10 cache hits × 1ns

= 20ns

50xspeedup for traversal operations

4. Lazy MVCC

Version Chains: Only When Needed

Serial workload(no concurrent readers)

Modify → Update in-placeOverhead: 0

Concurrent workload(active readers)

Modify → Create version chain∝ concurrency

Most workloads are mostly serial. MVCC overhead is paid only when required.

Benchmark Results

Latest snapshot (single-file raw, Rust core, 10k nodes / 50k edges, edge types=3, edge props=10, syncMode=Normal, groupCommitEnabled=false, February 4, 2026):

Node Ops

Operation	p50
Key lookup (random existing)	125ns
Batch write (100 nodes)	34.08us

Edge Ops

Operation	p50
1-hop traversal (out)	208ns
Edge exists (random)	83ns
Batch write (100 edges)	40.25us
Batch write (100 edges + props)	172.33us

Full logs and run commands are in docs/benchmarks/results/.

Write Durability vs Throughput

Defaults stay safe: syncMode=Full, groupCommitEnabled=false.
Single-writer, low latency: syncMode=Normal + groupCommitEnabled=false.
Multi-writer throughput: syncMode=Normal + groupCommitEnabled=true (1-2ms). Scaling saturates quickly; prefer prep-parallel + single writer for max ingest. See benchmarks notes.
Highest speed, weakest durability: syncMode=Off (testing/throwaway only).

Group commit adds intentional latency to coalesce commits; it improves throughput under concurrency, but can slow single-threaded benchmarks.

Decision Table

Workload	syncMode	groupCommitEnabled	Why
Production, high durability	Full	Off	fsync per commit
Single-writer ingest	Normal	Off	Lowest latency per commit
Multi-writer throughput	Normal	On (1-2ms)	Coalesces commits
Testing/throwaway data	Off	Off	Max speed, weakest durability

Performance Playbook

Fastest ingest (single writer): beginBulk() + createNodesBatch() + addEdgesBatch() / addEdgesWithPropsBatch(), syncMode=Normal, groupCommitEnabled=false, WAL ≥ 256MB, auto-checkpoint off during ingest, then checkpoint.
Multi-writer throughput: syncMode=Normal + groupCommitEnabled=true (1-2ms window), batched ops per transaction.
Read-heavy, mixed workload: Keep batches small, checkpoint when WAL ≥ 80%, avoid deep traversals.
Max speed, lowest durability: syncMode=Off for testing only.

Bulk-load mode requires MVCC disabled. Use it for one-shot ingest or ETL jobs.

Bulk Ingest Example (Low-Level)

typescript

// Fast ingest: low-level API
db.beginBulk();
const nodeIds = db.createNodesBatch(keys); // keys: string[]
db.addEdgesBatch(edges); // edges: { src, etype, dst }[]
db.addEdgesWithPropsBatch(edgesWithProps);
db.commit();

// Optional: checkpoint after ingest
db.checkpoint();

Best Practices

Batch Writes

typescript

// Slow: Individual inserts (1 WAL sync per op)
for (const key of keys) {
  db.createNode(key);
}

// Fast: Batch + bulk-load transaction
db.beginBulk();
db.createNodesBatch(keys);
db.commit();

// 1000 nodes × 1μs + 1 WAL sync = ~2ms

Limit Traversal Depth

typescript

// Potentially expensive: deep traversal
const alice = await db.get(user, 'alice');
const all = db
  .from(alice)
  .traverse(follows, { direction: 'out', maxDepth: 10 })
  .nodes()
  .toArray();

// Safer: bounded traversal + limit
const friends = db
  .from(alice)
  .traverse(follows, { direction: 'out', maxDepth: 2 })
  .take(100)
  .nodes()
  .toArray();

Use Keys for Lookups

typescript

// Fast: Key lookup (O(1) hash index)
const alice = await db.get(user, 'alice');

// Slower: Property scan (O(n) nodes, done in JS)
const aliceByName = db.all(user).find((u) => u.name === 'Alice');

Design keys to match your access patterns.

Checkpoint Timing

typescript

// For write-heavy bursts: Compact snapshots after large ingests
await importLargeDataset();
db.optimize();

// Inspect storage stats
const stats = db.stats();

Memory Usage

Memory Breakdown

1. Snapshot (mmap'd)

• Not counted against process memory
• OS manages page cache
• Hot pages in RAM, cold pages on disk

2. Delta

Created nodes:~200 B/node

Modified nodes:~100 B/change

Edges:~20 B/edge

3. Caches(configurable)

• Property cache: LRU, default 10K entries
• Traversal cache: LRU, invalidated on writes

4. MVCC version chains

• Only when concurrent transactions exist
• Cleaned up by GC

Typical 100K node graph:

Snapshot on disk:~10MB (compressed)

Memory footprint:~5MB

Profiling Tips

typescript

// Get database statistics
const stats = db.stats();
console.log(stats);
// {
//   snapshotNodes: 100000,
//   snapshotEdges: 500000,
//   deltaNodesCreated: 1200,
//   deltaEdgesAdded: 3400,
//   walBytes: 10485760,
//   recommendCompact: false
// }

// If recommendCompact is true:
// → Run db.checkpoint() or db.optimize()

Next Steps

CSR Format – Why traversals are fast
Snapshot + Delta – How reads stay fast during writes
Benchmarks – Detailed performance measurements

./edit --remote