Performance
Why KiteDB is fast
KiteDB is designed for speed. This page explains why it's fast and how to get the best performance from it.
Why KiteDB is Fast
1. No Network Overhead
10-1000x speedup just from eliminating network.
2. Zero-Copy Memory Mapping
Hot data stays in RAM automatically. Cold data is paged in on demand.
3. Cache-Friendly Data Layout
Traversing 10 Neighbors
4. Lazy MVCC
Version Chains: Only When Needed
Most workloads are mostly serial. MVCC overhead is paid only when required.
Benchmark Results
Latest snapshot (single-file raw, Rust core, 10k nodes / 50k edges, edge types=3, edge props=10, syncMode=Normal, groupCommitEnabled=false, February 4, 2026):
Node Ops
| Operation | p50 |
|---|---|
| Key lookup (random existing) | 125ns |
| Batch write (100 nodes) | 34.08us |
Edge Ops
| Operation | p50 |
|---|---|
| 1-hop traversal (out) | 208ns |
| Edge exists (random) | 83ns |
| Batch write (100 edges) | 40.25us |
| Batch write (100 edges + props) | 172.33us |
Full logs and run commands are in docs/benchmarks/results/.
Write Durability vs Throughput
- Defaults stay safe:
syncMode=Full,groupCommitEnabled=false. - Single-writer, low latency:
syncMode=Normal+groupCommitEnabled=false. - Multi-writer throughput:
syncMode=Normal+groupCommitEnabled=true(1-2ms). Scaling saturates quickly; prefer prep-parallel + single writer for max ingest. See benchmarks notes. - Highest speed, weakest durability:
syncMode=Off(testing/throwaway only).
Group commit adds intentional latency to coalesce commits; it improves throughput under concurrency, but can slow single-threaded benchmarks.
Decision Table
| Workload | syncMode | groupCommitEnabled | Why |
|---|---|---|---|
| Production, high durability | Full | Off | fsync per commit |
| Single-writer ingest | Normal | Off | Lowest latency per commit |
| Multi-writer throughput | Normal | On (1-2ms) | Coalesces commits |
| Testing/throwaway data | Off | Off | Max speed, weakest durability |
Performance Playbook
- Fastest ingest (single writer):
beginBulk()+createNodesBatch()+addEdgesBatch()/addEdgesWithPropsBatch(),syncMode=Normal,groupCommitEnabled=false, WAL ≥ 256MB, auto-checkpoint off during ingest, then checkpoint. - Multi-writer throughput:
syncMode=Normal+groupCommitEnabled=true(1-2ms window), batched ops per transaction. - Read-heavy, mixed workload: Keep batches small, checkpoint when WAL ≥ 80%, avoid deep traversals.
- Max speed, lowest durability:
syncMode=Offfor testing only.
Bulk-load mode requires MVCC disabled. Use it for one-shot ingest or ETL jobs.
Bulk Ingest Example (Low-Level)
Best Practices
Batch Writes
Limit Traversal Depth
Use Keys for Lookups
Checkpoint Timing
Memory Usage
Memory Breakdown
- • Not counted against process memory
- • OS manages page cache
- • Hot pages in RAM, cold pages on disk
- • Property cache: LRU, default 10K entries
- • Traversal cache: LRU, invalidated on writes
- • Only when concurrent transactions exist
- • Cleaned up by GC
Typical 100K node graph:
Profiling Tips
Next Steps
- CSR Format – Why traversals are fast
- Snapshot + Delta – How reads stay fast during writes
- Benchmarks – Detailed performance measurements