Mohamed KEITA

CortexDB v1

Build a minimal persistent key-value storage engine with a basic LSM-tree architecture

1. Overall Vision of V1

V1 establishes the foundations of CortexDB: a persistent, crash-safe key-value storage engine built around a minimal LSM-tree architecture.

Main objectives:

  • Persistent key to value storage on disk
  • Minimal API: put(key, value), get(key), delete(key)
  • Single process, single file (or directory) on disk
  • Tolerate simple crashes thanks to the WAL
  • Minimal LSM-tree architecture

2. V1 Non-Goals

  • No compaction
  • No Bloom filters
  • No vector search
  • No range queries
  • No replication
  • No SQL
  • No advanced performance optimization
  • No compression
  • No multi-threading
  • No external clients (Python/Node), postponed to V3

3. V1 Architecture

3.1. Main Components

┌───────────┐
│   Client  │
└─────┬─────┘
      │  put(key, value)
      ▼
┌───────────────────┐
│       WAL         │
│  (append-only log)│
└───────┬───────────┘
        │ replay
        ▼
┌───────────────────┐
│     Memtable      │
│ (HashMap<Vec<u8>, │
│      Vec<u8>>)    │
└───────────────────┘

3.2. Modules

  • src/db.rs: DB struct and public API
  • src/memtable.rs: in-memory structure (HashMap) for recent writes
  • src/wal.rs: append-only log file to replay the memtable state after a crash
  • src/sstable.rs (optional in V1): can be postponed to V2 for simplicity

4. WAL (Write-Ahead Log)

4.1. Simple Format

Binary format of an entry:

+----------+-----------+------------+-----------+-----------+
| op (u8)  | key_len   | val_len    | key bytes | val bytes |
|          | (u32)     | (u32)      |           |           |
+----------+-----------+------------+-----------+-----------+
  • op = 1 → PUT
  • op = 2 → DELETE

4.2. V1 Rules

  • The WAL must be flushed after each append()
  • The WAL must be replayed strictly in order at startup
  • DELETE only stores the key (val_len = 0)
  • Empty values are allowed
  • If the WAL is corrupted, use a simple strategy (stop or partial skip)

5. Memtable

In-memory structure (HashMap) for recent writes.

  • Stores key and value pairs in RAM
  • A None value represents a tombstone (deletion)
  • Unbounded size in V1, no automatic flush

6. External API V1

In pseudo-code:

db = DB.open(path="./data")
 
db.put("user:1", b"Alice")
value = db.get("user:1")
db.delete("user:1")

7. V1 Invariants

  • Every write must go through the WAL before reaching the Memtable
  • The Memtable is always fully reconstructed from the WAL at startup
  • A newer key always overrides an older one
  • A tombstone overrides all previous versions of a key
  • The engine must be able to restart after a crash without losing consistency

8. Engine State After Crash

  1. Full WAL replay
  2. Memtable reconstruction
  3. The engine is ready to serve reads

9. Local Server

Note: The local server and Python or Node clients are postponed to V3 in order to stabilize the API first. V1 focuses exclusively on the core engine.


10. Required Tests for V1

WAL:

  • Simple replay
  • Replay after reopen
  • Empty keys or empty values
  • Large values
  • Delete
  • Crash recovery

Memtable:

  • Put, get, delete
  • Tombstones
  • Reconstruction from WAL

Integration:

  • put → get → delete → get
  • crash → reopen → get (must retrieve the data)

11. V1 Objective

At the end of V1, CortexDB is:

  • Functional: basic API operational
  • Persistent: data stored on disk
  • Crash-safe: recovery after crash via WAL
  • Simple: minimal and understandable architecture

This is the foundation on which all subsequent versions will be built.