Mohamed KEITA

CortexDB

A constraints-first data system direction focused on infrastructure resilience, sovereignty, and execution clarity.

CortexDB is a CPU-first, local-first database engine designed for constrained environments.

It is built from scratch as a versioned infrastructure effort, evolving from a minimal LSM-based storage core into a distributed, AI-ready data platform.

CortexDB does not start from hyperscaler assumptions.
It starts from architectural discipline.


The Problem

Most modern data systems are designed with implicit assumptions:

  • Abundant RAM
  • Always-on cloud connectivity
  • GPU availability for vector workloads
  • Global distributed consensus
  • Hyperscaler-scale infrastructure

These assumptions are not universal.

In many real-world environments, edge deployments, offline-first applications, cost-sensitive infrastructures, emerging data ecosystems, such assumptions introduce fragility, unnecessary complexity, or structural misalignment.

CortexDB explores a different path:

A storage engine that prioritizes:

  • CPU efficiency over GPU dependency
  • Local autonomy over permanent cloud reliance
  • Explicit trade-offs over hidden complexity
  • Progressive evolution over premature distribution

Design Philosophy

CortexDB evolves through explicit, versioned architectural steps.

It follows a layered progression:

  • V1–V2: Minimal, crash-safe LSM-based storage core
  • V3: Query primitives and developer ecosystem
  • V4: CPU-first vector search layer
  • V5: Replication and offline synchronization
  • V6: Advanced CortexQL, secondary indexes, observability
  • V7: Pragmatic horizontal sharding without global consensus

Each version defines clear non-objectives.
Complexity is introduced deliberately — never speculatively.

CortexDB does not attempt to become a relational SQL engine.
It does not implement global 2PC or distributed consensus prematurely.
It does not assume infinite hardware.

Instead, it maintains a strict hierarchy:

KV core → JSON documents (optional) → Vector collections → Distributed shards.

Every layer builds on explicit invariants.


Why It Matters

Data infrastructure is becoming strategic.

As AI systems, edge computing, and regional data centers expand, the need for:

  • Resource-aware systems
  • Predictable performance models
  • Controlled operational complexity
  • Sovereign infrastructure capabilities

becomes structural.

CortexDB is an exploration of what a modern data engine can look like when:

  • CPU is the baseline
  • Offline capability is first-class
  • Vector search is native
  • Distribution is pragmatic, not ideological

It is not a product announcement.
It is a long-term infrastructure effort.

Explore CortexDB

Architecture, language, versions