Private preview is open to a few more founding customers.Apply
Engineering · Dec 09, 2025

Why Sempleo’s context layer runs on Postgres.

A new category does not need new infrastructure. It needs boring, predictable, well-understood infrastructure so the novel thing can fail safely. Postgres is the right answer.

The most common question I get from engineers, after the thesis lands, is some version of: what database is underneath it?

The answer is boring on purpose. Sempleo’s context layer runs on Postgres — specifically Postgres with pgvector for embeddings, row-level security for tenancy, and logical replication for audit streaming. No custom KV store. No bespoke vector database. No graph engine. Postgres.

I made that call in week one of building and I have not come close to reversing it. Here is the reasoning, because I think the reasoning is the more interesting part.

A new category needs old infrastructure. Sempleo is, as a product, a new kind of thing. Teams have never had a structured, multi-tenant, authority-aware context layer sitting between their people and their agents. That novelty is the bet. And because the bet is novel, everything underneath it should be as well understood, as debuggable, and as boring as I can possibly make it. You only get to spend your innovation budget once. Spending it on the data layer is a mistake most founders in this category will regret inside eighteen months.

Relational shape matches the product. The five-layer model — company, team, client, project, user — is a relational model. Layers own entries. Entries have fields. Fields have owners, freshness stamps, quality dots, and version history. Entries belong to scopes; scopes belong to tenants. Relationships between layers are foreign keys. Any non-relational store would spend its first year re-inventing joins, transactions, and constraints. Postgres hands those to you on day one, correctly, for free.

Vectors are a column, not a system. We do use embeddings — every entry is embedded at write time for semantic retrieval — and pgvectoris good enough for the scale any sensible early-stage team should be designing for. There is a long debate about whether dedicated vector databases pay for themselves at small scale. For the shape of query Sempleo runs — tenant-scoped, layer-scoped, authority-filtered, often fewer than ten thousand candidate rows — the answer is clearly no. Keep the vectors next to the rows they describe. Query them with the same SQL.

Row-level security is the tenancy story. Every request arrives with a tenant id and a user id, and every query runs inside a Postgres session with those values bound. RLS policies enforce who can read what. We have unit tests that assert a user in tenant A cannot, under any code path, read a row in tenant B — and those tests run against a real Postgres with real policies, not a mocked one. I would rather trust a decade of Postgres security hardening than a tenancy abstraction I wrote myself last quarter.

Audit is logical replication. Every mutation in Sempleo is audited — who changed what, from which agent run, on whose review approval. Postgres’s logical replication stream gives us that audit for free: a second consumer subscribes to the WAL and writes an append-only audit log to object storage. No trigger gymnastics. No application-layer audit code that can be forgotten on a new endpoint. Either a write happened or it did not.

The parts that are not Postgres. Embedding generation runs against the providers — we are not in the embedding-model business. Agent runs are scheduled from a Redis queue because fan-out and retries are not what Postgres is good at. Long-lived session state for MCP connections lives in its own process. The separation is deliberate: Postgres holds the truth, the rest of the system is stateless and disposable.

What I would do differently at scale. If we cross into genuine large-tenant territory — tens of thousands of active entries per tenant, sustained write volume from dozens of concurrent agent runs — we will split the hottest tenants onto dedicated instances and move the audit stream onto Kafka. Neither of those requires us to rewrite the data model. The boring choice at the start is also the one that gives us the cleanest upgrade path at scale.

The larger point: in an AI-infrastructure category, the most interesting thing you can do at the data layer is nothing. Save your novelty for the ontology, the governance, and the agent design — the places where novelty actually becomes product value. Let Postgres be Postgres.

The platform page has more on where this sits in the stack. Founding-customer applications are open.

Shape the team-context
layer with us.

We're onboarding a small cohort of founding customers to deploy Sempleo on real workflows. A 45-minute call with the founder — you leave with a plan; we leave with the shape of how your team actually works.