CortexDB · docs

CortexDB for the Enterprise

A practitioner's guide to memory at company scale

Status: Draft v1 Audience: CTOs, VPs of Engineering, platform architects, AI-product leads, and the engineers who will actually wire CortexDB into a 5,000-person company. Companion documents: API_DESIGN_V1.md (one-user narrative), API_REFERENCE_V1.md (exact contract), enterprise_runbooks.md (ops).


1. Why this paper exists

The first walkthrough of CortexDB (API_DESIGN_V1.md) follows one sales rep named Alice through one deal. That's the right way to learn the primitives. It's the wrong way to plan a rollout.

A real enterprise has 12 departments, 200 teams, 5,000 humans, an unknown number of AI agents, three regulators looking over its shoulder, and one CFO asking "what does memory cost me per seat per month?" The same five primitives — Events, Episodes, Facts, Beliefs, Understanding — still apply, but the questions you ask them change shape entirely. An individual asks "what did I say in last Tuesday's meeting?" A department head asks "what are my top three risks across 40 reps?" The CEO asks "is our knowledge of the customer compounding or evaporating?"

This document shows how to model an enterprise on CortexDB so that all three of those people get useful answers from the same database, without anyone seeing what they shouldn't.

We will work through one company — Initech, a fictional B2B SaaS company with ~3,000 employees — and walk it from rollout day through a year of operation, ending with one concrete application (engineering RCA) traced end-to-end through the API.

The goal is for a reader to finish this document and be able to sketch their own enterprise's scope tree, name the agents that will live in it, and predict what the CEO dashboard will look like.


2. The mental model in one page

CortexDB is an event-sourced memory database for AI agents. Three sentences of philosophy that the rest of this paper rests on:

  1. Events are immutable. Every experience an agent has lands as a single, signed, timestamped entry in a write-ahead log. That log is the source of truth. Everything else — facts, beliefs, summaries — is a derived view that can be rebuilt by replay.

  2. Memory is layered. Five layers, each addressable:

    • Events — raw lossless capture ("Alice said X at 9:14am")
    • Episodes — bounded chains of related events ("the Acme Q3 deal pursuit")
    • Facts — triple-shaped assertions with validity windows ("Acme.deal_stage = signed, valid from May 13")
    • Beliefs — probabilistic conclusions with confidence intervals ("Acme will close in Q3, 0.94 supported")
    • Understanding — synthesized concepts spanning many beliefs ("B2B sales cycles in this org tend to stall at procurement")
  3. Every claim has a trail. When the system says something, it can tell you exactly why — which facts, which episodes, which raw events. Memory without an audit trail is not memory; it is gossip.

The four verbs you write code against are:

Everything else — Events list, Episodes timeline, Beliefs/why, Lifecycle stream, Audit log — is read-only introspection on top of these four.


3. Initech: the cast

For the rest of this paper, one company:

Who / whatDescription
InitechA B2B SaaS company. ~3,000 employees. EU + US presence. Subject to GDPR.
DepartmentsSales, Engineering, Customer Success, Marketing, Finance, HR, Legal, Security
MayaA new account executive in Sales. (Our "individual" persona.)
SarahMaya's manager. Sales VP for enterprise accounts.
DiegoVP of Engineering. Owns reliability for the platform.
PriyaInitech's CTO. The "top management" persona.
LanaInitech's DPO (Data Protection Officer). Owns GDPR and audit.
maya_botMaya's personal sales agent.
oncall_botThe engineering on-call agent.
csm_botThe customer success agent.
exec_brief_botThe agent that produces Priya's morning briefing.

Throughout this paper, humans don't talk to CortexDB directly. They talk to their agents. Their agents talk to CortexDB. Every API call carries the agent's signed token; every call also declares who the experience is about (the subject) and who had the experience (the observed_actor).

This separation — caller vs. observed_actor vs. subject — is the single most important enterprise concept and the one most teams get wrong on the first try.


4. Modeling the enterprise: the scope tree

A scope in CortexDB is a slash-delimited path of type:id segments. It is both the namespace for memories and the unit of authorization. Workspaces are scopes too — there is one primitive, not two.

Here is Initech's scope tree as we'll build it out over the rest of the paper:

org:initech                                  ← the root tenant
├── dept:sales
│   ├── team:enterprise
│   │   ├── user:maya
│   │   ├── user:another_rep
│   │   └── manager:sarah
│   ├── team:smb
│   └── team:partnerships
├── dept:engineering
│   ├── team:platform
│   │   ├── user:diego
│   │   └── ...
│   ├── team:product
│   ├── team:sre
│   └── team:security_eng
├── dept:customer_success
├── dept:marketing
├── dept:finance
├── dept:hr
├── dept:legal
├── dept:security
│
├── ws:initech-acme-account              ← shared workspace: customer account
├── ws:initech-q3-launch                 ← shared workspace: cross-functional project
├── ws:initech-incident-2026-04-12       ← shared workspace: an incident war room
├── ws:initech-product-roadmap-2026
│
├── project:gdpr-readiness-q2
├── project:soc2-renewal-2026
│
└── system:initech-policies              ← internal config (operators only)

A few rules to internalize before going further:

Rule 1: Personal scopes are the default home. Every human and every agent has a personal scope path (org:initech/.../user:maya). The default destination for any experience involving Maya is her personal scope. Nothing else can read it unless explicitly granted access.

Rule 2: Shared work goes in workspaces, not in users' personal scopes. When Maya works the Acme account jointly with a CSM, she does not write to user:maya and hope the CSM can read it. She writes to ws:initech-acme-account, a scope they both have membership in. This is critical — workspaces are how a company's collective knowledge accumulates separately from any one person's notebook.

Rule 3: Reads can walk up; writes go in. Sarah (Maya's manager) can ask a question scoped at dept:sales/team:enterprise and the recall pipeline will walk down into the team's member scopes (with view=descend). But Sarah does not write to all of them at once — she writes to her own scope, or to a workspace, or back into a team-level scope she owns.

Rule 4: A scope path exists as soon as anything is written to it. No "create database" step. First write auto-registers the scope with members=[caller] and role=owner. Explicit registration (via POST /v1/scopes) is what you do when you want to add more members, set custom consolidation policies, or attach inheritance rules.

Rule 5: Each scope can be GDPR-erased independently, but cross-scope refcounting protects shared work. If Maya leaves Initech and exercises her right to be forgotten, events that lived only in her personal scope are truly deleted. Events of hers that were used to derive workspace-level facts are redacted (content removed, ID preserved) — so the workspace's view of Acme doesn't break.

This tree is not imposed by CortexDB. CortexDB enforces only the grammar (slash-delimited type:id) and a set of reserved type names (org, dept, team, user, agent, ws, project, service, system). The shape of your hierarchy is your design.


5. The four-tier authorization stack

Before any user story, one more primitive — the policy stack that makes the scope tree safe at enterprise scale.

Every API call is evaluated through four tiers of policy:

                     ┌──────────────────────┐
  most narrow ──────▶│  4. Actor token      │  ← what *this token* is allowed
                     │  3. Scope ACL        │  ← who is on this scope's member list
                     │  2. Tenant config    │  ← what Initech itself permits
  most broad   ──────▶│  1. Deployment      │  ← what the CortexDB deployment allows
                     └──────────────────────┘

Each tier can only narrow what the tier above it allows. None can re-expand. Concretely:

When a request is denied, the response says which tier denied it and which capability would have allowed it. No opaque 403s. That single design choice is what makes enterprise debugging tractable.

Every response also carries an X-Cortex-Policy header showing the decision trail. When something works, you can see why; when it fails, you can see what to ask Lana to grant.


6. User journeys: how each layer of the org uses CortexDB

Now the meat. Five operators, five different ways of using the same database.

6.1 The individual contributor — Maya, account executive

It's Maya's first week. Initech rolled out CortexDB three months ago. Maya doesn't know any of this; she just talks to maya_bot in Slack.

Monday 9:14am. Maya pings her bot:

"Just got off intro call with Priya at Acme. POC starts next week. She mentioned procurement runs slow over there."

maya_bot doesn't dump that into a chat log. It packages it as an experience envelope:

POST /v1/experience
Authorization: Bearer <maya_bot's token>
X-Cortex-Actor: agent:maya_bot

{
  "scope": "org:initech/dept:sales/team:enterprise/user:maya",
  "observed_actor": {"id": "user:maya", "type": "user"},
  "subject":        {"id": "user:maya", "type": "user"},
  "modality": "conversation",
  "content": {
    "kind": "message", "role": "user",
    "text": "Just got off intro call with Priya at Acme. POC starts next week. She mentioned procurement runs slow over there."
  },
  "context": {
    "observed_at": "2026-09-07T13:14:00Z",
    "intent": "deal_intake",
    "labels": ["acme", "intro_call", "poc"]
  },
  "directives": {"extract": ["facts", "entities", "beliefs"]}
}

The API returns 202 Accepted in about 5ms. Behind the scenes the async pipeline runs:

Maya never sees any of this. What she sees is her bot replying "Got it — I'll keep track."

Wednesday. Maya messages: "Acme procurement just sent over their security questionnaire — sending it to Security." That single sentence produces:

Friday afternoon. Maya asks: "What's my open pipeline for Q4?" Her bot calls:

POST /v1/recall
{
  "scope": "org:initech/dept:sales/team:enterprise/user:maya",
  "view": "local",
  "query": "Open Q4 deals and current state of each",
  "include": ["beliefs", "facts", "episodes"],
  "temporal": {"natural": "last 90 days"},
  "budgets": {"max_tokens": 3000}
}

The response is a stratified pack — facts, beliefs, episodes side by side, plus a synthesized prose context block. Maya sees: "You have 5 open Q4 deals. Acme is at security review (likely close: 0.62). NetCore is stalled at procurement (0.31). ..."

Notice what didn't happen: nobody dumped 90 days of chat history at the LLM. The pack is the layered answer — five facts, three beliefs, two episodes — with full citations back to the events that produced them. The LLM has to invent nothing.

Maya's daily questions, all on view=local:

For Maya, CortexDB is invisible. It's just the thing that makes her bot useful even when she comes back from a two-week vacation.

6.2 The middle manager — Sarah, sales director

Sarah manages eight reps including Maya. She does not read Maya's Slack. She does not need to. What she needs is:

She uses one agent, sarah_bot, with a token bound to org:initech/dept:sales/team:enterprise. Her capabilities include scope.read.descend for her team's child scopes.

Monday standup prep. Sarah asks: "What deals slipped this week?"

POST /v1/recall
{
  "scope": "org:initech/dept:sales/team:enterprise",
  "view": "descend",
  "query": "Deals where stage moved backward or stalled in the last 7 days",
  "include": ["facts", "beliefs", "episodes"],
  "temporal": {"natural": "last 7 days"}
}

view=descend is the critical word. The recall pipeline walks down into the team's scopes — user:maya, user:another_rep, etc. — applies the policy filter at each (Sarah is a manager of the team, so she's permitted), aggregates the facts, and returns one stratified pack.

The pack tells Sarah: "Three deals slipped. NetCore — Maya's, stalled at procurement for 9 days (belief: will_close_q4 dropped from 0.61 → 0.41). Vandelay — Tom's, contact stopped responding. Globex — Jenna's, stage reverted from signed to legal_review because of a redline." Each item carries citations — Sarah can click and see the underlying events.

The privacy contract. Sarah's recall walked into Maya's scope. Maya knows this is possible — it was explained when she onboarded. The scope policy at team:enterprise says managers can read aggregated team data. Sarah does not, however, get raw event content from Maya's scope unless the per-event policy allows it. What she sees is facts (acme.deal_stage = security_review), beliefs (will_close_q4 = 0.62), and episode summaries — not Maya's actual Slack messages.

If Maya writes something to her scope and tags it private:true, the recall pipeline filters it out before Sarah ever sees it. The default is "manager can read derived; not raw," and an individual can lock down further.

Coaching signal. Once a quarter Sarah asks: "What kinds of objections are my reps hearing most often?" The recall traverses her team's scopes and the Understanding layer returns synthesized concepts — concept: procurement_velocity_objection, concept: pricing_compression_objection, concept: security_questionnaire_friction. Each concept lists how many episodes across the team support it, and the trend over time.

Understanding is the layer that makes a manager's job different from an individual's. A rep cares about facts and beliefs about one deal. A manager cares about concepts that span many deals.

Sarah's recurring questions:

6.3 The functional VP — Diego, VP of Engineering

Diego cares about reliability, not deals. His agent (eng_brief_bot) has read access to:

The kinds of questions he asks are different in shape from Sarah's. He doesn't track individual humans — he tracks systems.

Friday afternoon. Diego asks: "What's our top operational risk right now?"

The recall pipeline runs against dept:engineering with view=descend, and CortexDB returns:

This is what Understanding + Beliefs do for an executive: they surface what's changing without anyone having to filter for it. Sarah looked at her team's deals. Diego looks at his org's epistemic state — what's the system getting more or less sure about, and why.

Diego's recurring questions:

6.4 The C-suite — Priya, CTO

Priya does not write queries. She reads a morning briefing produced by exec_brief_bot, which runs a set of saved recalls at scheduled intervals and assembles them into a one-page document.

Her briefing pulls from:

The briefing is structured around the five layers:

  1. Events (volume only): "47,000 experiences captured yesterday. 12% above 30-day rolling avg." — operational pulse.
  2. Episodes (sealed): "Three episodes closed yesterday — deals signed, an incident resolved, a customer renewal." — the day's milestones, with citations.
  3. Facts (deltas): "Notable supersessions: customer:bigco.deal_stage advanced from negotiation to signed; system:payments.error_rate revised from 0.01% to 0.04%." — what changed materially.
  4. Beliefs (high-impact revisions): "belief: q4_revenue_target_met revised from 0.71 to 0.58 (interval widening). belief: payment_subsystem_will_have_p1 rose from 0.42 to 0.71." — the things the company is changing its mind about.
  5. Understanding (concept evolution): "Two new concepts crossed the partial→full threshold this week: concept: enterprise_procurement_friction and concept: payment_retry_storm." — emergent patterns the org has now recognized.

This is the executive use case. Priya is not querying anything; she's reading a layered, cited summary of what her 3,000-person company changed its mind about today. Click any belief → walk back to the facts → walk back to the events → see the raw conversation that triggered the cascade.

When the board asks "why did our Q4 forecast drop two points?" Priya can pull the answer in seconds: a /v1/beliefs/why call walks straight from the revised q4_revenue_target_met belief through the supporting facts, through the supporting episodes, all the way back to the source events — and renders a narrative explanation. That capability does not exist in any incumbent system; it's why CortexDB exists.

Priya's standing questions (run daily, surfaced in briefing):

6.5 The compliance officer — Lana, DPO

Lana doesn't write to CortexDB. She audits it. Her token is heavy on audit.* capabilities and light on everything else — she can read the audit log, run /v1/erasures/preview, and inspect any scope's policy, but she cannot read scope content unless the data subject has authorized it.

Day-to-day:

Maya leaves Initech. Lana receives a DSR. She runs the preview:

POST /v1/erasures/preview
{
  "scope": "org:initech/dept:sales/team:enterprise/user:maya",
  "audit_note": "DSR-2027-014: Maya Garcia exercising right to erasure"
}

The preview comes back with a manifest:

Lana sends the manifest to Legal. Legal signs off. Lana executes:

POST /v1/erasures
{
  "scope": "org:initech/dept:sales/team:enterprise/user:maya",
  "from_preview_id": "ervw_01HX_dsr_014",
  "audit_note": "DSR-2027-014: approved by Legal_Counsel_Rivera 2027-01-08"
}

The job runs. The lifecycle stream shows phase-by-phase progress (enumerate → refcount → delete → redact → demote → audit). When it completes, the audit row is permanent — Lana can prove to a regulator exactly what was deleted, when, by whom, with what authorization, even though the data itself is gone.

This is the asymmetry CortexDB makes possible: the content can vanish; the fact-of-vanishing never does.


7. The application: engineering RCA, end-to-end

Now we walk one concrete application through CortexDB at every API call, to show how the abstractions add up in a real system.

The application: RCA-Copilot, an internal engineering tool. When an incident fires, RCA-Copilot helps the on-call engineer:

  1. Construct a timeline of what happened
  2. Surface what the system "already knew" that might be relevant
  3. After resolution, write a structured postmortem
  4. Update beliefs and understanding so the next incident is faster to resolve

Without CortexDB, this is what most companies do: dump Slack logs into a Confluence page, write a postmortem nobody re-reads, repeat. With CortexDB the postmortem is a first-class memory record and the next incident retrieves it automatically.

7.1 The setup

Initech registers a workspace for each incident:

POST /v1/scopes
{
  "path": "ws:initech-incident-2027-01-15-payments",
  "members": [
    {"actor": "user:diego",         "role": "owner"},
    {"actor": "user:on_call_alice", "role": "writer"},
    {"actor": "user:sre_bob",       "role": "writer"},
    {"actor": "agent:oncall_bot",   "role": "writer"},
    {"actor": "agent:rca_copilot",  "role": "writer"}
  ],
  "policies": {
    "consolidation":       "merge_compatible_beliefs",
    "conflict_resolution": "latest_wins_within_confidence_band",
    "inherit_from":        ["dept:engineering/team:sre"],
    "audit":               "full"
  }
}

The inherit_from clause is doing real work — it tells CortexDB that recall queries against this workspace can also pull from the SRE team's accumulated knowledge. This is how Initech makes a fresh war room not start from zero.

7.2 The incident fires

PagerDuty fires. Alice (on-call) joins the war room. Her bot streams the alert payload into the workspace as the first event:

POST /v1/experience
{
  "scope": "ws:initech-incident-2027-01-15-payments",
  "observed_actor": {"id": "service:pagerduty", "type": "service"},
  "subject":        {"id": "system:payments",   "type": "entity"},
  "modality":       "alert",
  "content": {
    "kind": "alert",
    "source": "datadog",
    "text":   "payments-api p99 latency > 800ms for 3m on prod cluster",
    "metadata": {
      "service": "payments-api",
      "metric":  "p99_latency_ms",
      "value":   847,
      "threshold": 500
    }
  },
  "context": {
    "observed_at": "2027-01-15T14:32:00Z",
    "labels":      ["incident", "payments", "latency"]
  }
}

Note subject is system:payments, not Alice. This is about the payments system, not about her. That single line of metadata is what later lets the postmortem be discoverable when a different payments incident fires in March.

7.3 RCA-Copilot surfaces what's already known

The moment the alert lands, rca_copilot is triggered. It calls:

POST /v1/recall
{
  "scope":       "ws:initech-incident-2027-01-15-payments",
  "view":        "holistic",
  "query":       "payments-api latency, recent deploys, related incidents, payment_retry_storm concept",
  "include":     ["facts", "beliefs", "episodes", "understanding"],
  "temporal":    {"natural": "last 30 days"},
  "budgets":     {"max_tokens": 4000}
}

view=holistic means: search this workspace plus walk up the inheritance chain (dept:engineering/team:sre and its ancestors). The recall returns:

That entire pack — concept, episodes, facts, beliefs, plus a synthesized prose summary — is what RCA-Copilot dumps into the Slack war room as its first message. Alice sees: "Heads up — this looks structurally similar to the Dec 3 and Dec 19 incidents, both linked to the payment_retry_storm pattern. There was a deploy 45 minutes ago. The system has been ~50% confident the payments service handles burst traffic well; this incident is evidence against."

That's the first concrete value. The on-call engineer is reading the company's institutional memory of payments incidents in the first 30 seconds of the war room, not 30 minutes in.

7.4 During the incident — continuous capture

Alice, Bob, and the bots all keep writing experiences into the workspace as the incident unfolds:

The lifecycle stream is live. Diego watches it on a dashboard. Every event triggers the async pipeline — facts get extracted, the episode chain grows, the belief that "this is caused by the deploy" climbs from 0.4 toward 0.9.

7.5 After resolution — structured postmortem as memory

Once the incident is closed, the team does a postmortem. Traditionally this is a Confluence page that gets read once and never again. Here, it's another set of experiences submitted to the same workspace, with structured intent:

POST /v1/experience
{
  "scope": "ws:initech-incident-2027-01-15-payments",
  "observed_actor": {"id": "user:diego", "type": "user"},
  "subject":        {"id": "system:payments", "type": "entity"},
  "modality":       "postmortem",
  "content": {
    "kind": "structured",
    "fields": {
      "root_cause":        "Retry config max=5 caused thundering herd on Stripe timeouts after deploy 2027-01-15-13:47 changed the retry interval from exponential to fixed",
      "trigger":           "deploy 2027-01-15-13:47",
      "blast_radius":      "27 minutes, all payments customers, ~14k failed transactions",
      "remediation":       "reverted deploy; opened ticket to refactor retry strategy",
      "permanent_fix":     "ENG-4471 — switch payments-api to exponential backoff with jitter",
      "preventive_belief": "exponential backoff with jitter prevents thundering herd on transient upstream failures"
    }
  },
  "context": {
    "observed_at": "2027-01-15T18:00:00Z",
    "labels":      ["postmortem", "incident", "payment_retry_storm"]
  },
  "directives": {
    "extract":           ["facts", "entities", "beliefs"],
    "link_to_episode":   "ep_01HX_incident_2027_01_15_payments",
    "update_concept":    "concept: payment_retry_storm"
  }
}

What CortexDB now does, asynchronously:

  1. Facts get extracted from the structured fields. (system:payments, root_cause_2027_01_15, "retry_config"), etc.
  2. The belief payments.handles_burst_traffic is revised — the postmortem identified the failure mode, raising the confidence interval upward as the understood failure is now patched (or scheduled to be).
  3. The episode ep_01HX_incident_2027_01_15_payments is sealed — its ended_at set, its causal chain finalized.
  4. The concept payment_retry_storm gets a new version. It now references three episodes instead of two; its coverage_score climbs from 0.30 to 0.55. A new belief is opened: belief: exponential_backoff_prevents_retry_storms = 0.62 uncertain — the org has a working hypothesis it can now test.
  5. Recommendations queue: because the concept's confidence rose past a threshold, RCA-Copilot can now proactively flag any other service with the same retry-config shape as a risk. This is institutional learning expressed as a database mutation.

7.6 Two months later — the test

It's March. A different team, a different service. notifications-api starts misbehaving. The on-call engineer is a new hire, three weeks into the job. They have never seen the payments incident.

RCA-Copilot fires. Its very first recall against the new war-room workspace pulls in the payment_retry_storm concept. The concept's supported_by array now includes the January postmortem. The synthesized prose at the top of the recall says: "This may relate to the payment_retry_storm pattern. The org has seen three prior instances; the working hypothesis is that exponential backoff with jitter prevents it. Check the retry configuration of notifications-api first."

The new engineer skips ten minutes of confused debugging and goes straight to the retry config. The institution's memory just helped someone who didn't know it existed.

This is the value proposition. Not "we store memories." Not "we have a vector database." It's: every postmortem is queryable, citable, and reachable by every future incident — automatically, with provenance, with confidence intervals, with cross-incident concept synthesis. No tool in the incumbent landscape ships this. Confluence has none of it. Mem0 has facts but no concepts. Zep has temporal facts but no Understanding layer. CortexDB's five-layer model is what makes "the company learned something" a database-native operation.

7.7 The cost trail

For completeness, what does this one incident cost in API terms?

PhaseCallsApprox. cost
Alert ingestion + 100 metric pings/v1/experience/bulk ×3trivial — 100 events ≈ a few hundred KB
War-room conversation (300 messages)/v1/experience ×300most expensive — each triggers async extraction (~1¢ each at GPT-4o-mini rates)
Initial RCA-Copilot recall + 20 follow-up recalls/v1/recall ×21, /v1/answer ×6~5¢ in LLM costs total
Postmortem submission/v1/experience ×1 (structured)a few cents — heavier extraction
Episode sealing + concept updatebackground, no client callserver-side compute only
Audit rowsautomatica few KB

Total: low single dollars per incident, with everything indexed, queryable, and feeding future incidents forever. If the next incident is resolved 15 minutes faster because of what the system learned, the ROI is several orders of magnitude.


8. Other applications worth sketching

The same pattern — define a workspace, give it members and inheritance, attach a domain-specific agent — applies to many other enterprise problems. Brief sketches:

ApplicationWorkspacePrimary writer agentsThe "after" story
Customer 360ws:initech-customer-{customer_id}csm_bot, sales_bot, support_botEvery team that touches a customer sees one accumulating record. New CSM onboarding to an account takes hours not weeks.
Product roadmap memoryws:initech-product-roadmap-{year}PM bots, design bots"Why did we deprioritize feature X in Q1?" — walk the belief trail to the user-research events that revised it.
Hiring decisionsws:initech-hire-{req_id}recruiter bots, panelist botsEvery interview signal is an event. Hire/no-hire is a belief with cited supports. Post-hire, the belief is updated with first-90-days outcome data — closing the feedback loop on calibration.
Vendor risk registerws:initech-vendor-{vendor}security bots, procurement botsEvery security questionnaire, audit, breach notice is an event. The belief "vendor X is acceptable risk" is continuously revised, with citations. SOC2 audits become a query, not a fire drill.
Sales objection librarydept:sales/concept-store (synthesized)all rep bots write up the treeSarah's coaching example. The Understanding layer surfaces objection patterns no individual rep sees alone.
Internal "ask the company" searchorg:initech (descend with strict ACL)company_search_bot"Has anyone here worked with vendor Y?" — recall with view=descend over the whole org, returns episodes/facts filtered by what the asker is allowed to see. The intranet finally becomes useful.

The pattern in every case is the same:

  1. Pick a workspace scope. Give it the right members.
  2. Identify which agents will write (and what their observed_actor/subject should be).
  3. Wire those agents to call /v1/experience on every meaningful event.
  4. Build the read-side UI around /v1/recall and /v1/answer, surfacing citations.
  5. Let Understanding accumulate over months. The longer the workspace runs, the more valuable it becomes — institutional memory compounds.

9. Rollout playbook

For a real enterprise actually doing this, the order matters. Here is the recommended sequence:

Phase 0 — Operator setup (week 0-1)

Phase 1 — Two pilot teams (weeks 2-6)

Phase 2 — Cross-functional workspaces (weeks 6-12)

Phase 3 — Org-wide rollout (months 4-12)

Phase 4 — Continuous (year 2+)

The pilot phase is where companies most often go wrong. The mistake is trying to design the perfect scope tree before any team has actually used the system. Start with three teams and one workspace; let the tree grow organically; refactor once at month 3. CortexDB makes scope renames safe (the WAL replays cleanly under a renamed scope), so getting the initial shape only mostly-right is fine.


10. What changes about how the company thinks

This is the section that's hardest to write but most worth saying out loud.

Three behaviors change at companies that go all-in on a memory layer:

1. Postmortems become valuable. Today, postmortems are written, filed, and forgotten — read by maybe three people, lost in Confluence within six months. When the postmortem is a database mutation that the next incident will surface automatically, writing a good one is finally worth the engineer's time. Quality of postmortems improves measurably within two quarters at companies that have done this rollout.

2. Beliefs become a unit of management. Today, an executive learns about a confidence shift through a Slack message or a board prep doc — informal, undated, uncited. When belief: q4_revenue_target_met changes from 0.71 to 0.58 with full citation back to three specific deal events, the executive can act earlier and with more justification. The leading indicator becomes machine-readable. Forecasting accuracy climbs.

3. Forgetting becomes routine, not catastrophic. Today, GDPR deletion is a manual scramble. When it's a documented /v1/erasures job with a preview, a manifest, a legal sign-off step, and a permanent audit row, it becomes a 30-minute task instead of a 30-day project. Compliance load drops; willingness to give employees genuine erasure rights goes up.

The bet behind CortexDB is that an enterprise that can track its own knowledge, beliefs, and forgetting — with citations and audit — will eventually outperform one that can't, in the same way that companies with proper double-entry accounting eventually outperformed those that kept ad-hoc ledgers. The accounting analogy is real. CortexDB is the general ledger for what the company knows.


11. Frequently asked enterprise questions

Does it run on-prem? Yes — on_prem_enterprise deployment profile. RocksDB + Tantivy + HNSW are all local; no data leaves your network unless you connect an external LLM provider, which itself can be a local model.

Can we self-host the LLMs? Yes. The embedding, extraction, reranker, and answer LLMs are all reached over HTTP and can be pointed at any OpenAI-compatible local endpoint (vLLM, llama.cpp, etc.).

How does it scale? Cluster mode runs on openraft for consensus, gossip-over-UDP for membership, and gRPC for RPC. WAL is sharded by token ring with hinted handoff. Tested deployment profiles include a 5-node cluster handling ~10k experiences/sec sustained.

What's the cost model? Per-event compute is dominated by embedding + extraction LLM calls (~$0.001-0.01 per event depending on length and model choice). Storage is RocksDB-cheap. A 1,000-person company doing ~10k experiences/day across the org runs in the low hundreds of dollars/month, not counting LLM API costs which can be brought way down with local models.

How does it integrate with our existing stack? SDKs in Python and TypeScript. An MCP server (cortexdb-mcp) exposes the memory operations as MCP tools so any MCP-capable agent (Claude Desktop, custom agents) can use CortexDB without code. Connectors exist for Slack, GitHub, Jira, PagerDuty, Confluence to auto-ingest experiences.

What about backup/restore? Daily full backups + WAL archival = RPO of 5 minutes. PITR (point-in-time restore) is built in: pick a timestamp, the restore replays the WAL forward to that exact moment and rebuilds all derived views. Full procedure in enterprise_runbooks.md.

Vendor lock-in? The export endpoint produces JSONL plus mappings to Mem0, Zep, and Letta formats. A database you cannot leave is a database that owns you. CortexDB is designed to be left.


12. Closing

Five layers. Four authorization tiers. One scope tree, shaped by your org. Three identities per call (caller, observed_actor, subject). Two time axes per record (valid and recorded). One rule above all: every claim has a trail.

Get those right, and you can build everything in this paper. The hard part is not the API — the API is small enough to teach in a day. The hard part is the cultural one: convincing a 3,000-person company that its institutional memory deserves the same care it gives its accounting ledger. The companies that do this in the next five years will own categories. The ones that don't will keep forgetting what they already knew, every time someone leaves the room.

That's the bet. The rest is implementation.


This document is a draft. Feedback and corrections welcome — particularly from operators who have rolled out enterprise memory systems before and know where the rakes are. Address comments to pmalik.