The agent-native knowledge base
Constellation's knowledge base (KB) is the knowledge-base wiki space: the
team's compounding, queryable memory — the decisions, conventions, and
lessons distilled from prior work. The mental-model shift in one sentence:
The wiki stops being a place you put documents and becomes a memory that accretes from work happening elsewhere — coordinator consults, agent sessions, eventually Slack — keeps itself honest (lint, retraction), and is the first thing agents read instead of re-deriving answers.
This page explains what the KB is and how it operates. For the read path
— the query_knowledge_base MCP tool / pt query-kb CLI an agent calls to
ground itself before implementing — see KB retrieval. For
standing up a KB for a new team or tenant, see the
new-tenant setup runbook.
The operating loop
The dashed qualifying synthesis filed back edge is the compounding mechanism:
answers that pass the file-back gates (below) become pages, and the next
question reads those pages instead of re-deriving them. Ask "what is blocked on
initiative X and why" on Monday, and Thursday's related question does not
re-derive the reasoning from raw Project Tracker data — it reads Monday's
synthesis and builds on it. Decision history stops being per-session and becomes
queryable organisational memory.
Who reads it
| Consumer | How it reads |
|---|---|
| Coordinator brain | Every consult_coordinator call reads the tenant's KB index-first, embeds the relevant topic spokes + synthesis pages into the prompt, and records a citations array of which pages informed the answer. |
| Dev agent sessions | Through the query_knowledge_base MCP tool / pt query-kb CLI and the consult-the-kb skill — ground before implementing, cite what you used. |
| Humans | The wiki UI shows provenance and the agent/human authorship distinction, plus "⚠ needs review" annotations from open lint findings. |
| CI PR reviewer | The multi-LLM reviewer can ground a review against the KB from a CI script via the same retrieval seam. |
The trajectory (PLT-128 / ADR-015)
is that the KB becomes the canonical compounding home for what the team has
learned and decided, read at runtime, while the flat bootstrap files
(.ai/constitution.md, AGENTS.md, .claude/skills, the authored
.ai/decisions / .ai/specs) stay canonical for cold-start and CI-gated
content. See the two knowledge layers
in the agents overview.
Where content comes from — and the file-back gates
Ingestion is deliberately phased. Two sources are live today:
1. Consult file-back (automated, gated)
A coordinator consult on an UNCLASSIFIED initiative may be filed back as a
synthesis page with provenance — but file-back is gated, not
unconditional. Two independent gates apply:
- Durability. A filed-back page is written
durablevstime_bound. Time-bound pages (e.g. "what should X work on next" — true only at the moment asked) carry a TTL that a reaper expires, so transient state does not accumulate. - Quality / dedup. A consult is skipped (no page written) when it has zero citations, merely echoes an existing page, or is a near-duplicate of recent synthesis.
So "every successful consult becomes a page" is not true: only consults that add cited, durable, non-redundant knowledge file back. File-back is also restricted to UNCLASSIFIED initiatives — classified synthesis never enters the compounding loop.
2. Manual / session ingest (human-approved)
An agent or human pastes a source through ingest_source: meeting notes, a
decision thread, a postmortem, an external document. One call produces the raw
source page, a source_summary page with a derived_from provenance edge
pinned to the source revision, fan-out updates to related pages, an index
rebuild, and a log entry — all as one atomically revertible ingest_batch
(revert_ingest_batch undoes the whole thing). approvedBy is mandatory.
Connectors (planned). Slack channels and session transcripts will feed the same pipeline via connectors — into a human-approved queue first, with a durable budget counter capping spend. Full per-space autonomy is gated on an infrastructure precondition: the production non-bypass database role (so RLS is actually enforced).
See Adding content to the knowledge base and the new-tenant runbook for the operator's view of ingestion.
What happens to existing wiki pages
Nothing happens to them — the system is opt-in by space and constitutionally non-destructive toward human content:
- The opt-in is creating a space with the
knowledge-baseslug. The compounding machinery operates only there; the Default space, specs, release notes, and meeting pages are untouched. - Existing pages are human-owned by default; agents never rewrite human pages. Autonomous lint can only flag, mark stale, and list orphans — deletes and synthesis-rewrites stay human-gated even in autonomous spaces.
- Existing pages can become inputs: anything already in the wiki can be ingested as a source, producing summary pages with provenance pinned to the source page's revision. That is the migration template for an existing corpus — it is compounded into the KB with provenance, never moved or deleted.
Keeping it honest — the curation cadence
The honesty loop runs on a cost-split cadence:
- Daily — mechanical lint (automated). A Vercel cron runs the cheap, DB-only
checks (
orphan_page/broken_crossref/provenance_drift) over every agent-owned space and appends alintspace-log entry per space. No human or agent invocation needed. - Per-release + ad-hoc — LLM judgement pass (agent/human-run). The
wiki-curatorsubagent runs the expensive checks — contradictions (including cross-source), stale claims, missing concept pages, data gaps — and the.ai/lessons.md→ KB promotion. This is the token-bearing work and stays a deliberate, gated invocation.
Findings are structured rows in wiki.lint_findings, not prose, so they are
queryable. The curator is strictly non-destructive — flag / mark-stale /
list-orphans only. See Wiki lint & curation.
Known costs and accepted trade-offs
- Index-first retrieval costs more tokens than keyword search — measured at ~3.65× after the single-pass navigation fold — but it won the retrieval eval decisively (0.96 answer correctness vs 0.30 for keyword search; citation rate 1.0). The quality lift was judged worth the token cost.
- Embeddings were evaluated and closed NO-GO — no token saving over index-first at the measured scale; the scaffolding is retained as a future-scale re-test harness.
- The hub-and-spoke index (a topics-only root hub + per-topic spokes the reader resolves on demand) is the structural response to keeping the read bounded as the space grows — see KB retrieval.
- File-back is gated (UNCLASSIFIED + durability + quality/dedup) so low-value, echo, and near-duplicate answers stay out of the compounding loop by design.
See also
- KB retrieval — the read-path mechanics (
query_knowledge_base, theWikiKbSourceseam). - New-tenant KB setup runbook — stand up an isolated KB for a new team.
- Wiki module and wiki lint & curation — the underlying data model and honesty loop.
- Working with AI agents — the two-knowledge-layer model (flat bootstrap vs. the KB).