Skip to main content

Wiki module

Self-hosted specifications and knowledge with an AI-traversable dependency graph. Lets defence and corporate tenants author, link, and search documentation entirely on tenant-owned infrastructure — no third-party SaaS.

  • Source: apps/wiki/
  • Schema: wiki
  • Project Tracker prefix: PLT-* (delivered under the Shared Platform project)
  • Hosting: sub-zone in the multi-zone topology — served at /wiki, rewritten by the Directory shell via WIKI_ZONE_URL. Preview/local deployments serve standalone at /.

Capabilities

  • Spaces — top-level organisational containers. Every page belongs to exactly one space; child pages must be in the same space as their parent. A per-tenant Default space is created lazily on first page creation. See Spaces for full details.
  • Markdown pages with YAML-style frontmatter, organised in a parent/slug hierarchy within a space.
  • Typed links between pages (depends_on, supersedes, references, implements) forming a dependency graph.
  • Full-text search, tenant- and classification-scoped via row-level security; filterable by space.
  • Attachments with short-lived signed download URLs (storage provider never disclosed).
  • Per-page classification (UNCLASSIFIEDSECRET) enforced in RLS, not by frontmatter convention.
  • Every mutation writes a universal-audit-log row and publishes a wiki.* domain event (payloads never carry the markdown body, preserving tenant/classification isolation).

Page fields

Beyond the title, slug, body, frontmatter, and classification, a page carries two machine-facing fields used by the agent walking skeleton (PLT-194):

  • summary (string | null) — a machine-readable 1-3 sentence tl;dr triage signal. It lets a consumer (MCP list output, a per-space index page) decide whether a page is relevant without pulling the full body. NULL is valid — human-authored pages typically leave it unset. Accepted on both create and update.
  • is_agent_owned (boolean, default false) — distinguishes agent-owned pages from human-authored ones, so a later automated pass knows which pages are safe to overwrite. Accepted on create only and surfaced read-only; its write-gate enforcement is deferred. Human-authored pages keep the false default.

Recognized page types

page_type is a free-form TEXT column — it is not constrained to an enum. The platform recognizes a documented vocabulary by convention (KNOWN_PAGE_TYPES in apps/wiki/src/lib/schemas/page.schema.ts), but any non-empty string up to 64 characters is accepted:

  • Existing values: spec, page, doc, runbook.
  • Walking-skeleton values: source_summary (an agent's digest of an external source), index (the per-space root navigation hub), index_spoke (a per-topic catalog spoke, PLT-281 — many per space), and log (records ingest events chronologically).
  • Ingest values: source (the raw, pasted external source text — the provenance anchor for a source_summary's derived_from link), maintenance_schema (an optional per-space contract page read at ingest time to load any space-level instructions; absence is advisory-only, not fatal).
  • Fan-out values (PLT-203): entity (a named domain concept or actor), concept (a higher-level abstraction, pattern, or principle), synthesis (a cross-cutting page that compiles relationships or comparisons across entities/concepts). These are the only pageType values the ingest fan-out create op accepts — the FanoutPageSchema narrows to exactly these three while the generic CreatePageSchema.pageType remains free-form.

Pages can be linked with typed edges stored in wiki.page_links. Five link types are supported:

link_typeMeaning
depends_onThe source page depends on the target page.
supersedesThe source page replaces / obsoletes the target page.
referencesThe source page cites the target page.
implementsThe source page implements the spec in the target page.
derived_fromThe source page is derived (summarised, distilled) from a specific revision of the target page.

The derived_from link type is the PLT-197 provenance edge. It differs from the others in two ways:

  1. It requires a source_revision_num — a positive integer that records which revision of the target page the summary was distilled from. This lets a reader detect when the source has since been updated.
  2. It is validated at the service layer: LinkService.create asserts that the referenced revision exists before inserting the link.

The source_revision_num column is enforced by a DB-level CHECK constraint: it must be non-null exactly when link_type = 'derived_from', and null otherwise.

R2 guard — source_summary pages must carry provenance

A source_summary page must carry an outgoing derived_from link at all times. This invariant is enforced in PageService:

  • Create: a POST /api/pages (or create_page MCP call) with pageType: 'source_summary' and no derivedFrom field returns 400.
  • Update: a PATCH /api/pages/:id that would result in a source_summary page (via pageType conversion) but the page has no existing derived_from link also returns 400.

The check runs in PageService.create/update — not just in the ingest tool — so the constraint holds for every write path.

Index & log pages

Each space has a reserved, agent-maintained index hub + log page (PLT-196), identified by a deterministic slug convention and capped at one of each per space by a unique partial index on wiki.pages (tenant_id, space_id), plus a set of per-topic index spoke pages:

  • index page (slug <space-slug>-index, page_type = 'index') — since PLT-281 a topics-only hub: a ## Topics table with one compact row per topic (its label, the slug of its spoke page, and a page count). The hub's rendered size grows with the topic count, not the page count — adding pages to an existing topic grows that topic's spoke, never the hub — so the hub stays small enough to inject into a coordinator consult regardless of how large the KB grows. An agent reads this first to learn what topics a space holds, then resolves the relevant spoke(s) by slug. (Before PLT-281 the index page was a single flat catalog of every page; that scaled with the page count and is the problem the hub-and-spoke split fixes. Cells are still clamped — Title ≤ 80, Summary ≤ 72 at render time, PLT-279.)
  • index_spoke pages (slug <space-slug>-spoke-<digest>, page_type = 'index_spoke', PLT-281) — one per topic, holding that topic's detailed ## Read first + ## Provenance catalog rows (the same rows that used to live in the flat index, now partitioned by topic). Many per space (not a single reserved slot). Reconciled on every rebuild: new topics get a spoke, changed spokes are rewritten (an unchanged spoke is a no-op), and a spoke whose topic no longer exists is emptied in place — kept alive at its deterministic slug rather than soft-deleted, because the page-slug UNIQUE constraint is not partial on deleted_at, so a tombstone would reserve the slug and a later rebuild for a returning topic would hit a unique violation (a hard delete is not available — the append-only revision trigger rejects DELETE). Like the hub they are is_agent_owned, UNCLASSIFIED, and excluded from the orphan-page lint.
  • log page (slug <space-slug>-log, page_type = 'log') — an append-only chronology of ingest / maintenance events. Each entry is prefixed ## [YYYY-MM-DD] <action> | <title> (UTC date) so the log stays grep-able.

The rebuild collects content pages paginated (excluding the reserved index/log/index_spoke navigation pages in SQL, so the freshly-written spokes — which sort first by updated_at — never crowd out real content), up to a 2,000-page backstop that logs loudly if exceeded. The coordinator KB reader (queryKnowledgeBase in @constellation-platform/coordinator) reads the hub and fetches only the question-relevant spokes via the pages list API's slugs filter (GET /api/pages?...&slugs=a,b,c, comma-separated, max 50) — an O(selected) read so per-consult token cost stops scaling with the KB.

Both pages are created with is_agent_owned = true, are UNCLASSIFIED, and are kept up to date by two write-gated endpoints (and the matching MCP tools below). A rebuild is a full, idempotent regeneration from the space's current pages; a log append is read-concat-write. Soft-deleting either page frees the per-space slot, so a later rebuild can recreate it.

  • POST /api/spaces/:spaceId/index/rebuild — regenerate the space's index page. Returns { data: { indexPageId } }.
  • POST /api/spaces/:spaceId/log/append — append one { action, title, detail? } entry to the space's log page. Returns { data: { logPageId } }.

Both routes require the wiki:page:write permission and are tenant-scoped: a caller can only rebuild / append within a space their tenant owns.

Ingest loop (PLT-197 + PLT-203 fan-out)

The ingest loop is the mechanism that turns an external source into compounding knowledge inside the wiki. It is human-approved only (R7): there is no autonomous trigger and approvedBy is required and non-empty.

What an ingest does

A single ingest call (one source, one summary, optional fan-out):

  1. Acquires a per-space serialization lock so concurrent ingests into the same space run strictly one-at-a-time (transaction-scoped advisory lock, released on commit/rollback — the lock makes the entire fan-out + index/log rebuild the critical section).
  2. Reads the space's maintenance_schema contract page if one exists (advisory — absence is not fatal; the consulted revision number is recorded in the log entry).
  3. Creates a source page (is_agent_owned: true, body_md = the pasted source text). This is the provenance anchor.
  4. Creates a source_summary page (is_agent_owned: true, derivedFrom: { sourcePageId, sourceRevisionNum: 1 }). The R2 guard in PageService.create validates the edge.
  5. (PLT-203 fan-out) For each entry in derivedPages[]:
    • op: 'create' — creates a new entity/concept/synthesis page with a derived_from provenance edge to the source at revision 1. Both the revision and the link are batch-stamped.
    • op: 'update' — writes a batch-stamped revision to the target page (updated body + optional summary) and adds a batch-stamped derived_from edge from that page to the source. This lets a pre-existing synthesis page accumulate provenance from multiple ingests.
  6. Rebuilds the space's index page (includes the new derived pages).
  7. Appends an ingest entry to the space's log page (detail records the fan-out count).

All writes happen inside a single withTenantContext transaction and are stamped with a shared ingest_batch_id. This makes the ingest fully reversible (see below).

Per-space ingest policy

Every space carries an ingest_policy JSONB column (default: {"autonomy":"human_approved","maxPagesPerIngest":15}) that governs fan-out behaviour:

FieldTypeDefaultMeaning
autonomy'human_approved' | 'autonomous''human_approved''autonomous' is deferred until PLT-214 (non-bypass soft-delete). Attempting autonomous ingest is rejected 400.
maxPagesPerIngestinteger 1–5015Maximum number of fan-out pages (derivedPages) in one ingest. Over-budget requests are rejected 400 before any write.

The policy is configurable via the PATCH /api/spaces/:spaceId endpoint (ingestPolicy field). The source_summary page does not count toward the budget — only the entries in derivedPages[] do.

Classification gate

Fan-out pages are created at UNCLASSIFIED (the default). The tool layer pins the RLS context to UNCLASSIFIED when ingesting, so a fan-out cannot write a classified page and cannot read classified pages into synthesis bodies. Writes above UNCLASSIFIED by the ingest path are blocked until a separate review gate is implemented (locked decision (d) from PLT-194).

ingest_batches table

wiki.ingest_batches records each human-approved ingest:

ColumnTypeNotes
idUUID PKThe batch id stamped on every revision written by this ingest.
tenant_idUUIDFK to identity.tenants; RLS-enforced.
space_idUUIDComposite FK to wiki.spaces(tenant_id, id).
source_refTEXTHuman-readable label for the source (e.g. "RFC 9110 §4").
agent_principalTEXTThe sub claim of the caller who ran the ingest.
approved_byTEXTThe human who approved the ingest (R7 gate).
statusTEXT'committed' or 'reverted'.
created_atTIMESTAMPTZWhen the batch was committed.
reverted_atTIMESTAMPTZWhen the batch was reverted (null if committed).

The table has FORCE ROW LEVEL SECURITY and a combined USING/WITH CHECK policy on tenant_id.

Reverting an ingest

POST /api/ingest-batches/:batchId/revert reverses a committed ingest atomically:

  • Link removal (PLT-203): Before processing any page, LinkRepository.deleteByBatch removes every page_links row stamped with the batch id. This covers both (a) provenance edges from batch-created pages and (b) edges merged onto pre-existing synthesis pages during the fan-out — removing them all in one idempotent pass.
  • For each page the batch created (earliest batch-stamped revision is revision_num = 1): frees its slug, then soft-deletes the page.
  • For each page the batch modified (earliest batch-stamped revision is revision_num > 1): writes a compensating revision restoring the pre-ingest body.
  • The index and log pages written by the ingest are also batch-stamped, so they are restored by the same mechanism.
  • The batch row is flipped to status = 'reverted'.

Slug reclaim (PLT-300). The slug unique index pages_tenant_space_parent_slug_uk has no WHERE deleted_at IS NULL predicate (intentional — it backs the disclosure-safe re-home check), so an ordinary soft-deleted page keeps its slug reserved. Reverting a batch therefore renames each batch-created page's slug to a unique reverted-<uuid> tombstone (a fresh random UUID, while the page is still live, then soft-deletes it), which frees the original slug. The result: the curation pattern "revert the old ingest batch, then ingest_source fresh at the same slug" works in a single step — no manual purge and no need to choose new slugs. This freeing is scoped to the discarded batch and does not change ordinary trash soft-delete, which keeps a deleted page's slug reserved on purpose. Re-ingesting at a slug still held by an unrelated soft-deleted page (e.g. one trashed via the page delete route) returns a clean 409 Conflict, never a raw 500.

Unlike ingest (pinned to UNCLASSIFIED for index-rebuild safety), the revert runs at the caller's clearance — it does not rebuild the index, so there is no classified-content leak, and a sufficiently-cleared operator can free + soft-delete a batch page that a later PATCH reclassified above UNCLASSIFIED. If a batch-created page cannot be removed at the caller's clearance, the revert fails with a 409 (and the batch stays committed) rather than reporting success while the page and its reserved slug silently survive.

Both the ingest and the revert emit auditCritical entries (ingest.batch.committed / ingest.batch.reverted) inside the transaction so the security-relevant lifecycle is durably recorded via the outbox.

Retracting a source (PLT-204)

revert_ingest_batch undoes a recent ingest; source retraction is the compliance path for a source whose batch must otherwise stand but whose content must go away — a GDPR erasure request, a redacted transcript, a reverted PR, a released legal hold. POST /api/pages/:id/retract (body: { reason, approvedBy } — a named human approver is required, mirroring the ingest R7 gate) runs in one transaction:

  • Destroys the source content at rest — irreversibly, unlike soft-delete: the page row's body/title/summary/frontmatter and every historical page_revisions row are overwritten with a fixed redaction sentinel (the generated search_tsv recomputes, emptying the search index), retracted_at is stamped, and the page is soft-deleted. Retracted pages are excluded from the trash listing and can never be restored. The scrub goes through a narrow, GUC-gated exception in the append-only revisions trigger; DELETE remains unconditionally rejected.
  • Walks the full transitive derived_from closure (SECURITY DEFINER, so derivers above the caller's clearance cannot escape the walk) and flags every live derived page with an open lost_source finding in the unified wiki.lint_findings store (PLT-255) — structured, queryable needs-review state (detail_json carries the retracted page id, the reason, and the traversal depth; the finding is anchored to the affected page's own space). A page "needs review" iff it has at least one open finding — one query across lost-source and lint findings alike; findings are resolved (via the lint surface), never erased.
  • Strips every link touching the source (all link types, both directions) strictly before the soft-delete, so no RLS-invisible orphan rows can exist.
  • Re-types sole-provenance source_summary pages to synthesis so the R2 guard cannot make them permanently un-editable after their only source disappears.
  • Surfaces the degraded confidence: the rebuilt space index prefixes flagged pages' Summary cells with ⚠ needs review (<check types>) — covering lost-source and open lint findings, which the coordinator read path (PLT-199) renders into consult prompts with no consumer-side change — and GET /api/pages/:id responses include openFindings (the unified lint finding shape, every open finding type for the page). Because the derived_from closure can cross spaces, every other visible space holding affected pages gets its index rebuilt too (findings are anchored to the affected page's own space). A retract entry is appended to the space log, and an auditCritical entry (source.retracted) records the reason, approver, affected/re-typed page ids, and scrub counts.

Retraction is idempotent at the source level: retracting an already-retracted source returns 409; a non-source page returns 400. A source reclassified above UNCLASSIFIED remains retractable by a sufficiently cleared caller — the retraction functions gate on the caller's own clearance, not the UNCLASSIFIED session the index/log writes run under — and the log entry then redacts the classified title to (classified source <id>) (the real title is kept only in the privileged audit context); an under-cleared caller gets 404 with no existence oracle. A source concurrently moved to another space mid-retraction aborts with 409 (the transaction rolls back; retry). Derived pages are not rewritten (v1 is deterministic and LLM-free) — they may still quote the source, which is exactly what the needs-review queue puts in front of a human. The retract_source MCP tool exposes the same capability over the REST API.

Endpoints

MethodPathDescription
POST/api/spaces/:spaceId/ingestExecute a human-approved ingest (body: IngestSourceSchema). Returns 201 with { batchId, sourcePage, summaryPage, derivedPageIds, indexPageId, logPageId }. Over-budget fan-outs return 400.
POST/api/ingest-batches/:batchId/revertAtomically revert a committed ingest. Returns 200 with { batch } (status = 'reverted') or 409 if already reverted.
POST/api/pages/:id/retractRetract a source page (body: { reason, approvedBy }). Destroys its content at rest and flags all derived pages needs_review. 200 with the blast radius; 400 non-source; 409 already retracted.
PATCH/api/spaces/:spaceIdUpdate space including ingestPolicy ({ autonomy, maxPagesPerIngest }). Returns 200 with the updated space.

The ingest, revert, and retract endpoints require the wiki:page:write permission; PATCH /api/spaces/:spaceId is a space-administration endpoint and requires wiki:spaces:admin (via assertCanAdminSpaces). All are wrapped with authedRouteWithParams.

Revisions & history

Every save appends an immutable row to wiki.page_revisions (UPDATE/DELETE are rejected by a trigger). The reading view exposes a History panel that lists revisions (author, timestamp, edit summary) and renders a server-computed side-by-side diff between any two — the diff is built in the API route, never in the browser, so an under-cleared caller can never diff a body they cannot otherwise read.

  • GET /api/pages/:id/revisions — revision metadata for a page.
  • GET /api/pages/:id/revisions/:n — a single revision (full body + frontmatter).
  • GET /api/pages/:id/revisions/diff?from=&to= — structured side-by-side line diff between two revisions.

Restore does not mutate history: it writes the chosen revision's content back as a new revision (with an auto edit summary), preserving the append-only invariant. An optional edit summary field on save is persisted to wiki.page_revisions.edit_summary.

Page hierarchy & editing

Pages form a parent/child tree (pages.parent_id). The hierarchy can be reorganised two ways:

  • From the editor — a searchable parent picker re-parents the page (or detaches it to the top level).
  • From the left-rail tree — drag a page onto another to re-parent it, or onto the "move to top level" zone to detach it.

Re-parenting is validated server-side: a page cannot become its own parent, and a move that would create a cycle (a page under its own descendant) is rejected via the wiki.is_page_in_ancestor_chain SECURITY DEFINER check — including cycles through an ancestor the caller cannot see under RLS. The UI surfaces the rejection inline; it never produces a 500.

Agent access (MCP)

The wiki is a first-class participant in the consolidated constellation MCP server. When WIKI_BASE_URL is configured, 24 wiki tools are registered:

  • Spaceslist_spaces, get_space, create_space, update_space, delete_space, decommission_space
  • Pagessearch_pages, get_page, list_child_pages, create_page, update_page
  • Links & dependencieslist_page_links, link_pages, traverse_dependencies
  • Attachmentsget_attachment
  • Maintenancerebuild_space_index, append_space_log
  • Lint (honesty loop)lint_space, list_lint_findings, record_lint_finding, resolve_lint_finding
  • Source ingestioningest_source, retract_source, revert_ingest_batch

traverse_dependencies walks the depends_on graph breadth-first with a depth cap, so an agent can answer "what does X transitively depend on?" in a single call. rebuild_space_index and append_space_log maintain the per-space index & log pages. decommission_space is the human-confirmed bulk teardown of a populated space — see Spaces; the lint tools drive the honesty loop. Space tools accept spaceId (UUID payload field) and spaceSlug (slug-path resolution context) — see Spaces for the distinction.

Module-boundary rule. The MCP server has no database access — wiki tools reach the module exclusively over its REST API using the caller's Directory-issued token (valid across zones). A CI gate (scripts/check-wiki-mcp-boundary.ts, in the Quality Gates job) fails the build if a wiki tool imports a Prisma client or any app source.

See also

  • MCP server setup — wire the constellation server (incl. WIKI_BASE_URL) into your agent runtime.
  • Universal audit log — the audit guarantees every wiki mutation satisfies.