Skip to main content

Core architecture, canonical model, index strategy

Part of the Universal Audit Log Specification. How the audit schema is laid out, the canonical audit-entry model every module conforms to, and the index strategy that makes SELECT … WHERE correlation_id = $1 fast at billions-of-rows scale.

6. Core Architecture

6.1 Capture Points

The primary capture point SHALL be the tool layer.

Rules:

  • route handlers and server actions stay thin
  • services apply domain rules and persistence orchestration
  • repositories remain persistence-only
  • tools open the transaction, call the service, compute audit input, write audit, and publish domain events if needed

Audit MUST NOT rely primarily on:

  • Prisma middleware
  • Prisma query extensions
  • repository-side implicit hooks
  • fire-and-forget background writes

Those mechanisms may exist for development diagnostics, but they are not authoritative enough for universal audit.

Additional Capture Points

The tool layer is the primary capture point, but it does not cover all auditable operations. The following table defines the complete set of capture points:

Capture PointActor TypeAudit MethodExample
Tool function (interactive)USERcreateAuditor(ctx) / withAuditedMutation()User updates a task
Background job (cron/queue)SYSTEMDirect auditAction() with actor_type: 'SYSTEM'Nightly status recalculation
Event-driven handlerSYSTEMDirect auditAction() with correlation from source eventCascade update from upstream event
Bulk operationUSER or SYSTEMauditBatch(tx, entries[])CSV import of 500 catalogue items
Migration scriptSYSTEMMigration audit record with script identifierData migration changing resource states

auditBatch(tx, entries[]) SHALL be provided in the platform API for bulk operations. It MUST write all entries within the supplied transaction and enforce the same validation as single-entry writes.

6.2 Transaction Pattern

All auditable writes SHALL follow this pattern:

  1. open withTenantContext(...)
  2. load current state when a before/after diff is required
  3. execute domain mutation
  4. compute normalised changes
  5. call auditAction(tx, ...) or auditCritical(tx, ...)
  6. publish domain event if required
  7. commit transaction

If any step fails, neither the state change nor the audit row is committed.

6.3 Storage Pattern

Audit data remains in the shared audit schema, not in per-module schemas.

This matches the approved architecture: cross-cutting concerns such as tenancy, audit, and events live in platform schemas.

6.4 Partitioning Strategy

The audit.audit_entries table SHALL be range-partitioned on created_at with monthly partitions.

CREATE TABLE audit.audit_entries (
-- columns defined in Section 7
) PARTITION BY RANGE (created_at);

-- Example: create partition for March 2026
CREATE TABLE audit.audit_entries_2026_03
PARTITION OF audit.audit_entries
FOR VALUES FROM ('2026-03-01') TO ('2026-04-01');

Partition management: automated creation via pg_partman or cron, at least 3 months ahead. Old partitions MAY be detached and archived (Section 6.5). A default partition MUST exist to prevent write failures. At high scale (>100M rows/month), composite partitioning (range + hash by tenant_id) SHOULD be evaluated.

6.5 Data Retention Policy

The approved Constellation architecture (Constellation_Architecture_Spec_v1.md §Retention) specifies:

Audit entries are retained indefinitely in the SaaS tier. Dedicated Cloud and On-Prem tiers allow configurable retention policies per tenant. Entries are never deleted — they are archived to cold storage after the retention window.

This spec conforms to that policy:

TierOnline RetentionArchivalNotes
SaaS (default)IndefiniteN/A — entries remain onlineMatches approved architecture
Dedicated CloudConfigurable per tenantCold storage after retention windowTenant-configured
On-PremConfigurable per tenantTenant-managedTenant-configured

Archival (Dedicated/On-Prem only): detach partition older than the tenant-configured online threshold, export to Parquet in cold storage (S3/GCS), record metadata in audit.archive_manifest, drop the detached partition after verification. Archived data MUST remain queryable on request within 48 hours.

Partitioning (Section 6.4) is still a Phase 1 requirement — it is an operational necessity for table management (VACUUM, backup, index performance) regardless of retention policy. Archival automation is Phase 6 (Dedicated/On-Prem tiers only).

Note: If a future ADR changes SaaS retention to a fixed window, the partition-based archival mechanism is ready. No schema changes would be needed.


7. Canonical Audit Model

7.1 Existing Fields to Keep

The existing audit.audit_entries contract remains the base:

  • id
  • tenant_id
  • actor_id
  • actor_type
  • action
  • resource_type
  • resource_id
  • module
  • changes
  • classification
  • ip_address
  • correlation_id
  • created_at

7.2 Required Additions

The table SHALL be extended with:

  • organisation_id UUID NULL
    • for organisation-scoped filtering inside a tenant
  • parent_resource_type TEXT NULL
    • for related history views
  • parent_resource_id TEXT NULL
    • allows non-UUID resource ids where needed
  • context_json JSONB NULL
    • stores bounded metadata such as reason, status transition metadata, workflow instance ids, approval notes, import source, or system trigger info
    • MUST NOT contain PII (see Section 9.5)
  • entry_hash TEXT NULL
    • reserved for tamper-evident chain
  • previous_hash TEXT NULL
    • reserved for tamper-evident chain
  • session_id TEXT NULL
    • for session correlation (NIST 800-171 requirement)
  • user_agent TEXT NULL
    • for forensic analysis
  • outcome TEXT NOT NULL DEFAULT 'SUCCESS'
    • values: 'SUCCESS' | 'FAILURE' | 'DENIED'
    • critical for security audit — failed and denied attempts MUST be captured
  • duration_ms INTEGER NULL
    • operation duration for performance audit
  • changed_fields TEXT[] NULL
    • array of top-level field names present in changes
    • enables efficient "which entries changed field X" queries via GIN index

7.3 Semantics

  • resource_type
    • canonical domain resource type, e.g. directory.organisation, projects.task, catalog.entry
  • resource_id
    • identifier of the direct subject of the change
  • parent_resource_type / parent_resource_id
    • optional owning or related aggregate for unified history views
  • changes
    • normalised before/after field diff, not arbitrary full payload dumps
  • context_json
    • bounded operational context; never a replacement for changes; never contains PII
  • outcome
    • reflects whether the operation succeeded, failed due to an error, or was denied by authorisation

7.4 Immutability

The append-only rule remains unchanged:

  • no UPDATE
  • no DELETE
  • triggers reject mutation attempts
  • application roles never receive mutation rights on existing audit rows

Immutability Hardening

PostgreSQL row-level triggers can be bypassed by superusers via DISABLE TRIGGER ALL. Mitigations:

  1. Revoke pg_trigger from all application database roles. Only the migration role may alter triggers.
  2. DDL event trigger to block TRUNCATE on audit tables (row-level triggers do not fire on TRUNCATE):
CREATE EVENT TRIGGER no_audit_truncate ON ddl_command_start
WHEN TAG IN ('TRUNCATE') EXECUTE FUNCTION audit.prevent_truncate();
  1. pg_audit extension (Phase 6) to log superuser actions including trigger state changes.
  2. Monitor immutability trigger rejection count (Section 14). Any non-zero count indicates a bug or attack.

8. Index Strategy

Indexes SHALL be created to support the defined query patterns without over-indexing.

8.1 Required Indexes

#ColumnsPurpose
1(tenant_id, created_at DESC)Tenant time-range queries
2(tenant_id, resource_type, resource_id, created_at DESC)Per-resource history
3(tenant_id, actor_id, created_at DESC)Actor activity queries
4(tenant_id, correlation_id)Correlation lookup (tenant-scoped to prevent cross-tenant correlation leakage)
5(tenant_id, organisation_id, created_at DESC)Organisation-scoped queries
6(tenant_id, parent_resource_type, parent_resource_id)Related resource history
7(tenant_id, module, created_at DESC)Module filtering
8GIN on changed_fields"Which entries changed field X" queries

8.2 Index Guidance

  • Do NOT add GIN indexes on changes or context_json JSONB columns. These columns are write-heavy and rarely queried by internal structure.
  • All indexes are tenant-scoped (leading tenant_id) to align with RLS and prevent cross-tenant data leakage through index scans.
  • Indexes are defined on the partitioned parent table; PostgreSQL propagates them to partitions automatically.