Skip to main content

Privacy & redaction framework

Part of the Universal Audit Log Specification. GDPR and PII handling rules for audit entries (§9), and the redaction framework that lets compliance officers strike fields after the fact without breaking the tamper-evident chain (§18).

9. GDPR / Privacy — PII in Audit Entries

9.1 Problem

actor_id and ip_address are personally identifiable information subject to GDPR right-to-erasure (Article 17). Audit entries are immutable and must be retained for compliance and investigation purposes. These two requirements are in tension.

9.2 Current Position: Actor Identity is Retained

The approved architecture and requirements mandate that audit entries capture "actor (who)" and support tenant/platform investigations (requirements.md §3, Constellation_Architecture_Spec_v1.md §Retention). Audit rows SHALL continue to store actor_id directly — the "who did what" guarantee is non-negotiable for security investigations, compliance evidence, and SOC 2 Type II audit trails.

GDPR exemption basis: GDPR Article 17(3)(b) provides an exemption from erasure for "compliance with a legal obligation" and Article 17(3)(e) for "establishment, exercise or defence of legal claims." Constellation relies on these exemptions for audit data.

9.3 Future: Pseudonymisation Requires Its Own ADR

If legal or privacy counsel determines that the Article 17(3) exemptions are insufficient for certain jurisdictions or data categories, a pseudonymisation strategy (e.g., HMAC-based actor references with a deletable reverse-mapping table) MAY be adopted. However:

  1. This would be a policy change affecting investigation capability, not a spec refinement.
  2. It MUST be documented in a dedicated Architecture Decision Record (ADR) covering: regulatory analysis, jurisdiction scope, impact on investigation workflows, data controller obligations, and DPA (Data Processing Agreement) implications.
  3. It MUST NOT be implemented without explicit sign-off from legal/privacy and the platform architect.

The schema reserves no columns for pseudonymisation at this time. The partition-based architecture makes a future migration feasible without rewriting existing rows.

9.4 IP Address Handling

IP addresses SHOULD be truncated at write time to reduce PII surface: IPv4 to /24, IPv6 to /48. If forensic precision is required for a specific tenant tier (e.g., defence), full IP MAY be retained with documented justification.

IP extraction MUST use a trusted proxy chain: maintain a list of trusted reverse proxy IPs, extract the client IP from X-Forwarded-For by walking right to left and stopping at the first untrusted hop, and store only the (optionally truncated) result.

9.5 context_json PII Prohibition

context_json MUST NOT contain PII. CI linting SHOULD flag known PII field names in context payloads during tests.


18. Redaction Framework

18.1 RedactionPolicy Type

interface RedactionPolicy {
paths: string[];
strategy: 'omit' | 'hash' | 'mask';
}
  • omit: remove the field entirely from the diff
  • hash: replace the value with SHA256(value) — presence is recorded, content is not
  • mask: replace with '***REDACTED***'

18.2 Default Sensitive Patterns

The platform SHALL ship with a default redaction policy that matches fields containing:

password, secret, token, key, credential, ssn, authorization

Modules MAY extend this list but MUST NOT reduce it.

18.3 Application in buildAuditDiff

buildAuditDiff() accepts an optional redact parameter. When provided, matching field paths are redacted according to the policy before the diff is returned.

18.4 Double Exposure Prevention

auditCritical() MUST apply redaction before publishing the entry to the outbox. The outbox event is consumed by downstream processors (SIEM, notifications) that may have weaker access controls than the audit table itself.