Architecture v3

Status

This document is a merged architecture that combines the conceptual model from Architecture v2 with the technology stack and platform alignment from Stella Catalog Spec v1.

It is intended for a greenfield implementation.
It assumes no production code exists yet.
It preserves Constellation shared platform alignment.
It adopts the three-layer architecture (Domain Core, Tool Layer, Agent Plane) from v2.
It keeps the TypeScript/NestJS stack from v1 for full-stack coherence and deployment simplicity.

1. Goals

The system should be:

AI-first at runtime
- Agents are first-class users of the platform.
- The platform supports planning, tool use, memory, evaluation, and human approval flows.
- MCP tools are business capabilities, not CRUD wrappers.
AI-first in development
- AI coding agents should be able to add modules and features with low context overhead.
- Module boundaries, contracts, and tests must be explicit and machine-readable.
- Every module follows the same template with predictable file names and clear responsibilities.
Deterministic where it matters
- Product data, pricing, permissions, tenant isolation, and audit remain domain-controlled.
- Agents never directly own business truth.
- The domain core is testable without an LLM.
Modular without premature distribution
- The system starts as a modular monolith.
- Modules can be extracted later if scale or team topology requires it.
Safe for enterprise and multi-tenant use
- Strong tenant isolation, auditability, idempotency, policy checks, and approval gates are built in from day one.
Platform-aligned
- Shares technology foundations with Constellation via @constellation-platform/* packages.
- One language (TypeScript) across API, admin, contracts, SDKs, and agent plane.

2. Core Position

This architecture rejects both extremes:

Not a classic centralized CRUD/API platform with thin MCP wrappers bolted on.
Not a swarm of autonomous agents directly mutating shared state.

Instead, the platform is split into three layers:

Domain Core — Deterministic business logic and source-of-truth data.
Tool Layer — High-level business capabilities exposed to agents and humans.
Agent Plane — Planning, orchestration, delegation, memory, approvals, and workflow execution.

The domain core is authoritative. The agent plane is adaptive. The tool layer is the contract between them.

Why three layers matter

Without an explicit agent plane, agent behavior gets scattered across API controllers, service methods, and ad-hoc scripts. Without a tool layer distinct from CRUD endpoints, agents must understand implementation details instead of working at the intent level. The three-layer split ensures each concern has a clear owner.

3. Technology Stack

Primary stack

Component	Technology
Backend runtime	Node.js 20+, TypeScript (strict)
API framework	NestJS
Admin UI	Next.js (App Router)
Validation & contracts	Zod v4 + JSON Schema + OpenAPI
ORM	Prisma 6
Database	PostgreSQL 16+
Extensions	`pgvector`, `ltree`, `pg_trgm`, `pgcrypto`, `uuid-ossp`
Job queue (default)	Postgres outbox + LISTEN/NOTIFY (via `@constellation-platform/jobs`)
Job queue (high-throughput)	BullMQ + Redis (opt-in per module)
Object storage	S3-compatible (MinIO for dev; Supabase Storage on cloud)
Auth	OAuth2/OIDC delegation; JWT with `sub`, `tenant_id`, `roles`
Observability	OpenTelemetry (Jaeger dev, NewRelic/Datadog prod)
Testing	Vitest + fast-check (property-based) + integration tests
Agent tooling	MCP SDK (TypeScript), custom orchestration layer
Packaging	Monorepo (Turborepo)

Why TypeScript-first

TypeScript is the better default for this product because:

Constellation shares a common platform in TypeScript; switching languages breaks alignment.
One language across API, admin UI, contracts, SDKs, and agent plane reduces context-switching for both humans and AI coding agents.
The MCP SDK is TypeScript-native.
Prisma, Zod, and the NestJS ecosystem are mature and well-understood by AI coding tools.
Node.js provides a real long-running process model needed by agents, workers, and durable workflows.

Python remains available for:

Embedding model integrations where Python libraries have no TypeScript equivalent.
Data science and evaluation scripts.
Specialized agent evaluations using Python eval frameworks.

These are called as external processes or microservices, not as the primary runtime.

Constellation platform alignment

Shared packages from @constellation-platform/* (published from the external platform-packages repo via GitHub Packages private registry; consumed as versioned npm dependencies in Stella's package.json):

Package	Purpose
`auth-core`	JWT validation, tenant context extraction, provider abstraction
`db`	Prisma client setup, RLS helpers, migration utilities
`events`	Domain event contracts, outbox pattern, LISTEN/NOTIFY
`jobs`	Postgres outbox + LISTEN/NOTIFY by default; BullMQ adapter for high-throughput
`errors`	Typed error classes, API error envelope
`testing`	Test fixtures, property-based test helpers, integration test utilities

4. Repository Topology

apps/
├── api/                     # NestJS backend: REST API + tool API + auth
├── agents/                  # Agent runtime: orchestrator, specialists, memory, approvals
└── admin/                   # Next.js admin UI

packages/
├── contracts/               # Zod schemas, OpenAPI specs, generated SDKs
├── agent-sdk/               # Thin primitives: tool registry types, memory interfaces, eval helpers
└── module-template/         # Scaffolding for new modules

# External: @constellation-platform/* packages (separate `platform-packages` repo)
# Published to GitHub Packages private registry, consumed as versioned npm dependencies.
# See Stella_Constellation_AI_Shared_Architecture_Plan_v1.md for package contracts.
# Packages: auth-core, auth-nest, db, events, jobs, errors, testing

modules/
├── products/
├── taxonomy/
├── search/
├── ingest/
├── pricing/
├── supplier-offers/
├── canonicals/
├── shares/
└── reference-catalogs/

Key differences from v1

apps/agents/ is a first-class application, not hidden inside the API.
packages/agent-sdk/ provides thin primitives (tool registry types, memory interfaces, eval helpers) — not an orchestration framework. Orchestration logic lives in apps/agents/.
packages/module-template/ provides scaffolding for AI coding agents.
Each module includes tools.ts, workflows.ts, policies.ts, and evals/.

Key differences from v2

TypeScript throughout, not Python.
Prisma instead of raw repositories.
Shared platform packages instead of standalone runtime libraries.
Supabase database and auth path preserved; Vercel used for UI hosting.

5. Architectural Layers

5.1 Domain Core

The domain core owns:

Product and catalog state
Taxonomy and classifications
Supplier offers and canonical products
Pricing and quote rules
Ingestion and conflict resolution
Tenant configuration
Permissions and policy checks
Audit records

The domain core must be:

Deterministic
Idempotent
Transaction-safe
Tenant-scoped (RLS enforced)
Independently testable without an LLM

Implementation lives in each module's service.ts, repository.ts, models.ts, and events.ts.

5.2 Tool Layer

Tools are not CRUD wrappers. Tools are business actions.

Good tool examples:

search_catalog — semantic search with natural language
compare_products — structured attribute comparison across products
build_quote — assemble a priced quote from product selections
find_substitutes — locate alternative products meeting criteria
validate_channel_readiness — check if products meet syndication requirements
enrich_from_reference_catalog — pull and apply reference data
explain_price_decision — trace how a price was calculated
review_match_suggestion — present a supplier-offer match for human review

Bad tool examples (avoid):

create_product — too low-level, no business context
update_rule — exposes implementation details
list_records — generic, no agent value

CRUD endpoints still exist for the human-facing REST API, but agent-facing tools must be higher-level, bounded, and policy-aware.

Each tool must define:

Clear purpose and description
Zod-validated inputs and outputs
Permission requirements
Tenant scoping rules
Failure modes and error contracts
Idempotency behavior
Usage examples
Eval cases

Implementation lives in each module's tools.ts.

Tool taxonomy

Read tools — search, compare, explain, inspect. Safe to call freely.
Write tools — bounded state changes with approval and policy checks. Rare and strongly governed.
Composite tools — high-level actions coordinating multiple domain services. May trigger workflows.

5.3 Agent Plane

The agent plane owns:

Task planning and decomposition
Tool sequencing and selection
Multi-step workflow orchestration
Bounded delegation to specialist agents
Short-term working memory (session-scoped)
Structured working memory (facts gathered during a workflow)
Approval request creation and resolution
Retries and recovery
Evaluation and trace capture

The agent plane does not directly write business data. All mutations go through tools, which enforce policies and tenant scoping.

Implementation lives in apps/agents/ (orchestration, workflows, specialist agents) with thin shared types from packages/agent-sdk/ (tool registry interfaces, memory store contracts, eval runner helpers).

6. Runtime Model

6.1 API surfaces

The system exposes three distinct interfaces:

Human API — REST/JSON for admin UI and external integrations. Served by apps/api/.
Tool API — MCP protocol and internal tool registry exposing business capabilities. Tools registered from each module's tools.ts.
Workflow API — Start, resume, cancel, and query long-running agent tasks. Served by apps/agents/.

6.2 Execution modes

Request/response — Synchronous validation, search, comparison, quoting. Used by both human API and tools.
Durable workflow — Ingest pipelines, enrichment, large quote builds, catalog validation batches. Managed by the agent plane with explicit typed state machines whose state is persisted to PostgreSQL.
Background jobs — Embeddings, reindexing, document parsing, sync tasks. Managed by the job queue.

6.3 Durable execution model (canonical)

The system uses one default execution backbone. This resolves the ambiguity between BullMQ, Inngest, Trigger.dev, and custom FSMs.

Concern	Default	When to upgrade
Job dispatch	Postgres outbox + LISTEN/NOTIFY via `@constellation-platform/jobs`	Never — this is the canonical primitive
Workflow state	DB-serialized typed FSM — workflow state lives in a `workflow_runs` table with Zod-validated JSON state	Never — all workflows use this
High-throughput queues	BullMQ + Redis (opt-in)	Only when a module proves Postgres throughput is insufficient (e.g., bulk embeddings at 10k+ items)
Cron / scheduled	pg_cron or application-level scheduler	Default; no external dependency

Why not Inngest/Trigger.dev? They couple to Vercel's execution model and add an external dependency. Starting with Postgres-native primitives keeps the system self-contained and portable. If operational evidence later shows Postgres LISTEN/NOTIFY cannot keep up, BullMQ is already available as an adapter — no architectural change needed.

Why not a custom FSM library? The FSM is not a library. It is a pattern: each workflow defines a Zod-validated state type, a transition function, and a workflow_runs table row. The pattern is codified in packages/agent-sdk/ as types and helpers, not as a runtime engine.

6.4 Multi-agent strategy

Default to single orchestrator agent + specialist helpers, not full peer swarms.

Recommended initial specialist agents:

Search agent — handles catalog search, comparison, and similarity queries
Pricing/quote agent — builds quotes, explains pricing, applies rules
Enrichment agent — matches reference catalogs, applies enrichment data
Validation agent — checks channel readiness, data quality, schema compliance

Add more specialists only when evals show a clear gain. Avoid premature agent proliferation.

7. Data Architecture

7.1 Primary database

PostgreSQL remains the system of record.

Use it for:

Transactional data (Prisma-managed)
JSONB flexible attributes
ltree taxonomy paths
pgvector embeddings
Outbox/events (LISTEN/NOTIFY + polling)
Workflow state metadata
Audit logs

7.2 Tenancy

Single tenant key: tenant_id.

Rules:

Every tenant-scoped table has tenant_id
PostgreSQL RLS is mandatory on all tenant-scoped tables
Services still scope by tenant_id as defense-in-depth
Tools inherit tenant context from auth/session
Cross-tenant reads and writes fail by design

Avoid dual-scope organization_id + tenant_id unless a concrete business case demands both.

7.3 Eventing

Use domain events and an outbox table.

Initial design:

Postgres outbox for durability
Typed event schemas (Zod)
Idempotent consumers
Event versioning
LISTEN/NOTIFY for low-latency in-process delivery

Do not introduce Kafka or distributed buses at the start.

8. Module Standard

Every module must follow the same structure.

modules/<module-name>/
├── README.md            # Module purpose, domain concepts, API surface
├── AGENTS.md            # Instructions for AI coding agents working on this module
├── schemas.ts           # Zod input/output models
├── models.ts            # Prisma-facing entities and typed domain models
├── service.ts           # Deterministic business logic
├── tools.ts             # Agent-facing tool definitions
├── workflows.ts         # Explicit state machine workflows
├── policies.ts          # Permission and approval logic
├── repository.ts        # DB access via Prisma
├── events.ts            # Emitted/consumed domain events
├── tests/               # Unit and integration tests
└── evals/               # Agent/tool evaluations and regression suites

File responsibilities

File	Owns	Does not own
`schemas.ts`	Zod input/output schemas for API and tools	Prisma types, DB concerns
`models.ts`	Prisma model types, domain value objects	Business logic
`service.ts`	Deterministic business logic, validation	DB access, HTTP concerns
`tools.ts`	Agent-facing tool definitions, MCP registration	Business logic (delegates to service)
`workflows.ts`	Multi-step orchestrated procedures, state machines	Direct DB access
`policies.ts`	Permission checks, approval gate logic	Authentication (handled by platform)
`repository.ts`	Prisma queries, tenant-scoped data access	Business logic
`events.ts`	Domain event definitions, emission, consumption	Side effects outside the module
`tests/`	Unit tests, integration tests, property-based tests	Agent evals
`evals/`	Tool selection tests, workflow completion tests, regression suites	Deterministic unit tests

Module rules

No module imports another module's repository directly.
Cross-module access goes through services or tool contracts.
No hidden magic registration — all wiring is explicit.
No large base classes or deep inheritance.
No decorator-heavy abstractions that obscure control flow.
File count per module should remain small enough for AI tools to reason over (aim for < 15 files).
Every module has a short README.md and AGENTS.md.

9. AI-Friendly Development Rules

The system must be intentionally easy for AI coding agents to extend.

Required practices

Explicit Zod schemas everywhere — no implicit types or any.
One clear module template — every module looks the same.
Predictable file names — an AI agent can find tools.ts in any module.
Minimal framework magic — avoid custom decorators, interceptor chains, and DI tricks.
Generated SDKs from one contract source (OpenAPI from Zod schemas).
Usage examples for every tool.
Evals for every non-trivial workflow.
Local fixtures for every module.
AGENTS.md in every module describing domain concepts, boundaries, and gotchas.

NestJS guardrails

NestJS is the API framework for apps/api/ because of Constellation platform alignment and broad AI training-data coverage. However, NestJS's decorator-heavy DI system is a known friction point for AI coding agents. The following rules constrain NestJS usage to the predictable subset:

Banned patterns:

❌ Custom decorators — all cross-cutting concerns use standard NestJS decorators or middleware.
❌ Request-scoped providers — all services are singletons. Tenant context uses AsyncLocalStorage (Appendix A.5), not request-scoped injection.
❌ forwardRef() — indicates circular dependencies; refactor instead.
❌ Custom interceptor chains — use at most one global interceptor (for correlation ID / logging).
❌ Dynamic modules with runtime configuration — keep module registration static and declarative.
❌ Deep NestJS DI for platform primitives — AppContext is a plain object, not a NestJS provider tree (Appendix A.4).

Required patterns:

✅ Each module registers exactly one NestJS module with one controller and one service provider.
✅ Controllers are thin: validate input (Zod pipe) → delegate to AppContext-wired service → return result.
✅ All business logic lives in modules/*/service.ts, never in NestJS controllers or providers.
✅ Module services receive AppContext through the composition root, not through @Inject() tokens.
✅ The NestJS module.ts file is boilerplate — AI agents copy it from the module template without modification.

Escape hatch: If AI coding agents consistently fail on NestJS DI wiring during Phase 0 implementation, the composition root architecture allows apps/api/ to be replaced with Fastify + tRPC without changing any module code. Module contracts are framework-independent by design.

Eval scaffolding

AI coding agents struggle to write agentic evals from scratch because the eval harness (mock LLM responses, tool call assertions, trace validation) requires substantial boilerplate.

The CLI must provide:

stella eval:scaffold <tool-name>

This command generates:

Mock LLM response fixtures for the target tool
Expected tool call assertion templates
Eval harness boilerplate with correct imports and AppContext test setup
Example passing and failing test cases
Annotation mapping to the correctness property the eval validates

This scaffolding is a Phase 0 requirement. No AI coding agent should be asked to build downstream module evals until the scaffold command produces a working baseline.

Required CI gates

Type checking (tsc strict)
Linting (ESLint)
Unit tests (Vitest)
Integration tests (against real Postgres)
RLS tests (verify tenant isolation)
Contract compatibility tests (Zod schema backward compat)
Tool schema validation (all tools have valid Zod I/O)
Eval regression suite (no tool selection regressions)
Module boundary enforcement (no cross-module repository imports)

File and context rules

Keep files focused and small (< 300 lines preferred, < 500 max).
Prefer pure functions in services where possible.
Avoid deep inheritance — prefer composition.
Keep side effects explicit and at module boundaries.
Each module's full source should fit in an AI agent's context window.

10. Tool Design Standard

Tools are a first-class product surface — the primary way agents interact with the system.

Tool definition contract

Every tool must specify:

interface ToolDefinition {
  name: string; // e.g. "search_catalog"
  description: string; // clear purpose for agent tool selection
  inputSchema: ZodSchema; // validated inputs
  outputSchema: ZodSchema; // validated outputs
  permissions: string[]; // required roles/permissions
  tenantScoping: 'required' | 'system' | 'none';
  idempotent: boolean;
  failureModes: FailureMode[]; // documented error cases
  examples: ToolExample[]; // input/output pairs for agent context
  evalCases: EvalCase[]; // regression test cases
}

Tool taxonomy

Read tools — search, compare, explain, inspect. No side effects. Safe for agents to call freely.
Write tools — bounded state changes. Must check policies. May require approval. Should be rare.
Composite tools — high-level actions coordinating multiple services. May trigger durable workflows.

Write tools should be rare and strongly governed. When in doubt, make it a read tool that returns a proposed action, and let approval flow handle the mutation.

11. Workflow Design Standard

Workflows are explicit state machines, not hidden prompt behavior.

Each workflow defines:

Start conditions and triggers
Input schema (Zod)
Named states and transitions
Tool calls at each state
Retry policy per step
Timeout policy per step and overall
Approval checkpoints (which steps need human sign-off)
Completion criteria
Failure and compensation logic
Evaluation criteria

Example workflow classes

Quote generation workflow
Catalog enrichment workflow
Supplier match review workflow
Syndication readiness workflow
Legacy PIM ingest workflow
Bulk import workflow

Implementation approach

Start with a simple typed finite state machine. Do not introduce Temporal, XState, or a heavy workflow engine unless operational evidence justifies it. The state machine should be:

Serializable to/from the database (for durability)
Inspectable (current state, history of transitions)
Resumable after process restart
Traceable (every transition logged)

FSM and agent execution model

The FSM owns the skeleton — states, valid transitions, timeouts, and checkpoints. The agent operates within the FSM, not outside it.

There are two kinds of workflow states:

Deterministic states — execute a fixed operation (service call, API request, data transformation). No LLM involved. The FSM advances automatically on success or failure.
Agent-delegated states — invoke the agent with a scoped prompt and bounded tool set. The agent reasons, calls tools, and returns a transition choice. The FSM validates that the chosen transition is legal for the current state. If it is not, the transition is rejected and the agent is re-prompted.

This means:

The agent never invents states or transitions — only chooses among the ones the FSM declares.
The FSM guarantees that every execution path is auditable and bounded.
Agent autonomy is scoped: the agent decides which valid transition to take, not what transitions exist.

Example:

State: RESOLVE_DISCREPANCY (agent-delegated)
  → Agent invokes compare_products tool
  → Agent reviews result and picks a transition:
    → ACCEPT_MATCH   (if confidence ≥ threshold)
    → REJECT_MATCH   (if confidence < threshold)
    → ESCALATE       (if comparison is ambiguous)
  → FSM validates chosen transition is in the declared set
  → FSM advances to next state

This pattern prevents unbounded agent loops while preserving the agent's ability to reason about non-deterministic decisions.

12. Memory and Context

Use layered memory rather than one large chat transcript.

Memory types

Type	Scope	Storage	Purpose
Session memory	Single agent conversation	In-memory / Redis	Short-lived task context
Working memory	Single workflow execution	Database (JSONB)	Structured facts gathered during a workflow
Domain memory	Permanent	PostgreSQL (Prisma)	Product, tenant, and catalog data
Retrieval memory	Per-query	Transient	Search results, embeddings, documents
Evaluation memory	Permanent	Database	Historical traces and outcomes for regression

Rules

Business truth belongs in domain data, not chat memory.
Memory writes must be intentional and typed (Zod schemas).
Stale context must be discardable — memory has TTL or explicit invalidation.
Agent prompts should remain small and structured — inject only relevant context.
No "memory" should circumvent tenant isolation.

13. Security and Governance

AI-first does not weaken safety. It requires stronger controls because agents can act faster and at scale.

Required controls

External auth provider with JWT/OIDC (Supabase Auth, Keycloak, or equivalent)
tenant_id in auth context on every request
RLS in PostgreSQL on every tenant-scoped table
Role and feature checks at service and tool level
Approval gates for sensitive writes (tools declare when approval is needed)
Immutable audit logs (append-only, tenant-scoped)
Prompt and tool trace logging (what the agent asked, what tools it called, what it received)
Secrets isolation (no secrets in agent context or tool outputs)
Rate limiting for external, tool, and agent interfaces

High-risk actions requiring approval

Price rule changes
Cross-system syndication
Large bulk imports (above configurable threshold)
Destructive merges or splits of canonical products
Reference catalog license changes
Data export
Any write tool the module's policies.ts flags as approval-required

14. Testing and Evals

Testing splits into two categories that are equally important.

Deterministic tests

Unit tests for services (pure business logic)
Integration tests for DB behavior (real Postgres)
RLS and tenancy tests (verify isolation)
Contract tests for APIs and tools (Zod schema compat)
Property-based tests with fast-check (minimum 100 iterations per property)

Agent evals

Tool selection correctness — given a task description, does the agent pick the right tool?
Workflow completion rate — does the agent complete multi-step tasks?
Approval compliance — does the agent stop and request approval when required?
Hallucination resistance — does the agent invent data not in tool outputs?
Retry and recovery behavior — does the agent handle tool failures gracefully?
Cost and latency budgets — does the agent stay within token and time limits?
Tenant isolation — does the agent ever access cross-tenant data?

Eval infrastructure (Phase 0 deliverable)

The eval harness must be built before the first agent is deployed. It must support:

Reproducible test cases with fixed inputs and expected tool sequences
Scoring rubrics for partial credit (not just pass/fail)
Regression detection (alert when a previously passing eval fails)
Cost tracking per eval run
Trace capture for debugging failed evals

No module is complete without both deterministic tests and agent evals.

15. Deployment Strategy

Architectural principle

The agent runtime and worker/orchestrator are standalone long-running Node.js processes. This is not negotiable — serverless cold starts, execution time limits, and lack of persistent connections make Vercel Functions unsuitable as the agent runtime host.

Vercel is used for what it excels at: hosting the Next.js admin UI and, optionally, light REST API endpoints (health checks, webhooks, lightweight reads). The heavy lifting — agent orchestration, durable workflows, background jobs, MCP tool serving — runs in standalone processes.

Standalone (primary path)

Concern	Solution
API	NestJS standalone process (Docker / any container host)
Agent runtime	Standalone Node.js process (`apps/agents/`) — long-running, persistent connections
Worker / orchestrator	Same process as agent runtime (single binary) or separate process at scale
Admin UI	Next.js on Vercel or any static/SSR host
Database	PostgreSQL 16+ (Supabase, RDS, self-hosted) with pgvector, ltree
Auth	OAuth2/OIDC delegation via `@constellation-platform/auth-core` (Supabase Auth, Keycloak, Auth0)
Storage	S3-compatible (Supabase Storage, MinIO, AWS S3)
Jobs	Postgres outbox (default); BullMQ + Redis (opt-in for high-throughput modules)
MCP server	Runs inside the agent runtime process, shares the same tool registry

Vercel + Supabase (UI hosting path)

Concern	Solution
Admin UI	Next.js on Vercel (SSR + static)
Light API routes	Vercel Functions for webhooks, health, lightweight reads (optional)
Database	Supabase PostgreSQL (pgvector, ltree enabled)
Auth	Supabase Auth (JWT with `tenant_id` in `app_metadata`)
Storage	Supabase Storage (tenant-namespaced)

The API server, agent runtime, and worker processes still run as standalone containers even when Supabase hosts the database and Vercel hosts the UI.

Why not "Vercel for everything"?

An agent-first product needs:

Long-running processes for multi-step workflows (minutes, not seconds)
Persistent WebSocket/SSE connections for real-time agent status
In-process tool registry without cold-start latency
Reliable job processing without execution time limits

Vercel serverless cannot provide these. Keeping the UI on Vercel preserves developer experience and CDN benefits without constraining the core runtime.

Both paths use the same codebase. Auth is provider-agnostic via @constellation-platform/auth-core. The job queue interface abstracts Postgres vs BullMQ.

16. Delivery Phases

Phase 0: Foundation (weeks 1-3)

Create repo topology and build system (Turborepo)
Implement module template with all file slots
Set up shared platform packages (auth, db, events, jobs, errors, testing)
Build eval harness and trace capture
Define tool design standard with Zod contracts
Scaffold CLI (stella-cli)
CI pipeline with all required gates

Phase 1: Core catalog modules (weeks 4-8)

Products module (CRUD + tools + evals)
Taxonomy module (ltree + tools)
Search module (hybrid search: fulltext + pgvector + tools)
Ingest module (webhook + conflict resolution + tools)
Pricing module (rules engine + tools)

Phase 2: Agent plane (weeks 6-10, overlaps Phase 1)

Orchestrator agent with tool registry
Specialist agents (search, pricing, validation)
Workflow runtime (typed state machines)
Approval system (create/resolve approval requests)
Memory stores (session, working, evaluation)
Trace capture and eval runner

Phase 3: Advanced modules (weeks 9-14)

Supplier offers module
Canonicals module (matching, merge/split)
Shares module (catalog sharing)
Reference catalogs module (enrichment, licensing)
Enrichment workflows
Channel syndication validation

Phase 4: Hardening (weeks 13-16)

Performance optimization against targets
Security audit and penetration testing
Eval regression suite fully populated
Documentation and SDK generation
Deployment automation (Docker Compose for standalone; Vercel for UI; Supabase for managed DB)

17. Migration From Existing Documents

From Stella Catalog Spec v1 — keep

All 31 requirements and acceptance criteria
Multi-tenant model (tenant_id + RLS)
PostgreSQL + pgvector + ltree
Hybrid search pipeline
Ingest/webhook conflict resolution
Supplier offer and canonical product concepts
Catalog sharing model
Reference catalog enrichment
CPQ and pricing rules
Performance targets
CLI commands
Supabase database and auth deployment path
Constellation platform alignment

From Architecture v2 — adopt

Three-layer architecture (Domain Core / Tool Layer / Agent Plane)
Tool-first design (business actions, not CRUD wrappers)
Explicit module template with tools.ts, workflows.ts, policies.ts, evals/
Agent eval requirements alongside deterministic tests
Memory model (session, working, domain, retrieval, evaluation)
Workflow design standard (explicit state machines)
AI-friendly development rules (small files, no magic, predictable names)
Module rules (no cross-module repo imports, no deep inheritance)

From Architecture v2 — do not adopt

Python/FastAPI backend (breaks Constellation alignment, splits the stack)
PydanticAI (use MCP SDK + custom orchestration in TypeScript)
Temporal (premature; start with simple typed FSM)
Raw SQL repositories (use Prisma for consistency with Constellation)

From Stella Catalog Spec v1 — replace

"MCP tools as thin adapters" becomes "MCP tools as first-class business capabilities"
Add apps/agents/ as a first-class application
Add evals/ at module and system level
Add policies.ts and workflows.ts to module template

18. Risks and Mitigations

Risk	Impact	Mitigation
NestJS decorator magic obscures control flow for AI agents	Medium	Keep modules thin; avoid custom decorators; lint for complexity
TypeScript agent ecosystem less mature than Python	Medium	Use MCP SDK (TS-native); build thin orchestration layer; call Python for embeddings if needed
Tool layer drifts back to CRUD wrappers	High	CI gate: every tool must have evals; review tool names for business-action language
Constellation shared platform becomes a drag	Medium	Keep shared packages small and stable; don't block product delivery on platform work
Eval infrastructure gets deprioritized	High	Make evals a Phase 0 deliverable; no module ships without evals
Workflow complexity escalates	Medium	Start with simple typed FSM; introduce Temporal only with operational evidence
Agent costs spiral	Medium	Token budgets per task; model selection per specialist; cost tracking in eval harness
Postgres job queue hits throughput ceiling	Low	BullMQ adapter already exists in `@constellation-platform/jobs`; swap per-module without architecture change

19. Open Questions

Resolved

Agent hosting model — Resolved: the agent runtime is a standalone long-running Node.js process (apps/agents/). It does not run on Vercel serverless. See Section 15.
Workflow persistence — Resolved: DB-serialized typed FSM. Workflow state lives in a workflow_runs table with Zod-validated JSON state. See Section 6.3.
MCP server hosting — Resolved: the MCP tool server runs inside the agent runtime process, sharing the same tool registry and in-process access to domain services.

Still open — resolve before implementation begins:

Eval tooling: Build custom eval harness or adopt an existing framework (e.g., Braintrust, Promptfoo)?

Resolved (post-v3):

Constellation package publishing — Resolved: GitHub Packages private registry. @constellation-platform/* packages live in a separate platform-packages repo, versioned with SemVer + Changesets. See Stella_Constellation_AI_Shared_Architecture_Plan_v1.md.

20. Recommended Next Documents

If this architecture is accepted, the next documents to create are:

docs/adr/ADR-001-merged-architecture-v3.md — records the decision to merge v1 and v2
docs/module_template_v1.md — detailed module template with code examples
docs/tool_design_standard_v1.md — tool definition contract, taxonomy, and examples
docs/workflow_design_standard_v1.md — state machine patterns, approval checkpoints
docs/eval_standard_v1.md — eval harness design, scoring rubrics, regression detection
docs/agent_plane_design_v1.md — orchestrator, specialists, memory, trace capture
Updated .kiro/specs/stella-catalog/design.md — aligned with v3 architecture
Updated .kiro/specs/stella-catalog/tasks.md — re-sequenced for v3 phases

21. Summary

Architecture v3 takes the best from both predecessors:

From v2: The three-layer model (Domain Core, Tool Layer, Agent Plane), tool-first design, explicit module template with tools/workflows/policies/evals, eval-first mindset, structured memory model, and AI-friendly development rules.
From v1: TypeScript/NestJS stack, Prisma ORM, Zod validation, Constellation shared platform alignment, Supabase database/auth path, all 31 domain requirements, and the existing task breakdown.

Key decisions tightened after expert review:

One durable execution model: Postgres outbox + DB-backed workflow state as the canonical default. BullMQ optional for throughput-heavy modules. No Inngest/Trigger.dev dependency.
Agent runtime is a standalone process: Long-running Node.js process with persistent connections, in-process tool registry, and MCP server. Not serverless.
Vercel scoped to UI hosting: Next.js admin on Vercel for CDN and developer experience. API, agents, and workers run as standalone containers.
packages/agent-sdk/ stays thin: Types, interfaces, and helpers — not an orchestration framework. All orchestration logic lives in apps/agents/.

The result is a system that is genuinely AI-native at runtime (agents are first-class users with proper orchestration, memory, and evaluation) and AI-native in development (every module is predictable, explicit, and fits in a context window) — without sacrificing platform alignment, operational control, or full-stack coherence.

Appendix A: Runtime Primitives

This appendix defines the runtime contracts that Section 6.3 refers to. These are implementation-ready specifications — not guidelines. Platform packages (@constellation-platform/jobs, @constellation-platform/db) must conform to them.

A.1 Job Claiming and Retry Semantics

All background work flows through a single job_queue table in the application's PostgreSQL database.

Table schema

CREATE TABLE job_queue (
  id            uuid        PRIMARY KEY DEFAULT gen_random_uuid(),
  tenant_id     uuid        NOT NULL,
  actor_id      text,                            -- user/system that enqueued the job; nullable for system-initiated jobs
  correlation_id text,                           -- propagated from request context; nullable for cron/sweep-initiated jobs
  queue         text        NOT NULL,            -- e.g. 'embeddings', 'ingest', 'enrichment'
  payload       jsonb       NOT NULL,
  status        text        NOT NULL DEFAULT 'pending'
                            CHECK (status IN ('pending','claimed','completed','failed','dead')),
  run_at        timestamptz NOT NULL DEFAULT now(),
  claimed_at    timestamptz,
  claimed_by    text,                            -- worker instance id
  completed_at  timestamptz,
  attempt       int         NOT NULL DEFAULT 0,
  max_attempts  int         NOT NULL DEFAULT 5,
  last_error    text,
  idempotency_key text      UNIQUE,              -- optional; callers may set for dedup
  created_at    timestamptz NOT NULL DEFAULT now(),
  updated_at    timestamptz NOT NULL DEFAULT now()
);

CREATE INDEX idx_job_queue_poll ON job_queue (queue, status, run_at)
  WHERE status = 'pending';

Claim protocol

Workers claim jobs with a single atomic statement. No advisory locks, no two-phase claim.

UPDATE job_queue
SET    status     = 'claimed',
       claimed_at = now(),
       claimed_by = $1,          -- worker instance id
       attempt    = attempt + 1
WHERE  id = (
  SELECT id FROM job_queue
  WHERE  queue  = $2
    AND  status = 'pending'
    AND  run_at <= now()
  ORDER BY run_at
  FOR UPDATE SKIP LOCKED
  LIMIT 1
)
RETURNING *;

FOR UPDATE SKIP LOCKED ensures multiple workers never claim the same row. This is the only job-claiming mechanism in the system.

Retry semantics

Behavior	Rule
Backoff	Exponential: `run_at = now() + (2^attempt * base_interval)`. Default `base_interval` = 5 seconds.
Max attempts	Per-job `max_attempts`, default 5.
Dead-letter	After `max_attempts` exhausted, status moves to `dead`. Dead jobs are never auto-retried.
Stale claim recovery	A periodic sweep (every 60s) resets jobs stuck in `claimed` for longer than `claim_timeout` (default 5 minutes) back to `pending`.
Idempotency	If `idempotency_key` is set, a second INSERT with the same key is a no-op (ON CONFLICT DO NOTHING).

Notification

After inserting a job, the writer issues:

NOTIFY job_queue, '<queue_name>';

Workers LISTEN job_queue and wake immediately. The poll loop (interval: 1s) is the fallback if a notification is missed.

BullMQ upgrade path

When a module opts into BullMQ (approved per-module, documented in the module's README.md), the @constellation-platform/jobs adapter routes that queue to Redis instead of Postgres. The JobQueue interface is identical — callers do not change. The job_queue table is not used for BullMQ-backed queues.

A.2 Workflow Row Schema and State Transitions

All durable workflows (agent tasks, ingest pipelines, enrichment runs, bulk operations) share a single workflow_runs table. Each workflow type defines its own Zod-validated state shape.

Table schema

CREATE TABLE workflow_runs (
  id            uuid        PRIMARY KEY DEFAULT gen_random_uuid(),
  tenant_id     uuid        NOT NULL,
  actor_id      text,                            -- user/agent that started the workflow; nullable for system-triggered workflows
  correlation_id text,                           -- propagated from originating request; nullable for scheduled workflows
  workflow_type text        NOT NULL,            -- e.g. 'ingest_pipeline', 'enrichment', 'agent_task'
  status        text        NOT NULL DEFAULT 'pending'
                            CHECK (status IN ('pending','running','waiting_approval','completed','failed','cancelled')),
  state         jsonb       NOT NULL DEFAULT '{}', -- Zod-validated per workflow_type
  input         jsonb       NOT NULL,            -- immutable; the original request
  output        jsonb,                           -- set on completion
  error         text,                            -- set on failure
  started_at    timestamptz,
  completed_at  timestamptz,
  updated_at    timestamptz NOT NULL DEFAULT now(),
  created_at    timestamptz NOT NULL DEFAULT now(),
  parent_id     uuid        REFERENCES workflow_runs(id), -- for sub-workflows
  trace_id      text                             -- OpenTelemetry trace correlation
);

CREATE INDEX idx_workflow_runs_active ON workflow_runs (tenant_id, status)
  WHERE status IN ('pending','running','waiting_approval');

State machine contract

Each workflow type must provide:

interface WorkflowDefinition<
  TState extends z.ZodType,
  TInput extends z.ZodType,
  TOutput extends z.ZodType,
> {
  type: string; // matches workflow_type column
  stateSchema: TState; // Zod schema for the state column
  inputSchema: TInput;
  outputSchema: TOutput;
  initialState: (input: z.infer<TInput>) => z.infer<TState>;
  transitions: WorkflowTransition<TState>[]; // ordered list of named steps
}

interface WorkflowTransition<TState extends z.ZodType> {
  name: string;
  from: string[]; // allowed status values to enter this transition
  execute: (state: z.infer<TState>, ctx: WorkflowContext) => Promise<TransitionResult<TState>>;
}

type TransitionResult<TState extends z.ZodType> =
  | { action: 'continue'; state: z.infer<TState> }
  | { action: 'wait_approval'; state: z.infer<TState>; approvalRequest: ApprovalRequest }
  | { action: 'complete'; output: unknown }
  | { action: 'fail'; error: string };

Status lifecycle

pending → running → completed
                  → failed
                  → waiting_approval → running  (after approval granted)
                                     → cancelled (after approval denied or timeout)

Transitions are always single-row UPDATEs with an optimistic concurrency check:

UPDATE workflow_runs
SET    status     = $1,
       state      = $2,
       updated_at = now()
WHERE  id     = $3
  AND  status = $4   -- expected current status
RETURNING *;

If the UPDATE returns zero rows, the transition is rejected (concurrent modification). The caller retries from a fresh read.

A.3 Raw SQL Policy

Prisma is the default data access layer. Raw SQL is allowed only for platform primitives where Prisma cannot express the operation correctly or efficiently.

Permitted raw SQL

Use case	Reason	Owner
Session-level RLS context (`SET LOCAL app.tenant_id = $1`)	Prisma has no session-variable API; must be raw parameterized SQL inside a transaction	`@constellation-platform/db`
Job claim (`FOR UPDATE SKIP LOCKED`)	Prisma does not support `SKIP LOCKED`	`@constellation-platform/jobs`
Workflow transition (optimistic `UPDATE ... WHERE status = $expected`)	Must be a single atomic statement, not read-then-write	`@constellation-platform/jobs`
RLS policy setup (`ALTER TABLE ... ENABLE ROW LEVEL SECURITY`, `CREATE POLICY`)	DDL, not data access — migration-time only (see rule 4)	`@constellation-platform/db`
Extension setup (`CREATE EXTENSION`, `pg_cron` schedule management)	Extension DDL — migration-time only (see rule 4)	`@constellation-platform/db`, `@constellation-platform/jobs`
`LISTEN` / `NOTIFY`	Prisma does not expose Postgres channels	`@constellation-platform/events`
Recursive `ltree` queries (`@>`, `<@`, `lquery`)	Prisma does not support ltree operators natively	`modules/taxonomy/repository.ts`
Hybrid search ranking (`ts_rank` + pgvector distance in one query)	Prisma cannot compose full-text and vector scoring	`modules/search/repository.ts`

Rules

Raw SQL lives in the repository or platform infrastructure layer only (modules/*/repository.ts or @constellation-platform/*) — never in services, tools, or workflows.
Every raw SQL call must be wrapped in a typed function with Zod-validated inputs and outputs.
Raw SQL must be annotated with a comment referencing this appendix section: // Raw SQL: see Architecture v3, Appendix A.3.
Migration files use prisma migrate for schema changes. Exception: RLS policy DDL (ENABLE ROW LEVEL SECURITY, CREATE POLICY), extension DDL (CREATE EXTENSION), and pg_cron schedules cannot be expressed through Prisma's schema — these are the only raw SQL permitted in migration files, and each must reference this appendix.
If Prisma adds support for a currently-raw operation, the raw SQL must be replaced in the next cleanup cycle.

A.4 Composition Root

Both apps/api/ and apps/agents/ share a single composition root that wires all dependencies. This avoids duplicate initialization, inconsistent config, and drift between the two processes.

Structure

// packages/platform/runtime/composition-root.ts

export interface AppContext {
  // Config
  config: AppConfig; // validated with Zod at startup

  // Database
  prisma: PrismaClient; // single instance, RLS-aware

  // Platform services
  jobQueue: JobQueue; // Postgres-backed (or BullMQ per-queue override)
  eventBus: EventBus; // domain events → outbox → LISTEN/NOTIFY

  // Module registries
  toolRegistry: ToolRegistry; // all module tools, keyed by name
  workflowRegistry: WorkflowRegistry; // all workflow definitions, keyed by type

  // Cross-cutting
  tenantContext: TenantContextProvider; // extracts tenant_id from JWT / request
  logger: Logger; // structured, OpenTelemetry-correlated
  tracer: Tracer; // OpenTelemetry tracer
}

export function createAppContext(overrides?: Partial<AppContext>): Promise<AppContext>;

Wiring rules

createAppContext() is called exactly once per process — at the top of apps/api/main.ts and apps/agents/main.ts.
Module registration is declarative. Each module exports a register(ctx: AppContext) function that registers its tools, workflows, and event handlers. No module reaches into another module's internals.
apps/api/ calls createAppContext() then boots the NestJS HTTP server. It registers all module tools (for the REST-facing tool endpoints) but does not start the workflow runner or job workers.
apps/agents/ calls createAppContext() then starts the workflow runner, job workers, and MCP server. It registers the same tools (for agent use) and additionally starts the orchestrator and specialist agents.
Overrides for testing. createAppContext({ prisma: testPrismaClient, jobQueue: inMemoryQueue }) replaces real dependencies with test doubles. Every integration test uses this — no mocking of internal imports.
No NestJS module injection for platform primitives. AppContext is a plain object, not a NestJS provider tree. NestJS controllers receive AppContext via a single provider binding. This keeps platform code framework-independent and usable by apps/agents/ (which is not a NestJS app).

A.5 Tenant Context Propagation

tenant_id must survive across all async boundaries — HTTP requests, job execution, workflow transitions, and event handlers — without requiring callers to pass it manually through every function signature.

Mechanism

The platform uses Node.js AsyncLocalStorage as the canonical tenant context carrier.

// packages/platform/runtime/tenant-context.ts

import { AsyncLocalStorage } from 'node:async_hooks';

interface TenantContext {
  tenantId: string;
  actorId: string;
  correlationId: string;
}

export const tenantStore = new AsyncLocalStorage<TenantContext>();

export function getCurrentTenant(): TenantContext {
  const ctx = tenantStore.getStore();
  if (!ctx)
    throw new Error('Tenant context not set — are you outside a request/job/workflow scope?');
  return ctx;
}

Context restoration rules

When restoring context from a persisted row, actorId and correlationId may be null (e.g. for cron-triggered jobs or system-initiated workflows). The restoration code must handle this:

tenantId — always present; read from the row's tenant_id column. Required.
actorId — read from the row's actor_id column if present; falls back to 'system' if null.
correlationId — read from the row's correlation_id column if present; a new ID is generated if null.

Entry point	How tenant context is set
HTTP request (`apps/api`)	NestJS middleware extracts `tenant_id`, `sub` (actor), and correlation header from the JWT/request and enters `tenantStore.run()` before the controller executes. All three fields are available.
Job claim (`apps/agents`)	After `FOR UPDATE SKIP LOCKED` returns a job row, the worker reads `tenant_id`, `actor_id`, and `correlation_id` from the row and enters `tenantStore.run()`. `actor_id` and `correlation_id` may be null (see fallback rules above).
Workflow transition (`apps/agents`)	After loading the `workflow_runs` row, the runner reads `tenant_id`, `actor_id`, and `correlation_id` from the row and enters `tenantStore.run()`. Same fallback rules apply.
Event handler	The outbox consumer reads `tenant_id`, `actor_id`, and `correlation_id` from the event payload and enters `tenantStore.run()`.
MCP tool invocation	The MCP server reads `tenant_id` from the session/request metadata and enters `tenantStore.run()`. Actor is the agent identity; correlation ID comes from the session trace.

Prisma integration

The Prisma client auto-attaches tenant_id to the database session for RLS enforcement. The implementation must satisfy two invariants:

Parameterized SQL only — never interpolate tenantId into a string. Use $executeRaw with tagged template literals (see A.3 whitelist).
Same session guarantee — the SET LOCAL and the subsequent query must execute within the same database transaction/session, otherwise the RLS variable is not visible to the query.

Note: The snippet below is conceptual pseudocode illustrating the required architecture constraints. The exact Prisma $extends / client-extension API may differ at implementation time. What matters is that the two invariants above are satisfied. The concrete implementation belongs in packages/platform/db/ and must be validated against the Prisma version in use.

// CONCEPTUAL — packages/platform/db/prisma-tenant.ts
// Validates against: Prisma client extensions API (verify exact signatures at implementation time)

export function createTenantAwarePrisma(basePrisma: PrismaClient): PrismaClient {
  return basePrisma.$extends({
    query: {
      async $allOperations({ args, query, model, operation }) {
        const { tenantId } = getCurrentTenant();
        // Invariant 2: $transaction ensures SET LOCAL and query share the same PG session.
        // SET LOCAL scopes the variable to the current transaction only —
        // it is automatically reset when the transaction commits or rolls back.
        return basePrisma.$transaction(async (tx) => {
          // Invariant 1: tagged template — parameterized, not interpolated.
          // Raw SQL: see Architecture v3, Appendix A.3 (session-level RLS context)
          await tx.$executeRaw`SET LOCAL app.tenant_id = ${tenantId}::text`;
          // Route the original query through tx, NOT through basePrisma,
          // to guarantee it sees the SET LOCAL variable.
          // (Exact dispatch mechanism depends on Prisma extension API version.)
          return /* dispatch original operation through tx */;
        });
      },
    },
  });
}

Key constraints the implementation must honour (regardless of exact Prisma API shape):

$executeRaw tagged template, not $executeRawUnsafe: Prisma's tagged template $executeRaw parameterizes values automatically. $executeRawUnsafe accepts a raw string and is subject to SQL injection if callers ever interpolate user input. The implementation must use the tagged-template form.
Query routed through transactional client: The query following SET LOCAL must execute on the same transactional connection (tx), not through the base Prisma client. If Prisma's $extends callback provides a query() function that dispatches through the base client, it must not be used — find the equivalent that routes through tx.

Invariant: runtime context vs serialized state

AsyncLocalStorage is the canonical in-process runtime carrier for tenant context. However, tenant_id also appears as persisted data in rows that cross process boundaries. These are distinct concerns:

Runtime context (in-process):

Within a running request, job handler, workflow transition, or tool execution, services read tenant context from AsyncLocalStorage — never from function parameters. This prevents:

accidental tenant ID mismatch between caller and callee
RLS bypass when a worker forgets to set the session variable
proliferation of tenantId through every function signature

Serialized state (persisted):

Rows in job_queue, workflow_runs, outbox_events, and domain event envelopes must carry tenant_id (and optionally actor_id, correlation_id) as explicit columns. This is data, not runtime context — it is how context is reconstructed after serialization, process restart, or delayed execution. Boundary DTOs that cross process or network boundaries (e.g. webhook payloads, MCP session metadata) also carry tenant_id as data.

The rule: No in-process business logic (service method, tool handler, policy check, workflow transition function) should accept tenantId as a function parameter. These read from AsyncLocalStorage. Persistence layers, serialization boundaries, and entry-point bootstrapping code are the only places that read and write tenant_id as explicit data.

Status​

1. Goals​

2. Core Position​

Why three layers matter​

3. Technology Stack​

Primary stack​

Why TypeScript-first​

Constellation platform alignment​

4. Repository Topology​

Key differences from v1​

Key differences from v2​

5. Architectural Layers​

5.1 Domain Core​

5.2 Tool Layer​

Tool taxonomy​

5.3 Agent Plane​

6. Runtime Model​

6.1 API surfaces​

6.2 Execution modes​

6.3 Durable execution model (canonical)​

6.4 Multi-agent strategy​

7. Data Architecture​

7.1 Primary database​

7.2 Tenancy​

7.3 Eventing​

8. Module Standard​

File responsibilities​

Module rules​

9. AI-Friendly Development Rules​

Required practices​

NestJS guardrails​

Eval scaffolding​

Required CI gates​

File and context rules​

10. Tool Design Standard​

Tool definition contract​

Tool taxonomy​

11. Workflow Design Standard​

Example workflow classes​

Implementation approach​

FSM and agent execution model​

12. Memory and Context​

Memory types​

Rules​

13. Security and Governance​

Required controls​

High-risk actions requiring approval​

14. Testing and Evals​

Deterministic tests​

Agent evals​

Eval infrastructure (Phase 0 deliverable)​

15. Deployment Strategy​

Architectural principle​

Standalone (primary path)​

Vercel + Supabase (UI hosting path)​

Why not "Vercel for everything"?​

16. Delivery Phases​

Phase 0: Foundation (weeks 1-3)​

Phase 1: Core catalog modules (weeks 4-8)​

Phase 2: Agent plane (weeks 6-10, overlaps Phase 1)​

Phase 3: Advanced modules (weeks 9-14)​

Phase 4: Hardening (weeks 13-16)​

17. Migration From Existing Documents​

From Stella Catalog Spec v1 — keep​

From Architecture v2 — adopt​

From Architecture v2 — do not adopt​

From Stella Catalog Spec v1 — replace​

18. Risks and Mitigations​

19. Open Questions​

Resolved​

Still open — resolve before implementation begins:​

Resolved (post-v3):​

20. Recommended Next Documents​

21. Summary​

Appendix A: Runtime Primitives​

A.1 Job Claiming and Retry Semantics​

Table schema​

Claim protocol​

Retry semantics​

Notification​

Status

1. Goals

2. Core Position

Why three layers matter

3. Technology Stack

Primary stack

Why TypeScript-first

Constellation platform alignment

4. Repository Topology

Key differences from v1

Key differences from v2

5. Architectural Layers

5.1 Domain Core

5.2 Tool Layer

Tool taxonomy

5.3 Agent Plane

6. Runtime Model

6.1 API surfaces

6.2 Execution modes

6.3 Durable execution model (canonical)

6.4 Multi-agent strategy

7. Data Architecture

7.1 Primary database

7.2 Tenancy

7.3 Eventing

8. Module Standard

File responsibilities

Module rules

9. AI-Friendly Development Rules

Required practices

NestJS guardrails

Eval scaffolding

Required CI gates

File and context rules

10. Tool Design Standard

Tool definition contract

Tool taxonomy

11. Workflow Design Standard

Example workflow classes

Implementation approach

FSM and agent execution model

12. Memory and Context

Memory types

Rules

13. Security and Governance

Required controls

High-risk actions requiring approval

14. Testing and Evals

Deterministic tests

Agent evals

Eval infrastructure (Phase 0 deliverable)

15. Deployment Strategy

Architectural principle

Standalone (primary path)

Vercel + Supabase (UI hosting path)

Why not "Vercel for everything"?

16. Delivery Phases

Phase 0: Foundation (weeks 1-3)

Phase 1: Core catalog modules (weeks 4-8)

Phase 2: Agent plane (weeks 6-10, overlaps Phase 1)

Phase 3: Advanced modules (weeks 9-14)

Phase 4: Hardening (weeks 13-16)

17. Migration From Existing Documents

From Stella Catalog Spec v1 — keep

From Architecture v2 — adopt

From Architecture v2 — do not adopt

From Stella Catalog Spec v1 — replace

18. Risks and Mitigations

19. Open Questions

Resolved

Still open — resolve before implementation begins:

Resolved (post-v3):

20. Recommended Next Documents

21. Summary

Appendix A: Runtime Primitives

A.1 Job Claiming and Retry Semantics

Table schema

Claim protocol

Retry semantics

Notification