Agent Workflow Guide¶
How to build and evolve a Dazzle application using an AI agent: spec change → agent edits DSL → validate → tests → human review → deploy.
1. Why a DSL Is Agent-Friendly¶
The central premise of Dazzle (formalised in ADR-0004) is that AI agents are the primary authors of DSL, and human developers are primarily reviewers. This shapes every design decision in the grammar.
The spec stays small. A complete production application — entities, access
control, surfaces, workflows, events, multi-tenant scoping — lives in a handful
of .dsl files. An agent holds the full specification in context. It never has
to infer intent from a sprawling implicit codebase, because there is no implicit
codebase.
The grammar is constrained by design. Dazzle's DSL is deliberately
anti-Turing: no control flow, no function definitions, no procedural
shortcuts. The --anti-turing flag on dazzle lint enforces this mechanically.
A constrained grammar means fewer plausible-but-wrong edits. When an agent
cannot express an idea in the DSL, that is useful signal — not a failing of the
framework, but a prompt to consider whether the idea belongs in the spec at all
or whether a service block is the right vehicle.
Validation is fast and structured. dazzle validate runs in under a second
and returns structured errors keyed to the specific construct that failed — not
a stack trace from a runtime that had to boot first. An agent can iterate on DSL
edits in a tight loop: edit → validate → read error → edit again. No server
restarts, no migrations, no test fixtures required at this stage.
Scope rules are statically verified. Row-level access control (scope:) is
compiled to a formal predicate algebra and validated against the FK graph at
dazzle validate time. An agent that writes an invalid scope rule (for example,
a field path that doesn't exist in the FK chain) gets a precise error before
anything runs. This makes access-control errors a class of problem the loop
catches early rather than discovering in production.
The ROADMAP describes the growth model that emerges from this design: agents build applications within the existing DSL vocabulary, encounter friction at the grammar boundary when they need something the DSL cannot yet express, and produce structured friction reports. That feedback loop is how the framework evolves without becoming a general-purpose language.
2. The Loop¶
The core workflow is a tight cycle between the developer's intent, the agent's DSL edits, and a deterministic validation + test gate before a human reviews and deploys.
┌─────────────────────────────────────────────────────────────────┐
│ │
│ Requirement Agent edits dazzle validate │
│ change ───▶ DSL files ───▶ dazzle lint ──┐ │
│ │ │
│ ◀──────────────────────────────────────────── ┘ │
│ (fix and retry on failure) │
│ │
│ │ passes │
│ ▼ │
│ dazzle rbac matrix Tests pass? dazzle test dsl-run ──┐ │
│ dazzle rbac verify ──▶ suite ──▶ dazzle e2e run │ │
│ │ │
│ ◀──────────────────────────────────────────── ┘ │
│ (fix and retry on failure) │
│ │
│ │ passes │
│ ▼ │
│ Human review │
│ (RBAC diff, migrations, │
│ friction findings) │
│ │
│ │ approved │
│ ▼ │
│ dazzle db upgrade → dazzle serve │
│ │
└─────────────────────────────────────────────────────────────────┘
Requirement change¶
The developer states what changed — a new entity, a new access rule, a workflow step, a migration — as a natural-language requirement to the agent. The agent does not improvise; it works from what the developer says. If the requirement is ambiguous, the loop works best when the agent clarifies before editing rather than after.
Agent edits the DSL¶
The agent edits .dsl files directly: entities, surfaces, personas, workflow
definitions, event models, scope rules. All spec-level intent lives here. The
framework generates the runtime implementation from the DSL; the agent does not
write backend code, migration scripts (in the common case), or frontend
templates.
The files the agent typically touches:
| What changed | File |
|---|---|
| New entity / field change | entities.dsl or equivalent module |
| Access rule or persona | personas.dsl / policies.dsl |
| New surface or mode change | surfaces.dsl |
| Workflow or process step | workflow.dsl |
| Event model or projection | events.dsl |
Validate¶
After each edit round, the agent runs:
This parses all DSL modules, resolves cross-module dependencies, and validates
the merged AppSpec — including FK-graph checking of scope predicates. It
operates in the project directory (where dazzle.toml lives) and produces
human-readable errors by default.
A real error looks like:
ERROR: Entity 'Invoice' field 'supplier_id' — FK target 'Supplier' not found.
Did you mean 'SupplierContact'? (module: entities.dsl, line 14)
The MCP dsl tool wraps the same validation for in-context use without leaving
the agent loop:
Verified operations on the dsl MCP tool: validate, list_modules,
inspect_entity, inspect_surface, analyze, lint, get_spec, fidelity,
list_fragments, export_frontend_spec. See section 3
for how to point an MCP client at the server.
For a more thorough pass — scope-warning completeness, anti-Turing compliance, coverage checks — run:
The agent iterates on validate + lint until both pass cleanly before moving to tests.
Tests¶
Once validation is clean, the agent runs the test layers in order of cost:
Tier 1 — RBAC matrix + API tests (fast, no browser):
dazzle rbac matrix # generate static access matrix from DSL
dazzle rbac verify # run dynamic verification against in-process app
dazzle test dsl-run # API-based tests derived from stories
dazzle rbac matrix is entirely static — no server required. It derives the
access matrix from the DSL and writes it to a file the human reviewer will diff.
dazzle rbac verify (Layer 2) boots an in-process instance and exercises the
matrix against live HTTP responses.
What is auto-derived from the DSL: the RBAC matrix, Tier 1 API test flows from
story and test-design definitions, schema tests, scope predicate tests.
What must still be hand-authored: adversarial cross-tenant isolation tests,
business-logic edge cases that depend on runtime state, integration tests for
external services. The examples/invoice_ops test suite (see section 4)
includes hand-authored adversarial tests alongside the auto-derived ones — the
combination is what caught the cross-tenant leak.
Tier 2 — scripted UI tests:
dazzle test dsl-run # Tier 1 API — no browser
dazzle test run # Tier 2 Playwright scripted UI
dazzle test run-all # all tiers
Tier 3 — E2E with UX coverage tracking:
Tier 3 tests require a running app instance and are slower; they are run selectively (after larger structural changes) rather than on every DSL edit.
Human review¶
The review gate is the backstop for anything the mechanical checks cannot assess. A reviewer inspects:
- RBAC matrix diff —
dazzle rbac matrixoutput between the old and new spec. NewALLOWentries on sensitive entities need explicit sign-off. - Migration review — the generated Alembic migration file under
.dazzle/migrations/versions/for any schema change. Destructive migrations (column drops, renames, type changes) must be hand-edited in that file before they are applied; see migrations guide. - Friction findings — anything the agent logged as uncertain or the
dazzle lintwarnings flagged but didn't block on. - Business-logic correctness — the loop verifies structural consistency, not domain correctness. A reviewer who understands the domain is the check.
The existence of this gate is not an apology for the loop. It is the design. See section 6 for what the loop does and does not guarantee.
Deploy¶
Once the human approves:
See deployment reference and the Heroku guide for environment-specific deployment instructions. This guide does not re-document deployment.
3. MCP + Claude Code Setup¶
The Dazzle MCP server makes the dsl tool and the full knowledge-graph surface
available to any MCP-compatible AI agent without context-window tricks. The
agent calls tools directly rather than shelling out.
Starting the MCP server¶
The server is stateless per request. It must be started from (or pointed at)
the project root where dazzle.toml lives.
Registering with Claude Code¶
This registers the server in your Claude Code MCP configuration so the Dazzle
tools are available in all sessions automatically. Pass --force to overwrite
an existing registration.
For other MCP-compatible agents, add a server entry pointing to:
using whichever client configuration format your agent host expects.
The dsl tool in the agent loop¶
The dsl MCP tool is the agent's primary introspection surface during the
edit-validate cycle. Verified operations (from dazzle inspect api mcp-tools):
| Operation | What it does |
|---|---|
validate |
Parse and validate the full DSL — same logic as dazzle validate |
lint |
Extended checks; pass "extended": true for all warning classes |
inspect_entity |
Full field/scope/permit details for one entity by name |
inspect_surface |
Surface definition and fragment inventory |
analyze |
Cross-cutting analysis of the AppSpec (entity relationships, coverage gaps) |
get_spec |
Full or filtered AppSpec summary; filter by entity or surface names |
list_modules |
List all DSL modules the parser resolved |
fidelity |
Per-surface fidelity score; "gaps_only": true to filter |
list_fragments |
List fragments available for a surface |
export_frontend_spec |
Export spec as TypeScript interfaces, route map, component inventory, etc. |
Typical agent loop for a DSL edit cycle in Claude Code:
dsl { "operation": "validate" }— confirm the edit is valid before testing.dsl { "operation": "inspect_entity", "name": "Invoice" }— verify field names and types before referencing them in scope rules.dsl { "operation": "fidelity", "gaps_only": true }— check whether new surfaces have coverage gaps.
The MCP server also exposes graph, knowledge, policy, sentinel, and
other tools for deeper introspection — run dazzle inspect api mcp-tools for
the full list.
4. Worked Example: How invoice_ops Was Built and Evolved¶
examples/invoice_ops is a production-grade accounts-payable system — invoices,
line items, supplier bank accounts, multi-step maker-checker approval, HLESS
event model, shared-schema tenancy, and a full RBAC matrix with four personas.
It was built entirely by this agent loop (SP1, v0.71.103) and then evolved
through six successive schema and DSL changes (SP2, v0.71.104) to exercise the
migration workflow.
SP1: Building the app¶
The keystone build started from a blank scaffold and layered the spec in
discrete commits, each passing dazzle validate and the test suite before the
next:
| SHA | What the agent shipped |
|---|---|
c9223960 |
Initial data + access model — entities, fields, scopes, personas |
4f9eaa14 |
Shared-schema tenancy declaration |
2949db2a |
Maker-checker approval gates |
aa9fbc2d |
HLESS event model + status projection |
7d922509 |
Surfaces including audit-export view |
6d841cb6 |
Edit/create surfaces to make the app fully operable |
The RBAC isolation suite was hand-authored (not auto-derived from the DSL) as an adversarial check. It caught a real problem:
| SHA | What the suite found |
|---|---|
f34c5a5b |
admin_personas included tenant_admin — a cross-tenant-visibility leak; removed |
This is the canonical example of why adversarial tests belong in the loop alongside auto-derived ones. The DSL validated cleanly before this fix; the isolation suite found what structural validation could not.
SP2: Evolving the spec through migrations¶
Six successive changes exercised every migration class the loop must handle:
| SHA | Change | Migration class |
|---|---|---|
2672926d |
Add Invoice.po_number |
Additive (auto-generated) |
add48eea |
Rename bank_reference field |
Rename (hand-edited) |
956568a6 |
Add partially_paid status |
Enum evolution |
42af45a7 |
Split SupplierBankAccount entity |
Entity split + backfill script |
c560d6a1 |
Event-schema retention + new field | Event-model change |
a79c1067 |
Add finance_admin persona |
DSL-only, no migration |
The companion artefact the loop produced is docs/reference/migrations.md —
the schema-evolution guide, written from the friction the agent encountered
running through these six changes. It documents which migration classes are safe
to auto-apply and which require hand-editing.
5. Failure Handling¶
Validate fails¶
Read the error. dazzle validate names the construct, the module, and (where
possible) the line. Fix the DSL and re-run. Common patterns:
- Unknown entity reference — a scope rule or surface references an entity
that does not exist in the merged spec. Check spelling; check that the module
containing the entity is listed in
dazzle.toml. - Invalid FK path — a scope predicate traverses a relationship that doesn't
exist in the FK graph. Use
dsl { "operation": "inspect_entity" }to verify the field chain before writing the scope rule. - Duplicate surface name — two surfaces share an identifier. Rename one; the error gives both locations.
- Anti-Turing violation — a construct contains control flow (
if,for, etc.). Move the logic to a service block or remove it.
Iterate: edit → dazzle validate → read error → edit. Do not move to tests
until validate passes cleanly.
A test fails¶
First, decide whether the test found a real problem or has a test issue.
A correctly failing adversarial test has found a real bug. The
admin_personas cross-tenant leak (SP1, f34c5a5b) was found this way. The
test was not wrong; the DSL was. Fix the DSL, re-run validate, re-run the test.
A test that fails due to stale fixtures or seed data is a test infrastructure
issue. The framework derives Tier 1 test flows from stories; if story definitions
diverge from the DSL, dazzle test dsl-run will error before running. Fix the
story or test-design definition, then retry.
A migration-related test failure (schema mismatch between what the app boots
with and what the test expects) means the migration sequence is incomplete. Run
dazzle db current to check the revision, then dazzle db upgrade to apply
pending migrations before re-running tests.
The agent makes a wrong call¶
The human review gate is the backstop. When a reviewer inspects the RBAC matrix
diff and sees an unexpected ALLOW entry — or inspects a migration preview and
sees a destructive operation that should not be there — they stop the loop and
send the requirement back to the agent with a correction. The correction is a
new requirement change, and the loop restarts from the top.
The agent's wrong call is not a failure of the loop design. The loop is designed to surface wrong calls cheaply, at the review gate, before they reach production.
A migration needs hand-editing¶
Some schema changes cannot be expressed as auto-generated Alembic migrations: column renames, type changes, entity splits with backfill logic. When the agent encounters one of these:
- Generate the migration with
dazzle db revision -m "description". - Open the generated file under
.dazzle/migrations/versions/and add the hand-written SQL or SQLAlchemy operations. - Review the generated migration file and hand-edit it — rename, split, and
type-change migrations need hand-written SQL (see migrations.md) —
then apply with
dazzle db upgrade. - Run
dazzle db verifyafterwards to confirm FK integrity.
See migrations.md for the full taxonomy of migration classes and the patterns the SP2 exercise produced.
6. The Verifiability Boundary¶
The loop provides strong mechanical guarantees. It also has explicit limits. This section is honest about both.
What the loop checks mechanically¶
| Check | Tool | Guarantee |
|---|---|---|
| DSL is syntactically valid | dazzle validate |
Every construct parses; no unknown keywords; cross-module references resolve |
| FK graph is consistent | dazzle validate |
Every scope predicate's field path exists in the entity graph |
| Access matrix is correct by construction | dazzle rbac matrix |
The static matrix matches the DSL's permit: / scope: / as: declarations |
| Access matrix is enforced at runtime | dazzle rbac verify |
HTTP responses match the matrix for every persona / surface pair tested |
| Scope filters restrict data | dazzle rbac verify-scope |
Row-level filters fire and are not bypassable via the tested routes |
| Schema migrations apply cleanly | dazzle db upgrade + dazzle db verify |
Pending migrations apply without error; FK integrity holds afterwards |
| Anti-Turing compliance | dazzle lint --anti-turing |
No control-flow constructs in DSL files |
What still needs human judgment¶
- Domain correctness. The loop verifies structural consistency, not whether the entities and rules model the right domain concepts. A technically valid DSL can still model the wrong business logic.
- Adversarial test design. The loop auto-derives Tier 1 tests from stories. It does not auto-generate adversarial cross-tenant, privilege-escalation, or state-machine abuse tests. Those must be hand-authored.
- Security claims. The RBAC matrix is a necessary condition for the claims in SECURITY_CLAIMS.md, not a sufficient one. Evaluating those claims requires the full exercise described in EVALUATION.md.
- Destructive migration review. The generated migration file under
.dazzle/migrations/versions/shows what will run. Whether it is correct for the domain — whether a column rename is safe, whether a backfill is complete — requires a reviewer who understands the data. - External integrations. Service blocks declare contracts with external systems. The loop validates the DSL side of that contract; it cannot validate the external system.
The loop reduces risk. It does not remove the need for human review.
The review gate in the loop is not a formality or a concession to process. It is the point at which domain judgment, adversarial thinking, and accountability enter. An agent that produces a clean validation pass, a passing test suite, and a correct RBAC matrix has done its job well. A human who reviews the result and approves it is doing a different job — one the loop cannot do on their behalf.
Related: deployment reference · Heroku guide · migrations guide · SECURITY_CLAIMS.md · EVALUATION.md · ADR-0004 · ROADMAP · AGENTS.md