33 production workflows across 7 operational domains — designed by AI, audited to a 60-point standard, debugged with production failure patterns
Building n8n workflows by hand doesn't scale. At 5 workflows, you remember the node patterns. At 33, you're guessing at property names, copying from broken examples, and discovering failure modes in production. The standard answer — read the docs, trial and error — produces inconsistent results and inconsistent reliability.
The second problem: governance. Workflows that start clean drift. Credentials get hardcoded. Error handling gets skipped. A workflow that ran fine last month silently starts failing because a full API update stripped its credentials — no error, no alert. You find it three sessions later through downstream failures.
The solution was to bring Claude into the build process from the start, and build the audit framework before scaling the workflow count.
n8n-MCP is the core tool. It gives Claude direct access to all 541 n8n node definitions — properties, operations, required fields, configuration options — plus 2,646 real-world workflow configurations from production templates. Without it, building workflows with Claude is a guessing game about node structure. With it, Claude designs the full workflow JSON, validates node connections, and flags configuration errors before anything touches production.
┌─────────────────────────────────────────────────────────┐
│ WORKFLOW BUILD PROCESS │
├─────────────────────────────────────────────────────────┤
│ │
│ DESIGN │
│ Claude + n8n-MCP → full workflow JSON │
│ 541 nodes available, 2,646 template patterns │
│ Validate connections + required fields before build │
│ │
│ BUILD │
│ n8n API → create workflow directly (no JSON import) │
│ Incremental testing after each major node │
│ Watch for known bugs (IF node conditions, etc.) │
│ │
│ AUDIT │
│ 60-point two-phase framework │
│ Phase 1: JSON structure (33 pts, no execution needed) │
│ Phase 2: Runtime verification (27 pts) │
│ │
│ DEPLOY │
│ Pass threshold ≥48/60 required │
│ Error handling confirmed, docs present, indexed │
│ │
└─────────────────────────────────────────────────────────┘
Agentic build protocol:
- Check naming — Agent (autonomous/scheduled) or Utility (on-demand)?
- Assign maturity level — L1 at creation, L2+ after testing
- Review similar workflows via n8n-MCP template patterns
- Build directly via API — never fall back to JSON export on first error
- Test incrementally — validate after each major node
- Audit before deploy
Every workflow passes a 60-point two-phase audit before going live.
Phase 1: JSON Analysis (33 points) — runs on raw workflow structure, no execution needed
| Cat | Points | What gets checked |
|---|---|---|
| A: Triggers | 6 | Single trigger, correct type, valid schedule/path |
| D: Strategy | 3 | Purpose aligns with domain and layer |
| G: Metrics | 4 | Logging nodes present, writes to correct tables |
| I: Tech Stack | 3 | Node types, versions, credentials referenced (not hardcoded) |
| J: Cadence | 5 | Schedule matches stated purpose |
| L: Parameters | 5 | No hardcoded secrets in Set or Code nodes |
| M: Error Handling | 7 | Error Trigger → haios_health_checks, Slack only for critical |
Phase 2: Runtime Verification (27 points) — external verification after Phase 1 passes
| Cat | Points | What gets checked |
|---|---|---|
| B: Outputs | 5 | Last 10 executions, success rate |
| C: Database | 4 | Target tables exist, recent records present |
| E: Testing | 4 | Safe test execution passes |
| F: Docs | 4 | README exists in workflow folder |
| H: COO Access | 5 | Webhook callable, outputs queryable by agent |
| K: Indexed | 5 | Doc findable via semantic search |
Thresholds: PASS ≥48/60 (80%) | CONDITIONAL 36–47 | FAIL <36
The workflow-audit-skill v2.1 packages this as a versioned capability Claude executes on demand. v2.1 fixed a math error (65 pts → 60 pts) and tightened the error handling pattern. The framework audits itself.
Organized across 3 layers and 7 domains:
| Domain | Examples | Audit scores |
|---|---|---|
| Compliance | Content enforcement, FTC/ToS audit, prohibited keyword scanning | 57/60 |
| Gmail classification, invoice extraction, expense routing | 54–58/60 | |
| Finance | Tax category mapping, 50/30/20 budget allocation, threshold alerts | 54–60/60 |
| Slack | Inbound command parsing, outbound alerts, loop detection | 56/60 |
| Search | Semantic index updates, query webhook, incremental embedding | 58/60 |
| Learning | Weekly intelligence synthesis, cross-business pattern sharing | 55/60 |
| Health | Error logging, Daily Digest, circuit breakers | 60/60 |
Error handling pattern (enforced on all 33 workflows):
Any workflow error
↓
Error Trigger → haios_health_checks (state='error')
↓
Daily Digest surfaces errors each morning
↓
Slack #haios-alerts ONLY for critical/time-sensitive failures
Four failures that permanently changed how this system works:
n8n credential stripping. Full API workflow updates silently remove credential assignments from all nodes. No error. No warning. Discovered through downstream authentication failures three sessions later. Fix: mandatory warning before any full workflow update. Partial update tooling built to avoid full replacements. Now a standing anti-pattern in the audit skill.
Dual-path payload mismatch. Same workflow, two call paths (MCP and webhook), two different payload structures. Intermittent failures traced to field nesting differences invisible at the application layer. Fix: defensive type-checking at the input boundary of all utility workflows.
Silent success, zero output. Green execution status, nothing written to the database. Upstream node returned an unexpected shape — no error thrown, just an empty pass-through. Caught by Phase 2 Category C, which checks for recent records in target tables, not just execution status.
Stale semantic index after rename. Renamed a workflow folder; old path stayed in the embedding index. Semantic search returned stale docs pointing to a non-existent location. Fix: batch index update required after any rename.
Claude designing workflows with n8n-MCP isn't the same as Claude guessing at workflow structure.
Without MCP, asking Claude to build an n8n workflow produces plausible-looking JSON with wrong property names, missing required fields, and node configurations that fail silently. With MCP, Claude has the actual node specs — it knows which fields are required, what the valid operation types are, and how to connect nodes correctly. The difference between "it looks right" and "it is right."
The audit framework compounds this: every workflow built with MCP still runs the same 60-point check. The build tool reduces errors going in. The audit catches what gets through.
Workflow governance is the same problem as release management, change control, and operational runbooks — systematic standards applied consistently, not remembered selectively. Most teams add governance after scale reveals the problem. The audit framework here came first.
The build methodology applies anywhere humans and AI are constructing systems together: give the AI the actual specs (not approximations), build incrementally, verify at each step, audit before deploy. The credential-stripping failure and the silent-success failure both exist because the system lacked a step that checked "did this actually do what it was supposed to do?" Phase 2 is that step.
| Metric | Value |
|---|---|
| Production workflows | 33 |
| Operational domains | 7 |
| Audit framework | 60-point, 2-phase, 13 categories |
| n8n nodes accessible via MCP | 541 |
| Template patterns available | 2,646 |
| Pass threshold | ≥48/60 (80%) |
| Top score | 60/60 (Revenue Import, Health workflows) |
| Documented failure patterns | 4 (all fixed, all in anti-patterns) |
| Error handling coverage | 100% |
- n8n — workflow automation platform
- n8n-MCP — Claude's access to 541 node definitions + 2,646 template configs
- Claude (Anthropic) — workflow design, debugging, audit execution
- Supabase / PostgreSQL — logging, state, health checks
- workflow-audit-skill v2.1 — versioned audit capability
This framework is part of a larger Human-AI Operating System (HAIOS) — production infrastructure for human-AI collaboration, running since October 2024.
Other components:
- compliance-enforcement-framework — compliance workflows
- email-intelligence-framework — email routing workflows
- financial-operations-framework — financial automation workflows
- slack-integration-framework — Slack bi-directional workflows
Jordan Waxman — AI Systems & Operations 14 years operations leadership — building human-AI infrastructure since 2025
33 workflows. Every number verified from production systems.