Skip to content

feat: Standalone Docker container architecture with parallel agents#4

Merged
paraddox merged 169 commits intomainfrom
docker-standalone
Jan 24, 2026
Merged

feat: Standalone Docker container architecture with parallel agents#4
paraddox merged 169 commits intomainfrom
docker-standalone

Conversation

@paraddox
Copy link
Owner

Summary

  • Migrates from volume-mounted containers to fully standalone Docker containers that clone repos at runtime via SSH keys baked in at build time
  • Adds parallel agent support with atomic feature claiming, overseer verification at 10% milestones, and reviewer agents for completed features
  • Implements comprehensive beads-based feature tracking with host-side sync, background polling, and WebSocket real-time updates
  • Adds full test suite (1600+ unit tests), CI workflow, pre-commit hooks, and enterprise-grade error recovery
  • Fixes template sync to skip unchanged files and pull before push, preventing spurious commits

Test plan

  • All 1605 unit tests pass
  • Docker image builds successfully with correct SSH key
  • Container clones repo and starts agent (verified with nexus)
  • Template sync no longer creates redundant commits
  • Pre-commit hooks validate on every commit

🤖 Generated with Claude Code

paraddox and others added 30 commits January 15, 2026 09:30
Major architectural change to enable parallel agent execution in separate Docker containers:

## Backend Changes
- registry.py: Changed from path to git_url storage, added Container model
- container_manager.py: Updated to use git_url instead of volume mounts
- agent.py: Fixed to use git_url with container manager
- projects.py: Added container control and edit mode endpoints
- New services: BeadsSyncManager, LocalProjectManager for host-side operations

## Template Changes
- initializer_prompt: Added `bd init --branch beads-sync` for parallel workflow
- coding_prompt: Added distributed lock feature claiming and random verification

## UI Changes
- New ContainerControl component: Slider (1-10 agents) + control buttons
- New ContainerList component: Shows running containers with status
- App.tsx: Integrated new components with hooks for container management
- Removed FolderBrowser (no longer needed with git URL approach)

## Removed Files
- server/routers/filesystem.py: Filesystem browsing no longer needed
- ui/src/components/FolderBrowser.tsx: Replaced by git URL input

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix TypeScript build errors:
  - Change project.path to project.local_path in DeleteProjectModal
  - Simplify ExistingRepoModal to git URL only flow (remove FolderBrowser)
  - Add missing paused/completed status configs in AgentControl

- Add thread safety:
  - Add threading.Lock() to beads_sync_manager.py global registry
  - Add threading.Lock() to local_project_manager.py global registry

- Convert blocking I/O to async:
  - Wrap all subprocess.run() calls with asyncio.to_thread()
  - Affects beads_sync_manager.py and local_project_manager.py

- Fix container stop loop bug:
  - Remove unnecessary loop in stop_all_containers endpoint
  - Call container manager once instead of per-container

- Fix git merge branch reference:
  - Store branch name in variable before checkout to main
  - Prevents merge from referencing wrong branch

- Add logging to silent failures:
  - Log JSON parse errors in issues.jsonl reading
  - Log agent config read failures
  - Change return value on read error to True (safer default)

- Fix type mismatches:
  - Add 'paused' and 'completed' to AgentStatus type
  - Update WizardStep literal from "name"/"folder" to "mode"/"details"

- Add task_id validation:
  - Add validate_task_id() function with regex check
  - Validate task_id in update_task and delete_task endpoints

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Critical fixes:
- Fix distributed lock bash bugs in coding_prompt.template.md:
  - Capture exit code immediately after command (claim_status=$?)
  - Send error messages to stderr (>&2) not stdout
  - Remove duplicate feature claiming in Steps 2.5 and 3
  - Add merge conflict detection

- Fix false success returns in git operations:
  - beads_sync_manager.pull_latest() now returns False on failure
  - local_project_manager.pull_latest() checks checkout result
  - local_project_manager.push_changes() checks add/commit results

- Implement multi-container registry pattern:
  - ContainerManager now accepts container_number parameter
  - Container naming: zerocoder-{project}-{number}
  - Registry: dict[str, dict[int, ContainerManager]]
  - Added get_all_container_managers() helper
  - Updated all callers in agent.py, websocket.py, features.py

- Add beads-sync branch management methods:
  - ensure_beads_sync_branch(): create if not exists
  - pre_agent_sync(): sync before agent starts
  - post_agent_cleanup(): cleanup feature branches
  - recover_stuck_features(): reset in_progress → open

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Fixes remaining issues from comprehensive code review:

P2 Issues Fixed:
- fix-o4c: Per-container log filtering in AgentLogViewer with container
  tabs, badges, and proper WebSocket callback cleanup
- fix-61b: Edit mode conflict resolution with git stash/pop flow
- fix-dxl: Container registry restoration on server restart

P3 Issues Fixed:
- fix-4up: UI accessibility (ARIA attributes on slider, aria-pressed on
  buttons, error handling for async callbacks, removed unused hook)
- fix-85d: SQLAlchemy CheckConstraints for type enforcement at DB level
  (container_type, status, target_container_count, feature status)

Key changes:
- WebSocket now properly tracks callbacks per container for cleanup
- AgentLogViewer shows container filter tabs when multiple containers
- LocalProjectManager handles stash/pop for uncommitted changes
- registry.py enforces data constraints at database level

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
_get_registry_functions() returns 11 values but several call sites
only unpacked 10. Updated all call sites to correctly unpack all 11
values (added missing list_project_containers to unpacking).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Phase 1/2 container startup:
- New /start-all endpoint runs init container FIRST
- Init container waits for repo clone before pre_agent_sync
- Coding containers spawned AFTER init completes

Key changes:
- agent.py: Add start_all_containers endpoint with 2-phase startup
- agent.py: Update stop/graceful-stop to affect ALL containers
- container_manager.py: Add _is_init_container flag and special handling
- container_manager.py: Wait for repo clone before git operations
- feature_poller.py: Fix iteration over nested dict structure
- api.ts: Route startAgent to /start-all endpoint

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The existing project recovery path in start_all_containers() was calling
pre_agent_sync() and recover_stuck_features() without waiting for the
container to clone the repo first. This caused git operations to fail.

Added wait-for-clone check (test -d /project/.git) before running
recovery operations in the else branch.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix /project directory permissions for coder user in Dockerfile
- Set up SSH key for both root and coder users in entrypoint
- Run git clone as coder user who has SSH key configured
- Fix tilde expansion in start-app.sh for env var loading
- Pull latest changes to local clone before checking project state
- Add SSH key mount logging

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Container Selector Fix:
- Send registered containers list via WebSocket on connect
- UI shows all container tabs immediately instead of deriving from logs
- Added ContainerInfo type and containers state to useWebSocket hook

Beads Sync Integration:
- Replace feature_poller with beads_sync_manager for reading feature state
- Initialize beads-sync clones for all registered projects on startup
- Poll beads-sync every 15s only for projects with active containers
- Add get_cached_stats() and get_cached_features() compatibility functions
- Fix container naming for multi-container architecture (init + coding)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This file has been fully replaced by beads_sync_manager.py which reads
task state directly from local beads-sync branch clone instead of
querying containers via docker exec.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Beads issues store content in 'description' field, not 'body'.
This fixes task descriptions not loading in the UI.

Also fixes import of list_valid_projects (was list_all_projects).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add staggered 10s delay between container starts to prevent race condition
  where all containers claim the same feature
- Make start() non-blocking by spawning agent in background task
- Add registry calls for container lifecycle tracking (create, start, stop)
- Add dynamic polling interval (5s active, 15s idle) for faster UI updates

Fixes issues found during parallel container testing:
1. Race condition in feature claiming (staggered startup)
2. False failure reports (non-blocking start)
3. Containers not tracked in registry
4. Slow stats updates when containers running

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix registry tracking: use self.project_name, self.container_number,
  self.container_type instead of underscore-prefixed versions
- Fix View Logs button: pass container.container_number instead of
  container.id to filter logs correctly

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
WebSocket was only registering log callbacks at connection time. If
containers were created after the UI loaded, their logs wouldn't be
sent to the UI.

Add background task that periodically checks for new container managers
and registers callbacks dynamically. This ensures logs appear for
containers created while the WebSocket is connected.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Issue: fix-6os

1. coding_prompt.template.md:
   - Add safe_bd_json() helper for clean JSON extraction with stderr suppression
   - Add safe_bd_sync() helper for quiet sync operations
   - Update all bd command usages to use safe helpers
   - Add KEY RULE #9 about suppressing bd stderr

2. beads_commands.py:
   - Update parse_json_output() to return (data, error) tuple
   - Include stderr in error messages for debugging
   - Update callers to handle the error tuple

3. prompts.py:
   - Add CLAUDE.md refresh to refresh_project_prompts()
   - Smart merge: preserves user content above BEADS WORKFLOW section
   - Replaces or appends beads workflow section from template

4. opencode_config/agent/*.md:
   - Add CLAUDE.md reference to coder, overseer, and hound agents
   - OpenCode agents now get beads workflow instructions like Claude agents

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The verbose output from bd sync goes to stdout, not stderr.
Updated safe_bd_sync() to suppress stdout instead of stderr.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Updated Step 2 to ensure dependencies are installed before
running servers. Detects package manager (pnpm, yarn, npm).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
After refresh_project_prompts() copies templates to the project dir,
commit and push them to git so containers get the latest when they
clone/pull the repo.

This ensures the safe_bd_json and safe_bd_sync fixes reach agents.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add git safe.directory config to Dockerfile for both root and coder users
- Add bd sync after clone/pull in entrypoint
- Filter bd ready to only open status (not in_progress) to prevent race conditions
- Use shuf to randomize feature selection across parallel agents

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The _stream_logs() method already broadcasts container output via docker logs -f.
The send_instruction() method was also broadcasting, causing duplicate messages in UI.

Removed the broadcast call from send_instruction() while keeping stdout consumption
for activity updates.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Agents were pushing feature branches but not cleaning them up.
Added step 5 to delete local and remote feature branches after merge.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
bd show --json returns an array [{...}], not an object.
Fixed all jq commands to access .[0].title instead of .title
to properly extract feature titles for branch names.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Initializer creates scripts/safe_bd_json.sh and scripts/safe_bd_sync.sh
- Coding prompt now uses these scripts instead of inline function definitions
- Scripts are committed by initializer if they don't exist
- Coding prompt validates scripts exist before proceeding

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Previously all containers shared a single .agent_started marker file,
causing race conditions where one container completing would remove
the marker for all containers, breaking health monitor auto-recovery.

Now each container gets its own marker: .agent_started.{container_number}

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Server restarts no longer kill active agent work. User-started containers
with open features are preserved and will be restored when the server
restarts. The health monitor will restart their agents automatically.

This fixes the issue where restarting the server (e.g., to apply code
changes) would kill all running agents.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The _user_started flag was only set at manager creation time, causing
the health monitor to ignore containers whose marker files were created
after the manager was initialized. Now _sync_status() refreshes the flag
from the marker file, allowing recovery to work even when markers are
created externally (e.g., during server restart recovery).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Added is_agent_stuck() method and health monitor check for agents that
are running but haven't produced any output for AGENT_STUCK_TIMEOUT_MINUTES
(default 10 minutes).

This catches scenarios like:
- OpenCode API hung/not responding
- Network timeouts
- Agent blocked waiting on external service

The health monitor will now restart these stuck agents automatically.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
lswSoso and others added 29 commits January 20, 2026 17:29
- Move feature ID into the colored category badge
- Show category as plain text beside the badge
- Change priority display from #N to PN format

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…s_cache

The column was defined in the model but missing from the migration
function, causing errors when querying the table.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Without this, all containers defaulted to container 1, causing all
feature claims to update the same DB record. Now each container
correctly identifies itself for proper feature tracking.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The ContainerStatus and AgentStatus schemas were missing 'reviewer'
in their agent_type Literal, causing validation errors when the
containers endpoint returned reviewer agent data.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…_feature

Use container's current_feature field from registry as primary indicator
for in-progress status instead of relying solely on beads status. This is
more reliable since current_feature is set on claim and cleared on close.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The stop() method was resetting _user_started=False on every call,
including during restart_agent(), restart_with_reviewer(), and
restart_with_overseer(). This caused containers to stop auto-restarting
after the first restart cycle (e.g., reviewer → coder transition).

Added preserve_user_started parameter to stop() method. Restart methods
now pass preserve_user_started=True to maintain auto-restart capability.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace direct issues.jsonl file reads with BeadsManager API calls
for consistency. Host-side code now uses get_cached_stats() and
get_cached_features() which run live bd commands.

Changes:
- progress.py: Remove JSONL/cache fallbacks from 4 functions
- features.py: Remove read_local_beads_features() and fallbacks
- container_manager.py: Remove _has_open_features_direct() method

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add apiFailureLogged flag to only log first API failure
- Check graceful stop API every 10 seconds instead of every 1 second
- Prevents "[WARN] Failed to check graceful stop via API" spam when API unreachable

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Create SVG favicon with stylized Z and circuit patterns
- Generate PNG versions (16x16, 32x32) for browser compatibility
- Add apple-touch-icon (180x180) for iOS devices
- Remove default Vite favicon

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add banner image to welcome screen when no project selected
- Add logo icon to header next to title
- Optimize banner image for web (5.5MB → 228KB)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add requests module to Dockerfile.project pip install (was causing
  agent_app.py to fail with ModuleNotFoundError)
- Fix session endpoint logic: init containers should only run when
  NO features exist yet (new project), not based on open features
- Remove fixed [FEATURE_COUNT] placeholder from initializer prompt,
  replace with flexible guidance to cover the full spec
- Add memory limits (4GB) to containers to prevent OOM crashes
- Add git lock cleanup on container restart
- Add timeout settings to opencode MCP config

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add minimax-m2-1 to UI model selector with OpenCode badge
- Add backend validation for minimax-m2-1 model
- Update container_manager to detect MiniMax as OpenCode model
- Pass MINIMAX_API_KEY to containers
- Configure minimax-coding-plan provider in OpenCode config
- Add MiniMax MCP server configuration (uvx minimax-coding-plan-mcp)
- Add model routing in opencode_agent_app.ts to dynamically switch
  between GLM-4.7 and MiniMax-M2.1 based on .agent_config.json
- Install uv in Dockerfile for uvx command support
- Set thinking budget_tokens: 32000 for MiniMax model

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Logs were being truncated too aggressively in multiple places:
- Thinking blocks: 150 → 500 chars
- Text output: 200 → 500 chars
- Tool results: 500 → 2000 chars
- UI log buffer: 500 → 1000 entries

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add Context7 MCP server to both Claude and OpenCode agents for
up-to-date library documentation access. Also document GIT_SSH_KEY_PATH
environment variable for custom SSH key paths in Docker builds.

- Add Context7 to opencode_config/config.json for GLM/MiniMax agents
- Add `claude mcp add context7` to Dockerfile.project for Claude agents
- Add usage instructions to coding_prompt.template.md
- Document GIT_SSH_KEY_PATH in CLAUDE.md and .env.example

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The bd CLI expects `comments add <id> "msg"` not `comments <id> --add "msg"`.
Updated all references across server, client script, templates, and tests.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Tests were using stale function signatures and invalid status values:
- register_project() no longer takes a path parameter
- update_container_status/get_container need container_type argument
- "completed" is not a valid container status (use "stopped")
- _notify_status_change dispatches async tasks that need event loop time
- count_passing_tests requires DB cache, not file-based reading
- Agent router mock targets updated to actual function names

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Increase Docker container memory limit from 4GB to 64GB (host has 128GB)
- Enable MCP servers for all agent types (coder, reviewer, overseer)
- Pass agentType to updateOpencodeConfig for future per-type configuration
- Fix WebSocket stale state when switching projects (reset all state, ignore
  stale messages via currentProjectRef)
- Add container_type to update_container_status calls for proper tracking
- Remove unused MiniMax thinking options from config.json

The OOM crashes were caused by insufficient container memory (4GB) when
MCP servers spawn alongside the agent. Rather than disabling MCP, we
allocate 64GB per container since the host has 128GB available.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ispatch

Two race conditions caused containers to restart unexpectedly after
graceful stop:

1. _handle_agent_exit() could fire after stop() already ran, seeing
   _graceful_stop_requested=False (cleared by stop) and exit code 137
   (from SIGKILL), then restarting. Fixed by returning early if container
   status is already "stopped" or "not_created".

2. A second start() call during SDK readiness wait would dispatch a
   duplicate agent (status already "running"). Fixed with an
   _agent_dispatched flag that prevents double-dispatch and resets on stop.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ceful stop

The mutation hooks in useContainers.ts used 'agentStatus' as the query
key but the actual query uses 'agent-status', so cache invalidation
after start/stop never triggered an immediate refetch. Also adds
optimistic update to useGracefulStopAgent for instant UI feedback.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Prevents stale button state when switching between projects by resetting
agent-status and containers queries, and forcing ContainerControl remount
to clear local isStarting/isStopping state.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add list_issues, update_issue, close_issue, reopen_issue, delete_issue,
and add_dependency tools to the issue MCP server. Rename server from
"issue-creator" to "issue-manager" to reflect broader capabilities.
Update assistant prompt template with new workflow and tool descriptions.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Pull latest changes (git pull --ff-only) when starting an assistant
session so beads issues are up-to-date, and sync issues back to remote
(bd sync + commit + push) when the session closes.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When bd list fails with "out of sync" error (e.g., after git pull brings
new commits), run bd sync --import-only to reconcile the database before
retrying. Fixes issues not showing for projects with stale SQLite state.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Previously cleanup deleted all non-protected branches, which was too
aggressive. Now only branches with the feature/ prefix are targeted,
matching the naming convention agents use (feature/{id}-{title-slug}).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The agent model config should be per-instance, not shared via git.
Pass model to containers via AGENT_MODEL env var instead of relying
on the config file being in the repo.

- Remove git commit/push from settings endpoint
- Add prompts/.gitignore template to exclude config files
- Copy gitignore in refresh_project_prompts()
- Untrack config in _push_template_updates() for existing projects
- Pass AGENT_MODEL env var in all docker exec paths
- Add AGENT_MODEL env var support to opencode_agent_app.ts
- Fix DEFAULT_AGENT_MODEL to claude-sonnet-4-5-20250514

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…d format

Replace the multi-section markdown template (Summary/Context/Implementation
Notes/Acceptance Criteria) with the initializer's simpler format: brief
description + numbered Steps that serve as both implementation guide and
verification criteria. The coding agent is optimized for this format.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Template refresh now compares content before copying, eliminating
spurious commits on every container start. Push logic fetches and
rebases from remote first, with claude-based conflict resolution
as fallback for diverged repos.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The claude-code npm package install path changed. Switch to the
official curl installer which puts the binary in ~/.local/bin/,
and update PATH accordingly.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Prevents stale modals/panels from showing wrong project context
and cross-project sound triggers when switching between projects.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@paraddox paraddox merged commit 1543d88 into main Jan 24, 2026
4 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants