feat: multi-agent foundation with recovery, quality gates, and migration compatibility by NextDoorLaoHuang-HF · Pull Request #415 · RightNow-AI/openfang

NextDoorLaoHuang-HF · 2026-03-07T16:19:58Z

Summary

This PR delivers the multi-agent foundation requested for long-running, resumable Codex workflows and OpenClaw migration compatibility.

Main scope:

Durable workflow/task state + recovery snapshot/resume primitives.
Declarative workflow routing + fan-out/fan-in execution.
Review reject-and-return control loop + retry/block escalation.
Step-level quality gate enforcement + gate execution logs.
Session isolation guardrails across kernel/runtime/API (including process scope isolation).
Approval enforcement + workflow audit/trace + observability metrics.
Shadow-run comparison + rollout/rollback controls.
OpenClaw migration compatibility hardening (identity/provider alias/bindings route variants).
Docs + e2e/focused tests for the above.

Why

OpenFang needs stronger multi-step orchestration and reliability foundations to support 24h+ autonomous runs with deterministic recovery, strict quality control, and safer migration from existing OpenClaw layouts.

Validation (comprehensive pre-PR review gate)

All commands below were re-run successfully before opening this PR:

cargo fmt --all -- --check
cargo test -p openfang-kernel workflow::tests::test_route_workflow_by_channel_task_type_and_risk -- --exact
cargo test -p openfang-kernel workflow::tests::test_review_reject_and_return_to_planning -- --exact
cargo test -p openfang-kernel --test session_resume_integration_test test_multi_session_e2e_session_summaries_stay_scoped -- --exact
cargo test -p openfang-api --test api_integration_test test_workflow_shadow_run_compares_against_production_output -- --exact
cargo test -p openfang-api --test api_integration_test test_workflow_rollout_controls_promote_and_rollback_with_checklist -- --exact
cargo test -p openfang-runtime tool_runner::tests::test_approval_required_approved -- --exact
cargo test -p openfang-migrate provider_alias_compatibility
cargo check -p openfang-api

Risks

Large cross-crate surface area (kernel/runtime/api/migrate) with behavior changes in workflow control-flow and session semantics.
Rollout should remain staged with shadow mode and rollback checklist.

Rollback

Revert by planned slice order (state/recovery -> routing -> review/retry -> session/approval/audit -> shadow/rollout -> migrate compatibility).
Operational fallback: restore stable production path and disable shadow promotion while keeping migration isolated.

NextDoorLaoHuang-HF · 2026-03-07T16:21:43Z

Pre-PR comprehensive review gate has been completed, and this PR is intentionally kept in Draft because unresolved high-severity findings were identified.\n\nBlocking findings summary:\n1. Session ownership/isolation risk in attachment injection path () — caller-supplied is not fully ownership-guarded and missing sessions can be auto-created.\n2. Streaming multi-session compaction risk () — compaction path may target default session instead of requested session under concurrent sessions.\n3. Provider normalization compatibility risk () — global to normalization can rewrite providers into unsupported runtime names.\n4. Rollout/rollback control-plane vs execution-path drift risk () — rollout state management needs stronger enforcement in routing/execution path.\n\nI will address these blockers before marking this PR ready for review.

NextDoorLaoHuang-HF · 2026-03-07T16:21:58Z

Pre-PR comprehensive review gate has been completed, and this PR is intentionally kept in Draft because unresolved high-severity findings were identified.

Blocking findings summary:

Session ownership/isolation risk in attachment injection path (routes.rs) — caller-supplied session_id is not fully ownership-guarded and missing sessions can be auto-created.
Streaming multi-session compaction risk (kernel.rs) — compaction path may target default session instead of requested session under concurrent sessions.
Provider normalization compatibility risk (openclaw.rs) — global - to _ normalization can rewrite providers into unsupported runtime names.
Rollout/rollback control-plane vs execution-path drift risk (workflow.rs) — rollout state management needs stronger enforcement in routing/execution path.

Next action: fix blockers, rerun comprehensive pre-PR review, then move PR out of Draft.

NextDoorLaoHuang-HF · 2026-03-08T04:45:48Z

High-severity blocker fixes are now landed on feat/multiagent-foundation-v1.

Fixed items:

routes/ws attachment session safety
- Enforced explicit session ownership checks before attachment injection.
- Rejected unknown explicit sessions (no implicit arbitrary session creation).
- WebSocket invalid session_id now returns explicit error (no silent fallback).
kernel multi-session compaction targeting
- Added compact_agent_session_in_session.
- Streaming pre/post compaction now targets resolved_session_id instead of default session.
migrate provider normalization
- Removed global unknown-provider - -> _ rewrite.
- Added explicit claude_code/claude-code handling and compatibility assertions.
workflow/kernel rollout execution enforcement
- Added route_workflow_for_primary_path and enforced Openfang primary path in routed execution.
- Made route score arithmetic overflow-safe (saturating_add).

Validation:

Full comprehensive pre-PR gate rerun passed (run_pre_pr_review.py, 11/11 commands).
Latest gate log:
- .codex-tasks/openfang-multiagent-foundation/logs/pre-pr-review-20260308-124253.log
Additional targeted regressions passed for:
- attachment session ownership checks
- cross-agent compaction rejection
- rollout-primary routing enforcement
- provider mapping compatibility

T028 (comprehensive pre-PR review gate) is back to DONE.

NextDoorLaoHuang-HF · 2026-03-08T04:47:38Z

Post-fix update for blocking review items (corrected formatting):

Fixed all High findings from the comprehensive pre-PR review in commit b64964b.
Added/updated regressions for session ownership, compaction session targeting, provider alias normalization, and rollout primary-path enforcement.
Also fixed Medium issue: websocket invalid session_id now returns explicit error instead of silent fallback.
Re-ran comprehensive pre-PR gate with GO decision (log: .codex-tasks/openfang-multiagent-foundation/logs/pre-pr-review-20260308-124253.log).

This PR is now Ready for review.

NextDoorLaoHuang-HF · 2026-03-08T05:39:57Z

Process hardening update pushed in 99d7d3b:

Added pre-pr-review-gate CI workflow to enforce required PR review sections/checklist on pull_request -> main.
Added PR template with mandatory sections for validation evidence, findings, risks, and rollback.
Added branch-protection automation helper scripts/ci/configure_branch_protection.sh and verified it on fork NextDoorLaoHuang-HF/openfang:main.
Added docs/pr-quality-gates.md and wired it into CONTRIBUTING.md / docs/README.md.

Note: per local policy preference, codex longrun runtime scripts/logs remain local-only in ignored paths (.codex-tasks/, .longrun/) and are not committed.

root added 2 commits March 7, 2026 22:46

feat(workflow): implement multi-agent foundation baseline

0e41b83

chore(tests): rustfmt session resume integration test

c5853a8

NextDoorLaoHuang-HF marked this pull request as draft March 7, 2026 16:20

fix(multiagent): resolve pre-PR high-risk routing/session/migrate issues

b64964b

NextDoorLaoHuang-HF marked this pull request as ready for review March 8, 2026 04:47

chore(governance): enforce pre-pr review gate workflow

99d7d3b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: multi-agent foundation with recovery, quality gates, and migration compatibility#415

feat: multi-agent foundation with recovery, quality gates, and migration compatibility#415
NextDoorLaoHuang-HF wants to merge 4 commits intoRightNow-AI:mainfrom
NextDoorLaoHuang-HF:feat/multiagent-foundation-v1

NextDoorLaoHuang-HF commented Mar 7, 2026

Uh oh!

NextDoorLaoHuang-HF commented Mar 7, 2026

Uh oh!

NextDoorLaoHuang-HF commented Mar 7, 2026

Uh oh!

NextDoorLaoHuang-HF commented Mar 8, 2026

Uh oh!

NextDoorLaoHuang-HF commented Mar 8, 2026

Uh oh!

NextDoorLaoHuang-HF commented Mar 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

NextDoorLaoHuang-HF commented Mar 7, 2026

Summary

Why

Validation (comprehensive pre-PR review gate)

Risks

Rollback

Uh oh!

NextDoorLaoHuang-HF commented Mar 7, 2026

Uh oh!

NextDoorLaoHuang-HF commented Mar 7, 2026

Uh oh!

NextDoorLaoHuang-HF commented Mar 8, 2026

Uh oh!

NextDoorLaoHuang-HF commented Mar 8, 2026

Uh oh!

NextDoorLaoHuang-HF commented Mar 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant