Skip to content

Enterprise hardening baseline: security, reliability, CI gates, and runbooks#32

Open
ndycode wants to merge 5 commits intomainfrom
feat/enterprise-hardening
Open

Enterprise hardening baseline: security, reliability, CI gates, and runbooks#32
ndycode wants to merge 5 commits intomainfrom
feat/enterprise-hardening

Conversation

@ndycode
Copy link
Owner

@ndycode ndycode commented Mar 3, 2026

Summary

This PR implements a comprehensive enterprise hardening baseline across runtime behavior, storage safety, CLI contracts, CI policy gates, and operations documentation.

What’s Included

  • Runtime/data hardening:
    • Cross-process file locking for settings/quota writes
    • At-rest secret encryption + key rotation support
    • Idempotency key support for codex auth rotate-secrets
    • RBAC/ABAC-style authorization gates for CLI actions
    • JSON output redaction mode for sensitive fields
    • Startup data retention enforcement
    • Background retry + dead-letter queue for failed async persistence jobs
  • CLI/API contract maturity:
    • schemaVersion in JSON outputs
    • JSON list pagination standard (--page-size, --cursor)
  • Enterprise CI/security controls:
    • Secret scanning workflow
    • Supply-chain workflow (dependency review, SCA/license gate, SBOM generation)
    • Expanded CI checks and cross-platform smoke coverage
    • Required checks policy-as-code (.github/settings.yml)
  • Ops maturity:
    • Operations runbook
    • Incident response playbook
  • Docs updated for commands/settings/privacy/API/testing references.

Validation

  • npm run typecheck
  • npm run lint
  • npm run build && npm test
  • npm run coverage
  • npm run audit:ci
  • npm run license:check
  • npm run clean:repo:check

Notes

  • Work performed in isolated worktree on branch feat/enterprise-hardening.
  • No changes were made on main during implementation.

note: greptile review for oc-chatgpt-multi-auth. cite files like lib/foo.ts:123. confirm regression tests + windows concurrency/token redaction coverage.

Greptile Summary

comprehensive enterprise hardening baseline implementing cross-process file locking for all settings/account writes (defends windows antivirus/concurrent access races), at-rest token encryption with scrypt kdf and key rotation, rbac/abac authorization gates with audited break-glass, background job retry + dead-letter queue, idempotency key support for rotate-secrets, json output redaction, and data retention enforcement.

key improvements addressing windows filesystem safety:

  • all account/settings writes now guarded by cross-process file locks using exclusive write mode (wx)
  • stale lock cleanup after 2min timeout handles crashed processes
  • retry logic with exponential backoff for EBUSY/EPERM errors
  • temp-file-rename pattern for atomic writes
  • sync operations use Atomics.wait instead of blocking event loop
  • test/file-lock.test.ts:83-174 spawns 6 concurrent processes verifying mutual exclusion under contention

security/reliability enhancements:

  • refresh/access tokens encrypted at rest with aes-256-gcm + scrypt (v2), backward-compatible v1 decryption
  • authorization module now audits break-glass bypass (addresses previous comment)
  • background job failures written to dlq with redacted context
  • debounced account saves use retry mechanism preventing lost writes

ci/supply-chain controls:

  • cross-platform smoke tests on windows/macos runners
  • gitleaks secret scanning weekly + on pr/push
  • dependency review blocking high severity runtime deps
  • license gate denying gpl/agpl
  • sbom generation

test coverage:

  • multi-process concurrent write race test reproduces windows contention bugs
  • authorization tests verify break-glass audit trail
  • secrets-crypto tests cover v1/v2 compatibility and key rotation

all validation commands pass per pr description. previous review comments addressed: scrypt kdf replaces sha256, break-glass now audited, concurrent write race test added.

Confidence Score: 5/5

  • safe to merge - thorough hardening with comprehensive test coverage and all previous security concerns addressed
  • all three previous review comments have been resolved (scrypt kdf, break-glass audit, concurrent race test). file locking defends windows filesystem races, encryption protects tokens at rest, authorization gates cli actions, comprehensive test suite including multi-process concurrency verification. ci expanded with cross-platform smoke tests and supply-chain gates. no logical errors or security vulnerabilities found.
  • no files require special attention - implementation follows security best practices throughout

Important Files Changed

Filename Overview
lib/file-lock.ts new cross-process file locking with exponential backoff, stale lock cleanup, and windows filesystem concurrency handling via exclusive write mode and atomics-based sync sleep
lib/authorization.ts rbac/abac authorization gates for cli actions with break-glass bypass that now correctly audits emergency access
lib/secrets-crypto.ts at-rest encryption using aes-256-gcm with scrypt kdf (v2) and legacy sha256 (v1) support, key rotation via primary/previous keys
lib/storage.ts guards all account writes with file locks, encrypts refresh/access tokens at rest, supports key rotation, decrypts on read
lib/unified-settings.ts settings persistence with file locks on all writes, in-process queue for async writes, temp-file-rename pattern, windows ebusy/eperm retry logic
lib/background-jobs.ts retry mechanism with exponential backoff and dead-letter queue for failed async persistence, uses file locks for dlq writes, redacts context
lib/idempotency.ts idempotency key storage with file locks, 24h ttl, supports rotate-secrets command replay protection
test/file-lock.test.ts comprehensive test suite including multi-process concurrent write race test with 6 workers verifying mutual exclusion under contention
.github/workflows/ci.yml expanded ci with node 20/22 matrix, coverage gate, cross-platform smoke tests on windows/macos, security audit, lockfile floor guard
.github/workflows/supply-chain.yml dependency review, license gate denying gpl/agpl, sca audit, sbom generation with cyclonedx

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[CLI Command] --> B{Authorization Check}
    B -->|Break-Glass Set| C[Audit Break-Glass]
    B -->|ABAC Policy| D{Policy Allows?}
    D -->|Denied| E[Return Error]
    D -->|Allowed| F{RBAC Role Check}
    F -->|Denied| E
    F -->|Allowed| G[Execute Action]
    C --> G
    
    G --> H{Mutates Storage?}
    H -->|No| I[Return Success]
    H -->|Yes| J[Acquire File Lock]
    J --> K{Lock Acquired?}
    K -->|Timeout/Fail| L[Background Job Retry]
    K -->|Success| M[Encrypt Secrets]
    M --> N[Write Temp File]
    N --> O[Atomic Rename]
    O --> P[Release Lock]
    P --> I
    
    L --> Q{Max Retries?}
    Q -->|Yes| R[Write to DLQ]
    Q -->|No| S[Exponential Backoff]
    S --> J
    R --> E
Loading

Last reviewed commit: d1c603e

Add enterprise-grade hardening across runtime, CLI, storage, CI, and docs.\n\n- Add cross-process file locking for settings/quota persistence\n- Add at-rest secret encryption with rotation command and idempotency support\n- Add RBAC/ABAC-style command authorization, JSON redaction, and retention policies\n- Add background retry + dead-letter queue for async persistence failures\n- Add list JSON pagination standard and schemaVersion contract updates\n- Add CI security gates: secret scan, supply-chain/SCA/license checks, SBOM, required checks policy\n- Add operations and incident response runbooks\n- Add/extend tests for new security/reliability primitives and CLI behaviors\n\nValidated with:\n- npm run typecheck\n- npm run lint\n- npm run build && npm test\n- npm run coverage\n- npm run audit:ci\n- npm run license:check\n- npm run clean:repo:check

Co-authored-by: Codex <noreply@openai.com>
@coderabbitai
Copy link

coderabbitai bot commented Mar 3, 2026

Warning

Rate limit exceeded

@ndycode has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 1 minutes and 1 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 851a6a48-b1ed-4ef9-ad8e-0a5e29848015

📥 Commits

Reviewing files that changed from the base of the PR and between d36b04f and d1c603e.

📒 Files selected for processing (46)
  • .github/settings.yml
  • .github/workflows/ci.yml
  • .github/workflows/secret-scan.yml
  • .github/workflows/supply-chain.yml
  • README.md
  • docs/README.md
  • docs/configuration.md
  • docs/development/CONFIG_FIELDS.md
  • docs/development/TESTING.md
  • docs/index.md
  • docs/privacy.md
  • docs/reference/commands.md
  • docs/reference/error-contracts.md
  • docs/reference/public-api.md
  • docs/reference/settings.md
  • docs/reference/storage-paths.md
  • docs/runbooks/README.md
  • docs/runbooks/incident-response.md
  • docs/runbooks/operations.md
  • index.ts
  • lib/accounts.ts
  • lib/audit.ts
  • lib/authorization.ts
  • lib/background-jobs.ts
  • lib/codex-manager.ts
  • lib/data-redaction.ts
  • lib/data-retention.ts
  • lib/file-lock.ts
  • lib/idempotency.ts
  • lib/index.ts
  • lib/quota-cache.ts
  • lib/secrets-crypto.ts
  • lib/storage.ts
  • lib/unified-settings.ts
  • package.json
  • scripts/license-policy-check.js
  • test/audit.test.ts
  • test/authorization.test.ts
  • test/background-jobs.test.ts
  • test/codex-manager-cli.test.ts
  • test/data-redaction.test.ts
  • test/data-retention.test.ts
  • test/file-lock.test.ts
  • test/idempotency.test.ts
  • test/quota-cache.test.ts
  • test/secrets-crypto.test.ts
📝 Walkthrough

Walkthrough

adds wide-ranging security, auditing, and reliability features: authorization/abac (lib/authorization.ts), secrets envelope crypto (lib/secrets-crypto.ts), background job retries + dlq (lib/background-jobs.ts), file locks (lib/file-lock.ts), idempotency store (lib/idempotency.ts), data retention (lib/data-retention.ts), redaction (lib/data-redaction.ts), storage encryption + rotation (lib/storage.ts), audit integration (lib/audit.ts, index.ts), and ci/supply-chain workflows (.github/...). see lib/... files for implementations.

Changes

Cohort / File(s) Summary
github settings & workflows
.github/settings.yml, .github/workflows/ci.yml, .github/workflows/secret-scan.yml, .github/workflows/supply-chain.yml
adds branch protection, concurrency and matrix changes, secret-scan, supply-chain jobs, coverage gating, and cross-platform smoke tests.
docs and runbooks
README.md, docs/..., docs/runbooks/...
adds docs about encryption/rotation, retention, new CLI flags, and runbooks including incident-response and operations.
core auth & auditing
lib/authorization.ts, lib/audit.ts, index.ts
introduces role types, authorizeAction(), break-glass audit action, and re-exports for new modules.
secrets and storage
lib/secrets-crypto.ts, lib/storage.ts
adds envelope AES-256-GCM secret encryption, key rotation API (rotateStoredSecretEncryption), per-field encryption during persist, and storage error helpers.
concurrency primitives
lib/file-lock.ts, lib/unified-settings.ts, lib/quota-cache.ts
adds robust file locks (sync/async), wraps writes with locks, and retries/atomic rename protections.
background jobs & idempotency
lib/background-jobs.ts, lib/idempotency.ts
implements retry-with-backoff, DLQ write with redaction, and disk-backed idempotency store with TTL and file-locking.
redaction & retention
lib/data-redaction.ts, lib/data-retention.ts
adds recursive redaction utility and enforceDataRetention() that prunes logs/cache/state files from multi-auth dir.
cli manager changes
lib/codex-manager.ts, index.ts
integrates authorization checks, audit events, idempotency handling, rotate-secrets command, JSON schema/versioning, pagination and redaction hooks.
tests added/updated
test/*.test.ts (authorization.test.ts, background-jobs.test.ts, file-lock.test.ts, idempotency.test.ts, secrets-crypto.test.ts, data-retention.test.ts, data-redaction.test.ts, codex-manager-cli.test.ts, quota-cache.test.ts, audit.test.ts)
adds extensive test coverage for new modules and integrations; some tests exercise concurrency and DLQ behavior.
package & scripts
package.json, scripts/license-policy-check.js
adds license:check and sbom scripts and a license-policy-check node script used in supply-chain workflow.

Sequence Diagram(s)

sequenceDiagram
participant cli as CLI
participant auth as authorization (`lib/authorization.ts`)
participant idemp as idempotency (`lib/idempotency.ts`)
participant storage as storage (`lib/storage.ts`)
participant audit as audit (`lib/audit.ts`)
participant out as stdout

cli->>auth: ensureAuthorized(secrets:rotate)
auth-->>cli: {allowed}
cli->>idemp: checkAndRecordIdempotencyKey(scope="rotate", key=K)
alt replayed
  idemp-->>cli: {replayed:true}
  cli->>audit: emit(EVENT=rotate.replay)
  cli->>out: "replay: no-op"
else not replayed
  idemp-->>cli: {replayed:false}
  cli->>audit: emit(EVENT=rotate.start)
  cli->>storage: rotateStoredSecretEncryption(idempotencyKey=K)
  storage-->>audit: emit(EVENT=rotate.complete)
  storage-->>cli: {accounts:n, flaggedAccounts:m}
  cli->>out: JSON/PLAINTEXT result
end
Loading

review notes

  • missing regression tests: i did not find a dedicated test for rotateStoredSecretEncryption error paths or for cross-key rotation edge cases in test/... (see test/codex-manager-cli.test.ts). add explicit rotation failure/regression tests.
  • windows edge cases: file-lock and atomic rename behavior may behave differently on windows. ensure windows-specific tests or CI gating for lib/file-lock.ts and lib/unified-settings.ts. see test/file-lock.test.ts.
  • concurrency risks: many new cross-process locks and DLQ writes. review lock acquisition semantics and timeouts for deadlock and long-held locks under high contention (lib/file-lock.ts, lib/background-jobs.ts, lib/quota-cache.ts).

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Title check ⚠️ Warning Title exceeds 72-character limit at 76 characters and uses title case instead of lowercase imperative. Does not follow conventional commits format (type(scope): summary). Rewrite as: feat(cli,ops): enterprise hardening baseline or similar, ensuring lowercase imperative under 72 chars per conventional commits spec.
Docstring Coverage ⚠️ Warning Docstring coverage is 7.48% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Description check ✅ Passed Description is comprehensive and well-structured with summary, detailed what's included, validation steps, and notes. Covers most required checklist items with command evidence.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/enterprise-hardening

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

44 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

- migrate new envelopes to salted scrypt-derived AES keys\n- retain legacy enc:v1 decryption compatibility\n- add coverage for v2 and legacy v1 crypto paths\n- clarify high-entropy 32-byte key requirements in docs

Co-authored-by: Codex <noreply@openai.com>
ndycode and others added 2 commits March 4, 2026 04:35
- audit break-glass authorization bypass events\n- add multi-process contention regression for file locking\n- cover new auth.break_glass audit action in tests

Co-authored-by: Codex <noreply@openai.com>
Introduce optional environment-driven ABAC constraints layered ahead of RBAC for auth actions and commands.

- add ABAC policy evaluation in authorization module
- pass command/interactive/idempotency context from CLI command routing
- enforce read authorization on read command paths consistently
- add unit and CLI coverage for ABAC deny and idempotency requirements
- document ABAC controls in config and operations runbooks

Co-authored-by: Codex <noreply@openai.com>
@greptile-apps
Copy link

greptile-apps bot commented Mar 3, 2026

Additional Comments (1)

lib/storage.ts, line 113
accounts storage uses in-memory mutex but not file locks. settings (lib/unified-settings.ts:130), quota cache (lib/quota-cache.ts:230), dlq, and idempotency store all use acquireFileLock for cross-process safety. if cli and plugin can write accounts.json concurrently, this mutex won't prevent windows filesystem corruption. consider adding file lock wrapper like other critical stores.

Co-authored-by: Codex <noreply@openai.com>
@ndycode
Copy link
Owner Author

ndycode commented Mar 5, 2026

@coderabbitai review

Re-triggering review to clear the transient rate-limit status check failure.

@coderabbitai
Copy link

coderabbitai bot commented Mar 5, 2026

@ndycode: re-triggering the review now to clear that transient status check failure.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@ndycode
Copy link
Owner Author

ndycode commented Mar 5, 2026

Re-requesting CodeRabbit review: the current failure is CodeRabbit: Review rate limit exceeded while other checks are green.

@ndycode
Copy link
Owner Author

ndycode commented Mar 5, 2026

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Mar 5, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@ndycode
Copy link
Owner Author

ndycode commented Mar 5, 2026

Follow-up tracking: external blocker recorded in #44.

Current failing context remains:

  • CodeRabbit: Review rate limit exceeded

Actions already taken:

  • re-triggered @coderabbitai review
  • verified review threads are resolved

Next required step is external quota reset/increase, then rerun CodeRabbit.

@coderabbitai
Copy link

coderabbitai bot commented Mar 5, 2026

@ndycode: acknowledged — the rate limit blocker is noted and tracked in #44. i'll attempt to trigger the review now; if the quota has reset since the last attempt, this should proceed. if it hits the limit again, the review will need to wait for the external quota reset before retrying.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 30

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (5)
docs/reference/settings.md (1)

131-157: ⚠️ Potential issue | 🟡 Minor

upgrade.md needs to consolidate new security config in rollout checklist.

env vars (CODEX_AUTH_ENCRYPTION_KEY, CODEX_AUTH_PREVIOUS_ENCRYPTION_KEY, CODEX_AUTH_ROLE, CODEX_AUTH_BREAK_GLASS, CODEX_AUTH_ABAC_*, CODEX_AUTH_REDACT_JSON_OUTPUT) are documented in configuration.md and reference/settings.md, and rotate-secrets workflow is in runbooks/operations.md and reference/commands.md, but upgrade.md itself skips the encryption/ABAC setup steps and npm hardening scripts (audit:ci, license:check) that should be part of the migration process. add a new section under "Migration Checklist" that calls out:

  1. configure encryption keys if enabling at-rest secret storage
  2. set ABAC policy (CODEX_AUTH_ROLE baseline, denial rules) for your security posture
  3. run hardening checks: npm run audit:ci && npm run license:check && npm run clean:repo:check
  4. validate key rotation with codex auth rotate-secrets --json before promoting to production

windows concurrency edge cases for atomic file writes and lock retries are already covered in test/unified-settings.test.ts and test/storage.test.ts, so persistence safety is good.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/reference/settings.md` around lines 131 - 157, Add a new "Encryption &
ABAC" subsection under Migration Checklist in upgrade.md that consolidates the
security config and rollout steps: instruct operators to configure
CODEX_AUTH_ENCRYPTION_KEY and CODEX_AUTH_PREVIOUS_ENCRYPTION_KEY when enabling
at-rest secret storage, set baseline CODEX_AUTH_ROLE and ABAC denial rules
(CODEX_AUTH_ABAC_READ_ONLY, CODEX_AUTH_ABAC_DENY_ACTIONS,
CODEX_AUTH_ABAC_DENY_COMMANDS, CODEX_AUTH_ABAC_REQUIRE_INTERACTIVE,
CODEX_AUTH_ABAC_REQUIRE_IDEMPOTENCY_KEY), enable CODEX_AUTH_BREAK_GLASS and
CODEX_AUTH_REDACT_JSON_OUTPUT as needed, run hardening checks with npm run
audit:ci && npm run license:check && npm run clean:repo:check, and validate
rotation with codex auth rotate-secrets --json before promoting to production;
reference the existing docs (reference/settings.md, configuration.md), runbooks
(runbooks/operations.md), and tests (test/unified-settings.test.ts,
test/storage.test.ts) as follow-up verification.
lib/unified-settings.ts (2)

294-299: ⚠️ Potential issue | 🟠 Major

cross-process lost-update race still exists in read-modify-write flows.

lib/unified-settings.ts:294-299 and lib/unified-settings.ts:338-342 read the settings record before the cross-process write lock is taken, so two processes can read the same base document and overwrite each other’s section updates. the lock needs to wrap the full read-modify-write transaction, not only rename/write.

as per coding guidelines "lib/**: focus on ... concurrency."

Also applies to: 338-342

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/unified-settings.ts` around lines 294 - 299, The read-modify-write race
occurs because readSettingsRecordAsync() is called before acquiring the
cross-process lock in saveUnifiedPluginConfig (and the similar block at lines
338-342); move the read inside the enqueueSettingsWrite(async () => { ... })
callback so the lock acquired by enqueueSettingsWrite wraps the entire
read-modify-write sequence, i.e., call readSettingsRecordAsync() only after
entering the enqueueSettingsWrite callback, perform the record.pluginConfig
update, then call writeSettingsRecordAsync(); apply the same change to the other
function/block referenced (lines 338-342) and keep using enqueueSettingsWrite,
readSettingsRecordAsync, and writeSettingsRecordAsync as the unique identifiers.

130-224: ⚠️ Potential issue | 🟠 Major

add vitest regressions for the new lock path on windows-style contention.

lib/unified-settings.ts:130-224 introduces retry/backoff and cross-process lock behavior, but the provided tests do not show direct regressions for unified settings under contention (including ebusy/eperm rename pressure). add targeted vitest coverage for this path.

as per coding guidelines "lib/**: verify every change cites affected tests (vitest)andtest/**`: demand regression cases that reproduce concurrency bugs ... and windows filesystem behavior."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/unified-settings.ts` around lines 130 - 224, Add vitest regression tests
that simulate Windows-style rename contention for the unified settings code
paths: write tests targeting writeSettingsRecordAsync (and the sync counterpart
that uses acquireFileLockSync) in lib/unified-settings.ts by mocking
fs.rename/renameSync to throw transient EBUSY/EPERM errors for the first N
attempts and succeed thereafter, assert that the function eventually succeeds,
that temp files are removed on failure (unlink/unlinkSync called), and that the
lock (acquireFileLock / acquireFileLockSync via UNIFIED_SETTINGS_LOCK_PATH) is
always released; also include a test where errors are non-retryable
(isRetryableFsError returns false) to ensure the functions propagate the error.
Ensure tests use vitest mocks/stubs and cover backoff/retry behavior and
cleanup.
docs/privacy.md (1)

83-111: ⚠️ Potential issue | 🟠 Major

cleanup commands miss the dlq artifact.

docs/privacy.md:83-111 does not remove background-job-dlq.jsonl, even though it is listed as canonical local data and managed in retention logic (lib/data-retention.ts:112-115). add it to both bash and powershell cleanup snippets.

proposed docs patch
 # bash
 rm -f ~/.codex/multi-auth/openai-codex-flagged-accounts.json
 rm -f ~/.codex/multi-auth/quota-cache.json
+rm -f ~/.codex/multi-auth/background-job-dlq.jsonl

 # powershell
 Remove-Item "$HOME\.codex\multi-auth\openai-codex-flagged-accounts.json" -Force -ErrorAction SilentlyContinue
 Remove-Item "$HOME\.codex\multi-auth\quota-cache.json" -Force -ErrorAction SilentlyContinue
+Remove-Item "$HOME\.codex\multi-auth\background-job-dlq.jsonl" -Force -ErrorAction SilentlyContinue

as per coding guidelines "docs/**: keep README, SECURITY, and docs consistent with actual CLI flags and workflows."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/privacy.md` around lines 83 - 111, Add removal of the
background-job-dlq.jsonl artifact to the cleanup snippets in docs/privacy.md so
docs match the actual retention logic in lib/data-retention.ts (lines
referencing background-job-dlq.jsonl). Specifically, update the Bash snippet to
rm -f ~/.codex/multi-auth/background-job-dlq.jsonl (and any override-root
conditional lines that mirror other removals) and update the PowerShell snippet
to Remove-Item "$HOME\.codex\multi-auth\background-job-dlq.jsonl" -Force
-ErrorAction SilentlyContinue (and the corresponding $env:CODEX_MULTI_AUTH_DIR
override handling).
lib/storage.ts (1)

1446-1468: ⚠️ Potential issue | 🟠 Major

flagged-account writes are not protected by cross-process file locking.

lib/storage.ts:1446-1468 uses withStorageLock only (in-process). concurrent cli processes can still race on saveFlaggedAccounts, causing lost updates and rename conflicts on windows-heavy workflows.

use a file lock (same pattern as accounts) for flagged-account read/modify/write paths, and add a contention vitest regression.

As per coding guidelines lib/**: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/storage.ts` around lines 1446 - 1468, saveFlaggedAccounts currently only
uses the in-process withStorageLock so concurrent CLI processes can race; update
saveFlaggedAccounts to acquire a cross-process file lock (same pattern used by
the accounts read/modify/write helpers) around the getFlaggedAccountsPath
operations, perform the read/normalize/serialize/write to a temp file then
atomic-rename under that file lock, and ensure cleanup on failure; additionally
add a vitest regression that spawns concurrent saveFlaggedAccounts calls to
reproduce and prevent lost updates (simulate EBUSY/rename contention and
retry/backoff for EBUSY/429), reuse the same lock acquisition/retry logic from
the accounts implementation, and review log.error calls in saveFlaggedAccounts
to avoid leaking emails/tokens when logging errors.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/ci.yml:
- Around line 146-147: The CI step "Run smoke tests" currently omits key Windows
concurrency/regression tests; update .github/workflows/ci.yml so that the
Windows matrix (windows-latest) runs the lock and background persistence smoke
tests by including test/file-lock.test.ts and test/background-jobs.test.ts in
the "npm run test" invocation (or add a separate run step for those files) for
the windows job; ensure the job that invokes "Run smoke tests" or the matrix
entry that targets windows runs these specific tests so fs/concurrency and
background-job regressions are caught on Windows.

In `@docs/configuration.md`:
- Line 73: The table cell for CODEX_AUTH_ROLE contains raw pipe characters which
break Markdown table columns; replace the inner pipes with escaped HTML entities
(e.g., change admin|operator|viewer to admin&#124;operator&#124;viewer) or wrap
the entire cell in an HTML <code> element to preserve the literal pipes, and
then verify the values match the actual CLI/constant in lib/secrets-crypto.ts
(the CODEX_AUTH_ROLE enumeration/strings) so docs/configuration.md stays
consistent with runtime behavior.

In `@docs/reference/commands.md`:
- Line 58: Update the recovery workflow example for the rotate-secrets CLI to
include the documented --idempotency-key flag so the example matches the flag
table; change the non-idempotent rotate-secrets invocation in
docs/reference/commands.md (the recovery example and the nearby example at lines
~115-116) to include a stable idempotency token (e.g. --idempotency-key
"$CI_JOB_ID" or a generated UUID) and ensure the example text mentions
re-running safely; verify this aligns with the rotate-secrets handling in the
background job logic (see the rotate-secrets usage in lib/index.ts and the
idempotency handling in the background job code around the rotate job) so docs
and implementation are consistent.

In `@docs/runbooks/incident-response.md`:
- Around line 28-32: Update the "rotate-secrets" containment step in the
incident runbook to be idempotent by appending the CLI flag "--idempotency-key
<incident-id>" to the "codex auth rotate-secrets" invocation so retries won't
cause duplicate side effects; mirror this flag usage in documentation and
mention the change in upgrade notes and npm scripts as appropriate, and
cross-reference the implementation locations (lib/background-jobs.ts function
handling background job retries and lib/index.ts CLI command registration) to
ensure the flag name and behavior match the actual code.

In `@index.ts`:
- Around line 270-283: The wrapper exchangeAuthorizationCodeWithRateLimit
currently lets exceptions from checkAuthRateLimit and exchangeAuthorizationCode
bubble up instead of returning a TokenResult failure; update
exchangeAuthorizationCodeWithRateLimit to catch errors thrown by
checkAuthRateLimit, recordAuthAttempt, and exchangeAuthorizationCode and convert
them into the appropriate TokenResult failure object (preserving error
message/details), while still calling resetAuthRateLimit on success; apply the
same pattern to the other wrappers referenced (the oauth paths at the locations
noted) and add Vitest regressions covering (1) blocked-by-rate-limit and (2)
network/error thrown from token exchange to ensure the functions always return a
TokenResult rather than throwing.
- Around line 228-232: The startup call to enforceDataRetention() runs
concurrently with storage/account startup and must be serialized using the same
storage lock/transaction used for account writes; modify the startup flow to
acquire the storage lock/transaction wrapper (the same mechanism used for
account writes) before invoking enforceDataRetention(), and update
lib/data-retention.ts to expose a lock-aware entry (e.g.,
enforceDataRetentionUnderLock or accept a transaction/lock token) so the cleanup
runs inside the storage lock; add a vitest regression (e.g.,
data-retention.spec.ts) that simulates Windows-style file-handle contention by
holding the storage lock/handle during startup and asserting that
enforceDataRetention either waits for the lock or fails gracefully without
eperm/ebusy errors to verify serialization.

In `@lib/accounts.ts`:
- Around line 771-787: The current flow awaits this.pendingSave and if it
rejects the exception prevents scheduling the new debounced save; change the
logic in the block around pendingSave/runBackgroundJobWithRetry so a failed
prior save does not block the next one — e.g., replace awaiting the prior
promise with safe awaiting that swallows errors (use await
this.pendingSave.catch(() => {}) or wrap in try { await this.pendingSave } catch
{ /* ignore */ }) before assigning this.pendingSave =
runBackgroundJobWithRetry({...}). Keep the existing finally that nulls
this.pendingSave and ensure saveToDisk() is still invoked via
runBackgroundJobWithRetry; add a regression test in test/accounts*.test.ts that
simulates a rejecting prior pendingSave and asserts the subsequent save runs.
- Around line 774-784: Add a regression test in test/accounts.test.ts that
verifies runBackgroundJobWithRetry wrapped saves survive transient fs errors:
stub/mock the underlying saveAccounts (or AccountsStorage.saveAccounts) called
by AccountsManager.saveToDisk so it throws an EBUSY error on the first two calls
and returns successfully on the third, then invoke the debounced save path that
sets this.pendingSave (i.e., call the method that triggers saveToDisk via
runBackgroundJobWithRetry or directly call the code path that enqueues the job),
wait for the background job to finish (await pendingSave or flush
timers/promises), and assert that saveAccounts was called at least three times
and that the final attempt succeeded; ensure you reference
runBackgroundJobWithRetry, saveToDisk, saveAccounts and the pendingSave behavior
in the test setup.

In `@lib/authorization.ts`:
- Around line 30-36: getRoleFromEnv currently defaults unknown CODEX_AUTH_ROLE
values to "admin", which is fail-open; change its behavior to fail-closed by
returning "viewer" (or an explicit deny role if you prefer) for any
unrecognized/invalid role after trimming/lowercasing inside getRoleFromEnv, and
update the logic in that function to only return "operator"|"viewer"|"admin" for
valid inputs. Add a vitest regression in test/authorization.test.ts that sets
CODEX_AUTH_ROLE to an invalid string and asserts getRoleFromEnv() returns
"viewer" (or the chosen deny role) to prevent silent privilege escalation;
ensure the test resets env state afterward and does not log secrets.

In `@lib/background-jobs.ts`:
- Around line 43-47: The default retry policy in isRetryableByDefault currently
only checks errno-style codes; update isRetryableByDefault to also treat HTTP
429 (rate-limit) as retryable by accepting errors with a numeric
status/statusCode === 429 or an object shape indicating HTTP response code, and
ensure it still returns true for existing errno values (EBUSY, EPERM, EAGAIN,
ETIMEDOUT); add a vitest regression in test/background-jobs.test.ts that asserts
jobs failing with a 429 error are retried (and subsequently hit backoff) and
that EBUSY scenarios still retry, and when adding tests ensure any logs produced
by the queue/job code do not include tokens or emails.

In `@lib/codex-manager.ts`:
- Around line 4089-4102: The error path for the "rotate-secrets" JSON log prints
raw error payload and bypasses the redaction gate; wrap the error object passed
to JSON.stringify with maybeRedactJsonOutput(...) (same approach used in the
success path) so the error field is redacted before logging (preserve
JSON_OUTPUT_SCHEMA_VERSION and include idempotencyKey as before). Also add a
vitest regression that triggers the failure branch of rotate-secrets and asserts
that sensitive markers (e.g., tokens/emails) in the error message are redacted
in the logged JSON to prevent leaks.

In `@lib/data-redaction.ts`:
- Around line 31-49: The recursive walker in redactForExternalOutput (visit)
lacks a visited-object guard and will infinite-loop on cyclic objects; update
visit to track seen objects using a WeakSet (or similar) keyed by object
references, check the set at the start of visiting an object/array and return a
stable placeholder or the already-constructed result for cycles, and add logic
to record objects before recursing into their properties/array items (ensure
arrays and plain objects are handled). Update references to shouldRedactKey and
preserve redaction behavior when encountering previously-seen nodes. Add a
Vitest regression in test/data-redaction.test.ts that creates a self-referential
object (e.g., obj.self = obj) and asserts redactForExternalOutput returns
without throwing and that redacted keys remain masked.

In `@lib/data-retention.ts`:
- Around line 80-92: pruneSingleFile currently bails on non-ENOENT errors;
update the function to retry unlink/stat on transient Windows filesystem error
codes (e.g., "EBUSY", "EPERM" and optionally "EACCES") with a bounded
exponential backoff before rethrowing. Implement a small retry loop inside
pruneSingleFile around the fs.stat/fs.unlink calls (referencing the function
name pruneSingleFile), perform up to a fixed number of attempts (e.g., 4–6),
start with a short delay (e.g., 50–100ms) and double the delay each retry, await
a Promise-based sleep between attempts, and only return false on ENOENT; if
retries exhausted, rethrow the last error. Ensure the retry applies to both the
stat and unlink operations and preserve existing return semantics.
- Around line 70-75: The catch block in lib/data-retention.ts currently silences
all errors by only continuing on ENOENT and dropping every other error; update
the error handling in that catch (where you read const code = (error as
NodeJS.ErrnoException).code) so that if code === "ENOENT" you continue, but for
any other error you either log the error (with context about the path/operation)
and rethrow or propagate it so permission/IO failures are not silently ignored;
ensure the non-ENOENT branch uses the existing logger or throws the original
error to surface failures to callers.

In `@lib/file-lock.ts`:
- Around line 68-74: The lock file write path currently opens a file descriptor
into `handle`, writes metadata, and closes it but if `writeFile` (or any
intermediate step) throws the fd is leaked; wrap the open→write→close sequence
in a try/finally so `handle.close()` is always invoked (only if `handle` was
successfully assigned), and handle/ignore close errors safely to avoid masking
the original error; apply the same try/finally pattern to the equivalent block
around lines 137-144 (same `handle` usage), update or add Vitest cases that
simulate write failures (disk full/permission) to prove no fd leak, and ensure
any added logging does not emit tokens/emails while also validating behavior
under EBUSY/429 retry scenarios.

In `@lib/idempotency.ts`:
- Around line 86-121: checkAndRecordIdempotencyKey currently writes a key as
soon as a request starts, causing legitimate retries to be blocked if downstream
work fails; change the idempotency scheme to a two‑phase state or rollback
model: have checkAndRecordIdempotencyKey record a pending entry (e.g., { scope,
key, status: "pending", createdAtMs }) under the existing lock
(IDEMPOTENCY_LOCK_PATH) using loadFile/pruneExpired/saveFile, and expose
companion functions markIdempotencySucceeded(scope,key) and
clearIdempotencyOnFailure(scope,key) that update status to "succeeded" or remove
the pending entry inside the same lock; add retry/backoff in saveFile/loadFile
to handle EBUSY and transient I/O errors (429-style transient failures), ensure
callers in lib/codex-manager.ts use the new two‑phase flow (mark success only
after downstream rotation completes), and add a vitest regression test in
test/idempotency.test.ts that simulates a failure then retry to confirm the
retry is allowed; preserve existing pruning logic (pruneExpired) and avoid
logging sensitive tokens/emails.
- Around line 100-106: The idempotency lock is being acquired before the storage
directory exists which causes ENOENT on cold start; before calling
acquireFileLock with IDEMPOTENCY_LOCK_PATH (from lib/idempotency.ts) ensure the
parent directory for IDEMPOTENCY_PATH (or IDEMPOTENCY_LOCK_PATH) is created
(mkdir -p behavior) and handle transient ENOENT by retrying once; update acquire
sequence in the function that calls acquireFileLock to create
dirname(IDEMPOTENCY_PATH) (or dirname(IDEMPOTENCY_LOCK_PATH)) with safe
permissions and ensure any thrown ENOENT triggers a retry/create path rather
than bubbling up, then add/update vitest tests (e.g., idempotency.spec.ts)
covering cold-start directory-missing behavior and concurrency/error cases
(EBUSY/429) and verify logs do not leak sensitive tokens/emails.

In `@lib/index.ts`:
- Around line 32-38: Add a vitest public-api regression test that imports the
barrel (the exports from lib/index.ts) and snapshots the exported symbol list so
future refactors can't silently drop exports; specifically import the barrel
that re-exports file-lock.js, secrets-crypto.js, data-retention.js,
data-redaction.js, authorization.js, background-jobs.js, and idempotency.js
(refer to those module names) and assert a snapshot of
Object.keys(theImportedModule) or the sorted export names in
test/public-api*.test.ts, then commit the snapshot; this pins the Tier-B API
surface and will catch removals in future changes.

In `@lib/quota-cache.ts`:
- Around line 230-265: Add an integration test that spawns 4–6 child processes
which each call saveQuotaCache repeatedly (similar to
test/file-lock.test.ts:83-174) to exercise acquireFileLock/QUOTA_CACHE_LOCK_PATH
and the rename retry logic under Windows EBUSY/EPERM, and then add a brief
comment in lib/quota-cache.ts (near saveQuotaCache / the acquireFileLock usage)
pointing to the new multi-process test so future reviewers know this concurrency
behavior is validated; ensure the test asserts no lost/corrupted writes and that
rename retry exhaustion is handled.

In `@lib/secrets-crypto.ts`:
- Around line 163-172: The env-key loader currently accepts any non-empty
string; change getSecretEncryptionKeysFromEnv to validate each key returned by
getTrimmedEnv and reject keys with <32 bytes of entropy by converting the string
to a Buffer (e.g., Buffer.from(value, 'utf8') or the expected encoding) and
checking buffer.length >= 32, throwing a clear Error when a key is present but
too short; apply the same validation to the "previous" key and keep null when
absent. Also add vitest coverage asserting that getSecretEncryptionKeysFromEnv
throws for short keys and accepts keys >=32 bytes (and that absent keys remain
null).
- Around line 107-110: encryptSecret currently skips encryption based solely on
the "enc:v1:/enc:v2:" prefix which lets malformed envelopes bypass encryption;
update encryptSecret to only skip when isEncryptedSecret confirms a valid
envelope (or add a stricter validator like
isEncryptedEnvelope/isEncryptedSecretStrict and use that in
encryptSecret/parseEncryptedEnvelope), so malformed "enc:v2:..." strings get
treated as plaintext and encrypted normally; also add a vitest regression that
passes a string starting with "enc:v2:" but not a valid envelope and assert the
output is an encrypted envelope (not the original input).

In `@lib/storage.ts`:
- Around line 1313-1318: The decrypt failure in the loop that calls
decryptStorageSecret(refreshTokenRaw, "flagged refresh token") currently
swallows errors and continues, causing flagged accounts to be dropped from
normalized state and potentially lost on save; modify the handler in
lib/storage.ts to either preserve the original encrypted entry (e.g., keep
refreshTokenRaw in the normalized record) or rethrow/return a fail-fast error so
the entry is not removed, update the normalization code path that uses
refreshToken/refreshTokenRaw to prefer the encrypted raw value when decryption
fails, add a vitest regression named "cannot decrypt flagged token" that asserts
the encrypted entry is preserved after normalization and save, and ensure any
new logging around decrypt failures uses non-sensitive messages (no
tokens/emails), and that related tests cover filesystem/concurrency retry
behavior for EBUSY/429 when writing the preserved entries or queueing work.

In `@README.md`:
- Line 131: Add documentation for the --idempotency-key CLI flag alongside the
existing `codex auth rotate-secrets --json` entry: describe that
`--idempotency-key <value>` ensures safe retryable rotations by making the
request idempotent, note expected format (string/UUID) and that it’s used by
automated workflows, and reference its test usage in
`test/codex-manager-cli.test.ts:2101` for implementers; place the short flag
description in the same table row or an adjacent troubleshooting/automation note
per docs/** style guidelines.

In `@scripts/license-policy-check.js`:
- Around line 36-38: The current deny-check loop uses substring matching
(normalized.includes(denied)) which falsely flags licenses like "LGPL-3.0" when
"GPL-3.0" is denied; update the check in the loop over denyList to perform an
exact SPDX token/expression match instead of substring matching — e.g., parse
the normalized license expression into SPDX identifiers or split/tokenseparate
on operators and whitespace and compare exact identifiers to denied; keep
pushing violations with the same violations.push(`${name}@${version}
(${rawLicense})`) when a true exact match is found.

In `@test/background-jobs.test.ts`:
- Around line 26-49: Add a new Vitest case mirroring the existing "retries and
succeeds before exhausting attempts" test that specifically simulates rate-limit
(429) failures: import runBackgroundJobWithRetry and getBackgroundJobDlqPath,
create a task that throws an error representing an HTTP 429 (e.g., an Error with
status or statusCode = 429) for the first N-1 attempts and returns "ok" on the
Nth attempt, call runBackgroundJobWithRetry with small baseDelayMs/maxDelayMs
and appropriate maxAttempts, assert the result is "ok" and that the task ran the
expected number of attempts, and add a separate test that exhausts maxAttempts
on repeated 429 errors and then asserts fs.stat(getBackgroundJobDlqPath())
rejects with { code: "ENOENT" } or the appropriate DLQ presence check per
existing tests; reference runBackgroundJobWithRetry and getBackgroundJobDlqPath
to locate where to add/modify tests.

In `@test/data-redaction.test.ts`:
- Around line 1-34: Add regression tests to test/data-redaction.test.ts that
exercise edge cases for the redactForExternalOutput function: include a test
that passes null and undefined fields (and nested nulls) to ensure they are
preserved rather than causing errors, a test with arrays containing mixed
primitives, objects and nulls to ensure only sensitive keys in objects are
redacted, a test for empty objects/arrays to ensure they remain unchanged, and a
deep-nesting test (e.g., >50 levels) to guard against recursion/stack issues;
reference the redactForExternalOutput symbol when adding these cases and assert
exact expected outputs (null stays null, primitives unchanged, sensitive fields
replaced with "***REDACTED***").

In `@test/file-lock.test.ts`:
- Around line 49-57: The test currently uses a broad expectation
(rejects.toBeTruthy()) for acquireFileLock; replace it with a deterministic
assertion that checks the specific failure mode—e.g., assert the promise rejects
with the LockAcquisitionError (or the concrete error class your lock
implementation throws) or matches a known error code/message (for example use
expect(acquireFileLock(...)).rejects.toThrow(/acquire.*lock/i) or
expect(...).rejects.toMatchObject({ code: 'LOCK_BUSY' })) so the test only
passes for the intended lock contention error; import the specific error type
from the file-lock implementation if available and use that in the assertion.

In `@test/idempotency.test.ts`:
- Around line 26-50: The test suite lacks a concurrency regression for
idempotency key races: add a new test that concurrently calls
checkAndRecordIdempotencyKey for the same scope/key (e.g.,
"codex.auth.rotate-secrets", "key-concurrent") and asserts that exactly one
invocation returns { replayed: false } while all others return { replayed: true
}; use Promise.all to invoke many parallel calls (10+), await them, then count
results to ensure a single winner, and reuse getIdempotencyStorePath to validate
the store if desired—this proves the file-locking behavior around
checkAndRecordIdempotencyKey under concurrent access.

In `@test/secrets-crypto.test.ts`:
- Around line 62-66: Add a deterministic regression test in
test/secrets-crypto.test.ts that covers malformed prefixed plaintext inputs for
the isEncryptedSecret function: specifically add a case where the input begins
with the "enc:v2:" prefix but lacks the full expected segments (e.g.,
truncated/malformed payload) and assert that isEncryptedSecret(...) still
returns true; this ensures lib/secrets-crypto.ts::isEncryptedSecret handles
prefixed-but-malformed plaintext correctly and prevents regressions around
prefix recognition.

---

Outside diff comments:
In `@docs/privacy.md`:
- Around line 83-111: Add removal of the background-job-dlq.jsonl artifact to
the cleanup snippets in docs/privacy.md so docs match the actual retention logic
in lib/data-retention.ts (lines referencing background-job-dlq.jsonl).
Specifically, update the Bash snippet to rm -f
~/.codex/multi-auth/background-job-dlq.jsonl (and any override-root conditional
lines that mirror other removals) and update the PowerShell snippet to
Remove-Item "$HOME\.codex\multi-auth\background-job-dlq.jsonl" -Force
-ErrorAction SilentlyContinue (and the corresponding $env:CODEX_MULTI_AUTH_DIR
override handling).

In `@docs/reference/settings.md`:
- Around line 131-157: Add a new "Encryption & ABAC" subsection under Migration
Checklist in upgrade.md that consolidates the security config and rollout steps:
instruct operators to configure CODEX_AUTH_ENCRYPTION_KEY and
CODEX_AUTH_PREVIOUS_ENCRYPTION_KEY when enabling at-rest secret storage, set
baseline CODEX_AUTH_ROLE and ABAC denial rules (CODEX_AUTH_ABAC_READ_ONLY,
CODEX_AUTH_ABAC_DENY_ACTIONS, CODEX_AUTH_ABAC_DENY_COMMANDS,
CODEX_AUTH_ABAC_REQUIRE_INTERACTIVE, CODEX_AUTH_ABAC_REQUIRE_IDEMPOTENCY_KEY),
enable CODEX_AUTH_BREAK_GLASS and CODEX_AUTH_REDACT_JSON_OUTPUT as needed, run
hardening checks with npm run audit:ci && npm run license:check && npm run
clean:repo:check, and validate rotation with codex auth rotate-secrets --json
before promoting to production; reference the existing docs
(reference/settings.md, configuration.md), runbooks (runbooks/operations.md),
and tests (test/unified-settings.test.ts, test/storage.test.ts) as follow-up
verification.

In `@lib/storage.ts`:
- Around line 1446-1468: saveFlaggedAccounts currently only uses the in-process
withStorageLock so concurrent CLI processes can race; update saveFlaggedAccounts
to acquire a cross-process file lock (same pattern used by the accounts
read/modify/write helpers) around the getFlaggedAccountsPath operations, perform
the read/normalize/serialize/write to a temp file then atomic-rename under that
file lock, and ensure cleanup on failure; additionally add a vitest regression
that spawns concurrent saveFlaggedAccounts calls to reproduce and prevent lost
updates (simulate EBUSY/rename contention and retry/backoff for EBUSY/429),
reuse the same lock acquisition/retry logic from the accounts implementation,
and review log.error calls in saveFlaggedAccounts to avoid leaking emails/tokens
when logging errors.

In `@lib/unified-settings.ts`:
- Around line 294-299: The read-modify-write race occurs because
readSettingsRecordAsync() is called before acquiring the cross-process lock in
saveUnifiedPluginConfig (and the similar block at lines 338-342); move the read
inside the enqueueSettingsWrite(async () => { ... }) callback so the lock
acquired by enqueueSettingsWrite wraps the entire read-modify-write sequence,
i.e., call readSettingsRecordAsync() only after entering the
enqueueSettingsWrite callback, perform the record.pluginConfig update, then call
writeSettingsRecordAsync(); apply the same change to the other function/block
referenced (lines 338-342) and keep using enqueueSettingsWrite,
readSettingsRecordAsync, and writeSettingsRecordAsync as the unique identifiers.
- Around line 130-224: Add vitest regression tests that simulate Windows-style
rename contention for the unified settings code paths: write tests targeting
writeSettingsRecordAsync (and the sync counterpart that uses
acquireFileLockSync) in lib/unified-settings.ts by mocking fs.rename/renameSync
to throw transient EBUSY/EPERM errors for the first N attempts and succeed
thereafter, assert that the function eventually succeeds, that temp files are
removed on failure (unlink/unlinkSync called), and that the lock
(acquireFileLock / acquireFileLockSync via UNIFIED_SETTINGS_LOCK_PATH) is always
released; also include a test where errors are non-retryable (isRetryableFsError
returns false) to ensure the functions propagate the error. Ensure tests use
vitest mocks/stubs and cover backoff/retry behavior and cleanup.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: d3985bea-67ef-4b2d-83ce-467a08b8216b

📥 Commits

Reviewing files that changed from the base of the PR and between d36b04f and d1c603e.

📒 Files selected for processing (46)
  • .github/settings.yml
  • .github/workflows/ci.yml
  • .github/workflows/secret-scan.yml
  • .github/workflows/supply-chain.yml
  • README.md
  • docs/README.md
  • docs/configuration.md
  • docs/development/CONFIG_FIELDS.md
  • docs/development/TESTING.md
  • docs/index.md
  • docs/privacy.md
  • docs/reference/commands.md
  • docs/reference/error-contracts.md
  • docs/reference/public-api.md
  • docs/reference/settings.md
  • docs/reference/storage-paths.md
  • docs/runbooks/README.md
  • docs/runbooks/incident-response.md
  • docs/runbooks/operations.md
  • index.ts
  • lib/accounts.ts
  • lib/audit.ts
  • lib/authorization.ts
  • lib/background-jobs.ts
  • lib/codex-manager.ts
  • lib/data-redaction.ts
  • lib/data-retention.ts
  • lib/file-lock.ts
  • lib/idempotency.ts
  • lib/index.ts
  • lib/quota-cache.ts
  • lib/secrets-crypto.ts
  • lib/storage.ts
  • lib/unified-settings.ts
  • package.json
  • scripts/license-policy-check.js
  • test/audit.test.ts
  • test/authorization.test.ts
  • test/background-jobs.test.ts
  • test/codex-manager-cli.test.ts
  • test/data-redaction.test.ts
  • test/data-retention.test.ts
  • test/file-lock.test.ts
  • test/idempotency.test.ts
  • test/quota-cache.test.ts
  • test/secrets-crypto.test.ts
📜 Review details
🧰 Additional context used
📓 Path-based instructions (3)
docs/**

⚙️ CodeRabbit configuration file

keep README, SECURITY, and docs consistent with actual CLI flags and workflows. whenever behavior changes, require updated upgrade notes and mention new npm scripts.

Files:

  • docs/runbooks/incident-response.md
  • docs/runbooks/README.md
  • docs/README.md
  • docs/configuration.md
  • docs/development/CONFIG_FIELDS.md
  • docs/reference/error-contracts.md
  • docs/runbooks/operations.md
  • docs/reference/commands.md
  • docs/reference/storage-paths.md
  • docs/index.md
  • docs/privacy.md
  • docs/development/TESTING.md
  • docs/reference/settings.md
  • docs/reference/public-api.md
test/**

⚙️ CodeRabbit configuration file

tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.

Files:

  • test/quota-cache.test.ts
  • test/audit.test.ts
  • test/codex-manager-cli.test.ts
  • test/idempotency.test.ts
  • test/file-lock.test.ts
  • test/background-jobs.test.ts
  • test/authorization.test.ts
  • test/data-redaction.test.ts
  • test/secrets-crypto.test.ts
  • test/data-retention.test.ts
lib/**

⚙️ CodeRabbit configuration file

focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.

Files:

  • lib/index.ts
  • lib/audit.ts
  • lib/data-retention.ts
  • lib/accounts.ts
  • lib/data-redaction.ts
  • lib/unified-settings.ts
  • lib/storage.ts
  • lib/authorization.ts
  • lib/background-jobs.ts
  • lib/secrets-crypto.ts
  • lib/quota-cache.ts
  • lib/idempotency.ts
  • lib/file-lock.ts
  • lib/codex-manager.ts
🧬 Code graph analysis (14)
lib/data-retention.ts (1)
lib/runtime-paths.ts (3)
  • getCodexLogDir (226-228)
  • getCodexCacheDir (212-214)
  • getCodexMultiAuthDir (166-199)
lib/accounts.ts (1)
lib/background-jobs.ts (1)
  • runBackgroundJobWithRetry (71-115)
test/codex-manager-cli.test.ts (2)
lib/codex-manager.ts (1)
  • runCodexMultiAuthCli (4528-4609)
scripts/codex.js (1)
  • runCodexMultiAuthCli (501-501)
test/idempotency.test.ts (1)
lib/idempotency.ts (2)
  • checkAndRecordIdempotencyKey (86-125)
  • getIdempotencyStorePath (82-84)
test/file-lock.test.ts (1)
lib/file-lock.ts (1)
  • acquireFileLock (57-107)
lib/authorization.ts (1)
lib/audit.ts (1)
  • auditLog (123-153)
lib/quota-cache.ts (2)
lib/file-lock.ts (1)
  • acquireFileLock (57-107)
lib/utils.ts (1)
  • sleep (65-67)
test/background-jobs.test.ts (1)
lib/background-jobs.ts (2)
  • runBackgroundJobWithRetry (71-115)
  • getBackgroundJobDlqPath (67-69)
test/authorization.test.ts (2)
lib/authorization.ts (2)
  • getAuthorizationRole (38-40)
  • authorizeAction (107-145)
lib/audit.ts (2)
  • getAuditConfig (63-65)
  • configureAudit (59-61)
test/data-redaction.test.ts (1)
lib/data-redaction.ts (1)
  • redactForExternalOutput (31-50)
lib/idempotency.ts (2)
lib/runtime-paths.ts (1)
  • getCodexMultiAuthDir (166-199)
lib/file-lock.ts (1)
  • acquireFileLock (57-107)
index.ts (5)
lib/data-retention.ts (1)
  • enforceDataRetention (95-122)
lib/logger.ts (2)
  • error (389-393)
  • logWarn (341-346)
lib/auth-rate-limit.ts (1)
  • checkAuthRateLimit (119-127)
lib/auth/auth.ts (2)
  • exchangeAuthorizationCode (114-158)
  • REDIRECT_URI (12-12)
lib/audit.ts (1)
  • auditLog (123-153)
test/secrets-crypto.test.ts (1)
lib/secrets-crypto.ts (4)
  • encryptSecret (107-121)
  • isEncryptedSecret (102-105)
  • decryptSecret (123-161)
  • getSecretEncryptionKeysFromEnv (168-173)
test/data-retention.test.ts (1)
lib/data-retention.ts (2)
  • RetentionPolicy (5-11)
  • enforceDataRetention (95-122)
🪛 LanguageTool
docs/runbooks/operations.md

[uncategorized] ~64-~64: The official name of this software platform is spelled with a capital “H”.
Context: ...ion required checks remain aligned with .github/settings.yml. --- ## Failure Triage ...

(GITHUB)

🪛 markdownlint-cli2 (0.21.0)
docs/configuration.md

[warning] 73-73: Table column count
Expected: 2; Actual: 4; Too many cells, extra data will be missing

(MD056, table-column-count)

🔇 Additional comments (18)
lib/audit.ts (1)

17-17: lgtm for enum extension.

this action value is consistent with the new audit surface and is covered by the enum assertion in test/audit.test.ts:254 (source line: lib/audit.ts:17).

test/audit.test.ts (1)

254-254: remove this comment - the requested regression test already exists.

test/authorization.test.ts:90 ("allows all actions when break-glass is enabled and audits the bypass") does exactly what you're asking for: it exercises the break-glass authorization flow, reads the actual audit log file, and explicitly asserts entry.action === AuditAction.AUTH_BREAK_GLASS with additional checks on resource and metadata. the test is deterministic, uses vitest, and doesn't mock secrets.

line 254 in test/audit.test.ts remains a useful enum-literal check. the behavior-level regression is already covered elsewhere.

			> Likely an incorrect or invalid review comment.
index.ts (1)

1690-1699: no issue here. emails are already sanitized via sanitizeActor before writing to audit logs.

auditLog at lib/audit.ts:140 applies sanitizeActor(actor) which calls maskEmail from lib/logger.ts:64-72. maskEmail irreversibly masks emails to format like us***@***.com. test/audit.test.ts:131-143 and test/audit.test.ts:147-160 verify this works for actor and metadata fields respectively. nested metadata is also recursively sanitized at lib/audit.ts:114.

all three call sites at index.ts:1690-1699, index.ts:1726-1737, and index.ts:2399-2410 pass account emails through this sanitization pipeline before storage, not unsanitized.

			> Likely an incorrect or invalid review comment.
docs/README.md (1)

62-62: lgtm.

new runbook entry in maintainer docs table correctly references docs/runbooks/README.md and aligns with the operations documentation added in this pr.

docs/index.md (1)

49-50: lgtm.

runbooks reference added correctly to the landing page navigation. ordering makes sense for operators looking for quick access to operational docs.

test/quota-cache.test.ts (1)

193-196: good refinement to make temp cleanup assertion more precise.

filtering to .tmp file unlinks avoids false positives from lock file operations. the test at line 198 still verifies no lingering temp files remain.

one note: this test mocks EPERM for the "keeps failing" scenario. there's an existing parameterized test at test/quota-cache.test.ts:127 covering both EBUSY and EPERM for retry success, but consider adding a windows-specific EBUSY variant for the "keeps failing" path if you want full windows filesystem coverage per coding guidelines (test/**).

docs/runbooks/README.md (1)

1-12: lgtm.

clean runbook index with clear scope definition. links correctly point to operations.md and incident-response.md in the same directory.

docs/runbooks/operations.md (1)

1-89: solid operations runbook covering the new enterprise hardening features.

key rotation workflow at lines 24-28 correctly references codex auth rotate-secrets --json. abac env vars at lines 44-50 match the implementation exercised in test/codex-manager-cli.test.ts:2134-2136. dlq monitoring at line 18 aligns with the background-job infrastructure.

note: static analysis flagged .github capitalization at line 64, but this is a file path reference (.github/settings.yml), not the company name—ignore that hint.

test/codex-manager-cli.test.ts (4)

21-22: lgtm on mock setup for rotation and idempotency.

new mocks at lines 21-22 are properly wired in vi.mock("../lib/storage.js") and vi.mock("../lib/idempotency.js"). reset logic at lines 208-209 ensures clean state between tests.

Also applies to: 85-90


2015-2083: solid pagination test covering cursor-based flow.

test at test/codex-manager-cli.test.ts:2015-2083 correctly exercises --page-size and --cursor flags. the two-call pattern verifies:

  1. first page returns hasMore: true with a cursor
  2. second page returns hasMore: false with nextCursor: null

using mockResolvedValue (not mockResolvedValueOnce) ensures both calls see consistent account data, which is appropriate here.


2085-2130: good idempotency replay test, but consider adding concurrency regression case.

test at test/codex-manager-cli.test.ts:2085-2130 correctly verifies:

  • first call with key "rotation-001" performs rotation (replayed: false)
  • second call with same key returns cached result (replayed: true)
  • rotateStoredSecretEncryptionMock called only once

nice detail testing both --idempotency-key rotation-001 (space) and --idempotency-key=rotation-001 (equals) formats at lines 2101-2102 and 2117.

per coding guidelines (test/**), consider adding a regression test for concurrent calls with the same idempotency key to verify no race conditions in lib/idempotency.ts. the current test is sequential.


2132-2180: good abac enforcement test with proper env cleanup.

test at test/codex-manager-cli.test.ts:2132-2180 correctly verifies:

  • command denied without --idempotency-key when CODEX_AUTH_ABAC_REQUIRE_IDEMPOTENCY_KEY=secrets:rotate
  • command allowed with --idempotency-key flag present
  • error message mentions "idempotency key" (line 2153)

env cleanup at lines 2169-2178 properly handles the undefined-vs-defined distinction. the pattern of capturing original values at lines 2133-2134 before modification is correct.

docs/reference/error-contracts.md (1)

34-51: json contract additions look solid.

this keeps the machine contract explicit for rotate-secrets, schemaVersion, redaction mode, pagination envelope, and idempotency guidance. refs: lib/index.ts:33-38.

docs/reference/public-api.md (1)

66-81: api standards baseline section is a strong addition.

this gives clear defaults for schema versioning, idempotency, and pagination and matches the expanded hardening surface. refs: lib/index.ts:32-38.

test/file-lock.test.ts (1)

83-174: good multi-process contention regression coverage.

test/file-lock.test.ts:83-174 reproduces cross-process lock contention and verifies both expected write count and uniqueness. this is strong coverage for concurrency correctness.

as per coding guidelines "test/**: demand regression cases that reproduce concurrency bugs ... and windows filesystem behavior."

test/data-retention.test.ts (1)

27-78: retention coverage looks solid.

test/data-retention.test.ts:27-78 checks stale deletion and fresh-file preservation with explicit assertions. this gives good confidence in pruning behavior.

as per coding guidelines "test/**: tests must stay deterministic and use vitest."

test/background-jobs.test.ts (1)

51-88: redaction assertions are strong here.

test/background-jobs.test.ts:51-88 verifies sensitive fields are redacted and non-sensitive fields stay visible in dlq entries. good coverage for privacy guarantees.

test/authorization.test.ts (1)

90-227: authorization regression matrix is comprehensive.

test/authorization.test.ts:90-227 covers break-glass auditing and core abac gates with clear assertions. this is good protection for policy regressions.

as per coding guidelines "test/**: tests must stay deterministic and use vitest."

Comment on lines +146 to +147
- name: Run smoke tests
run: npm run test -- test/runtime-paths.test.ts test/codex-bin-wrapper.test.ts
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

expand windows smoke coverage for fs/concurrency regressions.

line 147 currently skips the tests most tied to this pr’s windows and contention hardening. add at least lock/background persistence smoke so regressions are caught on windows-latest (refs: test/file-lock.test.ts:1, test/background-jobs.test.ts:1).

proposed ci update
-      - name: Run smoke tests
-        run: npm run test -- test/runtime-paths.test.ts test/codex-bin-wrapper.test.ts
+      - name: Run smoke tests
+        run: npm run test -- test/runtime-paths.test.ts test/codex-bin-wrapper.test.ts test/file-lock.test.ts test/background-jobs.test.ts

As per coding guidelines test/**: demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: Run smoke tests
run: npm run test -- test/runtime-paths.test.ts test/codex-bin-wrapper.test.ts
- name: Run smoke tests
run: npm run test -- test/runtime-paths.test.ts test/codex-bin-wrapper.test.ts test/file-lock.test.ts test/background-jobs.test.ts
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/ci.yml around lines 146 - 147, The CI step "Run smoke
tests" currently omits key Windows concurrency/regression tests; update
.github/workflows/ci.yml so that the Windows matrix (windows-latest) runs the
lock and background persistence smoke tests by including test/file-lock.test.ts
and test/background-jobs.test.ts in the "npm run test" invocation (or add a
separate run step for those files) for the windows job; ensure the job that
invokes "Run smoke tests" or the matrix entry that targets windows runs these
specific tests so fs/concurrency and background-job regressions are caught on
Windows.

| `CODEX_AUTH_STREAM_STALL_TIMEOUT_MS=<ms>` | Stream stall timeout override |
| `CODEX_AUTH_ENCRYPTION_KEY=<32-byte-random-key>` | Enable at-rest encryption for stored account secrets (high-entropy key material only) |
| `CODEX_AUTH_PREVIOUS_ENCRYPTION_KEY=<32-byte-random-key>` | Previous high-entropy key used during staged secret rotation |
| `CODEX_AUTH_ROLE=admin|operator|viewer` | CLI authorization role baseline |
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

fix table rendering for codex_auth_role values.

line 73 uses raw | inside a table cell, so markdown treats it as extra columns and drops content in rendered docs.

ref: docs/configuration.md:73, lib/secrets-crypto.ts:168-172.

proposed fix
-| `CODEX_AUTH_ROLE=admin|operator|viewer` | CLI authorization role baseline |
+| `CODEX_AUTH_ROLE=admin\|operator\|viewer` | CLI authorization role baseline |

as per coding guidelines, "docs/**: keep README, SECURITY, and docs consistent with actual CLI flags and workflows. whenever behavior changes, require updated upgrade notes and mention new npm scripts."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
| `CODEX_AUTH_ROLE=admin|operator|viewer` | CLI authorization role baseline |
| `CODEX_AUTH_ROLE=admin\|operator\|viewer` | CLI authorization role baseline |
🧰 Tools
🪛 markdownlint-cli2 (0.21.0)

[warning] 73-73: Table column count
Expected: 2; Actual: 4; Too many cells, extra data will be missing

(MD056, table-column-count)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/configuration.md` at line 73, The table cell for CODEX_AUTH_ROLE
contains raw pipe characters which break Markdown table columns; replace the
inner pipes with escaped HTML entities (e.g., change admin|operator|viewer to
admin&#124;operator&#124;viewer) or wrap the entire cell in an HTML <code>
element to preserve the literal pipes, and then verify the values match the
actual CLI/constant in lib/secrets-crypto.ts (the CODEX_AUTH_ROLE
enumeration/strings) so docs/configuration.md stays consistent with runtime
behavior.

| `--no-restore` | verify-flagged | Verify only; do not restore healthy flagged accounts |
| `--page-size <n>` | list, status (`--json`) | Page size for JSON list output (1-200) |
| `--cursor <cursor>` | list, status (`--json`) | Cursor token for JSON list pagination |
| `--idempotency-key <key>` | rotate-secrets | Safe retry key for automation |
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

make the recovery example use the idempotency flag it documents.

the flag table adds --idempotency-key, but the recovery workflow still shows a non-idempotent rotate command. update the example to reinforce safe retries. refs: lib/index.ts:38, lib/background-jobs.ts:70-114.

proposed doc fix
-codex auth rotate-secrets --json
+codex auth rotate-secrets --json --idempotency-key <run-key>

As per coding guidelines, docs/**: keep README, SECURITY, and docs consistent with actual CLI flags and workflows. whenever behavior changes, require updated upgrade notes and mention new npm scripts.

Also applies to: 115-116

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/reference/commands.md` at line 58, Update the recovery workflow example
for the rotate-secrets CLI to include the documented --idempotency-key flag so
the example matches the flag table; change the non-idempotent rotate-secrets
invocation in docs/reference/commands.md (the recovery example and the nearby
example at lines ~115-116) to include a stable idempotency token (e.g.
--idempotency-key "$CI_JOB_ID" or a generated UUID) and ensure the example text
mentions re-running safely; verify this aligns with the rotate-secrets handling
in the background job logic (see the rotate-secrets usage in lib/index.ts and
the idempotency handling in the background job code around the rotate job) so
docs and implementation are consistent.

Comment on lines +28 to +32
1. For credential exposure:
- rotate affected OAuth/session credentials
- set new `CODEX_AUTH_ENCRYPTION_KEY`
- run `codex auth rotate-secrets`
2. For unauthorized command execution:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

add idempotency to the rotate-secrets containment step.

for incident automation, this step is commonly retried. include --idempotency-key <incident-id> to avoid duplicate side effects. refs: lib/background-jobs.ts:70-114, lib/index.ts:38.

proposed doc fix
-   - run `codex auth rotate-secrets`
+   - run `codex auth rotate-secrets --idempotency-key <incident-id>`

As per coding guidelines, docs/**: keep README, SECURITY, and docs consistent with actual CLI flags and workflows. whenever behavior changes, require updated upgrade notes and mention new npm scripts.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/runbooks/incident-response.md` around lines 28 - 32, Update the
"rotate-secrets" containment step in the incident runbook to be idempotent by
appending the CLI flag "--idempotency-key <incident-id>" to the "codex auth
rotate-secrets" invocation so retries won't cause duplicate side effects; mirror
this flag usage in documentation and mention the change in upgrade notes and npm
scripts as appropriate, and cross-reference the implementation locations
(lib/background-jobs.ts function handling background job retries and
lib/index.ts CLI command registration) to ensure the flag name and behavior
match the actual code.

Comment on lines +228 to +232
void enforceDataRetention().catch((error) => {
logWarn("Data retention cleanup failed", {
error: error instanceof Error ? error.message : String(error),
});
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

serialize retention cleanup with storage io on startup.

this runs in parallel with startup account/storage flows. on windows, file prune vs active handles can race and produce repeated eperm/ebusy cleanup failures.

ref: index.ts:228-232, lib/data-retention.ts:94-121.

please gate cleanup behind the same storage lock/transaction strategy used for account writes, and add a vitest regression that simulates windows-style lock contention during startup. as per coding guidelines, "lib/**: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) ..."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@index.ts` around lines 228 - 232, The startup call to enforceDataRetention()
runs concurrently with storage/account startup and must be serialized using the
same storage lock/transaction used for account writes; modify the startup flow
to acquire the storage lock/transaction wrapper (the same mechanism used for
account writes) before invoking enforceDataRetention(), and update
lib/data-retention.ts to expose a lock-aware entry (e.g.,
enforceDataRetentionUnderLock or accept a transaction/lock token) so the cleanup
runs inside the storage lock; add a vitest regression (e.g.,
data-retention.spec.ts) that simulates Windows-style file-handle contention by
holding the storage lock/handle during startup and asserting that
enforceDataRetention either waits for the lock or fails gracefully without
eperm/ebusy errors to verify serialization.

Comment on lines +26 to +49
it("retries and succeeds before exhausting attempts", async () => {
const { runBackgroundJobWithRetry, getBackgroundJobDlqPath } =
await import("../lib/background-jobs.js");
let attempts = 0;
const result = await runBackgroundJobWithRetry({
name: "test.retry-success",
task: async () => {
attempts += 1;
if (attempts < 3) {
const error = new Error("busy") as NodeJS.ErrnoException;
error.code = "EBUSY";
throw error;
}
return "ok";
},
maxAttempts: 4,
baseDelayMs: 1,
maxDelayMs: 2,
});

expect(result).toBe("ok");
expect(attempts).toBe(3);
await expect(fs.stat(getBackgroundJobDlqPath())).rejects.toMatchObject({ code: "ENOENT" });
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

add a retry regression for 429 scenarios.

test/background-jobs.test.ts:26-49 validates ebusy retries but misses the 429 path. add a case that simulates rate-limit failure and verifies retry/backoff behavior before success or dlq write.

as per coding guidelines "lib/**: ... verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/background-jobs.test.ts` around lines 26 - 49, Add a new Vitest case
mirroring the existing "retries and succeeds before exhausting attempts" test
that specifically simulates rate-limit (429) failures: import
runBackgroundJobWithRetry and getBackgroundJobDlqPath, create a task that throws
an error representing an HTTP 429 (e.g., an Error with status or statusCode =
429) for the first N-1 attempts and returns "ok" on the Nth attempt, call
runBackgroundJobWithRetry with small baseDelayMs/maxDelayMs and appropriate
maxAttempts, assert the result is "ok" and that the task ran the expected number
of attempts, and add a separate test that exhausts maxAttempts on repeated 429
errors and then asserts fs.stat(getBackgroundJobDlqPath()) rejects with { code:
"ENOENT" } or the appropriate DLQ presence check per existing tests; reference
runBackgroundJobWithRetry and getBackgroundJobDlqPath to locate where to
add/modify tests.

Comment on lines +1 to +34
import { describe, expect, it } from "vitest";
import { redactForExternalOutput } from "../lib/data-redaction.js";

describe("data redaction", () => {
it("redacts sensitive keys recursively", () => {
const value = {
email: "person@example.com",
profile: {
accessToken: "abc",
refresh_token: "def",
nested: [{ accountId: "acct_123" }],
},
};

const redacted = redactForExternalOutput(value);
expect(redacted).toEqual({
email: "***REDACTED***",
profile: {
accessToken: "***REDACTED***",
refresh_token: "***REDACTED***",
nested: [{ accountId: "***REDACTED***" }],
},
});
});

it("keeps non-sensitive fields unchanged", () => {
const value = {
command: "report",
schemaVersion: 1,
summary: { ok: 1, warn: 0, error: 0 },
};
expect(redactForExternalOutput(value)).toEqual(value);
});
}); No newline at end of file
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

good baseline coverage for redaction, but consider adding edge case tests.

tests are deterministic and use vitest correctly. the recursive redaction test at lines 5-24 covers nested objects and arrays well.

per coding guidelines (test/**), consider adding regression cases for:

  • null/undefined values in the input structure
  • arrays containing non-object primitives mixed with objects
  • empty objects and arrays
  • very deep nesting to ensure no stack overflow on pathological inputs

the implementation in lib/data-redaction.ts:31-36 shows it handles arrays and non-objects, but explicit tests for these paths would improve confidence.

📝 suggested additional test cases
it("handles null and undefined values without crashing", () => {
	const value = {
		email: "test@example.com",
		nullField: null,
		undefinedField: undefined,
		nested: { accessToken: "secret", empty: null },
	};
	const redacted = redactForExternalOutput(value);
	expect(redacted.email).toBe("***REDACTED***");
	expect(redacted.nullField).toBeNull();
	expect(redacted.nested.accessToken).toBe("***REDACTED***");
});

it("handles arrays with mixed primitives and objects", () => {
	const value = {
		items: [1, "string", { accountId: "acct_1" }, null],
	};
	const redacted = redactForExternalOutput(value);
	expect(redacted.items).toEqual([1, "string", { accountId: "***REDACTED***" }, null]);
});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/data-redaction.test.ts` around lines 1 - 34, Add regression tests to
test/data-redaction.test.ts that exercise edge cases for the
redactForExternalOutput function: include a test that passes null and undefined
fields (and nested nulls) to ensure they are preserved rather than causing
errors, a test with arrays containing mixed primitives, objects and nulls to
ensure only sensitive keys in objects are redacted, a test for empty
objects/arrays to ensure they remain unchanged, and a deep-nesting test (e.g.,
>50 levels) to guard against recursion/stack issues; reference the
redactForExternalOutput symbol when adding these cases and assert exact expected
outputs (null stays null, primitives unchanged, sensitive fields replaced with
"***REDACTED***").

Comment on lines +49 to +57
await expect(
acquireFileLock(lockPath, {
maxAttempts: 2,
baseDelayMs: 1,
maxDelayMs: 2,
staleAfterMs: 10_000,
}),
).rejects.toBeTruthy();

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

assert the lock failure mode explicitly.

rejects.toBeTruthy() at test/file-lock.test.ts:49-57 is too broad and can pass on unrelated exceptions. assert a specific error code/message so this regression stays targeted.

as per coding guidelines "test/**: tests must stay deterministic and use vitest... reject changes that ... skip assertions."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/file-lock.test.ts` around lines 49 - 57, The test currently uses a broad
expectation (rejects.toBeTruthy()) for acquireFileLock; replace it with a
deterministic assertion that checks the specific failure mode—e.g., assert the
promise rejects with the LockAcquisitionError (or the concrete error class your
lock implementation throws) or matches a known error code/message (for example
use expect(acquireFileLock(...)).rejects.toThrow(/acquire.*lock/i) or
expect(...).rejects.toMatchObject({ code: 'LOCK_BUSY' })) so the test only
passes for the intended lock contention error; import the specific error type
from the file-lock implementation if available and use that in the assertion.

Comment on lines +26 to +50
it("records first key use and replays duplicates", async () => {
const { checkAndRecordIdempotencyKey, getIdempotencyStorePath } =
await import("../lib/idempotency.js");

expect(
await checkAndRecordIdempotencyKey("codex.auth.rotate-secrets", "key-1"),
).toEqual({ replayed: false });
expect(
await checkAndRecordIdempotencyKey("codex.auth.rotate-secrets", "key-1"),
).toEqual({ replayed: true });

const content = await fs.readFile(getIdempotencyStorePath(), "utf8");
expect(content).toContain("codex.auth.rotate-secrets");
});

it("expires entries based on ttl", async () => {
const { checkAndRecordIdempotencyKey } = await import("../lib/idempotency.js");
expect(
await checkAndRecordIdempotencyKey("codex.auth.rotate-secrets", "key-2", 5),
).toEqual({ replayed: false });
await new Promise((resolve) => setTimeout(resolve, 10));
expect(
await checkAndRecordIdempotencyKey("codex.auth.rotate-secrets", "key-2", 5),
).toEqual({ replayed: false });
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

missing concurrency regression for duplicate-key race.

test/idempotency.test.ts:26-50 only validates sequential calls. add a regression where concurrent calls race on the same scope/key and prove exactly one wins as non-replay. this is the critical behavior the file lock is meant to protect.

as per coding guidelines "test/**: demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/idempotency.test.ts` around lines 26 - 50, The test suite lacks a
concurrency regression for idempotency key races: add a new test that
concurrently calls checkAndRecordIdempotencyKey for the same scope/key (e.g.,
"codex.auth.rotate-secrets", "key-concurrent") and asserts that exactly one
invocation returns { replayed: false } while all others return { replayed: true
}; use Promise.all to invoke many parallel calls (10+), await them, then count
results to ensure a single winner, and reuse getIdempotencyStorePath to validate
the store if desired—this proves the file-locking behavior around
checkAndRecordIdempotencyKey under concurrent access.

Comment on lines +62 to +66
it("recognizes legacy and current encrypted prefixes", () => {
expect(isEncryptedSecret("enc:v1:abc:def:ghi")).toBe(true);
expect(isEncryptedSecret("enc:v2:abc:def:ghi:jkl")).toBe(true);
expect(isEncryptedSecret("enc:v3:abc:def:ghi:jkl")).toBe(false);
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

add regression coverage for malformed prefixed plaintext inputs.

the suite checks prefix recognition, but it misses the case where plaintext begins with enc:v2: and should still be encrypted.

ref: test/secrets-crypto.test.ts:62-66, lib/secrets-crypto.ts:107-110.

as per coding guidelines, "test/**: tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/secrets-crypto.test.ts` around lines 62 - 66, Add a deterministic
regression test in test/secrets-crypto.test.ts that covers malformed prefixed
plaintext inputs for the isEncryptedSecret function: specifically add a case
where the input begins with the "enc:v2:" prefix but lacks the full expected
segments (e.g., truncated/malformed payload) and assert that
isEncryptedSecret(...) still returns true; this ensures
lib/secrets-crypto.ts::isEncryptedSecret handles prefixed-but-malformed
plaintext correctly and prevents regressions around prefix recognition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant