feat: implement production hardening roadmap phases#42
feat: implement production hardening roadmap phases#42
Conversation
Add timeout and abort boundary for authorization code exchange. Wire fetch timeout config into oauth exchange callsites. Add tests for network and timeout failure behavior. Co-authored-by: Codex <noreply@openai.com>
Add enterprise-grade hardening across runtime, CLI, storage, CI, and docs.\n\n- Add cross-process file locking for settings/quota persistence\n- Add at-rest secret encryption with rotation command and idempotency support\n- Add RBAC/ABAC-style command authorization, JSON redaction, and retention policies\n- Add background retry + dead-letter queue for async persistence failures\n- Add list JSON pagination standard and schemaVersion contract updates\n- Add CI security gates: secret scan, supply-chain/SCA/license checks, SBOM, required checks policy\n- Add operations and incident response runbooks\n- Add/extend tests for new security/reliability primitives and CLI behaviors\n\nValidated with:\n- npm run typecheck\n- npm run lint\n- npm run build && npm test\n- npm run coverage\n- npm run audit:ci\n- npm run license:check\n- npm run clean:repo:check Co-authored-by: Codex <noreply@openai.com>
- migrate new envelopes to salted scrypt-derived AES keys\n- retain legacy enc:v1 decryption compatibility\n- add coverage for v2 and legacy v1 crypto paths\n- clarify high-entropy 32-byte key requirements in docs Co-authored-by: Codex <noreply@openai.com>
- dedupe OAuth timeout resolution in plugin flows\n- close upstream abort-listener race in auth exchange\n- add transient config-read retry handling for EBUSY/EPERM\n- add timeout propagation and retry regression tests\n\nCo-authored-by: Codex <noreply@openai.com>
- sanitize OAuth token error logging and payloads - add deterministic upstream-abort and cleanup regressions - harden config-save merge path against transient EBUSY/EPERM reads - strengthen timeout wiring and EPERM regression tests Co-authored-by: Codex <noreply@openai.com>
- audit break-glass authorization bypass events\n- add multi-process contention regression for file locking\n- cover new auth.break_glass audit action in tests Co-authored-by: Codex <noreply@openai.com>
Prevent sensitive OAuth payload leakage during refresh HTTP failures by logging status and body length only, and return a generic safe error message when response text exists. Adds regression tests for auth behavior and sanitized logging. Co-authored-by: Codex <noreply@openai.com>
Introduce optional environment-driven ABAC constraints layered ahead of RBAC for auth actions and commands. - add ABAC policy evaluation in authorization module - pass command/interactive/idempotency context from CLI command routing - enforce read authorization on read command paths consistently - add unit and CLI coverage for ABAC deny and idempotency requirements - document ABAC controls in config and operations runbooks Co-authored-by: Codex <noreply@openai.com>
Add schema-aware config sanitization, introduce a resilient fetch primitive with timeout/retry instrumentation, apply it to remote prompt/update fetches, and add shutdown timeout boundaries with concurrent cleanup deduplication. Includes new regression tests for network retries, config sanitization, and shutdown timeout behavior. Co-authored-by: Codex <noreply@openai.com>
Introduce a shared account-view helper module and reuse it across the plugin entrypoint and CLI manager to remove duplicated formatting logic while preserving behavior. Add unit/regression tests to lock formatting and active-index fallback behavior. Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
Address unresolved hardening review items across auth abort classification, config sanitization, network retry cancellation, update-check backoff, prompt refresh dedupe coverage, and shutdown timer cleanup. Adds deterministic regression tests for each scenario and aligns config-save expectations with schema-bound validation. Co-authored-by: Codex <noreply@openai.com>
Add shared helpers for per-family active-index map creation and bulk updates, then reuse them across plugin, CLI manager, and codex-cli sync flows to reduce duplicated state-mutation logic. Extend tests to lock switch and selection behavior for all model-family indexes. Co-authored-by: Codex <noreply@openai.com>
Extract active-index normalization into a shared helper and reuse it in plugin account-flow clamping, codex-cli sync, and doctor normalization paths. Add focused tests covering bounded normalization and empty-account map handling to preserve behavior across both clear and fill modes. Co-authored-by: Codex <noreply@openai.com>
Extract model-family rate-limit status rendering into account-view helpers and replace duplicate status mapping logic in the plugin status output paths. Add unit coverage for mixed limited/ok family labels to protect output behavior. Co-authored-by: Codex <noreply@openai.com>
Move active-index-by-family label rendering into shared account-view helpers and reuse it in both v2 and legacy codex status output paths. Add helper tests to lock 1-based labels and placeholder output for missing families. Co-authored-by: Codex <noreply@openai.com>
Move codex-remove index/family reindexing into shared active-index helpers to centralize the mutation policy and reduce inline branching in the plugin entrypoint. Add helper tests for in-range and out-of-range removal behavior. Co-authored-by: Codex <noreply@openai.com>
- cleanup stale/dead process lock artifacts before acquiring account lock - ensure lock release always attempts fallback cleanup - keep clearAccounts/saveTransactions serialized across file and memory locks Co-authored-by: Codex <noreply@openai.com>
Introduce shared account-storage view helpers for creating empty v3 storage and cloning mutable working copies, then reuse them across plugin login/check flow, codex-cli sync, and manager reset paths. Add unit tests to lock clone isolation and family-index initialization behavior. Co-authored-by: Codex <noreply@openai.com>
- add storage identity fallback test for malformed .git metadata - add OAuth poll abort test for deterministic close behavior - add prompt refresh dedupe test for concurrent stale cache reads Co-authored-by: Codex <noreply@openai.com>
Add dev doctor and setup commands with fail-fast checks, document the contributor workflow, and wire a CI sanity check for doctor:dev. Co-authored-by: Codex <noreply@openai.com>
- replace expensive tool-schema clone path with fast JSON-safe clone and lower-allocation cleanup loops - eliminate repeated index map rebuilds and JSON.stringify comparisons in codex-cli account sync - parse error bodies once in fetch helpers and thread parsed payload through rate-limit mapping - add detailed non-stream success handling to avoid duplicate JSON parse in empty-response retry checks - extend runtime benchmark harness with sync/error/non-stream cases and update tests/mocks for new helper exports Co-authored-by: Codex <noreply@openai.com>
Add a canonical verify workflow shared by local and CI, integrate scoped Biome formatting checks, and document contributor/release runbooks to reduce setup and release friction. Co-authored-by: Codex <noreply@openai.com>
Bound expensive fan-out in account refresh/email hydration paths, cache index remaps in storage, and add runtime benchmark regression gating in CI. - throttle stale rotating backup cleanup scans per storage path - add bounded concurrency for proactive refresh and email hydration - reduce refresh-guardian lookup churn with precomputed token->index mapping - extend runtime benchmark coverage and add CI gate budgets Co-authored-by: Codex <noreply@openai.com>
Instrument hydration and proactive refresh hot paths with duration and throughput counters, and surface guardian tick timing in runtime metrics output. - track email hydration runs/candidates/refreshed/updated/failures/duration - log proactive refresh batch timing with effective concurrency - add guardian tick duration stats (last/max/avg) Co-authored-by: Codex <noreply@openai.com>
Tighten metrics correctness and improve test coverage for the new performance instrumentation. - keep guardian avg tick duration synchronized with cumulative/runs - add deterministic guardian duration aggregate test - assert codex-metrics output includes hydration/guardian timing lines Co-authored-by: Codex <noreply@openai.com>
Add regression tests for doctor/setup scripts, make setup command execution deterministic under event-order races, add bounded Windows npm ci retries, and document contributor upgrade + Windows lock remediation guidance. Co-authored-by: Codex <noreply@openai.com>
Add explicit loadAccounts tests for rotating-backup cleanup concurrency controls: - verify 30s cleanup interval throttling per storage path - verify in-flight cleanup promise dedup across concurrent loadAccounts calls This addresses the missing coverage gap called out by Greptile. Co-authored-by: Codex <noreply@openai.com>
Ensure account-storage mutations keep deterministic ordering while preserving the historical file-lock before in-process mutex acquisition sequence. Co-authored-by: Codex <noreply@openai.com>
- preserve manual accountId on snapshot merge - clarify refresh guardian last-tick metric semantics - add CRLF/fallback/concurrency/metrics regression tests Co-authored-by: Codex <noreply@openai.com>
Adds a pure retry governor for all-rate-limited flows, introduces an absolute wait ceiling setting with env override, and wires decision-based retry gating into the request loop. Also exposes retry ceiling in Settings Hub (Rotation & Quota), and adds structured codex-metrics counters for retry governor stop reasons. Validation: - npm run typecheck - npm run lint - npm run build - npm test - npm run clean:repo:check - npm run audit:ci Co-authored-by: Codex <noreply@openai.com>
- add shared spawnSync stub for npm/git version probes - use stub in missing-script, old-node, and missing-verify tests - keep doctor tests deterministic across CI host environments Co-authored-by: Codex <noreply@openai.com>
Replace sync rotation fs calls with async operations wrapped in bounded retry\nfor EBUSY/EPERM/ENOTEMPTY to handle Windows lock contention safely.\nAlso adds a telemetry test that simulates a transient rename lock and\nverifies rotation still succeeds.\n\nCo-authored-by: Codex <noreply@openai.com>
Implement targeted mitigations for high-risk production landmines:\n- CAS conflict detection for account, unified settings, and config persistence\n- account save conflict merge-retry flow\n- refresh lease timeout fail-closed policy\n- network timeouts across OAuth and prompt fetch paths\n- idempotency key propagation and safer SSE fallback handling\n- recovery storage corruption guards\n- quota cache save result signaling\n\nValidated with full test, lint, and typecheck passes. Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
📝 Walkthroughwalkthroughthis pr adds many new modules and CI/docs changes: authorization/abac, secrets encryption and rotation, file-locking and optimistic writes, idempotency, telemetry, background-jobs, resilient network/retry-governor, data-retention and shutdown, plus broad CLI/manager integrations and many tests. changes
sequence diagram(s)sequenceDiagram
participant cli as user/cli
participant auth as lib/authorization.ts
participant audit as lib/audit.ts
participant env as environment
cli->>auth: authorizeAction(action, context)
activate auth
env->>auth: read CODEX_AUTH_BREAK_GLASS / CODEX_AUTH_ABAC_*
alt break-glass enabled
auth->>audit: emitAudit(AUTH_BREAK_GLASS)
audit-->>auth: ack
auth-->>cli: { allowed: true, role }
else evaluate abac + role
auth->>auth: evaluate abac rules
auth-->>cli: { allowed: boolean, reason? }
end
deactivate auth
sequenceDiagram
participant manager as lib/codex-manager.ts
participant idemp as lib/idempotency.ts
participant crypto as lib/secrets-crypto.ts
participant lock as lib/file-lock.ts
participant storage as lib/storage.ts
manager->>idemp: checkAndRecordIdempotencyKey(scope,key)
alt replay
idemp-->>manager: { replayed: true }
manager-->>manager: return cached result
else new
idemp-->>manager: { replayed: false }
manager->>crypto: encryptSecret(secret)
manager->>lock: acquireFileLock(path)
activate lock
manager->>storage: rotateStoredSecretEncryption()
lock->>manager: release
manager->>manager: emit audit/telemetry
end
estimated code review effort🎯 5 (critical) | ⏱️ ~120 minutes suggested labels
notes and review flags
✨ Finishing Touches
🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 90
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (12)
test/server.unit.test.ts (1)
328-350: 🛠️ Refactor suggestion | 🟠 Majormissing try/finally for timer cleanup in timeout test.
if any assertion in this block throws,
vi.useRealTimers()at line 349 is skipped, leaving fake timers active for subsequent tests. the new test at 303-326 correctly uses try/finally — this one should match.suggested fix
it('should return null after 5 minute timeout', async () => { vi.useFakeTimers(); - - (mockServer.listen as ReturnType<typeof vi.fn>).mockImplementation( - (_port: number, _host: string, callback: () => void) => { - callback(); - return mockServer; - } - ); - (mockServer.on as ReturnType<typeof vi.fn>).mockReturnValue(mockServer); - - const result = await startLocalOAuthServer({ state: 'test-state' }); - - const codePromise = result.waitForCode('test-state'); - - await vi.advanceTimersByTimeAsync(5 * 60 * 1000 + 100); - - const code = await codePromise; - expect(code).toBeNull(); - expect(logWarn).toHaveBeenCalledWith('OAuth poll timeout after 5 minutes'); - - vi.useRealTimers(); + try { + (mockServer.listen as ReturnType<typeof vi.fn>).mockImplementation( + (_port: number, _host: string, callback: () => void) => { + callback(); + return mockServer; + } + ); + (mockServer.on as ReturnType<typeof vi.fn>).mockReturnValue(mockServer); + + const result = await startLocalOAuthServer({ state: 'test-state' }); + + const codePromise = result.waitForCode(); + + await vi.advanceTimersByTimeAsync(5 * 60 * 1000 + 100); + + const code = await codePromise; + expect(code).toBeNull(); + expect(logWarn).toHaveBeenCalledWith('OAuth poll timeout after 5 minutes'); + } finally { + vi.useRealTimers(); + } });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/server.unit.test.ts` around lines 328 - 350, The test "should return null after 5 minute timeout" uses vi.useFakeTimers() but does not guarantee vi.useRealTimers() runs if an assertion throws; wrap the fake-timer usage in a try/finally so timers are always restored. Concretely, update this test around the startLocalOAuthServer call / codePromise handling to call vi.useFakeTimers() before setting up mocks, run the test actions in a try block, and call vi.useRealTimers() in the finally block (referencing vi.useFakeTimers, vi.useRealTimers, startLocalOAuthServer, and result.waitForCode) to ensure timer cleanup even on failures.docs/reference/public-api.md (1)
80-106:⚠️ Potential issue | 🟠 Majormigration rules need explicit checklist for security.md sync and npm script documentation.
the api standards baseline section (lines 66-80) declares active contracts for schemaVersion, idempotency, and pagination. per coding guideline, "whenever behavior changes, require updated upgrade notes and mention new npm scripts." the migration rules (lines 95-106) require readme.md and docs/upgrade.md updates but don't explicitly mandate security.md consistency review or documenting new/changed npm scripts used by callers/operators.
this is a docs-only change with no regression tests, windows edge cases, or concurrency risks. the omission creates drift risk: future contract breaks may update docs but skip these required steps.
proposed addition to migration rules
3. Update: - `README.md` + - `SECURITY.md` - `docs/upgrade.md` - affected `docs/reference/*` - release notes and changelog + - document any new/changed npm scripts and workflow commands used by callers/operators🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/reference/public-api.md` around lines 80 - 106, Update the "Migration Rules" section to add explicit checklist items requiring (1) a security.md consistency review to ensure any contract change doesn't introduce security regressions and (2) documentation of any new or changed npm scripts used by callers/operators; reference the existing migration checklist items (README.md, docs/upgrade.md, docs/reference/*, release notes/changelog) and add short guidance to include before/after examples and test notes for schemaVersion, idempotency, and pagination where relevant so maintainers know to mention these contracts when documenting migrations.lib/request/helpers/tool-utils.ts (2)
138-141:⚠️ Potential issue | 🔴 Criticalguard
anyofentries before const-flattening to prevent runtime throws.
schema.anyOfis cast to records and then checked with"const" in opt. if any entry is non-object, this can throw and abort cleanup. seelib/request/helpers/tool-utils.ts:140(Line 140).proposed fix
if (Array.isArray(schema.anyOf)) { - const anyOf = schema.anyOf as Record<string, unknown>[]; - const allConst = anyOf.every((opt) => "const" in opt); + const rawAnyOf = schema.anyOf as unknown[]; + const anyOf = rawAnyOf.filter( + (opt): opt is Record<string, unknown> => isRecord(opt), + ); + const allConst = + anyOf.length > 0 && + anyOf.length === rawAnyOf.length && + anyOf.every((opt) => Object.prototype.hasOwnProperty.call(opt, "const")); if (allConst && anyOf.length > 0) { const enumValues = anyOf.map((opt) => opt.const); schema.enum = enumValues; delete schema.anyOf;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/request/helpers/tool-utils.ts` around lines 138 - 141, The anyOf entries are assumed to be objects and the every check uses `"const" in opt`, which throws for non-objects; update the guard around schema.anyOf (the anyOf variable and the allConst computation) to first ensure each opt is a non-null object (e.g., typeof opt === "object" && opt !== null) before testing for the "const" property, or filter/coerce non-object entries out so allConst only runs on safe object entries; modify the block that defines anyOf and computes allConst to perform this safe type-checking to avoid runtime throws.
104-108:⚠️ Potential issue | 🟠 Majorclone failure fallback returns unsanitized schemas.
on clone failure, the function returns the original tool unchanged. this bypasses cleanup and can re-introduce invalid/circular schema into downstream requests. see
lib/request/helpers/tool-utils.ts:106(Line 106).proposed fix
- } catch { - return tool; + } catch { + return { + ...tool, + function: { + ...functionDef, + parameters: { + type: "object", + properties: { + _placeholder: { + type: "boolean", + description: "this property is a placeholder and should be ignored.", + }, + }, + }, + }, + }; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/request/helpers/tool-utils.ts` around lines 104 - 108, The catch block for cloneRecord currently returns the original tool (variable tool), which leaves unsanitized/circular schemas in place; instead, on clone failure assign cleanedParameters = {} (or create and return a shallow copy of tool with parameters set to an empty object) so downstream uses a sanitized, safe parameters object; update the catch to set cleanedParameters = {} (or return {...tool, parameters: {}}) and continue processing rather than returning the original tool.lib/shutdown.ts (1)
77-79:⚠️ Potential issue | 🟡 Minoradd SIGBREAK handler for windows ctrl+break graceful shutdown support.
lib/shutdown.ts:77-79 registers SIGINT and SIGTERM but omits SIGBREAK. on windows, ctrl+break emits SIGBREAK (not SIGINT), so without it, graceful shutdown fails for users relying on ctrl+break in some console spawn scenarios. add
process.once("SIGBREAK", handleSignal);alongside SIGINT/SIGTERM.also add regression test to test/shutdown.test.ts:127+ (within the "process signal integration" suite) that verifies SIGBREAK handler is registered and runs cleanup/exit, following the same pattern as the existing SIGINT/SIGTERM tests at lines 128–184.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/shutdown.ts` around lines 77 - 79, Add a SIGBREAK registration alongside the existing signal handlers so Windows ctrl+break triggers graceful shutdown: in lib/shutdown.ts where process.once("SIGINT", handleSignal) and process.once("SIGTERM", handleSignal) are registered, add process.once("SIGBREAK", handleSignal). Then add a regression test in the "process signal integration" suite in test/shutdown.test.ts (next to the existing SIGINT/SIGTERM tests) that verifies the SIGBREAK handler is registered and that firing SIGBREAK runs the same cleanup/exit logic used by the SIGINT/SIGTERM tests.lib/auto-update-checker.ts (1)
36-41: 🧹 Nitpick | 🔵 TrivialsleepSync using SharedArrayBuffer/Atomics is blocking—intentional?
Atomics.waitblocks the thread synchronously. this is used in the retry loop for cache writes. while it works, it blocks the event loop during retries. for a background update check this is probably acceptable, but worth documenting the intentional blocking behavior.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/auto-update-checker.ts` around lines 36 - 41, sleepSync currently uses Atomics.wait which blocks the thread and thus the event loop; update the code to explicitly document that behavior in the sleepSync JSDoc/comment (function name: sleepSync) stating that it intentionally performs a synchronous/blocking sleep and why it is acceptable for the background update check; additionally, if non-blocking behavior is preferred, add an alternative asyncSleep implementation (e.g., async function asyncSleep(ms: number): Promise<void>) and update the retry loop(s) that call sleepSync to use asyncSleep instead of sleepSync.lib/storage.ts (1)
1586-1598:⚠️ Potential issue | 🟠 Majorflagged-account writes are missing file-lock hardening and windows retry behavior.
Line 1587 (
lib/storage.ts:1587) uses process-local mutex only, and Line 1597 (lib/storage.ts:1597) does a single rename. this leavesopenai-codex-flagged-accounts.jsonexposed to cross-process races and windowsEPERM/EBUSYfailures.proposed fix
export async function saveFlaggedAccounts(storage: FlaggedAccountStorageV1): Promise<void> { - return withStorageLock(async () => { - const path = getFlaggedAccountsPath(); + const path = getFlaggedAccountsPath(); + return withStorageSerializedFileLock(path, async () => { const uniqueSuffix = `${Date.now()}.${Math.random().toString(36).slice(2, 8)}`; const tempPath = `${path}.${uniqueSuffix}.tmp`; @@ - await fs.rename(tempPath, path); + await renameFileWithRetry(tempPath, path); } catch (error) {As per coding guidelines
lib/**: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/storage.ts` around lines 1586 - 1598, saveFlaggedAccounts currently only uses a process-local mutex (withStorageLock) and performs a single fs.rename, leaving cross-process races and Windows EPERM/EBUSY failures possible; update saveFlaggedAccounts (and related helpers like getFlaggedAccountsPath and tempPath usage) to use a cross-process file lock (e.g., advisory lock or lockfile on the target path) around the write/rename, write the temp file with exclusive create semantics, then attempt fs.rename in a retry loop with exponential backoff that handles EPERM/EBUSY/EMFILE and maximum retry count; ensure the lock is always released in finally, preserve file mode 0o600, and add/update vitest cases that simulate concurrent writers and Windows-style EBUSY/EPERM to verify retries and correctness.index.ts (1)
882-885:⚠️ Potential issue | 🟠 Majorhydrate persistence can lose concurrent updates.
line [884] writes with
saveAccounts(storage)after parallel refresh work. this path can clobber concurrent account mutations during the same window. use transactional persistence here as done in other write paths, and add a concurrency regression similar in spirit totest/codex-cli-sync.test.ts:503-583.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@index.ts` around lines 882 - 885, The current write path that sets storage.accounts = accountsCopy and calls await saveAccounts(storage) can clobber concurrent updates; change it to use the same transactional persistence approach used elsewhere: load the latest storage inside a transaction/compare-and-swap, merge the updated accounts into that fresh storage (instead of overwriting), and commit via the transactional save API (e.g., the existing transactional save function used in other write paths) so concurrent mutations are preserved; update the block around changed/accountsCopy/saveAccounts to perform read-merge-compare-and-commit and add a concurrency regression test modeled after test/codex-cli-sync.test.ts:503-583 that simulates parallel refresh updates to ensure no lost updates.docs/configuration.md (1)
71-99:⚠️ Potential issue | 🟠 Majoradd explicit upgrade-note and npm-script pointers for these behavior changes.
this section introduces new operational/security behavior toggles, but the change does not explicitly point operators to upgrade notes and the new script workflow entrypoints in the docs trail.
As per coding guidelines,
docs/**: keep README, SECURITY, and docs consistent with actual CLI flags and workflows. whenever behavior changes, require updated upgrade notes and mention new npm scripts.test/schemas.test.ts (1)
33-45:⚠️ Potential issue | 🟡 Minoradd a negative type test for
telemetryEnabled.
test/schemas.test.ts:44only validates the true-path fortelemetryEnabled. add a regression assertion that string/number input is rejected, so this new field stays schema-guarded.proposed test addition
it("rejects wrong types", () => { const result = PluginConfigSchema.safeParse({ codexMode: "yes" }); expect(result.success).toBe(false); }); + +it("rejects non-boolean telemetryEnabled", () => { + expect(PluginConfigSchema.safeParse({ telemetryEnabled: "true" }).success).toBe(false); + expect(PluginConfigSchema.safeParse({ telemetryEnabled: 1 }).success).toBe(false); +});🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/schemas.test.ts` around lines 33 - 45, Add a negative regression test in the same schemas.test.ts suite that verifies the telemetryEnabled schema rejects non-boolean inputs: call the same configuration validator used elsewhere in the file (the schema parse/validate function used in existing assertions) with telemetryEnabled set to a string (e.g., "true") and a number (e.g., 1) and assert that validation fails (throws or returns errors) for each case; place the assertions alongside the existing positive test for telemetryEnabled so the schema continues to guard this field.lib/refresh-lease.ts (1)
255-284: 🧹 Nitpick | 🔵 Trivialduplicate follower handle construction — consider extraction.
lib/refresh-lease.ts:255-264andlib/refresh-lease.ts:275-284repeat the same follower handle pattern. extract a helper likecreateFollowerHandle(result: TokenResult)to reduce duplication and ensure consistent behavior if the follower semantics evolve.♻️ proposed extraction
+ private createFollowerHandle(result: TokenResult): RefreshLeaseHandle { + return { + role: "follower", + result, + release: async () => { + // Follower does not own lock. + }, + }; + }then replace the inline objects with
return this.createFollowerHandle(finalResult);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/refresh-lease.ts` around lines 255 - 284, Two identical follower handle objects are built in refresh-lease.ts after readFreshResult; extract a helper method (e.g. createFollowerHandle(result: TokenResult)) on the same class that returns the { role: "follower", result, release: async () => { /* follower does not own lock */ } } object and replace both inline constructions (the blocks after readFreshResult at the two locations) with return this.createFollowerHandle(finalResult); so behavior stays consistent and duplication is removed; ensure the helper uses the same TokenResult type and is referenced where the current inline objects are returned.lib/auth/auth.ts (1)
327-334:⚠️ Potential issue | 🟡 Minorloose timeout detection via regex could match unrelated errors.
line 329 uses
/timeout/i.test(err.message)which could match any error message containing "timeout" (e.g., "connection timeout", "socket timeout", or even "timeout value invalid"). this might incorrectly suppress logging for non-timeout errors.consider relying solely on
isAbortError(err)which checksname === "AbortError"orcode === "ABORT_ERR"perlib/utils.ts:19-23.🔧 suggested fix
- if (isAbortError(err) || /timeout/i.test(err.message)) { + if (isAbortError(err)) { return { type: "failed", reason: "unknown", message: err?.message ?? "Request aborted" }; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/auth/auth.ts` around lines 327 - 334, The catch block in the token refresh flow currently treats any error whose message matches /timeout/i as an abort and skips logging; update the catch handling in the try/catch inside lib/auth/auth.ts so it no longer relies on the regex check and instead only treats errors as aborted when isAbortError(err) returns true (leave the err normalization via error instanceof Error intact), preserve the existing returned shapes (e.g., return { type: "failed", reason: "unknown", message: ... } for aborts) and ensure non-abort errors always call logError("Token refresh error", err) and return the network_error result; reference the catch block and functions isAbortError and logError to locate and change the logic.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 4777ad81-b342-4db1-8d7f-a5f369bbf6d3
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (116)
.github/pull_request_template.md.github/settings.yml.github/workflows/ci.yml.github/workflows/secret-scan.yml.github/workflows/supply-chain.ymlCONTRIBUTING.mdREADME.mdbiome.jsoncdocs/DOCUMENTATION.mddocs/README.mddocs/configuration.mddocs/development/CONFIG_FIELDS.mddocs/development/LOCAL_DEV.mddocs/development/RELEASE_RUNBOOK.mddocs/development/TESTING.mddocs/index.mddocs/privacy.mddocs/reference/commands.mddocs/reference/error-contracts.mddocs/reference/public-api.mddocs/reference/settings.mddocs/reference/storage-paths.mddocs/runbooks/README.mddocs/runbooks/incident-response.mddocs/runbooks/operations.mddocs/upgrade.mdindex.tslib/accounts.tslib/accounts/account-view.tslib/accounts/active-index.tslib/accounts/storage-view.tslib/audit.tslib/auth/auth.tslib/authorization.tslib/auto-update-checker.tslib/background-jobs.tslib/codex-cli/sync.tslib/codex-manager.tslib/codex-manager/settings-hub.tslib/config.tslib/data-redaction.tslib/data-retention.tslib/file-lock.tslib/idempotency.tslib/index.tslib/network.tslib/proactive-refresh.tslib/prompts/codex.tslib/prompts/host-codex-prompt.tslib/quota-cache.tslib/recovery/storage.tslib/refresh-guardian.tslib/refresh-lease.tslib/refresh-queue.tslib/request/fetch-helpers.tslib/request/helpers/tool-utils.tslib/request/response-handler.tslib/request/retry-governor.tslib/schemas.tslib/secrets-crypto.tslib/shutdown.tslib/storage.tslib/telemetry.tslib/unified-settings.tslib/utils.tspackage.jsonscripts/benchmark-runtime-path.mjsscripts/check-runtime-budgets.mjsscripts/doctor-dev.jsscripts/license-policy-check.jsscripts/runtime-bench-budgets.jsonscripts/setup-dev.jstest/account-storage-view.test.tstest/account-view.test.tstest/accounts-edge.test.tstest/active-index.test.tstest/audit.test.tstest/auth-logging.test.tstest/auth.test.tstest/authorization.test.tstest/auto-update-checker.test.tstest/background-jobs.test.tstest/codex-cli-sync.test.tstest/codex-manager-cli.test.tstest/codex-prompts.test.tstest/config-save.test.tstest/data-redaction.test.tstest/data-retention.test.tstest/fetch-helpers.test.tstest/file-lock.test.tstest/host-codex-prompt.test.tstest/idempotency.test.tstest/index-retry.test.tstest/index.test.tstest/network.test.tstest/paths.test.tstest/plugin-config.test.tstest/proactive-refresh.test.tstest/quota-cache.test.tstest/recovery-storage.test.tstest/refresh-guardian.test.tstest/refresh-lease.test.tstest/refresh-queue.test.tstest/response-handler-logging.test.tstest/response-handler.test.tstest/retry-governor.test.tstest/schemas.test.tstest/scripts/doctor-dev.test.tstest/scripts/setup-dev.test.tstest/secrets-crypto.test.tstest/server.unit.test.tstest/settings-hub-utils.test.tstest/shutdown.test.tstest/storage.test.tstest/telemetry.test.tstest/unified-settings.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (3)
test/**
⚙️ CodeRabbit configuration file
tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.
Files:
test/codex-cli-sync.test.tstest/response-handler-logging.test.tstest/response-handler.test.tstest/schemas.test.tstest/paths.test.tstest/recovery-storage.test.tstest/auto-update-checker.test.tstest/authorization.test.tstest/proactive-refresh.test.tstest/plugin-config.test.tstest/fetch-helpers.test.tstest/quota-cache.test.tstest/auth-logging.test.tstest/settings-hub-utils.test.tstest/scripts/setup-dev.test.tstest/scripts/doctor-dev.test.tstest/active-index.test.tstest/auth.test.tstest/accounts-edge.test.tstest/index-retry.test.tstest/unified-settings.test.tstest/refresh-queue.test.tstest/telemetry.test.tstest/host-codex-prompt.test.tstest/retry-governor.test.tstest/idempotency.test.tstest/data-retention.test.tstest/server.unit.test.tstest/network.test.tstest/refresh-lease.test.tstest/shutdown.test.tstest/data-redaction.test.tstest/secrets-crypto.test.tstest/refresh-guardian.test.tstest/account-view.test.tstest/codex-manager-cli.test.tstest/config-save.test.tstest/codex-prompts.test.tstest/background-jobs.test.tstest/file-lock.test.tstest/audit.test.tstest/storage.test.tstest/account-storage-view.test.tstest/index.test.ts
lib/**
⚙️ CodeRabbit configuration file
focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.
Files:
lib/audit.tslib/index.tslib/auto-update-checker.tslib/accounts/account-view.tslib/data-retention.tslib/proactive-refresh.tslib/refresh-guardian.tslib/schemas.tslib/background-jobs.tslib/refresh-lease.tslib/recovery/storage.tslib/quota-cache.tslib/codex-cli/sync.tslib/request/helpers/tool-utils.tslib/utils.tslib/codex-manager/settings-hub.tslib/idempotency.tslib/file-lock.tslib/secrets-crypto.tslib/network.tslib/prompts/host-codex-prompt.tslib/request/response-handler.tslib/shutdown.tslib/refresh-queue.tslib/accounts.tslib/accounts/storage-view.tslib/prompts/codex.tslib/accounts/active-index.tslib/auth/auth.tslib/request/retry-governor.tslib/storage.tslib/telemetry.tslib/request/fetch-helpers.tslib/authorization.tslib/codex-manager.tslib/config.tslib/unified-settings.tslib/data-redaction.ts
docs/**
⚙️ CodeRabbit configuration file
keep README, SECURITY, and docs consistent with actual CLI flags and workflows. whenever behavior changes, require updated upgrade notes and mention new npm scripts.
Files:
docs/development/RELEASE_RUNBOOK.mddocs/reference/settings.mddocs/configuration.mddocs/runbooks/incident-response.mddocs/upgrade.mddocs/reference/error-contracts.mddocs/development/LOCAL_DEV.mddocs/runbooks/README.mddocs/privacy.mddocs/DOCUMENTATION.mddocs/runbooks/operations.mddocs/development/CONFIG_FIELDS.mddocs/development/TESTING.mddocs/README.mddocs/index.mddocs/reference/storage-paths.mddocs/reference/commands.mddocs/reference/public-api.md
🧬 Code graph analysis (53)
test/codex-cli-sync.test.ts (3)
scripts/benchmark-runtime-path.mjs (2)
accountsPath(208-208)current(292-292)lib/storage.ts (1)
AccountStorageV3(34-34)lib/codex-cli/sync.ts (1)
syncAccountStorageFromCodexCli(328-421)
test/response-handler.test.ts (1)
lib/request/response-handler.ts (2)
convertSseToJsonWithDetails(65-144)convertSseToJson(56-63)
test/paths.test.ts (1)
lib/storage/paths.ts (1)
resolveProjectStorageIdentityRoot(256-298)
test/recovery-storage.test.ts (1)
lib/recovery/constants.ts (2)
MESSAGE_STORAGE(31-31)PART_STORAGE(32-32)
test/authorization.test.ts (2)
lib/authorization.ts (2)
getAuthorizationRole(38-40)authorizeAction(107-145)lib/audit.ts (2)
getAuditConfig(63-65)configureAudit(59-61)
test/proactive-refresh.test.ts (1)
lib/proactive-refresh.ts (3)
refreshExpiringAccounts(136-200)DEFAULT_PROACTIVE_BUFFER_MS(21-21)DEFAULT_PROACTIVE_REFRESH_MAX_CONCURRENCY(25-25)
test/fetch-helpers.test.ts (1)
lib/request/fetch-helpers.ts (1)
handleSuccessResponseDetailed(651-683)
test/quota-cache.test.ts (1)
lib/quota-cache.ts (1)
saveQuotaCache(221-275)
test/auth-logging.test.ts (2)
lib/auth/auth.ts (3)
exchangeAuthorizationCode(180-244)REDIRECT_URI(12-12)refreshAccessToken(278-335)lib/logger.ts (1)
logError(348-352)
lib/auto-update-checker.ts (1)
lib/network.ts (1)
fetchWithTimeoutAndRetry(124-194)
lib/accounts/account-view.ts (1)
lib/prompts/codex.ts (2)
ModelFamily(54-59)MODEL_FAMILIES(65-71)
lib/data-retention.ts (1)
lib/runtime-paths.ts (3)
getCodexLogDir(226-228)getCodexCacheDir(212-214)getCodexMultiAuthDir(166-199)
test/scripts/setup-dev.test.ts (2)
scripts/setup-dev.js (5)
resolveNpmInvocation(20-36)spawnFactory(39-39)runCommand(38-72)runCommandFn(78-78)runSetupDev(74-142)scripts/audit-dev-allowlist.js (1)
process(139-147)
lib/background-jobs.ts (5)
lib/runtime-paths.ts (1)
getCodexMultiAuthDir(166-199)lib/file-lock.ts (1)
acquireFileLock(57-107)lib/utils.ts (1)
sleep(65-67)lib/data-redaction.ts (1)
redactForExternalOutput(31-50)lib/logger.ts (1)
logWarn(341-346)
test/active-index.test.ts (2)
lib/accounts/active-index.ts (4)
createActiveIndexByFamily(22-31)setActiveIndexForAllFamilies(33-43)normalizeActiveIndexByFamily(45-86)removeAccountAndReconcileActiveIndexes(88-124)lib/prompts/codex.ts (1)
MODEL_FAMILIES(65-71)
lib/recovery/storage.ts (4)
lib/logger.ts (1)
createLogger(366-424)lib/utils.ts (1)
isRecord(11-13)lib/recovery/types.ts (2)
StoredMessageMeta(15-25)StoredPart(67-78)lib/recovery/constants.ts (2)
MESSAGE_STORAGE(31-31)THINKING_TYPES(34-34)
test/auth.test.ts (1)
lib/auth/auth.ts (3)
exchangeAuthorizationCode(180-244)REDIRECT_URI(12-12)refreshAccessToken(278-335)
test/index-retry.test.ts (1)
index.ts (1)
OpenAIAuthPlugin(4378-4378)
lib/idempotency.ts (3)
lib/runtime-paths.ts (1)
getCodexMultiAuthDir(166-199)lib/utils.ts (1)
nowMs(31-33)lib/file-lock.ts (1)
acquireFileLock(57-107)
test/unified-settings.test.ts (1)
lib/unified-settings.ts (3)
getUnifiedSettingsPath(315-317)saveUnifiedPluginConfig(379-398)loadUnifiedPluginConfigSync(328-336)
test/host-codex-prompt.test.ts (1)
lib/prompts/host-codex-prompt.ts (1)
getHostCodexPrompt(378-409)
test/retry-governor.test.ts (1)
lib/request/retry-governor.ts (1)
decideRetryAllAccountsRateLimited(42-72)
scripts/benchmark-runtime-path.mjs (4)
lib/codex-cli/state.ts (1)
clearCodexCliStateCache(545-549)lib/storage.ts (1)
normalizeAccountStorage(956-1044)lib/codex-cli/sync.ts (1)
syncAccountStorageFromCodexCli(328-421)lib/request/fetch-helpers.ts (2)
handleErrorResponse(595-632)handleSuccessResponseDetailed(651-683)
test/data-retention.test.ts (1)
lib/data-retention.ts (2)
RetentionPolicy(5-11)enforceDataRetention(95-122)
test/server.unit.test.ts (2)
lib/auth/server.ts (1)
startLocalOAuthServer(31-123)lib/logger.ts (1)
logWarn(341-346)
test/network.test.ts (1)
lib/network.ts (1)
fetchWithTimeoutAndRetry(124-194)
lib/network.ts (1)
lib/utils.ts (2)
isAbortError(20-24)sleep(65-67)
lib/prompts/host-codex-prompt.ts (2)
lib/network.ts (1)
fetchWithTimeoutAndRetry(124-194)lib/logger.ts (1)
logDebug(325-331)
test/shutdown.test.ts (1)
lib/shutdown.ts (2)
registerCleanup(18-21)runCleanup(30-62)
test/data-redaction.test.ts (1)
lib/data-redaction.ts (1)
redactForExternalOutput(31-50)
lib/refresh-queue.ts (1)
scripts/doctor-dev.js (1)
log(183-183)
test/secrets-crypto.test.ts (1)
lib/secrets-crypto.ts (4)
encryptSecret(107-121)isEncryptedSecret(102-105)decryptSecret(123-161)getSecretEncryptionKeysFromEnv(168-173)
lib/accounts.ts (2)
lib/utils.ts (1)
sleep(65-67)lib/background-jobs.ts (1)
runBackgroundJobWithRetry(71-115)
test/account-view.test.ts (1)
lib/accounts/account-view.ts (5)
resolveActiveIndex(16-25)getRateLimitResetTimeForFamily(27-47)formatRateLimitEntry(49-59)formatRateLimitStatusByFamily(61-71)formatActiveIndexByFamilyLabels(73-82)
test/codex-manager-cli.test.ts (1)
lib/codex-manager.ts (1)
runCodexMultiAuthCli(4657-4765)
lib/accounts/storage-view.ts (1)
lib/accounts/active-index.ts (1)
createActiveIndexByFamily(22-31)
lib/prompts/codex.ts (2)
lib/network.ts (1)
fetchWithTimeoutAndRetry(124-194)lib/logger.ts (1)
logDebug(325-331)
test/config-save.test.ts (1)
lib/config.ts (2)
savePluginConfig(488-531)loadPluginConfig(194-258)
lib/accounts/active-index.ts (1)
lib/prompts/codex.ts (2)
ModelFamily(54-59)MODEL_FAMILIES(65-71)
lib/auth/auth.ts (3)
lib/logger.ts (2)
error(389-393)logError(348-352)lib/schemas.ts (1)
safeParseOAuthTokenResponse(312-318)lib/utils.ts (2)
isAbortError(20-24)fetchWithTimeout(76-112)
test/codex-prompts.test.ts (1)
lib/prompts/codex.ts (1)
getCodexInstructions(248-337)
test/background-jobs.test.ts (1)
lib/background-jobs.ts (2)
runBackgroundJobWithRetry(71-115)getBackgroundJobDlqPath(67-69)
lib/storage.ts (2)
lib/secrets-crypto.ts (5)
SecretEncryptionKeys(17-20)getSecretEncryptionKeysFromEnv(168-173)isEncryptedSecret(102-105)decryptSecret(123-161)encryptSecret(107-121)lib/file-lock.ts (1)
acquireFileLock(57-107)
test/file-lock.test.ts (1)
lib/file-lock.ts (1)
acquireFileLock(57-107)
lib/request/fetch-helpers.ts (3)
lib/constants.ts (1)
HTTP_STATUS(19-26)lib/request/response-handler.ts (1)
convertSseToJsonWithDetails(65-144)lib/utils.ts (1)
isRecord(11-13)
lib/authorization.ts (1)
lib/audit.ts (1)
auditLog(123-153)
index.ts (8)
lib/auth-rate-limit.ts (3)
checkAuthRateLimit(119-127)recordAuthAttempt(49-61)resetAuthRateLimit(98-101)lib/request/retry-governor.ts (2)
RetryAllAccountsRateLimitDecisionReason(12-19)decideRetryAllAccountsRateLimited(42-72)lib/config.ts (3)
getFetchTimeoutMs(910-917)getRetryAllAccountsAbsoluteCeilingMs(712-719)getTelemetryEnabled(851-857)lib/accounts/active-index.ts (3)
setActiveIndexForAllFamilies(33-43)normalizeActiveIndexByFamily(45-86)removeAccountAndReconcileActiveIndexes(88-124)lib/telemetry.ts (1)
recordTelemetryEvent(287-312)lib/audit.ts (1)
auditLog(123-153)lib/accounts/storage-view.ts (2)
cloneAccountStorage(20-29)createEmptyAccountStorage(8-18)lib/accounts/account-view.ts (2)
formatActiveIndexByFamilyLabels(73-82)formatRateLimitStatusByFamily(61-71)
test/storage.test.ts (1)
lib/storage.ts (4)
setStoragePathDirect(624-632)loadAccounts(1051-1053)AccountStorageV3(34-34)saveAccounts(1401-1406)
test/account-storage-view.test.ts (2)
lib/accounts/storage-view.ts (2)
createEmptyAccountStorage(8-18)cloneAccountStorage(20-29)lib/prompts/codex.ts (1)
MODEL_FAMILIES(65-71)
lib/codex-manager.ts (11)
lib/audit.ts (1)
auditLog(123-153)lib/auth-rate-limit.ts (2)
checkAuthRateLimit(119-127)resetAuthRateLimit(98-101)lib/data-redaction.ts (1)
redactForExternalOutput(31-50)lib/authorization.ts (3)
AuthAction(5-9)AuthorizationContext(11-15)authorizeAction(107-145)lib/quota-cache.ts (2)
QuotaCacheData(23-26)saveQuotaCache(221-275)lib/accounts/active-index.ts (3)
createActiveIndexByFamily(22-31)normalizeActiveIndexByFamily(45-86)setActiveIndexForAllFamilies(33-43)lib/storage.ts (1)
rotateStoredSecretEncryption(1713-1739)lib/accounts/account-view.ts (2)
resolveActiveIndex(16-25)formatRateLimitEntry(49-59)lib/accounts.ts (2)
formatAccountLabel(922-941)formatCooldown(943-952)lib/telemetry.ts (3)
queryTelemetryEvents(314-334)summarizeTelemetryEvents(336-365)getTelemetryLogPath(283-285)lib/idempotency.ts (1)
checkAndRecordIdempotencyKey(86-125)
lib/config.ts (2)
lib/schemas.ts (1)
PluginConfigSchema(13-62)lib/utils.ts (1)
isRecord(11-13)
lib/unified-settings.ts (3)
lib/runtime-paths.ts (1)
getCodexMultiAuthDir(166-199)lib/file-lock.ts (2)
acquireFileLockSync(126-181)acquireFileLock(57-107)lib/utils.ts (1)
sleep(65-67)
test/index.test.ts (1)
lib/prompts/codex.ts (1)
MODEL_FAMILIES(65-71)
🪛 ast-grep (0.41.0)
test/index.test.ts
[warning] 897-900: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(
^${escapedLabel}: ${numberPattern}${suffixPattern}$,
"m",
)
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
🪛 LanguageTool
docs/runbooks/operations.md
[uncategorized] ~64-~64: The official name of this software platform is spelled with a capital “H”.
Context: ...ion required checks remain aligned with .github/settings.yml. --- ## Failure Triage ...
(GITHUB)
🪛 markdownlint-cli2 (0.21.0)
docs/configuration.md
[warning] 73-73: Table column count
Expected: 2; Actual: 4; Too many cells, extra data will be missing
(MD056, table-column-count)
- harden auth and codex-manager review paths\n- add deterministic regression coverage for auth/manager/prompts/index\n- tighten CLI/list behavior, telemetry redaction, and account-index reconciliation\n\nCo-authored-by: Codex <noreply@openai.com>
Additional Comments (1)
when the tests at fix: use an explicit 5xx or 502 status to signal a gateway/parse failure, so downstream error handling gates on status code: Prompt To Fix With AIThis is a comment left during a code review.
Path: lib/request/response-handler.ts
Line: 95-116
Comment:
**SSE parse-failure returns upstream status code (often 200)**
when `parseSseStream` finds no `response.done` event (line 95), the returned `Response` inherits `response.status` from the upstream SSE response. for a normal codex request that stream was opened with a 200, so `response.ok` evaluates to `true` on the error path. any caller that gates on `response.ok` (or checks status < 400) will silently swallow this and treat the structured `stream_parse_error` body as a success payload.
the tests at `test/response-handler.test.ts` verify the error body is correct but don't assert the status code — a caller checking `result.response.ok` would incorrectly treat the parse error as success.
fix: use an explicit 5xx or 502 status to signal a gateway/parse failure, so downstream error handling gates on status code:
```suggestion
return {
response: new Response(
JSON.stringify({
error: {
message: "No response.done event found in SSE stream",
type: "stream_parse_error",
},
}),
{
status: 502,
statusText: "Bad Gateway",
headers: jsonHeaders,
},
),
};
```
How can I resolve this? If you propose a fix, please make it concise. |
- add deterministic regressions for settings-hub metadata/unlimited preview, jitter ceiling accounting, quota-cache win32 writes, runtime budget validation, and sync benchmark guard - treat EBUSY account saves as recoverable merge conflicts and verify merged retry payloads - share removeWithRetry test helper to avoid duplicate Windows cleanup logic - normalize caller abort reasons in fetchWithTimeout catch paths to preserve AbortError contract Co-authored-by: Codex <noreply@openai.com>
- default CODEX_AUTH_ROLE to viewer to preserve least-privilege when unset - add double ownership verification before lock unlink and document TOCTOU limitation - add regressions for replacement-race lock release behavior and viewer default authorization Co-authored-by: Codex <noreply@openai.com>
There was a problem hiding this comment.
Actionable comments posted: 29
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
test/server.unit.test.ts (1)
328-350: 🧹 Nitpick | 🔵 Trivialwrap timer cleanup in try/finally for determinism.
if an assertion fails before line 349, fake timers stay active and can bleed into subsequent tests. the new close-abort test at
test/server.unit.test.ts:303-326uses try/finally correctly—this one should too.suggested fix
it('should return null after 5 minute timeout', async () => { vi.useFakeTimers(); - - (mockServer.listen as ReturnType<typeof vi.fn>).mockImplementation( - (_port: number, _host: string, callback: () => void) => { - callback(); - return mockServer; - } - ); - (mockServer.on as ReturnType<typeof vi.fn>).mockReturnValue(mockServer); - - const result = await startLocalOAuthServer({ state: 'test-state' }); - - const codePromise = result.waitForCode(); - - await vi.advanceTimersByTimeAsync(5 * 60 * 1000 + 100); - - const code = await codePromise; - expect(code).toBeNull(); - expect(logWarn).toHaveBeenCalledWith('OAuth poll timeout after 5 minutes'); - - vi.useRealTimers(); + try { + (mockServer.listen as ReturnType<typeof vi.fn>).mockImplementation( + (_port: number, _host: string, callback: () => void) => { + callback(); + return mockServer; + } + ); + (mockServer.on as ReturnType<typeof vi.fn>).mockReturnValue(mockServer); + + const result = await startLocalOAuthServer({ state: 'test-state' }); + + const codePromise = result.waitForCode(); + + await vi.advanceTimersByTimeAsync(5 * 60 * 1000 + 100); + + const code = await codePromise; + expect(code).toBeNull(); + expect(logWarn).toHaveBeenCalledWith('OAuth poll timeout after 5 minutes'); + } finally { + vi.useRealTimers(); + } });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/server.unit.test.ts` around lines 328 - 350, The test "should return null after 5 minute timeout" uses vi.useFakeTimers() but doesn't guarantee restoring real timers on failure; wrap the fake-timer setup and all test logic in a try/finally so vi.useRealTimers() is always called. Specifically, in the test block containing startLocalOAuthServer, result.waitForCode(), vi.advanceTimersByTimeAsync, and assertions, move vi.useRealTimers() into a finally clause to ensure cleanup even if an assertion throws; reference the test name, vi.useFakeTimers/vi.useRealTimers, startLocalOAuthServer, and result.waitForCode to locate the code to change.index.ts (1)
797-857:⚠️ Potential issue | 🟠 Majoradd deterministic vitest coverage for the concurrent hydration worker path.
index.ts:797-857introduces concurrent workers with shared progress/counter state (nextHydrateIndex, refreshed/failed/updated counters), but no corresponding deterministic regression is included in the changedtest/**files. add a test that forces mixed refresh outcomes and asserts each account is processed once with stable aggregate counts.As per coding guidelines,
test/**: tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.Also applies to: 859-880
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@index.ts` around lines 797 - 857, The concurrent worker loop using shared state (workerCount, nextHydrateIndex, queuedRefresh, accountsToHydrate, refreshedCount, failedCount, updatedAccountCount) needs a deterministic vitest regression: add a test in test/** that sets HYDRATE_EMAILS_MAX_CONCURRENCY to a known value, provides a fixed accountsToHydrate array, and mocks queuedRefresh to return a controlled sequence of mixed outcomes (success with varied access/refresh/expires, failure) while ensuring each mock call is deterministic (use a seeded sequence or a controlled promise queue rather than timing/flaky delays). The test must assert each account was processed exactly once and that refreshedCount/failedCount/updatedAccountCount match expected values, and must avoid mocking real secrets or skipping assertions.docs/reference/settings.md (1)
184-192:⚠️ Potential issue | 🟠 Majorvalidation commands are stale relative to current verification gates.
docs/reference/settings.md:184-192still shows only runtime cli checks. this section should include the current verification gates and npm scripts used by the hardening rollout, plus an upgrade-note pointer.proposed docs update
## Validation After changes: ```bash -codex auth status -codex auth check -codex auth forecast --live +npm run verify +npm run verify:ci +npm test -- test/documentation.test.ts +# optional local checks +npm run doctor:dev +npm run setup:dev</details> As per coding guidelines `docs/**`: keep README, SECURITY, and docs consistent with actual CLI flags and workflows. whenever behavior changes, require updated upgrade notes and mention new npm scripts. <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@docs/reference/settings.mdaround lines 184 - 192, Update the Validation
section to replace the stale CLI-only checks with the current verification gates
and npm scripts: list "npm run verify", "npm run verify:ci", and "npm test --
test/documentation.test.ts" as the primary gates, and add optional local checks
"npm run doctor:dev" and "npm run setup:dev"; also add a brief upgrade-note
pointer to the release/upgrade documentation indicating these new scripts and
any changed CLI flags (ensure the section references the exact script names and
the documentation test file to match current workflows).</details> </blockquote></details> <details> <summary>lib/auth/auth.ts (1)</summary><blockquote> `234-253`: _⚠️ Potential issue_ | _🟠 Major_ **normalize refresh timeout input before calling `fetchwithtimeout`** `lib/auth/auth.ts:253` (line 253) uses `options.timeoutMs ?? OAUTH_REFRESH_TIMEOUT_MS`, so `0`, negative values, and `NaN` can bypass sane bounds. this is inconsistent with `lib/auth/auth.ts:121-126` (line 121-126), where exchange timeout is clamped and validated. <details> <summary>proposed fix</summary> ```diff type RefreshAccessTokenOptions = { signal?: AbortSignal; timeoutMs?: number; }; +function resolveRefreshTimeoutMs(timeoutMs: number | undefined): number { + if (typeof timeoutMs !== "number" || !Number.isFinite(timeoutMs)) { + return OAUTH_REFRESH_TIMEOUT_MS; + } + return Math.max(1_000, Math.floor(timeoutMs)); +} + export async function refreshAccessToken( refreshToken: string, options: RefreshAccessTokenOptions = {}, ): Promise<TokenResult> { try { const response = await fetchWithTimeout(TOKEN_URL, { @@ - }, options.timeoutMs ?? OAUTH_REFRESH_TIMEOUT_MS); + }, resolveRefreshTimeoutMs(options.timeoutMs));please add vitest coverage for
timeoutMsvalues0,-1, andNaNin the refresh path. As per coding guidelineslib/**: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/auth/auth.ts` around lines 234 - 253, The refreshAccessToken function passes options.timeoutMs directly to fetchWithTimeout allowing 0, negative, or NaN to bypass the default; normalize and clamp timeoutMs before the fetch: read RefreshAccessTokenOptions.timeoutMs, coerce to a finite number, and if it's NaN or <= 0 fall back to OAUTH_REFRESH_TIMEOUT_MS (or apply the same min/max clamp logic used in the exchange path), then pass that normalized value to fetchWithTimeout; update refreshAccessToken and related references (refreshAccessToken, RefreshAccessTokenOptions, fetchWithTimeout, OAUTH_REFRESH_TIMEOUT_MS) accordingly and add Vitest cases asserting behavior for timeoutMs values 0, -1, and NaN in the refresh path, ensuring tests avoid logging tokens/emails and reference any changed tests in the commit message.lib/storage.ts (1)
1118-1174:⚠️ Potential issue | 🟠 Majorknown revision is overwritten with a stale checksum after migration
lib/storage.ts:1123-1129(line 1123-1129) can persist migrated content, butlib/storage.ts:1173(line 1173) still storesrawChecksumfrom the pre-migration file. thensaveAccountsUnlockedconflict checks atlib/storage.ts:1258-1273(line 1258-1273) can raise falseECONFLICTeven with no concurrent writer.proposed fix
- const { normalized, storedVersion, schemaErrors, rawChecksum } = await loadAccountsFromPath(path); + const { normalized, storedVersion, schemaErrors, rawChecksum } = await loadAccountsFromPath(path); + let revisionToRemember = rawChecksum; @@ if (persistMigration) { try { await persistMigration(normalized); + revisionToRemember = (await readStorageRevision(path)) ?? rawChecksum; } catch (saveError) { log.warn("Failed to persist migrated storage", { error: String(saveError) }); } } @@ - rememberKnownStorageRevision(path, rawChecksum); + rememberKnownStorageRevision(path, revisionToRemember);please add a vitest regression for
v1 -> v3auto-migration followed by immediate save to confirm no false conflict is raised. As per coding guidelineslib/**: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.Also applies to: 1258-1273
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/storage.ts` around lines 1118 - 1174, The migration path currently persists migrated data (persistMigration) but still calls rememberKnownStorageRevision(path, rawChecksum) using the pre-migration rawChecksum, causing false ECONFLICTs in saveAccountsUnlocked; fix by updating the stored checksum after a successful persistence: after persistMigration(normalized) either compute the actual on-disk checksum for the migrated content or reload the written file (e.g., via loadAccountsFromPath(path)) to obtain the new rawChecksum and pass that value into rememberKnownStorageRevision instead of the original rawChecksum; update code paths around persistMigration, the promotion branch, and the final rememberKnownStorageRevision call accordingly (functions/methods: persistMigration, loadAccountsFromPath, rememberKnownStorageRevision, saveAccountsUnlocked), and add a vitest regression that performs a v1->v3 auto-migration then immediately calls saveAccountsUnlocked to assert no ECONFLICT is thrown.
♻️ Duplicate comments (8)
test/utils.test.ts (1)
194-236:⚠️ Potential issue | 🟠 Majoradd timeout-path regression coverage for fetchwithtimeout.
Line 199 and Line 216 validate caller-abort behavior, but there is still no test for the hard-timeout abort path implemented in
lib/utils.ts:84-89andlib/utils.ts:123-125. this leaves a concurrency-sensitive abort classification path unguarded.As per coding guidelines `test/**`: tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.proposed test addition
describe('fetchWithTimeout', () => { afterEach(() => { vi.unstubAllGlobals(); }); + + it('normalizes hard timeout to AbortError shape', async () => { + vi.useFakeTimers(); + try { + vi.stubGlobal( + 'fetch', + vi.fn((_input: unknown, init?: RequestInit) => + new Promise((_, reject) => { + init?.signal?.addEventListener( + 'abort', + () => reject(init.signal?.reason), + { once: true }, + ); + }), + ), + ); + const request = fetchWithTimeout('https://example.com', {}, 1_000); + await vi.advanceTimersByTimeAsync(1_000); + await expect(request).rejects.toMatchObject({ + name: 'AbortError', + code: 'ABORT_ERR', + message: 'Fetch timeout after 1000ms', + }); + } finally { + vi.useRealTimers(); + } + });lib/idempotency.ts (1)
98-100:⚠️ Potential issue | 🟠 Majorinvalid ttl fallback to
1msweakens replay protection under misconfiguration.Line 98 in
lib/idempotency.ts:98maps non-finite/invalid ttl to1. that can effectively disable idempotency after ~1ms if caller input is bad. default toIDEMPOTENCY_TTL_MSfor fail-safe behavior.proposed fix
function normalizeTtlMs(ttlMs: number): number { - return Number.isFinite(ttlMs) && ttlMs > 0 ? Math.floor(ttlMs) : 1; + if (Number.isFinite(ttlMs) && ttlMs > 0) { + return Math.max(1, Math.floor(ttlMs)); + } + return IDEMPOTENCY_TTL_MS; }As per coding guidelines
lib/**: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/idempotency.ts` around lines 98 - 100, normalizeTtlMs currently maps invalid/non-finite TTLs to 1ms which weakens replay protection; change normalizeTtlMs to return the configured fail-safe constant IDEMPOTENCY_TTL_MS when ttlMs is not a positive finite number (keep Math.floor for valid values), update the function reference in lib/idempotency.ts to import/use IDEMPOTENCY_TTL_MS, and adjust/add vitest unit tests that assert fallback behavior (ensure tests cover non-finite, negative, and NaN inputs) so the default is IDEMPOTENCY_TTL_MS rather than 1.test/network.test.ts (1)
89-106:⚠️ Potential issue | 🟡 Minorstrengthen the 429 retry regression with callback payload assertions.
Line 89 in
test/network.test.ts:89covers 429 retry count, but it does not assertonRetrymetadata. this can miss regressions in retry reason/status reporting while still passing.proposed test tightening
it("retries on HTTP 429 when configured", async () => { + const onRetry = vi.fn(); fetchMock .mockResolvedValueOnce(new Response("rate-limited", { status: 429 })) .mockResolvedValueOnce(new Response("ok", { status: 200 })); const resultPromise = fetchWithTimeoutAndRetry("https://example.com", undefined, { timeoutMs: 2_000, retries: 1, baseDelayMs: 25, jitterMs: 0, retryOnStatuses: [429], + onRetry, }); await vi.runAllTimersAsync(); const result = await resultPromise; expect(result.attempts).toBe(2); expect(fetchMock).toHaveBeenCalledTimes(2); + expect(onRetry).toHaveBeenCalledWith( + expect.objectContaining({ + reason: "status", + status: 429, + attempt: 1, + maxAttempts: 2, + }), + ); });As per coding guidelines
test/**: tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/network.test.ts` around lines 89 - 106, The test currently only asserts retry counts but not the onRetry callback payload; update the test for fetchWithTimeoutAndRetry to pass a mocked onRetry handler (e.g., vi.fn()) and assert it was called once with the expected metadata (attempt number 1, the retry reason/status indicating 429 or a response object with status 429, and the request URL and delay values), in addition to the existing attempts/fetch call count assertions so the retry reason/status reporting is validated deterministically.lib/authorization.ts (1)
30-36:⚠️ Potential issue | 🔴 Criticaldefault role still "admin" when env absent - pr objectives flag this as critical.
lib/authorization.ts:31defaults to"admin"whenCODEX_AUTH_ROLEis unset. pr objectives explicitly state: "default role fallback ... currently defaults to 'admin' when env absent; should be 'viewer'". the past review addressed invalid values falling back to "viewer" (line 35), but the absent-env case still grants admin privileges.🔧 proposed fix
function getRoleFromEnv(): AuthRole { - const raw = (process.env.CODEX_AUTH_ROLE ?? "admin").trim().toLowerCase(); + const raw = (process.env.CODEX_AUTH_ROLE ?? "").trim().toLowerCase(); if (raw === "operator" || raw === "viewer" || raw === "admin") { return raw; } return "viewer"; }this change means both absent env and invalid env fall back to "viewer", which is least-privilege. update
test/authorization.test.ts:39-52to expect"viewer"instead of"admin"when env is deleted.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/authorization.ts` around lines 30 - 36, The getRoleFromEnv function currently defaults to "admin" when CODEX_AUTH_ROLE is unset; change it so the default fallback is "viewer" (least-privilege) and preserve the existing validation that invalid values map to "viewer"; update the logic in getRoleFromEnv to initialize raw to "viewer" (or check for undefined and return "viewer") instead of "admin", and update the test expectations in test/authorization.test.ts (lines checking behavior when env is deleted) to expect "viewer" rather than "admin".test/data-retention.test.ts (1)
80-114:⚠️ Potential issue | 🟡 Minoradd
ebusycase to complete windows prune regression coverage.line 80 currently validates only
EPERM. please include anEBUSYvariant (or parameterize both) so windows transient fs behavior is fully pinned intest/data-retention.test.ts:80againstlib/data-retention.ts:95.proposed parameterized variant
-it("rethrows non-ENOENT errors during recursive prune", async () => { +it.each(["EPERM", "EBUSY"] as const)( + "rethrows non-ENOENT errors during recursive prune (%s)", + async (code) => { const { enforceDataRetention } = await import("../lib/data-retention.js"); // ... unlinkSpy.mockImplementation(async (...args) => { const path = args[0]; if (typeof path === "string" && path.endsWith("locked.log")) { const error = new Error("locked") as NodeJS.ErrnoException; - error.code = "EPERM"; + error.code = code; throw error; } return originalUnlink(...(args as Parameters<typeof fs.unlink>)); }); // ... - ).rejects.toMatchObject({ code: "EPERM" }); + ).rejects.toMatchObject({ code }); // ... -}); +});As per coding guidelines, "
test/**: tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/data-retention.test.ts` around lines 80 - 114, The test only asserts that enforceDataRetention throws an EPERM error for the locked file; add coverage for the Windows transient case by including an EBUSY variant (or parameterize to test both "EPERM" and "EBUSY") in the mocked fs.unlink behavior inside the "rethrows non-ENOENT errors during recursive prune" test, so the mockImplementation throws a NodeJS.ErrnoException with code set to "EBUSY" for the locked.log path in addition to (or as a separate subtest from) "EPERM", and update the expect(...).rejects.toMatchObject to assert the corresponding error.code for each variant while restoring the spy in the finally block.lib/file-lock.ts (2)
184-187:⚠️ Potential issue | 🟠 Major
releasedis marked true before cleanup succeeds, blocking retry on transient failures.if
releaseOwnedLockorreleaseOwnedLockSyncthrows, callers cannot retry release because the handle is already marked released. this can strand lock files until stale eviction.proposed fix
release: async () => { if (released) return; - released = true; - await releaseOwnedLock(path, metadata.ownerId); + await releaseOwnedLock(path, metadata.ownerId); + released = true; },release: () => { if (released) return; - released = true; - releaseOwnedLockSync(path, metadata.ownerId); + releaseOwnedLockSync(path, metadata.ownerId); + released = true; },As per coding guidelines
lib/**: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.Also applies to: 265-268
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/file-lock.ts` around lines 184 - 187, The current release flow marks the local flag released = true before calling releaseOwnedLock/releaseOwnedLockSync, which prevents retries if those functions throw; change the logic so released is only set to true after the release call completes successfully (i.e., call await releaseOwnedLock(path, metadata.ownerId) inside a try/catch and set released = true in the try after success, rethrow or handle errors without flipping the flag), and apply the same pattern for the sync variant; update/ add vitest cases that simulate transient failures (EBUSY/429) to assert retries and eventual cleanup, and ensure any added logging around failures does not leak tokens or emails.
94-117:⚠️ Potential issue | 🔴 Criticalowner verification is still exposed to read-then-unlink race.
the current flow reads owner id then unlinks by path. under stale-lock eviction + rapid reacquire, a late releaser can still remove a newer lock file between those steps. this needs an atomic ownership release model and a regression in
test/file-lock.test.tsthat interleaves stale eviction/reacquire with delayed release.As per coding guidelines
lib/**: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.Also applies to: 119-142
test/plugin-config.test.ts (1)
1052-1073:⚠️ Potential issue | 🟡 Minorabsolute ceiling clamp coverage still misses pure config-value bounds.
these tests validate env clamp behavior, but they still do not assert negative/oversized values coming from
pluginConfigwhen env is unset (lib/config.ts:731-738handles both paths). add two config-path cases to lock the getter behavior.proposed tests
+ it('should clamp config retry-all absolute ceiling to non-negative values', () => { + delete process.env.CODEX_AUTH_RETRY_ALL_ABSOLUTE_CEILING_MS; + expect( + getRetryAllAccountsAbsoluteCeilingMs({ retryAllAccountsAbsoluteCeilingMs: -10 }), + ).toBe(0); + }); + + it('should clamp config retry-all absolute ceiling to 24h hard maximum', () => { + delete process.env.CODEX_AUTH_RETRY_ALL_ABSOLUTE_CEILING_MS; + expect( + getRetryAllAccountsAbsoluteCeilingMs({ + retryAllAccountsAbsoluteCeilingMs: 999_999_999, + }), + ).toBe(24 * 60 * 60_000); + });As per coding guidelines
test/**: tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/plugin-config.test.ts` around lines 1052 - 1073, Add two unit tests that cover the config-path clamping for getRetryAllAccountsAbsoluteCeilingMs when the environment var is unset: one that sets PluginConfig.retryAllAccountsAbsoluteCeilingMs to a negative value and asserts the getter returns 0, and another that sets it to a very large value and asserts the getter returns the 24-hour hard maximum; ensure both tests clear CODEX_AUTH_RETRY_ALL_ABSOLUTE_CEILING_MS (or leave it unset) so they exercise the config branch in the getter.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/supply-chain.yml:
- Line 22: Replace all floating action tags like actions/checkout@v4 in the
workflow with their corresponding immutable commit SHAs; specifically locate
occurrences of actions/checkout@v4 (and any other actions referenced with `@v4` at
the same file) and update each tag to the full commit SHA (e.g.,
actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5) so every use in the
file is pinned to an exact commit; ensure you update all instances mentioned
(the entries at the same positions as actions/checkout@v4 and the other `@v4`
refs) and run a quick YAML validation after editing.
In `@docs/development/CONFIG_FIELDS.md`:
- Around line 197-214: The docs table of env overrides and the plugin field
inventory are missing the telemetry entries; add the
`CODEX_AUTH_TELEMETRY_ENABLED` row to the env variable table in CONFIG_FIELDS
and include `telemetryEnabled` in the plugin/field inventory so documentation
matches runtime/tests (the tests reference telemetry usage). Update the docs
entry text to briefly describe the toggle (boolean enabling/disabling telemetry)
and ensure naming matches the code symbols `CODEX_AUTH_TELEMETRY_ENABLED` and
`telemetryEnabled`.
In `@docs/runbooks/operations.md`:
- Around line 57-66: The pre-release checklist in docs/runbooks/operations.md is
missing two validation gates; update the checklist to include explicit entries
for the npm scripts `npm run verify` and `npm test --
test/documentation.test.ts` so operators run the full gate set, i.e., insert
both commands into the numbered list alongside `npm run verify:ci`/other checks
and ensure wording matches the rest of the list for consistency with documented
CLI/workflows.
In `@index.ts`:
- Around line 1784-1794: The auditLog calls (e.g., the invocation using
AuditAction.REQUEST_START and the actor argument currently set to account.email)
are persisting raw emails; change these audit actor values to a non-identifying
key instead (for example a hashed or formatted identifier derived from
account.index or account.id) and remove account.email from the persistent
metadata. Update all occurrences that pass account.email to auditLog (including
the calls using account.index + 1 and the metadata object with
accountIndex/model/modelFamily) to pass a stable non-PII actor string (e.g.,
`account-${account.index + 1}` or a hash of account.id) and ensure any email is
only used transiently (not stored) or omitted; preserve other metadata fields
(model, accountIndex, modelFamily) as before. Ensure tests referencing audit
records (vitest) are updated to expect the non-PII actor key.
In `@lib/accounts.ts`:
- Around line 846-856: The merge-on-conflict path currently uses
clampNonNegativeInt on latest.activeIndexByFamily values so a stale too-large
index can survive; update the merge logic in lib/accounts.ts where
mergedActiveIndexByFamily and activeIndex/activeIndexByFamily are computed to
additionally clamp each family index against the mergedAccounts.length-1 (i.e.,
min(currentIndex, mergedAccounts.length-1) after ensuring non-negative) so
indices never exceed the merged account bounds; then add a vitest regression
that simulates an ebusy/econflict recovery merge where latest has fewer accounts
than stored family indices and assert for mergedActiveIndexByFamily and
activeIndex that every persisted index ∈ [0, accounts.length-1], and reference
the changed symbols clampNonNegativeInt, mergedActiveIndexByFamily, activeIndex,
and activeIndexByFamily in the test description.
In `@lib/accounts/account-view.ts`:
- Around line 23-24: The clamp currently returns fractional indexes; change
normalization in lib/accounts/account-view.ts to convert rawCandidate to an
integer (e.g., use Math.floor or Math.trunc on rawCandidate) before clamping so
the returned active index is an integer, and update the logic that computes the
active index (look for the variable rawCandidate / raw and the function
producing activeIndexByFamily.codex) to use the integer-normalized value; then
add a vitest regression in test/account-view.test.ts that sets
activeIndexByFamily.codex to a fractional value like 1.8 and asserts the
resolved active index equals 1 to cover this case.
In `@lib/codex-manager.ts`:
- Around line 4255-4268: The failure branch that prints the rotate-secrets JSON
currently logs raw error and idempotencyKey; wrap the output object with
maybeRedactJsonOutput (or run its fields through the redaction helper) before
JSON.stringify in the rotate-secrets error path so error and idempotencyKey are
redacted; update the code around the console.log that emits the { command:
"rotate-secrets", ..., error: message, ...(idempotencyKey ? { idempotencyKey } :
{}) } object to call maybeRedactJsonOutput(...) and then JSON.stringify that
result. Also add a vitest assertion in test/codex-manager-cli.test.ts that
running rotate-secrets --json on the failure path produces redacted output
(e.g., contains the redact marker or does not contain the original
error/idempotencyKey values).
- Around line 1340-1343: checkAuthRateLimit can throw inside the oauth login
path causing runOAuthFlow to propagate exceptions; wrap the pre-checks
(checkAuthRateLimit and recordAuthAttempt) in runOAuthFlow with a try/catch that
converts rate-limit/EBUSY errors (e.g. 429 or filesystem EBUSY) into a valid
failed TokenResult instead of throwing, leaving loadPluginConfig unchanged; then
add a vitest regression in test/codex-manager-cli.test.ts that stubs
checkAuthRateLimit to throw a 429-like error and asserts the CLI path (which
calls runOAuthFlow) returns the expected failed auth TokenResult (and document
the new test referencing EBUSY/429 handling).
In `@lib/config.ts`:
- Around line 306-319: Add vitest tests for readFileSyncWithRetry that mock
isRetryableFsError to return true and stub/spy on sleepSync to assert it is
called with CONFIG_READ_RETRY_BASE_DELAY_MS * 2**attempt for each retry attempt,
and assert retry stops and the original error is thrown after
CONFIG_READ_MAX_ATTEMPTS; include a separate test verifying cumulative blocking
time equals the sum of backoff delays (document the computed max cumulative
block time) and mark it acceptable for startup paths. Reference the
functions/symbols sleepSync, readFileSyncWithRetry, isRetryableFsError,
CONFIG_READ_MAX_ATTEMPTS, and CONFIG_READ_RETRY_BASE_DELAY_MS when locating code
to test; keep tests in the vitest suite (e.g., test/config.test.ts) and
restore/unmock original implementations after each test.
In `@lib/data-retention.ts`:
- Around line 57-75: The pruneDirectoryByAge flow currently does a
check-then-delete on directories (pruneDirectoryByAge + fs.rmdir) and rethrows
non-ENOENT errors, which causes flaky failures when rmdir races with concurrent
writers and raises ENOTEMPTY/EBUSY; update the deletion logic to catch ENOTEMPTY
and EBUSY, retry rmdir a few times with short backoff (e.g., exponential/backoff
up to N attempts) before giving up, ensuring removed count and recursion still
behave correctly; update the catch block that currently only treats ENOENT
specially to also swallow ENOTEMPTY/EBUSY until retries are exhausted and only
rethrow other errors; add a vitest regression that mocks/simulates concurrent
repopulation and a windows-style EBUSY from fs.rmdir (e.g., first call throws
EBUSY, subsequent call succeeds, and another test where a concurrent writer
re-adds a file so rmdir ultimately fails) to assert pruneDirectoryByAge handles
retries and doesn't abort retention, and ensure tests verify no logs emit
sensitive tokens/emails.
In `@lib/request/fetch-helpers.ts`:
- Around line 667-673: handleSuccessResponseDetailed returns converted.response
from convertSseToJsonWithDetails, but convertSseToJsonWithDetails (in
response-handler.ts) currently maps SSE parse failures to the upstream
response.status which can be 200; update convertSseToJsonWithDetails so that
when it constructs a parse-failure response it sets status to 502 and a suitable
error body instead of using response.status, then ensure
handleSuccessResponseDetailed (in fetch-helpers.ts) continues to use
converted.response; add a vitest regression in test/fetch-helpers.test.ts that
sends a malformed SSE stream with an upstream 200 and asserts the final response
status is 502; mention the changed test in the commit message and, per lib/**
guidelines, ensure any new queue logic added/modified around this flow properly
handles EBUSY and 429 retry/backoff scenarios.
In `@lib/secrets-crypto.ts`:
- Around line 26-28: The deriveAesKey function currently calls scryptSync(input,
salt, AES_KEY_LENGTH) which blocks the event loop; update the code or docs:
either (a) replace scryptSync with the async scrypt and make deriveAesKey return
a Promise (and update its callers such as decryptSecret to await it), or (b) add
a clear comment above deriveAesKey documenting that scryptSync is blocking
(~50-100ms) and recommend using the async variant or caching derived keys for
hot paths; reference deriveAesKey, scryptSync, AES_KEY_LENGTH, and decryptSecret
in your change so reviewers can verify caller updates or the added
documentation.
In `@lib/shutdown.ts`:
- Around line 31-33: The current runCleanup() directly awaits the shared promise
cleanupInFlight which lets later callers bypass their own timeout if the first
caller timed out; change the in-flight branch to wait with the same bounded
timeout logic used for fresh cleanups (i.e., race cleanupInFlight against the
existing timeout promise instead of awaiting it unconditionally), ensure
cleanupInFlight is left intact if it still runs, and update/add a vitest
regression in test/shutdown.test.ts that simulates a hanging cleanup where call
`#1` times out and verifies call `#2` also returns after its timeout; reference the
runCleanup function and the cleanupInFlight symbol in the changes and tests.
In `@lib/storage.ts`:
- Around line 345-373: The current fallback releaseStorageLockFallback
unconditionally force-removes the lock file (called from withAccountFileLock
when lock.release() throws), which can delete a lock owned by another process;
change the fallback so it does NOT blindly unlink: instead, on release failure
read/inspect the lock metadata (e.g., the lock file contents or lock token
returned by acquireFileLock) and only remove the file if you can prove ownership
(or better, avoid removing at all and only log the failure), update
withAccountFileLock to pass any lock identity/token from acquireFileLock into
the fallback path so ownership can be verified, and ensure
releaseStorageLockFallback no longer calls fs.rm(..., {force:true}) without
ownership proof. Also add a vitest regression that mocks lock.release() throwing
and asserts fs.rm is not called (or that removal only occurs when ownership is
verified) and update tests to cover Windows/EBUSY-like behavior and logging
checks to avoid leaking sensitive info.
In `@lib/telemetry.ts`:
- Around line 82-85: Add Windows transient errno coverage by including "EACCES"
and "EAGAIN" in the RETRYABLE_FILE_OP_CODES Set (alongside existing "EBUSY",
"EPERM", "ENOTEMPTY") in lib/telemetry.ts so the existing retry plumbing that
uses FILE_OP_MAX_ATTEMPTS and FILE_OP_BASE_DELAY_MS will retry on common Windows
lock errors; update test/telemetry.test.ts to add cases that simulate EACCES and
EAGAIN during telemetry file write/rotation to assert retries occur and no data
is dropped, and ensure added tests do not log secrets (tokens/emails).
In `@lib/unified-settings.ts`:
- Around line 217-228: Add a brief inline comment above the retry loop that uses
Atomics.wait to explain that the synchronous sleep is intentionally blocking the
event loop because this code path uses fs.renameSync (renameSync) and callers
expect synchronous behavior; mention that this is deliberate for the sync path,
reference UNIFIED_SETTINGS_PATH/renameSync and the isRetryableFsError retry
logic, and note that developers should avoid converting this to async without
adjusting callers or behavior.
In `@scripts/benchmark-runtime-path.mjs`:
- Around line 36-50: The removeWithRetry function calls an undefined sleep,
causing ReferenceError on retry; fix by adding a proper async backoff helper
(e.g., implement or import a sleep/msDelay function that returns a Promise
resolved after given ms) and use that from removeWithRetry, ensuring the symbol
name used matches the call (sleep) or update the call to the actual helper name;
also add a unit/regression test that mocks rm to throw a retryable code
(EBUSY/EPERM/ENOTEMPTY) then succeeds and assert that removeWithRetry retries
(no exception) and that the delay function is invoked (or time advanced via fake
timers) to prevent future regressions.
In `@test/authorization.test.ts`:
- Around line 39-52: Update the test to match the new default role: when
deleting process.env.CODEX_AUTH_ROLE assert getAuthorizationRole() returns
"viewer" (not "admin") and assert authorizeAction("secrets:rotate").allowed is
false; keep the same setup/teardown around process.env but change the expected
role string and the expected authorization boolean to reflect the new default in
getAuthorizationRole and authorizeAction.
In `@test/background-jobs.test.ts`:
- Around line 26-49: Add a concurrent DLQ write regression test that ensures the
file-locking in runBackgroundJobWithRetry correctly serializes writes: import
runBackgroundJobWithRetry and getBackgroundJobDlqPath from
lib/background-jobs.js, spawn several parallel runBackgroundJobWithRetry calls
that each fail on first attempt (maxAttempts: 1) so they write to the same DLQ,
await Promise.all to finish, then read the DLQ with fs.readFile and assert the
number of lines equals the number of jobs; this will catch regressions in the
DLQ write serialization implemented in runBackgroundJobWithRetry.
In `@test/config-save.test.ts`:
- Around line 76-140: The test never asserts that the injected transient read
errors actually executed, so add an assertion that the helper variable
transientReadFailures hit the expected failure count (e.g., === 2) after calling
savePluginConfig to guarantee the readFileSync mock path ran; reference the
transientReadFailures variable and the mocked readFileSync (used when invoking
savePluginConfig) and place the expectation just after the try/finally block
(before reading/parsing the config) so the test fails if the contention branch
in savePluginConfig never executed.
In `@test/fetch-helpers.test.ts`:
- Around line 530-539: The test for SSE parse failures in detailed mode (the it
block using handleSuccessResponseDetailed) checks the error body but doesn't
assert the response status; update this test to also assert that
result.response.status (or the returned Response.status) equals 502 to ensure
parse failures map to HTTP 502 in the non-streaming path exercised by
handleSuccessResponseDetailed and the logic around the SSE parsing in the
fetch-helpers code (referenced at the parse/fallback handling near the code path
invoked by handleSuccessResponseDetailed). Ensure the assertion is added
alongside the existing body shape checks so the contract is locked.
In `@test/idempotency.test.ts`:
- Around line 97-111: The test for checkAndRecordIdempotencyKey currently only
verifies immediate replay and can pass with an unsafe 1ms fallback; update the
"normalizes non-finite ttl values to a safe minimum" test to advance fake time a
few milliseconds after the first successful record and then assert replay still
returns true. Concretely, after the first expect(...).resolves.toEqual({
replayed: false }) call, call vi.advanceTimersByTime(5) (or vi.setSystemTime(new
Date(Date.now() + 5))) and then call
checkAndRecordIdempotencyKey("codex.auth.rotate-secrets", "key-non-finite",
Number.NaN) again and assert .resolves.toEqual({ replayed: true }); keep
vi.useFakeTimers()/vi.useRealTimers() as-is.
In `@test/index-retry.test.ts`:
- Around line 334-347: The test currently simulates single-flight behavior
inside the mock (refreshAndUpdateTokenMock) by using refreshInFlight, which
hides whether the plugin under test actually dedups concurrent refreshes; remove
refreshInFlight from the mock so the mock simply calls refreshEndpoint() and
returns its value, then implement the single-flight expectation by invoking the
real integration seam (the plugin method that triggers refresh) concurrently in
the test and asserting refreshAndUpdateTokenMock was called only once (check
call counts and timing), and adjust the related assertions around the calls at
the integration boundaries referenced near refreshAndUpdateTokenMock,
refreshEndpoint and the concurrent-refresh test blocks (lines ~318 and ~370-380)
so the test proves plugin single-flight deterministically under Vitest.
In `@test/index.test.ts`:
- Around line 530-587: The tests set persistent mock implementations for
configModule.loadPluginConfig and configModule.getFetchTimeoutMs using
vi.mocked(...).mockReturnValue(...) which can leak between tests because the
suite only calls vi.clearAllMocks(); replace that with resetting mock
implementations between tests (e.g., call vi.resetAllMocks() or
vi.restoreAllMocks() in the shared beforeEach/afterEach) or switch those
specific mocks to mockImplementationOnce so each test provides its own return
value; ensure changes target the setup that currently calls vi.clearAllMocks()
and update tests referencing loadPluginConfig and getFetchTimeoutMs to rely on
per-test mock setups.
In `@test/refresh-guardian.test.ts`:
- Around line 75-79: The timing assertions in tests using
missingManagerGuardian.getStats() (and the other test locations) are too
permissive (>= 0) and should assert deterministic branch outcomes: update the
assertions for missingManagerGuardian and similar instances to expect exact runs
counts (e.g., 0 or the known countedRun value), exact maxTickDurationMs and
avgTickDurationMs (0 when no run), and assert cumulativeTickDurationMs equals 0
for the early-return manager-missing branch; for the path exercising the
finally-based timing update (the countedRun gating in lib/refresh-guardian.ts
around lines handling countedRun and the finally block), assert non-zero
positive durations and that avgTickDurationMs equals cumulativeTickDurationMs /
runs, and adjust the tests at the other referenced spots (lines noted in the
review) to mirror these precise checks so tests deterministically cover the two
branches (early return vs finally timing update).
In `@test/retry-governor.test.ts`:
- Around line 95-108: Add a test that passes all-negative values to
decideRetryAllAccountsRateLimited to ensure clampNonNegative is applied to every
numeric input before evaluation; specifically call
decideRetryAllAccountsRateLimited with negative accountCount, waitMs, maxWaitMs,
currentRetryCount, accumulatedWaitMs, and absoluteCeilingMs (keep maxRetries
positive) and assert the outcome matches the clamped result (e.g., accountCount
clamps to 0 → reason "no-accounts"). This verifies the clamp-then-evaluate path
and prevents regressions around negative-field handling in
decideRetryAllAccountsRateLimited and the clampNonNegative logic.
In `@test/settings-hub-utils.test.ts`:
- Around line 308-316: The test currently asserts the object returned by
persistBackendConfigSelection but that helper can return the input fallback on
failure (see persistBackendConfigSelection in settings-hub.ts), so change the
test to verify on-disk persistence: after calling
persistBackendConfigSelection(selected, "backend"), reload the stored backend
config from the real persistence layer (e.g., via the test API's read/load
method or by reading the config file used by the settings hub) and assert that
retryAllAccountsAbsoluteCeilingMs is 0 in the reloaded data; this ensures
failures in persistBackendConfigSelection don't make the test falsely pass.
In `@test/telemetry.test.ts`:
- Around line 220-242: The test "serializes concurrent telemetry writes without
dropping events" can miss events due to log rotation; update the test to read
events via queryTelemetryEvents (the helper in lib/telemetry.ts that reads
rotated files) instead of directly reading getTelemetryLogPath(), or
alternatively temporarily set maxFileSizeBytes to a larger value for this test
so rotation won't occur; modify the test to call queryTelemetryEvents() (or
adjust the telemetry config used in this test) and assert the returned array
length equals total and that every entry.details.token === "***MASKED***".
In `@test/tool-utils.test.ts`:
- Around line 623-642: Update the test to include Number.NEGATIVE_INFINITY in
the input anyOf and assert post-flattening behavior from cleanupToolDefinitions:
add Number.NEGATIVE_INFINITY to the anyOf list for the coercedValues parameter,
change the expected enum on props.coercedValues to include three nulls ([null,
null, null]), and add an assertion that the original anyOf was removed (e.g.,
props.coercedValues.anyOf is undefined or not present) to prove flattening
removed the anyOf wrapper; reference the cleanupToolDefinitions call and the
coercedValues property to locate the assertion changes.
---
Outside diff comments:
In `@docs/reference/settings.md`:
- Around line 184-192: Update the Validation section to replace the stale
CLI-only checks with the current verification gates and npm scripts: list "npm
run verify", "npm run verify:ci", and "npm test -- test/documentation.test.ts"
as the primary gates, and add optional local checks "npm run doctor:dev" and
"npm run setup:dev"; also add a brief upgrade-note pointer to the
release/upgrade documentation indicating these new scripts and any changed CLI
flags (ensure the section references the exact script names and the
documentation test file to match current workflows).
In `@index.ts`:
- Around line 797-857: The concurrent worker loop using shared state
(workerCount, nextHydrateIndex, queuedRefresh, accountsToHydrate,
refreshedCount, failedCount, updatedAccountCount) needs a deterministic vitest
regression: add a test in test/** that sets HYDRATE_EMAILS_MAX_CONCURRENCY to a
known value, provides a fixed accountsToHydrate array, and mocks queuedRefresh
to return a controlled sequence of mixed outcomes (success with varied
access/refresh/expires, failure) while ensuring each mock call is deterministic
(use a seeded sequence or a controlled promise queue rather than timing/flaky
delays). The test must assert each account was processed exactly once and that
refreshedCount/failedCount/updatedAccountCount match expected values, and must
avoid mocking real secrets or skipping assertions.
In `@lib/auth/auth.ts`:
- Around line 234-253: The refreshAccessToken function passes options.timeoutMs
directly to fetchWithTimeout allowing 0, negative, or NaN to bypass the default;
normalize and clamp timeoutMs before the fetch: read
RefreshAccessTokenOptions.timeoutMs, coerce to a finite number, and if it's NaN
or <= 0 fall back to OAUTH_REFRESH_TIMEOUT_MS (or apply the same min/max clamp
logic used in the exchange path), then pass that normalized value to
fetchWithTimeout; update refreshAccessToken and related references
(refreshAccessToken, RefreshAccessTokenOptions, fetchWithTimeout,
OAUTH_REFRESH_TIMEOUT_MS) accordingly and add Vitest cases asserting behavior
for timeoutMs values 0, -1, and NaN in the refresh path, ensuring tests avoid
logging tokens/emails and reference any changed tests in the commit message.
In `@lib/storage.ts`:
- Around line 1118-1174: The migration path currently persists migrated data
(persistMigration) but still calls rememberKnownStorageRevision(path,
rawChecksum) using the pre-migration rawChecksum, causing false ECONFLICTs in
saveAccountsUnlocked; fix by updating the stored checksum after a successful
persistence: after persistMigration(normalized) either compute the actual
on-disk checksum for the migrated content or reload the written file (e.g., via
loadAccountsFromPath(path)) to obtain the new rawChecksum and pass that value
into rememberKnownStorageRevision instead of the original rawChecksum; update
code paths around persistMigration, the promotion branch, and the final
rememberKnownStorageRevision call accordingly (functions/methods:
persistMigration, loadAccountsFromPath, rememberKnownStorageRevision,
saveAccountsUnlocked), and add a vitest regression that performs a v1->v3
auto-migration then immediately calls saveAccountsUnlocked to assert no
ECONFLICT is thrown.
In `@test/server.unit.test.ts`:
- Around line 328-350: The test "should return null after 5 minute timeout" uses
vi.useFakeTimers() but doesn't guarantee restoring real timers on failure; wrap
the fake-timer setup and all test logic in a try/finally so vi.useRealTimers()
is always called. Specifically, in the test block containing
startLocalOAuthServer, result.waitForCode(), vi.advanceTimersByTimeAsync, and
assertions, move vi.useRealTimers() into a finally clause to ensure cleanup even
if an assertion throws; reference the test name,
vi.useFakeTimers/vi.useRealTimers, startLocalOAuthServer, and result.waitForCode
to locate the code to change.
---
Duplicate comments:
In `@lib/authorization.ts`:
- Around line 30-36: The getRoleFromEnv function currently defaults to "admin"
when CODEX_AUTH_ROLE is unset; change it so the default fallback is "viewer"
(least-privilege) and preserve the existing validation that invalid values map
to "viewer"; update the logic in getRoleFromEnv to initialize raw to "viewer"
(or check for undefined and return "viewer") instead of "admin", and update the
test expectations in test/authorization.test.ts (lines checking behavior when
env is deleted) to expect "viewer" rather than "admin".
In `@lib/file-lock.ts`:
- Around line 184-187: The current release flow marks the local flag released =
true before calling releaseOwnedLock/releaseOwnedLockSync, which prevents
retries if those functions throw; change the logic so released is only set to
true after the release call completes successfully (i.e., call await
releaseOwnedLock(path, metadata.ownerId) inside a try/catch and set released =
true in the try after success, rethrow or handle errors without flipping the
flag), and apply the same pattern for the sync variant; update/ add vitest cases
that simulate transient failures (EBUSY/429) to assert retries and eventual
cleanup, and ensure any added logging around failures does not leak tokens or
emails.
In `@lib/idempotency.ts`:
- Around line 98-100: normalizeTtlMs currently maps invalid/non-finite TTLs to
1ms which weakens replay protection; change normalizeTtlMs to return the
configured fail-safe constant IDEMPOTENCY_TTL_MS when ttlMs is not a positive
finite number (keep Math.floor for valid values), update the function reference
in lib/idempotency.ts to import/use IDEMPOTENCY_TTL_MS, and adjust/add vitest
unit tests that assert fallback behavior (ensure tests cover non-finite,
negative, and NaN inputs) so the default is IDEMPOTENCY_TTL_MS rather than 1.
In `@test/data-retention.test.ts`:
- Around line 80-114: The test only asserts that enforceDataRetention throws an
EPERM error for the locked file; add coverage for the Windows transient case by
including an EBUSY variant (or parameterize to test both "EPERM" and "EBUSY") in
the mocked fs.unlink behavior inside the "rethrows non-ENOENT errors during
recursive prune" test, so the mockImplementation throws a NodeJS.ErrnoException
with code set to "EBUSY" for the locked.log path in addition to (or as a
separate subtest from) "EPERM", and update the expect(...).rejects.toMatchObject
to assert the corresponding error.code for each variant while restoring the spy
in the finally block.
In `@test/network.test.ts`:
- Around line 89-106: The test currently only asserts retry counts but not the
onRetry callback payload; update the test for fetchWithTimeoutAndRetry to pass a
mocked onRetry handler (e.g., vi.fn()) and assert it was called once with the
expected metadata (attempt number 1, the retry reason/status indicating 429 or a
response object with status 429, and the request URL and delay values), in
addition to the existing attempts/fetch call count assertions so the retry
reason/status reporting is validated deterministically.
In `@test/plugin-config.test.ts`:
- Around line 1052-1073: Add two unit tests that cover the config-path clamping
for getRetryAllAccountsAbsoluteCeilingMs when the environment var is unset: one
that sets PluginConfig.retryAllAccountsAbsoluteCeilingMs to a negative value and
asserts the getter returns 0, and another that sets it to a very large value and
asserts the getter returns the 24-hour hard maximum; ensure both tests clear
CODEX_AUTH_RETRY_ALL_ABSOLUTE_CEILING_MS (or leave it unset) so they exercise
the config branch in the getter.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 85817824-b9e5-472d-b088-bd378d68fd01
📒 Files selected for processing (76)
.github/workflows/ci.yml.github/workflows/secret-scan.yml.github/workflows/supply-chain.ymldocs/configuration.mddocs/development/CONFIG_FIELDS.mddocs/development/LOCAL_DEV.mddocs/development/TESTING.mddocs/reference/settings.mddocs/runbooks/operations.mdindex.tslib/accounts.tslib/accounts/account-view.tslib/accounts/active-index.tslib/auth/auth.tslib/authorization.tslib/background-jobs.tslib/codex-manager.tslib/codex-manager/settings-hub.tslib/config.tslib/data-redaction.tslib/data-retention.tslib/file-lock.tslib/idempotency.tslib/network.tslib/prompts/codex.tslib/prompts/host-codex-prompt.tslib/quota-cache.tslib/refresh-queue.tslib/request/fetch-helpers.tslib/request/helpers/tool-utils.tslib/request/retry-governor.tslib/secrets-crypto.tslib/shutdown.tslib/storage.tslib/telemetry.tslib/unified-settings.tslib/utils.tsscripts/benchmark-runtime-path-helpers.mjsscripts/benchmark-runtime-path.mjsscripts/check-runtime-budgets-helpers.mjsscripts/check-runtime-budgets.mjsscripts/license-policy-check.jstest/account-view.test.tstest/accounts-edge.test.tstest/active-index.test.tstest/auth.test.tstest/authorization.test.tstest/background-jobs.test.tstest/benchmark-runtime-path-helpers.test.tstest/check-runtime-budgets-script.test.tstest/codex-manager-cli.test.tstest/codex-prompts.test.tstest/config-save.test.tstest/data-redaction.test.tstest/data-retention.test.tstest/fetch-helpers.test.tstest/file-lock.test.tstest/helpers/fs-retry.tstest/idempotency.test.tstest/index-retry.test.tstest/index.test.tstest/network.test.tstest/plugin-config.test.tstest/quota-cache.test.tstest/refresh-guardian.test.tstest/refresh-lease.test.tstest/retry-governor.test.tstest/scripts/doctor-dev.test.tstest/secrets-crypto.test.tstest/server.unit.test.tstest/settings-hub-utils.test.tstest/shutdown.test.tstest/storage.test.tstest/telemetry.test.tstest/tool-utils.test.tstest/utils.test.ts
Additional Comments (1)
removing it and letting the async path handle the absent-file case is both simpler and correct: the Prompt To Fix With AIThis is a comment left during a code review.
Path: lib/quota-cache.ts
Line: 173-175
Comment:
**`existsSync` blocks the event loop and introduces a TOCTOU race**
`existsSync` is a synchronous syscall. on windows, antivirus can hold a filesystem handle on the quota-cache directory, causing `existsSync` to stall and block the Node event loop for hundreds of milliseconds. more importantly, this is a classic TOCTOU: another process could delete the file between the `existsSync` check and the `readCacheFileWithRetry` call. the outer `catch` (line 192) already handles all errors including `ENOENT`, so the guard is redundant.
removing it and letting the async path handle the absent-file case is both simpler and correct:
```suggestion
try {
const content = await readCacheFileWithRetry(QUOTA_CACHE_PATH);
```
the `readCacheFileWithRetry` loop only retries `EBUSY`/`EPERM`; `ENOENT` will fall through to the outer `catch` and return empty maps — identical observable behavior. no vitest change needed beyond verifying the existing "missing file" test still passes.
How can I resolve this? If you propose a fix, please make it concise. |
Address outstanding review feedback for telemetry reliability and coverage: - enforce telemetry source/outcome enum validation while parsing logs - sort telemetry report events deterministically before summary/recent output - await CLI telemetry lifecycle writes before command return/throw - expand CLI/index/telemetry regressions (windows paths, deterministic time, exception paths, concurrent appends) - document telemetry rollout/automation pointer in command reference Co-authored-by: Codex <noreply@openai.com>
- address security/reliability review findings across auth, storage, telemetry, request, and shutdown paths - pin supply-chain workflow actions and update config/release docs - harden redaction, timeout/index clamping, cross-process telemetry writes, and retention cleanup retries - add/adjust regression tests for CLI authz flows, retry behavior, locking, and edge-case normalization Co-authored-by: Codex <noreply@openai.com>
There was a problem hiding this comment.
Actionable comments posted: 23
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
lib/accounts.ts (1)
900-917:⚠️ Potential issue | 🟠 Majordo not skip the fresh debounced save when the previous pending save rejects.
lib/accounts.ts:900-903currently awaitsthis.pendingSaveinside the main try block. if that prior promise rejects, execution jumps to catch and the new save atlib/accounts.ts:903-913is never scheduled. this can drop a later state update.proposed fix
const doSave = async () => { try { if (this.pendingSave) { - await this.pendingSave; + try { + await this.pendingSave; + } catch (error) { + log.warn("Previous debounced save failed; continuing with fresh save", { + error: error instanceof Error ? error.message : String(error), + }); + } } this.pendingSave = runBackgroundJobWithRetry({ name: "accounts.saveToDiskDebounced", task: () => this.saveToDisk(),also add a vitest regression that simulates: first pending save fails, second debounced trigger still performs a new save attempt (
test/accounts-edge.test.tsis a good fit).As per coding guidelines
lib/**: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/accounts.ts` around lines 900 - 917, The current code awaits this.pendingSave directly inside the try so a rejection aborts scheduling the new debounced save; change the await to swallow the prior rejection (e.g., await this.pendingSave.catch(() => undefined)) before creating this.pendingSave so a rejected prior save doesn't jump to catch and prevent scheduling the new runBackgroundJobWithRetry; update the code paths around pendingSave, runBackgroundJobWithRetry and saveToDisk accordingly and add a vitest regression test (test/accounts-edge.test.ts) that simulates a failing first pending save and verifies the second debounced trigger still runs a new save attempt.lib/storage.ts (2)
1578-1590:⚠️ Potential issue | 🟠 Majorflagged storage writes still lack cross-process lock protection.
accounts are serialized with file lock in
lib/storage.ts:1389-1393, but flagged writes atlib/storage.ts:1579only use in-process mutex.rotateStoredSecretEncryptionthen rewrites flagged storage atlib/storage.ts:1761, so concurrent processes can still do last-write-wins and lose updates. rename retry helps windowsEBUSY, but it does not prevent concurrent clobber.proposed fix
export async function saveFlaggedAccounts(storage: FlaggedAccountStorageV1): Promise<void> { - return withStorageLock(async () => { - const path = getFlaggedAccountsPath(); + const path = getFlaggedAccountsPath(); + return withAccountFileLock(path, () => withStorageLock(async () => { const uniqueSuffix = `${Date.now()}.${Math.random().toString(36).slice(2, 8)}`; const tempPath = `${path}.${uniqueSuffix}.tmp`; @@ - }); + })); }please add a vitest regression for concurrent flagged writers across processes (
test/storage.test.ts:1) and keep a windowsEBUSYcase in the same area.As per coding guidelines
lib/**: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios.Also applies to: 1760-1762
1110-1116:⚠️ Potential issue | 🟠 Majordo not overwrite known revision with pre-migration checksum.
line 1114 (
lib/storage.ts:1114) can persist migrated content, but line 1161 (lib/storage.ts:1161) then storesrawChecksumfrom the pre-migration file. this can trigger falseECONFLICTon the next persist in the same transaction path.proposed fix
async function loadAccountsInternal( persistMigration: ((storage: AccountStorageV3) => Promise<void>) | null, ): Promise<AccountStorageV3 | null> { const path = getStoragePath(); + let persistedDuringLoad = false; await cleanupStaleRotatingBackupArtifacts(path); const migratedLegacyStorage = persistMigration ? await migrateLegacyProjectStorageIfNeeded(persistMigration) : null; try { const { normalized, storedVersion, schemaErrors, rawChecksum } = await loadAccountsFromPath(path); @@ if (normalized && storedVersion !== normalized.version) { log.info("Migrating account storage to v3", { from: storedVersion, to: normalized.version }); if (persistMigration) { try { await persistMigration(normalized); + persistedDuringLoad = true; } catch (saveError) { log.warn("Failed to persist migrated storage", { error: String(saveError) }); } } } @@ - rememberKnownStorageRevision(path, rawChecksum); + if (!persistedDuringLoad) { + rememberKnownStorageRevision(path, rawChecksum); + } return normalized;please add a vitest regression around migration + transaction persist (e.g.
test/storage.test.ts:1) to assert no false conflict after migration write.As per coding guidelines
lib/**: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest).Also applies to: 1161-1162
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/storage.ts` around lines 1110 - 1116, When migrating account storage in the block that checks `if (normalized && storedVersion !== normalized.version)` avoid overwriting the known revision/checksum with the pre-migration `rawChecksum`: after a successful `persistMigration(normalized)` ensure the in-memory or to-be-persisted checksum used later (the `rawChecksum` value written around the code that sets `rawChecksum` at/near the later store step) is replaced with the post-migration checksum (`normalized.checksum`) or cleared so it doesn’t reintroduce the old checksum and cause a false ECONFLICT; update the code that writes `rawChecksum` (the same symbol referenced at lines ~1161-1162) to respect the migrated checksum state and add a vitest regression (e.g., test/storage.test.ts) that performs migration + transaction persist to assert no ECONFLICT on the subsequent persist and reference this test in the change.
♻️ Duplicate comments (4)
test/refresh-guardian.test.ts (1)
74-80:⚠️ Potential issue | 🟠 Majorseparate manager-missing vs no-enabled timing expectations more strictly.
manager missingandno enabled accountshit different paths:lib/refresh-guardian.ts:83-86exits before timing setup, whilelib/refresh-guardian.ts:189-201should still writelastTickDurationMsinfinally. intest/refresh-guardian.test.ts:91-97, assertinglastTickDurationMs === 0does not prove thatfinallypath still updates timing forcountedRun=false.proposed deterministic tightening
@@ const disabledGuardian = new RefreshGuardian(() => disabledManager, { intervalMs: 5_000, bufferMs: 60_000, }); + const nowSpy = vi.spyOn(performance, "now"); + nowSpy.mockReturnValueOnce(100).mockReturnValueOnce(112); await expect(disabledGuardian.tick()).resolves.toBeUndefined(); expect(refreshExpiringAccountsMock).not.toHaveBeenCalled(); const disabledStats = disabledGuardian.getStats(); expect(disabledStats.runs).toBe(0); - expect(disabledStats.lastTickDurationMs).toBe(0); + expect(disabledStats.lastTickDurationMs).toBe(12); expect(disabledStats.maxTickDurationMs).toBe(0); expect(disabledStats.cumulativeTickDurationMs).toBe(0); expect(disabledStats.avgTickDurationMs).toBe(0);as per coding guidelines, "tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions."
Also applies to: 91-97
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/refresh-guardian.test.ts` around lines 74 - 80, The test currently conflates the "manager missing" path (which exits before timing setup) with the "no enabled accounts" path (which still executes the finally block and should update timing). Split the assertions into two deterministic checks: for the missing-manager case (use missingManagerGuardian.tick() and missingManagerGuardian.getStats()) assert runs==0 and that timing fields remain at their initialized zero/unchanged values; for the no-enabled-accounts case invoke the guardian path that reaches the finally block (the code path in lib/refresh-guardian.ts that sets timing when countedRun=false) and assert that lastTickDurationMs and related timing fields are updated (e.g., >0 or non-zero) to prove the finally timing logic ran. Ensure you reference and update the specific test invocations of missingManagerGuardian.tick() and getStats(), and add a separate test or adjust the existing one to explicitly exercise the countedRun=false branch so timing is deterministically checked.test/file-lock.test.ts (1)
10-29: 🛠️ Refactor suggestion | 🟠 Majorreuse the shared fs retry helper instead of redefining it here.
this local
removeWithRetryduplicates logic already centralized intest/helpers/fs-retry.ts, which reintroduces drift risk (test/file-lock.test.ts:10-29,test/helpers/fs-retry.ts:1).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/file-lock.test.ts` around lines 10 - 29, The test defines a local removeWithRetry that duplicates the centralized helper; delete this local function and import and use the shared helper instead (replace the local removeWithRetry with the exported helper from the test helpers module), updating any calls in file-lock.test.ts to call the imported helper; ensure the import name matches the helper's export (e.g., removeWithRetry or the actual exported identifier) and remove the RETRYABLE_REMOVE_CODES constant which is now handled by the shared implementation.lib/file-lock.ts (1)
215-219:⚠️ Potential issue | 🟠 Majorrelease state is set too early
in
lib/file-lock.ts:215-219andlib/file-lock.ts:296-300(Line 215-219, Line 296-300),releasedflips beforereleaseOwnedLock*succeeds. if release throws transientebusy/epermon windows, this handle can’t retry and the lock can linger until stale cleanup.proposed fix
- let released = false; + let released = false; + let releasing: Promise<void> | null = null; return { path, release: async () => { if (released) return; - released = true; - await releaseOwnedLock(path, metadata.ownerId); + if (releasing) return releasing; + releasing = (async () => { + await releaseOwnedLock(path, metadata.ownerId); + released = true; + })(); + try { + await releasing; + } finally { + if (!released) releasing = null; + } }, }; @@ - let released = false; + let released = false; return { path, release: () => { if (released) return; - released = true; releaseOwnedLockSync(path, metadata.ownerId); + released = true; }, };#!/bin/bash rg -n "let released|released = true|releaseOwnedLock|releaseOwnedLockSync" lib/file-lock.ts -C3 rg -n "replacement-race|release.*ebusy|release.*eperm|stale" test/file-lock.test.ts -i -C2As per coding guidelines
lib/**: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.Also applies to: 296-300
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/file-lock.ts` around lines 215 - 219, The flag released is being set before the actual release call completes, which prevents retries on transient errors (EBUSY/EPERM); update the async release closure in the returned locker (the release function that calls releaseOwnedLock) so that released = true is assigned only after releaseOwnedLock(path, metadata.ownerId) resolves successfully (and similarly move the assignment in the synchronous variant that calls releaseOwnedLockSync), propagate or rethrow errors so callers can retry on transient failures, and add/adjust vitest cases (tests referencing replacement-race, release.*ebusy, release.*eperm, stale in file-lock.test.ts) to assert retry behavior and that logging does not leak tokens/emails when failures are logged.lib/codex-manager.ts (1)
152-165:⚠️ Potential issue | 🟠 Majorguard
recordauthattemptin the same pre-check block to avoid batch aborts
lib/codex-manager.ts:164can still throw after the currentcheckAuthRateLimitcatch, which means one account can still abort multi-account flows. keep both pre-check calls in one guarded block and convert429/ebusystyle failures into aTokenResultfailure so callers continue.proposed fix
async function queuedRefreshWithAuthRateLimit( refreshToken: string, context: { accountId?: string; email?: string }, ): Promise<TokenResult> { const rateLimitKey = buildRefreshRateLimitKey(refreshToken, context.accountId, context.email); try { checkAuthRateLimit(rateLimitKey); + recordAuthAttempt(rateLimitKey); } catch (error) { if (error instanceof AuthRateLimitError) { return { type: "failed", reason: "unknown", message: error.message, }; } throw error; } - recordAuthAttempt(rateLimitKey); const result = await queuedRefresh(refreshToken);please add a vitest regression in
test/codex-manager-cli.test.ts:1that stubs pre-check recording failure (including windows-styleEBUSY) for one account and asserts other accounts still process.As per coding guidelines
lib/**: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/codex-manager.ts` around lines 152 - 165, Move recordAuthAttempt(rateLimitKey) into the same guarded pre-check block with checkAuthRateLimit(rateLimitKey) inside the try/catch in codex-manager (so both pre-checks are atomic) and, inside the catch that currently handles AuthRateLimitError, also catch errors thrown by recordAuthAttempt (including filesystem EBUSY and HTTP 429-style errors) and return a TokenResult-style failure object ({ type: "failed", reason: "unknown", message: error.message }) instead of letting the error bubble and abort multi-account flows; update the code paths around queuedRefresh(refreshToken) to only run when the pre-check block succeeds. Add a vitest regression in test/codex-manager-cli.test.ts that stubs recordAuthAttempt to throw an AuthRateLimitError and an EBUSY-style error for one account and asserts other accounts still proceed (i.e., their queuedRefresh runs), covering Windows EBUSY and 429-like failures.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/supply-chain.yml:
- Around line 1-11: Add a workflow-level concurrency block to supply-chain.yml
to prevent overlapping runs: at the top-level (before the jobs key) add a
concurrency key with a unique group name (e.g., "supply-chain-${{ github.ref
}}") and set cancel-in-progress: true so a new run cancels any in-flight run for
the same branch/ref; ensure the concurrency block is placed alongside name and
on, not inside any job.
- Around line 3-10: The sca-and-license and sbom jobs are currently running for
pull_request and execute npm/npx on untrusted PR code; update the workflow so
those two jobs (sca-and-license and sbom) are gated to run only on push and
schedule (not pull_request) while leaving dependency-review running on
pull_request; implement this by restricting their job-level runs-on condition
(e.g., add an if: github.event_name != 'pull_request' or equivalent job-level
trigger) or by splitting triggers so the global on: keeps pull_request for
dependency-review but sca-and-license and sbom are scoped to push and schedule
only.
- Around line 72-73: The workflow step "Generate CycloneDX SBOM" currently
invokes an unpinned CLI via the command string `npx --yes
`@cyclonedx/cyclonedx-npm` --output-file sbom.cdx.json --omit dev`; pin that
package to a specific released version (e.g., change to `@4.2.0`) in the same
command to ensure reproducible SBOMs, and update the "Generate CycloneDX SBOM"
step metadata in .github/workflows/supply-chain.yml accordingly; additionally
add a minimal CI job or test that runs the SBOM generation and asserts the
produced sbom.cdx.json contains required top-level fields (e.g.,
bomFormat/version) to catch future version-induced breaks.
In `@docs/development/CONFIG_FIELDS.md`:
- Line 65: The docs table lists retryAllAccountsMaxRetries as defaulting to
Infinity but runtime sets it to 12; update the documentation entry for
retryAllAccountsMaxRetries in CONFIG_FIELDS.md to reflect the actual runtime
default of 12 and ensure any related fields (e.g.,
retryAllAccountsAbsoluteCeilingMs) remain accurate; locate the doc row
referencing retryAllAccountsMaxRetries and change its default value to 12 so it
matches the value set in lib/config.ts (where retryAllAccountsMaxRetries is
initialized).
In `@docs/runbooks/operations.md`:
- Around line 45-46: The runbook list of env vars omits that CODEX_AUTH_ROLE
defaults to "viewer" when unset; update the operations.md entry near the
CODEX_AUTH_ROLE / CODEX_AUTH_BREAK_GLASS bullet list to explicitly state "If
CODEX_AUTH_ROLE is not set, it defaults to 'viewer' (operator-facing behavior
change)" and mention CODEX_AUTH_BREAK_GLASS is for emergency windows only; also
add a short upgrade note and mirror this default in README and SECURITY docs and
any npm scripts or CLI docs that reference auth role behavior so all docs stay
consistent.
- Around line 77-79: Add a Windows-specific subsection to the filesystem lock
contention triage that explains how to identify and resolve
EBUSY/replacement-race scenarios (e.g., how to detect handles holding *.lock
files, using Sysinternals Handle/Process Explorer to find/kill locking
processes, recommended safe sequence to remove stale locks and restart services,
and the “hardened recovery” steps to use instead of simply deleting files), and
update the surrounding steps that mention stale `*.lock` files and DLQ
inspection to reference this Windows note; also ensure docs/** (README,
SECURITY, operations.md) mention any changed CLI flags/workflows and include
upgrade notes plus the new npm scripts required for the hardened recovery path
so on-call has consistent, platform-specific guidance.
In `@index.ts`:
- Around line 1311-1341: The current dedupe key builder getRefreshDedupeKey uses
the last 16 chars of refreshToken which can collide; replace that branch to
compute a collision-resistant hash (e.g., SHA-256 hex) of the full trimmed
refreshToken and use `refresh:<hash>` as the key (use an existing hash utility
or add one nearby and import it), keeping the accountId/email/index branches
unchanged; update refreshAccountAuth and refreshInFlightByAccount usage
unchanged since only the key changes; add a deterministic regression test that
constructs two RefreshDedupeAccount objects with distinct full refreshToken
values that share the same trailing 16 chars and assert that
refreshInFlightByAccount does not return the same promise for both (i.e., both
tokens trigger separate refreshAndUpdateToken calls).
In `@lib/accounts/account-view.ts`:
- Around line 76-84: formatActiveIndexByFamilyLabels currently renders idx + 1
directly which allows fractional persisted values to show; update
formatActiveIndexByFamilyLabels to normalize indexes (use Math.floor or the same
normalization logic as resolveActiveIndex) before adding 1 and rendering the
label so labels are always integers; add a deterministic vitest regression in
test/account-view.test.ts that passes a fractional activeIndexByFamily (e.g.,
codex: 1.8) and asserts the output includes "codex: 2" to prevent regressions.
In `@lib/codex-manager.ts`:
- Around line 4726-4748: The emitTelemetry call is currently awaited and can
throw, so make telemetry emission best-effort by changing emitTelemetry to
swallow errors and not block command execution: wrap the recordTelemetryEvent
call in try/catch (inside emitTelemetry) or fire-and-forget without awaiting,
and apply the same protection for the finish/exception emits invoked around the
main try/catch (the other emitTelemetry calls near the end of the command flow).
Add a Vitest regression in test/codex-manager-cli.test.ts that stubs/mock
recordTelemetryEvent to throw (or rejects) and asserts the CLI command (the main
entry that calls emitTelemetry and printUsage paths) still returns its normal
exit code; reference the emitTelemetry helper and recordTelemetryEvent in the
test so failures in telemetry do not affect business logic.
In `@lib/data-retention.ts`:
- Around line 89-99: The file-unlink path in lib/data-retention.ts currently
does a single fs.unlink and rethrows non-ENOENT errors, which causes transient
Windows EBUSY/EPERM to abort the entire run; update the unlink logic to mirror
removeEmptyDirectoryWithRetry’s retry/backoff behavior (reuse the same retry
constants and exponential backoff loop used by removeEmptyDirectoryWithRetry) so
unlink will retry on EBUSY/EPERM and only fail after retries, and add a vitest
in test/data-retention.test.ts that mocks/fs-stubs or spies to simulate an
initial EBUSY (or EPERM) from fs.unlink followed by a successful unlink to
assert the retry path succeeds (similar to the existing directory test).
In `@lib/secrets-crypto.ts`:
- Around line 104-112: The current isEncryptedSecret() treats any string
starting with `${ENCRYPTED_PREFIX}${LEGACY_ENCRYPTED_VERSION}:` or
`${ENCRYPTED_PREFIX}${ENCRYPTED_VERSION}:` as already encrypted which lets
malformed plaintext like "enc:v2:not-a-valid-envelope" bypass encryptSecret;
update isEncryptedSecret to more strictly validate the envelope format (e.g.,
check that the remainder matches the expected envelope pattern/structure or
base64/JSON format your encryption produces) and leave encryptSecret unchanged
except it should rely on this stricter check (functions to look at:
isEncryptedSecret and encryptSecret); add a deterministic vitest regression in
test/secrets-crypto.test.ts that passes a malformed prefixed value (e.g.,
"enc:v2:not-a-valid-envelope") to encryptSecret and asserts the returned value
is actually encrypted (not equal to the input, matches the expected encrypted
envelope pattern), and ensure the test names reference the changed behavior.
In `@lib/storage.ts`:
- Around line 1739-1758: The TOCTOU race happens because code calls
fileExists(flaggedPath) then reads the file, and if the file is removed between
those calls an ENOENT is treated as a hard failure; update the validation in the
flagged-rotation block (references: flaggedPath, fileExists,
loadFlaggedAccounts, flaggedCount) to treat an ENOENT during the read/parse as a
benign “file disappeared” case (do not convert it into the hard `Failed to
validate...` error) — either by re-checking fileExists and returning success
when missing or by explicitly ignoring errors with error.code === 'ENOENT' in
the catch and only rethrowing other parsing/IO errors; add a vitest regression
(e.g., tests/storage.test.ts) that simulates the concurrent delete between the
initial existence check and read (use a temp file or mock fs.readFile to throw
ENOENT) to assert validation does not fail, and ensure the test also covers
Windows-like behavior (handle transient EBUSY/EPERM/429 scenarios as non-fatal
where appropriate).
In `@lib/telemetry.ts`:
- Around line 290-292: The configureTelemetry function merges runtime settings
but does not clamp bounds for rotation parameters, so update configureTelemetry
to validate and clamp telemetryConfig.maxFiles (minimum 2) and
telemetryConfig.maxFileSizeBytes (minimum >0, cap to a sensible upper bound)
before assigning; reference the configureTelemetry function and telemetryConfig
object to locate the change and ensure behavior ties into the existing rotation
logic in the rotation code paths (e.g., where maxFiles is read). Also add a
regression test in test/telemetry.test.ts that sets maxFiles=1 and asserts
rotation still occurs (or that the config was clamped to >=2) to prevent future
regressions. Ensure the implementation handles concurrent updates safely if
configureTelemetry may be called at runtime.
In `@scripts/benchmark-runtime-path-helpers.mjs`:
- Around line 31-45: The assertion in assertSyncBenchmarkMergeResult assumes
reconciled.storage.accounts is an array and accesses .length / [0].refreshToken
directly; change the guard to explicitly verify reconciled?.storage and
Array.isArray(reconciled.storage.accounts) and that accounts.length >=
minimumAccounts before proceeding, and when checking expectedRefreshToken ensure
reconciled.storage.accounts[0] exists (or use the same array check) so malformed
storage triggers the same deterministic Error(`${caseName} failed`) rather than
a TypeError; update references in that function (minimumAccounts,
expectedRefreshToken, caseName) accordingly.
In `@test/authorization.test.ts`:
- Around line 39-86: Replace the manual single-key env save/restore in the three
tests that call getAuthorizationRole() and authorizeAction() with the existing
auth-env helpers so tests are deterministic: at the start of each test call
captureAuthEnv() (or clear all keys in AUTH_ENV_KEYS) to snapshot/clear all
authorization-related env vars, and in the finally block call restoreAuthEnv()
to restore them; update the tests around getAuthorizationRole and
authorizeAction to use captureAuthEnv/restoreAuthEnv (or explicitly clear
AUTH_ENV_KEYS) instead of only manipulating CODEX_AUTH_ROLE so
CODEX_AUTH_BREAK_GLASS and other ABAC keys can't leak and flip assertions.
In `@test/benchmark-runtime-path-helpers.test.ts`:
- Around line 62-91: Add deterministic vitest cases that mirror the existing
EPERM retry test to also cover the other retryable Windows fs error codes EBUSY
and ENOTEMPTY for removeWithRetry: create two new it(...) tests that mock remove
to reject once with Object.assign(new Error("busy"), { code: "EBUSY" }) and once
with Object.assign(new Error("not empty"), { code: "ENOTEMPTY" }) respectively,
then resolve on the second call; inject a mocked sleep and assert remove was
called twice and sleep was called with 25; keep the existing non-retryable test
unchanged. Reference the removeWithRetry helper and the mocked remove and sleep
symbols used in the current tests when adding these cases.
In `@test/config-save.test.ts`:
- Line 149: The current flaky assertion
expect(readAttempts).toBeGreaterThanOrEqual(3) should be tightened to assert an
exact retry count to detect regressions; replace it with
expect(readAttempts).toBe(3) (or the precise number your retry policy
guarantees) so the test for the retry loop (the readAttempts variable) is
deterministic and fails fast on accidental extra retries.
In `@test/fetch-helpers.test.ts`:
- Around line 542-547: The test at handleSuccessResponseDetailed should be
hardened to use an SSE-framed payload (e.g., event-stream style "data: ...\n\n"
chunks and Content-Type "text/event-stream") instead of plain text so the
streaming branch in lib/request/fetch-helpers.ts:671 is exercised; update the
test in test/fetch-helpers.test.ts (the case "'returns undefined parsedBody for
streaming success responses in detailed mode'") to construct a Response that
mimics SSE framing and content-type, call
handleSuccessResponseDetailed(response, true), assert result.parsedBody is
undefined, and assert await result.response.text() equals the original
SSE-framed string.
In `@test/index.test.ts`:
- Around line 2242-2262: The getAuth test fixture is using real-looking secret
literals ("sk-live-sensitive-access-token-1234567890" and
"refresh-sensitive-token-1234567890"); replace those with clearly synthetic
tokens (e.g., "sk-test-0000000000" and "refresh-test-0000000000") inside the
getAuth async function so secret scanners won't be triggered, and keep the rest
of the test logic (plugin.auth.loader, sdk.fetch call, and telemetry assertions
using recordTelemetryEventMock) unchanged to preserve behavior and assertions
that telemetry does not contain the synthetic tokens.
In `@test/shutdown.test.ts`:
- Around line 163-200: The test misses the sub-minimum clamp regression for
positive values below 1000ms; update the test in shutdown.test.ts to include a
deterministic case for process.env.CODEX_AUTH_SHUTDOWN_TIMEOUT_MS = "1" (which
should clamp to 1000ms) by adding await runCase("1") alongside the existing
runCase("0"), runCase("-1"), and runCase("not-a-number") calls; this verifies
the behavior in lib/shutdown.ts that clamps positive values under 1000 to 1000,
while still using registerCleanup and runCleanup and the same fake timers
pattern.
In `@test/telemetry-lock.test.ts`:
- Around line 44-49: The test uses a secret-looking fixture value in the call to
telemetry.recordTelemetryEvent (see telemetry.recordTelemetryEvent invocation) —
replace the token "sk-live-lock-test-token" with a neutral, non-secret
placeholder (e.g. "test-placeholder-token" or "placeholder-token") in the
details.token field so test data hygiene is preserved; ensure no other test
fixtures use real-looking secret patterns and keep the existing assertions and
deterministic vitest setup intact (do not mock out assertions or remove the
telemetry call).
In `@test/telemetry.test.ts`:
- Around line 16-33: The local removeWithRetry function duplicates shared logic;
delete this local definition in test/telemetry.test.ts and import the shared
helper from test/helpers/fs-retry.ts instead (use the exported removeWithRetry
helper from that module), update any usages to call the imported
removeWithRetry, and remove any now-unused local symbols like
RETRYABLE_REMOVE_CODES or direct fs.rm retry logic so the test reuses the
centralized retry behavior.
- Around line 305-330: The test uses secret-like fixture values
("sk-concurrent-token-...") which can trigger secret-scanning; update the
telemetry test to use clearly synthetic, non-secret token strings (e.g.
"concurrent-token-0000" or "token-<index>") when calling recordTelemetryEvent in
the concurrent write spec, keeping the same deterministic generation pattern so
configureTelemetry, recordTelemetryEvent, and getTelemetryLogPath behavior and
the final masking assertion remain unchanged; ensure the token string change is
applied only in this test fixture and that parsed.every(...) still checks for
masked tokens.
---
Outside diff comments:
In `@lib/accounts.ts`:
- Around line 900-917: The current code awaits this.pendingSave directly inside
the try so a rejection aborts scheduling the new debounced save; change the
await to swallow the prior rejection (e.g., await this.pendingSave.catch(() =>
undefined)) before creating this.pendingSave so a rejected prior save doesn't
jump to catch and prevent scheduling the new runBackgroundJobWithRetry; update
the code paths around pendingSave, runBackgroundJobWithRetry and saveToDisk
accordingly and add a vitest regression test (test/accounts-edge.test.ts) that
simulates a failing first pending save and verifies the second debounced trigger
still runs a new save attempt.
In `@lib/storage.ts`:
- Around line 1110-1116: When migrating account storage in the block that checks
`if (normalized && storedVersion !== normalized.version)` avoid overwriting the
known revision/checksum with the pre-migration `rawChecksum`: after a successful
`persistMigration(normalized)` ensure the in-memory or to-be-persisted checksum
used later (the `rawChecksum` value written around the code that sets
`rawChecksum` at/near the later store step) is replaced with the post-migration
checksum (`normalized.checksum`) or cleared so it doesn’t reintroduce the old
checksum and cause a false ECONFLICT; update the code that writes `rawChecksum`
(the same symbol referenced at lines ~1161-1162) to respect the migrated
checksum state and add a vitest regression (e.g., test/storage.test.ts) that
performs migration + transaction persist to assert no ECONFLICT on the
subsequent persist and reference this test in the change.
---
Duplicate comments:
In `@lib/codex-manager.ts`:
- Around line 152-165: Move recordAuthAttempt(rateLimitKey) into the same
guarded pre-check block with checkAuthRateLimit(rateLimitKey) inside the
try/catch in codex-manager (so both pre-checks are atomic) and, inside the catch
that currently handles AuthRateLimitError, also catch errors thrown by
recordAuthAttempt (including filesystem EBUSY and HTTP 429-style errors) and
return a TokenResult-style failure object ({ type: "failed", reason: "unknown",
message: error.message }) instead of letting the error bubble and abort
multi-account flows; update the code paths around queuedRefresh(refreshToken) to
only run when the pre-check block succeeds. Add a vitest regression in
test/codex-manager-cli.test.ts that stubs recordAuthAttempt to throw an
AuthRateLimitError and an EBUSY-style error for one account and asserts other
accounts still proceed (i.e., their queuedRefresh runs), covering Windows EBUSY
and 429-like failures.
In `@lib/file-lock.ts`:
- Around line 215-219: The flag released is being set before the actual release
call completes, which prevents retries on transient errors (EBUSY/EPERM); update
the async release closure in the returned locker (the release function that
calls releaseOwnedLock) so that released = true is assigned only after
releaseOwnedLock(path, metadata.ownerId) resolves successfully (and similarly
move the assignment in the synchronous variant that calls releaseOwnedLockSync),
propagate or rethrow errors so callers can retry on transient failures, and
add/adjust vitest cases (tests referencing replacement-race, release.*ebusy,
release.*eperm, stale in file-lock.test.ts) to assert retry behavior and that
logging does not leak tokens/emails when failures are logged.
In `@test/file-lock.test.ts`:
- Around line 10-29: The test defines a local removeWithRetry that duplicates
the centralized helper; delete this local function and import and use the shared
helper instead (replace the local removeWithRetry with the exported helper from
the test helpers module), updating any calls in file-lock.test.ts to call the
imported helper; ensure the import name matches the helper's export (e.g.,
removeWithRetry or the actual exported identifier) and remove the
RETRYABLE_REMOVE_CODES constant which is now handled by the shared
implementation.
In `@test/refresh-guardian.test.ts`:
- Around line 74-80: The test currently conflates the "manager missing" path
(which exits before timing setup) with the "no enabled accounts" path (which
still executes the finally block and should update timing). Split the assertions
into two deterministic checks: for the missing-manager case (use
missingManagerGuardian.tick() and missingManagerGuardian.getStats()) assert
runs==0 and that timing fields remain at their initialized zero/unchanged
values; for the no-enabled-accounts case invoke the guardian path that reaches
the finally block (the code path in lib/refresh-guardian.ts that sets timing
when countedRun=false) and assert that lastTickDurationMs and related timing
fields are updated (e.g., >0 or non-zero) to prove the finally timing logic ran.
Ensure you reference and update the specific test invocations of
missingManagerGuardian.tick() and getStats(), and add a separate test or adjust
the existing one to explicitly exercise the countedRun=false branch so timing is
deterministically checked.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 9cdd410c-afbd-4fc7-9568-e3f53e2786e9
📒 Files selected for processing (45)
.github/workflows/supply-chain.ymldocs/development/CONFIG_FIELDS.mddocs/reference/commands.mddocs/runbooks/operations.mdindex.tslib/accounts.tslib/accounts/account-view.tslib/auth/auth.tslib/authorization.tslib/codex-manager.tslib/config.tslib/data-redaction.tslib/data-retention.tslib/file-lock.tslib/idempotency.tslib/request/response-handler.tslib/secrets-crypto.tslib/shutdown.tslib/storage.tslib/telemetry.tslib/unified-settings.tsscripts/benchmark-runtime-path-helpers.mjsscripts/benchmark-runtime-path.mjstest/account-view.test.tstest/accounts-edge.test.tstest/auth.test.tstest/authorization.test.tstest/background-jobs.test.tstest/benchmark-runtime-path-helpers.test.tstest/codex-manager-cli.test.tstest/config-save.test.tstest/data-retention.test.tstest/fetch-helpers.test.tstest/file-lock.test.tstest/idempotency.test.tstest/index-retry.test.tstest/index.test.tstest/refresh-guardian.test.tstest/retry-governor.test.tstest/settings-hub-utils.test.tstest/shutdown.test.tstest/storage.test.tstest/telemetry-lock.test.tstest/telemetry.test.tstest/tool-utils.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (3)
test/**
⚙️ CodeRabbit configuration file
tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.
Files:
test/retry-governor.test.tstest/account-view.test.tstest/accounts-edge.test.tstest/background-jobs.test.tstest/data-retention.test.tstest/telemetry-lock.test.tstest/shutdown.test.tstest/settings-hub-utils.test.tstest/telemetry.test.tstest/file-lock.test.tstest/tool-utils.test.tstest/auth.test.tstest/index-retry.test.tstest/refresh-guardian.test.tstest/authorization.test.tstest/codex-manager-cli.test.tstest/index.test.tstest/config-save.test.tstest/benchmark-runtime-path-helpers.test.tstest/fetch-helpers.test.tstest/idempotency.test.tstest/storage.test.ts
lib/**
⚙️ CodeRabbit configuration file
focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.
Files:
lib/data-redaction.tslib/request/response-handler.tslib/accounts.tslib/data-retention.tslib/idempotency.tslib/file-lock.tslib/auth/auth.tslib/accounts/account-view.tslib/shutdown.tslib/secrets-crypto.tslib/authorization.tslib/config.tslib/unified-settings.tslib/storage.tslib/codex-manager.tslib/telemetry.ts
docs/**
⚙️ CodeRabbit configuration file
keep README, SECURITY, and docs consistent with actual CLI flags and workflows. whenever behavior changes, require updated upgrade notes and mention new npm scripts.
Files:
docs/reference/commands.mddocs/runbooks/operations.mddocs/development/CONFIG_FIELDS.md
🧬 Code graph analysis (28)
test/retry-governor.test.ts (1)
lib/request/retry-governor.ts (1)
decideRetryAllAccountsRateLimited(43-73)
test/account-view.test.ts (2)
lib/accounts/account-view.ts (5)
resolveActiveIndex(16-26)getRateLimitResetTimeForFamily(28-48)formatRateLimitEntry(50-60)formatRateLimitStatusByFamily(62-74)formatActiveIndexByFamilyLabels(76-85)lib/prompts/codex.ts (1)
MODEL_FAMILIES(65-71)
test/background-jobs.test.ts (1)
lib/background-jobs.ts (2)
runBackgroundJobWithRetry(71-117)getBackgroundJobDlqPath(67-69)
test/data-retention.test.ts (2)
scripts/benchmark-runtime-path-helpers.mjs (1)
removeWithRetry(9-29)lib/data-retention.ts (2)
RetentionPolicy(5-11)enforceDataRetention(119-146)
test/telemetry-lock.test.ts (2)
scripts/benchmark-runtime-path.mjs (1)
tempDir(228-228)scripts/benchmark-runtime-path-helpers.mjs (1)
removeWithRetry(9-29)
test/telemetry.test.ts (1)
lib/telemetry.ts (6)
getTelemetryConfig(294-296)configureTelemetry(290-292)recordTelemetryEvent(302-334)queryTelemetryEvents(336-356)getTelemetryLogPath(298-300)summarizeTelemetryEvents(358-387)
test/file-lock.test.ts (1)
lib/file-lock.ts (1)
acquireFileLock(176-238)
lib/accounts.ts (3)
lib/storage.ts (3)
AccountStorageV3(34-34)saveAccounts(1389-1394)loadAccounts(1039-1041)lib/utils.ts (1)
sleep(65-67)lib/background-jobs.ts (1)
runBackgroundJobWithRetry(71-117)
lib/data-retention.ts (1)
lib/runtime-paths.ts (3)
getCodexLogDir(226-228)getCodexCacheDir(212-214)getCodexMultiAuthDir(166-199)
test/tool-utils.test.ts (1)
lib/request/helpers/tool-utils.ts (1)
cleanupToolDefinitions(89-122)
lib/file-lock.ts (1)
lib/utils.ts (1)
sleep(65-67)
lib/auth/auth.ts (2)
lib/utils.ts (2)
fetchWithTimeout(76-134)isAbortError(20-24)lib/schemas.ts (1)
safeParseOAuthTokenResponse(312-318)
lib/accounts/account-view.ts (1)
lib/prompts/codex.ts (2)
ModelFamily(54-59)MODEL_FAMILIES(65-71)
test/auth.test.ts (1)
lib/auth/auth.ts (3)
exchangeAuthorizationCode(131-208)REDIRECT_URI(12-12)refreshAccessToken(242-300)
test/index-retry.test.ts (1)
index.ts (1)
OpenAIAuthPlugin(4415-4415)
test/refresh-guardian.test.ts (1)
lib/refresh-guardian.ts (1)
RefreshGuardian(32-206)
test/authorization.test.ts (2)
lib/authorization.ts (2)
getAuthorizationRole(38-40)authorizeAction(107-145)lib/audit.ts (2)
getAuditConfig(63-65)configureAudit(59-61)
scripts/benchmark-runtime-path.mjs (5)
lib/codex-cli/state.ts (1)
clearCodexCliStateCache(545-549)scripts/benchmark-runtime-path-helpers.mjs (2)
removeWithRetry(9-29)assertSyncBenchmarkMergeResult(31-46)lib/storage.ts (1)
normalizeAccountStorage(944-1032)lib/codex-cli/sync.ts (1)
syncAccountStorageFromCodexCli(328-421)lib/request/fetch-helpers.ts (2)
handleErrorResponse(595-632)handleSuccessResponseDetailed(651-684)
test/codex-manager-cli.test.ts (1)
lib/codex-manager.ts (1)
runCodexMultiAuthCli(4703-4810)
test/index.test.ts (1)
lib/prompts/codex.ts (1)
MODEL_FAMILIES(65-71)
test/benchmark-runtime-path-helpers.test.ts (3)
scripts/benchmark-runtime-path-helpers.mjs (4)
assertSyncBenchmarkMergeResult(31-46)remove(14-14)sleep(5-7)removeWithRetry(9-29)lib/utils.ts (1)
sleep(65-67)test/helpers/fs-retry.ts (1)
removeWithRetry(5-22)
test/fetch-helpers.test.ts (1)
lib/request/fetch-helpers.ts (2)
handleSuccessResponseDetailed(651-684)handleErrorResponse(595-632)
lib/authorization.ts (1)
lib/audit.ts (1)
auditLog(123-153)
lib/config.ts (3)
lib/schemas.ts (1)
PluginConfigSchema(13-62)lib/utils.ts (2)
isRecord(11-13)sleep(65-67)lib/file-lock.ts (1)
acquireFileLock(176-238)
test/idempotency.test.ts (1)
lib/idempotency.ts (2)
checkAndRecordIdempotencyKey(110-149)getIdempotencyStorePath(106-108)
lib/unified-settings.ts (3)
lib/runtime-paths.ts (1)
getCodexMultiAuthDir(166-199)lib/file-lock.ts (2)
acquireFileLockSync(257-323)acquireFileLock(176-238)lib/utils.ts (1)
sleep(65-67)
test/storage.test.ts (1)
lib/storage.ts (7)
loadAccounts(1039-1041)AccountStorageV3(34-34)saveAccounts(1389-1394)saveFlaggedAccounts(1578-1600)loadFlaggedAccounts(1529-1576)getStoragePath(626-631)rotateStoredSecretEncryption(1705-1768)
lib/telemetry.ts (3)
lib/runtime-paths.ts (1)
getCodexLogDir(226-228)lib/logger.ts (1)
getCorrelationId(137-139)lib/file-lock.ts (1)
acquireFileLock(176-238)
🪛 ast-grep (0.41.0)
test/index.test.ts
[warning] 903-906: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(
^${escapedLabel}: ${numberPattern}${suffixPattern}$,
"m",
)
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
🪛 LanguageTool
docs/runbooks/operations.md
[uncategorized] ~68-~68: The official name of this software platform is spelled with a capital “H”.
Context: ...ion required checks remain aligned with .github/settings.yml. --- ## Failure Triage ...
(GITHUB)
🔇 Additional comments (35)
test/tool-utils.test.ts (1)
623-648: good regression coverage for non-finite normalization.this change closes the earlier gap by covering
nan,+infinity, and-infinity, and it verifies both flattening (anyOfremoved) and json-safe coercion ([null, null, null]). citations:test/tool-utils.test.ts:623,test/tool-utils.test.ts:635,test/tool-utils.test.ts:646,test/tool-utils.test.ts:647,lib/request/helpers/tool-utils.ts:88.As per coding guidelines
test/**: tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.test/refresh-guardian.test.ts (1)
570-596: good deterministic regression coverage for duration aggregates.
test/refresh-guardian.test.ts:570-596cleanly validateslast/max/cumulative/avgmath against controlled timing inputs and maps well tolib/refresh-guardian.ts:189-201. this is strong protection against timing-metric regressions.test/fetch-helpers.test.ts (2)
509-540: good coverage for non-streaming detailed-mode contracts.nice lock-in of parse behavior and 502 mapping in
test/fetch-helpers.test.ts:509andtest/fetch-helpers.test.ts:530for the conversion path inlib/request/fetch-helpers.ts:650.
584-592: good regression for non-json error parse debug path.the assertion in
test/fetch-helpers.test.ts:590gives useful coverage for the fallback branch exercised bylib/request/fetch-helpers.ts:594.test/config-save.test.ts (3)
76-150: good concurrency regression coverage for transient env-path locks.
test/config-save.test.ts:76-150now proves the contention path ran and preserves existing keys after retries. this is the right kind of windows/concurrency regression guard.
152-230: solid deterministic coverage for sync lock retries and exhaustion fallback.
test/config-save.test.ts:152-230explicitly validates exponential backoff timing and exhausted-retry fallback behavior under retryable lock errors. this materially reduces concurrency regression risk.
271-277: default concurrency expectation update is consistent.
test/config-save.test.ts:271-277correctly locks the new default path atparallelProbingMaxConcurrency = 5when using unified settings.scripts/benchmark-runtime-path-helpers.mjs (1)
9-29: retry helper looks good for transient fs cleanup paths.the bounded retries and exponential backoff are appropriate for windows-style lock contention and are exercised by tests (
test/benchmark-runtime-path-helpers.test.ts:62,test/benchmark-runtime-path-helpers.test.ts:79).scripts/benchmark-runtime-path.mjs (3)
38-50: good fix for process-wide mutation concurrency risk.serializing
process.envand cache mutation throughwithProcessStateMutationLockis the right containment pattern for this benchmark path (lib/codex-cli/state.ts:544).
226-269: cleanup and teardown path is robust now.the setup/restore block is tight, and using retryable temp-dir cleanup reduces windows flakiness while preserving failure semantics (
test/benchmark-runtime-path-helpers.test.ts:62).
313-326: sync benchmark guard now validates real merge behavior.seeding a stale token and asserting
changedplus repaired token closes the no-op false-positive hole against reconcile semantics (lib/codex-cli/sync.ts:327).test/benchmark-runtime-path-helpers.test.ts (1)
8-60: token-repair regression coverage is strong and deterministic.these cases correctly gate unchanged reconcile outputs and stale-token non-repair paths, matching reconcile behavior expectations (
lib/codex-cli/sync.ts:327)..github/workflows/supply-chain.yml (2)
22-26: good hardening updates landed.immutable action pinning and the sca job timeout are in place and look correct. see
lib/.github/workflows/supply-chain.yml:22andlib/.github/workflows/supply-chain.yml:34.Also applies to: 34-34, 38-42, 61-64, 76-76
33-67: windows regression tests already exist in ci.yml.supply-chain.yml runs ubuntu-only, which is correct for its purpose (vulnerability and license policy gates). windows regression testing for the pr scope is already covered in
.github/workflows/ci.yml:129with windows-latest runners executing win32-specific tests at line 151:test/runtime-paths.test.ts,test/file-lock.test.ts,test/quota-cache.test.ts. no action needed.> Likely an incorrect or invalid review comment.lib/shutdown.ts (2)
30-48: bounded timeout handling for in-flight callers looks good.
lib/shutdown.ts:30-48(Line 30-48) correctly applies the same timeout wrapper to in-flight cleanup waits, which prevents later callers from waiting unboundedly.
54-66: add regression test for synchronous re-entry from cleanup callbacks.lib/shutdown.ts:54-66 already guards against re-entrancy correctly —
cleanupInFlightis assigned before any cleanup function executes (the IIFE body is microtask-scheduled). but test coverage is missing: add a deterministic vitest case in test/shutdown.test.ts where a cleanup function callsrunCleanup()synchronously and assert cleanup still executes as a single pass with the in-flight guard working.test/shutdown.test.ts (1)
83-143: good deterministic timeout and in-flight concurrency coverage.
test/shutdown.test.ts:83-143(Line 83-143) uses fake timers, controlled deferred cleanup resolution, and explicit restoration paths. this is solid regression coverage for timeout and concurrent-caller behavior.test/retry-governor.test.ts (1)
4-157: good deterministic coverage for retry-governor decision branches.
test/retry-governor.test.ts:4-157covers the core allow/deny reasons, clamp normalization, and absolute-ceiling behavior with stable assertions. this is solid regression coverage.lib/data-redaction.ts (1)
48-82: cycle handling and plain-object traversal look correct.
lib/data-redaction.ts:48-82uses aWeakSetfor circular references and avoids recursing non-plain objects viaisObjectRecordfromlib/data-redaction.ts:22-28. this is the right hardening direction for redaction safety.test/background-jobs.test.ts (1)
114-136: good concurrency regression for dlq file-lock serialization.
test/background-jobs.test.ts:114-136directly exercises parallel failing jobs and verifies serialized dlq persistence. this is exactly the kind of regression that catches windows/concurrency write races.lib/accounts.ts (1)
834-862: good bounds clamping in conflict merge output.
lib/accounts.ts:834-862now clampsactiveIndexByFamilyandactiveIndexagainstmergedAccountsbounds, which hardens the ebusy/econflict merge path.test/index-retry.test.ts (1)
318-374: good token-refresh race regression with deterministic call-count assertions.
test/index-retry.test.ts:318-374verifies single-flight behavior at the plugin integration seam by asserting bothrefreshAndUpdateTokenMockand upstream refresh endpoint call counts under concurrentsdk.fetchcalls.test/idempotency.test.ts (1)
42-113: good deterministic coverage for ttl and duplicate-key race paths.the fake-timer ttl tests and the concurrent duplicate-key assertion are solid and directly exercise the critical lock-protected path (
test/idempotency.test.ts:42-113,lib/idempotency.ts:109-148).test/data-retention.test.ts (1)
81-161: nice windows-oriented retention regressions.the transient
EBUSYretry case plus explicitEPERM/EBUSYrejection checks give strong, deterministic filesystem coverage (test/data-retention.test.ts:81-161).test/settings-hub-utils.test.ts (1)
289-319: good regression coverage for retry-all absolute ceiling semantics.preview, metadata, and persisted reload checks are all covered, and line [314]-line [318] now verifies actual on-disk state instead of only returned values (
test/settings-hub-utils.test.ts:289-319).lib/file-lock.ts (1)
94-133: ownership re-check before unlink is solid race hardeningthe two-phase owner verification in
lib/file-lock.ts:94-133andlib/file-lock.ts:135-174(Line 109-112, Line 150-153) is a good mitigation for stale-handle delete races.Also applies to: 135-174
test/codex-manager-cli.test.ts (2)
396-500: good deterministic telemetry coverage
test/codex-manager-cli.test.ts:396-500andtest/codex-manager-cli.test.ts:512-569(Line 401-403, Line 537-559) add stable time control, windows path assertions, redaction assertions, and explicit telemetry await semantics. this is strong regression coverage.Also applies to: 512-569
1193-1339: timeout, pagination, rate-limit, and idempotency regressions are well scoped
test/codex-manager-cli.test.ts:1193-1339andtest/codex-manager-cli.test.ts:2515-2868(Line 1254-1259, Line 2620-2669, Line 2733-2818) cover min-timeout clamp, env override precedence, pagination cursor behavior, rate-limit continuation, and rotate-secrets idempotency/redaction paths.Also applies to: 2515-2868
test/storage.test.ts (2)
705-821: good concurrency regression depth for load/save paths
test/storage.test.ts:705-821andtest/storage.test.ts:853-889(Line 758-821, Line 883-889) validate cleanup scan dedup under concurrentloadAccountsand guard against stale overwrite conflicts.Also applies to: 853-889
1703-1800: windows and lock/rotation resilience coverage is strong
test/storage.test.ts:1703-1800andtest/storage.test.ts:2114-2184(Line 1703-1730, Line 1761-1800, Line 2132-2184) add targeted regressions forebusy/epermlock-release behavior, lock serialization, flagged rename retry, and rotation failure signaling.Also applies to: 2114-2184
lib/idempotency.ts (1)
51-62: idempotency store hardening looks good
lib/idempotency.ts:51-62,lib/idempotency.ts:73-92, andlib/idempotency.ts:99-149(Line 58-60, Line 73-83, Line 133-145) properly handle malformed json recovery, windows rename retries, ttl normalization, and lock-serialized replay detection.Also applies to: 73-92, 99-104, 110-149
lib/data-retention.ts (1)
45-66: directory cleanup retry strategy is a good windows-safe addition
lib/data-retention.ts:45-66(Line 53-63) correctly adds bounded retry/backoff forenotempty/ebusy/epermaroundrmdir, which reduces flaky retention failures under concurrent writers.lib/unified-settings.ts (2)
200-243: good fix for revision-check toctou
lib/unified-settings.ts:200-243andlib/unified-settings.ts:273-313(Line 207-212, Line 280-285) now validateexpectedRevisionwhile holding the file lock. this closes the prior check-then-lock race.Also applies to: 273-313
367-414: conflict-retry flow is well implemented
lib/unified-settings.ts:367-414andlib/unified-settings.ts:453-470(Line 369-383, Line 455-469) add boundedECONFLICTretries for both plugin and dashboard saves, which is the right tradeoff for optimistic concurrency under contention.Also applies to: 453-470
lib/config.ts (1)
394-435: good hardening on config writesthe lock acquisition + revision check + retryable rename flow in
lib/config.ts:394-435is a solid improvement for cross-process safety and windows transient fs contention handling.
| name: Supply Chain | ||
|
|
||
| on: | ||
| pull_request: | ||
| branches: [main] | ||
| push: | ||
| branches: [main] | ||
| schedule: | ||
| - cron: "0 4 * * 1" | ||
|
|
||
| jobs: |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
add workflow concurrency to avoid overlapping supply-chain runs.
there is no concurrency guard, so push/schedule runs can overlap and duplicate expensive gates. add a workflow-level concurrency group. see lib/.github/workflows/supply-chain.yml:1.
proposed patch
name: Supply Chain
+
+concurrency:
+ group: supply-chain-${{ github.ref }}
+ cancel-in-progress: true📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| name: Supply Chain | |
| on: | |
| pull_request: | |
| branches: [main] | |
| push: | |
| branches: [main] | |
| schedule: | |
| - cron: "0 4 * * 1" | |
| jobs: | |
| name: Supply Chain | |
| concurrency: | |
| group: supply-chain-${{ github.ref }} | |
| cancel-in-progress: true | |
| on: | |
| pull_request: | |
| branches: [main] | |
| push: | |
| branches: [main] | |
| schedule: | |
| - cron: "0 4 * * 1" | |
| jobs: |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/supply-chain.yml around lines 1 - 11, Add a workflow-level
concurrency block to supply-chain.yml to prevent overlapping runs: at the
top-level (before the jobs key) add a concurrency key with a unique group name
(e.g., "supply-chain-${{ github.ref }}") and set cancel-in-progress: true so a
new run cancels any in-flight run for the same branch/ref; ensure the
concurrency block is placed alongside name and on, not inside any job.
| on: | ||
| pull_request: | ||
| branches: [main] | ||
| push: | ||
| branches: [main] | ||
| schedule: | ||
| - cron: "0 4 * * 1" | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n .github/workflows/supply-chain.ymlRepository: ndycode/codex-multi-auth
Length of output: 2618
scope sca/sbom away from pull_request execution.
sca-and-license and sbom both run npm ci and npx on untrusted pr code. gate these two jobs to push/schedule only, keeping dependency-review on pr where it belongs. see .github/workflows/supply-chain.yml:3-9 for triggers and .github/workflows/supply-chain.yml:31-79 for the unguarded jobs.
proposed patch
sca-and-license:
name: SCA and License Gate
+ if: github.event_name != 'pull_request'
runs-on: ubuntu-latest
timeout-minutes: 15
@@
sbom:
name: Generate SBOM
+ if: github.event_name != 'pull_request'
runs-on: ubuntu-latest🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/supply-chain.yml around lines 3 - 10, The sca-and-license
and sbom jobs are currently running for pull_request and execute npm/npx on
untrusted PR code; update the workflow so those two jobs (sca-and-license and
sbom) are gated to run only on push and schedule (not pull_request) while
leaving dependency-review running on pull_request; implement this by restricting
their job-level runs-on condition (e.g., add an if: github.event_name !=
'pull_request' or equivalent job-level trigger) or by splitting triggers so the
global on: keeps pull_request for dependency-review but sca-and-license and sbom
are scoped to push and schedule only.
| - name: Generate CycloneDX SBOM | ||
| run: npx --yes @cyclonedx/cyclonedx-npm --output-file sbom.cdx.json --omit dev |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Check the supply-chain workflow file
cat -n .github/workflows/supply-chain.yml | sed -n '65,80p'Repository: ndycode/codex-multi-auth
Length of output: 608
🏁 Script executed:
# Search for any other version pins in the workflow
rg "@cyclonedx" .github/workflows/ -A 2 -B 2Repository: ndycode/codex-multi-auth
Length of output: 404
🏁 Script executed:
# Check if there are any tests for sbom generation
fd -e "test\.*" -e "*.test.*" | xargs rg -l "sbom\|cyclone" 2>/dev/null || echo "no sbom tests found"Repository: ndycode/codex-multi-auth
Length of output: 87
🏁 Script executed:
# Check for any existing version constraints or lockfiles that might affect this
fd -e "package-lock.json" -e "package.json" -e "yarn.lock" | head -5Repository: ndycode/codex-multi-auth
Length of output: 50
🏁 Script executed:
# Search for any documentation or comments about sbom generation versioning
rg "cyclone|sbom" --type-list | head -5
rg -i "cyclone" . -A 3 -B 3 --type md --type yml 2>/dev/null | head -50Repository: ndycode/codex-multi-auth
Length of output: 457
🏁 Script executed:
# Check the full workflow structure and job configuration
head -n 90 .github/workflows/supply-chain.yml | cat -nRepository: ndycode/codex-multi-auth
Length of output: 2618
🏁 Script executed:
# Check if there are any concurrency or matrix configurations
rg -i "concurrency|matrix|strategy|os\:" .github/workflows/supply-chain.yml -B 2 -A 2Repository: ndycode/codex-multi-auth
Length of output: 196
🏁 Script executed:
# Check what versions of cyclonedx-npm are available and commonly used
web_search "cyclonedx-npm latest version pinning best practice supply chain"Repository: ndycode/codex-multi-auth
Length of output: 118
🏁 Script executed:
# Check if there are any other npm package pins in this workflow
rg "npx.*@.*@" .github/workflows/supply-chain.ymlRepository: ndycode/codex-multi-auth
Length of output: 50
🏁 Script executed:
# Look at the broader supply chain workflow context
head -n 30 .github/workflows/supply-chain.yml | cat -nRepository: ndycode/codex-multi-auth
Length of output: 967
🌐 Web query:
cyclonedx-npm current version best practices pinning npm package supply chain security
💡 Result:
Current version (cyclonedx-npm)
The latest tagged release of @cyclonedx/cyclonedx-npm is 4.2.0 (released Mar 3, 2026). [1]
Pin it explicitly, e.g.
- CI one-off:
npx --yes --package@cyclonedx/cyclonedx-npm@4.2.0 cyclonedx-npm --help(or your args) - Repo-local:
npm i -D@cyclonedx/cyclonedx-npm@4.2.0(then run vianpx cyclonedx-npm ...) [1]
Best practices: pinning + npm supply-chain security (practical checklist)
1) Prefer lockfile-based, reproducible installs
- Commit your lockfile and use
npm ciin CI to install exactly what’s in it (instead of re-resolving). [2] - Keep version ranges in
package.jsontight where feasible; from a security perspective “narrower is better,” and the lockfile should be committed. [2]
2) Reduce install-time attack surface
- Consider installing with lifecycle scripts disabled in hardened CI paths (then re-enable only where needed), e.g.
npm ci --ignore-scripts. [3]
3) Enforce “known-good” dependency resolution
- Use policy tooling to ensure dependencies are pinned/come from allowed registries (lockfile linting / allowlists), and review diffs to
package-lock.jsonlike code. [3]
4) SBOM generation hygiene with cyclonedx-npm
- For maximum determinism, you can generate SBOMs from the lockfile (less dependent on what’s present in
node_modules) using--package-lock-only. [1] - Enable SBOM validation (
--validate) as part of CI to catch malformed output early. [1]
5) Continuous monitoring
- Run
npm audit(or your SCA of choice) continuously and treat new advisories as events that trigger controlled updates (PRs that change bothpackage.json+ lockfile). [3]
Sources
[1] CycloneDX/cyclonedx-node-npm GitHub repository (releases + usage/options) (github.com)
[2] npm Blog: “AppSec POV on Dependency Management” (lockfiles; narrower pinning) (blog.npmjs.org)
[3] “npm-security-best-practices” (pinning; npm ci; --ignore-scripts; provenance/trusted publishing concepts) (github.com)
Citations:
- 1: https://github.com/CycloneDX/cyclonedx-node-npm
- 2: https://blog.npmjs.org/post/187496869845/appsec-pov-on-dependency-management.html?utm_source=openai
- 3: https://github.com/bodadotsh/npm-security-best-practices?utm_source=openai
pin @cyclonedx/cyclonedx-npm to explicit version for reproducible sbom generation.
unpinned npx --yes @cyclonedx/cyclonedx-npm`` at .github/workflows/supply-chain.yml:73 can pull different versions across runs, breaking sbom determinism and supply-chain repeatability. pin to a known version like @4.2.0 to guarantee consistent output.
note: no regression tests exist for sbom generation; consider adding a basic test that validates the generated sbom artifact format and required fields to catch version-induced breaks early.
windows edge cases and concurrency risks don't apply here (job runs on ubuntu-latest only, no matrix or concurrency config).
proposed patch
- - name: Generate CycloneDX SBOM
- run: npx --yes `@cyclonedx/cyclonedx-npm` --output-file sbom.cdx.json --omit dev
+ - name: Generate CycloneDX SBOM
+ run: npx --yes `@cyclonedx/cyclonedx-npm`@4.2.0 --output-file sbom.cdx.json --omit dev📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - name: Generate CycloneDX SBOM | |
| run: npx --yes @cyclonedx/cyclonedx-npm --output-file sbom.cdx.json --omit dev | |
| - name: Generate CycloneDX SBOM | |
| run: npx --yes `@cyclonedx/cyclonedx-npm`@4.2.0 --output-file sbom.cdx.json --omit dev |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/supply-chain.yml around lines 72 - 73, The workflow step
"Generate CycloneDX SBOM" currently invokes an unpinned CLI via the command
string `npx --yes `@cyclonedx/cyclonedx-npm` --output-file sbom.cdx.json --omit
dev`; pin that package to a specific released version (e.g., change to `@4.2.0`)
in the same command to ensure reproducible SBOMs, and update the "Generate
CycloneDX SBOM" step metadata in .github/workflows/supply-chain.yml accordingly;
additionally add a minimal CI job or test that runs the SBOM generation and
asserts the produced sbom.cdx.json contains required top-level fields (e.g.,
bomFormat/version) to catch future version-induced breaks.
| | `retryAllAccountsRateLimited` | `true` | | ||
| | `retryAllAccountsMaxWaitMs` | `0` | | ||
| | `retryAllAccountsMaxRetries` | `Infinity` | | ||
| | `retryAllAccountsAbsoluteCeilingMs` | `0` | |
There was a problem hiding this comment.
update retry max-retries default to match runtime
the table around docs/development/CONFIG_FIELDS.md:64-65 still implies retryAllAccountsMaxRetries defaults to Infinity, but runtime default is 12 (lib/config.ts:44, lib/config.ts:137). please align the docs to prevent operator misconfiguration.
As per coding guidelines docs/**: keep README, SECURITY, and docs consistent with actual CLI flags and workflows.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/development/CONFIG_FIELDS.md` at line 65, The docs table lists
retryAllAccountsMaxRetries as defaulting to Infinity but runtime sets it to 12;
update the documentation entry for retryAllAccountsMaxRetries in
CONFIG_FIELDS.md to reflect the actual runtime default of 12 and ensure any
related fields (e.g., retryAllAccountsAbsoluteCeilingMs) remain accurate; locate
the doc row referencing retryAllAccountsMaxRetries and change its default value
to 12 so it matches the value set in lib/config.ts (where
retryAllAccountsMaxRetries is initialized).
| - `CODEX_AUTH_ROLE=admin|operator|viewer` | ||
| - `CODEX_AUTH_BREAK_GLASS=1` only for emergency windows |
There was a problem hiding this comment.
document the new unset-role default explicitly.
line 45 lists allowed values, but it does not state that codex_auth_role now defaults to viewer when unset. this is an operator-facing behavior change and should be explicit in the runbook to avoid accidental privilege assumptions.
proposed docs patch
2. Validate RBAC defaults for deployment environments:
- - `CODEX_AUTH_ROLE=admin|operator|viewer`
+ - `CODEX_AUTH_ROLE=admin|operator|viewer` (defaults to `viewer` when unset)
- `CODEX_AUTH_BREAK_GLASS=1` only for emergency windowsAs per coding guidelines, docs/**: keep README, SECURITY, and docs consistent with actual CLI flags and workflows. whenever behavior changes, require updated upgrade notes and mention new npm scripts.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - `CODEX_AUTH_ROLE=admin|operator|viewer` | |
| - `CODEX_AUTH_BREAK_GLASS=1` only for emergency windows | |
| - `CODEX_AUTH_ROLE=admin|operator|viewer` (defaults to `viewer` when unset) | |
| - `CODEX_AUTH_BREAK_GLASS=1` only for emergency windows |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/runbooks/operations.md` around lines 45 - 46, The runbook list of env
vars omits that CODEX_AUTH_ROLE defaults to "viewer" when unset; update the
operations.md entry near the CODEX_AUTH_ROLE / CODEX_AUTH_BREAK_GLASS bullet
list to explicitly state "If CODEX_AUTH_ROLE is not set, it defaults to 'viewer'
(operator-facing behavior change)" and mention CODEX_AUTH_BREAK_GLASS is for
emergency windows only; also add a short upgrade note and mirror this default in
README and SECURITY docs and any npm scripts or CLI docs that reference auth
role behavior so all docs stay consistent.
| const getAuth = async () => ({ | ||
| type: "oauth" as const, | ||
| access: "sk-live-sensitive-access-token-1234567890", | ||
| refresh: "refresh-sensitive-token-1234567890", | ||
| expires: Date.now() + 60_000, | ||
| multiAccount: true, | ||
| }); | ||
|
|
||
| try { | ||
| const sdk = await plugin.auth.loader(getAuth, { options: {}, models: {} }); | ||
| const response = await sdk.fetch!("https://api.openai.com/v1/chat", { | ||
| method: "POST", | ||
| body: JSON.stringify({ model: "gpt-5.1" }), | ||
| }); | ||
| expect(response.status).toBe(503); | ||
|
|
||
| expect(recordTelemetryEventMock).toHaveBeenCalled(); | ||
| const serializedCalls = JSON.stringify(recordTelemetryEventMock.mock.calls); | ||
| expect(serializedCalls).not.toContain("sk-live-sensitive-access-token-1234567890"); | ||
| expect(serializedCalls).not.toContain("refresh-sensitive-token-1234567890"); | ||
| } finally { |
There was a problem hiding this comment.
replace secret-like fixture literals with clearly synthetic tokens.
test/index.test.ts:2244-2246 uses sk-live-...-style values. this can trip secret scanners and violates test hygiene constraints for mocked secrets.
proposed fix
const getAuth = async () => ({
type: "oauth" as const,
- access: "sk-live-sensitive-access-token-1234567890",
- refresh: "refresh-sensitive-token-1234567890",
+ access: "telemetry_access_token_fixture_1234567890",
+ refresh: "telemetry_refresh_token_fixture_1234567890",
expires: Date.now() + 60_000,
multiAccount: true,
});
@@
- expect(serializedCalls).not.toContain("sk-live-sensitive-access-token-1234567890");
- expect(serializedCalls).not.toContain("refresh-sensitive-token-1234567890");
+ expect(serializedCalls).not.toContain("telemetry_access_token_fixture_1234567890");
+ expect(serializedCalls).not.toContain("telemetry_refresh_token_fixture_1234567890");As per coding guidelines test/**: tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const getAuth = async () => ({ | |
| type: "oauth" as const, | |
| access: "sk-live-sensitive-access-token-1234567890", | |
| refresh: "refresh-sensitive-token-1234567890", | |
| expires: Date.now() + 60_000, | |
| multiAccount: true, | |
| }); | |
| try { | |
| const sdk = await plugin.auth.loader(getAuth, { options: {}, models: {} }); | |
| const response = await sdk.fetch!("https://api.openai.com/v1/chat", { | |
| method: "POST", | |
| body: JSON.stringify({ model: "gpt-5.1" }), | |
| }); | |
| expect(response.status).toBe(503); | |
| expect(recordTelemetryEventMock).toHaveBeenCalled(); | |
| const serializedCalls = JSON.stringify(recordTelemetryEventMock.mock.calls); | |
| expect(serializedCalls).not.toContain("sk-live-sensitive-access-token-1234567890"); | |
| expect(serializedCalls).not.toContain("refresh-sensitive-token-1234567890"); | |
| } finally { | |
| const getAuth = async () => ({ | |
| type: "oauth" as const, | |
| access: "telemetry_access_token_fixture_1234567890", | |
| refresh: "telemetry_refresh_token_fixture_1234567890", | |
| expires: Date.now() + 60_000, | |
| multiAccount: true, | |
| }); | |
| try { | |
| const sdk = await plugin.auth.loader(getAuth, { options: {}, models: {} }); | |
| const response = await sdk.fetch!("https://api.openai.com/v1/chat", { | |
| method: "POST", | |
| body: JSON.stringify({ model: "gpt-5.1" }), | |
| }); | |
| expect(response.status).toBe(503); | |
| expect(recordTelemetryEventMock).toHaveBeenCalled(); | |
| const serializedCalls = JSON.stringify(recordTelemetryEventMock.mock.calls); | |
| expect(serializedCalls).not.toContain("telemetry_access_token_fixture_1234567890"); | |
| expect(serializedCalls).not.toContain("telemetry_refresh_token_fixture_1234567890"); | |
| } finally { |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/index.test.ts` around lines 2242 - 2262, The getAuth test fixture is
using real-looking secret literals ("sk-live-sensitive-access-token-1234567890"
and "refresh-sensitive-token-1234567890"); replace those with clearly synthetic
tokens (e.g., "sk-test-0000000000" and "refresh-test-0000000000") inside the
getAuth async function so secret scanners won't be triggered, and keep the rest
of the test logic (plugin.auth.loader, sdk.fetch call, and telemetry assertions
using recordTelemetryEventMock) unchanged to preserve behavior and assertions
that telemetry does not contain the synthetic tokens.
| it("defaults to 8s timeout for non-positive and non-numeric values", async () => { | ||
| const originalTimeout = process.env.CODEX_AUTH_SHUTDOWN_TIMEOUT_MS; | ||
| vi.useFakeTimers(); | ||
| try { | ||
| const runCase = async (value: string) => { | ||
| process.env.CODEX_AUTH_SHUTDOWN_TIMEOUT_MS = value; | ||
| let resolved = false; | ||
| let releaseCleanup: (() => void) | undefined; | ||
| registerCleanup( | ||
| () => | ||
| new Promise<void>((resolve) => { | ||
| releaseCleanup = resolve; | ||
| }), | ||
| ); | ||
| const cleanupPromise = runCleanup().then(() => { | ||
| resolved = true; | ||
| }); | ||
| await vi.advanceTimersByTimeAsync(7_999); | ||
| expect(resolved).toBe(false); | ||
| await vi.advanceTimersByTimeAsync(1); | ||
| await cleanupPromise; | ||
| expect(resolved).toBe(true); | ||
| releaseCleanup?.(); | ||
| await runCleanup(); | ||
| }; | ||
|
|
||
| await runCase("0"); | ||
| await runCase("-1"); | ||
| await runCase("not-a-number"); | ||
| } finally { | ||
| if (originalTimeout === undefined) { | ||
| delete process.env.CODEX_AUTH_SHUTDOWN_TIMEOUT_MS; | ||
| } else { | ||
| process.env.CODEX_AUTH_SHUTDOWN_TIMEOUT_MS = originalTimeout; | ||
| } | ||
| vi.useRealTimers(); | ||
| } | ||
| }); |
There was a problem hiding this comment.
add the missing sub-minimum clamp regression ("1" -> 1000ms).
test/shutdown.test.ts:163-200 (Line 163-200) checks default fallback cases but misses lib/shutdown.ts:15 (Line 15), where positive values below 1000 are clamped up to 1000. add a deterministic case for "1".
proposed test update
- const runCase = async (value: string) => {
+ const runCase = async (value: string, expectedMs: number) => {
process.env.CODEX_AUTH_SHUTDOWN_TIMEOUT_MS = value;
let resolved = false;
let releaseCleanup: (() => void) | undefined;
@@
- await vi.advanceTimersByTimeAsync(7_999);
+ await vi.advanceTimersByTimeAsync(expectedMs - 1);
expect(resolved).toBe(false);
await vi.advanceTimersByTimeAsync(1);
await cleanupPromise;
expect(resolved).toBe(true);
@@
- await runCase("0");
- await runCase("-1");
- await runCase("not-a-number");
+ await runCase("0", 8_000);
+ await runCase("-1", 8_000);
+ await runCase("not-a-number", 8_000);
+ await runCase("1", 1_000);as per coding guidelines, "test/**: tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions."
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| it("defaults to 8s timeout for non-positive and non-numeric values", async () => { | |
| const originalTimeout = process.env.CODEX_AUTH_SHUTDOWN_TIMEOUT_MS; | |
| vi.useFakeTimers(); | |
| try { | |
| const runCase = async (value: string) => { | |
| process.env.CODEX_AUTH_SHUTDOWN_TIMEOUT_MS = value; | |
| let resolved = false; | |
| let releaseCleanup: (() => void) | undefined; | |
| registerCleanup( | |
| () => | |
| new Promise<void>((resolve) => { | |
| releaseCleanup = resolve; | |
| }), | |
| ); | |
| const cleanupPromise = runCleanup().then(() => { | |
| resolved = true; | |
| }); | |
| await vi.advanceTimersByTimeAsync(7_999); | |
| expect(resolved).toBe(false); | |
| await vi.advanceTimersByTimeAsync(1); | |
| await cleanupPromise; | |
| expect(resolved).toBe(true); | |
| releaseCleanup?.(); | |
| await runCleanup(); | |
| }; | |
| await runCase("0"); | |
| await runCase("-1"); | |
| await runCase("not-a-number"); | |
| } finally { | |
| if (originalTimeout === undefined) { | |
| delete process.env.CODEX_AUTH_SHUTDOWN_TIMEOUT_MS; | |
| } else { | |
| process.env.CODEX_AUTH_SHUTDOWN_TIMEOUT_MS = originalTimeout; | |
| } | |
| vi.useRealTimers(); | |
| } | |
| }); | |
| it("defaults to 8s timeout for non-positive and non-numeric values", async () => { | |
| const originalTimeout = process.env.CODEX_AUTH_SHUTDOWN_TIMEOUT_MS; | |
| vi.useFakeTimers(); | |
| try { | |
| const runCase = async (value: string, expectedMs: number) => { | |
| process.env.CODEX_AUTH_SHUTDOWN_TIMEOUT_MS = value; | |
| let resolved = false; | |
| let releaseCleanup: (() => void) | undefined; | |
| registerCleanup( | |
| () => | |
| new Promise<void>((resolve) => { | |
| releaseCleanup = resolve; | |
| }), | |
| ); | |
| const cleanupPromise = runCleanup().then(() => { | |
| resolved = true; | |
| }); | |
| await vi.advanceTimersByTimeAsync(expectedMs - 1); | |
| expect(resolved).toBe(false); | |
| await vi.advanceTimersByTimeAsync(1); | |
| await cleanupPromise; | |
| expect(resolved).toBe(true); | |
| releaseCleanup?.(); | |
| await runCleanup(); | |
| }; | |
| await runCase("0", 8_000); | |
| await runCase("-1", 8_000); | |
| await runCase("not-a-number", 8_000); | |
| await runCase("1", 1_000); | |
| } finally { | |
| if (originalTimeout === undefined) { | |
| delete process.env.CODEX_AUTH_SHUTDOWN_TIMEOUT_MS; | |
| } else { | |
| process.env.CODEX_AUTH_SHUTDOWN_TIMEOUT_MS = originalTimeout; | |
| } | |
| vi.useRealTimers(); | |
| } | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/shutdown.test.ts` around lines 163 - 200, The test misses the
sub-minimum clamp regression for positive values below 1000ms; update the test
in shutdown.test.ts to include a deterministic case for
process.env.CODEX_AUTH_SHUTDOWN_TIMEOUT_MS = "1" (which should clamp to 1000ms)
by adding await runCase("1") alongside the existing runCase("0"), runCase("-1"),
and runCase("not-a-number") calls; this verifies the behavior in lib/shutdown.ts
that clamps positive values under 1000 to 1000, while still using
registerCleanup and runCleanup and the same fake timers pattern.
| await telemetry.recordTelemetryEvent({ | ||
| source: "plugin", | ||
| event: "request.locked_write", | ||
| outcome: "info", | ||
| details: { token: "sk-live-lock-test-token" }, | ||
| }); |
There was a problem hiding this comment.
replace secret-looking token fixtures with neutral placeholders.
line [48] uses sk-live-lock-test-token, which looks like a real key pattern (test/telemetry-lock.test.ts:48). replace it with a neutral non-secret-like value to keep test data hygiene strict.
As per coding guidelines, test/**: tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/telemetry-lock.test.ts` around lines 44 - 49, The test uses a
secret-looking fixture value in the call to telemetry.recordTelemetryEvent (see
telemetry.recordTelemetryEvent invocation) — replace the token
"sk-live-lock-test-token" with a neutral, non-secret placeholder (e.g.
"test-placeholder-token" or "placeholder-token") in the details.token field so
test data hygiene is preserved; ensure no other test fixtures use real-looking
secret patterns and keep the existing assertions and deterministic vitest setup
intact (do not mock out assertions or remove the telemetry call).
| async function removeWithRetry( | ||
| targetPath: string, | ||
| options: { recursive?: boolean; force?: boolean }, | ||
| ): Promise<void> { | ||
| for (let attempt = 0; attempt < 6; attempt += 1) { | ||
| try { | ||
| await fs.rm(targetPath, options); | ||
| return; | ||
| } catch (error) { | ||
| const code = (error as NodeJS.ErrnoException).code; | ||
| if (code === "ENOENT") return; | ||
| if (!code || !RETRYABLE_REMOVE_CODES.has(code) || attempt === 5) { | ||
| throw error; | ||
| } | ||
| await new Promise((resolve) => setTimeout(resolve, 25 * 2 ** attempt)); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
prefer reusing the shared fs retry helper instead of local duplication.
test/telemetry.test.ts:16-33 redefines removeWithRetry. import the shared helper from test/helpers/fs-retry.ts to keep retry behavior consistent across suites.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/telemetry.test.ts` around lines 16 - 33, The local removeWithRetry
function duplicates shared logic; delete this local definition in
test/telemetry.test.ts and import the shared helper from
test/helpers/fs-retry.ts instead (use the exported removeWithRetry helper from
that module), update any usages to call the imported removeWithRetry, and remove
any now-unused local symbols like RETRYABLE_REMOVE_CODES or direct fs.rm retry
logic so the test reuses the centralized retry behavior.
| it("serializes concurrent telemetry writes without dropping events", async () => { | ||
| configureTelemetry({ maxFileSizeBytes: 1_000_000, maxFiles: 12 }); | ||
| const total = 30; | ||
| await Promise.all( | ||
| Array.from({ length: total }, (_, index) => | ||
| recordTelemetryEvent({ | ||
| source: "plugin", | ||
| event: "request.concurrent", | ||
| outcome: "info", | ||
| details: { | ||
| index, | ||
| token: `sk-concurrent-token-${String(index).padStart(4, "0")}`, | ||
| }, | ||
| }), | ||
| ), | ||
| ); | ||
|
|
||
| const raw = await fs.readFile(getTelemetryLogPath(), "utf8"); | ||
| const lines = raw.trim().split("\n"); | ||
| expect(lines).toHaveLength(total); | ||
|
|
||
| const parsed = lines.map( | ||
| (line) => JSON.parse(line) as { details?: Record<string, unknown> }, | ||
| ); | ||
| expect(parsed.every((entry) => entry.details?.token === "***MASKED***")).toBe(true); | ||
| }); |
There was a problem hiding this comment.
avoid secret-like token literals in telemetry fixtures.
test/telemetry.test.ts:316-317 uses sk-concurrent-token-... fixture values. use clearly synthetic non-secret strings to avoid false positives in secret scanning.
proposed fix
details: {
index,
- token: `sk-concurrent-token-${String(index).padStart(4, "0")}`,
+ token: `telemetry_token_fixture_${String(index).padStart(4, "0")}`,
},As per coding guidelines test/**: tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| it("serializes concurrent telemetry writes without dropping events", async () => { | |
| configureTelemetry({ maxFileSizeBytes: 1_000_000, maxFiles: 12 }); | |
| const total = 30; | |
| await Promise.all( | |
| Array.from({ length: total }, (_, index) => | |
| recordTelemetryEvent({ | |
| source: "plugin", | |
| event: "request.concurrent", | |
| outcome: "info", | |
| details: { | |
| index, | |
| token: `sk-concurrent-token-${String(index).padStart(4, "0")}`, | |
| }, | |
| }), | |
| ), | |
| ); | |
| const raw = await fs.readFile(getTelemetryLogPath(), "utf8"); | |
| const lines = raw.trim().split("\n"); | |
| expect(lines).toHaveLength(total); | |
| const parsed = lines.map( | |
| (line) => JSON.parse(line) as { details?: Record<string, unknown> }, | |
| ); | |
| expect(parsed.every((entry) => entry.details?.token === "***MASKED***")).toBe(true); | |
| }); | |
| it("serializes concurrent telemetry writes without dropping events", async () => { | |
| configureTelemetry({ maxFileSizeBytes: 1_000_000, maxFiles: 12 }); | |
| const total = 30; | |
| await Promise.all( | |
| Array.from({ length: total }, (_, index) => | |
| recordTelemetryEvent({ | |
| source: "plugin", | |
| event: "request.concurrent", | |
| outcome: "info", | |
| details: { | |
| index, | |
| token: `telemetry_token_fixture_${String(index).padStart(4, "0")}`, | |
| }, | |
| }), | |
| ), | |
| ); | |
| const raw = await fs.readFile(getTelemetryLogPath(), "utf8"); | |
| const lines = raw.trim().split("\n"); | |
| expect(lines).toHaveLength(total); | |
| const parsed = lines.map( | |
| (line) => JSON.parse(line) as { details?: Record<string, unknown> }, | |
| ); | |
| expect(parsed.every((entry) => entry.details?.token === "***MASKED***")).toBe(true); | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/telemetry.test.ts` around lines 305 - 330, The test uses secret-like
fixture values ("sk-concurrent-token-...") which can trigger secret-scanning;
update the telemetry test to use clearly synthetic, non-secret token strings
(e.g. "concurrent-token-0000" or "token-<index>") when calling
recordTelemetryEvent in the concurrent write spec, keeping the same
deterministic generation pattern so configureTelemetry, recordTelemetryEvent,
and getTelemetryLogPath behavior and the final masking assertion remain
unchanged; ensure the token string change is applied only in this test fixture
and that parsed.every(...) still checks for masked tokens.
Summary
authorizationleast-privilege default,file-lockrelease race hardening).What Changed
test/index-retry.test.ts.retryAllAccountsAbsoluteCeilingMsunlimited preview + category/label wiring.lib/quota-cache.tsnow omits unixmodeonwin32; added regression.lib/accounts.tsconflict recovery now retries onEBUSY(plus deterministic merge regression).lib/file-lock.tsnow re-checks ownership immediately before delete (async + sync paths), with explicit TOCTOU limitation note and replacement-race regression test.benchmark-runtime-pathsync-merge guard (stale precondition + changed/token repair assertions) and added helper regression tests.maxAvgMsrejection.removeWithRetryhelper totest/helpers/fs-retry.tsand reused in affected tests.CODEX_AUTH_ROLEfallback fromadmintoviewer(least privilege) and updated tests.Behavior Change
CODEX_AUTH_ROLEis unset, authorization now defaults toviewerinstead ofadmin.How To Test
npm run typechecknpm run lint:tsnpm run lint:scriptsnpm test -- test/settings-hub-utils.test.ts test/accounts-edge.test.ts test/index-retry.test.ts test/quota-cache.test.ts test/check-runtime-budgets-script.test.ts test/benchmark-runtime-path-helpers.test.ts test/authorization.test.ts test/config-save.test.ts test/file-lock.test.ts test/utils.test.ts test/network.test.tsRisk / Rollout Notes
CODEX_AUTH_ROLE, setCODEX_AUTH_ROLE=adminexplicitly.Update 2026-03-05 (commit
6277c85)What changed
test/telemetry-lock.test.ts,test/config-save.test.ts,test/fetch-helpers.test.ts,test/auth.test.ts,test/idempotency.test.ts,test/data-retention.test.ts,test/codex-manager-cli.test.ts.How to test
npm run typechecknpm run lint:tsnpm run lint:scriptsnpm test -- test/telemetry-lock.test.ts test/telemetry.test.ts test/data-retention.test.ts test/shutdown.test.ts test/storage.test.ts test/auth.test.ts test/account-view.test.ts test/accounts-edge.test.ts test/codex-manager-cli.test.ts test/benchmark-runtime-path-helpers.test.ts test/index.test.ts test/index-retry.test.ts test/fetch-helpers.test.ts test/retry-governor.test.ts test/config-save.test.ts test/tool-utils.test.ts test/background-jobs.test.ts test/idempotency.test.ts test/settings-hub-utils.test.ts test/data-redaction.test.tsRisk / rollout notes
note: greptile review for oc-chatgpt-multi-auth. cite files like
lib/foo.ts:123. confirm regression tests + windows concurrency/token redaction coverage.Greptile Summary
this PR lands a focused production-hardening batch: least-privilege auth default (
viewer), file-lock TOCTOU double-read hardening,refreshAccessTokentimeout floor guard, EBUSY conflict recovery in account persistence, win32modeomission in quota-cache writes, and extraction of a sharedremoveWithRetrytest helper — all with targeted regression tests.CODEX_AUTH_ROLEunset now resolves to"viewer"instead of"admin", closing a silent privilege escalation on unconfigured installs. tests updated correctly.refreshAccessTokennow uses the sharedresolveExchangeTimeoutMsguard, preventing 0 / negative / non-finite callers from silently killing token rotation.mode: 0o600is correctly skipped on win32 inquota-cache.ts; account saves retry onEBUSY/ECONFLICTwith a disk-merge strategy to survive antivirus-induced lock contention.test/helpers/fs-retry.tsextracted, buttest/file-lock.test.tswas not updated to import from it — it still carries a local duplicate, defeating the stated refactor goal.mergeIntoLatestStorageinlib/accounts.ts(lines 834–862) is indented one extra level, making nesting visually ambiguous despite being brace-correct; this may be caught by the biome formatter on CI.Confidence Score: 4/5
removeWithRetryintest/file-lock.test.tsand the indentation drift inlib/accounts.tstest/file-lock.test.ts(duplicate helper) andlib/accounts.tslines 834–862 (indentation drift inmergeIntoLatestStorage)Important Files Changed
admintovieweron line 31; fallback for invalid values already returnedviewer— both paths are now consistently least-privilege. tests updated accordingly.refreshAccessTokennow routes its timeout throughresolveExchangeTimeoutMs, clamping 0 / negative / non-finite values to a 1 000 ms floor — matches the guard already present onexchangeAuthorizationCode.mergeIntoLatestStorageandpersistStorageWithConflictRecoverymethods add EBUSY/ECONFLICT retry with disk-merge. logic is correct but the second half ofmergeIntoLatestStorage(lines 834–862) has inconsistent extra indentation that makes nesting visually ambiguous and may be caught by the biome formatter.mode: 0o600correctly omitted on win32 (Windows does not support POSIX permission bits and setting them can causeEINVAL). temp-rename write path and file-lock acquisition look solid.removeWithRetrycopy instead of importing from the newly-extractedtest/helpers/fs-retry.ts, defeating the stated refactor goal.RETRYABLE_REMOVE_CODESandremoveWithRetrywith correct exponential backoff. only issue is thattest/file-lock.test.tswas not updated to consume it."viewer"as default role; correctly exercises least-privilege default, invalid-role fallback, break-glass bypass, ABAC read-only mode, deny-actions/commands, and interactive/idempotency-key guards.Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A[saveToDisk] --> B[buildStorageSnapshot] B --> C[persistStorageWithConflictRecovery] C --> D{saveAccounts} D -->|success| E[return] D -->|ECONFLICT / EBUSY| F{attempt < maxAttempts?} F -->|no| G[throw error] F -->|yes| H[loadAccounts from disk] H --> I[mergeIntoLatestStorage\nlatest + local] I --> J[sleep 20ms * 2^attempt] J --> D subgraph mergeIntoLatestStorage K[start with latest.accounts] --> L[claimIndex by token / accountId / email] L -->|found| M[mergeStoredAccount\nboth spread + rate-limit union] L -->|not found| N[append new account] M --> O[resolve activeIndexByFamily\nby local token lookup\nfall back to latest index] N --> O endLast reviewed commit: 6277c85
Context used:
dashboard- What: Every code change must explain how it defends against Windows filesystem concurrency bugs and ... (source)