Conversation
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
Add timeout and abort boundary for authorization code exchange. Wire fetch timeout config into oauth exchange callsites. Add tests for network and timeout failure behavior. Co-authored-by: Codex <noreply@openai.com>
Add enterprise-grade hardening across runtime, CLI, storage, CI, and docs.\n\n- Add cross-process file locking for settings/quota persistence\n- Add at-rest secret encryption with rotation command and idempotency support\n- Add RBAC/ABAC-style command authorization, JSON redaction, and retention policies\n- Add background retry + dead-letter queue for async persistence failures\n- Add list JSON pagination standard and schemaVersion contract updates\n- Add CI security gates: secret scan, supply-chain/SCA/license checks, SBOM, required checks policy\n- Add operations and incident response runbooks\n- Add/extend tests for new security/reliability primitives and CLI behaviors\n\nValidated with:\n- npm run typecheck\n- npm run lint\n- npm run build && npm test\n- npm run coverage\n- npm run audit:ci\n- npm run license:check\n- npm run clean:repo:check Co-authored-by: Codex <noreply@openai.com>
- migrate new envelopes to salted scrypt-derived AES keys\n- retain legacy enc:v1 decryption compatibility\n- add coverage for v2 and legacy v1 crypto paths\n- clarify high-entropy 32-byte key requirements in docs Co-authored-by: Codex <noreply@openai.com>
- dedupe OAuth timeout resolution in plugin flows\n- close upstream abort-listener race in auth exchange\n- add transient config-read retry handling for EBUSY/EPERM\n- add timeout propagation and retry regression tests\n\nCo-authored-by: Codex <noreply@openai.com>
- sanitize OAuth token error logging and payloads - add deterministic upstream-abort and cleanup regressions - harden config-save merge path against transient EBUSY/EPERM reads - strengthen timeout wiring and EPERM regression tests Co-authored-by: Codex <noreply@openai.com>
- audit break-glass authorization bypass events\n- add multi-process contention regression for file locking\n- cover new auth.break_glass audit action in tests Co-authored-by: Codex <noreply@openai.com>
Prevent sensitive OAuth payload leakage during refresh HTTP failures by logging status and body length only, and return a generic safe error message when response text exists. Adds regression tests for auth behavior and sanitized logging. Co-authored-by: Codex <noreply@openai.com>
Introduce optional environment-driven ABAC constraints layered ahead of RBAC for auth actions and commands. - add ABAC policy evaluation in authorization module - pass command/interactive/idempotency context from CLI command routing - enforce read authorization on read command paths consistently - add unit and CLI coverage for ABAC deny and idempotency requirements - document ABAC controls in config and operations runbooks Co-authored-by: Codex <noreply@openai.com>
Add schema-aware config sanitization, introduce a resilient fetch primitive with timeout/retry instrumentation, apply it to remote prompt/update fetches, and add shutdown timeout boundaries with concurrent cleanup deduplication. Includes new regression tests for network retries, config sanitization, and shutdown timeout behavior. Co-authored-by: Codex <noreply@openai.com>
Introduce a shared account-view helper module and reuse it across the plugin entrypoint and CLI manager to remove duplicated formatting logic while preserving behavior. Add unit/regression tests to lock formatting and active-index fallback behavior. Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
Address unresolved hardening review items across auth abort classification, config sanitization, network retry cancellation, update-check backoff, prompt refresh dedupe coverage, and shutdown timer cleanup. Adds deterministic regression tests for each scenario and aligns config-save expectations with schema-bound validation. Co-authored-by: Codex <noreply@openai.com>
Add shared helpers for per-family active-index map creation and bulk updates, then reuse them across plugin, CLI manager, and codex-cli sync flows to reduce duplicated state-mutation logic. Extend tests to lock switch and selection behavior for all model-family indexes. Co-authored-by: Codex <noreply@openai.com>
Extract active-index normalization into a shared helper and reuse it in plugin account-flow clamping, codex-cli sync, and doctor normalization paths. Add focused tests covering bounded normalization and empty-account map handling to preserve behavior across both clear and fill modes. Co-authored-by: Codex <noreply@openai.com>
Extract model-family rate-limit status rendering into account-view helpers and replace duplicate status mapping logic in the plugin status output paths. Add unit coverage for mixed limited/ok family labels to protect output behavior. Co-authored-by: Codex <noreply@openai.com>
Move active-index-by-family label rendering into shared account-view helpers and reuse it in both v2 and legacy codex status output paths. Add helper tests to lock 1-based labels and placeholder output for missing families. Co-authored-by: Codex <noreply@openai.com>
Move codex-remove index/family reindexing into shared active-index helpers to centralize the mutation policy and reduce inline branching in the plugin entrypoint. Add helper tests for in-range and out-of-range removal behavior. Co-authored-by: Codex <noreply@openai.com>
- cleanup stale/dead process lock artifacts before acquiring account lock - ensure lock release always attempts fallback cleanup - keep clearAccounts/saveTransactions serialized across file and memory locks Co-authored-by: Codex <noreply@openai.com>
Introduce shared account-storage view helpers for creating empty v3 storage and cloning mutable working copies, then reuse them across plugin login/check flow, codex-cli sync, and manager reset paths. Add unit tests to lock clone isolation and family-index initialization behavior. Co-authored-by: Codex <noreply@openai.com>
- add storage identity fallback test for malformed .git metadata - add OAuth poll abort test for deterministic close behavior - add prompt refresh dedupe test for concurrent stale cache reads Co-authored-by: Codex <noreply@openai.com>
Align local and CI validation through npm run verify commands, wire CI to verify:ci, and update contributor-facing validation docs/templates. Co-authored-by: Codex <noreply@openai.com>
- replace instantaneous fetch-count check with stable post-refresh assertion window - verify fetch count remains fixed after refresh persistence completes Co-authored-by: Codex <noreply@openai.com>
Add dev doctor and setup commands with fail-fast checks, document the contributor workflow, and wire a CI sanity check for doctor:dev. Co-authored-by: Codex <noreply@openai.com>
- replace microtask spin-loops with vi.waitFor polling - keep final no-extra-fetch stability assertion after refresh Co-authored-by: Codex <noreply@openai.com>
Add a canonical verify workflow shared by local and CI, integrate scoped Biome formatting checks, and document contributor/release runbooks to reduce setup and release friction. Co-authored-by: Codex <noreply@openai.com>
Add regression tests for doctor/setup scripts, make setup command execution deterministic under event-order races, add bounded Windows npm ci retries, and document contributor upgrade + Windows lock remediation guidance. Co-authored-by: Codex <noreply@openai.com>
Ensure account-storage mutations keep deterministic ordering while preserving the historical file-lock before in-process mutex acquisition sequence. Co-authored-by: Codex <noreply@openai.com>
Drop pre-acquire dead-process lock cleanup and only run fallback lock-file deletion when lock.release() fails. Also align lock-order comment and add debug observability for fallback cleanup failures. Co-authored-by: Codex <noreply@openai.com>
- dedupe safe audit resource sanitization in plugin fetch flow - surface retention retry exhaustion and cover persistent EBUSY regression - share test removeWithRetry helper and add license policy parsing regressions Co-authored-by: Codex <noreply@openai.com>
- add recovery drill timeout and failure notifier hook - clarify upgrade runbook npm script execution context - harden secret-scan allowlist regression fixtures and test isolation - reuse shared removeWithRetry helper in audit forwarder tests Co-authored-by: Codex <noreply@openai.com>
Rotate immediately on 429 responses and bootstrap runtime selection from persisted quota cache (accountId-first, normalized email fallback) with a 30m lazy TTL and env escape hatch. Co-authored-by: Codex <noreply@openai.com>
- block exhausted accounts until the furthest reset window - make quota bootstrap loading best-effort on cache failures - add regressions for email fallback, concurrency, refresh races, and EPERM/EBUSY Co-authored-by: Codex <noreply@openai.com>
- avoid cross-model bootstrap blocking within the same family - align scheduler keys with the model that produced cached quota state - make concurrency/race tests deterministic with explicit barriers Co-authored-by: Codex <noreply@openai.com>
- simplify bootstrap apply path to use the loader's best-effort fallback - keep error handling centralized in loadQuotaCacheForBootstrap Co-authored-by: Codex <noreply@openai.com>
Add an explicit ceiling-exhausted guard and integration coverage to ensure exhausted absolute ceiling stops immediately instead of spinning into retry-limit handling. Co-authored-by: Codex <noreply@openai.com>
- avoid over-blocking when quota reset timestamps are known but elapsed - add failure cooldown with retry interval to prevent load hammering while allowing recovery - normalize model ids for bootstrap exact matching using provider-prefix stripping + lowercase - add regression coverage for model-id variants, elapsed resets, and cooldown recovery behavior Co-authored-by: Codex <noreply@openai.com>
- use shared quota schedule key builder in bootstrap, runtime deferral, and stream fallback updates - add regression in index-retry for provider-prefixed model-id deferral matching - tighten recovery assertion to verify rotation target account deterministically Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
- harden recovery drill workflow concurrency and webhook payload handling\n- tighten forwarder harness determinism for lock/timeout edge cases\n- dedupe test cleanup helper and make slo report module import test-compatible\n\nCo-authored-by: Codex <noreply@openai.com>
…s-to-dev Co-authored-by: Codex <noreply@openai.com>
…ved-prs-to-dev Co-authored-by: Codex <noreply@openai.com>
…-prs-to-dev Conflict resolution: kept the verify-first DX workflow while preserving PR #32 enterprise hardening scripts, docs, workflows, and runtime changes. Co-authored-by: Codex <noreply@openai.com>
Conflict resolution: applied the stacked hardening follow-up on top of the merged PR #32 runtime and test surfaces while preserving the earlier DX workflow choices. Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
…ion/all-approved-prs-to-dev
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
|
Important Review skippedToo many files! This PR contains 157 files, which is 7 over the limit of 150. ⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (157)
You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
| await queueAppend(async () => { | ||
| await ensureLogDir(); | ||
| rotateLogsIfNeeded(); | ||
| const line = `${JSON.stringify(entry)}\n`; | ||
| await fs.appendFile(getTelemetryPath(), line, "utf8"); | ||
| }); |
There was a problem hiding this comment.
Synchronous I/O inside async queue blocks the event loop
rotateLogsIfNeeded() (called at line 256) uses statSync, existsSync, unlinkSync, and renameSync. all four are synchronous and are called directly inside the queueAppend async task. on windows, antivirus/filesystem watchers routinely hold file locks on *.jsonl files for dozens of milliseconds; a blocking statSync or renameSync inside node's microtask queue will freeze the entire event loop for that duration, stalling all pending i/o — including in-flight token refresh or account save operations.
rotateLogsIfNeeded should be converted to an async function using fs.stat, fs.rename, and fs.unlink (already imported as promises as fs). this also creates a testable regression point — there is currently no vitest coverage for concurrent log rotation while a save is in flight.
| await queueAppend(async () => { | |
| await ensureLogDir(); | |
| rotateLogsIfNeeded(); | |
| const line = `${JSON.stringify(entry)}\n`; | |
| await fs.appendFile(getTelemetryPath(), line, "utf8"); | |
| }); | |
| await queueAppend(async () => { | |
| await ensureLogDir(); | |
| await rotateLogsIfNeeded(); | |
| const line = `${JSON.stringify(entry)}\n`; | |
| await fs.appendFile(getTelemetryPath(), line, "utf8"); | |
| }); |
(rotateLogsIfNeeded itself should be converted to async using fs.stat, fs.rename, and fs.unlink.)
Prompt To Fix With AI
This is a comment left during a code review.
Path: lib/telemetry.ts
Line: 254-259
Comment:
**Synchronous I/O inside async queue blocks the event loop**
`rotateLogsIfNeeded()` (called at line 256) uses `statSync`, `existsSync`, `unlinkSync`, and `renameSync`. all four are synchronous and are called directly inside the `queueAppend` async task. on windows, antivirus/filesystem watchers routinely hold file locks on `*.jsonl` files for dozens of milliseconds; a blocking `statSync` or `renameSync` inside node's microtask queue will freeze the entire event loop for that duration, stalling all pending i/o — including in-flight token refresh or account save operations.
`rotateLogsIfNeeded` should be converted to an async function using `fs.stat`, `fs.rename`, and `fs.unlink` (already imported as `promises as fs`). this also creates a testable regression point — there is currently no vitest coverage for concurrent log rotation while a save is in flight.
```suggestion
await queueAppend(async () => {
await ensureLogDir();
await rotateLogsIfNeeded();
const line = `${JSON.stringify(entry)}\n`;
await fs.appendFile(getTelemetryPath(), line, "utf8");
});
```
(`rotateLogsIfNeeded` itself should be converted to `async` using `fs.stat`, `fs.rename`, and `fs.unlink`.)
How can I resolve this? If you propose a fix, please make it concise.| const flagged = await loadFlaggedAccounts(); | ||
| const flaggedCount = flagged.accounts.length; | ||
| if (flaggedCount > 0) { | ||
| await saveFlaggedAccounts(flagged); |
There was a problem hiding this comment.
rotateStoredSecretEncryption has an unguarded TOCTOU race on flagged accounts
the main accounts block is correctly serialized via withAccountStorageTransaction (which holds withStorageSerializedFileLock), but the flagged-accounts rotate (loadFlaggedAccounts → saveFlaggedAccounts) runs entirely outside any lock. two concurrent callers or a concurrent saveFlaggedAccounts from elsewhere will interleave: the second writer's load snapshot is already stale before it calls save, silently overwriting the first writer's rotation result. this is a real risk on windows where cli restarts can overlap.
the saveFlaggedAccounts path already supports atomic temp-file-rename writes, but without serialization the load→save window is unguarded. there is no regression test for concurrent rotateStoredSecretEncryption invocations in test/storage.test.ts or test/storage-flagged.test.ts.
Prompt To Fix With AI
This is a comment left during a code review.
Path: lib/storage.ts
Line: 2295-2298
Comment:
**`rotateStoredSecretEncryption` has an unguarded TOCTOU race on flagged accounts**
the main accounts block is correctly serialized via `withAccountStorageTransaction` (which holds `withStorageSerializedFileLock`), but the flagged-accounts rotate (`loadFlaggedAccounts` → `saveFlaggedAccounts`) runs entirely outside any lock. two concurrent callers or a concurrent `saveFlaggedAccounts` from elsewhere will interleave: the second writer's load snapshot is already stale before it calls save, silently overwriting the first writer's rotation result. this is a real risk on windows where cli restarts can overlap.
the `saveFlaggedAccounts` path already supports atomic temp-file-rename writes, but without serialization the load→save window is unguarded. there is no regression test for concurrent `rotateStoredSecretEncryption` invocations in `test/storage.test.ts` or `test/storage-flagged.test.ts`.
How can I resolve this? If you propose a fix, please make it concise.| if (DISK_PREFERRED_MERGE_KEYS.has(rawKey) && currentValue !== undefined) { | ||
| continue; |
There was a problem hiding this comment.
DISK_PREFERRED_MERGE_KEYS can discard a freshly-refreshed in-memory token during conflict recovery
DISK_PREFERRED_MERGE_KEYS contains "refreshToken" and "accessToken". in mergeStoredAccountRecords the guard currentValue !== undefined means the disk copy always wins. the problem: mergeIntoLatestStorage is called during conflict recovery (after a successful refreshAccessToken that updated the in-memory account but before the first save succeeded). if the disk has the old token and the in-memory state has the new one, the merge discards the fresh token and writes the stale one back. the next request will then fail auth with a 401 using the now-invalid token.
the intent is to prefer stable long-lived disk state, but refreshToken rotates on every refresh call; disk-preferred semantics are wrong for this field in the conflict-recovery merge path. consider comparing token expiry (expiresAt) and preferring the newer token, or at minimum not including "refreshToken" in DISK_PREFERRED_MERGE_KEYS.
Prompt To Fix With AI
This is a comment left during a code review.
Path: lib/accounts.ts
Line: 863-864
Comment:
**`DISK_PREFERRED_MERGE_KEYS` can discard a freshly-refreshed in-memory token during conflict recovery**
`DISK_PREFERRED_MERGE_KEYS` contains `"refreshToken"` and `"accessToken"`. in `mergeStoredAccountRecords` the guard `currentValue !== undefined` means the disk copy always wins. the problem: `mergeIntoLatestStorage` is called **during conflict recovery** (after a successful `refreshAccessToken` that updated the in-memory account but before the first save succeeded). if the disk has the old token and the in-memory state has the new one, the merge discards the fresh token and writes the stale one back. the next request will then fail auth with a 401 using the now-invalid token.
the intent is to prefer stable long-lived disk state, but `refreshToken` rotates on every refresh call; disk-preferred semantics are wrong for this field in the conflict-recovery merge path. consider comparing token expiry (`expiresAt`) and preferring the newer token, or at minimum not including `"refreshToken"` in `DISK_PREFERRED_MERGE_KEYS`.
How can I resolve this? If you propose a fix, please make it concise.
Merged PRs
Summary
dev.integration/all-approved-prs-to-dev.package.jsonformatting normalization required to clear the existing Biome formatting gate.Validation
npm run typechecknpm testnpm run buildnpm run lintnpm run clean:repo:checkConflict Resolutions
note: greptile review for oc-chatgpt-multi-auth. cite files like
lib/foo.ts:123. confirm regression tests + windows concurrency/token redaction coverage.Greptile Summary
this integration PR lands ~12 approved branches onto
dev: storage at-rest encryption (AES-256-GCM with scrypt KDF), a dual-lock hierarchy (in-process mutex + cross-process file lock + OCC revision check) for account saves, oauth timeout hardening, a local telemetry ledger, retry-governor controls, quota-based account rotation, and several CI/DX additions. the breadth of changes is large but the individual pieces are well-scoped and the conflict resolutions documented in the PR description are accurate.three issues require attention before merging to production:
blocking sync I/O in telemetry log rotation —
rotateLogsIfNeeded()callsstatSync/renameSync/unlinkSyncinside the asyncqueueAppendchain (lib/telemetry.ts:256), blocking the node event loop. on windows, AV-held file locks can make these stall for seconds, freezing token refresh and account save operations. needs to be converted to asyncfs.stat/fs.rename/fs.unlink. no vitest coverage exists for concurrent rotation during a save.unguarded TOCTOU race in
rotateStoredSecretEncryption— the main accounts rotation is correctly serialized viawithAccountStorageTransaction, but the flagged-accounts load→save block (lib/storage.ts:2295-2298) runs with no lock. two concurrent invocations or a concurrentsaveFlaggedAccountscall can cause a silent overwrite of the just-rotated ciphertext.DISK_PREFERRED_MERGE_KEYSdiscards freshly-refreshed tokens during conflict recovery — including"refreshToken"in the disk-wins set (lib/accounts.ts:863) means that when a concurrent save conflict triggers the merge path, a token refreshed in-memory since the last successful write gets replaced by the older disk copy, causing the next request to auth-fail with a 401.additionally,
isPidAliveviaprocess.kill(pid, 0)is a no-op on windows (alwaysEPERMfor cross-process calls), which defeats the pid-based stale lock reclaim path inlib/file-lock.ts; the age-only fallback is the only working path on windows and this should be documented and tested.Confidence Score: 2/5
Important Files Changed
Sequence Diagram
sequenceDiagram participant Caller participant AccountManager participant withStorageSerializedFileLock participant withAccountFileMutex participant withAccountFileLock participant withStorageLock participant saveAccountsUnlocked participant withStorageSaveFileLock participant disk as Disk (accounts.json) Caller->>AccountManager: saveToDisk() AccountManager->>AccountManager: buildStorageSnapshot() AccountManager->>AccountManager: persistStorageWithConflictRecovery() AccountManager->>withStorageSerializedFileLock: path, fn withStorageSerializedFileLock->>withAccountFileMutex: fn withAccountFileMutex->>withAccountFileLock: path (.queue.lock), fn withAccountFileLock->>withStorageLock: fn withStorageLock->>saveAccountsUnlocked: storage saveAccountsUnlocked->>withStorageSaveFileLock: path (.lock), task withStorageSaveFileLock->>disk: OCC revision check (SHA-256) alt revision matches withStorageSaveFileLock->>disk: write WAL, write temp, rename withStorageSaveFileLock-->>saveAccountsUnlocked: ok else ECONFLICT withStorageSaveFileLock-->>saveAccountsUnlocked: throws ECONFLICT saveAccountsUnlocked-->>withStorageLock: throws withStorageLock-->>AccountManager: throws AccountManager->>disk: loadAccounts() (reload) AccountManager->>AccountManager: mergeIntoLatestStorage(latest, local) AccountManager->>AccountManager: retry persistStorageWithConflictRecovery endLast reviewed commit: afda35a
Context used:
dashboard- What: Every code change must explain how it defends against Windows filesystem concurrency bugs and ... (source)