Enterprise hardening baseline: security, reliability, CI gates, and runbooks#32
Enterprise hardening baseline: security, reliability, CI gates, and runbooks#32
Conversation
Add enterprise-grade hardening across runtime, CLI, storage, CI, and docs.\n\n- Add cross-process file locking for settings/quota persistence\n- Add at-rest secret encryption with rotation command and idempotency support\n- Add RBAC/ABAC-style command authorization, JSON redaction, and retention policies\n- Add background retry + dead-letter queue for async persistence failures\n- Add list JSON pagination standard and schemaVersion contract updates\n- Add CI security gates: secret scan, supply-chain/SCA/license checks, SBOM, required checks policy\n- Add operations and incident response runbooks\n- Add/extend tests for new security/reliability primitives and CLI behaviors\n\nValidated with:\n- npm run typecheck\n- npm run lint\n- npm run build && npm test\n- npm run coverage\n- npm run audit:ci\n- npm run license:check\n- npm run clean:repo:check Co-authored-by: Codex <noreply@openai.com>
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (46)
📝 WalkthroughWalkthroughadds wide-ranging security, auditing, and reliability features: authorization/abac ( Changes
Sequence Diagram(s)sequenceDiagram
participant cli as CLI
participant auth as authorization (`lib/authorization.ts`)
participant idemp as idempotency (`lib/idempotency.ts`)
participant storage as storage (`lib/storage.ts`)
participant audit as audit (`lib/audit.ts`)
participant out as stdout
cli->>auth: ensureAuthorized(secrets:rotate)
auth-->>cli: {allowed}
cli->>idemp: checkAndRecordIdempotencyKey(scope="rotate", key=K)
alt replayed
idemp-->>cli: {replayed:true}
cli->>audit: emit(EVENT=rotate.replay)
cli->>out: "replay: no-op"
else not replayed
idemp-->>cli: {replayed:false}
cli->>audit: emit(EVENT=rotate.start)
cli->>storage: rotateStoredSecretEncryption(idempotencyKey=K)
storage-->>audit: emit(EVENT=rotate.complete)
storage-->>cli: {accounts:n, flaggedAccounts:m}
cli->>out: JSON/PLAINTEXT result
end
review notes
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
- migrate new envelopes to salted scrypt-derived AES keys\n- retain legacy enc:v1 decryption compatibility\n- add coverage for v2 and legacy v1 crypto paths\n- clarify high-entropy 32-byte key requirements in docs Co-authored-by: Codex <noreply@openai.com>
- audit break-glass authorization bypass events\n- add multi-process contention regression for file locking\n- cover new auth.break_glass audit action in tests Co-authored-by: Codex <noreply@openai.com>
Introduce optional environment-driven ABAC constraints layered ahead of RBAC for auth actions and commands. - add ABAC policy evaluation in authorization module - pass command/interactive/idempotency context from CLI command routing - enforce read authorization on read command paths consistently - add unit and CLI coverage for ABAC deny and idempotency requirements - document ABAC controls in config and operations runbooks Co-authored-by: Codex <noreply@openai.com>
Additional Comments (1)
|
Co-authored-by: Codex <noreply@openai.com>
|
@coderabbitai review Re-triggering review to clear the transient rate-limit status check failure. |
|
✅ Actions performedReview triggered.
|
|
Re-requesting CodeRabbit review: the current failure is CodeRabbit: Review rate limit exceeded while other checks are green. |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
Follow-up tracking: external blocker recorded in #44. Current failing context remains:
Actions already taken:
Next required step is external quota reset/increase, then rerun CodeRabbit. |
|
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 30
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
docs/reference/settings.md (1)
131-157:⚠️ Potential issue | 🟡 Minorupgrade.md needs to consolidate new security config in rollout checklist.
env vars (CODEX_AUTH_ENCRYPTION_KEY, CODEX_AUTH_PREVIOUS_ENCRYPTION_KEY, CODEX_AUTH_ROLE, CODEX_AUTH_BREAK_GLASS, CODEX_AUTH_ABAC_*, CODEX_AUTH_REDACT_JSON_OUTPUT) are documented in configuration.md and reference/settings.md, and rotate-secrets workflow is in runbooks/operations.md and reference/commands.md, but upgrade.md itself skips the encryption/ABAC setup steps and npm hardening scripts (audit:ci, license:check) that should be part of the migration process. add a new section under "Migration Checklist" that calls out:
- configure encryption keys if enabling at-rest secret storage
- set ABAC policy (CODEX_AUTH_ROLE baseline, denial rules) for your security posture
- run hardening checks:
npm run audit:ci && npm run license:check && npm run clean:repo:check- validate key rotation with
codex auth rotate-secrets --jsonbefore promoting to productionwindows concurrency edge cases for atomic file writes and lock retries are already covered in test/unified-settings.test.ts and test/storage.test.ts, so persistence safety is good.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/reference/settings.md` around lines 131 - 157, Add a new "Encryption & ABAC" subsection under Migration Checklist in upgrade.md that consolidates the security config and rollout steps: instruct operators to configure CODEX_AUTH_ENCRYPTION_KEY and CODEX_AUTH_PREVIOUS_ENCRYPTION_KEY when enabling at-rest secret storage, set baseline CODEX_AUTH_ROLE and ABAC denial rules (CODEX_AUTH_ABAC_READ_ONLY, CODEX_AUTH_ABAC_DENY_ACTIONS, CODEX_AUTH_ABAC_DENY_COMMANDS, CODEX_AUTH_ABAC_REQUIRE_INTERACTIVE, CODEX_AUTH_ABAC_REQUIRE_IDEMPOTENCY_KEY), enable CODEX_AUTH_BREAK_GLASS and CODEX_AUTH_REDACT_JSON_OUTPUT as needed, run hardening checks with npm run audit:ci && npm run license:check && npm run clean:repo:check, and validate rotation with codex auth rotate-secrets --json before promoting to production; reference the existing docs (reference/settings.md, configuration.md), runbooks (runbooks/operations.md), and tests (test/unified-settings.test.ts, test/storage.test.ts) as follow-up verification.lib/unified-settings.ts (2)
294-299:⚠️ Potential issue | 🟠 Majorcross-process lost-update race still exists in read-modify-write flows.
lib/unified-settings.ts:294-299andlib/unified-settings.ts:338-342read the settings record before the cross-process write lock is taken, so two processes can read the same base document and overwrite each other’s section updates. the lock needs to wrap the full read-modify-write transaction, not only rename/write.as per coding guidelines "
lib/**: focus on ... concurrency."Also applies to: 338-342
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/unified-settings.ts` around lines 294 - 299, The read-modify-write race occurs because readSettingsRecordAsync() is called before acquiring the cross-process lock in saveUnifiedPluginConfig (and the similar block at lines 338-342); move the read inside the enqueueSettingsWrite(async () => { ... }) callback so the lock acquired by enqueueSettingsWrite wraps the entire read-modify-write sequence, i.e., call readSettingsRecordAsync() only after entering the enqueueSettingsWrite callback, perform the record.pluginConfig update, then call writeSettingsRecordAsync(); apply the same change to the other function/block referenced (lines 338-342) and keep using enqueueSettingsWrite, readSettingsRecordAsync, and writeSettingsRecordAsync as the unique identifiers.
130-224:⚠️ Potential issue | 🟠 Majoradd vitest regressions for the new lock path on windows-style contention.
lib/unified-settings.ts:130-224introduces retry/backoff and cross-process lock behavior, but the provided tests do not show direct regressions for unified settings under contention (includingebusy/epermrename pressure). add targeted vitest coverage for this path.as per coding guidelines "
lib/**: verify every change cites affected tests (vitest)andtest/**`: demand regression cases that reproduce concurrency bugs ... and windows filesystem behavior."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/unified-settings.ts` around lines 130 - 224, Add vitest regression tests that simulate Windows-style rename contention for the unified settings code paths: write tests targeting writeSettingsRecordAsync (and the sync counterpart that uses acquireFileLockSync) in lib/unified-settings.ts by mocking fs.rename/renameSync to throw transient EBUSY/EPERM errors for the first N attempts and succeed thereafter, assert that the function eventually succeeds, that temp files are removed on failure (unlink/unlinkSync called), and that the lock (acquireFileLock / acquireFileLockSync via UNIFIED_SETTINGS_LOCK_PATH) is always released; also include a test where errors are non-retryable (isRetryableFsError returns false) to ensure the functions propagate the error. Ensure tests use vitest mocks/stubs and cover backoff/retry behavior and cleanup.docs/privacy.md (1)
83-111:⚠️ Potential issue | 🟠 Majorcleanup commands miss the dlq artifact.
docs/privacy.md:83-111does not removebackground-job-dlq.jsonl, even though it is listed as canonical local data and managed in retention logic (lib/data-retention.ts:112-115). add it to both bash and powershell cleanup snippets.proposed docs patch
# bash rm -f ~/.codex/multi-auth/openai-codex-flagged-accounts.json rm -f ~/.codex/multi-auth/quota-cache.json +rm -f ~/.codex/multi-auth/background-job-dlq.jsonl # powershell Remove-Item "$HOME\.codex\multi-auth\openai-codex-flagged-accounts.json" -Force -ErrorAction SilentlyContinue Remove-Item "$HOME\.codex\multi-auth\quota-cache.json" -Force -ErrorAction SilentlyContinue +Remove-Item "$HOME\.codex\multi-auth\background-job-dlq.jsonl" -Force -ErrorAction SilentlyContinueas per coding guidelines "
docs/**: keep README, SECURITY, and docs consistent with actual CLI flags and workflows."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/privacy.md` around lines 83 - 111, Add removal of the background-job-dlq.jsonl artifact to the cleanup snippets in docs/privacy.md so docs match the actual retention logic in lib/data-retention.ts (lines referencing background-job-dlq.jsonl). Specifically, update the Bash snippet to rm -f ~/.codex/multi-auth/background-job-dlq.jsonl (and any override-root conditional lines that mirror other removals) and update the PowerShell snippet to Remove-Item "$HOME\.codex\multi-auth\background-job-dlq.jsonl" -Force -ErrorAction SilentlyContinue (and the corresponding $env:CODEX_MULTI_AUTH_DIR override handling).lib/storage.ts (1)
1446-1468:⚠️ Potential issue | 🟠 Majorflagged-account writes are not protected by cross-process file locking.
lib/storage.ts:1446-1468useswithStorageLockonly (in-process). concurrent cli processes can still race onsaveFlaggedAccounts, causing lost updates and rename conflicts on windows-heavy workflows.use a file lock (same pattern as accounts) for flagged-account read/modify/write paths, and add a contention vitest regression.
As per coding guidelines
lib/**: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/storage.ts` around lines 1446 - 1468, saveFlaggedAccounts currently only uses the in-process withStorageLock so concurrent CLI processes can race; update saveFlaggedAccounts to acquire a cross-process file lock (same pattern used by the accounts read/modify/write helpers) around the getFlaggedAccountsPath operations, perform the read/normalize/serialize/write to a temp file then atomic-rename under that file lock, and ensure cleanup on failure; additionally add a vitest regression that spawns concurrent saveFlaggedAccounts calls to reproduce and prevent lost updates (simulate EBUSY/rename contention and retry/backoff for EBUSY/429), reuse the same lock acquisition/retry logic from the accounts implementation, and review log.error calls in saveFlaggedAccounts to avoid leaking emails/tokens when logging errors.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/ci.yml:
- Around line 146-147: The CI step "Run smoke tests" currently omits key Windows
concurrency/regression tests; update .github/workflows/ci.yml so that the
Windows matrix (windows-latest) runs the lock and background persistence smoke
tests by including test/file-lock.test.ts and test/background-jobs.test.ts in
the "npm run test" invocation (or add a separate run step for those files) for
the windows job; ensure the job that invokes "Run smoke tests" or the matrix
entry that targets windows runs these specific tests so fs/concurrency and
background-job regressions are caught on Windows.
In `@docs/configuration.md`:
- Line 73: The table cell for CODEX_AUTH_ROLE contains raw pipe characters which
break Markdown table columns; replace the inner pipes with escaped HTML entities
(e.g., change admin|operator|viewer to admin|operator|viewer) or wrap
the entire cell in an HTML <code> element to preserve the literal pipes, and
then verify the values match the actual CLI/constant in lib/secrets-crypto.ts
(the CODEX_AUTH_ROLE enumeration/strings) so docs/configuration.md stays
consistent with runtime behavior.
In `@docs/reference/commands.md`:
- Line 58: Update the recovery workflow example for the rotate-secrets CLI to
include the documented --idempotency-key flag so the example matches the flag
table; change the non-idempotent rotate-secrets invocation in
docs/reference/commands.md (the recovery example and the nearby example at lines
~115-116) to include a stable idempotency token (e.g. --idempotency-key
"$CI_JOB_ID" or a generated UUID) and ensure the example text mentions
re-running safely; verify this aligns with the rotate-secrets handling in the
background job logic (see the rotate-secrets usage in lib/index.ts and the
idempotency handling in the background job code around the rotate job) so docs
and implementation are consistent.
In `@docs/runbooks/incident-response.md`:
- Around line 28-32: Update the "rotate-secrets" containment step in the
incident runbook to be idempotent by appending the CLI flag "--idempotency-key
<incident-id>" to the "codex auth rotate-secrets" invocation so retries won't
cause duplicate side effects; mirror this flag usage in documentation and
mention the change in upgrade notes and npm scripts as appropriate, and
cross-reference the implementation locations (lib/background-jobs.ts function
handling background job retries and lib/index.ts CLI command registration) to
ensure the flag name and behavior match the actual code.
In `@index.ts`:
- Around line 270-283: The wrapper exchangeAuthorizationCodeWithRateLimit
currently lets exceptions from checkAuthRateLimit and exchangeAuthorizationCode
bubble up instead of returning a TokenResult failure; update
exchangeAuthorizationCodeWithRateLimit to catch errors thrown by
checkAuthRateLimit, recordAuthAttempt, and exchangeAuthorizationCode and convert
them into the appropriate TokenResult failure object (preserving error
message/details), while still calling resetAuthRateLimit on success; apply the
same pattern to the other wrappers referenced (the oauth paths at the locations
noted) and add Vitest regressions covering (1) blocked-by-rate-limit and (2)
network/error thrown from token exchange to ensure the functions always return a
TokenResult rather than throwing.
- Around line 228-232: The startup call to enforceDataRetention() runs
concurrently with storage/account startup and must be serialized using the same
storage lock/transaction used for account writes; modify the startup flow to
acquire the storage lock/transaction wrapper (the same mechanism used for
account writes) before invoking enforceDataRetention(), and update
lib/data-retention.ts to expose a lock-aware entry (e.g.,
enforceDataRetentionUnderLock or accept a transaction/lock token) so the cleanup
runs inside the storage lock; add a vitest regression (e.g.,
data-retention.spec.ts) that simulates Windows-style file-handle contention by
holding the storage lock/handle during startup and asserting that
enforceDataRetention either waits for the lock or fails gracefully without
eperm/ebusy errors to verify serialization.
In `@lib/accounts.ts`:
- Around line 771-787: The current flow awaits this.pendingSave and if it
rejects the exception prevents scheduling the new debounced save; change the
logic in the block around pendingSave/runBackgroundJobWithRetry so a failed
prior save does not block the next one — e.g., replace awaiting the prior
promise with safe awaiting that swallows errors (use await
this.pendingSave.catch(() => {}) or wrap in try { await this.pendingSave } catch
{ /* ignore */ }) before assigning this.pendingSave =
runBackgroundJobWithRetry({...}). Keep the existing finally that nulls
this.pendingSave and ensure saveToDisk() is still invoked via
runBackgroundJobWithRetry; add a regression test in test/accounts*.test.ts that
simulates a rejecting prior pendingSave and asserts the subsequent save runs.
- Around line 774-784: Add a regression test in test/accounts.test.ts that
verifies runBackgroundJobWithRetry wrapped saves survive transient fs errors:
stub/mock the underlying saveAccounts (or AccountsStorage.saveAccounts) called
by AccountsManager.saveToDisk so it throws an EBUSY error on the first two calls
and returns successfully on the third, then invoke the debounced save path that
sets this.pendingSave (i.e., call the method that triggers saveToDisk via
runBackgroundJobWithRetry or directly call the code path that enqueues the job),
wait for the background job to finish (await pendingSave or flush
timers/promises), and assert that saveAccounts was called at least three times
and that the final attempt succeeded; ensure you reference
runBackgroundJobWithRetry, saveToDisk, saveAccounts and the pendingSave behavior
in the test setup.
In `@lib/authorization.ts`:
- Around line 30-36: getRoleFromEnv currently defaults unknown CODEX_AUTH_ROLE
values to "admin", which is fail-open; change its behavior to fail-closed by
returning "viewer" (or an explicit deny role if you prefer) for any
unrecognized/invalid role after trimming/lowercasing inside getRoleFromEnv, and
update the logic in that function to only return "operator"|"viewer"|"admin" for
valid inputs. Add a vitest regression in test/authorization.test.ts that sets
CODEX_AUTH_ROLE to an invalid string and asserts getRoleFromEnv() returns
"viewer" (or the chosen deny role) to prevent silent privilege escalation;
ensure the test resets env state afterward and does not log secrets.
In `@lib/background-jobs.ts`:
- Around line 43-47: The default retry policy in isRetryableByDefault currently
only checks errno-style codes; update isRetryableByDefault to also treat HTTP
429 (rate-limit) as retryable by accepting errors with a numeric
status/statusCode === 429 or an object shape indicating HTTP response code, and
ensure it still returns true for existing errno values (EBUSY, EPERM, EAGAIN,
ETIMEDOUT); add a vitest regression in test/background-jobs.test.ts that asserts
jobs failing with a 429 error are retried (and subsequently hit backoff) and
that EBUSY scenarios still retry, and when adding tests ensure any logs produced
by the queue/job code do not include tokens or emails.
In `@lib/codex-manager.ts`:
- Around line 4089-4102: The error path for the "rotate-secrets" JSON log prints
raw error payload and bypasses the redaction gate; wrap the error object passed
to JSON.stringify with maybeRedactJsonOutput(...) (same approach used in the
success path) so the error field is redacted before logging (preserve
JSON_OUTPUT_SCHEMA_VERSION and include idempotencyKey as before). Also add a
vitest regression that triggers the failure branch of rotate-secrets and asserts
that sensitive markers (e.g., tokens/emails) in the error message are redacted
in the logged JSON to prevent leaks.
In `@lib/data-redaction.ts`:
- Around line 31-49: The recursive walker in redactForExternalOutput (visit)
lacks a visited-object guard and will infinite-loop on cyclic objects; update
visit to track seen objects using a WeakSet (or similar) keyed by object
references, check the set at the start of visiting an object/array and return a
stable placeholder or the already-constructed result for cycles, and add logic
to record objects before recursing into their properties/array items (ensure
arrays and plain objects are handled). Update references to shouldRedactKey and
preserve redaction behavior when encountering previously-seen nodes. Add a
Vitest regression in test/data-redaction.test.ts that creates a self-referential
object (e.g., obj.self = obj) and asserts redactForExternalOutput returns
without throwing and that redacted keys remain masked.
In `@lib/data-retention.ts`:
- Around line 80-92: pruneSingleFile currently bails on non-ENOENT errors;
update the function to retry unlink/stat on transient Windows filesystem error
codes (e.g., "EBUSY", "EPERM" and optionally "EACCES") with a bounded
exponential backoff before rethrowing. Implement a small retry loop inside
pruneSingleFile around the fs.stat/fs.unlink calls (referencing the function
name pruneSingleFile), perform up to a fixed number of attempts (e.g., 4–6),
start with a short delay (e.g., 50–100ms) and double the delay each retry, await
a Promise-based sleep between attempts, and only return false on ENOENT; if
retries exhausted, rethrow the last error. Ensure the retry applies to both the
stat and unlink operations and preserve existing return semantics.
- Around line 70-75: The catch block in lib/data-retention.ts currently silences
all errors by only continuing on ENOENT and dropping every other error; update
the error handling in that catch (where you read const code = (error as
NodeJS.ErrnoException).code) so that if code === "ENOENT" you continue, but for
any other error you either log the error (with context about the path/operation)
and rethrow or propagate it so permission/IO failures are not silently ignored;
ensure the non-ENOENT branch uses the existing logger or throws the original
error to surface failures to callers.
In `@lib/file-lock.ts`:
- Around line 68-74: The lock file write path currently opens a file descriptor
into `handle`, writes metadata, and closes it but if `writeFile` (or any
intermediate step) throws the fd is leaked; wrap the open→write→close sequence
in a try/finally so `handle.close()` is always invoked (only if `handle` was
successfully assigned), and handle/ignore close errors safely to avoid masking
the original error; apply the same try/finally pattern to the equivalent block
around lines 137-144 (same `handle` usage), update or add Vitest cases that
simulate write failures (disk full/permission) to prove no fd leak, and ensure
any added logging does not emit tokens/emails while also validating behavior
under EBUSY/429 retry scenarios.
In `@lib/idempotency.ts`:
- Around line 86-121: checkAndRecordIdempotencyKey currently writes a key as
soon as a request starts, causing legitimate retries to be blocked if downstream
work fails; change the idempotency scheme to a two‑phase state or rollback
model: have checkAndRecordIdempotencyKey record a pending entry (e.g., { scope,
key, status: "pending", createdAtMs }) under the existing lock
(IDEMPOTENCY_LOCK_PATH) using loadFile/pruneExpired/saveFile, and expose
companion functions markIdempotencySucceeded(scope,key) and
clearIdempotencyOnFailure(scope,key) that update status to "succeeded" or remove
the pending entry inside the same lock; add retry/backoff in saveFile/loadFile
to handle EBUSY and transient I/O errors (429-style transient failures), ensure
callers in lib/codex-manager.ts use the new two‑phase flow (mark success only
after downstream rotation completes), and add a vitest regression test in
test/idempotency.test.ts that simulates a failure then retry to confirm the
retry is allowed; preserve existing pruning logic (pruneExpired) and avoid
logging sensitive tokens/emails.
- Around line 100-106: The idempotency lock is being acquired before the storage
directory exists which causes ENOENT on cold start; before calling
acquireFileLock with IDEMPOTENCY_LOCK_PATH (from lib/idempotency.ts) ensure the
parent directory for IDEMPOTENCY_PATH (or IDEMPOTENCY_LOCK_PATH) is created
(mkdir -p behavior) and handle transient ENOENT by retrying once; update acquire
sequence in the function that calls acquireFileLock to create
dirname(IDEMPOTENCY_PATH) (or dirname(IDEMPOTENCY_LOCK_PATH)) with safe
permissions and ensure any thrown ENOENT triggers a retry/create path rather
than bubbling up, then add/update vitest tests (e.g., idempotency.spec.ts)
covering cold-start directory-missing behavior and concurrency/error cases
(EBUSY/429) and verify logs do not leak sensitive tokens/emails.
In `@lib/index.ts`:
- Around line 32-38: Add a vitest public-api regression test that imports the
barrel (the exports from lib/index.ts) and snapshots the exported symbol list so
future refactors can't silently drop exports; specifically import the barrel
that re-exports file-lock.js, secrets-crypto.js, data-retention.js,
data-redaction.js, authorization.js, background-jobs.js, and idempotency.js
(refer to those module names) and assert a snapshot of
Object.keys(theImportedModule) or the sorted export names in
test/public-api*.test.ts, then commit the snapshot; this pins the Tier-B API
surface and will catch removals in future changes.
In `@lib/quota-cache.ts`:
- Around line 230-265: Add an integration test that spawns 4–6 child processes
which each call saveQuotaCache repeatedly (similar to
test/file-lock.test.ts:83-174) to exercise acquireFileLock/QUOTA_CACHE_LOCK_PATH
and the rename retry logic under Windows EBUSY/EPERM, and then add a brief
comment in lib/quota-cache.ts (near saveQuotaCache / the acquireFileLock usage)
pointing to the new multi-process test so future reviewers know this concurrency
behavior is validated; ensure the test asserts no lost/corrupted writes and that
rename retry exhaustion is handled.
In `@lib/secrets-crypto.ts`:
- Around line 163-172: The env-key loader currently accepts any non-empty
string; change getSecretEncryptionKeysFromEnv to validate each key returned by
getTrimmedEnv and reject keys with <32 bytes of entropy by converting the string
to a Buffer (e.g., Buffer.from(value, 'utf8') or the expected encoding) and
checking buffer.length >= 32, throwing a clear Error when a key is present but
too short; apply the same validation to the "previous" key and keep null when
absent. Also add vitest coverage asserting that getSecretEncryptionKeysFromEnv
throws for short keys and accepts keys >=32 bytes (and that absent keys remain
null).
- Around line 107-110: encryptSecret currently skips encryption based solely on
the "enc:v1:/enc:v2:" prefix which lets malformed envelopes bypass encryption;
update encryptSecret to only skip when isEncryptedSecret confirms a valid
envelope (or add a stricter validator like
isEncryptedEnvelope/isEncryptedSecretStrict and use that in
encryptSecret/parseEncryptedEnvelope), so malformed "enc:v2:..." strings get
treated as plaintext and encrypted normally; also add a vitest regression that
passes a string starting with "enc:v2:" but not a valid envelope and assert the
output is an encrypted envelope (not the original input).
In `@lib/storage.ts`:
- Around line 1313-1318: The decrypt failure in the loop that calls
decryptStorageSecret(refreshTokenRaw, "flagged refresh token") currently
swallows errors and continues, causing flagged accounts to be dropped from
normalized state and potentially lost on save; modify the handler in
lib/storage.ts to either preserve the original encrypted entry (e.g., keep
refreshTokenRaw in the normalized record) or rethrow/return a fail-fast error so
the entry is not removed, update the normalization code path that uses
refreshToken/refreshTokenRaw to prefer the encrypted raw value when decryption
fails, add a vitest regression named "cannot decrypt flagged token" that asserts
the encrypted entry is preserved after normalization and save, and ensure any
new logging around decrypt failures uses non-sensitive messages (no
tokens/emails), and that related tests cover filesystem/concurrency retry
behavior for EBUSY/429 when writing the preserved entries or queueing work.
In `@README.md`:
- Line 131: Add documentation for the --idempotency-key CLI flag alongside the
existing `codex auth rotate-secrets --json` entry: describe that
`--idempotency-key <value>` ensures safe retryable rotations by making the
request idempotent, note expected format (string/UUID) and that it’s used by
automated workflows, and reference its test usage in
`test/codex-manager-cli.test.ts:2101` for implementers; place the short flag
description in the same table row or an adjacent troubleshooting/automation note
per docs/** style guidelines.
In `@scripts/license-policy-check.js`:
- Around line 36-38: The current deny-check loop uses substring matching
(normalized.includes(denied)) which falsely flags licenses like "LGPL-3.0" when
"GPL-3.0" is denied; update the check in the loop over denyList to perform an
exact SPDX token/expression match instead of substring matching — e.g., parse
the normalized license expression into SPDX identifiers or split/tokenseparate
on operators and whitespace and compare exact identifiers to denied; keep
pushing violations with the same violations.push(`${name}@${version}
(${rawLicense})`) when a true exact match is found.
In `@test/background-jobs.test.ts`:
- Around line 26-49: Add a new Vitest case mirroring the existing "retries and
succeeds before exhausting attempts" test that specifically simulates rate-limit
(429) failures: import runBackgroundJobWithRetry and getBackgroundJobDlqPath,
create a task that throws an error representing an HTTP 429 (e.g., an Error with
status or statusCode = 429) for the first N-1 attempts and returns "ok" on the
Nth attempt, call runBackgroundJobWithRetry with small baseDelayMs/maxDelayMs
and appropriate maxAttempts, assert the result is "ok" and that the task ran the
expected number of attempts, and add a separate test that exhausts maxAttempts
on repeated 429 errors and then asserts fs.stat(getBackgroundJobDlqPath())
rejects with { code: "ENOENT" } or the appropriate DLQ presence check per
existing tests; reference runBackgroundJobWithRetry and getBackgroundJobDlqPath
to locate where to add/modify tests.
In `@test/data-redaction.test.ts`:
- Around line 1-34: Add regression tests to test/data-redaction.test.ts that
exercise edge cases for the redactForExternalOutput function: include a test
that passes null and undefined fields (and nested nulls) to ensure they are
preserved rather than causing errors, a test with arrays containing mixed
primitives, objects and nulls to ensure only sensitive keys in objects are
redacted, a test for empty objects/arrays to ensure they remain unchanged, and a
deep-nesting test (e.g., >50 levels) to guard against recursion/stack issues;
reference the redactForExternalOutput symbol when adding these cases and assert
exact expected outputs (null stays null, primitives unchanged, sensitive fields
replaced with "***REDACTED***").
In `@test/file-lock.test.ts`:
- Around line 49-57: The test currently uses a broad expectation
(rejects.toBeTruthy()) for acquireFileLock; replace it with a deterministic
assertion that checks the specific failure mode—e.g., assert the promise rejects
with the LockAcquisitionError (or the concrete error class your lock
implementation throws) or matches a known error code/message (for example use
expect(acquireFileLock(...)).rejects.toThrow(/acquire.*lock/i) or
expect(...).rejects.toMatchObject({ code: 'LOCK_BUSY' })) so the test only
passes for the intended lock contention error; import the specific error type
from the file-lock implementation if available and use that in the assertion.
In `@test/idempotency.test.ts`:
- Around line 26-50: The test suite lacks a concurrency regression for
idempotency key races: add a new test that concurrently calls
checkAndRecordIdempotencyKey for the same scope/key (e.g.,
"codex.auth.rotate-secrets", "key-concurrent") and asserts that exactly one
invocation returns { replayed: false } while all others return { replayed: true
}; use Promise.all to invoke many parallel calls (10+), await them, then count
results to ensure a single winner, and reuse getIdempotencyStorePath to validate
the store if desired—this proves the file-locking behavior around
checkAndRecordIdempotencyKey under concurrent access.
In `@test/secrets-crypto.test.ts`:
- Around line 62-66: Add a deterministic regression test in
test/secrets-crypto.test.ts that covers malformed prefixed plaintext inputs for
the isEncryptedSecret function: specifically add a case where the input begins
with the "enc:v2:" prefix but lacks the full expected segments (e.g.,
truncated/malformed payload) and assert that isEncryptedSecret(...) still
returns true; this ensures lib/secrets-crypto.ts::isEncryptedSecret handles
prefixed-but-malformed plaintext correctly and prevents regressions around
prefix recognition.
---
Outside diff comments:
In `@docs/privacy.md`:
- Around line 83-111: Add removal of the background-job-dlq.jsonl artifact to
the cleanup snippets in docs/privacy.md so docs match the actual retention logic
in lib/data-retention.ts (lines referencing background-job-dlq.jsonl).
Specifically, update the Bash snippet to rm -f
~/.codex/multi-auth/background-job-dlq.jsonl (and any override-root conditional
lines that mirror other removals) and update the PowerShell snippet to
Remove-Item "$HOME\.codex\multi-auth\background-job-dlq.jsonl" -Force
-ErrorAction SilentlyContinue (and the corresponding $env:CODEX_MULTI_AUTH_DIR
override handling).
In `@docs/reference/settings.md`:
- Around line 131-157: Add a new "Encryption & ABAC" subsection under Migration
Checklist in upgrade.md that consolidates the security config and rollout steps:
instruct operators to configure CODEX_AUTH_ENCRYPTION_KEY and
CODEX_AUTH_PREVIOUS_ENCRYPTION_KEY when enabling at-rest secret storage, set
baseline CODEX_AUTH_ROLE and ABAC denial rules (CODEX_AUTH_ABAC_READ_ONLY,
CODEX_AUTH_ABAC_DENY_ACTIONS, CODEX_AUTH_ABAC_DENY_COMMANDS,
CODEX_AUTH_ABAC_REQUIRE_INTERACTIVE, CODEX_AUTH_ABAC_REQUIRE_IDEMPOTENCY_KEY),
enable CODEX_AUTH_BREAK_GLASS and CODEX_AUTH_REDACT_JSON_OUTPUT as needed, run
hardening checks with npm run audit:ci && npm run license:check && npm run
clean:repo:check, and validate rotation with codex auth rotate-secrets --json
before promoting to production; reference the existing docs
(reference/settings.md, configuration.md), runbooks (runbooks/operations.md),
and tests (test/unified-settings.test.ts, test/storage.test.ts) as follow-up
verification.
In `@lib/storage.ts`:
- Around line 1446-1468: saveFlaggedAccounts currently only uses the in-process
withStorageLock so concurrent CLI processes can race; update saveFlaggedAccounts
to acquire a cross-process file lock (same pattern used by the accounts
read/modify/write helpers) around the getFlaggedAccountsPath operations, perform
the read/normalize/serialize/write to a temp file then atomic-rename under that
file lock, and ensure cleanup on failure; additionally add a vitest regression
that spawns concurrent saveFlaggedAccounts calls to reproduce and prevent lost
updates (simulate EBUSY/rename contention and retry/backoff for EBUSY/429),
reuse the same lock acquisition/retry logic from the accounts implementation,
and review log.error calls in saveFlaggedAccounts to avoid leaking emails/tokens
when logging errors.
In `@lib/unified-settings.ts`:
- Around line 294-299: The read-modify-write race occurs because
readSettingsRecordAsync() is called before acquiring the cross-process lock in
saveUnifiedPluginConfig (and the similar block at lines 338-342); move the read
inside the enqueueSettingsWrite(async () => { ... }) callback so the lock
acquired by enqueueSettingsWrite wraps the entire read-modify-write sequence,
i.e., call readSettingsRecordAsync() only after entering the
enqueueSettingsWrite callback, perform the record.pluginConfig update, then call
writeSettingsRecordAsync(); apply the same change to the other function/block
referenced (lines 338-342) and keep using enqueueSettingsWrite,
readSettingsRecordAsync, and writeSettingsRecordAsync as the unique identifiers.
- Around line 130-224: Add vitest regression tests that simulate Windows-style
rename contention for the unified settings code paths: write tests targeting
writeSettingsRecordAsync (and the sync counterpart that uses
acquireFileLockSync) in lib/unified-settings.ts by mocking fs.rename/renameSync
to throw transient EBUSY/EPERM errors for the first N attempts and succeed
thereafter, assert that the function eventually succeeds, that temp files are
removed on failure (unlink/unlinkSync called), and that the lock
(acquireFileLock / acquireFileLockSync via UNIFIED_SETTINGS_LOCK_PATH) is always
released; also include a test where errors are non-retryable (isRetryableFsError
returns false) to ensure the functions propagate the error. Ensure tests use
vitest mocks/stubs and cover backoff/retry behavior and cleanup.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: d3985bea-67ef-4b2d-83ce-467a08b8216b
📒 Files selected for processing (46)
.github/settings.yml.github/workflows/ci.yml.github/workflows/secret-scan.yml.github/workflows/supply-chain.ymlREADME.mddocs/README.mddocs/configuration.mddocs/development/CONFIG_FIELDS.mddocs/development/TESTING.mddocs/index.mddocs/privacy.mddocs/reference/commands.mddocs/reference/error-contracts.mddocs/reference/public-api.mddocs/reference/settings.mddocs/reference/storage-paths.mddocs/runbooks/README.mddocs/runbooks/incident-response.mddocs/runbooks/operations.mdindex.tslib/accounts.tslib/audit.tslib/authorization.tslib/background-jobs.tslib/codex-manager.tslib/data-redaction.tslib/data-retention.tslib/file-lock.tslib/idempotency.tslib/index.tslib/quota-cache.tslib/secrets-crypto.tslib/storage.tslib/unified-settings.tspackage.jsonscripts/license-policy-check.jstest/audit.test.tstest/authorization.test.tstest/background-jobs.test.tstest/codex-manager-cli.test.tstest/data-redaction.test.tstest/data-retention.test.tstest/file-lock.test.tstest/idempotency.test.tstest/quota-cache.test.tstest/secrets-crypto.test.ts
📜 Review details
🧰 Additional context used
📓 Path-based instructions (3)
docs/**
⚙️ CodeRabbit configuration file
keep README, SECURITY, and docs consistent with actual CLI flags and workflows. whenever behavior changes, require updated upgrade notes and mention new npm scripts.
Files:
docs/runbooks/incident-response.mddocs/runbooks/README.mddocs/README.mddocs/configuration.mddocs/development/CONFIG_FIELDS.mddocs/reference/error-contracts.mddocs/runbooks/operations.mddocs/reference/commands.mddocs/reference/storage-paths.mddocs/index.mddocs/privacy.mddocs/development/TESTING.mddocs/reference/settings.mddocs/reference/public-api.md
test/**
⚙️ CodeRabbit configuration file
tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.
Files:
test/quota-cache.test.tstest/audit.test.tstest/codex-manager-cli.test.tstest/idempotency.test.tstest/file-lock.test.tstest/background-jobs.test.tstest/authorization.test.tstest/data-redaction.test.tstest/secrets-crypto.test.tstest/data-retention.test.ts
lib/**
⚙️ CodeRabbit configuration file
focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.
Files:
lib/index.tslib/audit.tslib/data-retention.tslib/accounts.tslib/data-redaction.tslib/unified-settings.tslib/storage.tslib/authorization.tslib/background-jobs.tslib/secrets-crypto.tslib/quota-cache.tslib/idempotency.tslib/file-lock.tslib/codex-manager.ts
🧬 Code graph analysis (14)
lib/data-retention.ts (1)
lib/runtime-paths.ts (3)
getCodexLogDir(226-228)getCodexCacheDir(212-214)getCodexMultiAuthDir(166-199)
lib/accounts.ts (1)
lib/background-jobs.ts (1)
runBackgroundJobWithRetry(71-115)
test/codex-manager-cli.test.ts (2)
lib/codex-manager.ts (1)
runCodexMultiAuthCli(4528-4609)scripts/codex.js (1)
runCodexMultiAuthCli(501-501)
test/idempotency.test.ts (1)
lib/idempotency.ts (2)
checkAndRecordIdempotencyKey(86-125)getIdempotencyStorePath(82-84)
test/file-lock.test.ts (1)
lib/file-lock.ts (1)
acquireFileLock(57-107)
lib/authorization.ts (1)
lib/audit.ts (1)
auditLog(123-153)
lib/quota-cache.ts (2)
lib/file-lock.ts (1)
acquireFileLock(57-107)lib/utils.ts (1)
sleep(65-67)
test/background-jobs.test.ts (1)
lib/background-jobs.ts (2)
runBackgroundJobWithRetry(71-115)getBackgroundJobDlqPath(67-69)
test/authorization.test.ts (2)
lib/authorization.ts (2)
getAuthorizationRole(38-40)authorizeAction(107-145)lib/audit.ts (2)
getAuditConfig(63-65)configureAudit(59-61)
test/data-redaction.test.ts (1)
lib/data-redaction.ts (1)
redactForExternalOutput(31-50)
lib/idempotency.ts (2)
lib/runtime-paths.ts (1)
getCodexMultiAuthDir(166-199)lib/file-lock.ts (1)
acquireFileLock(57-107)
index.ts (5)
lib/data-retention.ts (1)
enforceDataRetention(95-122)lib/logger.ts (2)
error(389-393)logWarn(341-346)lib/auth-rate-limit.ts (1)
checkAuthRateLimit(119-127)lib/auth/auth.ts (2)
exchangeAuthorizationCode(114-158)REDIRECT_URI(12-12)lib/audit.ts (1)
auditLog(123-153)
test/secrets-crypto.test.ts (1)
lib/secrets-crypto.ts (4)
encryptSecret(107-121)isEncryptedSecret(102-105)decryptSecret(123-161)getSecretEncryptionKeysFromEnv(168-173)
test/data-retention.test.ts (1)
lib/data-retention.ts (2)
RetentionPolicy(5-11)enforceDataRetention(95-122)
🪛 LanguageTool
docs/runbooks/operations.md
[uncategorized] ~64-~64: The official name of this software platform is spelled with a capital “H”.
Context: ...ion required checks remain aligned with .github/settings.yml. --- ## Failure Triage ...
(GITHUB)
🪛 markdownlint-cli2 (0.21.0)
docs/configuration.md
[warning] 73-73: Table column count
Expected: 2; Actual: 4; Too many cells, extra data will be missing
(MD056, table-column-count)
🔇 Additional comments (18)
lib/audit.ts (1)
17-17: lgtm for enum extension.this action value is consistent with the new audit surface and is covered by the enum assertion in
test/audit.test.ts:254(source line:lib/audit.ts:17).test/audit.test.ts (1)
254-254: remove this comment - the requested regression test already exists.test/authorization.test.ts:90 ("allows all actions when break-glass is enabled and audits the bypass") does exactly what you're asking for: it exercises the break-glass authorization flow, reads the actual audit log file, and explicitly asserts
entry.action === AuditAction.AUTH_BREAK_GLASSwith additional checks on resource and metadata. the test is deterministic, uses vitest, and doesn't mock secrets.line 254 in test/audit.test.ts remains a useful enum-literal check. the behavior-level regression is already covered elsewhere.
> Likely an incorrect or invalid review comment.index.ts (1)
1690-1699: no issue here. emails are already sanitized viasanitizeActorbefore writing to audit logs.auditLog at lib/audit.ts:140 applies
sanitizeActor(actor)which callsmaskEmailfrom lib/logger.ts:64-72. maskEmail irreversibly masks emails to format likeus***@***.com. test/audit.test.ts:131-143 and test/audit.test.ts:147-160 verify this works for actor and metadata fields respectively. nested metadata is also recursively sanitized at lib/audit.ts:114.all three call sites at index.ts:1690-1699, index.ts:1726-1737, and index.ts:2399-2410 pass account emails through this sanitization pipeline before storage, not unsanitized.
> Likely an incorrect or invalid review comment.docs/README.md (1)
62-62: lgtm.new runbook entry in maintainer docs table correctly references
docs/runbooks/README.mdand aligns with the operations documentation added in this pr.docs/index.md (1)
49-50: lgtm.runbooks reference added correctly to the landing page navigation. ordering makes sense for operators looking for quick access to operational docs.
test/quota-cache.test.ts (1)
193-196: good refinement to make temp cleanup assertion more precise.filtering to
.tmpfile unlinks avoids false positives from lock file operations. the test at line 198 still verifies no lingering temp files remain.one note: this test mocks
EPERMfor the "keeps failing" scenario. there's an existing parameterized test attest/quota-cache.test.ts:127covering bothEBUSYandEPERMfor retry success, but consider adding a windows-specificEBUSYvariant for the "keeps failing" path if you want full windows filesystem coverage per coding guidelines (test/**).docs/runbooks/README.md (1)
1-12: lgtm.clean runbook index with clear scope definition. links correctly point to
operations.mdandincident-response.mdin the same directory.docs/runbooks/operations.md (1)
1-89: solid operations runbook covering the new enterprise hardening features.key rotation workflow at lines 24-28 correctly references
codex auth rotate-secrets --json. abac env vars at lines 44-50 match the implementation exercised intest/codex-manager-cli.test.ts:2134-2136. dlq monitoring at line 18 aligns with the background-job infrastructure.note: static analysis flagged
.githubcapitalization at line 64, but this is a file path reference (.github/settings.yml), not the company name—ignore that hint.test/codex-manager-cli.test.ts (4)
21-22: lgtm on mock setup for rotation and idempotency.new mocks at lines 21-22 are properly wired in
vi.mock("../lib/storage.js")andvi.mock("../lib/idempotency.js"). reset logic at lines 208-209 ensures clean state between tests.Also applies to: 85-90
2015-2083: solid pagination test covering cursor-based flow.test at
test/codex-manager-cli.test.ts:2015-2083correctly exercises--page-sizeand--cursorflags. the two-call pattern verifies:
- first page returns
hasMore: truewith a cursor- second page returns
hasMore: falsewithnextCursor: nullusing
mockResolvedValue(notmockResolvedValueOnce) ensures both calls see consistent account data, which is appropriate here.
2085-2130: good idempotency replay test, but consider adding concurrency regression case.test at
test/codex-manager-cli.test.ts:2085-2130correctly verifies:
- first call with key "rotation-001" performs rotation (
replayed: false)- second call with same key returns cached result (
replayed: true)rotateStoredSecretEncryptionMockcalled only oncenice detail testing both
--idempotency-key rotation-001(space) and--idempotency-key=rotation-001(equals) formats at lines 2101-2102 and 2117.per coding guidelines (
test/**), consider adding a regression test for concurrent calls with the same idempotency key to verify no race conditions inlib/idempotency.ts. the current test is sequential.
2132-2180: good abac enforcement test with proper env cleanup.test at
test/codex-manager-cli.test.ts:2132-2180correctly verifies:
- command denied without
--idempotency-keywhenCODEX_AUTH_ABAC_REQUIRE_IDEMPOTENCY_KEY=secrets:rotate- command allowed with
--idempotency-keyflag present- error message mentions "idempotency key" (line 2153)
env cleanup at lines 2169-2178 properly handles the undefined-vs-defined distinction. the pattern of capturing original values at lines 2133-2134 before modification is correct.
docs/reference/error-contracts.md (1)
34-51: json contract additions look solid.this keeps the machine contract explicit for rotate-secrets, schemaVersion, redaction mode, pagination envelope, and idempotency guidance. refs: lib/index.ts:33-38.
docs/reference/public-api.md (1)
66-81: api standards baseline section is a strong addition.this gives clear defaults for schema versioning, idempotency, and pagination and matches the expanded hardening surface. refs: lib/index.ts:32-38.
test/file-lock.test.ts (1)
83-174: good multi-process contention regression coverage.
test/file-lock.test.ts:83-174reproduces cross-process lock contention and verifies both expected write count and uniqueness. this is strong coverage for concurrency correctness.as per coding guidelines "
test/**: demand regression cases that reproduce concurrency bugs ... and windows filesystem behavior."test/data-retention.test.ts (1)
27-78: retention coverage looks solid.
test/data-retention.test.ts:27-78checks stale deletion and fresh-file preservation with explicit assertions. this gives good confidence in pruning behavior.as per coding guidelines "
test/**: tests must stay deterministic and use vitest."test/background-jobs.test.ts (1)
51-88: redaction assertions are strong here.
test/background-jobs.test.ts:51-88verifies sensitive fields are redacted and non-sensitive fields stay visible in dlq entries. good coverage for privacy guarantees.test/authorization.test.ts (1)
90-227: authorization regression matrix is comprehensive.
test/authorization.test.ts:90-227covers break-glass auditing and core abac gates with clear assertions. this is good protection for policy regressions.as per coding guidelines "
test/**: tests must stay deterministic and use vitest."
| - name: Run smoke tests | ||
| run: npm run test -- test/runtime-paths.test.ts test/codex-bin-wrapper.test.ts |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
expand windows smoke coverage for fs/concurrency regressions.
line 147 currently skips the tests most tied to this pr’s windows and contention hardening. add at least lock/background persistence smoke so regressions are caught on windows-latest (refs: test/file-lock.test.ts:1, test/background-jobs.test.ts:1).
proposed ci update
- - name: Run smoke tests
- run: npm run test -- test/runtime-paths.test.ts test/codex-bin-wrapper.test.ts
+ - name: Run smoke tests
+ run: npm run test -- test/runtime-paths.test.ts test/codex-bin-wrapper.test.ts test/file-lock.test.ts test/background-jobs.test.tsAs per coding guidelines test/**: demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - name: Run smoke tests | |
| run: npm run test -- test/runtime-paths.test.ts test/codex-bin-wrapper.test.ts | |
| - name: Run smoke tests | |
| run: npm run test -- test/runtime-paths.test.ts test/codex-bin-wrapper.test.ts test/file-lock.test.ts test/background-jobs.test.ts |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/ci.yml around lines 146 - 147, The CI step "Run smoke
tests" currently omits key Windows concurrency/regression tests; update
.github/workflows/ci.yml so that the Windows matrix (windows-latest) runs the
lock and background persistence smoke tests by including test/file-lock.test.ts
and test/background-jobs.test.ts in the "npm run test" invocation (or add a
separate run step for those files) for the windows job; ensure the job that
invokes "Run smoke tests" or the matrix entry that targets windows runs these
specific tests so fs/concurrency and background-job regressions are caught on
Windows.
| | `CODEX_AUTH_STREAM_STALL_TIMEOUT_MS=<ms>` | Stream stall timeout override | | ||
| | `CODEX_AUTH_ENCRYPTION_KEY=<32-byte-random-key>` | Enable at-rest encryption for stored account secrets (high-entropy key material only) | | ||
| | `CODEX_AUTH_PREVIOUS_ENCRYPTION_KEY=<32-byte-random-key>` | Previous high-entropy key used during staged secret rotation | | ||
| | `CODEX_AUTH_ROLE=admin|operator|viewer` | CLI authorization role baseline | |
There was a problem hiding this comment.
fix table rendering for codex_auth_role values.
line 73 uses raw | inside a table cell, so markdown treats it as extra columns and drops content in rendered docs.
ref: docs/configuration.md:73, lib/secrets-crypto.ts:168-172.
proposed fix
-| `CODEX_AUTH_ROLE=admin|operator|viewer` | CLI authorization role baseline |
+| `CODEX_AUTH_ROLE=admin\|operator\|viewer` | CLI authorization role baseline |as per coding guidelines, "docs/**: keep README, SECURITY, and docs consistent with actual CLI flags and workflows. whenever behavior changes, require updated upgrade notes and mention new npm scripts."
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| | `CODEX_AUTH_ROLE=admin|operator|viewer` | CLI authorization role baseline | | |
| | `CODEX_AUTH_ROLE=admin\|operator\|viewer` | CLI authorization role baseline | |
🧰 Tools
🪛 markdownlint-cli2 (0.21.0)
[warning] 73-73: Table column count
Expected: 2; Actual: 4; Too many cells, extra data will be missing
(MD056, table-column-count)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/configuration.md` at line 73, The table cell for CODEX_AUTH_ROLE
contains raw pipe characters which break Markdown table columns; replace the
inner pipes with escaped HTML entities (e.g., change admin|operator|viewer to
admin|operator|viewer) or wrap the entire cell in an HTML <code>
element to preserve the literal pipes, and then verify the values match the
actual CLI/constant in lib/secrets-crypto.ts (the CODEX_AUTH_ROLE
enumeration/strings) so docs/configuration.md stays consistent with runtime
behavior.
| | `--no-restore` | verify-flagged | Verify only; do not restore healthy flagged accounts | | ||
| | `--page-size <n>` | list, status (`--json`) | Page size for JSON list output (1-200) | | ||
| | `--cursor <cursor>` | list, status (`--json`) | Cursor token for JSON list pagination | | ||
| | `--idempotency-key <key>` | rotate-secrets | Safe retry key for automation | |
There was a problem hiding this comment.
make the recovery example use the idempotency flag it documents.
the flag table adds --idempotency-key, but the recovery workflow still shows a non-idempotent rotate command. update the example to reinforce safe retries. refs: lib/index.ts:38, lib/background-jobs.ts:70-114.
proposed doc fix
-codex auth rotate-secrets --json
+codex auth rotate-secrets --json --idempotency-key <run-key>As per coding guidelines, docs/**: keep README, SECURITY, and docs consistent with actual CLI flags and workflows. whenever behavior changes, require updated upgrade notes and mention new npm scripts.
Also applies to: 115-116
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/reference/commands.md` at line 58, Update the recovery workflow example
for the rotate-secrets CLI to include the documented --idempotency-key flag so
the example matches the flag table; change the non-idempotent rotate-secrets
invocation in docs/reference/commands.md (the recovery example and the nearby
example at lines ~115-116) to include a stable idempotency token (e.g.
--idempotency-key "$CI_JOB_ID" or a generated UUID) and ensure the example text
mentions re-running safely; verify this aligns with the rotate-secrets handling
in the background job logic (see the rotate-secrets usage in lib/index.ts and
the idempotency handling in the background job code around the rotate job) so
docs and implementation are consistent.
| 1. For credential exposure: | ||
| - rotate affected OAuth/session credentials | ||
| - set new `CODEX_AUTH_ENCRYPTION_KEY` | ||
| - run `codex auth rotate-secrets` | ||
| 2. For unauthorized command execution: |
There was a problem hiding this comment.
add idempotency to the rotate-secrets containment step.
for incident automation, this step is commonly retried. include --idempotency-key <incident-id> to avoid duplicate side effects. refs: lib/background-jobs.ts:70-114, lib/index.ts:38.
proposed doc fix
- - run `codex auth rotate-secrets`
+ - run `codex auth rotate-secrets --idempotency-key <incident-id>`As per coding guidelines, docs/**: keep README, SECURITY, and docs consistent with actual CLI flags and workflows. whenever behavior changes, require updated upgrade notes and mention new npm scripts.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/runbooks/incident-response.md` around lines 28 - 32, Update the
"rotate-secrets" containment step in the incident runbook to be idempotent by
appending the CLI flag "--idempotency-key <incident-id>" to the "codex auth
rotate-secrets" invocation so retries won't cause duplicate side effects; mirror
this flag usage in documentation and mention the change in upgrade notes and npm
scripts as appropriate, and cross-reference the implementation locations
(lib/background-jobs.ts function handling background job retries and
lib/index.ts CLI command registration) to ensure the flag name and behavior
match the actual code.
| void enforceDataRetention().catch((error) => { | ||
| logWarn("Data retention cleanup failed", { | ||
| error: error instanceof Error ? error.message : String(error), | ||
| }); | ||
| }); |
There was a problem hiding this comment.
serialize retention cleanup with storage io on startup.
this runs in parallel with startup account/storage flows. on windows, file prune vs active handles can race and produce repeated eperm/ebusy cleanup failures.
ref: index.ts:228-232, lib/data-retention.ts:94-121.
please gate cleanup behind the same storage lock/transaction strategy used for account writes, and add a vitest regression that simulates windows-style lock contention during startup. as per coding guidelines, "lib/**: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) ..."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@index.ts` around lines 228 - 232, The startup call to enforceDataRetention()
runs concurrently with storage/account startup and must be serialized using the
same storage lock/transaction used for account writes; modify the startup flow
to acquire the storage lock/transaction wrapper (the same mechanism used for
account writes) before invoking enforceDataRetention(), and update
lib/data-retention.ts to expose a lock-aware entry (e.g.,
enforceDataRetentionUnderLock or accept a transaction/lock token) so the cleanup
runs inside the storage lock; add a vitest regression (e.g.,
data-retention.spec.ts) that simulates Windows-style file-handle contention by
holding the storage lock/handle during startup and asserting that
enforceDataRetention either waits for the lock or fails gracefully without
eperm/ebusy errors to verify serialization.
| it("retries and succeeds before exhausting attempts", async () => { | ||
| const { runBackgroundJobWithRetry, getBackgroundJobDlqPath } = | ||
| await import("../lib/background-jobs.js"); | ||
| let attempts = 0; | ||
| const result = await runBackgroundJobWithRetry({ | ||
| name: "test.retry-success", | ||
| task: async () => { | ||
| attempts += 1; | ||
| if (attempts < 3) { | ||
| const error = new Error("busy") as NodeJS.ErrnoException; | ||
| error.code = "EBUSY"; | ||
| throw error; | ||
| } | ||
| return "ok"; | ||
| }, | ||
| maxAttempts: 4, | ||
| baseDelayMs: 1, | ||
| maxDelayMs: 2, | ||
| }); | ||
|
|
||
| expect(result).toBe("ok"); | ||
| expect(attempts).toBe(3); | ||
| await expect(fs.stat(getBackgroundJobDlqPath())).rejects.toMatchObject({ code: "ENOENT" }); | ||
| }); |
There was a problem hiding this comment.
add a retry regression for 429 scenarios.
test/background-jobs.test.ts:26-49 validates ebusy retries but misses the 429 path. add a case that simulates rate-limit failure and verifies retry/backoff behavior before success or dlq write.
as per coding guidelines "lib/**: ... verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/background-jobs.test.ts` around lines 26 - 49, Add a new Vitest case
mirroring the existing "retries and succeeds before exhausting attempts" test
that specifically simulates rate-limit (429) failures: import
runBackgroundJobWithRetry and getBackgroundJobDlqPath, create a task that throws
an error representing an HTTP 429 (e.g., an Error with status or statusCode =
429) for the first N-1 attempts and returns "ok" on the Nth attempt, call
runBackgroundJobWithRetry with small baseDelayMs/maxDelayMs and appropriate
maxAttempts, assert the result is "ok" and that the task ran the expected number
of attempts, and add a separate test that exhausts maxAttempts on repeated 429
errors and then asserts fs.stat(getBackgroundJobDlqPath()) rejects with { code:
"ENOENT" } or the appropriate DLQ presence check per existing tests; reference
runBackgroundJobWithRetry and getBackgroundJobDlqPath to locate where to
add/modify tests.
| import { describe, expect, it } from "vitest"; | ||
| import { redactForExternalOutput } from "../lib/data-redaction.js"; | ||
|
|
||
| describe("data redaction", () => { | ||
| it("redacts sensitive keys recursively", () => { | ||
| const value = { | ||
| email: "person@example.com", | ||
| profile: { | ||
| accessToken: "abc", | ||
| refresh_token: "def", | ||
| nested: [{ accountId: "acct_123" }], | ||
| }, | ||
| }; | ||
|
|
||
| const redacted = redactForExternalOutput(value); | ||
| expect(redacted).toEqual({ | ||
| email: "***REDACTED***", | ||
| profile: { | ||
| accessToken: "***REDACTED***", | ||
| refresh_token: "***REDACTED***", | ||
| nested: [{ accountId: "***REDACTED***" }], | ||
| }, | ||
| }); | ||
| }); | ||
|
|
||
| it("keeps non-sensitive fields unchanged", () => { | ||
| const value = { | ||
| command: "report", | ||
| schemaVersion: 1, | ||
| summary: { ok: 1, warn: 0, error: 0 }, | ||
| }; | ||
| expect(redactForExternalOutput(value)).toEqual(value); | ||
| }); | ||
| }); No newline at end of file |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
good baseline coverage for redaction, but consider adding edge case tests.
tests are deterministic and use vitest correctly. the recursive redaction test at lines 5-24 covers nested objects and arrays well.
per coding guidelines (test/**), consider adding regression cases for:
- null/undefined values in the input structure
- arrays containing non-object primitives mixed with objects
- empty objects and arrays
- very deep nesting to ensure no stack overflow on pathological inputs
the implementation in lib/data-redaction.ts:31-36 shows it handles arrays and non-objects, but explicit tests for these paths would improve confidence.
📝 suggested additional test cases
it("handles null and undefined values without crashing", () => {
const value = {
email: "test@example.com",
nullField: null,
undefinedField: undefined,
nested: { accessToken: "secret", empty: null },
};
const redacted = redactForExternalOutput(value);
expect(redacted.email).toBe("***REDACTED***");
expect(redacted.nullField).toBeNull();
expect(redacted.nested.accessToken).toBe("***REDACTED***");
});
it("handles arrays with mixed primitives and objects", () => {
const value = {
items: [1, "string", { accountId: "acct_1" }, null],
};
const redacted = redactForExternalOutput(value);
expect(redacted.items).toEqual([1, "string", { accountId: "***REDACTED***" }, null]);
});🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/data-redaction.test.ts` around lines 1 - 34, Add regression tests to
test/data-redaction.test.ts that exercise edge cases for the
redactForExternalOutput function: include a test that passes null and undefined
fields (and nested nulls) to ensure they are preserved rather than causing
errors, a test with arrays containing mixed primitives, objects and nulls to
ensure only sensitive keys in objects are redacted, a test for empty
objects/arrays to ensure they remain unchanged, and a deep-nesting test (e.g.,
>50 levels) to guard against recursion/stack issues; reference the
redactForExternalOutput symbol when adding these cases and assert exact expected
outputs (null stays null, primitives unchanged, sensitive fields replaced with
"***REDACTED***").
| await expect( | ||
| acquireFileLock(lockPath, { | ||
| maxAttempts: 2, | ||
| baseDelayMs: 1, | ||
| maxDelayMs: 2, | ||
| staleAfterMs: 10_000, | ||
| }), | ||
| ).rejects.toBeTruthy(); | ||
|
|
There was a problem hiding this comment.
assert the lock failure mode explicitly.
rejects.toBeTruthy() at test/file-lock.test.ts:49-57 is too broad and can pass on unrelated exceptions. assert a specific error code/message so this regression stays targeted.
as per coding guidelines "test/**: tests must stay deterministic and use vitest... reject changes that ... skip assertions."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/file-lock.test.ts` around lines 49 - 57, The test currently uses a broad
expectation (rejects.toBeTruthy()) for acquireFileLock; replace it with a
deterministic assertion that checks the specific failure mode—e.g., assert the
promise rejects with the LockAcquisitionError (or the concrete error class your
lock implementation throws) or matches a known error code/message (for example
use expect(acquireFileLock(...)).rejects.toThrow(/acquire.*lock/i) or
expect(...).rejects.toMatchObject({ code: 'LOCK_BUSY' })) so the test only
passes for the intended lock contention error; import the specific error type
from the file-lock implementation if available and use that in the assertion.
| it("records first key use and replays duplicates", async () => { | ||
| const { checkAndRecordIdempotencyKey, getIdempotencyStorePath } = | ||
| await import("../lib/idempotency.js"); | ||
|
|
||
| expect( | ||
| await checkAndRecordIdempotencyKey("codex.auth.rotate-secrets", "key-1"), | ||
| ).toEqual({ replayed: false }); | ||
| expect( | ||
| await checkAndRecordIdempotencyKey("codex.auth.rotate-secrets", "key-1"), | ||
| ).toEqual({ replayed: true }); | ||
|
|
||
| const content = await fs.readFile(getIdempotencyStorePath(), "utf8"); | ||
| expect(content).toContain("codex.auth.rotate-secrets"); | ||
| }); | ||
|
|
||
| it("expires entries based on ttl", async () => { | ||
| const { checkAndRecordIdempotencyKey } = await import("../lib/idempotency.js"); | ||
| expect( | ||
| await checkAndRecordIdempotencyKey("codex.auth.rotate-secrets", "key-2", 5), | ||
| ).toEqual({ replayed: false }); | ||
| await new Promise((resolve) => setTimeout(resolve, 10)); | ||
| expect( | ||
| await checkAndRecordIdempotencyKey("codex.auth.rotate-secrets", "key-2", 5), | ||
| ).toEqual({ replayed: false }); | ||
| }); |
There was a problem hiding this comment.
missing concurrency regression for duplicate-key race.
test/idempotency.test.ts:26-50 only validates sequential calls. add a regression where concurrent calls race on the same scope/key and prove exactly one wins as non-replay. this is the critical behavior the file lock is meant to protect.
as per coding guidelines "test/**: demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/idempotency.test.ts` around lines 26 - 50, The test suite lacks a
concurrency regression for idempotency key races: add a new test that
concurrently calls checkAndRecordIdempotencyKey for the same scope/key (e.g.,
"codex.auth.rotate-secrets", "key-concurrent") and asserts that exactly one
invocation returns { replayed: false } while all others return { replayed: true
}; use Promise.all to invoke many parallel calls (10+), await them, then count
results to ensure a single winner, and reuse getIdempotencyStorePath to validate
the store if desired—this proves the file-locking behavior around
checkAndRecordIdempotencyKey under concurrent access.
| it("recognizes legacy and current encrypted prefixes", () => { | ||
| expect(isEncryptedSecret("enc:v1:abc:def:ghi")).toBe(true); | ||
| expect(isEncryptedSecret("enc:v2:abc:def:ghi:jkl")).toBe(true); | ||
| expect(isEncryptedSecret("enc:v3:abc:def:ghi:jkl")).toBe(false); | ||
| }); |
There was a problem hiding this comment.
add regression coverage for malformed prefixed plaintext inputs.
the suite checks prefix recognition, but it misses the case where plaintext begins with enc:v2: and should still be encrypted.
ref: test/secrets-crypto.test.ts:62-66, lib/secrets-crypto.ts:107-110.
as per coding guidelines, "test/**: tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/secrets-crypto.test.ts` around lines 62 - 66, Add a deterministic
regression test in test/secrets-crypto.test.ts that covers malformed prefixed
plaintext inputs for the isEncryptedSecret function: specifically add a case
where the input begins with the "enc:v2:" prefix but lacks the full expected
segments (e.g., truncated/malformed payload) and assert that
isEncryptedSecret(...) still returns true; this ensures
lib/secrets-crypto.ts::isEncryptedSecret handles prefixed-but-malformed
plaintext correctly and prevents regressions around prefix recognition.
Summary
This PR implements a comprehensive enterprise hardening baseline across runtime behavior, storage safety, CLI contracts, CI policy gates, and operations documentation.
What’s Included
codex auth rotate-secretsschemaVersionin JSON outputs--page-size,--cursor).github/settings.yml)Validation
npm run typechecknpm run lintnpm run build && npm testnpm run coveragenpm run audit:cinpm run license:checknpm run clean:repo:checkNotes
feat/enterprise-hardening.mainduring implementation.note: greptile review for oc-chatgpt-multi-auth. cite files like
lib/foo.ts:123. confirm regression tests + windows concurrency/token redaction coverage.Greptile Summary
comprehensive enterprise hardening baseline implementing cross-process file locking for all settings/account writes (defends windows antivirus/concurrent access races), at-rest token encryption with scrypt kdf and key rotation, rbac/abac authorization gates with audited break-glass, background job retry + dead-letter queue, idempotency key support for
rotate-secrets, json output redaction, and data retention enforcement.key improvements addressing windows filesystem safety:
wx)EBUSY/EPERMerrorsAtomics.waitinstead of blocking event loopsecurity/reliability enhancements:
ci/supply-chain controls:
test coverage:
all validation commands pass per pr description. previous review comments addressed: scrypt kdf replaces sha256, break-glass now audited, concurrent write race test added.
Confidence Score: 5/5
Important Files Changed
Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A[CLI Command] --> B{Authorization Check} B -->|Break-Glass Set| C[Audit Break-Glass] B -->|ABAC Policy| D{Policy Allows?} D -->|Denied| E[Return Error] D -->|Allowed| F{RBAC Role Check} F -->|Denied| E F -->|Allowed| G[Execute Action] C --> G G --> H{Mutates Storage?} H -->|No| I[Return Success] H -->|Yes| J[Acquire File Lock] J --> K{Lock Acquired?} K -->|Timeout/Fail| L[Background Job Retry] K -->|Success| M[Encrypt Secrets] M --> N[Write Temp File] N --> O[Atomic Rename] O --> P[Release Lock] P --> I L --> Q{Max Retries?} Q -->|Yes| R[Write to DLQ] Q -->|No| S[Exponential Backoff] S --> J R --> ELast reviewed commit: d1c603e