feat(reliability): add retry governor controls and telemetry by ndycode · Pull Request #40 · ndycode/codex-multi-auth

ndycode · 2026-03-04T08:42:01Z

Summary

add a pure retry governor decision module for all-rate-limited retry behavior
add
etryAllAccountsAbsoluteCeilingMs\ + \CODEX_AUTH_RETRY_ALL_ABSOLUTE_CEILING_MS\ and wire it into the request loop
expose retry ceiling in Settings Hub (Rotation & Quota)
add structured \codex-metrics\ counters for retry governor stop reasons
update docs and tests for config/schema/settings parity

Validation

npm run typecheck
npm run lint
npm run build
npm test
npm run clean:repo:check
npm run audit:ci

note: greptile review for oc-chatgpt-multi-auth. cite files like `lib/foo.ts:123`. confirm regression tests + windows concurrency/token redaction coverage.

Greptile Summary

adds pure retry governor module to centralize all-rate-limited retry decisions with new absoluteCeilingMs config and env var. clean separation of decision logic from request loop enables isolated unit testing. three new telemetry counters track governor stop reasons (wait>max, retry limit, absolute ceiling).

key changes:

lib/request/retry-governor.ts - pure decision function with comprehensive edge case handling (NaN, Infinity, negative values)
index.ts - integrated governor, tracks accumulated wait across retries, logs stop reasons
lib/config.ts + lib/schemas.ts - new retryAllAccountsAbsoluteCeilingMs field with env override CODEX_AUTH_RETRY_ALL_ABSOLUTE_CEILING_MS
settings hub exposes ceiling control in rotation & quota category (0 to 24h range)
comprehensive test coverage: unit tests for governor logic + integration test verifying ceiling enforcement + telemetry

notes:

accumulated wait tracking uses base waitMs but actual sleeps include ±20% jitter, so real wait can exceed ceiling by ~20% (commented)
no windows filesystem or token safety concerns - pure timing logic
no concurrency issues - accumulatedAllRateLimitedWaitMs is local to request loop scope
all tests passing, docs updated, config/schema/settings parity maintained

Confidence Score: 4/5

safe to merge - well-tested retry logic with minor jitter tracking discrepancy
pure decision logic with comprehensive unit + integration tests, clean separation of concerns, and thorough edge case handling. minor style issue: accumulated wait tracks base values while actual sleeps include ±20% jitter, allowing real wait to slightly exceed ceiling. all validation passing, docs complete, no concurrency or safety risks.
index.ts around line 2417 - consider whether jitter should be included in accumulated wait tracking

Important Files Changed

Filename	Overview
lib/request/retry-governor.ts	new pure decision module for retry-all-rate-limited logic with comprehensive edge case handling
index.ts	integrated retry governor, added telemetry counters and accumulated wait tracking - minor jitter accounting issue
lib/config.ts	added getRetryAllAccountsAbsoluteCeilingMs with env override and min constraint
lib/codex-manager/settings-hub.ts	exposed retry ceiling in rotation & quota category with proper bounds and formatting
test/retry-governor.test.ts	comprehensive unit tests covering all stop reasons and edge cases
test/index-retry.test.ts	added integration test verifying absolute ceiling enforcement and telemetry

_{Last reviewed commit: d7a8a4d}

Adds a pure retry governor for all-rate-limited flows, introduces an absolute wait ceiling setting with env override, and wires decision-based retry gating into the request loop. Also exposes retry ceiling in Settings Hub (Rotation & Quota), and adds structured codex-metrics counters for retry governor stop reasons. Validation: - npm run typecheck - npm run lint - npm run build - npm test - npm run clean:repo:check - npm run audit:ci Co-authored-by: Codex <noreply@openai.com>

coderabbitai · 2026-03-04T08:42:12Z

Warning

Rate limit exceeded

@ndycode has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 9 minutes and 15 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: a7fa01f5-4bd0-443b-bbcb-6e4e1127a2fd

📥 Commits

Reviewing files that changed from the base of the PR and between d36b04f and d7a8a4d.

📒 Files selected for processing (14)

docs/development/CONFIG_FIELDS.md
docs/reference/settings.md
index.ts
lib/codex-manager/settings-hub.ts
lib/config.ts
lib/request/retry-governor.ts
lib/schemas.ts
test/codex-manager-cli.test.ts
test/index-retry.test.ts
test/index.test.ts
test/plugin-config.test.ts
test/retry-governor.test.ts
test/schemas.test.ts
test/settings-hub-utils.test.ts

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch transform/stage-01-reliability

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

ndycode · 2026-03-05T02:15:25Z

@coderabbitai review

Re-triggering review to clear the transient rate-limit status check failure.

coderabbitai · 2026-03-05T02:15:38Z

@ndycode: re-triggering the review now.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

ndycode · 2026-03-05T02:37:07Z

Re-requesting CodeRabbit review: the current failure is CodeRabbit: Review rate limit exceeded while other checks are green.

ndycode · 2026-03-05T05:00:58Z

@coderabbitai review

coderabbitai · 2026-03-05T05:01:07Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

ndycode · 2026-03-05T05:02:46Z

Follow-up tracking: external blocker recorded in #44.

Current failing context remains:

CodeRabbit: Review rate limit exceeded

Actions already taken:

re-triggered @coderabbitai review
verified review threads are resolved

Next required step is external quota reset/increase, then rerun CodeRabbit.

coderabbitai · 2026-03-05T05:03:05Z

@ndycode: acknowledged — the rate limit block is external and nothing on the pr side is holding things up. once the quota resets (tracked in #44), re-trigger with @coderabbitai review and the incremental review will resume from d7a8a4d.

no action needed from you in the meantime — all threads are clean and the pr is in good shape.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

ndycode mentioned this pull request Mar 5, 2026

External blocker: CodeRabbit rate limit failing checks on PRs #32/#36/#38/#40/#43 #44

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(reliability): add retry governor controls and telemetry#40

feat(reliability): add retry governor controls and telemetry#40
ndycode wants to merge 1 commit intomainfrom
transform/stage-01-reliability

ndycode commented Mar 4, 2026 •

edited by greptile-apps bot

Loading

Uh oh!

coderabbitai bot commented Mar 4, 2026 •

edited

Loading

Rate limit exceeded

Uh oh!

ndycode commented Mar 5, 2026

Uh oh!

coderabbitai bot commented Mar 5, 2026

Uh oh!

ndycode commented Mar 5, 2026

Uh oh!

ndycode commented Mar 5, 2026

Uh oh!

coderabbitai bot commented Mar 5, 2026

Uh oh!

ndycode commented Mar 5, 2026

Uh oh!

coderabbitai bot commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ndycode commented Mar 4, 2026 • edited by greptile-apps bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

note: greptile review for oc-chatgpt-multi-auth. cite files like lib/foo.ts:123. confirm regression tests + windows concurrency/token redaction coverage.

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Uh oh!

coderabbitai bot commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Uh oh!

ndycode commented Mar 5, 2026

Uh oh!

coderabbitai bot commented Mar 5, 2026

Uh oh!

ndycode commented Mar 5, 2026

Uh oh!

ndycode commented Mar 5, 2026

Uh oh!

coderabbitai bot commented Mar 5, 2026

Uh oh!

ndycode commented Mar 5, 2026

Uh oh!

coderabbitai bot commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ndycode commented Mar 4, 2026 •

edited by greptile-apps bot

Loading

note: greptile review for oc-chatgpt-multi-auth. cite files like `lib/foo.ts:123`. confirm regression tests + windows concurrency/token redaction coverage.

coderabbitai bot commented Mar 4, 2026 •

edited

Loading