This fork is meant to make the Codex command-line agent easier to understand as a working system. The starting point is that Codex can look opaque from the outside because model selection, backend routing, tool execution, connectors, and turn handling are spread across several layers. The purpose of this README is to make those layers easier to inspect in plain English, so a builder can understand how a session works, how a turn works, and where the main decisions and integrations live in the code.
The previous repository README has been moved to README_LEGACY.md.
This section gives a top-level mental model for the architecture used by this CLI.
This section introduces the three basic units that make the rest of the document easier to read. The starting point is that the system is not just a single prompt going to a model. It is a longer-lived interaction structure with a user, an ongoing session, and many turns inside that session. The practical value of naming these units early is that later sections can describe model requests, tool execution, reconnects, and backend routing without mixing together short-lived and long-lived parts of the system.
-
User - the human interacting with Codex through the CLI or a host UI such as an editor integration
-
Session - the longer-lived conversation or thread that carries the conversation identity over time
-
Turn - one unit of work started by user input and completed only after the model and any follow-up tool activity for that input settle
A session can contain many turns. A turn is not just one outbound request. It can include model output, tool calls, tool results, retries, reconnects, and follow-up model requests before the turn is done. In the code, this idea shows up as turn-scoped state such as TurnContext, TurnState, ActiveTurn, and ModelClientSession.
This section names the main parts that show up when one turn runs inside a session. The starting point is that a turn is not a single isolated API call. It is a unit of work that passes through several parts of the system. The practical question is how to describe those parts clearly without pretending that the code uses exactly the same labels. In this README, the labels below are explanatory labels that help describe the turn structure in plain English, even when the code uses more specific internal names.
-
User input - starts or steers the turn
-
Session runtime - the local runtime that builds the request, selects the available tools, and manages the turn; in the code, this is closer to the
Sessionplus turn-scoped state than to a separate formal layer namedCLI -
Model request path - the model-facing network path, including requests such as
/responses,/models,/responses/compact, and realtime transport -
Local tools path - the built-in local tools such as
list_dir,shell,apply_patch,grep_files, andread_file -
Connectors path - the app and integration side of the system, where connector-backed tools and app metadata are surfaced through the ChatGPT connectors ecosystem
This is the core execution logic of a turn. The starting point is that the session already has a model choice, a provider, and working context. The complication is that the model may need to do more than produce text: it may need to inspect files, run commands, or use connector-backed capabilities. The practical question is who decides which tool to use and where that tool runs. In this codebase, the answer is that the CLI sends the model request together with the available tool schema, the model decides whether to call a tool, and the local runtime executes local tools and returns the result into the same turn until the work is complete.
-
Model selection - starts from session configuration and collaboration-mode settings, then resolves against the available model catalog and provider configuration.
-
Tool availability - for each turn, the CLI builds a prompt that includes the current tool menu.
-
Model decision - the normal model request sends that tool schema to the model-facing backend, and the model can answer with plain content or with tool calls.
-
Local execution - if the model calls a local tool, the CLI dispatches that tool locally and records the output back into the turn.
-
Turn completion - the turn ends only after the model and any tool work for that user request have finished.
A reconnecting message in a UI fits this model. It usually means the streaming transport for the current turn is retrying or reconnecting, not that the whole session has started over.
This repository uses a real model-list endpoint and also ships with an offline model catalog.
- Live model discovery comes from
GET /v1/models. - The local bundled fallback catalog lives at
codex-rs/core/models.json. - The model manager starts from the bundled catalog and can later refresh from the network.
- You can also force a custom offline catalog with
model_catalog_jsonin config. - The system is centered on the OpenAI
ResponsesAPI rather than the olderchatwire format. - When the user is authenticated through ChatGPT instead of an API key, model traffic can be routed through the ChatGPT Codex backend at
https://chatgpt.com/backend-api/codex.
Current model slugs present in the bundled local catalog:
gpt-5.3-codexgpt-5.4gpt-5.2-codexgpt-5.1-codex-maxgpt-5.1-codexgpt-5.2gpt-5.1gpt-5-codexgpt-5gpt-oss-120bgpt-oss-20bgpt-5.1-codex-minigpt-5-codex-mini
Additional notes about the model layer:
- Main inference path:
POST /v1/responses - Model catalog path:
GET /v1/models - Conversation compaction path:
POST /v1/responses/compact - Responses WebSocket transport is supported
- Realtime voice/session transport is supported through
/v1/realtime - Audio transcription is handled separately through
/v1/audio/transcriptions - A local model catalog can be supplied through
model_catalog_json - The code explicitly rejects
wire_api = "chat"and expectswire_api = "responses"
This section is meant to make the overall request choreography easier to reason about.
- Model: the actual GPT/Codex model, such as
gpt-5.4orgpt-5-codex. - Backend: the HTTP service that the CLI talks to when it sends model-facing requests.
- Direct API mode: the backend is
https://api.openai.com/v1. - ChatGPT Codex backend mode: the backend is
https://chatgpt.com/backend-api/codex.
This subsection narrows the terms so they are easier to use in a mutually exclusive and collectively exhaustive way within this README.
OpenAI public API: the public API surface rooted athttps://api.openai.com/v1ChatGPT Codex backend: the narrow Codex model-serving family rooted athttps://chatgpt.com/backend-api/codexChatGPT backend ecosystem used by Codex: the broader ChatGPT-side backend surface touched by this codebase
Working partition used in this README:
- Public API bucket:
https://api.openai.com/v1/... - Narrow ChatGPT Codex bucket:
https://chatgpt.com/backend-api/codex/... - Broader ChatGPT backend ecosystem bucket: ChatGPT-side routes outside the narrow
codexfamily, including: https://chatgpt.com/backend-api/wham/...https://chatgpt.com/backend-api/connectors/directory/...https://chatgpt.com/backend-api/plugins/...https://chatgpt.com/backend-api/transcribe
Working rule for this README:
- When we say
ChatGPT Codex backend, we mean only the narrow.../backend-api/codexfamily. - When we say
ChatGPT backend ecosystem used by Codex, we mean the full ChatGPT-side surface used by this codebase, including the narrowcodexfamily plus the adjacent families listed above. - Section 4 is therefore intentionally broader than the term
ChatGPT Codex backend.
- The CLI does not only "hold a chat locally" and then somehow reveal it all at once.
- The conversation is expressed over time through a model request stream for the conversation.
- That stream can include user-turn submission, model streaming output, tool-related events, model catalog lookups, compaction requests, and realtime session traffic.
- In ChatGPT-authenticated mode, those conversation-related model-facing requests can be routed through the ChatGPT Codex backend at
https://chatgpt.com/backend-api/codex. - That backend should not be thought of as the model itself. It is better understood as the service layer or gateway that receives those requests and handles account, workspace, orchestration, and routing concerns before or around model execution.
- Public OpenAI API documentation exists for the public API surfaces such as the Responses API and related endpoints.
- Useful public starting points are:
- Codex docs:
https://developers.openai.com/codex - Responses API docs:
https://platform.openai.com/docs/api-reference/responses - Audio transcription docs are part of the OpenAI platform API documentation.
- I did not find a public formal API reference in this repo for the ChatGPT-specific backend routes such as
https://chatgpt.com/backend-api/codex/...or/wham/.... - For this README, the safest working assumption is that the public OpenAI docs describe the public API at
api.openai.com, while the ChatGPT backend routes seen here behave like product/backend integration endpoints rather than a separately documented public API.
This section clusters the built-in tools that operate on the local machine or local workspace during a session. These are separate from connectors. The exact set available in a session depends on configuration, feature flags, and tool mode.
-
list_dir- lists entries in a local directory with numbered output and simple type labels -
read_file- reads a local file with line numbers and can expand around indentation-aware code blocks -
grep_files- searches local file contents for a regex pattern and returns matching file paths -
view_image- loads a local image file by filesystem path for model inspection
-
shell- runs a local shell command on the current machine -
shell_command- runs a shell script string in the user's default shell -
local_shell- compatibility alias for local shell execution -
exec_command- unified exec-style local command runner with richer session control -
write_stdin- sends stdin to a running unified exec session
-
apply_patch- edits local files by applying structured patches -
js_repl- runs local JavaScript in the bundled runtime -
artifacts- runs local JavaScript against the preinstalled artifact runtime for generated files such as presentations or spreadsheets
- Local computer tools act directly on the current machine or local workspace.
- Connectors are app/integration identities associated with ChatGPT-side app surfaces and connector-backed tools.
- A request such as "list a local directory" maps to the local tool surface, not to the connectors directory.
list_diris the clearest example of a built-in local tool rather than a connector-backed capability.
This repo does not appear to use a separate "tool catalog dispatch" API call. Instead, the available tool menu is attached to the normal model request for the turn.
-
Step 1 - the CLI builds the available tool set locally. The tool router assembles built-in tools, MCP tools, connector-backed app tools, and dynamic tools, then keeps the subset that is model-visible for the turn.
-
Step 2 - the turn prompt includes that tool set. The prompt carries the model-visible tools as part of the turn request state.
-
Step 3 - the tool specs are serialized into Responses API JSON. The local tool definitions are converted into the
toolsarray sent to the model-facing backend. -
Step 4 - the normal model request sends the tools to the server. The standard
/responsesrequest, or websocket equivalent, carriestools,tool_choice, andparallel_tool_calls. -
Step 5 - the model decides whether to call a tool. This repo uses
tool_choice = "auto"in the normal Responses request path. -
Step 6 - the CLI receives the tool call and dispatches it locally. The streamed response item is parsed into a local tool invocation, matched to a registered handler, and executed on the client side.
-
Step 7 - the CLI sends the tool result back into the conversation. The result is returned as a
function_call_output-style item and fed into the next step of the turn.
In plain terms:
- The server sees the tool schema.
- The model chooses whether to call a tool.
- The local CLI owns the actual execution of local tools.
- Tool output is fed back to the model as structured conversation state, not as an out-of-band side channel.
This section is a repo-local documentation cluster for the broader ChatGPT-side backend surfaces used by this codebase. It is descriptive, not official.
- Broad ChatGPT-side base in this repo:
https://chatgpt.com/backend-api/ - Narrow Codex model-serving base used for model-facing requests:
https://chatgpt.com/backend-api/codex - Scope of this section: the broader
ChatGPT backend ecosystem used by Codex, not only the narrowChatGPT Codex backend - Common auth shape:
Authorization: Bearer <token> - Common ChatGPT account header:
ChatGPT-Account-ID
These are the ChatGPT-side Codex model-serving routes. They are the closest analogue to the direct api.openai.com/v1 model API.
-
/backend-api/codex/responses- the ChatGPT-side counterpart ofPOST /v1/responses -
/backend-api/codex/models- the ChatGPT-side counterpart ofGET /v1/models -
/backend-api/codex/responses/compact- the ChatGPT-side counterpart ofPOST /v1/responses/compact -
/backend-api/codex/...websocket-derivedresponsestransport - websocket transport for responses when the provider is the ChatGPT Codex backend
These routes appear to cover product/backend features around tasks, usage, requirements, and environments.
-
/backend-api/wham/usage- usage and rate-limit style information -
/backend-api/wham/tasks/list- list tasks -
/backend-api/wham/tasks/{id}- fetch a specific task -
/backend-api/wham/tasks/{id}/turns/{turn_id}/sibling_turns- fetch sibling attempts or related turn variants -
/backend-api/wham/tasks- create tasks -
/backend-api/wham/config/requirements- fetch managed requirements or config requirements -
/backend-api/wham/environments- list environments -
/backend-api/wham/environments/by-repo/{provider}/{owner}/{repo}- map a repository to candidate environments -
/backend-api/wham/apps- legacy apps MCP gateway path
These routes are used to discover connector/app metadata.
-
/backend-api/connectors/directory/list- list discoverable connector directory entries -
/backend-api/connectors/directory/list_workspace- list workspace-specific connector entries
These routes are used for remote plugin status synchronization.
/backend-api/plugins/list- fetch remote plugin installation or enabled-state summaries
These routes are used when voice transcription goes through ChatGPT-authenticated mode instead of the public OpenAI audio API.
/backend-api/transcribe- audio transcription endpoint used in ChatGPT-authenticated mode
- I do not see a formal public API reference for these ChatGPT backend routes in this repo.
- The public OpenAI documentation appears to document the public API surface, not these product/backend routes.
- For that reason, this section should be treated as a reverse-engineered route map from the codebase.
Primary OpenAI API paths used by this codebase:
POST /v1/responsesGET /v1/modelsPOST /v1/responses/compact- WebSocket transport for
responses - Realtime WebSocket at
/v1/realtime POST /v1/audio/transcriptions- Connectors MCP gateway at
/v1/connectors/gateways/flat/mcp
OpenAI-owned backend services used by the system:
https://api.openai.com/v1https://auth.openai.comhttps://chatgpt.com/backend-api/codexhttps://chatgpt.com/backend-api/wham/...https://chatgpt.com/backend-api/connectors/directory/...
ChatGPT-backed routes used in the repo include:
/wham/usage/wham/tasks/list/wham/tasks/{id}/wham/tasks/{id}/turns/{turn_id}/sibling_turns/wham/config/requirements/wham/tasks/connectors/directory/list/connectors/directory/list_workspace/wham/apps
Auth and request behavior:
- Standard auth header:
Authorization: Bearer <token> - ChatGPT-backed requests also use
ChatGPT-Account-ID - The built-in OpenAI provider can send
OpenAI-OrganizationfromOPENAI_ORGANIZATION - The built-in OpenAI provider can send
OpenAI-ProjectfromOPENAI_PROJECT - Default clients attach an
originatorheader - Realtime requests include
x-session-id - Subagent traffic can include
x-openai-subagent
OAuth and login services:
- OAuth issuer defaults to
https://auth.openai.com - Browser login uses
/oauth/authorize - Token exchange uses
/oauth/token - Refresh-token handling also uses
https://auth.openai.com/oauth/token
Notably absent from the runtime code paths inspected here:
- No runtime use of
chat/completions - No runtime use of the Assistants API
- No runtime use of Embeddings
- No runtime use of Moderations
- No runtime use of image generation endpoints
- No runtime use of fine-tuning endpoints
Use the bundled catalog:
sed -n '1,120p' codex-rs/core/models.jsonList only model slugs:
rg -o '"slug":\s*"[^"]+"' codex-rs/core/models.json | sed 's/.*"slug":\s*"//; s/"$//'If you want the application to use a local catalog without contacting the server, point config at a JSON file through model_catalog_json.
-
Model traffic - the network requests that the CLI sends to the model-serving backend to get model behavior. In practice, this means requests such as
POST /v1/responses,GET /v1/models, websocket connections forresponses, realtime websocket sessions, and similar calls that directly support model selection, inference, streaming output, or model metadata. When I said model traffic can be routed through the ChatGPT Codex backend, I meant those model-related requests may go tohttps://chatgpt.com/backend-api/codexinstead of directly tohttps://api.openai.com/v1, depending on how the user is authenticated. -
Model-facing requests - another way to describe model traffic. These are the requests that are sent toward the service layer that fronts model execution and model metadata. They are "facing" the model layer because they ask for model output, model streaming, model selection, model metadata, or related context-management behavior.
-
Model request stream for the conversation - the sequence of model-facing requests that together represent an ongoing CLI conversation. Instead of thinking of the chat as one giant opaque blob, it is more accurate to think of it as a stream of requests and responses over time: user turn submission, streaming response output, tool-related follow-up, model lookup, compaction, and sometimes realtime session traffic.
-
Session - the longer-lived conversation or thread identity that persists across multiple user interactions. In the code, this is closely related to the session-scoped
ModelClientand theconversation_idor thread identity carried through the system. -
Session-scoped - state or objects that are intended to live across multiple turns within one conversation or thread. In this repo,
ModelClientis explicitly documented as session-scoped, while the thread identity also appears asconversation_idorthread_iddepending on the code path. -
Turn - one unit of user work within a session. A turn may include more than one model request because tool calls, tool outputs, reconnects, retries, or follow-up requests can all occur before that turn finishes. In the code, this idea appears in
TurnContext,TurnState, andActiveTurn. -
Turn-scoped session - a per-turn streaming or model client context. In this repo,
ModelClientSessionis turn-scoped: it can reuse websocket and sticky-routing state within one turn, but a session as a whole can contain many turns. -
ChatGPT Codex backend - the backend service that receives the CLI's conversation-related model-facing requests before those requests are fulfilled by actual model execution. In this repo, that backend is represented by
https://chatgpt.com/backend-api/codex. -
ChatGPT backend ecosystem used by Codex - a broader term than
ChatGPT Codex backend. This wider bucket includes the ChatGPT Codex backend itself plus adjacent ChatGPT-side backend families such aswham, connectors directory routes, plugin routes, and transcription routes. -
Connectors - ChatGPT-side integration surfaces that let the system discover and work with external tools or apps through directory-style endpoints such as
/backend-api/connectors/directory/listand/backend-api/connectors/directory/list_workspace. In this repo, connectors are part of the broader ChatGPT backend ecosystem used by Codex rather than part of the narrowcodexmodel-serving family.