Skip to content

Conversation

@benzntech
Copy link

@benzntech benzntech commented Dec 11, 2025

TLDR

This PR adds dynamic multi-model support to Qwen Code, allowing users to fetch and switch between models from OpenAI-compatible API endpoints at runtime without restarting. It removes hardcoded model configuration and adds a new /model command that dynamically discovers available models from any OpenAI-compatible service (LocalAI, Ollama, LM Studio, OpenRouter, Azure OpenAI, etc.). Selected models are now persisted to settings.json so users' latest model choice is remembered across sessions.

Dive Deeper

Problem:
Previously, users could only use a single hardcoded model via environment variable (OPENAI_MODEL). This prevented:

  • Runtime model switching without restarting
  • Discovery of available models from the API
  • Leveraging model aggregators (like OpenRouter) that provide 100+ models
  • Seamless switching between different models for experimentation
  • Remembering the selected model across sessions

Solution:
Implemented a new ModelsService that fetches models from OpenAI-compatible endpoints and integrated it with the CLI's model command system. Users can now:

  1. Configure API endpoint via /auth command (no model selection needed)
  2. Use /model command to dynamically fetch available models
  3. Interactively select a model
  4. Model selection is persisted to settings.json
  5. Switch models at runtime without restarting
  6. Selected model is remembered across CLI sessions

Implementation Details:

New Service: ModelsService (packages/cli/src/services/ModelsService.ts)

  • Fetches models from {OPENAI_BASE_URL}/v1/models standard endpoint
  • Timeout protection: 5 seconds default (prevents hanging on unresponsive endpoints)
  • Comprehensive error handling for network failures
  • Returns models in standard OpenAI API format
  • Caches results to avoid repeated API calls

Updated Components:

  1. modelCommand.ts - Enhanced to fetch and display available models dynamically
  2. ModelDialog.tsx - Added loading states, error handling, and model persistence to settings.json
  3. OpenAIKeyPrompt.tsx - Simplified to remove hardcoded model field
  4. useAuth.ts - Updated auth flow to work without model selection

Model Persistence:
When users select a model via /model command:

  • Model is updated in-memory via config.setModel(model)
  • Selected model is persisted to ~/.qwen/settings.json via settings.setValue('model.name', model)
  • On next CLI startup, the last selected model is automatically loaded from settings
  • Consistent with how other settings (theme, approval mode, editor) are persisted

Workflow Example:

# Setup (no model needed, just endpoint and key)
export OPENAI_BASE_URL='http://localhost:8317'
export OPENAI_API_KEY='your-key'

# Run qwen
qwen

# At runtime, switch models
/model
# Shows: Loading models...
# Then displays: [1] gpt-4, [2] gpt-3.5-turbo, [3] claude-3-opus, etc.
# User selects one, model is now active for the session AND saved to settings.json

# Close and restart qwen
qwen
# Model automatically loads the previously selected model from settings.json

Supported APIs:

  • ✅ LocalAI (local server)
  • ✅ Ollama (local model runner)
  • ✅ LM Studio (local model server)
  • ✅ OpenRouter (cloud model aggregator)
  • ✅ Azure OpenAI (Microsoft's implementation)
  • ✅ Any OpenAI-compatible service

Reviewer Test Plan

  1. Build and test:

    npm run build
    npm run test --workspace=packages/cli
    npm run test --workspace=packages/core

    Expected: All tests pass ✅

  2. Test with local OpenAI-compatible service:

    # Start a local model server (e.g., Ollama, LocalAI, or LM Studio)
    export OPENAI_BASE_URL='http://localhost:8000'
    export OPENAI_API_KEY='dummy-key'
    
    qwen
    # Type: /model
    # Verify models are fetched and displayed
    # Select a model and verify it's now active
  3. Test model persistence across sessions:

    # Session 1: Select a model
    qwen
    /model
    # Select 'gpt-4' (or any model)
    # Quit with /quit or Ctrl+C
    
    # Session 2: Verify persisted model
    qwen
    # Check status bar - should show the previously selected 'gpt-4'
    # Verify /model still shows the last selected model as active
  4. Test error handling:

    • Set OPENAI_BASE_URL to invalid endpoint
    • Run /model
    • Expected: Clear error message instead of hanging
  5. Test timeout protection:

    • Set OPENAI_BASE_URL to slow/unresponsive endpoint
    • Run /model
    • Expected: Times out after 5 seconds with helpful message
  6. Test backward compatibility:

    • Users with existing OPENAI_MODEL environment variable should still work
    • Test with OpenRouter, Azure OpenAI, standard OpenAI API
    • Expected: All continue to work without issues
  7. Test UI/UX:

    • Launch CLI with /auth command
    • Verify model field is no longer requested
    • Verify /model command appears in help (/help)
    • Verify loading indicator appears while fetching models
    • Verify model persistence for the current session and across restarts

Testing Matrix

🍏 🪟 🐧
npm run
npx
Docker
Podman - -
Seatbelt - -

- Add ModelsService to fetch available models from OpenAI-compatible endpoints
- Update modelCommand to dynamically fetch and display available models
- Enhance ModelDialog with loading states and error handling
- Simplify OpenAIKeyPrompt to remove hardcoded model field
- Support runtime model switching without restarting
- Compatible with LocalAI, Ollama, LM Studio, OpenRouter, Azure OpenAI
- Add comprehensive error handling and timeout protection
- Resolves: QwenLM#1206
- Add useSettings hook to ModelDialog to access settings persistence
- Update handleSelect callback to save selected model to settings via settings.setValue()
- Store model selection in 'model.name' setting scope (User, not Workspace)
- Model selection is now remembered across CLI sessions
- Add error handling to prevent dialog blocking if settings save fails
- Resolves issue where selected model was lost after CLI restart
- Import useSettings from SettingsContext instead of non-existent hook file
- Resolves TypeScript compilation error
@tanzhenxin tanzhenxin requested a review from Mingholy December 15, 2025 09:46
@Mingholy Mingholy self-assigned this Dec 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants