-
Notifications
You must be signed in to change notification settings - Fork 2.8k
feat: strategic GLM model family detection for LM Studio and OpenAI-compatible providers #11092
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…AI-compatible providers This PR addresses Issue #11071 by implementing a comprehensive GLM model detection system: 1. Created glm-model-detection.ts utility that: - Detects GLM family models (GLM-4.5, 4.6, 4.7 and variants) - Supports various model ID formats (standard, MLX, GGUF, ChatGLM) - Identifies version (4.5, 4.6, 4.7) and variant (base, air, flash, v, etc.) - Returns appropriate configuration for each model 2. Updated LmStudioHandler to: - Detect GLM models and log detection results to console - Use convertToZAiFormat with mergeToolResultText for GLM models - Disable parallel_tool_calls for GLM models - Handle reasoning_content for GLM-4.7 models 3. Updated BaseOpenAiCompatibleProvider similarly 4. Added 33 comprehensive tests for the detection utility The detection uses flexible regex patterns to match model IDs like: - mlx-community/GLM-4.5-4bit - GLM-4.5-UD-Q8_K_XL-00001-of-00008.gguf - glm-4.5, glm-4.7-flash, etc.
Review complete. Found 2 issues in
Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues. |
| // Determine parallel_tool_calls setting for GLM models | ||
| const parallelToolCalls = | ||
| this.glmConfig?.isGlmModel && this.glmConfig.disableParallelToolCalls ? false : true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dead code: parallelToolCalls is computed here but never used. The params object below does not include parallel_tool_calls. Either add it to params or remove this computation.
| // Determine parallel_tool_calls setting for GLM models | |
| const parallelToolCalls = | |
| this.glmConfig?.isGlmModel && this.glmConfig.disableParallelToolCalls ? false : true |
Fix it with Roo Code or mention @roomote and request a fix.
| tool_choice: metadata?.tool_choice, | ||
| parallel_tool_calls: metadata?.parallelToolCalls ?? true, | ||
| parallel_tool_calls: parallelToolCalls, | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing thinking parameter for GLM-4.7 models. Unlike BaseOpenAiCompatibleProvider, this handler does not add the thinking parameter to requests when this.glmConfig.supportsThinking is true. While the code correctly handles reasoning_content in responses (lines 161-167), without sending the thinking parameter, GLM-4.7's thinking mode won't be activated. Consider adding the same logic used in BaseOpenAiCompatibleProvider (lines 138-145) here.
Fix it with Roo Code or mention @roomote and request a fix.
Summary
This PR attempts to address Issue #11071 by implementing a comprehensive GLM model family detection system that addresses the feedback from @mark-ucalgary.
Problem
GLM4.5 models via LM Studio and OpenAI-compatible endpoints were getting stuck in repeated file read loops. Previous attempts to fix this did not:
Solution
This PR implements a strategic GLM model family detection system that:
1. GLM Model Detection Utility (
glm-model-detection.ts)Supports various model ID formats:
glm-4.5,glm-4.7-flashmlx-community/GLM-4.5-4bitGLM-4.5-UD-Q8_K_XL-00001-of-00008.ggufchatglm-6b,chatglm3-6b2. Console Logging
Added
logGlmDetection()function that outputs detection results to console:3. Provider Updates
Updated
LmStudioHandlerandBaseOpenAiCompatibleProviderto:convertToZAiFormatwithmergeToolResultText: truefor GLM models (prevents conversation flow disruption)parallel_tool_callsfor GLM models (they may not support it)thinkingparameter support for GLM-4.7 modelsreasoning_contentin stream responses4. Comprehensive Tests
Added 33 test cases covering:
Testing
All tests pass:
How to Test
[LM Studio]or[GLM Detection]log messagesIf detection is working, you should see logs indicating the model was detected and what optimizations were applied.
Feedback Welcome
This is an attempt to address the issue. Feedback and guidance are welcome!
Important
Introduces strategic GLM model family detection for LM Studio and OpenAI-compatible providers, with utility, provider updates, logging, and comprehensive tests.
glm-model-detection.ts):glm-4.5,mlx-community/GLM-4.5-4bit, andchatglm-6b.BaseOpenAiCompatibleProviderandLmStudioHandlerdetect GLM models on construction, log results, and adjust settings likemergeToolResultTextandparallel_tool_calls.thinkingparameter for GLM-4.7 models.logGlmDetection()logs detection results to console.glm-model-detection.spec.ts.This description was created by
for d010015. You can customize this summary. It will automatically update as commits are pushed.