Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: abetlen/llama-cpp-python
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: main
Choose a base ref
...
head repository: kitaekatt/llama-cpp-python
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: main
Choose a head ref
Checking mergeability… Don’t worry, you can still create the pull request.
  • 2 commits
  • 4 files changed
  • 2 contributors

Commits on Nov 18, 2025

  1. Support latest llama.cpp with nemotron_h architecture and graceful de…

    …precated symbol handling
    
    - Update vendor/llama.cpp to latest main branch for nemotron_h architecture support
    - Disable mtmd build in CMakeLists.txt: latest llama.cpp has CMake compatibility issues with mtmd module that prevent build completion. mtmd is not required for nemotron_h.
    - Add graceful deprecated symbol handling in _ctypes_extensions.py: Wrap getattr() in try/except to handle missing C symbols from deprecated functions removed in latest llama.cpp. Returns stub functions instead of hard failures, allowing import to succeed.
    
    Result: nemotron-nano-12b-gguf now loads and benchmarks successfully
    - Model architecture: nemotron_h (Mamba-2 hybrid)
    - Benchmark speed: 18.9 tokens/sec
    - Test status: PASS (5/5 prompts validated)
    
    🤖 Generated with Claude Code
    
    Co-Authored-By: Claude <noreply@anthropic.com>
    kitaekatt and claude committed Nov 18, 2025
    Configuration menu
    Copy the full SHA
    7f691e2 View commit details
    Browse the repository at this point in the history

Commits on Nov 19, 2025

  1. docs: Add CLAUDE.md for fork setup and RTX 5090 optimization guidance

    - Document fork relationship with abetlen/llama-cpp-python upstream
    - Add build instructions with CMAKE_CUDA_ARCHITECTURES=120 for SM 12.0
    - Explain integration with llm-dev project
    - Include common tasks and troubleshooting steps
    kitaekatt committed Nov 19, 2025
    Configuration menu
    Copy the full SHA
    a16ebac View commit details
    Browse the repository at this point in the history
Loading