Skip to content

Tags: MyselfTry/llama.cpp

Tags

b5306

Toggle b5306's commit message
sync : ggml

ggml-ci

b5303

Toggle b5303's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
llama : deci : support ffn-free with attention (ggml-org#13296)

b5302

Toggle b5302's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
common : Add a warning when we can't match samplers from a string or …

…char. (ggml-org#13330)

b5301

Toggle b5301's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
cuda : remove nrows_x in mul_mat_q_process_tile (ggml-org#13325)

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

b5300

Toggle b5300's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
examples : remove infill (ggml-org#13283)

ggml-ci

b5299

Toggle b5299's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
llama : support tie embedding for chatglm models (ggml-org#13328)

gguf-v0.16.3

Toggle gguf-v0.16.3's commit message
Version 0.16.3 release

b5298

Toggle b5298's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
CUDA: mix virt/real CUDA archs for GGML_NATIVE=OFF (ggml-org#13135)

b5297

Toggle b5297's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
clip : refactor graph builder (ggml-org#13321)

* mtmd : refactor graph builder

* fix qwen2vl

* clean up siglip cgraph

* pixtral migrated

* move minicpmv to a dedicated build function

* move max_feature_layer to build_llava

* use build_attn for minicpm resampler

* fix windows build

* add comment for batch_size

* also support tinygemma3 test model

* qwen2vl does not use RMS norm

* fix qwen2vl norm (2)

b5296

Toggle b5296's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
sampling : make top_n_sigma no-op at <=0 or a single candidate (ggml-…

…org#13345)