Tags: Patater/llama.cpp
Tags
cann: Fix ggml_cann_im2col for 1D im2col (ggml-org#8819) * fix ggml_cann_im2col for 1D im2col * fix build warning
ggml-cuda: Adding support for unified memory (ggml-org#8035) * Adding support for unified memory * adding again the documentation about unified memory * refactoring: Moved the unified memory code in the correct location. * Fixed compilation error when using hipblas * cleaning up the documentation * Updating the documentation Co-authored-by: Johannes Gäßler <johannesg@5d6.de> * adding one more case where the PR should not be enabled --------- Co-authored-by: matteo serva <matteo.serva@gmail.com> Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
cuda : fix dmmv cols requirement to 2*GGML_CUDA_DMMV_X (ggml-org#8800) * cuda : fix dmmv cols requirement to 2*GGML_CUDA_DMMV_X * update asserts * only use dmmv for supported types * add test
server : update llama-server embedding flag documentation (ggml-org#8779 ) Fixes ggml-org#8763
Build: Fix potential race condition (ggml-org#8781) * Fix potential race condition as pointed out by @fairydreaming in ggml-org#8776 * Reference the .o rather than rebuilding every time. * Adding in CXXFLAGS and LDFLAGS * Removing unnecessary linker flags.
Adding Gemma 2 2B configs (ggml-org#8784) * Adding Gemma 2 2B configs Updates to Q scaling and Gemma 2 model sizes to match v2 2B model. * Update src/llama.cpp Co-authored-by: slaren <slarengh@gmail.com> --------- Co-authored-by: slaren <slarengh@gmail.com>
PreviousNext