ollama/convert at 1044b0419a5e9f2508e3268ce3af8d83292d62fb - ollama - Computersurge

github-starred/ollama

mirror of https://github.com/ollama/ollama.git synced 2026-03-09 07:16:38 -05:00

Files

History

Jeffrey Morgan 1044b0419a model: add MLA absorption for glm4moelite (#13810 )

* model: add MLA absorption for glm4moelite

Split the combined KV_B tensor into separate K_B and V_B tensors
during conversion, enabling MLA (Multi-head Latent Attention)
absorption which compresses the KV cache for improved efficiency.

* ggml: enable MLA flash attention for GLM-4.7-flash

Add support for gqa_ratio 4 in MLA flash attention kernels. GLM-4.7-flash
uses head size 576 with gqa_ratio 4, which was previously only supported
for gqa_ratio 16 (DeepSeek).

Metal changes:
- Enable head size 576 for flash attention
- Increase simdgroups to 8 for large heads (>=512)
- Add case 8 kernel dispatch for 8 simdgroups

CUDA changes:
- Add gqa_ratio 4 support for head 576/512
- Add tile configs for (576, 512, 4) and (576, 512, 8)
- Add MMA config cases for ncols 4
- Add template instances for ncols2=4

* model: add compatibility validation for glm4moelite architecture

2026-01-23 14:47:42 -08:00

..

chore(all): replace instances of interface with any (#10067 )

2025-04-02 09:44:27 -07:00

convert: import support for command-r models from safetensors (#6063 )

2025-01-15 16:31:22 -08:00

convert_bert.go

Add experimental MLX backend and engine with imagegen support (#13648 )

2026-01-08 16:18:59 -08:00

convert_commandr.go

Add experimental MLX backend and engine with imagegen support (#13648 )

2026-01-08 16:18:59 -08:00

convert_deepseek2.go

Add experimental MLX backend and engine with imagegen support (#13648 )

2026-01-08 16:18:59 -08:00

convert_deepseekocr.go

Add experimental MLX backend and engine with imagegen support (#13648 )

2026-01-08 16:18:59 -08:00

convert_gemma2_adapter.go

Add experimental MLX backend and engine with imagegen support (#13648 )

2026-01-08 16:18:59 -08:00

convert_gemma2.go

Add experimental MLX backend and engine with imagegen support (#13648 )

2026-01-08 16:18:59 -08:00

convert_gemma3.go

Add experimental MLX backend and engine with imagegen support (#13648 )

2026-01-08 16:18:59 -08:00

convert_gemma3n.go

Add experimental MLX backend and engine with imagegen support (#13648 )

2026-01-08 16:18:59 -08:00

convert_gemma.go

Add experimental MLX backend and engine with imagegen support (#13648 )

2026-01-08 16:18:59 -08:00

convert_glm4moelite.go

model: add MLA absorption for glm4moelite (#13810 )

2026-01-23 14:47:42 -08:00

convert_gptoss.go

Add experimental MLX backend and engine with imagegen support (#13648 )

2026-01-08 16:18:59 -08:00

convert_lfm2.go

model: add lfm2 architecture and LFM2.5-1.2B-Thinking support (#13792 )

2026-01-20 12:20:53 -08:00

convert_llama4.go

Add experimental MLX backend and engine with imagegen support (#13648 )

2026-01-08 16:18:59 -08:00

convert_llama_adapter.go

Add experimental MLX backend and engine with imagegen support (#13648 )

2026-01-08 16:18:59 -08:00

convert_llama.go

Add experimental MLX backend and engine with imagegen support (#13648 )

2026-01-08 16:18:59 -08:00

convert_mistral_causal.go

Add experimental MLX backend and engine with imagegen support (#13648 )

2026-01-08 16:18:59 -08:00

convert_mistral.go

Add experimental MLX backend and engine with imagegen support (#13648 )

2026-01-08 16:18:59 -08:00

convert_mixtral.go

Add experimental MLX backend and engine with imagegen support (#13648 )

2026-01-08 16:18:59 -08:00

convert_mllama.go

Add experimental MLX backend and engine with imagegen support (#13648 )

2026-01-08 16:18:59 -08:00

convert_nomicbert.go

Add experimental MLX backend and engine with imagegen support (#13648 )

2026-01-08 16:18:59 -08:00

convert_olmo.go

Add experimental MLX backend and engine with imagegen support (#13648 )

2026-01-08 16:18:59 -08:00

convert_phi3.go

Add experimental MLX backend and engine with imagegen support (#13648 )

2026-01-08 16:18:59 -08:00

convert_qwen2.go

Add experimental MLX backend and engine with imagegen support (#13648 )

2026-01-08 16:18:59 -08:00

convert_qwen3.go

Add experimental MLX backend and engine with imagegen support (#13648 )

2026-01-08 16:18:59 -08:00

convert_qwen3vl.go

Add experimental MLX backend and engine with imagegen support (#13648 )

2026-01-08 16:18:59 -08:00

convert_qwen25vl.go

Add experimental MLX backend and engine with imagegen support (#13648 )

2026-01-08 16:18:59 -08:00

convert_test.go

Add experimental MLX backend and engine with imagegen support (#13648 )

2026-01-08 16:18:59 -08:00

convert.go

model: add lfm2 architecture and LFM2.5-1.2B-Thinking support (#13792 )

2026-01-20 12:20:53 -08:00

reader_safetensors.go

deepseekocr

2025-11-18 16:11:37 -08:00

reader_test.go

convert: convert bf16 vision weights to fp16 (#12324 )

2025-09-17 17:43:17 -07:00

reader_torch.go

llama4

2025-04-25 16:59:20 -07:00

reader.go

model: add lfm2 architecture and LFM2.5-1.2B-Thinking support (#13792 )

2026-01-20 12:20:53 -08:00

sentencepiece_model.proto

all: fix typos in documentation, code, and comments (#7021 )

2024-12-10 12:58:06 -08:00

tensor_test.go

fix tensor merge (#13053 )

2025-11-13 15:32:34 -08:00

tensor.go

fix tensor merge (#13053 )

2025-11-13 15:32:34 -08:00

tokenizer_spm.go

parsers/renderers: functiongemma (#13521 )

2025-12-18 07:55:37 -08:00

tokenizer_test.go

model: handle multiple eos tokens (#10577 )

2025-05-16 13:40:23 -07:00

tokenizer.go

s#x/exp/maps#maps# (#11506 )

2025-07-23 13:23:32 -07:00