[PR #12777] feat: Set params.embeddings according to the pooling_type in GGUF #13947

Open
opened 2026-04-13 00:40:58 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/12777
Author: @mitmul
Created: 10/25/2025
Status: 🔄 Open

Base: mainHead: mitmul/set-embeddings-using-pooling-type


📝 Commits (1)

  • 7d498f9 Set params.embeddings according to the pooling_type in GGUF

📊 Changes

1 file changed (+5 additions, -1 deletions)

View changed files

📝 llama/llama.go (+5 -1)

📄 Description

This resolves #12689 in another way to #12761. Alongside with this, I submitted a PR to llama.cpp addressing this issue at https://github.com/ggml-org/llama.cpp/pull/16766. However, there appears to be an alternative approach possible. One could argue that the issue occurs because params.embeddings is hardcoded to true in
ad6f6a1d29/llama/llama.go (L120)
Therefore, even if the loaded model is not an embedding model (i.e., when GGUF specifies pooling_type = 0), the build_pooling() is always called but nothing is performed because it goes through this part:
ad6f6a1d29/llama/llama.cpp/src/llama-graph.cpp (L1899-L1902)
So, I think it's fine to set params.embeddings to false when pooling_type <= 0.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/12777 **Author:** [@mitmul](https://github.com/mitmul) **Created:** 10/25/2025 **Status:** 🔄 Open **Base:** `main` ← **Head:** `mitmul/set-embeddings-using-pooling-type` --- ### 📝 Commits (1) - [`7d498f9`](https://github.com/ollama/ollama/commit/7d498f9e49082b18ba22c79c10d787619d700806) Set params.embeddings according to the pooling_type in GGUF ### 📊 Changes **1 file changed** (+5 additions, -1 deletions) <details> <summary>View changed files</summary> 📝 `llama/llama.go` (+5 -1) </details> ### 📄 Description This resolves #12689 in another way to #12761. Alongside with this, I submitted a PR to llama.cpp addressing this issue at https://github.com/ggml-org/llama.cpp/pull/16766. However, there appears to be an alternative approach possible. One could argue that the issue occurs because `params.embeddings` is hardcoded to `true` in https://github.com/ollama/ollama/blob/ad6f6a1d29f45a5c7266bcd7edb5671621e86810/llama/llama.go#L120 Therefore, even if the loaded model is not an embedding model (i.e., when GGUF specifies `pooling_type = 0`), the `build_pooling()` is always called but nothing is performed because it goes through this part: https://github.com/ollama/ollama/blob/ad6f6a1d29f45a5c7266bcd7edb5671621e86810/llama/llama.cpp/src/llama-graph.cpp#L1899-L1902 So, I think it's fine to set `params.embeddings` to `false` when `pooling_type <= 0`. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-13 00:40:58 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#13947