[GH-ISSUE #1755] [enhancement] use bert.cpp for /api/embeddings #26767

Closed
opened 2026-04-22 03:20:09 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @fakezeta on GitHub (Jan 1, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/1755

Llama2 and mistral base model are quite poor in embedding compared to sentence tranformer models like bert.

Why not integrate bert.cpp or sentence-transformers for api/embeddings endpoint so we can have the best of both architectures?

Originally created by @fakezeta on GitHub (Jan 1, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/1755 Llama2 and mistral base model are quite poor in embedding compared to sentence tranformer models like bert. Why not integrate [bert.cpp](https://github.com/skeskinen/bert.cpp) or [sentence-transformers](https://sbert.net/) for `api/embeddings` endpoint so we can have the best of both architectures?
Author
Owner

@easp commented on GitHub (Jan 1, 2024):

Looks like there is slow movement on adding BERT support to llama.cpp
https://github.com/ggerganov/llama.cpp/issues/2872

<!-- gh-comment-id:1873435052 --> @easp commented on GitHub (Jan 1, 2024): Looks like there is slow movement on adding BERT support to llama.cpp https://github.com/ggerganov/llama.cpp/issues/2872
Author
Owner

@BruceMacD commented on GitHub (Jan 2, 2024):

Hi @fakezeta thanks for opening the issue. This a commonly requested feature that we are definitely looking at.

Closing this issue for now to consolidate it with #327 just to keep this organized.

<!-- gh-comment-id:1873911073 --> @BruceMacD commented on GitHub (Jan 2, 2024): Hi @fakezeta thanks for opening the issue. This a commonly requested feature that we are definitely looking at. Closing this issue for now to consolidate it with #327 just to keep this organized.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#26767