[GH-ISSUE #6908] How to use embedding models from huggingface hub? #4370

Open
opened 2026-04-12 15:18:33 -05:00 by GiteaMirror · 5 comments
Owner

Originally created by @fzyzcjy on GitHub (Sep 22, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6908

Hi thanks for the lib! I want to use some embedding models (arch is bert) from hf hub. I have tried gguf, but the converter says bert arch cannot be converted to that. I have also tried directly have a modelfile to import safetensors, but it says Error: unsupported content type: unknown

Originally created by @fzyzcjy on GitHub (Sep 22, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/6908 Hi thanks for the lib! I want to use some embedding models (arch is bert) from hf hub. I have tried gguf, but the converter says bert arch cannot be converted to that. I have also tried directly have a modelfile to import safetensors, but it says `Error: unsupported content type: unknown`
GiteaMirror added the feature request label 2026-04-12 15:18:33 -05:00
Author
Owner

@neuwcodebox commented on GitHub (Oct 14, 2024):

We need a manual to import embedding models.

<!-- gh-comment-id:2409860974 --> @neuwcodebox commented on GitHub (Oct 14, 2024): We need a manual to import embedding models.
Author
Owner

@E218PQ commented on GitHub (Oct 22, 2024):

I also have the same experience and the same needs. I hope there can be a better solution. I hope ollama can increase support for RAG models and TTS models.

<!-- gh-comment-id:2427985229 --> @E218PQ commented on GitHub (Oct 22, 2024): I also have the same experience and the same needs. I hope there can be a better solution. I hope ollama can increase support for RAG models and TTS models.
Author
Owner

@EntropyYue commented on GitHub (Oct 29, 2024):

In the llama.cpp convert it to gguf, and then import it like a regular model.

<!-- gh-comment-id:2445175976 --> @EntropyYue commented on GitHub (Oct 29, 2024): In the llama.cpp convert it to gguf, and then import it like a regular model.
Author
Owner

@E218PQ commented on GitHub (Oct 30, 2024):

In the llama.cpp convert it to gguf, and then import it like a regular model.

Thank you for your response,
Sorry, I tried many times but couldn't complete the conversion and kept showing unsupported types. For this reason, I also built a separate llama.cpp environment.In the end, I had no choice but to run an additional xinference environment, running RAG and TTS separately. In addition, some frameworks such as dify do not support adding TTS or RAG models to the preset ollama model function module. I hope that all of this can run in a single environment of ollama, which is both simple and stable, and very easy to maintain.

<!-- gh-comment-id:2445778827 --> @E218PQ commented on GitHub (Oct 30, 2024): > In the llama.cpp convert it to gguf, and then import it like a regular model. Thank you for your response, Sorry, I tried many times but couldn't complete the conversion and kept showing unsupported types. For this reason, I also built a separate llama.cpp environment.In the end, I had no choice but to run an additional xinference environment, running RAG and TTS separately. In addition, some frameworks such as dify do not support adding TTS or RAG models to the preset ollama model function module. I hope that all of this can run in a single environment of ollama, which is both simple and stable, and very easy to maintain.
Author
Owner

@AlgorithmicKing737 commented on GitHub (Jan 18, 2025):

any solution yet?

<!-- gh-comment-id:2599573029 --> @AlgorithmicKing737 commented on GitHub (Jan 18, 2025): any solution yet?
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#4370