[GH-ISSUE #2471] Adding Bert-Embeddings server / how to add torch #47956

Closed
opened 2026-04-28 06:10:35 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @michaelfeil on GitHub (Feb 13, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/2471

I am building https://github.com/michaelfeil/infinity and would love to contribute to ollama. It is compatible with cuda, cpu and mps, with the option to run onnx models. Beta also for torch + ROCm.

torch or onnx are required, I would recommend going with torch for speed (flash attn integration).

import asyncio
from infinity_emb import AsyncEmbeddingEngine

engine = AsyncEmbeddingEngine(model_name_or_path = "BAAI/bge-small-en-v1.5", engine="torch")

async def main(sentences = ["Embed this is sentence via Infinity.", "Paris is in France."]): 
    # invoke this function however you can from go
    async with engine: # engine starts with engine.astart()
        embeddings, usage = await engine.embed(sentences=sentences)
asyncio.run(main())

PS: Had a chat at the SF meetup earlier today thanks for hosting, a lot of interesting people here. Great work @mchiang0610

Originally created by @michaelfeil on GitHub (Feb 13, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/2471 I am building https://github.com/michaelfeil/infinity and would love to contribute to ollama. It is compatible with cuda, cpu and mps, with the option to run onnx models. Beta also for torch + ROCm. torch or onnx are required, I would recommend going with torch for speed (flash attn integration). ```python import asyncio from infinity_emb import AsyncEmbeddingEngine engine = AsyncEmbeddingEngine(model_name_or_path = "BAAI/bge-small-en-v1.5", engine="torch") async def main(sentences = ["Embed this is sentence via Infinity.", "Paris is in France."]): # invoke this function however you can from go async with engine: # engine starts with engine.astart() embeddings, usage = await engine.embed(sentences=sentences) asyncio.run(main()) ``` PS: Had a chat at the SF meetup earlier today thanks for hosting, a lot of interesting people here. Great work @mchiang0610
GiteaMirror added the feature request label 2026-04-28 06:10:36 -05:00
Author
Owner

@jmorganca commented on GitHub (Mar 13, 2024):

@michaelfeil so great to see you there! Bert models should new be possible to run with Ollama (e.g. see https://ollama.com/library/all-minilm or https://ollama.com/library/nomic-embed-text). Let me know if this helps!

<!-- gh-comment-id:1993513689 --> @jmorganca commented on GitHub (Mar 13, 2024): @michaelfeil so great to see you there! Bert models should new be possible to run with Ollama (e.g. see https://ollama.com/library/all-minilm or https://ollama.com/library/nomic-embed-text). Let me know if this helps!
Author
Owner

@michaelfeil commented on GitHub (Mar 13, 2024):

Great - thanks for closing the issue!

<!-- gh-comment-id:1993563106 --> @michaelfeil commented on GitHub (Mar 13, 2024): Great - thanks for closing the issue!
Author
Owner

@spike-xiong commented on GitHub (Feb 19, 2025):

@michaelfeil so great to see you there! Bert models should new be possible to run with Ollama (e.g. see https://ollama.com/library/all-minilm or https://ollama.com/library/nomic-embed-text). Let me know if this helps!

Hello @jmorganca ! Is there any guide for importing a fine-tuned Bert structure model into Ollama?

<!-- gh-comment-id:2667582030 --> @spike-xiong commented on GitHub (Feb 19, 2025): > [@michaelfeil](https://github.com/michaelfeil) so great to see you there! Bert models should new be possible to run with Ollama (e.g. see https://ollama.com/library/all-minilm or https://ollama.com/library/nomic-embed-text). Let me know if this helps! Hello @jmorganca ! Is there any guide for importing a fine-tuned Bert structure model into Ollama?
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#47956