[GH-ISSUE #1882] Embedding generation is slow #63116

Closed
opened 2026-05-03 12:12:30 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @jmorganca on GitHub (Jan 10, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/1882

When using /api/embeddings, large documents can take up to second

Originally created by @jmorganca on GitHub (Jan 10, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/1882 When using `/api/embeddings`, large documents can take up to second
GiteaMirror added the embeddingsperformance labels 2026-05-03 12:12:31 -05:00
Author
Owner

@iamashwin99 commented on GitHub (Feb 21, 2024):

I have the same issue, I am not limited by the CPU or the memory. Not sure what the issue is.

<!-- gh-comment-id:1956188428 --> @iamashwin99 commented on GitHub (Feb 21, 2024): I have the same issue, I am not limited by the CPU or the memory. Not sure what the issue is.
Author
Owner

@BruceMacD commented on GitHub (Mar 4, 2024):

@jmorganca can this be resolved now that bert models are supported? Moving forward generating embeddings bert models should be used rather than llama-family models.

<!-- gh-comment-id:1977405200 --> @BruceMacD commented on GitHub (Mar 4, 2024): @jmorganca can this be resolved now that bert models are supported? Moving forward generating embeddings bert models should be used rather than llama-family models.
Author
Owner

@jmorganca commented on GitHub (Jun 25, 2024):

Closing as bert models are now supported such as https://ollama.com/library/all-minilm

<!-- gh-comment-id:2187955446 --> @jmorganca commented on GitHub (Jun 25, 2024): Closing as `bert` models are now supported such as https://ollama.com/library/all-minilm
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#63116