[GH-ISSUE #6214] Embedding model performance improvements #3883

Open
opened 2026-04-12 14:43:23 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @jmorganca on GitHub (Aug 7, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6214

What is the issue?

  1. Embedding models should disable kv cache size (e.g. num_ctx) as it may not be used
  2. Embedding models should by default use higher parallization (10+) for batches to be faster

OS

No response

GPU

No response

CPU

No response

Ollama version

No response

Originally created by @jmorganca on GitHub (Aug 7, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/6214 ### What is the issue? 1. Embedding models should disable kv cache size (e.g. num_ctx) as it may not be used 2. Embedding models should by default use higher parallization (10+) for batches to be faster ### OS _No response_ ### GPU _No response_ ### CPU _No response_ ### Ollama version _No response_
GiteaMirror added the feature request label 2026-04-12 14:43:23 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#3883