Embedding model performance improvements #3886

Open
opened 2025-11-12 11:58:42 -06:00 by GiteaMirror · 0 comments
Owner

Originally created by @jmorganca on GitHub (Aug 7, 2024).

What is the issue?

  1. Embedding models should disable kv cache size (e.g. num_ctx) as it may not be used
  2. Embedding models should by default use higher parallization (10+) for batches to be faster

OS

No response

GPU

No response

CPU

No response

Ollama version

No response

Originally created by @jmorganca on GitHub (Aug 7, 2024). ### What is the issue? 1. Embedding models should disable kv cache size (e.g. num_ctx) as it may not be used 2. Embedding models should by default use higher parallization (10+) for batches to be faster ### OS _No response_ ### GPU _No response_ ### CPU _No response_ ### Ollama version _No response_
GiteaMirror added the feature request label 2025-11-12 11:58:42 -06:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama-ollama#3886