[GH-ISSUE #13340] Memory issues since v0.13.1 at embedding on Windows #34571

Open
opened 2026-04-22 18:15:57 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @negedng on GitHub (Dec 5, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/13340

Originally assigned to: @npardal on GitHub.

What is the issue?

Hi,

I used to embed a markdown file (3000+ chunks) using nomic-embed-text in batches of 32 with max char length 2000. Since the new update, I had to reduce the context length to 1900 chars. Reducing the batch size didn't solve the memory(?) issue.

Not a big problem for me, but I'll copy the logs for future reference

Relevant log output

## On the server:

[GIN] 2025/12/05 - 10:21:32 | 200 |    654.4864ms |       127.0.0.1 | POST     "/api/embed"
panic: caching disabled but unable to fit entire input in a batch

goroutine 50 [running]:
github.com/ollama/ollama/runner/ollamarunner.(*Server).computeBatch(0xc00020eb40, {0x142e, {0x7ff64d4f4c00, 0xc0002cc8c0}, {0x7ff64d501468, 0xc000009530}, {0xc000ea5308, 0x200, 0x25f}, {{0x7ff64d501468, ...}, ...}, ...})
        github.com/ollama/ollama/runner/ollamarunner/runner.go:707 +0x1ac5
github.com/ollama/ollama/runner/ollamarunner.(*Server).run(0xc00020eb40, {0x7ff64d4e9840, 0xc0003920f0})
        github.com/ollama/ollama/runner/ollamarunner/runner.go:460 +0x30b
created by github.com/ollama/ollama/runner/ollamarunner.Execute in goroutine 1
        github.com/ollama/ollama/runner/ollamarunner/runner.go:1411 +0x4c9
[GIN] 2025/12/05 - 10:21:32 | 500 |    358.1718ms |       127.0.0.1 | POST     "/api/embed"

## In the client:
do embedding request: Post "http://127.0.0.1:65153/embedding": read tcp 127.0.0.1:55845->127.0.0.1:65153: wsarecv: An existing connection was forcibly closed by the remote host.

OS

Windows

GPU

Nvidia

CPU

Intel

Ollama version

0.13.1

Originally created by @negedng on GitHub (Dec 5, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/13340 Originally assigned to: @npardal on GitHub. ### What is the issue? Hi, I used to embed a markdown file (3000+ chunks) using `nomic-embed-text` in batches of 32 with max char length 2000. Since the new update, I had to reduce the context length to 1900 chars. Reducing the batch size didn't solve the memory(?) issue. Not a big problem for me, but I'll copy the logs for future reference ### Relevant log output ```shell ## On the server: [GIN] 2025/12/05 - 10:21:32 | 200 | 654.4864ms | 127.0.0.1 | POST "/api/embed" panic: caching disabled but unable to fit entire input in a batch goroutine 50 [running]: github.com/ollama/ollama/runner/ollamarunner.(*Server).computeBatch(0xc00020eb40, {0x142e, {0x7ff64d4f4c00, 0xc0002cc8c0}, {0x7ff64d501468, 0xc000009530}, {0xc000ea5308, 0x200, 0x25f}, {{0x7ff64d501468, ...}, ...}, ...}) github.com/ollama/ollama/runner/ollamarunner/runner.go:707 +0x1ac5 github.com/ollama/ollama/runner/ollamarunner.(*Server).run(0xc00020eb40, {0x7ff64d4e9840, 0xc0003920f0}) github.com/ollama/ollama/runner/ollamarunner/runner.go:460 +0x30b created by github.com/ollama/ollama/runner/ollamarunner.Execute in goroutine 1 github.com/ollama/ollama/runner/ollamarunner/runner.go:1411 +0x4c9 [GIN] 2025/12/05 - 10:21:32 | 500 | 358.1718ms | 127.0.0.1 | POST "/api/embed" ## In the client: do embedding request: Post "http://127.0.0.1:65153/embedding": read tcp 127.0.0.1:55845->127.0.0.1:65153: wsarecv: An existing connection was forcibly closed by the remote host. ``` ### OS Windows ### GPU Nvidia ### CPU Intel ### Ollama version 0.13.1
GiteaMirror added the bug label 2026-04-22 18:15:57 -05:00
Author
Owner

@jessegross commented on GitHub (Dec 5, 2025):

This is caused by the batch size being too small - rather than reducing the batch size, you should set it to be the same size as the context length (or max input).

<!-- gh-comment-id:3618219897 --> @jessegross commented on GitHub (Dec 5, 2025): This is caused by the batch size being too small - rather than reducing the batch size, you should set it to be the same size as the context length (or max input).
Author
Owner

@negedng commented on GitHub (Dec 6, 2025):

I don't think so, reducing the batch size to 1 did not solve the issue. The problem was with the length of the input

<!-- gh-comment-id:3619826474 --> @negedng commented on GitHub (Dec 6, 2025): I don't think so, reducing the batch size to 1 did not solve the issue. The problem was with the length of the input
Author
Owner

@jessegross commented on GitHub (Dec 8, 2025):

Yes, you should increase the batch size, not decrease it.

<!-- gh-comment-id:3628611245 --> @jessegross commented on GitHub (Dec 8, 2025): Yes, you should _increase_ the batch size, not decrease it.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#34571