[GH-ISSUE #8741] Longer context length leads to half power usage? #67727

Closed
opened 2026-05-04 11:29:09 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @kungfu-eric on GitHub (Jan 31, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/8741

What is the issue?

It seems cranking up the context length beyond the default 4096 (24000) on deepseek-r1 32b (qwen 2.5) and 70b (llama3.3) causes power usage to drop by half with corresponding throughput drop.

There's no OOM on VRAM. Others suggested this is due to paging to RAM or CPU activity. Glances says CPU is operating at ~22% and there's no disk IO wait reported.

Is there some inefficiency in Ollama?

OS

Linux

GPU

Nvidia

CPU

AMD

Ollama version

0.3.12

Originally created by @kungfu-eric on GitHub (Jan 31, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/8741 ### What is the issue? It seems cranking up the context length beyond the default 4096 (24000) on deepseek-r1 32b (qwen 2.5) and 70b (llama3.3) causes power usage to drop by half with corresponding throughput drop. There's no OOM on VRAM. Others suggested this is due to paging to RAM or CPU activity. Glances says CPU is operating at ~22% and there's no disk IO wait reported. Is there some inefficiency in Ollama? ### OS Linux ### GPU Nvidia ### CPU AMD ### Ollama version 0.3.12
GiteaMirror added the bug label 2026-05-04 11:29:09 -05:00
Author
Owner

@rick-github commented on GitHub (Jan 31, 2025):

Server logs would give some insight.

<!-- gh-comment-id:2628578818 --> @rick-github commented on GitHub (Jan 31, 2025): [Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) would give some insight.
Author
Owner

@kungfu-eric commented on GitHub (Feb 1, 2025):

Doing some reasoning, pretty sure it's memory bandwidth bound unfortunately. Closing

<!-- gh-comment-id:2628595684 --> @kungfu-eric commented on GitHub (Feb 1, 2025): Doing some reasoning, pretty sure it's memory bandwidth bound unfortunately. Closing
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#67727