[GH-ISSUE #4742] VRAM allocation error when loading different models with different OLLAMA_VRAM_MAX configurations #80666

Closed
opened 2026-05-09 09:17:22 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @hamkido on GitHub (May 31, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/4742

What is the issue?

I have two amd 7900xtx 24g gpu. When using ollama, I encounter different memory allocation errors and exit errors.

  1. No OLLAMA_VRAM_MAX configuration
    The large model deepseek-llm:67b-chat can be loaded correctly
    But if you call something bigger, such as qwen:72b and command-r-plus, the display memory allocation will report an error and exit.
  2. Use OLLAMA_VRAM_MAX configuration
    Models larger than 67b-q4 load correctly, such as qwen:72b and command-r-plus.
    However, smaller models cannot be loaded correctly, and memory allocation errors will be reported and exited, such as deepseek-llm:67b-chat.

There might be some errors in the vram configuration.

OS

Linux

GPU

AMD

CPU

AMD

Ollama version

0.1.39

Originally created by @hamkido on GitHub (May 31, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/4742 ### What is the issue? I have two amd 7900xtx 24g gpu. When using ollama, I encounter different memory allocation errors and exit errors. 1. No OLLAMA_VRAM_MAX configuration The large model deepseek-llm:67b-chat can be loaded correctly But if you call something bigger, such as qwen:72b and command-r-plus, the display memory allocation will report an error and exit. 2. Use OLLAMA_VRAM_MAX configuration Models larger than 67b-q4 load correctly, such as qwen:72b and command-r-plus. However, smaller models cannot be loaded correctly, and memory allocation errors will be reported and exited, such as deepseek-llm:67b-chat. There might be some errors in the vram configuration. ### OS Linux ### GPU AMD ### CPU AMD ### Ollama version 0.1.39
GiteaMirror added the bug label 2026-05-09 09:17:22 -05:00
Author
Owner

@hamkido commented on GitHub (May 31, 2024):

This is my ollama configuration

cat /etc/ollama.conf | grep -v '#'
OLLAMA_NUM_PARALLEL=2
OLLAMA_MAX_LOADED_MODELS=2
HSA_OVERRIDE_GFX_VERSION=11.0.0
ROCR_VISIBLE_DEVICES=0,1
HIP_VISIBLE_DEVICES=0,1
HOME=/var/lib/ollama
GIN_MODE=debug
OLLAMA_DEBUG=1
AMD_SERIALIZE_KERNEL=3
OLLAMA_LLM_LIBRARY=rocm_v60002
OLLAMA_ORIGINS=*
OLLAMA_HOST=0.0.0.0

Another configuration has one more line
OLLAMA_MAX_VRAM=49392123904

If you need more information and tests, like systemd log or tests, I will upload them.

<!-- gh-comment-id:2141270001 --> @hamkido commented on GitHub (May 31, 2024): This is my ollama configuration ``` cat /etc/ollama.conf | grep -v '#' OLLAMA_NUM_PARALLEL=2 OLLAMA_MAX_LOADED_MODELS=2 HSA_OVERRIDE_GFX_VERSION=11.0.0 ROCR_VISIBLE_DEVICES=0,1 HIP_VISIBLE_DEVICES=0,1 HOME=/var/lib/ollama GIN_MODE=debug OLLAMA_DEBUG=1 AMD_SERIALIZE_KERNEL=3 OLLAMA_LLM_LIBRARY=rocm_v60002 OLLAMA_ORIGINS=* OLLAMA_HOST=0.0.0.0 ``` Another configuration has one more line ```OLLAMA_MAX_VRAM=49392123904``` If you need more information and tests, like systemd log or tests, I will upload them.
Author
Owner

@hamkido commented on GitHub (May 31, 2024):

seems caused by other reason.

<!-- gh-comment-id:2141319384 --> @hamkido commented on GitHub (May 31, 2024): seems caused by other reason.
Author
Owner

@hamkido commented on GitHub (Jun 5, 2024):

Just as a reference for anyone who runs into this error, I quickly fixed this by customizing the totalMemory var in gpu/amd_linux.go file and rebuilding the package. I was short on time and this wasn't a very elegant approach. But it just works.
Also /etc/ollama.conf OLLAMA_MAX_VRAM do not need config any more.

<!-- gh-comment-id:2148986748 --> @hamkido commented on GitHub (Jun 5, 2024): Just as a reference for anyone who runs into this error, I quickly fixed this by customizing the totalMemory var in gpu/amd_linux.go file and rebuilding the package. I was short on time and this wasn't a very elegant approach. But it just works. Also /etc/ollama.conf OLLAMA_MAX_VRAM do not need config any more.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#80666