[GH-ISSUE #6678] OLLAMA_LOAD_TIMEOUT env variable not being applied #29964

Closed
opened 2026-04-22 09:19:48 -05:00 by GiteaMirror · 7 comments
Owner

Originally created by @YetheSamartaka on GitHub (Sep 6, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6678

What is the issue?

OLLAMA_LOAD_TIMEOUT env variable is not applied at all. When I specify it using docker -e OLLAMA_LOAD_TIMEOUT=60 and then inspect logs, this variable is missing there completely. Other variables might be missing there as well.

Here is the text from logs:
routes.go:1125: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:2562047h47m16.854775807s OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR: OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"

You can see that I was setting the OLLAMA_KEEP_ALIVE to indefinetly, so the setting of env variables is working, but this one seems not to be applied because model loading fails after 5 minutes which is the default value, thus not being applied and displayed here.

OS

Windows, WSL2

GPU

Nvidia

CPU

AMD

Ollama version

0.3.9

Originally created by @YetheSamartaka on GitHub (Sep 6, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/6678 ### What is the issue? OLLAMA_LOAD_TIMEOUT env variable is not applied at all. When I specify it using docker -e OLLAMA_LOAD_TIMEOUT=60 and then inspect logs, this variable is missing there completely. Other variables might be missing there as well. Here is the text from logs: `routes.go:1125: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:2562047h47m16.854775807s OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR: OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"` You can see that I was setting the OLLAMA_KEEP_ALIVE to indefinetly, so the setting of env variables is working, but this one seems not to be applied because model loading fails after 5 minutes which is the default value, thus not being applied and displayed here. ### OS Windows, WSL2 ### GPU Nvidia ### CPU AMD ### Ollama version 0.3.9
GiteaMirror added the bug label 2026-04-22 09:19:48 -05:00
Author
Owner

@rick-github commented on GitHub (Sep 6, 2024):

Support for OLLAMA_LOAD_TIMEOUT was only added to the codebase 17 hours ago, there hasn't been a release with it yet. You can clone the repo and build a local version of ollama while you wait for an official release.

<!-- gh-comment-id:2334163718 --> @rick-github commented on GitHub (Sep 6, 2024): Support for OLLAMA_LOAD_TIMEOUT was only added to the codebase [17 hours ago](https://github.com/ollama/ollama/commit/67190976491af3535c7587e08db05a6f2ff2d7ea), there hasn't been a [release](https://github.com/ollama/ollama/releases) with it yet. You can clone the repo and [build](https://github.com/ollama/ollama/blob/main/docs/development.md) a local version of ollama while you wait for an official release.
Author
Owner

@dhiltgen commented on GitHub (Sep 6, 2024):

0.3.10 should be out soon, which will include this capability.

<!-- gh-comment-id:2334323168 --> @dhiltgen commented on GitHub (Sep 6, 2024): 0.3.10 should be out soon, which will include this capability.
Author
Owner

@YetheSamartaka commented on GitHub (Sep 12, 2024):

Hello @dhiltgen. When I manually hardcoded values into the code and rebuilt my own image, it works, but the env variable is not working in 0.3.10. There is default value of 1m0s instead of my supplied value.

<!-- gh-comment-id:2346044499 --> @YetheSamartaka commented on GitHub (Sep 12, 2024): Hello @dhiltgen. When I manually hardcoded values into the code and rebuilt my own image, it works, but the env variable is not working in 0.3.10. There is default value of 1m0s instead of my supplied value.
Author
Owner

@dhiltgen commented on GitHub (Sep 12, 2024):

@YetheSamartaka can you share what string you're setting it to, and the first log line of the server that looks like INFO server config env=...

<!-- gh-comment-id:2346636249 --> @dhiltgen commented on GitHub (Sep 12, 2024): @YetheSamartaka can you share what string you're setting it to, and the first log line of the server that looks like `INFO server config env=...`
Author
Owner

@YetheSamartaka commented on GitHub (Sep 13, 2024):

@YetheSamartaka can you share what string you're setting it to, and the first log line of the server that looks like INFO server config env=...

@dhiltgen
routes.go:1125: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:2562047h47m16.854775807s OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:1m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR: OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"

<!-- gh-comment-id:2349604853 --> @YetheSamartaka commented on GitHub (Sep 13, 2024): > @YetheSamartaka can you share what string you're setting it to, and the first log line of the server that looks like `INFO server config env=...` @dhiltgen routes.go:1125: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:2562047h47m16.854775807s OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:1m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR: OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"
Author
Owner

@dhiltgen commented on GitHub (Sep 13, 2024):

@YetheSamartaka I'm not able to reproduce. In 0.3.10 on windows the default I see is OLLAMA_LOAD_TIMEOUT:5m0s. When I set a value in the environment, it is reflected in the first log output (and the behavior of the system). For example:

> $env:OLLAMA_LOAD_TIMEOUT="30m"
> .\ollama.exe serve
2024/09/13 12:36:24 routes.go:1125: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATT
ENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://localhost:1234 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:30m0s OLLAMA_MAX_LOADED_MODELS:0
OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\\Users\\Daniel\\.ollama\\models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost h
ttp://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* fi
le://* tauri://*] OLLAMA_RUNNERS_DIR:C:\\Users\\Daniel\\tmp\\ollama-windows-amd64\\lib\\ollama\\runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"
...

I also tried setting it to a very short value 100ms and confirmed it aborts model loads with a timeout.

I think you might not be setting the environment variable correctly, or you may still have local code changes hard coding the value. Can you share how you're setting the variable, and what the value is in the environment?

<!-- gh-comment-id:2350045762 --> @dhiltgen commented on GitHub (Sep 13, 2024): @YetheSamartaka I'm not able to reproduce. In 0.3.10 on windows the default I see is `OLLAMA_LOAD_TIMEOUT:5m0s`. When I set a value in the environment, it is reflected in the first log output (and the behavior of the system). For example: ``` > $env:OLLAMA_LOAD_TIMEOUT="30m" > .\ollama.exe serve 2024/09/13 12:36:24 routes.go:1125: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATT ENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://localhost:1234 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:30m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\\Users\\Daniel\\.ollama\\models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost h ttp://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* fi le://* tauri://*] OLLAMA_RUNNERS_DIR:C:\\Users\\Daniel\\tmp\\ollama-windows-amd64\\lib\\ollama\\runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]" ... ``` I also tried setting it to a very short value `100ms` and confirmed it aborts model loads with a timeout. I think you might not be setting the environment variable correctly, or you may still have local code changes hard coding the value. Can you share how you're setting the variable, and what the value is in the environment?
Author
Owner

@YetheSamartaka commented on GitHub (Sep 13, 2024):

@dhiltgen Mistake is on my side. As I mentioned in the initial message, I was setting -e OLLAMA_LOAD_TIMEOUT=60 instead of -e OLLAMA_LOAD_TIMEOUT="60m". The "60m" is working as supposed. Thank you very much for pointing it out.

<!-- gh-comment-id:2350063539 --> @YetheSamartaka commented on GitHub (Sep 13, 2024): @dhiltgen Mistake is on my side. As I mentioned in the initial message, I was setting -e OLLAMA_LOAD_TIMEOUT=60 instead of -e OLLAMA_LOAD_TIMEOUT="60m". The "60m" is working as supposed. Thank you very much for pointing it out.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#29964