[GH-ISSUE #4361] Feature request: to document all the environmental variables which are used as configuration parameters for ollama #64760

Closed
opened 2026-05-03 18:43:52 -05:00 by GiteaMirror · 10 comments
Owner

Originally created by @kha84 on GitHub (May 11, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/4361

Hello everyone. As days go by ollama grows older and becoming more and more mature, I see a lot of features are added which are relying on some environmental variable to be set in order to control the behavior and benefit from it. Previously I knew of / used just couple of such environmental variables:

  • CUDA_VISIBLE_DEVICES - was setting it to empty string to force ollama to use CPU inference instead of GPU (had my own reasons)
  • OLLAMA_HOST - to my limited understanding, this defines what IP to bind when ollama starts its API server. By default I think it binds to 127.0.0.1 which is it safe in terms of unwanted intruders, but makes it impossible to use from some other machine within the network, and if you want to share your ollama you usually set it to 0.0.0.0 - so it binds to all the IP addresses which are known to localhost

With some recent release where parallel inference was introduced, I can see few more environmental variables are now considered:

  • OLLAMA_NUM_PARALLEL - number of parallel workers to process simultaneous requests (1 by default)
  • OLLAMA_MAX_LOADED_MODELS - max number of different models which can be simultaneously loaded (1 by default)
  • OLLAMA_MAX_QUEUE - the queue length, defines number of requests that might be sitting there and waiting for being picked up (512 by default)
  • OLLAMA_MAX_VRAM - defines the max VRAM that ollama can use, if user wants to limit the usage for some reason (unlimited by default)

I can see some efforts are being made to gather all the work with such env variables to be done from a single place (https://github.com/ollama/ollama/blob/main/server/envconfig/config.go). Just wanted to mention that all ollama users will benefit a lot from having all these env parameters to be clearly documented somewhere on a start page.

Originally created by @kha84 on GitHub (May 11, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/4361 Hello everyone. As days go by ollama grows older and becoming more and more mature, I see a lot of features are added which are relying on some environmental variable to be set in order to control the behavior and benefit from it. Previously I knew of / used just couple of such environmental variables: * CUDA_VISIBLE_DEVICES - was setting it to empty string to force ollama to use CPU inference instead of GPU (had my own reasons) * OLLAMA_HOST - to my limited understanding, this defines what IP to bind when ollama starts its API server. By default I think it binds to 127.0.0.1 which is it safe in terms of unwanted intruders, but makes it impossible to use from some other machine within the network, and if you want to share your ollama you usually set it to 0.0.0.0 - so it binds to all the IP addresses which are known to localhost With some [recent release](https://github.com/ollama/ollama/releases) where parallel inference was introduced, I can see few more environmental variables are now considered: * OLLAMA_NUM_PARALLEL - number of parallel workers to process simultaneous requests (1 by default) * OLLAMA_MAX_LOADED_MODELS - max number of different models which can be simultaneously loaded (1 by default) * OLLAMA_MAX_QUEUE - the queue length, defines number of requests that might be sitting there and waiting for being picked up (512 by default) * OLLAMA_MAX_VRAM - defines the max VRAM that ollama can use, if user wants to limit the usage for some reason (unlimited by default) I can see some efforts are being made to gather all the work with such env variables to be done from a single place (https://github.com/ollama/ollama/blob/main/server/envconfig/config.go). Just wanted to mention that all ollama users will benefit a lot from having all these env parameters to be clearly documented somewhere on a start page.
GiteaMirror added the feature request label 2026-05-03 18:43:52 -05:00
Author
Owner

@wgong commented on GitHub (May 11, 2024):

On Ubuntu, ollama is running as service,
If I add in my .bashrc,

export OLLAMA_NUM_PARALLEL=3
export OLLAMA_MAX_LOADED_MODELS=2

will they be picked up? which user is ollama service running as?
Thanks

<!-- gh-comment-id:2106036982 --> @wgong commented on GitHub (May 11, 2024): On Ubuntu, ollama is running as service, If I add in my .bashrc, ``` export OLLAMA_NUM_PARALLEL=3 export OLLAMA_MAX_LOADED_MODELS=2 ``` will they be picked up? which user is ollama service running as? Thanks
Author
Owner

@kha84 commented on GitHub (May 12, 2024):

@wgong If your ollama is running as systemd service you'll need to inject those variables into unit file.
If you're starting it manually with "ollama serve" then yes - whatever env variables you have at the moment in your env should be picked up.
In both cases it's easy to check it by yourself - find the PID of ollama process and check the content of /proc/<pid>/environ

<!-- gh-comment-id:2106161040 --> @kha84 commented on GitHub (May 12, 2024): @wgong If your ollama is running as systemd service you'll need to inject those variables into unit file. If you're starting it manually with "ollama serve" then yes - whatever env variables you have at the moment in your env should be picked up. In both cases it's easy to check it by yourself - find the PID of ollama process and check the content of `/proc/<pid>/environ`
Author
Owner

@wgong commented on GitHub (May 12, 2024):

Thank you @kha84, I was able to follow your suggestion and verify it works. Also thanks this YouTube - https://youtu.be/QKejFJoYC50?si=eBFBsRLxPyobK3X6

How to inject environ var to systemd unit file

## create /etc/ollama/env
##========================
cd /etc
sudo mkdir ollama
cd ollama
sudo vi env  # adding the following 2 env vars
OLLAMA_NUM_PARALLEL=3
OLLAMA_MAX_LOADED_MODELS=2

sudo chmod 640 env

## update systemd ollama unit file
##========================

sudo vi /etc/systemd/system/ollama.service  # add the following line in [Service] section
EnvironmentFile=/etc/ollama/env

sudo systemctl daemon-reload
sudo systemctl restart ollama

## verify
##========================

pgrep ollama  # get PID
sudo cat /proc/<PID>/environ  # see above 2 env vars

## run 3 client against 2 models
##========================
<!-- gh-comment-id:2106211627 --> @wgong commented on GitHub (May 12, 2024): Thank you @kha84, I was able to follow your suggestion and verify it works. Also thanks this YouTube - https://youtu.be/QKejFJoYC50?si=eBFBsRLxPyobK3X6 ### How to inject environ var to systemd unit file ```bash ## create /etc/ollama/env ##======================== cd /etc sudo mkdir ollama cd ollama sudo vi env # adding the following 2 env vars OLLAMA_NUM_PARALLEL=3 OLLAMA_MAX_LOADED_MODELS=2 sudo chmod 640 env ## update systemd ollama unit file ##======================== sudo vi /etc/systemd/system/ollama.service # add the following line in [Service] section EnvironmentFile=/etc/ollama/env sudo systemctl daemon-reload sudo systemctl restart ollama ## verify ##======================== pgrep ollama # get PID sudo cat /proc/<PID>/environ # see above 2 env vars ## run 3 client against 2 models ##======================== ```
Author
Owner

@lengrongfu commented on GitHub (May 14, 2024):

We should simplify the environment variables by adding startup parameters?

<!-- gh-comment-id:2109228655 --> @lengrongfu commented on GitHub (May 14, 2024): We should simplify the environment variables by adding startup parameters?
Author
Owner

@kha84 commented on GitHub (May 14, 2024):

To be honest, I don't mind ollama to keep using env variables as it's main source of configuration - if that was a decision made. Just to have all of them listed, documented and clearly visible to users. Because now this precious info is scattered among the code of ollama itself

<!-- gh-comment-id:2109440972 --> @kha84 commented on GitHub (May 14, 2024): To be honest, I don't mind ollama to keep using env variables as it's main source of configuration - if that was a decision made. Just to have all of them listed, documented and clearly visible to users. Because now this precious info is scattered among the code of ollama itself
Author
Owner

@testniubi1 commented on GitHub (May 15, 2024):

I agree with this, when there is a clear env we can better set up ollama

<!-- gh-comment-id:2113121751 --> @testniubi1 commented on GitHub (May 15, 2024): I agree with this, when there is a clear env we can better set up ollama
Author
Owner

@kha84 commented on GitHub (May 19, 2024):

Just a few greps:

$ grep -r EnvironmentVar ./*
./cmd/cmd.go:type EnvironmentVar struct {
./cmd/cmd.go:func appendEnvDocs(cmd *cobra.Command, envs []EnvironmentVar) {
./cmd/cmd.go:   ollamaHostEnv := EnvironmentVar{"OLLAMA_HOST", "The host:port or base URL of the Ollama server (e.g. http://localhost:11434)"}
./cmd/cmd.go:   ollamaNoHistoryEnv := EnvironmentVar{"OLLAMA_NOHISTORY", "Disable readline history"}
./cmd/cmd.go:   envs := []EnvironmentVar{ollamaHostEnv}
./cmd/cmd.go:                   appendEnvDocs(cmd, []EnvironmentVar{ollamaHostEnv, ollamaNoHistoryEnv})
$ grep -r os.Getenv ./*
./api/client.go:        hostVar := os.Getenv("OLLAMA_HOST")
./app/lifecycle/paths.go:               localAppData := os.Getenv("LOCALAPPDATA")
./app/lifecycle/paths.go:               paths := strings.Split(os.Getenv("PATH"), ";")
./app/store/store_darwin.go:    home := os.Getenv("HOME")
./app/store/store_linux.go:     home := os.Getenv("HOME")
./app/store/store_windows.go:   localAppData := os.Getenv("LOCALAPPDATA")
./cmd/cmd.go:           WordWrap:    os.Getenv("TERM") == "xterm-256color",
./cmd/interactive.go:   if os.Getenv("OLLAMA_NOHISTORY") != "" {
./cmd/start_windows.go:         localAppData := os.Getenv("LOCALAPPDATA")
./gpu/amd_common.go:    hipPath := os.Getenv("HIP_PATH")
./gpu/amd_common.go:    paths := os.Getenv(pathEnv)
./gpu/amd_linux.go:     hipVD := os.Getenv("HIP_VISIBLE_DEVICES")   // zero based index only
./gpu/amd_linux.go:     rocrVD := os.Getenv("ROCR_VISIBLE_DEVICES") // zero based index or UUID, but consumer cards seem to not support UUID
./gpu/amd_linux.go:     gpuDO := os.Getenv("GPU_DEVICE_ORDINAL")    // zero based index
./gpu/amd_linux.go:     gfxOverride := os.Getenv("HSA_OVERRIDE_GFX_VERSION")
./gpu/amd_windows.go:   gfxOverride := os.Getenv("HSA_OVERRIDE_GFX_VERSION")
./gpu/amd_windows.go:   localAppData := os.Getenv("LOCALAPPDATA")
./gpu/assets.go:                pathComponents := strings.Split(os.Getenv("PATH"), ";")
./gpu/gpu.go:var CudaTegra string = os.Getenv("JETSON_JETPACK")
./gpu/gpu.go:           localAppData := os.Getenv("LOCALAPPDATA")
./gpu/gpu.go:           ldPaths = strings.Split(os.Getenv("PATH"), ";")
./gpu/gpu.go:           ldPaths = strings.Split(os.Getenv("LD_LIBRARY_PATH"), ":")
./integration/basic_test.go:    if os.Getenv("OLLAMA_TEST_EXISTING") != "" {
./integration/basic_test.go:    oldModelsDir := os.Getenv("OLLAMA_MODELS")
./integration/concurrency_test.go:      vram := os.Getenv("OLLAMA_MAX_VRAM")
./integration/max_queue_test.go:        if os.Getenv("OLLAMA_TEST_EXISTING") != "" {
./integration/max_queue_test.go:        mq := os.Getenv("OLLAMA_MAX_QUEUE")
./integration/utils_test.go:    ollamaHost := os.Getenv("OLLAMA_HOST")
./integration/utils_test.go:    if os.Getenv("OLLAMA_TEST_EXISTING") == "" && port == defaultPort {
./integration/utils_test.go:    if tmp := os.Getenv("OLLAMA_HOST"); tmp != ollamaHost {
./integration/utils_test.go:    if os.Getenv("OLLAMA_TEST_EXISTING") == "" {
./integration/utils_test.go:            if os.Getenv("OLLAMA_TEST_EXISTING") == "" {
./server/envconfig/config.go:   return strings.Trim(os.Getenv(key), "\"' ")
./server/envconfig/config.go:   if onp := os.Getenv("OLLAMA_MAX_QUEUE"); onp != "" {
$ grep -r CUDA_VISIBLE_DEVICES ./
./docs/gpu.md:a subset, you can set `CUDA_VISIBLE_DEVICES` to a comma separated list of GPUs.
./gpu/cuda_common.go:   return "CUDA_VISIBLE_DEVICES", strings.Join(ids, ",")

Looks like there's no single pattern inside of ollama on how it works with env variables

<!-- gh-comment-id:2119369303 --> @kha84 commented on GitHub (May 19, 2024): Just a few greps: ``` $ grep -r EnvironmentVar ./* ./cmd/cmd.go:type EnvironmentVar struct { ./cmd/cmd.go:func appendEnvDocs(cmd *cobra.Command, envs []EnvironmentVar) { ./cmd/cmd.go: ollamaHostEnv := EnvironmentVar{"OLLAMA_HOST", "The host:port or base URL of the Ollama server (e.g. http://localhost:11434)"} ./cmd/cmd.go: ollamaNoHistoryEnv := EnvironmentVar{"OLLAMA_NOHISTORY", "Disable readline history"} ./cmd/cmd.go: envs := []EnvironmentVar{ollamaHostEnv} ./cmd/cmd.go: appendEnvDocs(cmd, []EnvironmentVar{ollamaHostEnv, ollamaNoHistoryEnv}) ``` ``` $ grep -r os.Getenv ./* ./api/client.go: hostVar := os.Getenv("OLLAMA_HOST") ./app/lifecycle/paths.go: localAppData := os.Getenv("LOCALAPPDATA") ./app/lifecycle/paths.go: paths := strings.Split(os.Getenv("PATH"), ";") ./app/store/store_darwin.go: home := os.Getenv("HOME") ./app/store/store_linux.go: home := os.Getenv("HOME") ./app/store/store_windows.go: localAppData := os.Getenv("LOCALAPPDATA") ./cmd/cmd.go: WordWrap: os.Getenv("TERM") == "xterm-256color", ./cmd/interactive.go: if os.Getenv("OLLAMA_NOHISTORY") != "" { ./cmd/start_windows.go: localAppData := os.Getenv("LOCALAPPDATA") ./gpu/amd_common.go: hipPath := os.Getenv("HIP_PATH") ./gpu/amd_common.go: paths := os.Getenv(pathEnv) ./gpu/amd_linux.go: hipVD := os.Getenv("HIP_VISIBLE_DEVICES") // zero based index only ./gpu/amd_linux.go: rocrVD := os.Getenv("ROCR_VISIBLE_DEVICES") // zero based index or UUID, but consumer cards seem to not support UUID ./gpu/amd_linux.go: gpuDO := os.Getenv("GPU_DEVICE_ORDINAL") // zero based index ./gpu/amd_linux.go: gfxOverride := os.Getenv("HSA_OVERRIDE_GFX_VERSION") ./gpu/amd_windows.go: gfxOverride := os.Getenv("HSA_OVERRIDE_GFX_VERSION") ./gpu/amd_windows.go: localAppData := os.Getenv("LOCALAPPDATA") ./gpu/assets.go: pathComponents := strings.Split(os.Getenv("PATH"), ";") ./gpu/gpu.go:var CudaTegra string = os.Getenv("JETSON_JETPACK") ./gpu/gpu.go: localAppData := os.Getenv("LOCALAPPDATA") ./gpu/gpu.go: ldPaths = strings.Split(os.Getenv("PATH"), ";") ./gpu/gpu.go: ldPaths = strings.Split(os.Getenv("LD_LIBRARY_PATH"), ":") ./integration/basic_test.go: if os.Getenv("OLLAMA_TEST_EXISTING") != "" { ./integration/basic_test.go: oldModelsDir := os.Getenv("OLLAMA_MODELS") ./integration/concurrency_test.go: vram := os.Getenv("OLLAMA_MAX_VRAM") ./integration/max_queue_test.go: if os.Getenv("OLLAMA_TEST_EXISTING") != "" { ./integration/max_queue_test.go: mq := os.Getenv("OLLAMA_MAX_QUEUE") ./integration/utils_test.go: ollamaHost := os.Getenv("OLLAMA_HOST") ./integration/utils_test.go: if os.Getenv("OLLAMA_TEST_EXISTING") == "" && port == defaultPort { ./integration/utils_test.go: if tmp := os.Getenv("OLLAMA_HOST"); tmp != ollamaHost { ./integration/utils_test.go: if os.Getenv("OLLAMA_TEST_EXISTING") == "" { ./integration/utils_test.go: if os.Getenv("OLLAMA_TEST_EXISTING") == "" { ./server/envconfig/config.go: return strings.Trim(os.Getenv(key), "\"' ") ./server/envconfig/config.go: if onp := os.Getenv("OLLAMA_MAX_QUEUE"); onp != "" { ``` ``` $ grep -r CUDA_VISIBLE_DEVICES ./ ./docs/gpu.md:a subset, you can set `CUDA_VISIBLE_DEVICES` to a comma separated list of GPUs. ./gpu/cuda_common.go: return "CUDA_VISIBLE_DEVICES", strings.Join(ids, ",") ``` Looks like there's no single pattern inside of ollama on how it works with env variables
Author
Owner

@kha84 commented on GitHub (May 19, 2024):

Anyways, I'm happy to help and prepare a pull request to a documentation, if you don't mind

<!-- gh-comment-id:2119387578 --> @kha84 commented on GitHub (May 19, 2024): Anyways, I'm happy to help and prepare a pull request to a documentation, if you don't mind
Author
Owner

@pdevine commented on GitHub (May 24, 2024):

You should now be able to get online help for each of the environment variables. e.g. ollama serve -h. There are several environment variables which are purposely not exposed just because they are probably going to go away at some point. We can probably document some of those in the future, but they'll be tagged with something like experimental.

<!-- gh-comment-id:2130420343 --> @pdevine commented on GitHub (May 24, 2024): You should now be able to get online help for each of the environment variables. e.g. `ollama serve -h`. There are several environment variables which are purposely _not_ exposed just because they are probably going to go away at some point. We can probably document _some_ of those in the future, but they'll be tagged with something like _experimental_.
Author
Owner

@cenkerozkan commented on GitHub (Jan 11, 2026):

@wgong You have no idea how happy you made me with this <3 thank you

<!-- gh-comment-id:3735222022 --> @cenkerozkan commented on GitHub (Jan 11, 2026): @wgong You have no idea how happy you made me with this <3 thank you
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#64760