[GH-ISSUE #6195] When I start the container with http_proxy and https_proxy configured, the ollama service will not start properly #3868

Closed
opened 2026-04-12 14:42:19 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @0sengseng0 on GitHub (Aug 6, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6195

What is the issue?

start command

docker run -d --gpus=all -v ollama:/root/.ollama -p 31434:11434 -e "OLLAMA_DEBUG=1" -e "CUDA_VISIBLE_DEVICES=0" -e "http_proxy=http://192.168..:11080" -e "https_proxy=http://192.168..:11080" --name ollama ollama/ollama

operating command

[root@main ~]# docker exec -it ollama bash
root@5673350a5bbd:/# ollama list
Error: something went wrong, please see the ollama server logs for details
root@5673350a5bbd:/# ollama --version
Warning: could not connect to a running Ollama instance
Warning: client version is 0.3.3

Container logging

[root@main git]# docker logs -f ollama
2024/08/06 07:30:27 routes.go:1108: INFO server config env="map[CUDA_VISIBLE_DEVICES:0 GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://] OLLAMA_RUNNERS_DIR: OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"
time=2024-08-06T07:30:27.744Z level=INFO source=images.go:781 msg="total blobs: 0"
time=2024-08-06T07:30:27.744Z level=INFO source=images.go:788 msg="total unused blobs removed: 0"
time=2024-08-06T07:30:27.745Z level=INFO source=routes.go:1155 msg="Listening on [::]:11434 (version 0.3.3)"
time=2024-08-06T07:30:27.746Z level=INFO source=payload.go:30 msg="extracting embedded files" dir=/tmp/ollama2311459276/runners
time=2024-08-06T07:30:27.746Z level=DEBUG source=payload.go:182 msg=extracting variant=cpu file=build/linux/x86_64/cpu/bin/ollama_llama_server.gz
time=2024-08-06T07:30:27.746Z level=DEBUG source=payload.go:182 msg=extracting variant=cpu_avx file=build/linux/x86_64/cpu_avx/bin/ollama_llama_server.gz
time=2024-08-06T07:30:27.746Z level=DEBUG source=payload.go:182 msg=extracting variant=cpu_avx2 file=build/linux/x86_64/cpu_avx2/bin/ollama_llama_server.gz
time=2024-08-06T07:30:27.746Z level=DEBUG source=payload.go:182 msg=extracting variant=cuda_v11 file=build/linux/x86_64/cuda_v11/bin/libcublas.so.11.gz
time=2024-08-06T07:30:27.746Z level=DEBUG source=payload.go:182 msg=extracting variant=cuda_v11 file=build/linux/x86_64/cuda_v11/bin/libcublasLt.so.11.gz
time=2024-08-06T07:30:27.746Z level=DEBUG source=payload.go:182 msg=extracting variant=cuda_v11 file=build/linux/x86_64/cuda_v11/bin/libcudart.so.11.0.gz
time=2024-08-06T07:30:27.746Z level=DEBUG source=payload.go:182 msg=extracting variant=cuda_v11 file=build/linux/x86_64/cuda_v11/bin/ollama_llama_server.gz
time=2024-08-06T07:30:27.746Z level=DEBUG source=payload.go:182 msg=extracting variant=rocm_v60102 file=build/linux/x86_64/rocm_v60102/bin/deps.txt.gz
time=2024-08-06T07:30:27.746Z level=DEBUG source=payload.go:182 msg=extracting variant=rocm_v60102 file=build/linux/x86_64/rocm_v60102/bin/ollama_llama_server.gz
time=2024-08-06T07:30:35.271Z level=DEBUG source=payload.go:71 msg="availableServers : found" file=/tmp/ollama2311459276/runners/cpu/ollama_llama_server
time=2024-08-06T07:30:35.271Z level=DEBUG source=payload.go:71 msg="availableServers : found" file=/tmp/ollama2311459276/runners/cpu_avx/ollama_llama_server
time=2024-08-06T07:30:35.271Z level=DEBUG source=payload.go:71 msg="availableServers : found" file=/tmp/ollama2311459276/runners/cpu_avx2/ollama_llama_server
time=2024-08-06T07:30:35.271Z level=DEBUG source=payload.go:71 msg="availableServers : found" file=/tmp/ollama2311459276/runners/cuda_v11/ollama_llama_server
time=2024-08-06T07:30:35.271Z level=DEBUG source=payload.go:71 msg="availableServers : found" file=/tmp/ollama2311459276/runners/rocm_v60102/ollama_llama_server
time=2024-08-06T07:30:35.271Z level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 cuda_v11 rocm_v60102]"
time=2024-08-06T07:30:35.271Z level=DEBUG source=payload.go:45 msg="Override detection logic by setting OLLAMA_LLM_LIBRARY"
time=2024-08-06T07:30:35.271Z level=DEBUG source=sched.go:105 msg="starting llm scheduler"
time=2024-08-06T07:30:35.271Z level=INFO source=gpu.go:205 msg="looking for compatible GPUs"
time=2024-08-06T07:30:35.272Z level=DEBUG source=gpu.go:91 msg="searching for GPU discovery libraries for NVIDIA"
time=2024-08-06T07:30:35.272Z level=DEBUG source=gpu.go:468 msg="Searching for GPU library" name=libcuda.so

time=2024-08-06T07:30:35.272Z level=DEBUG source=gpu.go:487 msg="gpu library search" globs="[/usr/local/nvidia/lib/libcuda.so** /usr/local/nvidia/lib64/libcuda.so** /usr/local/cuda*/targets//lib/libcuda.so /usr/lib/-linux-gnu/nvidia/current/libcuda.so /usr/lib/-linux-gnu/libcuda.so /usr/lib/wsl/lib/libcuda.so* /usr/lib/wsl/drivers//libcuda.so /opt/cuda/lib*/libcuda.so* /usr/local/cuda/lib*/libcuda.so* /usr/lib*/libcuda.so* /usr/local/lib*/libcuda.so*]"
time=2024-08-06T07:30:35.273Z level=DEBUG source=gpu.go:521 msg="discovered GPU libraries" paths=[/usr/lib/x86_64-linux-gnu/libcuda.so.535.154.05]
CUDA driver version: 12.2
time=2024-08-06T07:30:35.287Z level=DEBUG source=gpu.go:124 msg="detected GPUs" count=1 library=/usr/lib/x86_64-linux-gnu/libcuda.so.535.154.05
[GPU-dfcbde6c-a5c4-f8b6-e56d-5b0d63559daf] CUDA totalMem 24217 mb
[GPU-dfcbde6c-a5c4-f8b6-e56d-5b0d63559daf] CUDA freeMem 5622 mb
[GPU-dfcbde6c-a5c4-f8b6-e56d-5b0d63559daf] Compute Capability 8.9
time=2024-08-06T07:30:35.467Z level=DEBUG source=amd_linux.go:371 msg="amdgpu driver not detected /sys/module/amdgpu"
releasing cuda driver library
time=2024-08-06T07:30:35.467Z level=INFO source=types.go:105 msg="inference compute" id=GPU-dfcbde6c-a5c4-f8b6-e56d-5b0d63559daf library=cuda compute=8.9 driver=12.2 name="NVIDIA GeForce RTX 4090" total="23.6 GiB" available="5.5 GiB"

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.3.3

Originally created by @0sengseng0 on GitHub (Aug 6, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/6195 ### What is the issue? ## start command docker run -d --gpus=all -v ollama:/root/.ollama -p 31434:11434 -e "OLLAMA_DEBUG=1" -e "CUDA_VISIBLE_DEVICES=0" -e "http_proxy=http://192.168.*.*:11080" -e "https_proxy=http://192.168.*.*:11080" --name ollama ollama/ollama ## operating command [root@main ~]# docker exec -it ollama bash root@5673350a5bbd:/# ollama list Error: something went wrong, please see the ollama server logs for details root@5673350a5bbd:/# ollama --version Warning: could not connect to a running Ollama instance Warning: client version is 0.3.3 ## Container logging [root@main git]# docker logs -f ollama 2024/08/06 07:30:27 routes.go:1108: INFO server config env="map[CUDA_VISIBLE_DEVICES:0 GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR: OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]" time=2024-08-06T07:30:27.744Z level=INFO source=images.go:781 msg="total blobs: 0" time=2024-08-06T07:30:27.744Z level=INFO source=images.go:788 msg="total unused blobs removed: 0" time=2024-08-06T07:30:27.745Z level=INFO source=routes.go:1155 msg="Listening on [::]:11434 (version 0.3.3)" time=2024-08-06T07:30:27.746Z level=INFO source=payload.go:30 msg="extracting embedded files" dir=/tmp/ollama2311459276/runners time=2024-08-06T07:30:27.746Z level=DEBUG source=payload.go:182 msg=extracting variant=cpu file=build/linux/x86_64/cpu/bin/ollama_llama_server.gz time=2024-08-06T07:30:27.746Z level=DEBUG source=payload.go:182 msg=extracting variant=cpu_avx file=build/linux/x86_64/cpu_avx/bin/ollama_llama_server.gz time=2024-08-06T07:30:27.746Z level=DEBUG source=payload.go:182 msg=extracting variant=cpu_avx2 file=build/linux/x86_64/cpu_avx2/bin/ollama_llama_server.gz time=2024-08-06T07:30:27.746Z level=DEBUG source=payload.go:182 msg=extracting variant=cuda_v11 file=build/linux/x86_64/cuda_v11/bin/libcublas.so.11.gz time=2024-08-06T07:30:27.746Z level=DEBUG source=payload.go:182 msg=extracting variant=cuda_v11 file=build/linux/x86_64/cuda_v11/bin/libcublasLt.so.11.gz time=2024-08-06T07:30:27.746Z level=DEBUG source=payload.go:182 msg=extracting variant=cuda_v11 file=build/linux/x86_64/cuda_v11/bin/libcudart.so.11.0.gz time=2024-08-06T07:30:27.746Z level=DEBUG source=payload.go:182 msg=extracting variant=cuda_v11 file=build/linux/x86_64/cuda_v11/bin/ollama_llama_server.gz time=2024-08-06T07:30:27.746Z level=DEBUG source=payload.go:182 msg=extracting variant=rocm_v60102 file=build/linux/x86_64/rocm_v60102/bin/deps.txt.gz time=2024-08-06T07:30:27.746Z level=DEBUG source=payload.go:182 msg=extracting variant=rocm_v60102 file=build/linux/x86_64/rocm_v60102/bin/ollama_llama_server.gz time=2024-08-06T07:30:35.271Z level=DEBUG source=payload.go:71 msg="availableServers : found" file=/tmp/ollama2311459276/runners/cpu/ollama_llama_server time=2024-08-06T07:30:35.271Z level=DEBUG source=payload.go:71 msg="availableServers : found" file=/tmp/ollama2311459276/runners/cpu_avx/ollama_llama_server time=2024-08-06T07:30:35.271Z level=DEBUG source=payload.go:71 msg="availableServers : found" file=/tmp/ollama2311459276/runners/cpu_avx2/ollama_llama_server time=2024-08-06T07:30:35.271Z level=DEBUG source=payload.go:71 msg="availableServers : found" file=/tmp/ollama2311459276/runners/cuda_v11/ollama_llama_server time=2024-08-06T07:30:35.271Z level=DEBUG source=payload.go:71 msg="availableServers : found" file=/tmp/ollama2311459276/runners/rocm_v60102/ollama_llama_server time=2024-08-06T07:30:35.271Z level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 cuda_v11 rocm_v60102]" time=2024-08-06T07:30:35.271Z level=DEBUG source=payload.go:45 msg="Override detection logic by setting OLLAMA_LLM_LIBRARY" time=2024-08-06T07:30:35.271Z level=DEBUG source=sched.go:105 msg="starting llm scheduler" time=2024-08-06T07:30:35.271Z level=INFO source=gpu.go:205 msg="looking for compatible GPUs" time=2024-08-06T07:30:35.272Z level=DEBUG source=gpu.go:91 msg="searching for GPU discovery libraries for NVIDIA" time=2024-08-06T07:30:35.272Z level=DEBUG source=gpu.go:468 msg="Searching for GPU library" name=libcuda.so* time=2024-08-06T07:30:35.272Z level=DEBUG source=gpu.go:487 msg="gpu library search" globs="[/usr/local/nvidia/lib/libcuda.so** /usr/local/nvidia/lib64/libcuda.so** /usr/local/cuda*/targets/*/lib/libcuda.so* /usr/lib/*-linux-gnu/nvidia/current/libcuda.so* /usr/lib/*-linux-gnu/libcuda.so* /usr/lib/wsl/lib/libcuda.so* /usr/lib/wsl/drivers/*/libcuda.so* /opt/cuda/lib*/libcuda.so* /usr/local/cuda/lib*/libcuda.so* /usr/lib*/libcuda.so* /usr/local/lib*/libcuda.so*]" time=2024-08-06T07:30:35.273Z level=DEBUG source=gpu.go:521 msg="discovered GPU libraries" paths=[/usr/lib/x86_64-linux-gnu/libcuda.so.535.154.05] CUDA driver version: 12.2 time=2024-08-06T07:30:35.287Z level=DEBUG source=gpu.go:124 msg="detected GPUs" count=1 library=/usr/lib/x86_64-linux-gnu/libcuda.so.535.154.05 [GPU-dfcbde6c-a5c4-f8b6-e56d-5b0d63559daf] CUDA totalMem 24217 mb [GPU-dfcbde6c-a5c4-f8b6-e56d-5b0d63559daf] CUDA freeMem 5622 mb [GPU-dfcbde6c-a5c4-f8b6-e56d-5b0d63559daf] Compute Capability 8.9 time=2024-08-06T07:30:35.467Z level=DEBUG source=amd_linux.go:371 msg="amdgpu driver not detected /sys/module/amdgpu" releasing cuda driver library time=2024-08-06T07:30:35.467Z level=INFO source=types.go:105 msg="inference compute" id=GPU-dfcbde6c-a5c4-f8b6-e56d-5b0d63559daf library=cuda compute=8.9 driver=12.2 name="NVIDIA GeForce RTX 4090" total="23.6 GiB" available="5.5 GiB" ### OS Linux ### GPU Nvidia ### CPU Intel ### Ollama version 0.3.3
GiteaMirror added the bug label 2026-04-12 14:42:19 -05:00
Author
Owner

@rick-github commented on GitHub (Aug 6, 2024):

ollama operates with a client/server model. The server will use your HTTP proxy, but you don't want the client to do that, so you need to unset the proxy environment variables when using the client.

[root@main ~]# docker exec -it ollama bash
root@5673350a5bbd:/# unset http_proxy https_proxy
root@5673350a5bbd:/# ollama --version
<!-- gh-comment-id:2270704675 --> @rick-github commented on GitHub (Aug 6, 2024): ollama operates with a client/server model. The server will use your HTTP proxy, but you don't want the client to do that, so you need to unset the proxy environment variables when using the client. ``` [root@main ~]# docker exec -it ollama bash root@5673350a5bbd:/# unset http_proxy https_proxy root@5673350a5bbd:/# ollama --version ```
Author
Owner

@0sengseng0 commented on GitHub (Aug 8, 2024):

Does the lack of python command in the image mean I can't convert the huggingface safetensors model to a GGUF model in the image?

<!-- gh-comment-id:2275022741 --> @0sengseng0 commented on GitHub (Aug 8, 2024): Does the lack of python command in the image mean I can't convert the huggingface safetensors model to a GGUF model in the image?
Author
Owner
<!-- gh-comment-id:2275542337 --> @rick-github commented on GitHub (Aug 8, 2024): https://github.com/ollama/ollama/blob/main/docs/import.md#import-safetensors
Author
Owner

@mxyng commented on GitHub (Aug 23, 2024):

setting http_proxy for the container is not recommended if you're reusing the container for ollama cli since it'll proxy client requests through the proxy which won't have connectivity back into the container to reach ollama server. I'd suggest omitting it completely

<!-- gh-comment-id:2307811678 --> @mxyng commented on GitHub (Aug 23, 2024): setting `http_proxy` for the container is not recommended if you're reusing the container for ollama cli since it'll proxy client requests through the proxy which won't have connectivity back into the container to reach ollama server. I'd suggest omitting it completely
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#3868