[GH-ISSUE #10258] Ollama is not working on Kubernetes after 0.5.0 #68789

Closed
opened 2026-05-04 15:11:38 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @1995parham on GitHub (Apr 13, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10258

We are deploying Ollama on Kubernetes using helm-chart but upgrading it to any version in 0.5.x or 0.6.x makes it not working.

I mean, pod starts correctly, but service returns connection refused or EOF. Here is the debug log:

ollama-platform-genai-79f6b79848-vdbqb ollama 2025/04/13 21:01:42 routes.go:1231: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:2048 OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:true O
LLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:168h0m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:5 OLLAMA_MAX_QUEUE:1024 OLLAMA_MODELS:/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAM
A_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:4 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://*
tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
ollama-platform-genai-79f6b79848-vdbqb ollama time=2025-04-13T21:01:42.927Z level=INFO source=images.go:458 msg="total blobs: 18"
ollama-platform-genai-79f6b79848-vdbqb ollama time=2025-04-13T21:01:42.927Z level=INFO source=images.go:465 msg="total unused blobs removed: 0"
ollama-platform-genai-79f6b79848-vdbqb ollama time=2025-04-13T21:01:42.927Z level=INFO source=routes.go:1298 msg="Listening on [::]:11434 (version 0.6.5)"
ollama-platform-genai-79f6b79848-vdbqb ollama time=2025-04-13T21:01:42.928Z level=DEBUG source=sched.go:107 msg="starting llm scheduler"
ollama-platform-genai-79f6b79848-vdbqb ollama time=2025-04-13T21:01:42.928Z level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
ollama-platform-genai-79f6b79848-vdbqb ollama time=2025-04-13T21:01:42.929Z level=DEBUG source=gpu.go:98 msg="searching for GPU discovery libraries for NVIDIA"
ollama-platform-genai-79f6b79848-vdbqb ollama time=2025-04-13T21:01:42.929Z level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=libcuda.so*
ollama-platform-genai-79f6b79848-vdbqb ollama time=2025-04-13T21:01:42.929Z level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[/usr/lib/ollama/libcuda.so* /usr/local/nvidia/lib/libcuda.so* /usr/local/nvidia/lib64/libcuda.so* /usr/local/cuda*/targets/*/lib/libcuda.so* /usr/lib/*-linux-gnu/nvidia/
current/libcuda.so* /usr/lib/*-linux-gnu/libcuda.so* /usr/lib/wsl/lib/libcuda.so* /usr/lib/wsl/drivers/*/libcuda.so* /opt/cuda/lib*/libcuda.so* /usr/local/cuda/lib*/libcuda.so* /usr/lib*/libcuda.so* /usr/local/lib*/libcuda.so*]"
ollama-platform-genai-79f6b79848-vdbqb ollama time=2025-04-13T21:01:42.930Z level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths=[/usr/lib/x86_64-linux-gnu/libcuda.so.560.35.03]
ollama-platform-genai-79f6b79848-vdbqb ollama initializing /usr/lib/x86_64-linux-gnu/libcuda.so.560.35.03
ollama-platform-genai-79f6b79848-vdbqb ollama dlsym: cuInit - 0x7f9e2d8be7f0
ollama-platform-genai-79f6b79848-vdbqb ollama dlsym: cuDriverGetVersion - 0x7f9e2d8be810
ollama-platform-genai-79f6b79848-vdbqb ollama dlsym: cuDeviceGetCount - 0x7f9e2d8be850
ollama-platform-genai-79f6b79848-vdbqb ollama dlsym: cuDeviceGet - 0x7f9e2d8be830
ollama-platform-genai-79f6b79848-vdbqb ollama dlsym: cuDeviceGetAttribute - 0x7f9e2d8be930
ollama-platform-genai-79f6b79848-vdbqb ollama dlsym: cuDeviceGetUuid - 0x7f9e2d8be890
ollama-platform-genai-79f6b79848-vdbqb ollama dlsym: cuDeviceGetName - 0x7f9e2d8be870
ollama-platform-genai-79f6b79848-vdbqb ollama dlsym: cuCtxCreate_v3 - 0x7f9e2d8c9060
ollama-platform-genai-79f6b79848-vdbqb ollama dlsym: cuMemGetInfo_v2 - 0x7f9e2d8d4520
ollama-platform-genai-79f6b79848-vdbqb ollama dlsym: cuCtxDestroy - 0x7f9e2d92f380
ollama-platform-genai-79f6b79848-vdbqb ollama calling cuInit
ollama-platform-genai-79f6b79848-vdbqb ollama calling cuDriverGetVersion
ollama-platform-genai-79f6b79848-vdbqb ollama raw version 0x2f1c
ollama-platform-genai-79f6b79848-vdbqb ollama CUDA driver version: 12.6
ollama-platform-genai-79f6b79848-vdbqb ollama calling cuDeviceGetCount
ollama-platform-genai-79f6b79848-vdbqb ollama device count 1
ollama-platform-genai-79f6b79848-vdbqb ollama time=2025-04-13T21:01:42.934Z level=DEBUG source=gpu.go:125 msg="detected GPUs" count=1 library=/usr/lib/x86_64-linux-gnu/libcuda.so.560.35.03
ollama-platform-genai-79f6b79848-vdbqb ollama [GPU-c3b33e5a-6207-f3d7-34ce-04c6a4416df7] CUDA totalMem 45709 mb
ollama-platform-genai-79f6b79848-vdbqb ollama [GPU-c3b33e5a-6207-f3d7-34ce-04c6a4416df7] CUDA freeMem 45278 mb
ollama-platform-genai-79f6b79848-vdbqb ollama [GPU-c3b33e5a-6207-f3d7-34ce-04c6a4416df7] Compute Capability 8.9
ollama-platform-genai-79f6b79848-vdbqb ollama time=2025-04-13T21:01:43.066Z level=DEBUG source=amd_linux.go:419 msg="amdgpu driver not detected /sys/module/amdgpu"
ollama-platform-genai-79f6b79848-vdbqb ollama releasing cuda driver library
ollama-platform-genai-79f6b79848-vdbqb ollama time=2025-04-13T21:01:43.066Z level=INFO source=types.go:130 msg="inference compute" id=GPU-c3b33e5a-6207-f3d7-34ce-04c6a4416df7 library=cuda variant=v12 compute=8.9 driver=12.6 name="NVIDIA L40S" total="44.6 GiB" available="44.2 GiB"

We are using Nvidia GPU as you can see in the logs and our Kubernetes version is:

Client Version: 4.18.7
Kustomize Version: v5.4.2
Server Version: 4.10.0-0.okd-2022-03-07-131213
Kubernetes Version: v1.23.3-2003+e419edff267ffa-dirty
Originally created by @1995parham on GitHub (Apr 13, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10258 We are deploying Ollama on Kubernetes using [helm-chart](https://github.com/otwld/ollama-helm/) but upgrading it to any version in 0.5.x or 0.6.x makes it not working. I mean, pod starts correctly, but service returns connection refused or EOF. Here is the debug log: ``` ollama-platform-genai-79f6b79848-vdbqb ollama 2025/04/13 21:01:42 routes.go:1231: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:2048 OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:true O LLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:168h0m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:5 OLLAMA_MAX_QUEUE:1024 OLLAMA_MODELS:/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAM A_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:4 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]" ollama-platform-genai-79f6b79848-vdbqb ollama time=2025-04-13T21:01:42.927Z level=INFO source=images.go:458 msg="total blobs: 18" ollama-platform-genai-79f6b79848-vdbqb ollama time=2025-04-13T21:01:42.927Z level=INFO source=images.go:465 msg="total unused blobs removed: 0" ollama-platform-genai-79f6b79848-vdbqb ollama time=2025-04-13T21:01:42.927Z level=INFO source=routes.go:1298 msg="Listening on [::]:11434 (version 0.6.5)" ollama-platform-genai-79f6b79848-vdbqb ollama time=2025-04-13T21:01:42.928Z level=DEBUG source=sched.go:107 msg="starting llm scheduler" ollama-platform-genai-79f6b79848-vdbqb ollama time=2025-04-13T21:01:42.928Z level=INFO source=gpu.go:217 msg="looking for compatible GPUs" ollama-platform-genai-79f6b79848-vdbqb ollama time=2025-04-13T21:01:42.929Z level=DEBUG source=gpu.go:98 msg="searching for GPU discovery libraries for NVIDIA" ollama-platform-genai-79f6b79848-vdbqb ollama time=2025-04-13T21:01:42.929Z level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=libcuda.so* ollama-platform-genai-79f6b79848-vdbqb ollama time=2025-04-13T21:01:42.929Z level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[/usr/lib/ollama/libcuda.so* /usr/local/nvidia/lib/libcuda.so* /usr/local/nvidia/lib64/libcuda.so* /usr/local/cuda*/targets/*/lib/libcuda.so* /usr/lib/*-linux-gnu/nvidia/ current/libcuda.so* /usr/lib/*-linux-gnu/libcuda.so* /usr/lib/wsl/lib/libcuda.so* /usr/lib/wsl/drivers/*/libcuda.so* /opt/cuda/lib*/libcuda.so* /usr/local/cuda/lib*/libcuda.so* /usr/lib*/libcuda.so* /usr/local/lib*/libcuda.so*]" ollama-platform-genai-79f6b79848-vdbqb ollama time=2025-04-13T21:01:42.930Z level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths=[/usr/lib/x86_64-linux-gnu/libcuda.so.560.35.03] ollama-platform-genai-79f6b79848-vdbqb ollama initializing /usr/lib/x86_64-linux-gnu/libcuda.so.560.35.03 ollama-platform-genai-79f6b79848-vdbqb ollama dlsym: cuInit - 0x7f9e2d8be7f0 ollama-platform-genai-79f6b79848-vdbqb ollama dlsym: cuDriverGetVersion - 0x7f9e2d8be810 ollama-platform-genai-79f6b79848-vdbqb ollama dlsym: cuDeviceGetCount - 0x7f9e2d8be850 ollama-platform-genai-79f6b79848-vdbqb ollama dlsym: cuDeviceGet - 0x7f9e2d8be830 ollama-platform-genai-79f6b79848-vdbqb ollama dlsym: cuDeviceGetAttribute - 0x7f9e2d8be930 ollama-platform-genai-79f6b79848-vdbqb ollama dlsym: cuDeviceGetUuid - 0x7f9e2d8be890 ollama-platform-genai-79f6b79848-vdbqb ollama dlsym: cuDeviceGetName - 0x7f9e2d8be870 ollama-platform-genai-79f6b79848-vdbqb ollama dlsym: cuCtxCreate_v3 - 0x7f9e2d8c9060 ollama-platform-genai-79f6b79848-vdbqb ollama dlsym: cuMemGetInfo_v2 - 0x7f9e2d8d4520 ollama-platform-genai-79f6b79848-vdbqb ollama dlsym: cuCtxDestroy - 0x7f9e2d92f380 ollama-platform-genai-79f6b79848-vdbqb ollama calling cuInit ollama-platform-genai-79f6b79848-vdbqb ollama calling cuDriverGetVersion ollama-platform-genai-79f6b79848-vdbqb ollama raw version 0x2f1c ollama-platform-genai-79f6b79848-vdbqb ollama CUDA driver version: 12.6 ollama-platform-genai-79f6b79848-vdbqb ollama calling cuDeviceGetCount ollama-platform-genai-79f6b79848-vdbqb ollama device count 1 ollama-platform-genai-79f6b79848-vdbqb ollama time=2025-04-13T21:01:42.934Z level=DEBUG source=gpu.go:125 msg="detected GPUs" count=1 library=/usr/lib/x86_64-linux-gnu/libcuda.so.560.35.03 ollama-platform-genai-79f6b79848-vdbqb ollama [GPU-c3b33e5a-6207-f3d7-34ce-04c6a4416df7] CUDA totalMem 45709 mb ollama-platform-genai-79f6b79848-vdbqb ollama [GPU-c3b33e5a-6207-f3d7-34ce-04c6a4416df7] CUDA freeMem 45278 mb ollama-platform-genai-79f6b79848-vdbqb ollama [GPU-c3b33e5a-6207-f3d7-34ce-04c6a4416df7] Compute Capability 8.9 ollama-platform-genai-79f6b79848-vdbqb ollama time=2025-04-13T21:01:43.066Z level=DEBUG source=amd_linux.go:419 msg="amdgpu driver not detected /sys/module/amdgpu" ollama-platform-genai-79f6b79848-vdbqb ollama releasing cuda driver library ollama-platform-genai-79f6b79848-vdbqb ollama time=2025-04-13T21:01:43.066Z level=INFO source=types.go:130 msg="inference compute" id=GPU-c3b33e5a-6207-f3d7-34ce-04c6a4416df7 library=cuda variant=v12 compute=8.9 driver=12.6 name="NVIDIA L40S" total="44.6 GiB" available="44.2 GiB" ``` We are using Nvidia GPU as you can see in the logs and our Kubernetes version is: ``` Client Version: 4.18.7 Kustomize Version: v5.4.2 Server Version: 4.10.0-0.okd-2022-03-07-131213 Kubernetes Version: v1.23.3-2003+e419edff267ffa-dirty ```
Author
Owner

@1995parham commented on GitHub (Apr 16, 2025):

The issue was related to Kernel version and Go version. Go recently enabled MPTCP, which our cluster Linux version is not yet supported. You can fix it by setting the following env:

    - name: GODEBUG
      value: multipathtcp=0
<!-- gh-comment-id:2809392547 --> @1995parham commented on GitHub (Apr 16, 2025): The issue was related to Kernel version and Go version. Go recently enabled MPTCP, which our cluster Linux version is not yet supported. You can fix it by setting the following env: ```yaml - name: GODEBUG value: multipathtcp=0 ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#68789