[GH-ISSUE #6650] ollama serve does not finished after long waiting #4186

Closed
opened 2026-04-12 15:07:06 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @lifelongeeek on GitHub (Sep 5, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6650

What is the issue?

I tried ollama serve in a container. But it does not completed after waiting for a very long time. Could anyone suggest related solution to this?

Here is the log.

root@d39fcb3d6754: # ollama serve
Couldn't find '/root/.ollama/id_ed25519'. Generating new private key.
Your new public key is: 

ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIPaNPlXFSbC0urQ1ESCmmOMdA/yq1Dem4qWNwPtKDbuc

2024/09/04 12:54:30 routes.go:1125: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR: OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"
time=2024-09-05T03:54:30.399Z level=INFO source=images.go:753 msg="total blobs: 0"
time=2024-09-05T03:54:30.399Z level=INFO source=images.go:760 msg="total unused blobs removed: 0"
time=2024-09-05T03:54:30.399Z level=INFO source=routes.go:1172 msg="Listening on 127.0.0.1:11434 (version 0.3.9)"
time=2024-09-05T03:54:30.408Z level=INFO source=payload.go:30 msg="extracting embedded files" dir=/tmp/ollama844009318/runners
time=2024-09-05T03:54:41.802Z level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 cuda_v11 cuda_v12 rocm_v60102]"
time=2024-09-05T03:54:41.802Z level=INFO source=gpu.go:200 msg="looking for compatible GPUs"
time=2024-09-05T03:54:42.152Z level=INFO source=types.go:107 msg="inference compute" id=GPU-0c8e7b28-0c88-e46b-0269-e472c7044e62 library=cuda variant=v12 compute=8.0 driver=12.3 name="NVIDIA A100 80GB PCIe" total="79.2 GiB" available="78.7 GiB"

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.3.9

Originally created by @lifelongeeek on GitHub (Sep 5, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/6650 ### What is the issue? I tried `ollama serve` in a container. But it does not completed after waiting for a very long time. Could anyone suggest related solution to this? Here is the log. ``` root@d39fcb3d6754: # ollama serve Couldn't find '/root/.ollama/id_ed25519'. Generating new private key. Your new public key is: ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIPaNPlXFSbC0urQ1ESCmmOMdA/yq1Dem4qWNwPtKDbuc 2024/09/04 12:54:30 routes.go:1125: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR: OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]" time=2024-09-05T03:54:30.399Z level=INFO source=images.go:753 msg="total blobs: 0" time=2024-09-05T03:54:30.399Z level=INFO source=images.go:760 msg="total unused blobs removed: 0" time=2024-09-05T03:54:30.399Z level=INFO source=routes.go:1172 msg="Listening on 127.0.0.1:11434 (version 0.3.9)" time=2024-09-05T03:54:30.408Z level=INFO source=payload.go:30 msg="extracting embedded files" dir=/tmp/ollama844009318/runners time=2024-09-05T03:54:41.802Z level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 cuda_v11 cuda_v12 rocm_v60102]" time=2024-09-05T03:54:41.802Z level=INFO source=gpu.go:200 msg="looking for compatible GPUs" time=2024-09-05T03:54:42.152Z level=INFO source=types.go:107 msg="inference compute" id=GPU-0c8e7b28-0c88-e46b-0269-e472c7044e62 library=cuda variant=v12 compute=8.0 driver=12.3 name="NVIDIA A100 80GB PCIe" total="79.2 GiB" available="78.7 GiB" ``` ### OS Linux ### GPU Nvidia ### CPU Intel ### Ollama version 0.3.9
GiteaMirror added the bug label 2026-04-12 15:07:06 -05:00
Author
Owner

@jmorganca commented on GitHub (Sep 5, 2024):

Hi @lifelongeeek thanks for creating an issue! I think this is working as expected, ollama serve will serve Ollama's REST API. As mentioned in the logs, the server will be available on port 11434. Let me know if you still have trouble (or feel free to email me – email is in my GH profile).

<!-- gh-comment-id:2330570235 --> @jmorganca commented on GitHub (Sep 5, 2024): Hi @lifelongeeek thanks for creating an issue! I think this is working as expected, `ollama serve` will serve Ollama's [REST API](https://github.com/ollama/ollama/blob/main/docs/api.md). As mentioned in the logs, the server will be available on port 11434. Let me know if you still have trouble (or feel free to email me – email is in my GH profile).
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#4186