[GH-ISSUE #13590] Internal Server Error on Model Load (EOF) #55460

Closed
opened 2026-04-29 09:15:48 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @ApprenticeofEnder on GitHub (Dec 31, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/13590

What is the issue?

Attempted to run gpt-oss:20b on my local hardware, got an internal service error that said something about an EOF.

Was on 0.12.11, swapping to 0.13.5 because I'm running it via Nix and home-manager, will update if that changes anything.

Relevant log output

From systemctl --user ollama, since that's the only thing that currently gives me anything resembling logs.

ollama.service - Server for local large language models
     Loaded: loaded (/home/ender/.config/systemd/user/ollama.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2025-12-30 22:04:39 EST; 38min ago
   Main PID: 16508 (.ollama-wrapped)
      Tasks: 22 (limit: 38009)
     Memory: 11.8G
        CPU: 1min 43.892s
     CGroup: /user.slice/user-1000.slice/user@1000.service/app.slice/ollama.service
             └─16508 /nix/store/6bxrsdj05ykag2dr9hwp8z9bhah2a0gy-ollama-0.12.11/bin/ollama serve

Dec 30 22:31:47 ender-hornet ollama[16508]: gs     0x0
Dec 30 22:31:47 ender-hornet ollama[16508]: time=2025-12-30T22:31:47.373-05:00 level=INFO source=device.go:240 msg="model weights" device=CUDA0 size="11.8 GiB"
Dec 30 22:31:47 ender-hornet ollama[16508]: time=2025-12-30T22:31:47.374-05:00 level=INFO source=device.go:245 msg="model weights" device=CPU size="1.1 GiB"
Dec 30 22:31:47 ender-hornet ollama[16508]: time=2025-12-30T22:31:47.374-05:00 level=INFO source=device.go:251 msg="kv cache" device=CUDA0 size="858.0 MiB"
Dec 30 22:31:47 ender-hornet ollama[16508]: time=2025-12-30T22:31:47.374-05:00 level=INFO source=device.go:262 msg="compute graph" device=CUDA0 size="222.8 MiB"
Dec 30 22:31:47 ender-hornet ollama[16508]: time=2025-12-30T22:31:47.374-05:00 level=INFO source=device.go:267 msg="compute graph" device=CPU size="5.6 MiB"
Dec 30 22:31:47 ender-hornet ollama[16508]: time=2025-12-30T22:31:47.374-05:00 level=INFO source=device.go:272 msg="total memory" size="13.9 GiB"
Dec 30 22:31:47 ender-hornet ollama[16508]: time=2025-12-30T22:31:47.374-05:00 level=INFO source=sched.go:470 msg="Load failed" model=/home/ender/.ollama/models/blobs/sha256-e7b273f9636059a689e3ddcab3716e4f65abe0143ac978e46673ad0e52d09efb error="do load request: Post \"http://127.0.0.1:43297/load\": EOF"
Dec 30 22:31:47 ender-hornet ollama[16508]: time=2025-12-30T22:31:47.418-05:00 level=ERROR source=server.go:265 msg="llama runner terminated" error="exit status 2"
Dec 30 22:31:47 ender-hornet ollama[16508]: [GIN] 2025/12/30 - 22:31:47 | 500 |  1.823233695s |       127.0.0.1 | POST     "/api/generate"

nvidia-smi:

Tue Dec 30 22:46:19 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.95.05              Driver Version: 580.95.05      CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 980 Ti      Off |   00000000:04:00.0 Off |                  N/A |
|  0%   36C    P8             11W /  250W |       8MiB /   6144MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA GeForce RTX 3070 Ti     Off |   00000000:0A:00.0  On |                  N/A |
|  0%   54C    P5             41W /  290W |     857MiB /   8192MiB |     32%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A            1680      G   /usr/lib/xorg/Xorg                        3MiB |
|    1   N/A  N/A            1680      G   /usr/lib/xorg/Xorg                      434MiB |
|    1   N/A  N/A            2686      G   cinnamon                                 16MiB |
|    1   N/A  N/A            2994      G   /opt/1Password/1password                 23MiB |
|    1   N/A  N/A            3150      G   .../ender/.nix-profile/bin/kitty         13MiB |
|    1   N/A  N/A           18347      G   /usr/lib/firefox/firefox                236MiB |
+-----------------------------------------------------------------------------------------+

OS

Linux

GPU

Nvidia

CPU

AMD

Ollama version

0.12.11

Originally created by @ApprenticeofEnder on GitHub (Dec 31, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/13590 ### What is the issue? Attempted to run `gpt-oss:20b` on my local hardware, got an internal service error that said something about an EOF. Was on 0.12.11, swapping to 0.13.5 because I'm running it via Nix and home-manager, will update if that changes anything. ### Relevant log output From `systemctl --user ollama`, since that's the only thing that currently gives me anything resembling logs. ``` ollama.service - Server for local large language models Loaded: loaded (/home/ender/.config/systemd/user/ollama.service; enabled; vendor preset: enabled) Active: active (running) since Tue 2025-12-30 22:04:39 EST; 38min ago Main PID: 16508 (.ollama-wrapped) Tasks: 22 (limit: 38009) Memory: 11.8G CPU: 1min 43.892s CGroup: /user.slice/user-1000.slice/user@1000.service/app.slice/ollama.service └─16508 /nix/store/6bxrsdj05ykag2dr9hwp8z9bhah2a0gy-ollama-0.12.11/bin/ollama serve Dec 30 22:31:47 ender-hornet ollama[16508]: gs 0x0 Dec 30 22:31:47 ender-hornet ollama[16508]: time=2025-12-30T22:31:47.373-05:00 level=INFO source=device.go:240 msg="model weights" device=CUDA0 size="11.8 GiB" Dec 30 22:31:47 ender-hornet ollama[16508]: time=2025-12-30T22:31:47.374-05:00 level=INFO source=device.go:245 msg="model weights" device=CPU size="1.1 GiB" Dec 30 22:31:47 ender-hornet ollama[16508]: time=2025-12-30T22:31:47.374-05:00 level=INFO source=device.go:251 msg="kv cache" device=CUDA0 size="858.0 MiB" Dec 30 22:31:47 ender-hornet ollama[16508]: time=2025-12-30T22:31:47.374-05:00 level=INFO source=device.go:262 msg="compute graph" device=CUDA0 size="222.8 MiB" Dec 30 22:31:47 ender-hornet ollama[16508]: time=2025-12-30T22:31:47.374-05:00 level=INFO source=device.go:267 msg="compute graph" device=CPU size="5.6 MiB" Dec 30 22:31:47 ender-hornet ollama[16508]: time=2025-12-30T22:31:47.374-05:00 level=INFO source=device.go:272 msg="total memory" size="13.9 GiB" Dec 30 22:31:47 ender-hornet ollama[16508]: time=2025-12-30T22:31:47.374-05:00 level=INFO source=sched.go:470 msg="Load failed" model=/home/ender/.ollama/models/blobs/sha256-e7b273f9636059a689e3ddcab3716e4f65abe0143ac978e46673ad0e52d09efb error="do load request: Post \"http://127.0.0.1:43297/load\": EOF" Dec 30 22:31:47 ender-hornet ollama[16508]: time=2025-12-30T22:31:47.418-05:00 level=ERROR source=server.go:265 msg="llama runner terminated" error="exit status 2" Dec 30 22:31:47 ender-hornet ollama[16508]: [GIN] 2025/12/30 - 22:31:47 | 500 | 1.823233695s | 127.0.0.1 | POST "/api/generate" ``` `nvidia-smi`: ``` Tue Dec 30 22:46:19 2025 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 580.95.05 Driver Version: 580.95.05 CUDA Version: 13.0 | +-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce GTX 980 Ti Off | 00000000:04:00.0 Off | N/A | | 0% 36C P8 11W / 250W | 8MiB / 6144MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ | 1 NVIDIA GeForce RTX 3070 Ti Off | 00000000:0A:00.0 On | N/A | | 0% 54C P5 41W / 290W | 857MiB / 8192MiB | 32% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | 0 N/A N/A 1680 G /usr/lib/xorg/Xorg 3MiB | | 1 N/A N/A 1680 G /usr/lib/xorg/Xorg 434MiB | | 1 N/A N/A 2686 G cinnamon 16MiB | | 1 N/A N/A 2994 G /opt/1Password/1password 23MiB | | 1 N/A N/A 3150 G .../ender/.nix-profile/bin/kitty 13MiB | | 1 N/A N/A 18347 G /usr/lib/firefox/firefox 236MiB | +-----------------------------------------------------------------------------------------+ ``` ### OS Linux ### GPU Nvidia ### CPU AMD ### Ollama version 0.12.11
GiteaMirror added the bug label 2026-04-29 09:15:48 -05:00
Author
Owner

@ApprenticeofEnder commented on GitHub (Dec 31, 2025):

Updating to 0.13.5 seems to have resolved it. Strange.

<!-- gh-comment-id:3701410130 --> @ApprenticeofEnder commented on GitHub (Dec 31, 2025): Updating to 0.13.5 seems to have resolved it. Strange.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#55460