[GH-ISSUE #15107] 10 minutes http client timeout is too low when image generation is done on CPU #9678

Open
opened 2026-04-12 22:33:50 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @yurivict on GitHub (Mar 27, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/15107

What is the issue?

This line sets the 10 minute timeout regardless of the hardware.

This is maybe fine on a GPU, but is certainly too low for a CPU.
The timeout should be at least an hour or more when CPU is used.

Relevant log output


OS

Linux

GPU

No response

CPU

AMD

Ollama version

0.18.3

Originally created by @yurivict on GitHub (Mar 27, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/15107 ### What is the issue? [This line](https://github.com/ollama/ollama/blob/main/x/imagegen/server.go#L58) sets the 10 minute timeout regardless of the hardware. This is maybe fine on a GPU, but is certainly too low for a CPU. The timeout should be at least an hour or more when CPU is used. ### Relevant log output ```shell ``` ### OS Linux ### GPU _No response_ ### CPU AMD ### Ollama version 0.18.3
GiteaMirror added the bug label 2026-04-12 22:33:50 -05:00
Author
Owner

@rick-github commented on GitHub (Mar 27, 2026):

Do you have logs showing a failed generation?

<!-- gh-comment-id:4145862397 --> @rick-github commented on GitHub (Mar 27, 2026): Do you have logs showing a failed generation?
Author
Owner

@yurivict commented on GitHub (Mar 28, 2026):

Image gen works without this timeout, but with the timeout it fails with this server log:

$ ollama-serve 
time=2026-03-28T02:39:41.185-07:00 level=INFO source=routes.go:1740 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:65536 OLLAMA_DEBUG:INFO OLLAMA_DEBUG_LOG_REQUESTS:false OLLAMA_EDITOR: OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/yuri/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_CLOUD:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2026-03-28T02:39:41.185-07:00 level=INFO source=routes.go:1742 msg="Ollama cloud disabled: false"
time=2026-03-28T02:39:41.219-07:00 level=INFO source=images.go:477 msg="total blobs: 2339"
time=2026-03-28T02:39:41.222-07:00 level=INFO source=images.go:484 msg="total unused blobs removed: 0"
[GIN-debug] [WARNING] Creating an Engine instance with the Logger and Recovery middleware already attached.

[GIN-debug] [WARNING] Running in "debug" mode. Switch to "release" mode in production.
 - using env:   export GIN_MODE=release
 - using code:  gin.SetMode(gin.ReleaseMode)

[GIN-debug] HEAD   /                         --> github.com/yurivict/ollama/server.(*Server).GenerateRoutes.func1 (5 handlers)
[GIN-debug] GET    /                         --> github.com/yurivict/ollama/server.(*Server).GenerateRoutes.func2 (5 handlers)
[GIN-debug] HEAD   /api/version              --> github.com/yurivict/ollama/server.(*Server).GenerateRoutes.func3 (5 handlers)
[GIN-debug] GET    /api/version              --> github.com/yurivict/ollama/server.(*Server).GenerateRoutes.func4 (5 handlers)
[GIN-debug] GET    /api/status               --> github.com/yurivict/ollama/server.(*Server).StatusHandler-fm (5 handlers)
[GIN-debug] POST   /api/pull                 --> github.com/yurivict/ollama/server.(*Server).PullHandler-fm (5 handlers)
[GIN-debug] POST   /api/push                 --> github.com/yurivict/ollama/server.(*Server).PushHandler-fm (5 handlers)
[GIN-debug] HEAD   /api/tags                 --> github.com/yurivict/ollama/server.(*Server).ListHandler-fm (5 handlers)
[GIN-debug] GET    /api/tags                 --> github.com/yurivict/ollama/server.(*Server).ListHandler-fm (5 handlers)
[GIN-debug] POST   /api/show                 --> github.com/yurivict/ollama/server.(*Server).ShowHandler-fm (5 handlers)
[GIN-debug] DELETE /api/delete               --> github.com/yurivict/ollama/server.(*Server).DeleteHandler-fm (5 handlers)
[GIN-debug] POST   /api/me                   --> github.com/yurivict/ollama/server.(*Server).WhoamiHandler-fm (5 handlers)
[GIN-debug] POST   /api/signout              --> github.com/yurivict/ollama/server.(*Server).SignoutHandler-fm (5 handlers)
[GIN-debug] DELETE /api/user/keys/:encodedKey --> github.com/yurivict/ollama/server.(*Server).SignoutHandler-fm (5 handlers)
[GIN-debug] POST   /api/create               --> github.com/yurivict/ollama/server.(*Server).CreateHandler-fm (5 handlers)
[GIN-debug] POST   /api/blobs/:digest        --> github.com/yurivict/ollama/server.(*Server).CreateBlobHandler-fm (5 handlers)
[GIN-debug] HEAD   /api/blobs/:digest        --> github.com/yurivict/ollama/server.(*Server).HeadBlobHandler-fm (5 handlers)
[GIN-debug] POST   /api/copy                 --> github.com/yurivict/ollama/server.(*Server).CopyHandler-fm (5 handlers)
[GIN-debug] POST   /api/experimental/web_search --> github.com/yurivict/ollama/server.(*Server).WebSearchExperimentalHandler-fm (5 handlers)
[GIN-debug] POST   /api/experimental/web_fetch --> github.com/yurivict/ollama/server.(*Server).WebFetchExperimentalHandler-fm (5 handlers)
[GIN-debug] GET    /api/ps                   --> github.com/yurivict/ollama/server.(*Server).PsHandler-fm (5 handlers)
[GIN-debug] POST   /api/generate             --> github.com/yurivict/ollama/server.(*Server).GenerateHandler-fm (5 handlers)
[GIN-debug] POST   /api/chat                 --> github.com/yurivict/ollama/server.(*Server).ChatHandler-fm (5 handlers)
[GIN-debug] POST   /api/embed                --> github.com/yurivict/ollama/server.(*Server).EmbedHandler-fm (5 handlers)
[GIN-debug] POST   /api/embeddings           --> github.com/yurivict/ollama/server.(*Server).EmbeddingsHandler-fm (5 handlers)
[GIN-debug] POST   /v1/chat/completions      --> github.com/yurivict/ollama/server.(*Server).ChatHandler-fm (7 handlers)
[GIN-debug] POST   /v1/completions           --> github.com/yurivict/ollama/server.(*Server).GenerateHandler-fm (7 handlers)
[GIN-debug] POST   /v1/embeddings            --> github.com/yurivict/ollama/server.(*Server).EmbedHandler-fm (7 handlers)
[GIN-debug] GET    /v1/models                --> github.com/yurivict/ollama/server.(*Server).ListHandler-fm (6 handlers)
[GIN-debug] GET    /v1/models/:model         --> github.com/yurivict/ollama/server.(*Server).ShowHandler-fm (7 handlers)
[GIN-debug] POST   /v1/responses             --> github.com/yurivict/ollama/server.(*Server).ChatHandler-fm (7 handlers)
[GIN-debug] POST   /v1/images/generations    --> github.com/yurivict/ollama/server.(*Server).GenerateHandler-fm (7 handlers)
[GIN-debug] POST   /v1/images/edits          --> github.com/yurivict/ollama/server.(*Server).GenerateHandler-fm (7 handlers)
[GIN-debug] POST   /v1/messages              --> github.com/yurivict/ollama/server.(*Server).ChatHandler-fm (7 handlers)
time=2026-03-28T02:39:41.222-07:00 level=INFO source=routes.go:1798 msg="Listening on 127.0.0.1:11434 (version 0.18.3)"
time=2026-03-28T02:39:41.222-07:00 level=INFO source=runner.go:67 msg="discovering available GPUs..."
time=2026-03-28T02:39:41.223-07:00 level=INFO source=server.go:432 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 24959"
time=2026-03-28T02:39:41.243-07:00 level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="61.5 GiB" available="21.8 GiB"
time=2026-03-28T02:39:41.243-07:00 level=INFO source=routes.go:1848 msg="vram-based default context" total_vram="0 B" default_num_ctx=4096
[GIN] 2026/03/28 - 02:40:20 | 200 |     128.171µs |       127.0.0.1 | HEAD     "/"
[GIN] 2026/03/28 - 02:40:20 | 200 |   40.320913ms |       127.0.0.1 | POST     "/api/show"
time=2026-03-28T02:40:20.353-07:00 level=INFO source=sched.go:484 msg="system memory" total="61.5 GiB" free="21.7 GiB" free_swap="66.0 GiB"
time=2026-03-28T02:40:20.355-07:00 level=INFO source=server.go:173 msg="starting mlx runner subprocess" model=x/z-image-turbo:latest port=46122
time=2026-03-28T02:40:20.358-07:00 level=INFO source=sched.go:561 msg="loaded runners" count=1
time=2026-03-28T02:40:20.368-07:00 level=WARN source=server.go:166 msg=mlx-runner msg="time=2026-03-28T02:40:20.368-07:00 level=INFO msg=\"starting mlx runner\" model=x/z-image-turbo:latest port=46122 mode=imagegen"
time=2026-03-28T02:40:20.368-07:00 level=WARN source=server.go:166 msg=mlx-runner msg="time=2026-03-28T02:40:20.368-07:00 level=INFO msg=\"MLX library initialized\""
time=2026-03-28T02:40:20.372-07:00 level=WARN source=server.go:166 msg=mlx-runner msg="time=2026-03-28T02:40:20.372-07:00 level=INFO msg=\"detected image model type\" type=ZImagePipeline"
time=2026-03-28T02:40:20.372-07:00 level=INFO source=server.go:159 msg=mlx-runner msg="Loading Z-Image model from manifest: x/z-image-turbo:latest..."
time=2026-03-28T02:40:20.568-07:00 level=INFO source=server.go:159 msg=mlx-runner msg="  Loading tokenizer... ✓"
time=2026-03-28T02:40:29.401-07:00 level=INFO source=server.go:159 msg=mlx-runner msg="  Loading text encoder... ✓"
time=2026-03-28T02:40:45.960-07:00 level=INFO source=server.go:159 msg=mlx-runner msg="  (11.3 GB, peak 12.9 GB)"
time=2026-03-28T02:41:03.228-07:00 level=INFO source=server.go:159 msg=mlx-runner msg="  Loading transformer... ✓"
time=2026-03-28T02:41:45.435-07:00 level=INFO source=server.go:159 msg=mlx-runner msg="  (41.4 GB, peak 42.9 GB)"
time=2026-03-28T02:41:45.788-07:00 level=INFO source=server.go:159 msg=mlx-runner msg="  Loading conv_in... ✓"
time=2026-03-28T02:41:45.788-07:00 level=INFO source=server.go:159 msg=mlx-runner msg="  Loading mid block... ✓"
time=2026-03-28T02:41:45.788-07:00 level=INFO source=server.go:159 msg=mlx-runner msg="  Loading up blocks... ✓ [4 blocks]"
time=2026-03-28T02:41:45.788-07:00 level=INFO source=server.go:159 msg=mlx-runner msg="  Loading conv_norm_out... ✓"
time=2026-03-28T02:41:45.788-07:00 level=INFO source=server.go:159 msg=mlx-runner msg="  Loading conv_out... ✓"
time=2026-03-28T02:41:45.789-07:00 level=INFO source=server.go:159 msg=mlx-runner msg="  (41.5 GB, peak 42.9 GB)"
time=2026-03-28T02:41:45.789-07:00 level=INFO source=server.go:159 msg=mlx-runner msg="  Loaded in 85.42s (41.5 GB VRAM)"
time=2026-03-28T02:41:45.790-07:00 level=WARN source=server.go:166 msg=mlx-runner msg="time=2026-03-28T02:41:45.790-07:00 level=INFO msg=\"mlx runner listening\" addr=127.0.0.1:46122"
time=2026-03-28T02:41:45.862-07:00 level=INFO source=server.go:234 msg="mlx runner is ready" port=46122
time=2026-03-28T02:42:10.261-07:00 level=INFO source=server.go:159 msg=mlx-runner msg="  TeaCache enabled: threshold=0.15"
time=2026-03-28T02:51:38.117-07:00 level=INFO source=server.go:159 msg=mlx-runner msg="  Step 1/9: t=1.0000 (567.86s) [41.5 GB active, 45.8 GB peak]"
time=2026-03-28T02:51:38.117-07:00 level=INFO source=server.go:159 msg=mlx-runner msg="    [TeaCache: reusing cached output]"
time=2026-03-28T02:51:38.118-07:00 level=INFO source=server.go:159 msg=mlx-runner msg="  Step 2/9: t=0.9619 (0.00s) [41.5 GB active, 45.8 GB peak]"
time=2026-03-28T02:51:38.118-07:00 level=INFO source=server.go:159 msg=mlx-runner msg="    [TeaCache: reusing cached output]"
time=2026-03-28T02:51:38.118-07:00 level=INFO source=server.go:159 msg=mlx-runner msg="  Step 3/9: t=0.9170 (0.00s) [41.5 GB active, 45.8 GB peak]"
time=2026-03-28T03:00:13.237-07:00 level=INFO source=server.go:159 msg=mlx-runner msg="  Step 4/9: t=0.8633 (515.12s) [41.5 GB active, 45.8 GB peak]"
time=2026-03-28T03:00:13.237-07:00 level=INFO source=server.go:159 msg=mlx-runner msg="    [TeaCache: reusing cached output]"
time=2026-03-28T03:00:13.237-07:00 level=INFO source=server.go:159 msg=mlx-runner msg="  Step 5/9: t=0.7979 (0.00s) [41.5 GB active, 45.8 GB peak]"
time=2026-03-28T03:09:18.034-07:00 level=INFO source=server.go:159 msg=mlx-runner msg="  Step 6/9: t=0.7164 (544.80s) [41.5 GB active, 45.8 GB peak]"
time=2026-03-28T03:09:18.034-07:00 level=INFO source=server.go:159 msg=mlx-runner msg="    [TeaCache: reusing cached output]"
time=2026-03-28T03:09:18.035-07:00 level=INFO source=server.go:159 msg=mlx-runner msg="  Step 7/9: t=0.6123 (0.00s) [41.5 GB active, 45.8 GB peak]"
time=2026-03-28T03:16:58.111-07:00 level=INFO source=server.go:159 msg=mlx-runner msg="  Step 8/9: t=0.4743 (460.08s) [41.5 GB active, 45.8 GB peak]"
time=2026-03-28T03:23:41.316-07:00 level=ERROR source=server.go:346 msg="mlx scanner error" error="unexpected EOF"
[GIN] 2026/03/28 - 03:23:41 | 200 |        43m21s |       127.0.0.1 | POST     "/api/generate"
time=2026-03-28T03:28:41.647-07:00 level=INFO source=server.go:365 msg="stopping mlx runner subprocess" pid=19769

The client exited with exit code 0 (success) without printing any message which is also a problem.

<!-- gh-comment-id:4148364037 --> @yurivict commented on GitHub (Mar 28, 2026): Image gen works without this timeout, but with the timeout it fails with this server log: ``` $ ollama-serve time=2026-03-28T02:39:41.185-07:00 level=INFO source=routes.go:1740 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:65536 OLLAMA_DEBUG:INFO OLLAMA_DEBUG_LOG_REQUESTS:false OLLAMA_EDITOR: OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/yuri/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_CLOUD:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]" time=2026-03-28T02:39:41.185-07:00 level=INFO source=routes.go:1742 msg="Ollama cloud disabled: false" time=2026-03-28T02:39:41.219-07:00 level=INFO source=images.go:477 msg="total blobs: 2339" time=2026-03-28T02:39:41.222-07:00 level=INFO source=images.go:484 msg="total unused blobs removed: 0" [GIN-debug] [WARNING] Creating an Engine instance with the Logger and Recovery middleware already attached. [GIN-debug] [WARNING] Running in "debug" mode. Switch to "release" mode in production. - using env: export GIN_MODE=release - using code: gin.SetMode(gin.ReleaseMode) [GIN-debug] HEAD / --> github.com/yurivict/ollama/server.(*Server).GenerateRoutes.func1 (5 handlers) [GIN-debug] GET / --> github.com/yurivict/ollama/server.(*Server).GenerateRoutes.func2 (5 handlers) [GIN-debug] HEAD /api/version --> github.com/yurivict/ollama/server.(*Server).GenerateRoutes.func3 (5 handlers) [GIN-debug] GET /api/version --> github.com/yurivict/ollama/server.(*Server).GenerateRoutes.func4 (5 handlers) [GIN-debug] GET /api/status --> github.com/yurivict/ollama/server.(*Server).StatusHandler-fm (5 handlers) [GIN-debug] POST /api/pull --> github.com/yurivict/ollama/server.(*Server).PullHandler-fm (5 handlers) [GIN-debug] POST /api/push --> github.com/yurivict/ollama/server.(*Server).PushHandler-fm (5 handlers) [GIN-debug] HEAD /api/tags --> github.com/yurivict/ollama/server.(*Server).ListHandler-fm (5 handlers) [GIN-debug] GET /api/tags --> github.com/yurivict/ollama/server.(*Server).ListHandler-fm (5 handlers) [GIN-debug] POST /api/show --> github.com/yurivict/ollama/server.(*Server).ShowHandler-fm (5 handlers) [GIN-debug] DELETE /api/delete --> github.com/yurivict/ollama/server.(*Server).DeleteHandler-fm (5 handlers) [GIN-debug] POST /api/me --> github.com/yurivict/ollama/server.(*Server).WhoamiHandler-fm (5 handlers) [GIN-debug] POST /api/signout --> github.com/yurivict/ollama/server.(*Server).SignoutHandler-fm (5 handlers) [GIN-debug] DELETE /api/user/keys/:encodedKey --> github.com/yurivict/ollama/server.(*Server).SignoutHandler-fm (5 handlers) [GIN-debug] POST /api/create --> github.com/yurivict/ollama/server.(*Server).CreateHandler-fm (5 handlers) [GIN-debug] POST /api/blobs/:digest --> github.com/yurivict/ollama/server.(*Server).CreateBlobHandler-fm (5 handlers) [GIN-debug] HEAD /api/blobs/:digest --> github.com/yurivict/ollama/server.(*Server).HeadBlobHandler-fm (5 handlers) [GIN-debug] POST /api/copy --> github.com/yurivict/ollama/server.(*Server).CopyHandler-fm (5 handlers) [GIN-debug] POST /api/experimental/web_search --> github.com/yurivict/ollama/server.(*Server).WebSearchExperimentalHandler-fm (5 handlers) [GIN-debug] POST /api/experimental/web_fetch --> github.com/yurivict/ollama/server.(*Server).WebFetchExperimentalHandler-fm (5 handlers) [GIN-debug] GET /api/ps --> github.com/yurivict/ollama/server.(*Server).PsHandler-fm (5 handlers) [GIN-debug] POST /api/generate --> github.com/yurivict/ollama/server.(*Server).GenerateHandler-fm (5 handlers) [GIN-debug] POST /api/chat --> github.com/yurivict/ollama/server.(*Server).ChatHandler-fm (5 handlers) [GIN-debug] POST /api/embed --> github.com/yurivict/ollama/server.(*Server).EmbedHandler-fm (5 handlers) [GIN-debug] POST /api/embeddings --> github.com/yurivict/ollama/server.(*Server).EmbeddingsHandler-fm (5 handlers) [GIN-debug] POST /v1/chat/completions --> github.com/yurivict/ollama/server.(*Server).ChatHandler-fm (7 handlers) [GIN-debug] POST /v1/completions --> github.com/yurivict/ollama/server.(*Server).GenerateHandler-fm (7 handlers) [GIN-debug] POST /v1/embeddings --> github.com/yurivict/ollama/server.(*Server).EmbedHandler-fm (7 handlers) [GIN-debug] GET /v1/models --> github.com/yurivict/ollama/server.(*Server).ListHandler-fm (6 handlers) [GIN-debug] GET /v1/models/:model --> github.com/yurivict/ollama/server.(*Server).ShowHandler-fm (7 handlers) [GIN-debug] POST /v1/responses --> github.com/yurivict/ollama/server.(*Server).ChatHandler-fm (7 handlers) [GIN-debug] POST /v1/images/generations --> github.com/yurivict/ollama/server.(*Server).GenerateHandler-fm (7 handlers) [GIN-debug] POST /v1/images/edits --> github.com/yurivict/ollama/server.(*Server).GenerateHandler-fm (7 handlers) [GIN-debug] POST /v1/messages --> github.com/yurivict/ollama/server.(*Server).ChatHandler-fm (7 handlers) time=2026-03-28T02:39:41.222-07:00 level=INFO source=routes.go:1798 msg="Listening on 127.0.0.1:11434 (version 0.18.3)" time=2026-03-28T02:39:41.222-07:00 level=INFO source=runner.go:67 msg="discovering available GPUs..." time=2026-03-28T02:39:41.223-07:00 level=INFO source=server.go:432 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 24959" time=2026-03-28T02:39:41.243-07:00 level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="61.5 GiB" available="21.8 GiB" time=2026-03-28T02:39:41.243-07:00 level=INFO source=routes.go:1848 msg="vram-based default context" total_vram="0 B" default_num_ctx=4096 [GIN] 2026/03/28 - 02:40:20 | 200 | 128.171µs | 127.0.0.1 | HEAD "/" [GIN] 2026/03/28 - 02:40:20 | 200 | 40.320913ms | 127.0.0.1 | POST "/api/show" time=2026-03-28T02:40:20.353-07:00 level=INFO source=sched.go:484 msg="system memory" total="61.5 GiB" free="21.7 GiB" free_swap="66.0 GiB" time=2026-03-28T02:40:20.355-07:00 level=INFO source=server.go:173 msg="starting mlx runner subprocess" model=x/z-image-turbo:latest port=46122 time=2026-03-28T02:40:20.358-07:00 level=INFO source=sched.go:561 msg="loaded runners" count=1 time=2026-03-28T02:40:20.368-07:00 level=WARN source=server.go:166 msg=mlx-runner msg="time=2026-03-28T02:40:20.368-07:00 level=INFO msg=\"starting mlx runner\" model=x/z-image-turbo:latest port=46122 mode=imagegen" time=2026-03-28T02:40:20.368-07:00 level=WARN source=server.go:166 msg=mlx-runner msg="time=2026-03-28T02:40:20.368-07:00 level=INFO msg=\"MLX library initialized\"" time=2026-03-28T02:40:20.372-07:00 level=WARN source=server.go:166 msg=mlx-runner msg="time=2026-03-28T02:40:20.372-07:00 level=INFO msg=\"detected image model type\" type=ZImagePipeline" time=2026-03-28T02:40:20.372-07:00 level=INFO source=server.go:159 msg=mlx-runner msg="Loading Z-Image model from manifest: x/z-image-turbo:latest..." time=2026-03-28T02:40:20.568-07:00 level=INFO source=server.go:159 msg=mlx-runner msg=" Loading tokenizer... ✓" time=2026-03-28T02:40:29.401-07:00 level=INFO source=server.go:159 msg=mlx-runner msg=" Loading text encoder... ✓" time=2026-03-28T02:40:45.960-07:00 level=INFO source=server.go:159 msg=mlx-runner msg=" (11.3 GB, peak 12.9 GB)" time=2026-03-28T02:41:03.228-07:00 level=INFO source=server.go:159 msg=mlx-runner msg=" Loading transformer... ✓" time=2026-03-28T02:41:45.435-07:00 level=INFO source=server.go:159 msg=mlx-runner msg=" (41.4 GB, peak 42.9 GB)" time=2026-03-28T02:41:45.788-07:00 level=INFO source=server.go:159 msg=mlx-runner msg=" Loading conv_in... ✓" time=2026-03-28T02:41:45.788-07:00 level=INFO source=server.go:159 msg=mlx-runner msg=" Loading mid block... ✓" time=2026-03-28T02:41:45.788-07:00 level=INFO source=server.go:159 msg=mlx-runner msg=" Loading up blocks... ✓ [4 blocks]" time=2026-03-28T02:41:45.788-07:00 level=INFO source=server.go:159 msg=mlx-runner msg=" Loading conv_norm_out... ✓" time=2026-03-28T02:41:45.788-07:00 level=INFO source=server.go:159 msg=mlx-runner msg=" Loading conv_out... ✓" time=2026-03-28T02:41:45.789-07:00 level=INFO source=server.go:159 msg=mlx-runner msg=" (41.5 GB, peak 42.9 GB)" time=2026-03-28T02:41:45.789-07:00 level=INFO source=server.go:159 msg=mlx-runner msg=" Loaded in 85.42s (41.5 GB VRAM)" time=2026-03-28T02:41:45.790-07:00 level=WARN source=server.go:166 msg=mlx-runner msg="time=2026-03-28T02:41:45.790-07:00 level=INFO msg=\"mlx runner listening\" addr=127.0.0.1:46122" time=2026-03-28T02:41:45.862-07:00 level=INFO source=server.go:234 msg="mlx runner is ready" port=46122 time=2026-03-28T02:42:10.261-07:00 level=INFO source=server.go:159 msg=mlx-runner msg=" TeaCache enabled: threshold=0.15" time=2026-03-28T02:51:38.117-07:00 level=INFO source=server.go:159 msg=mlx-runner msg=" Step 1/9: t=1.0000 (567.86s) [41.5 GB active, 45.8 GB peak]" time=2026-03-28T02:51:38.117-07:00 level=INFO source=server.go:159 msg=mlx-runner msg=" [TeaCache: reusing cached output]" time=2026-03-28T02:51:38.118-07:00 level=INFO source=server.go:159 msg=mlx-runner msg=" Step 2/9: t=0.9619 (0.00s) [41.5 GB active, 45.8 GB peak]" time=2026-03-28T02:51:38.118-07:00 level=INFO source=server.go:159 msg=mlx-runner msg=" [TeaCache: reusing cached output]" time=2026-03-28T02:51:38.118-07:00 level=INFO source=server.go:159 msg=mlx-runner msg=" Step 3/9: t=0.9170 (0.00s) [41.5 GB active, 45.8 GB peak]" time=2026-03-28T03:00:13.237-07:00 level=INFO source=server.go:159 msg=mlx-runner msg=" Step 4/9: t=0.8633 (515.12s) [41.5 GB active, 45.8 GB peak]" time=2026-03-28T03:00:13.237-07:00 level=INFO source=server.go:159 msg=mlx-runner msg=" [TeaCache: reusing cached output]" time=2026-03-28T03:00:13.237-07:00 level=INFO source=server.go:159 msg=mlx-runner msg=" Step 5/9: t=0.7979 (0.00s) [41.5 GB active, 45.8 GB peak]" time=2026-03-28T03:09:18.034-07:00 level=INFO source=server.go:159 msg=mlx-runner msg=" Step 6/9: t=0.7164 (544.80s) [41.5 GB active, 45.8 GB peak]" time=2026-03-28T03:09:18.034-07:00 level=INFO source=server.go:159 msg=mlx-runner msg=" [TeaCache: reusing cached output]" time=2026-03-28T03:09:18.035-07:00 level=INFO source=server.go:159 msg=mlx-runner msg=" Step 7/9: t=0.6123 (0.00s) [41.5 GB active, 45.8 GB peak]" time=2026-03-28T03:16:58.111-07:00 level=INFO source=server.go:159 msg=mlx-runner msg=" Step 8/9: t=0.4743 (460.08s) [41.5 GB active, 45.8 GB peak]" time=2026-03-28T03:23:41.316-07:00 level=ERROR source=server.go:346 msg="mlx scanner error" error="unexpected EOF" [GIN] 2026/03/28 - 03:23:41 | 200 | 43m21s | 127.0.0.1 | POST "/api/generate" time=2026-03-28T03:28:41.647-07:00 level=INFO source=server.go:365 msg="stopping mlx runner subprocess" pid=19769 ``` The client exited with exit code 0 (success) without printing any message which is also a problem.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#9678