[GH-ISSUE #14600] Stream Error with Codex: stream closed before response.complete #35224

Closed
opened 2026-04-22 19:36:30 -05:00 by GiteaMirror · 8 comments
Owner

Originally created by @chigkim on GitHub (Mar 3, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/14600

What is the issue?

When I try to use Codex with Ollama, I often get this error:
stream error: stream disconnected before completion: stream closed before response.complete
My codex-cli is 0.104.0.
I also set stream_idle_timeout_ms = 600000 in Codex.

Relevant log output

[GIN] 2026/03/03 - 18:37:21 | 200 | 24.921095417s |  192.168.99.177 | POST     "/v1/responses"
time=2026-03-03T18:37:21.878-05:00 level=DEBUG source=sched.go:587 msg="context for request finished"
time=2026-03-03T18:37:21.878-05:00 level=DEBUG source=sched.go:338 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s
time=2026-03-03T18:37:21.879-05:00 level=DEBUG source=sched.go:356 msg="after processing request finished event" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 refCount=0
time=2026-03-03T18:37:22.021-05:00 level=DEBUG source=sched.go:736 msg="evaluating already loaded" model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0
time=2026-03-03T18:37:22.029-05:00 level=DEBUG source=server.go:1537 msg="completion request" images=0 prompt=38778 format=""
time=2026-03-03T18:37:22.037-05:00 level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=8751 prompt=8861 used=8604 remaining=257
time=2026-03-03T18:37:25.055-05:00 level=WARN source=qwen3.go:108 msg="qwen3 tool call parsing failed" error="failed to parse JSON: unexpected end of JSON input"
[GIN] 2026/03/03 - 18:37:25 | 200 |  3.143320333s |  192.168.99.177 | POST     "/v1/responses"
time=2026-03-03T18:37:25.055-05:00 level=DEBUG source=sched.go:433 msg="context for request finished" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000
time=2026-03-03T18:37:25.055-05:00 level=DEBUG source=sched.go:338 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s
time=2026-03-03T18:37:25.055-05:00 level=DEBUG source=sched.go:356 msg="after processing request finished event" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 refCount=0
time=2026-03-03T18:37:25.480-05:00 level=DEBUG source=sched.go:736 msg="evaluating already loaded" model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0
time=2026-03-03T18:37:25.488-05:00 level=DEBUG source=server.go:1537 msg="completion request" images=0 prompt=38778 format=""
time=2026-03-03T18:37:25.500-05:00 level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=8940 prompt=8861 used=8604 remaining=257
time=2026-03-03T18:37:28.155-05:00 level=WARN source=qwen3.go:108 msg="qwen3 tool call parsing failed" error="failed to parse JSON: unexpected end of JSON input"
[GIN] 2026/03/03 - 18:37:28 | 200 |  2.839764792s |  192.168.99.177 | POST     "/v1/responses"
time=2026-03-03T18:37:28.155-05:00 level=DEBUG source=sched.go:433 msg="context for request finished" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000
time=2026-03-03T18:37:28.156-05:00 level=DEBUG source=sched.go:338 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s
time=2026-03-03T18:37:28.156-05:00 level=DEBUG source=sched.go:356 msg="after processing request finished event" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 refCount=0
time=2026-03-03T18:37:28.944-05:00 level=DEBUG source=sched.go:736 msg="evaluating already loaded" model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0
time=2026-03-03T18:37:28.952-05:00 level=DEBUG source=server.go:1537 msg="completion request" images=0 prompt=38778 format=""
time=2026-03-03T18:37:28.960-05:00 level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=8929 prompt=8861 used=8604 remaining=257
time=2026-03-03T18:37:31.713-05:00 level=WARN source=qwen3.go:108 msg="qwen3 tool call parsing failed" error="failed to parse JSON: invalid character ']' after object key:value pair"
[GIN] 2026/03/03 - 18:37:31 | 200 |  2.917049959s |  192.168.99.177 | POST     "/v1/responses"
time=2026-03-03T18:37:31.713-05:00 level=DEBUG source=sched.go:433 msg="context for request finished" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000
time=2026-03-03T18:37:31.713-05:00 level=DEBUG source=sched.go:338 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s
time=2026-03-03T18:37:31.713-05:00 level=DEBUG source=sched.go:356 msg="after processing request finished event" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 refCount=0
[GIN] 2026/03/03 - 18:37:31 | 200 |        36.5µs |       127.0.0.1 | HEAD     "/"
[GIN] 2026/03/03 - 18:37:31 | 200 |      42.792µs |       127.0.0.1 | GET      "/api/ps"
time=2026-03-03T18:37:32.668-05:00 level=DEBUG source=sched.go:736 msg="evaluating already loaded" model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0
time=2026-03-03T18:37:32.677-05:00 level=DEBUG source=server.go:1537 msg="completion request" images=0 prompt=38778 format=""
time=2026-03-03T18:37:32.685-05:00 level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=8933 prompt=8861 used=8604 remaining=257
time=2026-03-03T18:37:35.804-05:00 level=WARN source=qwen3.go:108 msg="qwen3 tool call parsing failed" error="failed to parse JSON: unexpected end of JSON input"
[GIN] 2026/03/03 - 18:37:35 | 200 |     3.286773s |  192.168.99.177 | POST     "/v1/responses"
time=2026-03-03T18:37:35.804-05:00 level=DEBUG source=sched.go:433 msg="context for request finished" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000
time=2026-03-03T18:37:35.804-05:00 level=DEBUG source=sched.go:338 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s
time=2026-03-03T18:37:35.804-05:00 level=DEBUG source=sched.go:356 msg="after processing request finished event" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 refCount=0
time=2026-03-03T18:37:37.547-05:00 level=DEBUG source=sched.go:736 msg="evaluating already loaded" model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0
time=2026-03-03T18:37:37.556-05:00 level=DEBUG source=server.go:1537 msg="completion request" images=0 prompt=38778 format=""
time=2026-03-03T18:37:37.564-05:00 level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=8943 prompt=8861 used=8604 remaining=257
time=2026-03-03T18:37:40.452-05:00 level=WARN source=qwen3.go:108 msg="qwen3 tool call parsing failed" error="failed to parse JSON: unexpected end of JSON input"
[GIN] 2026/03/03 - 18:37:40 | 200 |  3.054015084s |  192.168.99.177 | POST     "/v1/responses"
time=2026-03-03T18:37:40.453-05:00 level=DEBUG source=sched.go:433 msg="context for request finished" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000
time=2026-03-03T18:37:40.453-05:00 level=DEBUG source=sched.go:338 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s
time=2026-03-03T18:37:40.453-05:00 level=DEBUG source=sched.go:356 msg="after processing request finished event" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 refCount=0
time=2026-03-03T18:37:44.217-05:00 level=DEBUG source=sched.go:736 msg="evaluating already loaded" model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0
time=2026-03-03T18:37:44.226-05:00 level=DEBUG source=server.go:1537 msg="completion request" images=0 prompt=38778 format=""
time=2026-03-03T18:37:44.234-05:00 level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=8936 prompt=8861 used=8604 remaining=257
time=2026-03-03T18:37:47.252-05:00 level=WARN source=qwen3.go:108 msg="qwen3 tool call parsing failed" error="failed to parse JSON: unexpected end of JSON input"
[GIN] 2026/03/03 - 18:37:47 | 200 |  3.195688125s |  192.168.99.177 | POST     "/v1/responses"
time=2026-03-03T18:37:47.252-05:00 level=DEBUG source=sched.go:433 msg="context for request finished" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000
time=2026-03-03T18:37:47.252-05:00 level=DEBUG source=sched.go:338 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s
time=2026-03-03T18:37:47.252-05:00 level=DEBUG source=sched.go:356 msg="after processing request finished event" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 refCount=0

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.17.5

Originally created by @chigkim on GitHub (Mar 3, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/14600 ### What is the issue? When I try to use Codex with Ollama, I often get this error: stream error: stream disconnected before completion: stream closed before response.complete My codex-cli is 0.104.0. I also set `stream_idle_timeout_ms = 600000` in Codex. ### Relevant log output ```shell [GIN] 2026/03/03 - 18:37:21 | 200 | 24.921095417s | 192.168.99.177 | POST "/v1/responses" time=2026-03-03T18:37:21.878-05:00 level=DEBUG source=sched.go:587 msg="context for request finished" time=2026-03-03T18:37:21.878-05:00 level=DEBUG source=sched.go:338 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s time=2026-03-03T18:37:21.879-05:00 level=DEBUG source=sched.go:356 msg="after processing request finished event" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 refCount=0 time=2026-03-03T18:37:22.021-05:00 level=DEBUG source=sched.go:736 msg="evaluating already loaded" model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 time=2026-03-03T18:37:22.029-05:00 level=DEBUG source=server.go:1537 msg="completion request" images=0 prompt=38778 format="" time=2026-03-03T18:37:22.037-05:00 level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=8751 prompt=8861 used=8604 remaining=257 time=2026-03-03T18:37:25.055-05:00 level=WARN source=qwen3.go:108 msg="qwen3 tool call parsing failed" error="failed to parse JSON: unexpected end of JSON input" [GIN] 2026/03/03 - 18:37:25 | 200 | 3.143320333s | 192.168.99.177 | POST "/v1/responses" time=2026-03-03T18:37:25.055-05:00 level=DEBUG source=sched.go:433 msg="context for request finished" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 time=2026-03-03T18:37:25.055-05:00 level=DEBUG source=sched.go:338 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s time=2026-03-03T18:37:25.055-05:00 level=DEBUG source=sched.go:356 msg="after processing request finished event" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 refCount=0 time=2026-03-03T18:37:25.480-05:00 level=DEBUG source=sched.go:736 msg="evaluating already loaded" model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 time=2026-03-03T18:37:25.488-05:00 level=DEBUG source=server.go:1537 msg="completion request" images=0 prompt=38778 format="" time=2026-03-03T18:37:25.500-05:00 level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=8940 prompt=8861 used=8604 remaining=257 time=2026-03-03T18:37:28.155-05:00 level=WARN source=qwen3.go:108 msg="qwen3 tool call parsing failed" error="failed to parse JSON: unexpected end of JSON input" [GIN] 2026/03/03 - 18:37:28 | 200 | 2.839764792s | 192.168.99.177 | POST "/v1/responses" time=2026-03-03T18:37:28.155-05:00 level=DEBUG source=sched.go:433 msg="context for request finished" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 time=2026-03-03T18:37:28.156-05:00 level=DEBUG source=sched.go:338 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s time=2026-03-03T18:37:28.156-05:00 level=DEBUG source=sched.go:356 msg="after processing request finished event" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 refCount=0 time=2026-03-03T18:37:28.944-05:00 level=DEBUG source=sched.go:736 msg="evaluating already loaded" model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 time=2026-03-03T18:37:28.952-05:00 level=DEBUG source=server.go:1537 msg="completion request" images=0 prompt=38778 format="" time=2026-03-03T18:37:28.960-05:00 level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=8929 prompt=8861 used=8604 remaining=257 time=2026-03-03T18:37:31.713-05:00 level=WARN source=qwen3.go:108 msg="qwen3 tool call parsing failed" error="failed to parse JSON: invalid character ']' after object key:value pair" [GIN] 2026/03/03 - 18:37:31 | 200 | 2.917049959s | 192.168.99.177 | POST "/v1/responses" time=2026-03-03T18:37:31.713-05:00 level=DEBUG source=sched.go:433 msg="context for request finished" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 time=2026-03-03T18:37:31.713-05:00 level=DEBUG source=sched.go:338 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s time=2026-03-03T18:37:31.713-05:00 level=DEBUG source=sched.go:356 msg="after processing request finished event" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 refCount=0 [GIN] 2026/03/03 - 18:37:31 | 200 | 36.5µs | 127.0.0.1 | HEAD "/" [GIN] 2026/03/03 - 18:37:31 | 200 | 42.792µs | 127.0.0.1 | GET "/api/ps" time=2026-03-03T18:37:32.668-05:00 level=DEBUG source=sched.go:736 msg="evaluating already loaded" model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 time=2026-03-03T18:37:32.677-05:00 level=DEBUG source=server.go:1537 msg="completion request" images=0 prompt=38778 format="" time=2026-03-03T18:37:32.685-05:00 level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=8933 prompt=8861 used=8604 remaining=257 time=2026-03-03T18:37:35.804-05:00 level=WARN source=qwen3.go:108 msg="qwen3 tool call parsing failed" error="failed to parse JSON: unexpected end of JSON input" [GIN] 2026/03/03 - 18:37:35 | 200 | 3.286773s | 192.168.99.177 | POST "/v1/responses" time=2026-03-03T18:37:35.804-05:00 level=DEBUG source=sched.go:433 msg="context for request finished" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 time=2026-03-03T18:37:35.804-05:00 level=DEBUG source=sched.go:338 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s time=2026-03-03T18:37:35.804-05:00 level=DEBUG source=sched.go:356 msg="after processing request finished event" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 refCount=0 time=2026-03-03T18:37:37.547-05:00 level=DEBUG source=sched.go:736 msg="evaluating already loaded" model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 time=2026-03-03T18:37:37.556-05:00 level=DEBUG source=server.go:1537 msg="completion request" images=0 prompt=38778 format="" time=2026-03-03T18:37:37.564-05:00 level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=8943 prompt=8861 used=8604 remaining=257 time=2026-03-03T18:37:40.452-05:00 level=WARN source=qwen3.go:108 msg="qwen3 tool call parsing failed" error="failed to parse JSON: unexpected end of JSON input" [GIN] 2026/03/03 - 18:37:40 | 200 | 3.054015084s | 192.168.99.177 | POST "/v1/responses" time=2026-03-03T18:37:40.453-05:00 level=DEBUG source=sched.go:433 msg="context for request finished" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 time=2026-03-03T18:37:40.453-05:00 level=DEBUG source=sched.go:338 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s time=2026-03-03T18:37:40.453-05:00 level=DEBUG source=sched.go:356 msg="after processing request finished event" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 refCount=0 time=2026-03-03T18:37:44.217-05:00 level=DEBUG source=sched.go:736 msg="evaluating already loaded" model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 time=2026-03-03T18:37:44.226-05:00 level=DEBUG source=server.go:1537 msg="completion request" images=0 prompt=38778 format="" time=2026-03-03T18:37:44.234-05:00 level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=8936 prompt=8861 used=8604 remaining=257 time=2026-03-03T18:37:47.252-05:00 level=WARN source=qwen3.go:108 msg="qwen3 tool call parsing failed" error="failed to parse JSON: unexpected end of JSON input" [GIN] 2026/03/03 - 18:37:47 | 200 | 3.195688125s | 192.168.99.177 | POST "/v1/responses" time=2026-03-03T18:37:47.252-05:00 level=DEBUG source=sched.go:433 msg="context for request finished" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 time=2026-03-03T18:37:47.252-05:00 level=DEBUG source=sched.go:338 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s time=2026-03-03T18:37:47.252-05:00 level=DEBUG source=sched.go:356 msg="after processing request finished event" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 refCount=0 ``` ### OS macOS ### GPU Apple ### CPU Apple ### Ollama version 0.17.5
GiteaMirror added the bug label 2026-04-22 19:36:30 -05:00
Author
Owner

@rick-github commented on GitHub (Mar 3, 2026):

Server logs will aid in debugging.

<!-- gh-comment-id:3994136461 --> @rick-github commented on GitHub (Mar 3, 2026): [Server logs](https://docs.ollama.com/troubleshooting) will aid in debugging.
Author
Owner

@chigkim commented on GitHub (Mar 3, 2026):

Sorry @rick-github, here it is.
I'm not sure, but maybe something has to do with failed to parse JSON?

[GIN] 2026/03/03 - 18:37:21 | 200 | 24.921095417s |  192.168.99.177 | POST     "/v1/responses"
time=2026-03-03T18:37:21.878-05:00 level=DEBUG source=sched.go:587 msg="context for request finished"
time=2026-03-03T18:37:21.878-05:00 level=DEBUG source=sched.go:338 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s
time=2026-03-03T18:37:21.879-05:00 level=DEBUG source=sched.go:356 msg="after processing request finished event" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 refCount=0
time=2026-03-03T18:37:22.021-05:00 level=DEBUG source=sched.go:736 msg="evaluating already loaded" model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0
time=2026-03-03T18:37:22.029-05:00 level=DEBUG source=server.go:1537 msg="completion request" images=0 prompt=38778 format=""
time=2026-03-03T18:37:22.037-05:00 level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=8751 prompt=8861 used=8604 remaining=257
time=2026-03-03T18:37:25.055-05:00 level=WARN source=qwen3.go:108 msg="qwen3 tool call parsing failed" error="failed to parse JSON: unexpected end of JSON input"
[GIN] 2026/03/03 - 18:37:25 | 200 |  3.143320333s |  192.168.99.177 | POST     "/v1/responses"
time=2026-03-03T18:37:25.055-05:00 level=DEBUG source=sched.go:433 msg="context for request finished" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000
time=2026-03-03T18:37:25.055-05:00 level=DEBUG source=sched.go:338 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s
time=2026-03-03T18:37:25.055-05:00 level=DEBUG source=sched.go:356 msg="after processing request finished event" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 refCount=0
time=2026-03-03T18:37:25.480-05:00 level=DEBUG source=sched.go:736 msg="evaluating already loaded" model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0
time=2026-03-03T18:37:25.488-05:00 level=DEBUG source=server.go:1537 msg="completion request" images=0 prompt=38778 format=""
time=2026-03-03T18:37:25.500-05:00 level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=8940 prompt=8861 used=8604 remaining=257
time=2026-03-03T18:37:28.155-05:00 level=WARN source=qwen3.go:108 msg="qwen3 tool call parsing failed" error="failed to parse JSON: unexpected end of JSON input"
[GIN] 2026/03/03 - 18:37:28 | 200 |  2.839764792s |  192.168.99.177 | POST     "/v1/responses"
time=2026-03-03T18:37:28.155-05:00 level=DEBUG source=sched.go:433 msg="context for request finished" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000
time=2026-03-03T18:37:28.156-05:00 level=DEBUG source=sched.go:338 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s
time=2026-03-03T18:37:28.156-05:00 level=DEBUG source=sched.go:356 msg="after processing request finished event" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 refCount=0
time=2026-03-03T18:37:28.944-05:00 level=DEBUG source=sched.go:736 msg="evaluating already loaded" model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0
time=2026-03-03T18:37:28.952-05:00 level=DEBUG source=server.go:1537 msg="completion request" images=0 prompt=38778 format=""
time=2026-03-03T18:37:28.960-05:00 level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=8929 prompt=8861 used=8604 remaining=257
time=2026-03-03T18:37:31.713-05:00 level=WARN source=qwen3.go:108 msg="qwen3 tool call parsing failed" error="failed to parse JSON: invalid character ']' after object key:value pair"
[GIN] 2026/03/03 - 18:37:31 | 200 |  2.917049959s |  192.168.99.177 | POST     "/v1/responses"
time=2026-03-03T18:37:31.713-05:00 level=DEBUG source=sched.go:433 msg="context for request finished" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000
time=2026-03-03T18:37:31.713-05:00 level=DEBUG source=sched.go:338 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s
time=2026-03-03T18:37:31.713-05:00 level=DEBUG source=sched.go:356 msg="after processing request finished event" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 refCount=0
[GIN] 2026/03/03 - 18:37:31 | 200 |        36.5µs |       127.0.0.1 | HEAD     "/"
[GIN] 2026/03/03 - 18:37:31 | 200 |      42.792µs |       127.0.0.1 | GET      "/api/ps"
time=2026-03-03T18:37:32.668-05:00 level=DEBUG source=sched.go:736 msg="evaluating already loaded" model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0
time=2026-03-03T18:37:32.677-05:00 level=DEBUG source=server.go:1537 msg="completion request" images=0 prompt=38778 format=""
time=2026-03-03T18:37:32.685-05:00 level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=8933 prompt=8861 used=8604 remaining=257
time=2026-03-03T18:37:35.804-05:00 level=WARN source=qwen3.go:108 msg="qwen3 tool call parsing failed" error="failed to parse JSON: unexpected end of JSON input"
[GIN] 2026/03/03 - 18:37:35 | 200 |     3.286773s |  192.168.99.177 | POST     "/v1/responses"
time=2026-03-03T18:37:35.804-05:00 level=DEBUG source=sched.go:433 msg="context for request finished" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000
time=2026-03-03T18:37:35.804-05:00 level=DEBUG source=sched.go:338 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s
time=2026-03-03T18:37:35.804-05:00 level=DEBUG source=sched.go:356 msg="after processing request finished event" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 refCount=0
time=2026-03-03T18:37:37.547-05:00 level=DEBUG source=sched.go:736 msg="evaluating already loaded" model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0
time=2026-03-03T18:37:37.556-05:00 level=DEBUG source=server.go:1537 msg="completion request" images=0 prompt=38778 format=""
time=2026-03-03T18:37:37.564-05:00 level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=8943 prompt=8861 used=8604 remaining=257
time=2026-03-03T18:37:40.452-05:00 level=WARN source=qwen3.go:108 msg="qwen3 tool call parsing failed" error="failed to parse JSON: unexpected end of JSON input"
[GIN] 2026/03/03 - 18:37:40 | 200 |  3.054015084s |  192.168.99.177 | POST     "/v1/responses"
time=2026-03-03T18:37:40.453-05:00 level=DEBUG source=sched.go:433 msg="context for request finished" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000
time=2026-03-03T18:37:40.453-05:00 level=DEBUG source=sched.go:338 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s
time=2026-03-03T18:37:40.453-05:00 level=DEBUG source=sched.go:356 msg="after processing request finished event" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 refCount=0
time=2026-03-03T18:37:44.217-05:00 level=DEBUG source=sched.go:736 msg="evaluating already loaded" model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0
time=2026-03-03T18:37:44.226-05:00 level=DEBUG source=server.go:1537 msg="completion request" images=0 prompt=38778 format=""
time=2026-03-03T18:37:44.234-05:00 level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=8936 prompt=8861 used=8604 remaining=257
time=2026-03-03T18:37:47.252-05:00 level=WARN source=qwen3.go:108 msg="qwen3 tool call parsing failed" error="failed to parse JSON: unexpected end of JSON input"
[GIN] 2026/03/03 - 18:37:47 | 200 |  3.195688125s |  192.168.99.177 | POST     "/v1/responses"
time=2026-03-03T18:37:47.252-05:00 level=DEBUG source=sched.go:433 msg="context for request finished" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000
time=2026-03-03T18:37:47.252-05:00 level=DEBUG source=sched.go:338 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s
time=2026-03-03T18:37:47.252-05:00 level=DEBUG source=sched.go:356 msg="after processing request finished event" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 refCount=0
<!-- gh-comment-id:3994268858 --> @chigkim commented on GitHub (Mar 3, 2026): Sorry @rick-github, here it is. I'm not sure, but maybe something has to do with `failed to parse JSON`? ```shell [GIN] 2026/03/03 - 18:37:21 | 200 | 24.921095417s | 192.168.99.177 | POST "/v1/responses" time=2026-03-03T18:37:21.878-05:00 level=DEBUG source=sched.go:587 msg="context for request finished" time=2026-03-03T18:37:21.878-05:00 level=DEBUG source=sched.go:338 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s time=2026-03-03T18:37:21.879-05:00 level=DEBUG source=sched.go:356 msg="after processing request finished event" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 refCount=0 time=2026-03-03T18:37:22.021-05:00 level=DEBUG source=sched.go:736 msg="evaluating already loaded" model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 time=2026-03-03T18:37:22.029-05:00 level=DEBUG source=server.go:1537 msg="completion request" images=0 prompt=38778 format="" time=2026-03-03T18:37:22.037-05:00 level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=8751 prompt=8861 used=8604 remaining=257 time=2026-03-03T18:37:25.055-05:00 level=WARN source=qwen3.go:108 msg="qwen3 tool call parsing failed" error="failed to parse JSON: unexpected end of JSON input" [GIN] 2026/03/03 - 18:37:25 | 200 | 3.143320333s | 192.168.99.177 | POST "/v1/responses" time=2026-03-03T18:37:25.055-05:00 level=DEBUG source=sched.go:433 msg="context for request finished" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 time=2026-03-03T18:37:25.055-05:00 level=DEBUG source=sched.go:338 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s time=2026-03-03T18:37:25.055-05:00 level=DEBUG source=sched.go:356 msg="after processing request finished event" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 refCount=0 time=2026-03-03T18:37:25.480-05:00 level=DEBUG source=sched.go:736 msg="evaluating already loaded" model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 time=2026-03-03T18:37:25.488-05:00 level=DEBUG source=server.go:1537 msg="completion request" images=0 prompt=38778 format="" time=2026-03-03T18:37:25.500-05:00 level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=8940 prompt=8861 used=8604 remaining=257 time=2026-03-03T18:37:28.155-05:00 level=WARN source=qwen3.go:108 msg="qwen3 tool call parsing failed" error="failed to parse JSON: unexpected end of JSON input" [GIN] 2026/03/03 - 18:37:28 | 200 | 2.839764792s | 192.168.99.177 | POST "/v1/responses" time=2026-03-03T18:37:28.155-05:00 level=DEBUG source=sched.go:433 msg="context for request finished" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 time=2026-03-03T18:37:28.156-05:00 level=DEBUG source=sched.go:338 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s time=2026-03-03T18:37:28.156-05:00 level=DEBUG source=sched.go:356 msg="after processing request finished event" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 refCount=0 time=2026-03-03T18:37:28.944-05:00 level=DEBUG source=sched.go:736 msg="evaluating already loaded" model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 time=2026-03-03T18:37:28.952-05:00 level=DEBUG source=server.go:1537 msg="completion request" images=0 prompt=38778 format="" time=2026-03-03T18:37:28.960-05:00 level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=8929 prompt=8861 used=8604 remaining=257 time=2026-03-03T18:37:31.713-05:00 level=WARN source=qwen3.go:108 msg="qwen3 tool call parsing failed" error="failed to parse JSON: invalid character ']' after object key:value pair" [GIN] 2026/03/03 - 18:37:31 | 200 | 2.917049959s | 192.168.99.177 | POST "/v1/responses" time=2026-03-03T18:37:31.713-05:00 level=DEBUG source=sched.go:433 msg="context for request finished" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 time=2026-03-03T18:37:31.713-05:00 level=DEBUG source=sched.go:338 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s time=2026-03-03T18:37:31.713-05:00 level=DEBUG source=sched.go:356 msg="after processing request finished event" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 refCount=0 [GIN] 2026/03/03 - 18:37:31 | 200 | 36.5µs | 127.0.0.1 | HEAD "/" [GIN] 2026/03/03 - 18:37:31 | 200 | 42.792µs | 127.0.0.1 | GET "/api/ps" time=2026-03-03T18:37:32.668-05:00 level=DEBUG source=sched.go:736 msg="evaluating already loaded" model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 time=2026-03-03T18:37:32.677-05:00 level=DEBUG source=server.go:1537 msg="completion request" images=0 prompt=38778 format="" time=2026-03-03T18:37:32.685-05:00 level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=8933 prompt=8861 used=8604 remaining=257 time=2026-03-03T18:37:35.804-05:00 level=WARN source=qwen3.go:108 msg="qwen3 tool call parsing failed" error="failed to parse JSON: unexpected end of JSON input" [GIN] 2026/03/03 - 18:37:35 | 200 | 3.286773s | 192.168.99.177 | POST "/v1/responses" time=2026-03-03T18:37:35.804-05:00 level=DEBUG source=sched.go:433 msg="context for request finished" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 time=2026-03-03T18:37:35.804-05:00 level=DEBUG source=sched.go:338 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s time=2026-03-03T18:37:35.804-05:00 level=DEBUG source=sched.go:356 msg="after processing request finished event" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 refCount=0 time=2026-03-03T18:37:37.547-05:00 level=DEBUG source=sched.go:736 msg="evaluating already loaded" model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 time=2026-03-03T18:37:37.556-05:00 level=DEBUG source=server.go:1537 msg="completion request" images=0 prompt=38778 format="" time=2026-03-03T18:37:37.564-05:00 level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=8943 prompt=8861 used=8604 remaining=257 time=2026-03-03T18:37:40.452-05:00 level=WARN source=qwen3.go:108 msg="qwen3 tool call parsing failed" error="failed to parse JSON: unexpected end of JSON input" [GIN] 2026/03/03 - 18:37:40 | 200 | 3.054015084s | 192.168.99.177 | POST "/v1/responses" time=2026-03-03T18:37:40.453-05:00 level=DEBUG source=sched.go:433 msg="context for request finished" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 time=2026-03-03T18:37:40.453-05:00 level=DEBUG source=sched.go:338 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s time=2026-03-03T18:37:40.453-05:00 level=DEBUG source=sched.go:356 msg="after processing request finished event" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 refCount=0 time=2026-03-03T18:37:44.217-05:00 level=DEBUG source=sched.go:736 msg="evaluating already loaded" model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 time=2026-03-03T18:37:44.226-05:00 level=DEBUG source=server.go:1537 msg="completion request" images=0 prompt=38778 format="" time=2026-03-03T18:37:44.234-05:00 level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=8936 prompt=8861 used=8604 remaining=257 time=2026-03-03T18:37:47.252-05:00 level=WARN source=qwen3.go:108 msg="qwen3 tool call parsing failed" error="failed to parse JSON: unexpected end of JSON input" [GIN] 2026/03/03 - 18:37:47 | 200 | 3.195688125s | 192.168.99.177 | POST "/v1/responses" time=2026-03-03T18:37:47.252-05:00 level=DEBUG source=sched.go:433 msg="context for request finished" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 time=2026-03-03T18:37:47.252-05:00 level=DEBUG source=sched.go:338 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s time=2026-03-03T18:37:47.252-05:00 level=DEBUG source=sched.go:356 msg="after processing request finished event" runner.name=registry.ollama.ai/library/qwen3.5:35b-a3b-q8_0 runner.inference="[{ID:0 Library:Metal}]" runner.size="39.8 GiB" runner.vram="39.8 GiB" runner.parallel=1 runner.pid=38713 runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 refCount=0 ```
Author
Owner

@ParthSareen commented on GitHub (Mar 4, 2026):

Seems like you're running at a low amount of context length (8k?) Models will often fail when you don't allocate enough context.

time=2026-03-03T18:37:22.029-05:00 level=DEBUG source=server.go:1537 msg="completion request" images=0 prompt=38778 format=""

time=2026-03-03T18:37:22.037-05:00 level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=8751 prompt=8861 used=8604 remaining=257

The initial prompt itself is ~40 000 tokens. I'd recommend bumping up context length if possible or switching to a cloud model.

<!-- gh-comment-id:3994504482 --> @ParthSareen commented on GitHub (Mar 4, 2026): Seems like you're running at a low amount of context length (8k?) Models will often fail when you don't allocate enough context. `time=2026-03-03T18:37:22.029-05:00 level=DEBUG source=server.go:1537 msg="completion request" images=0 prompt=38778 format=""` `time=2026-03-03T18:37:22.037-05:00 level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=8751 prompt=8861 used=8604 remaining=257` The initial prompt itself is ~40 000 tokens. I'd recommend bumping up context length if possible or switching to a cloud model.
Author
Owner

@chigkim commented on GitHub (Mar 4, 2026):

No, I'm running at 64k.
runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s

<!-- gh-comment-id:3994532512 --> @chigkim commented on GitHub (Mar 4, 2026): No, I'm running at 64k. `runner.model=/Users/cgk/.ollama/models/blobs/sha256-acd3c29c18f07df11b02809f1787803dbf0ba97abcd16c26e38b75168fce79e0 runner.num_ctx=64000 duration=5m0s`
Author
Owner

@rick-github commented on GitHub (Mar 4, 2026):

Context is 64 000. The initial prompt is ~40 000 characters, which is converted to 8861 tokens. The cache has 8604 tokens from the previous generation so only 257 new tokens are added to the cache slot.

The failure is the the tool call generated by the model is incomplete. There's no entries about shifting so it seems like the model is just generating poor output, maybe influenced by the instructions from codex. Would it be possible to try a different integration to see if the problem persists?

<!-- gh-comment-id:3994533975 --> @rick-github commented on GitHub (Mar 4, 2026): Context is 64 000. The initial prompt is ~40 000 characters, which is converted to 8861 tokens. The cache has 8604 tokens from the previous generation so only 257 new tokens are added to the cache slot. The failure is the the tool call generated by the model is incomplete. There's no entries about shifting so it seems like the model is just generating poor output, maybe influenced by the instructions from codex. Would it be possible to try a different integration to see if the problem persists?
Author
Owner

@ParthSareen commented on GitHub (Mar 4, 2026):

We also did make updates to the qwen3 parser – wondering if it is that. Have you noticed a certain tool call causing this? Edits? Read file? Etc.

<!-- gh-comment-id:3994543841 --> @ParthSareen commented on GitHub (Mar 4, 2026): We also did make updates to the qwen3 parser – wondering if it is that. Have you noticed a certain tool call causing this? Edits? Read file? Etc.
Author
Owner

@chigkim commented on GitHub (Mar 4, 2026):

It seems like qwen35 specific. I just swapped the exact same thing with gpt-oss-20b, and it works fine. I don't see any error.

<!-- gh-comment-id:3995159102 --> @chigkim commented on GitHub (Mar 4, 2026): It seems like qwen35 specific. I just swapped the exact same thing with gpt-oss-20b, and it works fine. I don't see any error.
Author
Owner

@chigkim commented on GitHub (Mar 4, 2026):

Since updating to v0.17.6, I haven't seen the error.
Maybe #14605 fixed it.

<!-- gh-comment-id:3999114907 --> @chigkim commented on GitHub (Mar 4, 2026): Since updating to v0.17.6, I haven't seen the error. Maybe #14605 fixed it.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#35224