[GH-ISSUE #12064] Tool call parsing errors #70071

New Issue

GiteaMirror · 2026-05-04T20:15:57-05:00

GiteaMirror commented

2026-05-04 20:15:57 -05:00

Originally created by @lefoulkrod on GitHub (Aug 25, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/12064

Originally assigned to: @drifkin, @jmorganca, @ParthSareen on GitHub.

What is the issue?

ollama 0.11.6
gpt-oss:120b

I get frequent errors from ollama ollama._types.ResponseError: error parsing tool call. These seem to occur mostly when a tool call is being generated for a write_file tool that I have.

Here is a truncated console log from my application

ERROR: agents.ollama.sdk.tool_loop: Unhandled exception in tool loop Message: error parsing tool call Raw: {"path":"src/pipe.py","content":"...[truncated]...","encoding":"utf-8"} Error: invalid character ']' after object key:value pair Status Code: 500 Traceback: File "/home/larry/repos/computron_9000/agents/ollama/sdk/tool_loop.py", line 80, in run_tool_call_loop response = await client.chat(...) File "/home/larry/repos/computron_9000/.venv/lib/python3.12/site-packages/ollama/_client.py", line 854, in chat return await self._request(...) File "/home/larry/repos/computron_9000/.venv/lib/python3.12/site-packages/ollama/_client.py", line 692, in _request return cls(**(await self._request_raw(...)).json()) File "/home/larry/repos/computron_9000/.venv/lib/python3.12/site-packages/ollama/_client.py", line 636, in _request_raw raise ResponseError(e.response.text, e.response.status_code) ollama._types.ResponseError: error parsing tool call: invalid character ']' after object key:value pair (status code: 500)

This is the non-truncated portion of the error that shows the invalid json syntax being returned for the tool call.

Relevant log output

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.11.6

Originally created by @lefoulkrod on GitHub (Aug 25, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/12064 Originally assigned to: @drifkin, @jmorganca, @ParthSareen on GitHub. ### What is the issue? ollama 0.11.6 gpt-oss:120b I get frequent errors from ollama `ollama._types.ResponseError: error parsing tool call`. These seem to occur mostly when a tool call is being generated for a write_file tool that I have. Here is a truncated console log from my application ``` ERROR: agents.ollama.sdk.tool_loop: Unhandled exception in tool loop Message: error parsing tool call Raw: {"path":"src/pipe.py","content":"...[truncated]...","encoding":"utf-8"} Error: invalid character ']' after object key:value pair Status Code: 500 Traceback: File "/home/larry/repos/computron_9000/agents/ollama/sdk/tool_loop.py", line 80, in run_tool_call_loop response = await client.chat(...) File "/home/larry/repos/computron_9000/.venv/lib/python3.12/site-packages/ollama/_client.py", line 854, in chat return await self._request(...) File "/home/larry/repos/computron_9000/.venv/lib/python3.12/site-packages/ollama/_client.py", line 692, in _request return cls(**(await self._request_raw(...)).json()) File "/home/larry/repos/computron_9000/.venv/lib/python3.12/site-packages/ollama/_client.py", line 636, in _request_raw raise ResponseError(e.response.text, e.response.status_code) ollama._types.ResponseError: error parsing tool call: invalid character ']' after object key:value pair (status code: 500) ``` This is the non-truncated portion of the error that shows the invalid json syntax being returned for the tool call. ```return self.pipes\n\n def draw(self, surface: pygame.Surface) -> None:\n """Draw all active pipes onto the supplied surface.\n\n Parameters\n ----------\n surface : pygame.Surface\n The surface to render the pipes onto.\n """\n\n for pipe in self.pipes:\n pipe.draw(surface)\n"], "encoding":"utf-8"}', err=invalid character ']' after object key:value pair (status code: 500) ``` ### Relevant log output ```shell ``` ### OS Linux ### GPU Nvidia ### CPU Intel ### Ollama version 0.11.6

GiteaMirror added the bug label 2026-05-04 20:15:57 -05:00

GiteaMirror commented

2026-05-04 20:15:58 -05:00

@onestardao commented on GitHub (Aug 25, 2025):

this error usually means the tool-call JSON coming back isn’t valid, so the parser chokes. it’s not your code, it’s the model drifting and adding extra text around the JSON. i’ve got a checklist for this kind of parsing/formatting failure (ProblemMap No.2). do you want me to share the link?

@onestardao commented on GitHub (Aug 25, 2025): this error usually means the tool-call JSON coming back isn’t valid, so the parser chokes. it’s not your code, it’s the model drifting and adding extra text around the JSON. i’ve got a checklist for this kind of parsing/formatting failure (ProblemMap No.2). do you want me to share the link?

GiteaMirror commented

2026-05-04 20:15:58 -05:00

@rick-github commented on GitHub (Aug 25, 2025):

This is the non-truncated portion of the error that shows the invalid json syntax being returned for the tool call.

Can you add the bit with the invalid JSON syntax?

@rick-github commented on GitHub (Aug 25, 2025): > This is the non-truncated portion of the error that shows the invalid json syntax being returned for the tool call. Can you add the bit with the invalid JSON syntax?

GiteaMirror commented

2026-05-04 20:15:59 -05:00

@lefoulkrod commented on GitHub (Aug 26, 2025):

clean up this error log so that its easier to see the actual error. truncate text but dont truncate the failing text ERROR:agents.ollama.sdk.tool_loop:Unhandled exception in tool loop: error parsing tool call: raw='{"path":"src/pipe.py","content":""""Pipe entity and manager for the Flappy Bird style game.\n\nThis module defines two classes:\n\n* :class:Pipe – Represents a pair of top and bottom pipe sections with a\n configurable vertical gap.\n* :class:PipeManager – Handles spawning, updating and rendering of multiple\n :class:Pipe instances.\n\nAll behaviour is driven by constants defined in :mod:src.config.\n"""\n\nfrom future import annotations\n\nimport random\nfrom typing import List\n\nimport pygame\n\nfrom src.config import (\n PIPE_COLOR,\n PIPE_GAP_MAX,\n PIPE_GAP_MIN,\n PIPE_SPEED,\n PIPE_SPAWN_INTERVAL,\n PIPE_WIDTH,\n SCREEN_HEIGHT,\n SCREEN_WIDTH,\n)\n\n\nclass Pipe:\n """A pair of pipe rectangles with a vertical gap.\n\n Parameters\n ----------\n x : int\n The horizontal position (left edge) where the pipe pair is created.\n """\n\n def init(self, x: int) -> None:\n # Store the x coordinate as a float for sub‑pixel movement.\n self._x: float = float(x)\n\n # Choose a random gap size between the configured minimum and maximum.\n gap_size: int = random.randint(PIPE_GAP_MIN, PIPE_GAP_MAX)\n # Ensure the gap fits inside the screen vertically.\n max_gap_top: int = SCREEN_HEIGHT - gap_size\n gap_top: int = random.randint(0, max_gap_top)\n\n # Create the top and bottom rectangles.\n self.top_rect: pygame.Rect = pygame.Rect(\n int(self._x), 0, PIPE_WIDTH, gap_top\n )\n self.bottom_rect: pygame.Rect = pygame.Rect(\n int(self._x),\n gap_top + gap_size,\n PIPE_WIDTH,\n SCREEN_HEIGHT - (gap_top + gap_size),\n )\n\n # ---------------------------------------------------------------------\n # Public API\n # ---------------------------------------------------------------------\n @property\n def rects(self) -> List[pygame.Rect]:\n """Return a list containing the top and bottom rectangles.\n\n The list order is [top_rect, bottom_rect] which matches the\n expectation of collision‑checking code.\n """\n\n return [self.top_rect, self.bottom_rect]\n\n def update(self, dt: float) -> None:\n """Move the pipe leftward according to the configured speed.\n\n Parameters\n ----------\n dt : float\n Time step in seconds since the last update.\n """\n\n # Update the stored x coordinate with sub‑pixel precision.\n self._x -= PIPE_SPEED * dt\n # Sync the pygame.Rect objects (they store integer coordinates).\n int_x = int(self._x)\n self.top_rect.x = int_x\n self.bottom_rect.x = int_x\n\n def draw(self, surface: pygame.Surface) -> None:\n """Render both pipe sections onto the given surface.\n\n Parameters\n ----------\n surface : pygame.Surface\n The surface to draw the pipe on.\n """\n\n pygame.draw.rect(surface, PIPE_COLOR, self.top_rect)\n pygame.draw.rect(surface, PIPE_COLOR, self.bottom_rect)\n\n\nclass PipeManager:\n """Manage a collection of :class:Pipe objects.\n\n The manager is responsible for spawning new pipes at regular intervals,\n updating existing pipes, removing those that have moved off‑screen and\n delegating drawing to each pipe.\n """\n\n def init(self) -> None:\n self.pipes: List[Pipe] = []\n self._time_since_last_spawn: float = 0.0\n\n def update(self, dt: float) -> List[Pipe]:\n """Update all managed pipes and handle spawning/removal.\n\n Parameters\n ----------\n dt : float\n Time step in seconds since the last update.\n\n Returns\n -------\n List[Pipe]\n The current list of active pipes after cleanup.\n """\n\n # Spawn new pipe if the interval has elapsed.\n self._time_since_last_spawn += dt\n while self._time_since_last_spawn >= PIPE_SPAWN_INTERVAL:\n self.pipes.append(Pipe(SCREEN_WIDTH))\n self._time_since_last_spawn -= PIPE_SPAWN_INTERVAL\n\n # Update existing pipes.\n for pipe in self.pipes:\n pipe.update(dt)\n\n # Remove pipes that have completely moved off the left side of the screen.\n self.pipes = [p for p in self.pipes if p.top_rect.right >= 0]\n return self.pipes\n\n def draw(self, surface: pygame.Surface) -> None:\n """Draw all active pipes onto the supplied surface.\n\n Parameters\n ----------\n surface : pygame.Surface\n The surface to render the pipes onto.\n """\n\n for pipe in self.pipes:\n pipe.draw(surface)\n"], "encoding":"utf-8"}', err=invalid character ']' after object key:value pair (status code: 500)
Traceback (most recent call last):
File "/home/larry/repos/computron_9000/agents/ollama/sdk/tool_loop.py", line 80, in run_tool_call_loop
response = await client.chat(
^^^^^^^^^^^^^^^^^^
File "/home/larry/repos/computron_9000/.venv/lib/python3.12/site-packages/ollama/_client.py", line 854, in chat
return await self._request(
^^^^^^^^^^^^^^^^^^^^
File "/home/larry/repos/computron_9000/.venv/lib/python3.12/site-packages/ollama/_client.py", line 692, in _request
return cls(**(await self._request_raw(*args, kwargs)).json())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/larry/repos/computron_9000/.venv/lib/python3.12/site-packages/ollama/_client.py", line 636, in _request_raw
raise ResponseError(e.response.text, e.response.status_code) from None
ollama._types.ResponseError: error parsing tool call: raw='{"path":"src/pipe.py","content":""""Pipe entity and manager for the Flappy Bird style game.\n\nThis module defines two classes:\n\n :class:Pipe – Represents a pair of top and bottom pipe sections with a\n configurable vertical gap.\n :class:PipeManager – Handles spawning, updating and rendering of multiple\n :class:Pipe instances.\n\nAll behaviour is driven by constants defined in :mod:src.config.\n"""\n\nfrom future import annotations\n\nimport random\nfrom typing import List\n\nimport pygame\n\nfrom src.config import (\n PIPE_COLOR,\n PIPE_GAP_MAX,\n PIPE_GAP_MIN,\n PIPE_SPEED,\n PIPE_SPAWN_INTERVAL,\n PIPE_WIDTH,\n SCREEN_HEIGHT,\n SCREEN_WIDTH,\n)\n\n\nclass Pipe:\n """A pair of pipe rectangles with a vertical gap.\n\n Parameters\n ----------\n x : int\n The horizontal position (left edge) where the pipe pair is created.\n """\n\n def init(self, x: int) -> None:\n # Store the x coordinate as a float for sub‑pixel movement.\n self._x: float = float(x)\n\n # Choose a random gap size between the configured minimum and maximum.\n gap_size: int = random.randint(PIPE_GAP_MIN, PIPE_GAP_MAX)\n # Ensure the gap fits inside the screen vertically.\n max_gap_top: int = SCREEN_HEIGHT - gap_size\n gap_top: int = random.randint(0, max_gap_top)\n\n # Create the top and bottom rectangles.\n self.top_rect: pygame.Rect = pygame.Rect(\n int(self._x), 0, PIPE_WIDTH, gap_top\n )\n self.bottom_rect: pygame.Rect = pygame.Rect(\n int(self._x),\n gap_top + gap_size,\n PIPE_WIDTH,\n SCREEN_HEIGHT - (gap_top + gap_size),\n )\n\n # ---------------------------------------------------------------------\n # Public API\n # ---------------------------------------------------------------------\n @property\n def rects(self) -> List[pygame.Rect]:\n """Return a list containing the top and bottom rectangles.\n\n The list order is [top_rect, bottom_rect] which matches the\n expectation of collision‑checking code.\n """\n\n return [self.top_rect, self.bottom_rect]\n\n def update(self, dt: float) -> None:\n """Move the pipe leftward according to the configured speed.\n\n Parameters\n ----------\n dt : float\n Time step in seconds since the last update.\n """\n\n # Update the stored x coordinate with sub‑pixel precision.\n self._x -= PIPE_SPEED * dt\n # Sync the pygame.Rect objects (they store integer coordinates).\n int_x = int(self._x)\n self.top_rect.x = int_x\n self.bottom_rect.x = int_x\n\n def draw(self, surface: pygame.Surface) -> None:\n """Render both pipe sections onto the given surface.\n\n Parameters\n ----------\n surface : pygame.Surface\n The surface to draw the pipe on.\n """\n\n pygame.draw.rect(surface, PIPE_COLOR, self.top_rect)\n pygame.draw.rect(surface, PIPE_COLOR, self.bottom_rect)\n\n\nclass PipeManager:\n """Manage a collection of :class:Pipe objects.\n\n The manager is responsible for spawning new pipes at regular intervals,\n updating existing pipes, removing those that have moved off‑screen and\n delegating drawing to each pipe.\n """\n\n def init(self) -> None:\n self.pipes: List[Pipe] = []\n self._time_since_last_spawn: float = 0.0\n\n def update(self, dt: float) -> List[Pipe]:\n """Update all managed pipes and handle spawning/removal.\n\n Parameters\n ----------\n dt : float\n Time step in seconds since the last update.\n\n Returns\n -------\n List[Pipe]\n The current list of active pipes after cleanup.\n """\n\n # Spawn new pipe if the interval has elapsed.\n self._time_since_last_spawn += dt\n while self._time_since_last_spawn >= PIPE_SPAWN_INTERVAL:\n self.pipes.append(Pipe(SCREEN_WIDTH))\n self._time_since_last_spawn -= PIPE_SPAWN_INTERVAL\n\n # Update existing pipes.\n for pipe in self.pipes:\n pipe.update(dt)\n\n # Remove pipes that have completely moved off the left side of the screen.\n self.pipes = [p for p in self.pipes if p.top_rect.right >= 0]\n return self.pipes\n\n def draw(self, surface: pygame.Surface) -> None:\n """Draw all active pipes onto the supplied surface.\n\n Parameters\n ----------\n surface : pygame.Surface\n The surface to render the pipes onto.\n """\n\n for pipe in self.pipes:\n pipe.draw(surface)\n"], "encoding":"utf-8"}', err=invalid character ']' after object key:value pair (status code: 500)

@lefoulkrod commented on GitHub (Aug 26, 2025): clean up this error log so that its easier to see the actual error. truncate text but dont truncate the failing text ERROR:agents.ollama.sdk.tool_loop:Unhandled exception in tool loop: error parsing tool call: raw='{"path":"src/pipe.py","content":"\"\"\"Pipe entity and manager for the Flappy Bird style game.\n\nThis module defines two classes:\n\n* :class:`Pipe` – Represents a pair of top and bottom pipe sections with a\n configurable vertical gap.\n* :class:`PipeManager` – Handles spawning, updating and rendering of multiple\n :class:`Pipe` instances.\n\nAll behaviour is driven by constants defined in :mod:`src.config`.\n\"\"\"\n\nfrom __future__ import annotations\n\nimport random\nfrom typing import List\n\nimport pygame\n\nfrom src.config import (\n PIPE_COLOR,\n PIPE_GAP_MAX,\n PIPE_GAP_MIN,\n PIPE_SPEED,\n PIPE_SPAWN_INTERVAL,\n PIPE_WIDTH,\n SCREEN_HEIGHT,\n SCREEN_WIDTH,\n)\n\n\nclass Pipe:\n \"\"\"A pair of pipe rectangles with a vertical gap.\n\n Parameters\n ----------\n x : int\n The horizontal position (left edge) where the pipe pair is created.\n \"\"\"\n\n def __init__(self, x: int) -> None:\n # Store the x coordinate as a float for sub‑pixel movement.\n self._x: float = float(x)\n\n # Choose a random gap size between the configured minimum and maximum.\n gap_size: int = random.randint(PIPE_GAP_MIN, PIPE_GAP_MAX)\n # Ensure the gap fits inside the screen vertically.\n max_gap_top: int = SCREEN_HEIGHT - gap_size\n gap_top: int = random.randint(0, max_gap_top)\n\n # Create the top and bottom rectangles.\n self.top_rect: pygame.Rect = pygame.Rect(\n int(self._x), 0, PIPE_WIDTH, gap_top\n )\n self.bottom_rect: pygame.Rect = pygame.Rect(\n int(self._x),\n gap_top + gap_size,\n PIPE_WIDTH,\n SCREEN_HEIGHT - (gap_top + gap_size),\n )\n\n # ---------------------------------------------------------------------\n # Public API\n # ---------------------------------------------------------------------\n @property\n def rects(self) -> List[pygame.Rect]:\n \"\"\"Return a list containing the top and bottom rectangles.\n\n The list order is ``[top_rect, bottom_rect]`` which matches the\n expectation of collision‑checking code.\n \"\"\"\n\n return [self.top_rect, self.bottom_rect]\n\n def update(self, dt: float) -> None:\n \"\"\"Move the pipe leftward according to the configured speed.\n\n Parameters\n ----------\n dt : float\n Time step in seconds since the last update.\n \"\"\"\n\n # Update the stored x coordinate with sub‑pixel precision.\n self._x -= PIPE_SPEED * dt\n # Sync the pygame.Rect objects (they store integer coordinates).\n int_x = int(self._x)\n self.top_rect.x = int_x\n self.bottom_rect.x = int_x\n\n def draw(self, surface: pygame.Surface) -> None:\n \"\"\"Render both pipe sections onto the given surface.\n\n Parameters\n ----------\n surface : pygame.Surface\n The surface to draw the pipe on.\n \"\"\"\n\n pygame.draw.rect(surface, PIPE_COLOR, self.top_rect)\n pygame.draw.rect(surface, PIPE_COLOR, self.bottom_rect)\n\n\nclass PipeManager:\n \"\"\"Manage a collection of :class:`Pipe` objects.\n\n The manager is responsible for spawning new pipes at regular intervals,\n updating existing pipes, removing those that have moved off‑screen and\n delegating drawing to each pipe.\n \"\"\"\n\n def __init__(self) -> None:\n self.pipes: List[Pipe] = []\n self._time_since_last_spawn: float = 0.0\n\n def update(self, dt: float) -> List[Pipe]:\n \"\"\"Update all managed pipes and handle spawning/removal.\n\n Parameters\n ----------\n dt : float\n Time step in seconds since the last update.\n\n Returns\n -------\n List[Pipe]\n The current list of active pipes after cleanup.\n \"\"\"\n\n # Spawn new pipe if the interval has elapsed.\n self._time_since_last_spawn += dt\n while self._time_since_last_spawn >= PIPE_SPAWN_INTERVAL:\n self.pipes.append(Pipe(SCREEN_WIDTH))\n self._time_since_last_spawn -= PIPE_SPAWN_INTERVAL\n\n # Update existing pipes.\n for pipe in self.pipes:\n pipe.update(dt)\n\n # Remove pipes that have completely moved off the left side of the screen.\n self.pipes = [p for p in self.pipes if p.top_rect.right >= 0]\n return self.pipes\n\n def draw(self, surface: pygame.Surface) -> None:\n \"\"\"Draw all active pipes onto the supplied surface.\n\n Parameters\n ----------\n surface : pygame.Surface\n The surface to render the pipes onto.\n \"\"\"\n\n for pipe in self.pipes:\n pipe.draw(surface)\n"], "encoding":"utf-8"}', err=invalid character ']' after object key:value pair (status code: 500) Traceback (most recent call last): File "/home/larry/repos/computron_9000/agents/ollama/sdk/tool_loop.py", line 80, in run_tool_call_loop response = await client.chat( ^^^^^^^^^^^^^^^^^^ File "/home/larry/repos/computron_9000/.venv/lib/python3.12/site-packages/ollama/_client.py", line 854, in chat return await self._request( ^^^^^^^^^^^^^^^^^^^^ File "/home/larry/repos/computron_9000/.venv/lib/python3.12/site-packages/ollama/_client.py", line 692, in _request return cls(**(await self._request_raw(*args, **kwargs)).json()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/larry/repos/computron_9000/.venv/lib/python3.12/site-packages/ollama/_client.py", line 636, in _request_raw raise ResponseError(e.response.text, e.response.status_code) from None ollama._types.ResponseError: error parsing tool call: raw='{"path":"src/pipe.py","content":"\"\"\"Pipe entity and manager for the Flappy Bird style game.\n\nThis module defines two classes:\n\n* :class:`Pipe` – Represents a pair of top and bottom pipe sections with a\n configurable vertical gap.\n* :class:`PipeManager` – Handles spawning, updating and rendering of multiple\n :class:`Pipe` instances.\n\nAll behaviour is driven by constants defined in :mod:`src.config`.\n\"\"\"\n\nfrom __future__ import annotations\n\nimport random\nfrom typing import List\n\nimport pygame\n\nfrom src.config import (\n PIPE_COLOR,\n PIPE_GAP_MAX,\n PIPE_GAP_MIN,\n PIPE_SPEED,\n PIPE_SPAWN_INTERVAL,\n PIPE_WIDTH,\n SCREEN_HEIGHT,\n SCREEN_WIDTH,\n)\n\n\nclass Pipe:\n \"\"\"A pair of pipe rectangles with a vertical gap.\n\n Parameters\n ----------\n x : int\n The horizontal position (left edge) where the pipe pair is created.\n \"\"\"\n\n def __init__(self, x: int) -> None:\n # Store the x coordinate as a float for sub‑pixel movement.\n self._x: float = float(x)\n\n # Choose a random gap size between the configured minimum and maximum.\n gap_size: int = random.randint(PIPE_GAP_MIN, PIPE_GAP_MAX)\n # Ensure the gap fits inside the screen vertically.\n max_gap_top: int = SCREEN_HEIGHT - gap_size\n gap_top: int = random.randint(0, max_gap_top)\n\n # Create the top and bottom rectangles.\n self.top_rect: pygame.Rect = pygame.Rect(\n int(self._x), 0, PIPE_WIDTH, gap_top\n )\n self.bottom_rect: pygame.Rect = pygame.Rect(\n int(self._x),\n gap_top + gap_size,\n PIPE_WIDTH,\n SCREEN_HEIGHT - (gap_top + gap_size),\n )\n\n # ---------------------------------------------------------------------\n # Public API\n # ---------------------------------------------------------------------\n @property\n def rects(self) -> List[pygame.Rect]:\n \"\"\"Return a list containing the top and bottom rectangles.\n\n The list order is ``[top_rect, bottom_rect]`` which matches the\n expectation of collision‑checking code.\n \"\"\"\n\n return [self.top_rect, self.bottom_rect]\n\n def update(self, dt: float) -> None:\n \"\"\"Move the pipe leftward according to the configured speed.\n\n Parameters\n ----------\n dt : float\n Time step in seconds since the last update.\n \"\"\"\n\n # Update the stored x coordinate with sub‑pixel precision.\n self._x -= PIPE_SPEED * dt\n # Sync the pygame.Rect objects (they store integer coordinates).\n int_x = int(self._x)\n self.top_rect.x = int_x\n self.bottom_rect.x = int_x\n\n def draw(self, surface: pygame.Surface) -> None:\n \"\"\"Render both pipe sections onto the given surface.\n\n Parameters\n ----------\n surface : pygame.Surface\n The surface to draw the pipe on.\n \"\"\"\n\n pygame.draw.rect(surface, PIPE_COLOR, self.top_rect)\n pygame.draw.rect(surface, PIPE_COLOR, self.bottom_rect)\n\n\nclass PipeManager:\n \"\"\"Manage a collection of :class:`Pipe` objects.\n\n The manager is responsible for spawning new pipes at regular intervals,\n updating existing pipes, removing those that have moved off‑screen and\n delegating drawing to each pipe.\n \"\"\"\n\n def __init__(self) -> None:\n self.pipes: List[Pipe] = []\n self._time_since_last_spawn: float = 0.0\n\n def update(self, dt: float) -> List[Pipe]:\n \"\"\"Update all managed pipes and handle spawning/removal.\n\n Parameters\n ----------\n dt : float\n Time step in seconds since the last update.\n\n Returns\n -------\n List[Pipe]\n The current list of active pipes after cleanup.\n \"\"\"\n\n # Spawn new pipe if the interval has elapsed.\n self._time_since_last_spawn += dt\n while self._time_since_last_spawn >= PIPE_SPAWN_INTERVAL:\n self.pipes.append(Pipe(SCREEN_WIDTH))\n self._time_since_last_spawn -= PIPE_SPAWN_INTERVAL\n\n # Update existing pipes.\n for pipe in self.pipes:\n pipe.update(dt)\n\n # Remove pipes that have completely moved off the left side of the screen.\n self.pipes = [p for p in self.pipes if p.top_rect.right >= 0]\n return self.pipes\n\n def draw(self, surface: pygame.Surface) -> None:\n \"\"\"Draw all active pipes onto the supplied surface.\n\n Parameters\n ----------\n surface : pygame.Surface\n The surface to render the pipes onto.\n \"\"\"\n\n for pipe in self.pipes:\n pipe.draw(surface)\n"], "encoding":"utf-8"}', err=invalid character ']' after object key:value pair (status code: 500)

GiteaMirror commented

2026-05-04 20:15:59 -05:00

@onestardao commented on GitHub (Aug 26, 2025):

@lefoulkrod You posted log so I thnk you want the solution it's generated by AI with my own WFGY method, check it

===

quick read of your new logs: this is a classic “tool-call JSON is not valid” case. the model is returning either a stringified arguments or extra prose around the JSON, so the parser chokes. this lines up with our ProblemMap No.10 (schema/format drift). sometimes No.2 (interpretation collapse) is involved if the system prompt is too wordy.

fast fix checklist

force JSON only
- OpenAI-compat: set response_format: {"type":"json_object"} or format: "json" in the request.
- keep the tool schema minimal and put the example JSON on a single line.
ensure native object in arguments
- do not pass a stringified blob. remove backticks, code fences, comments. the tool wrapper should hand a plain JSON object to the runtime.
pre-parse guard
- validate with a JSON parser before tool dispatch. if invalid, re-ask the model with a short “return only valid JSON, no prose” nudge and resend.
reduce prompt noise
- short system prompt, no extra explanations. start with a tiny example, then scale.

reference + full checklist is here:
WFGY ProblemMap / README — No.10 schema & format drift

if you want, i can paste the exact No.10 step-by-step block tailored to your payload snippet. just drop one failing tool-call JSON (redact secrets) and i’ll map it line by line.

@onestardao commented on GitHub (Aug 26, 2025): @lefoulkrod You posted log so I thnk you want the solution it's generated by AI with my own WFGY method, check it === quick read of your new logs: this is a classic “tool-call JSON is not valid” case. the model is returning either a stringified `arguments` or extra prose around the JSON, so the parser chokes. this lines up with our ProblemMap No.10 (schema/format drift). sometimes No.2 (interpretation collapse) is involved if the system prompt is too wordy. **fast fix checklist** 1. **force JSON only** * OpenAI-compat: set `response_format: {"type":"json_object"}` or `format: "json"` in the request. * keep the tool schema minimal and put the example JSON on a single line. 2. **ensure native object in `arguments`** * do not pass a stringified blob. remove backticks, code fences, comments. the tool wrapper should hand a plain JSON object to the runtime. 3. **pre-parse guard** * validate with a JSON parser before tool dispatch. if invalid, re-ask the model with a short “return only valid JSON, no prose” nudge and resend. 4. **reduce prompt noise** * short system prompt, no extra explanations. start with a tiny example, then scale. reference + full checklist is here: **[WFGY ProblemMap / README — No.10 schema & format drift](https://github.com/onestardao/WFGY/blob/main/ProblemMap/README.md)** if you want, i can paste the exact No.10 step-by-step block tailored to your payload snippet. just drop one failing tool-call JSON (redact secrets) and i’ll map it line by line.

GiteaMirror commented

2026-05-04 20:15:59 -05:00

@mdlmarkham commented on GitHub (Aug 31, 2025):

I'm using n8n connected to the Ollama Turbo API... keep getting "Invalid tool usage: mismatch between tool calls and tool results" even with very simple (one search tool) setup... any way around this?

@mdlmarkham commented on GitHub (Aug 31, 2025): I'm using n8n connected to the Ollama Turbo API... keep getting "Invalid tool usage: mismatch between tool calls and tool results" even with very simple (one search tool) setup... any way around this?

GiteaMirror commented

2026-05-04 20:16:00 -05:00

@onestardao commented on GitHub (Sep 1, 2025):

@mdlmarkham

quick note, hope it helps. your “mismatch between tool calls and tool results” is almost always tool-I/O schema drift. the model mixes prose with the call or returns a different count/shape than the results array.

map it to Problem Map No.6 (logic collapse & recovery) and No.2 (interpretation / format collapse). fix is simple: force strict JSON only, one tool per turn, pre-validate the JSON before dispatch, and hard-reset the turn if the shape does not match.

it is a semantic firewall you can copy into your wrapper without changing your stack. checklink above

@onestardao commented on GitHub (Sep 1, 2025): @mdlmarkham quick note, hope it helps. your “mismatch between tool calls and tool results” is almost always tool-I/O schema drift. the model mixes prose with the call or returns a different count/shape than the results array. map it to Problem Map No.6 (logic collapse & recovery) and No.2 (interpretation / format collapse). fix is simple: force strict JSON only, one tool per turn, pre-validate the JSON before dispatch, and hard-reset the turn if the shape does not match. it is a semantic firewall you can copy into your wrapper without changing your stack. checklink above

GiteaMirror commented

2026-05-04 20:16:00 -05:00

@mcr-ksh commented on GitHub (Sep 1, 2025):

@onestardao some example using your TXT OS would be great in this given context.

@mcr-ksh commented on GitHub (Sep 1, 2025): @onestardao some example using your TXT OS would be great in this given context.

GiteaMirror commented

2026-05-04 20:16:00 -05:00

@onestardao commented on GitHub (Sep 2, 2025):

you don’t have to over-debug this one.
fastest way: just grab my plain-text OS (TXTOS), load it in your model, then literally ask:

use WFGY to solve the tool call parsing bug in this screenshot

and paste the screenshot of your error.

the OS is written for the AI itself so it knows which problem map number you’re hitting

once you try it, the model will point you to the exact guardrail page inside the Problem Map with the step-by-step fix.

@onestardao commented on GitHub (Sep 2, 2025): you don’t have to over-debug this one. fastest way: just grab my plain-text OS (TXTOS), load it in your model, then literally ask: ``` use WFGY to solve the tool call parsing bug in this screenshot ``` and paste the screenshot of your error. the OS is written *for the AI itself* so it knows which problem map number you’re hitting once you try it, the model will point you to the exact guardrail page inside the Problem Map with the step-by-step fix.

GiteaMirror commented

2026-05-04 20:16:01 -05:00

@lefoulkrod commented on GitHub (Sep 6, 2025):

Possibly related to https://github.com/ollama/ollama/issues/12203

@lefoulkrod commented on GitHub (Sep 6, 2025): Possibly related to https://github.com/ollama/ollama/issues/12203

GiteaMirror commented

2026-05-04 20:16:01 -05:00

@socram8888 commented on GitHub (Sep 30, 2025):

I've just stumbled upon this error. The exact request, done manually against the chat API:

{
   "model":"huihui_ai/gpt-oss-abliterated:20b",
   "messages":[
      {
         "role":"user",
         "content":"Calculate the length in words of this chunk of text. Use the count_words tool.\n\nHow tool parsing works in Ollama\nBackground\nWe\u2019ve built a new parser that focuses on understanding the structure of a tool call rather than simply looking for JSON.\n\nPreviously, when tools were passed into the model, the system had to wait until the entire output was generated and then parse it as JSON to determine whether it contained a tool call or normal content. Users had to wait for the complete generation before seeing any streamed token. This approach was reliable against malformed output, but blocked streaming because a tool call might occur at any point in the text.\n\nOllama supports a wide range of models, some trained with tool-specific tokens and some without. The parsing logic would needs to stream user content while being able to detect, suppress, and parse the tool call tokens.\n\nIncremental Parser\nThe new parser directly references each model\u2019s template to understand the prefix of the tool call. This is necessary for Ollama to understand and separate the tool calls and content.\n\nWhen a model is not directly trained on tool usage directly (trained with a prefix/tool token), it may still be able to output valid tool calls based on the sheer amount of knowledge it has. In this case, the parser is able to handle the partial prefixes output by the model and correctly separate tool calls and content.\n\nSome models also elect to output a tool call without a prefix, even though they were trained on using a prefix for calling tools. Empirically, this behavior happens at the start of a model output only. To address this, the parser can fallback to parsing JSON as a tool call when it recognizes the start of a JSON. If the JSON does not match the tool call format for the model, the JSON will be returned."
      }
   ],
   "options":{
      "num_ctx":8192
   },
   "think":true,
   "stream":true,
   "tools":[
      {
         "type":"function",
         "function":{
            "name":"count_words",
            "description":"Calculate the word count for a given text",
            "parameters":{
               "type":"object",
               "properties":{
                  "text":{
                     "type":"string",
                     "description":"The text to calculate the length of"
                  }
               },
               "required":[
                  "text"
               ]
            }
         }
      }
   ]
}

The streaming response:

{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:48.8421465Z', 'message': {'role': 'assistant', 'content': '', 'thinking': 'We'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:48.8918969Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' need'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:48.943321Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' to'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:48.9943184Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' call'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.0466919Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' the'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.0975886Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' count'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.1477533Z', 'message': {'role': 'assistant', 'content': '', 'thinking': '_words'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.1993845Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' tool'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.250604Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' with'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.3023189Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' the'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.3541248Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' given'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.4060203Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' text'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.4569036Z', 'message': {'role': 'assistant', 'content': '', 'thinking': '.'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.5083889Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' The'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.5593626Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' text'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.6104598Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' is'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.6617102Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' the'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.7136654Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' chunk'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.7657388Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' starting'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.8168979Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' with'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.8693789Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' "'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.9213727Z', 'message': {'role': 'assistant', 'content': '', 'thinking': 'Calculate'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.972173Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' the'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.0243279Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' length'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.0762522Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' in'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.1288321Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' words'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.1823557Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' of'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.2347046Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' this'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.2891573Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' chunk'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.3417142Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' of'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.3937481Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' text'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.446075Z', 'message': {'role': 'assistant', 'content': '', 'thinking': '.'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.4985808Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' Use'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.5516047Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' the'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.6038104Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' count'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.6554842Z', 'message': {'role': 'assistant', 'content': '', 'thinking': '_words'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.7086585Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' tool'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.7607187Z', 'message': {'role': 'assistant', 'content': '', 'thinking': '."'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.812525Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' Then'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.8644452Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' the'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.9170759Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' rest'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.9702597Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' of'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.0235476Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' the'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.0763046Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' chunk'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.1307694Z', 'message': {'role': 'assistant', 'content': '', 'thinking': '.'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.1836988Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' We'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.2365574Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' must'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.2888922Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' ensure'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.3413114Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' the'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.3944374Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' text'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.447811Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' is'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.5013433Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' passed'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.5560489Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' correctly'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.6101454Z', 'message': {'role': 'assistant', 'content': '', 'thinking': '.'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.6635471Z', 'message': {'role': 'assistant', 'content': '', 'thinking': " We'll"}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.7176046Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' produce'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.7717075Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' JSON'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.8244351Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' with'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.8785793Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' key'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.9315896Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' "'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.9912788Z', 'message': {'role': 'assistant', 'content': '', 'thinking': 'text'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.0451975Z', 'message': {'role': 'assistant', 'content': '', 'thinking': '".'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.0992761Z', 'message': {'role': 'assistant', 'content': '', 'thinking': " Let's"}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.152886Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' count'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.2061728Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' words'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.259895Z', 'message': {'role': 'assistant', 'content': '', 'thinking': '.'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.313304Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' We'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.3667153Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' can'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.4205601Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' do'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.4744018Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' it'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.5306094Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' manually'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.5842288Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' quickly'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.6387154Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' or'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.6917305Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' rely'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.7451629Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' on'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.7980629Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' tool'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.8552752Z', 'message': {'role': 'assistant', 'content': '', 'thinking': '.'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.9079159Z', 'message': {'role': 'assistant', 'content': '', 'thinking': " We'll"}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.9607948Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' call'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:53.0129743Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' the'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:53.0657066Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' tool'}, 'done': False}
{'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:53.1191227Z', 'message': {'role': 'assistant', 'content': '', 'thinking': '.'}, 'done': False}
{'error': 'error parsing tool call: raw=\'{"text":"Calculate the length in words of this chunk of text. Use the count_words tool.\\n\\nHow tool parsing works in Ollama\\nBackground\\nWe’ve built a new parser that focuses on understanding the structure of a tool call rather than simply looking for JSON.\\n\\nPreviously, when tools were passed into the model, the system had to wait until the entire output was generated and then parse it as JSON to determine whether it contained a tool call or normal content. Users had to wait for the complete generation before seeing any streamed token. This approach was reliable against malformed output, but blocked streaming because a tool call might occur at any point in the text.\\n\\nOllama supports a wide range of models, some trained with tool-specific tokens and some without. The parsing logic would needs to stream user content while being able to detect, suppress, and parse the tool call tokens.\\n\\nIncremental Parser\\nThe new parser directly references each model’s template to understand the prefix of the tool call. This is necessary for Ollama to understand and separate the tool calls and content.\\n\\nWhen a model is not directly trained on tool usage directly (trained with a prefix/tool token), it may still be able to output valid tool calls based on the sheer amount of knowledge it has. In this case, the parser is able to handle the partial prefixes output by the model and correctly separate tool calls and content.\\n\\nSome models also elect to output a tool call without a prefix, even though they were trained on using a prefix for calling tools. Empirically, this behavior happens at the start of a model output only. To address this, the parser can fallback to parsing JSON as a tool call when it recognizes the start of a JSON. If the JSON does not match the tool call format for the model, the JSON will be returned."}<|call|>commentary<|channel|>assistant<|channel|>commentary to=functions.count_words<|channel|>analysis <|constrain|>json<|message|>{"text":"Calculate the length in words of this chunk of text. Use the count_words tool.\\n\\nHow tool parsing works in Ollama\\nBackground\\nWe’ve built a new parser that focuses on understanding the structure of a tool call rather than simply looking for JSON.\\n\\nPreviously, when tools were passed into the model, the system had to wait until the entire output was generated and then parse it as JSON to determine whether it contained a tool call or normal content. Users had to wait for the complete generation before seeing any streamed token. This approach was reliable against malformed output, but blocked streaming because a tool call might occur at any point in the text.\\n\\nOllama supports a wide range of models, some trained with tool-specific tokens and some without. The parsing logic would needs to stream user content while being able to detect, suppress, and parse the tool call tokens.\\n\\nIncremental Parser\\nThe new parser directly references each model’s template to understand the prefix of the tool call. This is necessary for Ollama to understand and separate the tool calls and content.\\n\\nWhen a model is not directly trained on tool usage directly (trained with a prefix/tool token), it may still be able to output valid tool calls based on the sheer amount of knowledge it has. In this case, the parser is able to handle the partial prefixes output by the model and correctly separate tool calls and content.\\n\\nSome models also elect to output a tool call without a prefix, even though they were trained on using a prefix for calling tools. Empirically, this behavior happens at the start of a model output only. To address this, the parser can fallback to parsing JSON as a tool call when it recognizes the start of a JSON. If the JSON does not match the tool call format for the model, the JSON will be returned."}<|call|>commentary<|channel|>assistant<|channel|>final<|message|>**Word count:** 241 words.\', err=invalid character \'<\' after top-level value'}

@socram8888 commented on GitHub (Sep 30, 2025): I've just stumbled upon this error. The exact request, done manually against the chat API: ``` { "model":"huihui_ai/gpt-oss-abliterated:20b", "messages":[ { "role":"user", "content":"Calculate the length in words of this chunk of text. Use the count_words tool.\n\nHow tool parsing works in Ollama\nBackground\nWe\u2019ve built a new parser that focuses on understanding the structure of a tool call rather than simply looking for JSON.\n\nPreviously, when tools were passed into the model, the system had to wait until the entire output was generated and then parse it as JSON to determine whether it contained a tool call or normal content. Users had to wait for the complete generation before seeing any streamed token. This approach was reliable against malformed output, but blocked streaming because a tool call might occur at any point in the text.\n\nOllama supports a wide range of models, some trained with tool-specific tokens and some without. The parsing logic would needs to stream user content while being able to detect, suppress, and parse the tool call tokens.\n\nIncremental Parser\nThe new parser directly references each model\u2019s template to understand the prefix of the tool call. This is necessary for Ollama to understand and separate the tool calls and content.\n\nWhen a model is not directly trained on tool usage directly (trained with a prefix/tool token), it may still be able to output valid tool calls based on the sheer amount of knowledge it has. In this case, the parser is able to handle the partial prefixes output by the model and correctly separate tool calls and content.\n\nSome models also elect to output a tool call without a prefix, even though they were trained on using a prefix for calling tools. Empirically, this behavior happens at the start of a model output only. To address this, the parser can fallback to parsing JSON as a tool call when it recognizes the start of a JSON. If the JSON does not match the tool call format for the model, the JSON will be returned." } ], "options":{ "num_ctx":8192 }, "think":true, "stream":true, "tools":[ { "type":"function", "function":{ "name":"count_words", "description":"Calculate the word count for a given text", "parameters":{ "type":"object", "properties":{ "text":{ "type":"string", "description":"The text to calculate the length of" } }, "required":[ "text" ] } } } ] } ``` The streaming response: ``` {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:48.8421465Z', 'message': {'role': 'assistant', 'content': '', 'thinking': 'We'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:48.8918969Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' need'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:48.943321Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' to'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:48.9943184Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' call'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.0466919Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' the'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.0975886Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' count'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.1477533Z', 'message': {'role': 'assistant', 'content': '', 'thinking': '_words'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.1993845Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' tool'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.250604Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' with'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.3023189Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' the'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.3541248Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' given'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.4060203Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' text'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.4569036Z', 'message': {'role': 'assistant', 'content': '', 'thinking': '.'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.5083889Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' The'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.5593626Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' text'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.6104598Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' is'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.6617102Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' the'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.7136654Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' chunk'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.7657388Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' starting'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.8168979Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' with'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.8693789Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' "'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.9213727Z', 'message': {'role': 'assistant', 'content': '', 'thinking': 'Calculate'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:49.972173Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' the'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.0243279Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' length'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.0762522Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' in'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.1288321Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' words'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.1823557Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' of'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.2347046Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' this'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.2891573Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' chunk'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.3417142Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' of'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.3937481Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' text'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.446075Z', 'message': {'role': 'assistant', 'content': '', 'thinking': '.'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.4985808Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' Use'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.5516047Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' the'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.6038104Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' count'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.6554842Z', 'message': {'role': 'assistant', 'content': '', 'thinking': '_words'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.7086585Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' tool'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.7607187Z', 'message': {'role': 'assistant', 'content': '', 'thinking': '."'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.812525Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' Then'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.8644452Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' the'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.9170759Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' rest'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:50.9702597Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' of'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.0235476Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' the'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.0763046Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' chunk'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.1307694Z', 'message': {'role': 'assistant', 'content': '', 'thinking': '.'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.1836988Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' We'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.2365574Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' must'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.2888922Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' ensure'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.3413114Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' the'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.3944374Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' text'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.447811Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' is'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.5013433Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' passed'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.5560489Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' correctly'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.6101454Z', 'message': {'role': 'assistant', 'content': '', 'thinking': '.'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.6635471Z', 'message': {'role': 'assistant', 'content': '', 'thinking': " We'll"}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.7176046Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' produce'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.7717075Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' JSON'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.8244351Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' with'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.8785793Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' key'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.9315896Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' "'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:51.9912788Z', 'message': {'role': 'assistant', 'content': '', 'thinking': 'text'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.0451975Z', 'message': {'role': 'assistant', 'content': '', 'thinking': '".'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.0992761Z', 'message': {'role': 'assistant', 'content': '', 'thinking': " Let's"}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.152886Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' count'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.2061728Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' words'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.259895Z', 'message': {'role': 'assistant', 'content': '', 'thinking': '.'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.313304Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' We'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.3667153Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' can'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.4205601Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' do'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.4744018Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' it'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.5306094Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' manually'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.5842288Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' quickly'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.6387154Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' or'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.6917305Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' rely'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.7451629Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' on'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.7980629Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' tool'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.8552752Z', 'message': {'role': 'assistant', 'content': '', 'thinking': '.'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.9079159Z', 'message': {'role': 'assistant', 'content': '', 'thinking': " We'll"}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:52.9607948Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' call'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:53.0129743Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' the'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:53.0657066Z', 'message': {'role': 'assistant', 'content': '', 'thinking': ' tool'}, 'done': False} {'model': 'huihui_ai/gpt-oss-abliterated:20b', 'created_at': '2025-09-30T22:41:53.1191227Z', 'message': {'role': 'assistant', 'content': '', 'thinking': '.'}, 'done': False} {'error': 'error parsing tool call: raw=\'{"text":"Calculate the length in words of this chunk of text. Use the count_words tool.\\n\\nHow tool parsing works in Ollama\\nBackground\\nWe’ve built a new parser that focuses on understanding the structure of a tool call rather than simply looking for JSON.\\n\\nPreviously, when tools were passed into the model, the system had to wait until the entire output was generated and then parse it as JSON to determine whether it contained a tool call or normal content. Users had to wait for the complete generation before seeing any streamed token. This approach was reliable against malformed output, but blocked streaming because a tool call might occur at any point in the text.\\n\\nOllama supports a wide range of models, some trained with tool-specific tokens and some without. The parsing logic would needs to stream user content while being able to detect, suppress, and parse the tool call tokens.\\n\\nIncremental Parser\\nThe new parser directly references each model’s template to understand the prefix of the tool call. This is necessary for Ollama to understand and separate the tool calls and content.\\n\\nWhen a model is not directly trained on tool usage directly (trained with a prefix/tool token), it may still be able to output valid tool calls based on the sheer amount of knowledge it has. In this case, the parser is able to handle the partial prefixes output by the model and correctly separate tool calls and content.\\n\\nSome models also elect to output a tool call without a prefix, even though they were trained on using a prefix for calling tools. Empirically, this behavior happens at the start of a model output only. To address this, the parser can fallback to parsing JSON as a tool call when it recognizes the start of a JSON. If the JSON does not match the tool call format for the model, the JSON will be returned."}<|call|>commentary<|channel|>assistant<|channel|>commentary to=functions.count_words<|channel|>analysis <|constrain|>json<|message|>{"text":"Calculate the length in words of this chunk of text. Use the count_words tool.\\n\\nHow tool parsing works in Ollama\\nBackground\\nWe’ve built a new parser that focuses on understanding the structure of a tool call rather than simply looking for JSON.\\n\\nPreviously, when tools were passed into the model, the system had to wait until the entire output was generated and then parse it as JSON to determine whether it contained a tool call or normal content. Users had to wait for the complete generation before seeing any streamed token. This approach was reliable against malformed output, but blocked streaming because a tool call might occur at any point in the text.\\n\\nOllama supports a wide range of models, some trained with tool-specific tokens and some without. The parsing logic would needs to stream user content while being able to detect, suppress, and parse the tool call tokens.\\n\\nIncremental Parser\\nThe new parser directly references each model’s template to understand the prefix of the tool call. This is necessary for Ollama to understand and separate the tool calls and content.\\n\\nWhen a model is not directly trained on tool usage directly (trained with a prefix/tool token), it may still be able to output valid tool calls based on the sheer amount of knowledge it has. In this case, the parser is able to handle the partial prefixes output by the model and correctly separate tool calls and content.\\n\\nSome models also elect to output a tool call without a prefix, even though they were trained on using a prefix for calling tools. Empirically, this behavior happens at the start of a model output only. To address this, the parser can fallback to parsing JSON as a tool call when it recognizes the start of a JSON. If the JSON does not match the tool call format for the model, the JSON will be returned."}<|call|>commentary<|channel|>assistant<|channel|>final<|message|>**Word count:** 241 words.\', err=invalid character \'<\' after top-level value'} ```

GiteaMirror commented

2026-05-04 20:16:01 -05:00

@drifkin commented on GitHub (Sep 30, 2025):

it looks like the model didn't stop generating after <|call|>, which should be a stop token. Have you seen this on the official gpt-oss:20b model? It might be that the abliterated one is outputting the wrong token, or is configured incorrectly

@drifkin commented on GitHub (Sep 30, 2025): it looks like the model didn't stop generating after `<|call|>`, which should be a stop token. Have you seen this on the official gpt-oss:20b model? It might be that the abliterated one is outputting the wrong token, or is configured incorrectly

GiteaMirror commented

2026-05-04 20:16:02 -05:00

@socram8888 commented on GitHub (Sep 30, 2025):

I was thinking exactly the same and yes, the gpt-oss:20b model does work fine. I've ran the same query three times and it all generated correct tool calls, so it must be an issue with that specific model.

@socram8888 commented on GitHub (Sep 30, 2025): I was thinking exactly the same and yes, the `gpt-oss:20b` model does work fine. I've ran the same query three times and it all generated correct tool calls, so it must be an issue with that specific model.

GiteaMirror commented

2026-05-04 20:16:02 -05:00

@trufae commented on GitHub (Oct 8, 2025):

lmstudio does that right, will be great if i could use ollama instead

@trufae commented on GitHub (Oct 8, 2025): lmstudio does that right, will be great if i could use ollama instead

GiteaMirror commented

2026-05-04 20:16:02 -05:00

@drifkin commented on GitHub (Oct 8, 2025):

lmstudio does that right, will be great if i could use ollama instead

Are you seeing this with the real model or also with the abliterated one?

@drifkin commented on GitHub (Oct 8, 2025): > lmstudio does that right, will be great if i could use ollama instead Are you seeing this with the real model or also with the abliterated one?

GiteaMirror commented

2026-05-04 20:16:03 -05:00

@trufae commented on GitHub (Oct 8, 2025):

With qwen2.5, qwen3, gpt-oss:20b at least for me using vscode copilot and opencode as clients. But i didnt went deep into seeing the real problem

@trufae commented on GitHub (Oct 8, 2025): With qwen2.5, qwen3, gpt-oss:20b at least for me using vscode copilot and opencode as clients. But i didnt went deep into seeing the real problem

GiteaMirror commented

2026-05-04 20:16:04 -05:00

@ParthSareen commented on GitHub (Oct 8, 2025):

@trufae how much context length are you setting for the model when using Ollama? these model should at least be giving some output (even though they're small). If you're coding with them I'd recommend at least 64000 as long as it fits on your machine.

@ParthSareen commented on GitHub (Oct 8, 2025): @trufae how much context length are you setting for the model when using Ollama? these model should at least be giving some output (even though they're small). If you're coding with them I'd recommend at least 64000 as long as it fits on your machine.

GiteaMirror commented

2026-05-04 20:16:04 -05:00

@trufae commented on GitHub (Oct 9, 2025):

im just testing it now with the latest ollama 0.12.3 and opencode 0.14.6 and i can use qwen3:latest and gpt-oss:20b in local with no issues, but i cant use other models. also after some testing i find out that the real problem was that the context was not configured to 128K so it was silently failing

@trufae commented on GitHub (Oct 9, 2025): im just testing it now with the latest ollama 0.12.3 and opencode 0.14.6 and i can use `qwen3:latest` and `gpt-oss:20b` in local with no issues, but i cant use other models. also after some testing i find out that the real problem was that the context was not configured to 128K so it was silently failing

GiteaMirror commented

2026-05-04 20:16:05 -05:00

@ParthSareen commented on GitHub (Oct 9, 2025):

@trufae that's great! hopefully we're gonna do some changes to make the context filling more visible. as for other models - I'm not sure what you're trying but most are not made for agentic tool use. they wouldn't be able to complete a loop at their size

@ParthSareen commented on GitHub (Oct 9, 2025): @trufae that's great! hopefully we're gonna do some changes to make the context filling more visible. as for other models - I'm not sure what you're trying but most are not made for agentic tool use. they wouldn't be able to complete a loop at their size

GiteaMirror commented

2026-05-04 20:16:05 -05:00

@mcr-ksh commented on GitHub (Oct 11, 2025):

I can confirm it didn't work for me setting the context to 128K. See comment from https://github.com/ollama/ollama/issues/12187.
Short vLLM works, ollama doesnt on the same model and same machine. Just exchanged the n8n chat model node between vLLM and ollama.

@mcr-ksh commented on GitHub (Oct 11, 2025): I can confirm it didn't work for me setting the context to 128K. See comment from https://github.com/ollama/ollama/issues/12187. Short vLLM works, ollama doesnt on the same model and same machine. Just exchanged the n8n chat model node between vLLM and ollama.

GiteaMirror commented

2026-05-04 20:16:06 -05:00

@ParthSareen commented on GitHub (Oct 11, 2025):

I can confirm it didn't work for me setting the context to 128K. See comment from https://github.com/ollama/ollama/issues/12187.
Short vLLM works, ollama doesnt on the same model and same machine. Just exchanged the n8n chat model node between vLLM and ollama.

What configuration are you running with? Can you show the output of ollama ps

@ParthSareen commented on GitHub (Oct 11, 2025): > I can confirm it didn't work for me setting the context to 128K. See comment from https://github.com/ollama/ollama/issues/12187. > Short vLLM works, ollama doesnt on the same model and same machine. Just exchanged the n8n chat model node between vLLM and ollama. What configuration are you running with? Can you show the output of ollama ps

GiteaMirror commented

2026-05-04 20:16:07 -05:00

@mcr-ksh commented on GitHub (Oct 11, 2025):

I can confirm it didn't work for me setting the context to 128K. See comment from #12187.
Short vLLM works, ollama doesnt on the same model and same machine. Just exchanged the n8n chat model node between vLLM and ollama.

What configuration are you running with? Can you show the output of ollama ps

(vllm) root@jetson-thor:/usr/src/vllm# ollama ps 
NAME           ID              SIZE     PROCESSOR    CONTEXT    UNTIL             
gpt-oss:20b    aa4295ac10c3    17 GB    100% GPU     131072     24 hours from now 

Oct 11 10:04:15 jetson-thor ollama[3558504]: [GIN-debug] POST   /v1/chat/completions      --> github.com/ollama/ollama/server.(*Server).ChatHandler-fm (6 handlers)
Oct 11 10:04:15 jetson-thor ollama[3558504]: [GIN-debug] POST   /v1/completions           --> github.com/ollama/ollama/server.(*Server).GenerateHandler-fm (6 handlers)
Oct 11 10:04:15 jetson-thor ollama[3558504]: [GIN-debug] POST   /v1/embeddings            --> github.com/ollama/ollama/server.(*Server).EmbedHandler-fm (6 handlers)
Oct 11 10:04:15 jetson-thor ollama[3558504]: [GIN-debug] GET    /v1/models                --> github.com/ollama/ollama/server.(*Server).ListHandler-fm (6 handlers)
Oct 11 10:04:15 jetson-thor ollama[3558504]: [GIN-debug] GET    /v1/models/:model         --> github.com/ollama/ollama/server.(*Server).ShowHandler-fm (6 handlers)
Oct 11 10:04:15 jetson-thor ollama[3558504]: time=2025-10-11T10:04:15.162+02:00 level=INFO source=routes.go:1534 msg="Listening on [::]:11434 (version 0.0.0)"
Oct 11 10:04:15 jetson-thor ollama[3558504]: time=2025-10-11T10:04:15.163+02:00 level=INFO source=runner.go:80 msg="discovering available GPUs..."
Oct 11 10:04:15 jetson-thor ollama[3558504]: time=2025-10-11T10:04:15.697+02:00 level=INFO source=types.go:112 msg="inference compute" id=GPU-a7c66ad2-6dbb-0ab8-c1a2-37ba6dba3600 library=CUDA compute=11.0 name=CUDA0 description="NVIDIA Thor" libdirs=ollama,cuda_sbsa driver=13.0 pci_id=01:00.0 type=iGPU total="122.8 GiB" available="120.0 GiB"
Oct 11 10:04:19 jetson-thor ollama[3558504]: [GIN] 2025/10/11 - 10:04:19 | 200 |      803.95µs |       127.0.0.1 | HEAD     "/"
Oct 11 10:04:19 jetson-thor ollama[3558504]: [GIN] 2025/10/11 - 10:04:19 | 200 |          92µs |       127.0.0.1 | GET      "/api/ps"
Oct 11 10:04:32 jetson-thor ollama[3558504]: time=2025-10-11T10:04:32.436+02:00 level=INFO source=server.go:216 msg="enabling flash attention"
Oct 11 10:04:32 jetson-thor ollama[3558504]: time=2025-10-11T10:04:32.436+02:00 level=INFO source=server.go:400 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --model /usr/share/ollama/.ollama/models/blobs/sha256-b112e727c6f18875636c56a779790a590d705aec9e1c0eb5a97d51fc2a778583 --port 40287"
Oct 11 10:04:32 jetson-thor ollama[3558504]: time=2025-10-11T10:04:32.437+02:00 level=INFO source=server.go:675 msg="loading model" "model layers"=25 requested=-1
Oct 11 10:04:32 jetson-thor ollama[3558504]: time=2025-10-11T10:04:32.437+02:00 level=INFO source=server.go:681 msg="system memory" total="122.8 GiB" free="119.7 GiB" free_swap="0 B"
Oct 11 10:04:32 jetson-thor ollama[3558504]: time=2025-10-11T10:04:32.437+02:00 level=INFO source=server.go:689 msg="gpu memory" id=GPU-a7c66ad2-6dbb-0ab8-c1a2-37ba6dba3600 library=CUDA available="119.1 GiB" free="119.5 GiB" minimum="457.0 MiB" overhead="0 B"
Oct 11 10:04:32 jetson-thor ollama[3558504]: time=2025-10-11T10:04:32.445+02:00 level=INFO source=runner.go:1316 msg="starting ollama engine"
Oct 11 10:04:32 jetson-thor ollama[3558504]: time=2025-10-11T10:04:32.448+02:00 level=INFO source=runner.go:1352 msg="Server listening on 127.0.0.1:40287"
Oct 11 10:04:32 jetson-thor ollama[3558504]: time=2025-10-11T10:04:32.449+02:00 level=INFO source=runner.go:1189 msg=load request="{Operation:fit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:true KvSize:131072 KvCacheType: NumThreads:14 GPULayers:25[ID:GPU-a7c66ad2-6dbb-0ab8-c1a2-37ba6dba3600 Layers:25(0..24)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}"
Oct 11 10:04:32 jetson-thor ollama[3558504]: time=2025-10-11T10:04:32.493+02:00 level=INFO source=ggml.go:133 msg="" architecture=gptoss file_type=MXFP4 name="" description="" num_tensors=315 num_key_values=30
Oct 11 10:04:32 jetson-thor ollama[3558504]: ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
Oct 11 10:04:32 jetson-thor ollama[3558504]: ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
Oct 11 10:04:32 jetson-thor ollama[3558504]: ggml_cuda_init: found 1 CUDA devices:
Oct 11 10:04:32 jetson-thor ollama[3558504]:   Device 0: NVIDIA Thor, compute capability 11.0, VMM: yes, ID: GPU-a7c66ad2-6dbb-0ab8-c1a2-37ba6dba3600
Oct 11 10:04:32 jetson-thor ollama[3558504]: load_backend: loaded CUDA backend from /usr/local/lib/ollama/cuda_sbsa/libggml-cuda.so
Oct 11 10:04:32 jetson-thor ollama[3558504]: load_backend: loaded CPU backend from /usr/local/lib/ollama/cuda_sbsa/libggml-cpu.so
Oct 11 10:04:32 jetson-thor ollama[3558504]: time=2025-10-11T10:04:32.589+02:00 level=INFO source=ggml.go:104 msg=system CPU.0.NEON=1 CPU.0.ARM_FMA=1 CPU.0.LLAMAFILE=1 CUDA.0.ARCHS=1100 CUDA.0.USE_GRAPHS=1 CUDA.0.PEER_MAX_BATCH_SIZE=128 CPU.1.NEON=1 CPU.1.ARM_FMA=1 CPU.1.LLAMAFILE=1 compiler=cgo(gcc)
Oct 11 10:04:32 jetson-thor ollama[3558504]: time=2025-10-11T10:04:32.818+02:00 level=INFO source=runner.go:1189 msg=load request="{Operation:alloc LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:true KvSize:131072 KvCacheType: NumThreads:14 GPULayers:25[ID:GPU-a7c66ad2-6dbb-0ab8-c1a2-37ba6dba3600 Layers:25(0..24)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}"
Oct 11 10:04:33 jetson-thor ollama[3558504]: time=2025-10-11T10:04:33.252+02:00 level=INFO source=runner.go:1189 msg=load request="{Operation:commit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:true KvSize:131072 KvCacheType: NumThreads:14 GPULayers:25[ID:GPU-a7c66ad2-6dbb-0ab8-c1a2-37ba6dba3600 Layers:25(0..24)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}"
Oct 11 10:04:33 jetson-thor ollama[3558504]: time=2025-10-11T10:04:33.252+02:00 level=INFO source=ggml.go:477 msg="offloading 24 repeating layers to GPU"
Oct 11 10:04:33 jetson-thor ollama[3558504]: time=2025-10-11T10:04:33.252+02:00 level=INFO source=ggml.go:483 msg="offloading output layer to GPU"
Oct 11 10:04:33 jetson-thor ollama[3558504]: time=2025-10-11T10:04:33.252+02:00 level=INFO source=ggml.go:488 msg="offloaded 25/25 layers to GPU"
Oct 11 10:04:33 jetson-thor ollama[3558504]: time=2025-10-11T10:04:33.252+02:00 level=INFO source=device.go:206 msg="model weights" device=CUDA0 size="11.8 GiB"
Oct 11 10:04:33 jetson-thor ollama[3558504]: time=2025-10-11T10:04:33.253+02:00 level=INFO source=device.go:211 msg="model weights" device=CPU size="1.1 GiB"
Oct 11 10:04:33 jetson-thor ollama[3558504]: time=2025-10-11T10:04:33.253+02:00 level=INFO source=device.go:217 msg="kv cache" device=CUDA0 size="3.1 GiB"
Oct 11 10:04:33 jetson-thor ollama[3558504]: time=2025-10-11T10:04:33.253+02:00 level=INFO source=device.go:228 msg="compute graph" device=CUDA0 size="249.8 MiB"
Oct 11 10:04:33 jetson-thor ollama[3558504]: time=2025-10-11T10:04:33.253+02:00 level=INFO source=device.go:233 msg="compute graph" device=CPU size="5.6 MiB"
Oct 11 10:04:33 jetson-thor ollama[3558504]: time=2025-10-11T10:04:33.253+02:00 level=INFO source=device.go:238 msg="total memory" size="16.2 GiB"
Oct 11 10:04:33 jetson-thor ollama[3558504]: time=2025-10-11T10:04:33.253+02:00 level=INFO source=sched.go:481 msg="loaded runners" count=1
Oct 11 10:04:33 jetson-thor ollama[3558504]: time=2025-10-11T10:04:33.253+02:00 level=INFO source=server.go:1271 msg="waiting for llama runner to start responding"
Oct 11 10:04:33 jetson-thor ollama[3558504]: time=2025-10-11T10:04:33.253+02:00 level=INFO source=server.go:1305 msg="waiting for server to become available" status="llm server loading model"
Oct 11 10:04:37 jetson-thor ollama[3558504]: time=2025-10-11T10:04:37.266+02:00 level=INFO source=server.go:1309 msg="llama runner started in 4.83 seconds"
Oct 11 10:04:46 jetson-thor ollama[3558504]: [GIN] 2025/10/11 - 10:04:46 | 200 |      44.907µs |       127.0.0.1 | HEAD     "/"
Oct 11 10:04:46 jetson-thor ollama[3558504]: [GIN] 2025/10/11 - 10:04:46 | 200 |       37.75µs |       127.0.0.1 | GET      "/api/ps"
Oct 11 10:05:08 jetson-thor ollama[3558504]: [GIN] 2025/10/11 - 10:05:08 | 200 | 38.538412472s |    192.168.1.17 | POST     "/api/chat"
Oct 11 10:05:27 jetson-thor ollama[3558504]: [GIN] 2025/10/11 - 10:05:27 | 200 |      35.241µs |       127.0.0.1 | HEAD     "/"
Oct 11 10:05:27 jetson-thor ollama[3558504]: [GIN] 2025/10/11 - 10:05:27 | 200 |      28.065µs |       127.0.0.1 | GET      "/api/ps"
Oct 11 10:06:16 jetson-thor ollama[3558504]: [GIN] 2025/10/11 - 10:06:16 | 200 | 38.183342541s |    192.168.1.17 | POST     "/api/chat"
Oct 11 10:06:29 jetson-thor ollama[3558504]: [GIN] 2025/10/11 - 10:06:29 | 200 |   50.020388ms |    192.168.1.17 | GET      "/api/tags"

@mcr-ksh commented on GitHub (Oct 11, 2025): > > I can confirm it didn't work for me setting the context to 128K. See comment from [#12187](https://github.com/ollama/ollama/issues/12187). > > Short vLLM works, ollama doesnt on the same model and same machine. Just exchanged the n8n chat model node between vLLM and ollama. > > What configuration are you running with? Can you show the output of ollama ps ``` (vllm) root@jetson-thor:/usr/src/vllm# ollama ps NAME ID SIZE PROCESSOR CONTEXT UNTIL gpt-oss:20b aa4295ac10c3 17 GB 100% GPU 131072 24 hours from now Oct 11 10:04:15 jetson-thor ollama[3558504]: [GIN-debug] POST /v1/chat/completions --> github.com/ollama/ollama/server.(*Server).ChatHandler-fm (6 handlers) Oct 11 10:04:15 jetson-thor ollama[3558504]: [GIN-debug] POST /v1/completions --> github.com/ollama/ollama/server.(*Server).GenerateHandler-fm (6 handlers) Oct 11 10:04:15 jetson-thor ollama[3558504]: [GIN-debug] POST /v1/embeddings --> github.com/ollama/ollama/server.(*Server).EmbedHandler-fm (6 handlers) Oct 11 10:04:15 jetson-thor ollama[3558504]: [GIN-debug] GET /v1/models --> github.com/ollama/ollama/server.(*Server).ListHandler-fm (6 handlers) Oct 11 10:04:15 jetson-thor ollama[3558504]: [GIN-debug] GET /v1/models/:model --> github.com/ollama/ollama/server.(*Server).ShowHandler-fm (6 handlers) Oct 11 10:04:15 jetson-thor ollama[3558504]: time=2025-10-11T10:04:15.162+02:00 level=INFO source=routes.go:1534 msg="Listening on [::]:11434 (version 0.0.0)" Oct 11 10:04:15 jetson-thor ollama[3558504]: time=2025-10-11T10:04:15.163+02:00 level=INFO source=runner.go:80 msg="discovering available GPUs..." Oct 11 10:04:15 jetson-thor ollama[3558504]: time=2025-10-11T10:04:15.697+02:00 level=INFO source=types.go:112 msg="inference compute" id=GPU-a7c66ad2-6dbb-0ab8-c1a2-37ba6dba3600 library=CUDA compute=11.0 name=CUDA0 description="NVIDIA Thor" libdirs=ollama,cuda_sbsa driver=13.0 pci_id=01:00.0 type=iGPU total="122.8 GiB" available="120.0 GiB" Oct 11 10:04:19 jetson-thor ollama[3558504]: [GIN] 2025/10/11 - 10:04:19 | 200 | 803.95µs | 127.0.0.1 | HEAD "/" Oct 11 10:04:19 jetson-thor ollama[3558504]: [GIN] 2025/10/11 - 10:04:19 | 200 | 92µs | 127.0.0.1 | GET "/api/ps" Oct 11 10:04:32 jetson-thor ollama[3558504]: time=2025-10-11T10:04:32.436+02:00 level=INFO source=server.go:216 msg="enabling flash attention" Oct 11 10:04:32 jetson-thor ollama[3558504]: time=2025-10-11T10:04:32.436+02:00 level=INFO source=server.go:400 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --model /usr/share/ollama/.ollama/models/blobs/sha256-b112e727c6f18875636c56a779790a590d705aec9e1c0eb5a97d51fc2a778583 --port 40287" Oct 11 10:04:32 jetson-thor ollama[3558504]: time=2025-10-11T10:04:32.437+02:00 level=INFO source=server.go:675 msg="loading model" "model layers"=25 requested=-1 Oct 11 10:04:32 jetson-thor ollama[3558504]: time=2025-10-11T10:04:32.437+02:00 level=INFO source=server.go:681 msg="system memory" total="122.8 GiB" free="119.7 GiB" free_swap="0 B" Oct 11 10:04:32 jetson-thor ollama[3558504]: time=2025-10-11T10:04:32.437+02:00 level=INFO source=server.go:689 msg="gpu memory" id=GPU-a7c66ad2-6dbb-0ab8-c1a2-37ba6dba3600 library=CUDA available="119.1 GiB" free="119.5 GiB" minimum="457.0 MiB" overhead="0 B" Oct 11 10:04:32 jetson-thor ollama[3558504]: time=2025-10-11T10:04:32.445+02:00 level=INFO source=runner.go:1316 msg="starting ollama engine" Oct 11 10:04:32 jetson-thor ollama[3558504]: time=2025-10-11T10:04:32.448+02:00 level=INFO source=runner.go:1352 msg="Server listening on 127.0.0.1:40287" Oct 11 10:04:32 jetson-thor ollama[3558504]: time=2025-10-11T10:04:32.449+02:00 level=INFO source=runner.go:1189 msg=load request="{Operation:fit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:true KvSize:131072 KvCacheType: NumThreads:14 GPULayers:25[ID:GPU-a7c66ad2-6dbb-0ab8-c1a2-37ba6dba3600 Layers:25(0..24)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}" Oct 11 10:04:32 jetson-thor ollama[3558504]: time=2025-10-11T10:04:32.493+02:00 level=INFO source=ggml.go:133 msg="" architecture=gptoss file_type=MXFP4 name="" description="" num_tensors=315 num_key_values=30 Oct 11 10:04:32 jetson-thor ollama[3558504]: ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no Oct 11 10:04:32 jetson-thor ollama[3558504]: ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no Oct 11 10:04:32 jetson-thor ollama[3558504]: ggml_cuda_init: found 1 CUDA devices: Oct 11 10:04:32 jetson-thor ollama[3558504]: Device 0: NVIDIA Thor, compute capability 11.0, VMM: yes, ID: GPU-a7c66ad2-6dbb-0ab8-c1a2-37ba6dba3600 Oct 11 10:04:32 jetson-thor ollama[3558504]: load_backend: loaded CUDA backend from /usr/local/lib/ollama/cuda_sbsa/libggml-cuda.so Oct 11 10:04:32 jetson-thor ollama[3558504]: load_backend: loaded CPU backend from /usr/local/lib/ollama/cuda_sbsa/libggml-cpu.so Oct 11 10:04:32 jetson-thor ollama[3558504]: time=2025-10-11T10:04:32.589+02:00 level=INFO source=ggml.go:104 msg=system CPU.0.NEON=1 CPU.0.ARM_FMA=1 CPU.0.LLAMAFILE=1 CUDA.0.ARCHS=1100 CUDA.0.USE_GRAPHS=1 CUDA.0.PEER_MAX_BATCH_SIZE=128 CPU.1.NEON=1 CPU.1.ARM_FMA=1 CPU.1.LLAMAFILE=1 compiler=cgo(gcc) Oct 11 10:04:32 jetson-thor ollama[3558504]: time=2025-10-11T10:04:32.818+02:00 level=INFO source=runner.go:1189 msg=load request="{Operation:alloc LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:true KvSize:131072 KvCacheType: NumThreads:14 GPULayers:25[ID:GPU-a7c66ad2-6dbb-0ab8-c1a2-37ba6dba3600 Layers:25(0..24)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}" Oct 11 10:04:33 jetson-thor ollama[3558504]: time=2025-10-11T10:04:33.252+02:00 level=INFO source=runner.go:1189 msg=load request="{Operation:commit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:true KvSize:131072 KvCacheType: NumThreads:14 GPULayers:25[ID:GPU-a7c66ad2-6dbb-0ab8-c1a2-37ba6dba3600 Layers:25(0..24)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}" Oct 11 10:04:33 jetson-thor ollama[3558504]: time=2025-10-11T10:04:33.252+02:00 level=INFO source=ggml.go:477 msg="offloading 24 repeating layers to GPU" Oct 11 10:04:33 jetson-thor ollama[3558504]: time=2025-10-11T10:04:33.252+02:00 level=INFO source=ggml.go:483 msg="offloading output layer to GPU" Oct 11 10:04:33 jetson-thor ollama[3558504]: time=2025-10-11T10:04:33.252+02:00 level=INFO source=ggml.go:488 msg="offloaded 25/25 layers to GPU" Oct 11 10:04:33 jetson-thor ollama[3558504]: time=2025-10-11T10:04:33.252+02:00 level=INFO source=device.go:206 msg="model weights" device=CUDA0 size="11.8 GiB" Oct 11 10:04:33 jetson-thor ollama[3558504]: time=2025-10-11T10:04:33.253+02:00 level=INFO source=device.go:211 msg="model weights" device=CPU size="1.1 GiB" Oct 11 10:04:33 jetson-thor ollama[3558504]: time=2025-10-11T10:04:33.253+02:00 level=INFO source=device.go:217 msg="kv cache" device=CUDA0 size="3.1 GiB" Oct 11 10:04:33 jetson-thor ollama[3558504]: time=2025-10-11T10:04:33.253+02:00 level=INFO source=device.go:228 msg="compute graph" device=CUDA0 size="249.8 MiB" Oct 11 10:04:33 jetson-thor ollama[3558504]: time=2025-10-11T10:04:33.253+02:00 level=INFO source=device.go:233 msg="compute graph" device=CPU size="5.6 MiB" Oct 11 10:04:33 jetson-thor ollama[3558504]: time=2025-10-11T10:04:33.253+02:00 level=INFO source=device.go:238 msg="total memory" size="16.2 GiB" Oct 11 10:04:33 jetson-thor ollama[3558504]: time=2025-10-11T10:04:33.253+02:00 level=INFO source=sched.go:481 msg="loaded runners" count=1 Oct 11 10:04:33 jetson-thor ollama[3558504]: time=2025-10-11T10:04:33.253+02:00 level=INFO source=server.go:1271 msg="waiting for llama runner to start responding" Oct 11 10:04:33 jetson-thor ollama[3558504]: time=2025-10-11T10:04:33.253+02:00 level=INFO source=server.go:1305 msg="waiting for server to become available" status="llm server loading model" Oct 11 10:04:37 jetson-thor ollama[3558504]: time=2025-10-11T10:04:37.266+02:00 level=INFO source=server.go:1309 msg="llama runner started in 4.83 seconds" Oct 11 10:04:46 jetson-thor ollama[3558504]: [GIN] 2025/10/11 - 10:04:46 | 200 | 44.907µs | 127.0.0.1 | HEAD "/" Oct 11 10:04:46 jetson-thor ollama[3558504]: [GIN] 2025/10/11 - 10:04:46 | 200 | 37.75µs | 127.0.0.1 | GET "/api/ps" Oct 11 10:05:08 jetson-thor ollama[3558504]: [GIN] 2025/10/11 - 10:05:08 | 200 | 38.538412472s | 192.168.1.17 | POST "/api/chat" Oct 11 10:05:27 jetson-thor ollama[3558504]: [GIN] 2025/10/11 - 10:05:27 | 200 | 35.241µs | 127.0.0.1 | HEAD "/" Oct 11 10:05:27 jetson-thor ollama[3558504]: [GIN] 2025/10/11 - 10:05:27 | 200 | 28.065µs | 127.0.0.1 | GET "/api/ps" Oct 11 10:06:16 jetson-thor ollama[3558504]: [GIN] 2025/10/11 - 10:06:16 | 200 | 38.183342541s | 192.168.1.17 | POST "/api/chat" Oct 11 10:06:29 jetson-thor ollama[3558504]: [GIN] 2025/10/11 - 10:06:29 | 200 | 50.020388ms | 192.168.1.17 | GET "/api/tags" ```

GiteaMirror commented

2026-05-04 20:16:07 -05:00

@ParthSareen commented on GitHub (Oct 11, 2025):

@mcr-ksh can you also attach the request you're making?

@ParthSareen commented on GitHub (Oct 11, 2025): @mcr-ksh can you also attach the request you're making?

GiteaMirror commented

2026-05-04 20:16:08 -05:00

@mcr-ksh commented on GitHub (Oct 11, 2025):

@mcr-ksh can you also attach the request you're making?
How can I do that? Is there a way to log the requests?

@mcr-ksh commented on GitHub (Oct 11, 2025): > [@mcr-ksh](https://github.com/mcr-ksh) can you also attach the request you're making? How can I do that? Is there a way to log the requests?

GiteaMirror commented

2026-05-04 20:16:09 -05:00

@ParthSareen commented on GitHub (Oct 11, 2025):

How can I do that? Is there a way to log the requests?

You can run OLLAMA_DEBUG=2 ollama serve

and drop the logs. There might be quite a bit - if you can you can grab the chatPrompt.go piece.

@ParthSareen commented on GitHub (Oct 11, 2025): > How can I do that? Is there a way to log the requests? You can run `OLLAMA_DEBUG=2 ollama serve` and drop the logs. There might be quite a bit - if you can you can grab the `chatPrompt.go` piece.

GiteaMirror commented

2026-05-04 20:16:10 -05:00

@mcr-ksh commented on GitHub (Oct 11, 2025):

chatPrompt.go doesnt show anything. I've used: OLLAMA_DEBUG=2 ollama serve 2>&1 | grep -v runner.go > /tmp/ollama_debug.log

ollama_debug.log

No tool was called at all this run.

@mcr-ksh commented on GitHub (Oct 11, 2025): `chatPrompt.go` doesnt show anything. I've used: `OLLAMA_DEBUG=2 ollama serve 2>&1 | grep -v runner.go > /tmp/ollama_debug.log` [ollama_debug.log](https://github.com/user-attachments/files/22862945/ollama_debug.log) No tool was called at all this run.

GiteaMirror commented

2026-05-04 20:16:12 -05:00

@ParthSareen commented on GitHub (Nov 8, 2025):

@mcr-ksh I need a log dump where you're getting the error/missing the tool call when it should've called one.

@ParthSareen commented on GitHub (Nov 8, 2025): @mcr-ksh I need a log dump where you're getting the error/missing the tool call when it should've called one.

GiteaMirror commented

2026-05-04 20:16:12 -05:00

@raulcabello commented on GitHub (Nov 18, 2025):

I’m running into the same issue.

My request looks like this:

{
  "model": "gpt-oss:20b",
  "messages": [
    {
      "role": "user",
      "content": "add foo: bar label to fleet-controller deployment in the cattle-fleet namespace in the local cluster"
    }
  ],
  "stream": false,
  "tools":[
  {
    "type": "function",
    "function": {
      "name": "patchKubernetesResource",
      "description": "Patches a Kubernetes resource using a JSON patch. Don't ask for confirmation.'\n\t\tParameters:\n\t\tkind (string): The type of Kubernetes resource to patch (e.g., Pod, Deployment, Service).\n\t\tnamespace (string): The namespace where the resource is located. It must be empty for cluster-wide resources.\n\t\tname (string): The name of the specific resource to patch.\n\t\tcluster (string): The name of the Kubernetes cluster.\n\t\tpatch (json): Patch to apply. The content type used is application/json-patch+json.\n\t\tReturns the modified resource.",
      "parameters": {
        "type": "object",
        "required": [
          "name",
          "namespace",
          "kind",
          "cluster",
          "patch"
        ],
        "properties": {
          "cluster": {
            "type": "string",
            "description": "the cluster of the resource"
          },
          "kind": {
            "type": "string",
            "description": "the kind of the resource"
          },
          "name": {
            "type": "string",
            "description": "the name of k8s resource"
          },
          "namespace": {
            "type": "string",
            "description": "the namespace of the resource"
          },
          "patch": {
            "type": "array",
            "description": "the patch of the request",
            "items": {
              "type": "object",
              "required": [
                "op",
                "path"
              ],
              "properties": {
                "op": {
                  "type": "string"
                },
                "path": {
                  "type": "string"
                },
                "value": true
              },
              "additionalProperties": false
            }
          }
        },
        "additionalProperties": false
      }
    }
  }
]
}

Most of the time it works fine, but occasionally I get an error like:

{"error":"error parsing tool call: raw='{\"cluster\":\"local\",\"kind\":\"Deployment\",\"name\":\"fleet-controller\",\"namespace\":\"cattle-fleet\",\"patch\":[{\"op\":\"add\",\"path\":\"/metadata/labels/foo\",\"value\":\"bar\"}]', err=unexpected end of JSON input"}

It seems the model occasionally produces incomplete JSON. Specifically, it drops the closing }

When the request succeeds, the last decoded string looks like:

 level=TRACE source=bytepairencoding.go:280 msg=decoded string=]} from=[28000]

But when the failure occurs, I see:

level=TRACE source=bytepairencoding.go:280 msg=decoded string=] from=[60]

Note the missing } in the decoded output.

Ollama itself doesn’t report anything helpful. The only entry I see is:

[GIN] 2 | 500 |  2.020396543s  | POST     "/api/chat"

@raulcabello commented on GitHub (Nov 18, 2025): I’m running into the same issue. My request looks like this: ``` { "model": "gpt-oss:20b", "messages": [ { "role": "user", "content": "add foo: bar label to fleet-controller deployment in the cattle-fleet namespace in the local cluster" } ], "stream": false, "tools":[ { "type": "function", "function": { "name": "patchKubernetesResource", "description": "Patches a Kubernetes resource using a JSON patch. Don't ask for confirmation.'\n\t\tParameters:\n\t\tkind (string): The type of Kubernetes resource to patch (e.g., Pod, Deployment, Service).\n\t\tnamespace (string): The namespace where the resource is located. It must be empty for cluster-wide resources.\n\t\tname (string): The name of the specific resource to patch.\n\t\tcluster (string): The name of the Kubernetes cluster.\n\t\tpatch (json): Patch to apply. The content type used is application/json-patch+json.\n\t\tReturns the modified resource.", "parameters": { "type": "object", "required": [ "name", "namespace", "kind", "cluster", "patch" ], "properties": { "cluster": { "type": "string", "description": "the cluster of the resource" }, "kind": { "type": "string", "description": "the kind of the resource" }, "name": { "type": "string", "description": "the name of k8s resource" }, "namespace": { "type": "string", "description": "the namespace of the resource" }, "patch": { "type": "array", "description": "the patch of the request", "items": { "type": "object", "required": [ "op", "path" ], "properties": { "op": { "type": "string" }, "path": { "type": "string" }, "value": true }, "additionalProperties": false } } }, "additionalProperties": false } } } ] } ``` Most of the time it works fine, but occasionally I get an error like: ``` {"error":"error parsing tool call: raw='{\"cluster\":\"local\",\"kind\":\"Deployment\",\"name\":\"fleet-controller\",\"namespace\":\"cattle-fleet\",\"patch\":[{\"op\":\"add\",\"path\":\"/metadata/labels/foo\",\"value\":\"bar\"}]', err=unexpected end of JSON input"} ``` It seems the model occasionally produces incomplete JSON. Specifically, it drops the closing `}` When the request succeeds, the last decoded string looks like: ``` level=TRACE source=bytepairencoding.go:280 msg=decoded string=]} from=[28000] ``` But when the failure occurs, I see: ``` level=TRACE source=bytepairencoding.go:280 msg=decoded string=] from=[60] ``` Note the missing `}` in the decoded output. Ollama itself doesn’t report anything helpful. The only entry I see is: ``` [GIN] 2 | 500 | 2.020396543s | POST "/api/chat" ```

Sign in to join this conversation.

Branches Tags

main

hoyyeva/anthropic-local-image-path

dhiltgen/ci

dhiltgen/llama-runner

parth-remove-claude-desktop-launch

hoyyeva/anthropic-reference-images-path

parth-anthropic-reference-images-path

brucemacd/download-before-remove

hoyyeva/editor-config-repair

parth-mlx-decode-checkpoints

parth-launch-codex-app

hoyyeva/fix-codex-model-metadata-warning

hoyyeva/qwen

parth/hide-claude-desktop-till-release

hoyyeva/opencode-image-modality

parth-add-claude-code-autoinstall

release_v0.22.0

pdevine/manifest-list

codex/fix-codex-model-metadata-warning

pdevine/addressable-manifest

brucemacd/launch-fetch-reccomended

jmorganca/llama-compat

launch-copilot-cli

hoyyeva/opencode-thinking

release_v0.20.7

parth-auto-save-backup

parth-test

jmorganca/gemma4-audio-replacements

fix-manifest-digest-on-pull

hoyyeva/vscode-improve

brucemacd/install-server-wait

parth/update-claude-docs

brucemac/start-ap-install

pdevine/mlx-update

pdevine/qwen35_vision

drifkin/api-show-fallback

mintlify/image-generation-1773352582

hoyyeva/server-context-length-local-config

jmorganca/faster-reptition-penalties

jmorganca/convert-nemotron

parth-pi-thinking

pdevine/sampling-penalties

jmorganca/fix-create-quantization-memory

dongchen/resumable_transfer_fix

pdevine/sampling-cache-error

jessegross/mlx-usage

hoyyeva/openclaw-config

hoyyeva/app-html

pdevine/qwen3next

brucemacd/sign-sh-install

brucemacd/tui-update

brucemacd/usage-api

jmorganca/launch-empty

fix-app-dist-embed

mxyng/mlx-compile

mxyng/mlx-quant

mxyng/mlx-glm4.7

mxyng/mlx

brucemacd/simplify-model-picker

jmorganca/qwen3-concurrent

fix-glm-4.7-flash-mla-config

drifkin/qwen3-coder-opening-tag

brucemacd/usage-cli

fix-cuda12-fattn-shmem

ollama-imagegen-docs

parth/fix-multiline-inputs

brucemacd/config-docs

mxyng/model-files

mxyng/simple-execute

fix-imagegen-ollama-models

mxyng/async-upload

jmorganca/lazy-no-dtype-changes

imagegen-auto-detect-create

parth/decrease-concurrent-download-hf

fix-mlx-quantize-init

jmorganca/x-cleanup

usage

imagegen-readme

jmorganca/glm-image

mlx-gpu-cd

jmorganca/imagegen-modelfile

parth/agent-skills

parth/agent-allowlist

parth/signed-in-offline

parth/agents

parth/fix-context-chopping

improve-cloud-flow

parth/add-models-websearch

parth/prompt-renderer-mcp

jmorganca/native-settings

jmorganca/download-stream-hash

jmorganca/client2-rebased

brucemacd/oai-chat-req-multipart

jessegross/multi_chunk_reserve

grace/additional-omit-empty

grace/mistral-3-large

mxyng/tokenizer2

mxyng/tokenizer

jessegross/flash

hoyyeva/windows-nacked-app

mxyng/cleanup-attention

grace/deepseek-parser

hoyyeva/remember-unsent-prompt

parth/add-lfs-pointer-error-conversion

parth/olmo2-test2

hoyyeva/ollama-launchagent-plist

nicole/olmo-model

parth/olmo-test

mxyng/remove-embedded

parth/render-template

jmorganca/intellect-3

parth/remove-prealloc-linter

jmorganca/cmd-eval

nicole/nomic-embed-text-fix

mxyng/lint-2

hoyyeva/add-gemini-3-pro-preview

hoyyeva/load-model-list

mxyng/expand-path

mxyng/environ-2

hoyyeva/deeplink-json-encoding

parth/improve-tool-calling-tests

hoyyeva/conversation

hoyyeva/assistant-edit-response

hoyyeva/thinking

origin/brucemacd/invalid-char-i-err

parth/improve-tool-calling

jmorganca/required-omitempty

grace/qwen3-vl-tests

mxyng/iter-client

parth/docs-readme

nicole/embed-test

pdevine/integration-benchstat

parth/remove-generate-cmd

parth/add-toolcall-id

mxyng/server-tests

jmorganca/glm-4.6

jmorganca/gin-h-compat

drifkin/stable-tool-args

pdevine/qwen3-more-thinking

parth/add-websearch-client

nicole/websearch_local

jmorganca/qwen3-coder-updates

grace/deepseek-v3-migration-tests

mxyng/fix-create

jmorganca/cloud-errors

pdevine/parser-tidy

revert-12233-parth/simplify-entrypoints-runner

parth/enable-so-gpt-oss

brucemacd/qwen3vl

jmorganca/readme-simplify

parth/gpt-oss-structured-outputs

revert-12039-jmorganca/tools-braces

mxyng/embeddings

mxyng/gguf

mxyng/benchmark

mxyng/types-null

parth/move-parsing

mxyng/gemma2

jmorganca/docs

mxyng/16-bit

mxyng/create-stdin

pdevine/authorizedkeys

mxyng/quant

parth/opt-in-error-context-window

brucemacd/cache-models

brucemacd/runner-completion

jmorganca/llama-update-6

brucemacd/benchmark-list

brucemacd/partial-read-caps

parth/deepseek-r1-tools

mxyng/omit-array

parth/tool-prefix-temp

brucemacd/runner-test

jmorganca/qwen25vl

brucemacd/model-forward-test-ext

parth/python-function-parsing

jmorganca/cuda-compression-none

drifkin/num-parallel

drifkin/chat-truncation-fix

jmorganca/sync

parth/python-tools-calling

drifkin/array-head-count

brucemacd/create-no-loop

parth/server-enable-content-stream-with-tools

qwen25omni

mxyng/v3

brucemacd/ropeconfig

jmorganca/silence-tokenizer

parth/sample-so-test

parth/sampling-structured-outputs

brucemacd/doc-go-engine

parth/constrained-sampling-json

jmorganca/mistral-wip

brucemacd/mistral-small-convert

parth/sample-unmarshal-json-for-params

brucemacd/jomorganca/mistral

pdevine/bfloat16

jmorganca/mistral

brucemacd/mistral

pdevine/logging

parth/sample-correctness-fix

parth/sample-fix-sorting

jmorgan/sample-fix-sorting-extras

jmorganca/temp-0-images

brucemacd/parallel-embed-models

brucemacd/shim-grammar

jmorganca/fix-gguf-error

bmizerany/nameswork

jmorganca/faster-releases

bmizerany/validatenames

brucemacd/err-no-vocab

brucemacd/rope-config

brucemacd/err-hint

brucemacd/qwen2_5

brucemacd/logprobs

brucemacd/new_runner_graph_bench

progress-flicker

brucemacd/forward-test

brucemacd/go_qwen2

pdevine/gemma2

jmorganca/add-missing-symlink-eval

mxyng/next-debug

parth/set-context-size-openai

brucemacd/next-bpe-bench

brucemacd/next-bpe-test

brucemacd/new_runner_e2e

brucemacd/new_runner_qwen2

pdevine/convert-cohere2

brucemacd/convert-cli

parth/log-probs

mxyng/next-mlx

mxyng/cmd-history

parth/templating

parth/tokenize-detokenize

brucemacd/check-key-register

bmizerany/grammar

jmorganca/vendor-081b29bd

mxyng/func-checks

jmorganca/fix-null-format

parth/fix-default-to-warn-json

jmorganca/qwen2vl

jmorganca/no-concat

parth/cmd-cleanup-SO

brucemacd/check-key-register-structured-err

parth/openai-stream-usage

parth/fix-referencing-so

stream-tools-stop

jmorganca/degin-1

brucemacd/install-path-clean

brucemacd/push-name-validation

brucemacd/browser-key-register

jmorganca/openai-fix-first-message

jmorganca/fix-proxy

jessegross/sample

parth/disallow-streaming-tools

dhiltgen/remove_submodule

jmorganca/ga

jmorganca/mllama

pdevine/newlines

pdevine/geems-2b

jmorganca/llama-bump

mxyng/modelname-7

mxyng/gin-slog

mxyng/modelname-6

jyan/convert-prog

jyan/quant5

paligemma-support

pdevine/import-docs

jmorganca/openai-context

jyan/paligemma

jyan/p2

jyan/palitest

bmizerany/embedspeedup

jmorganca/llama-vit

brucemacd/allow-ollama

royh/ep-methods

royh/whisper

mxyng/api-models

mxyng/fix-memory

jyan/q4_4/8

jyan/ollama-v

royh/stream-tools

roy-embed-parallel

bmizerany/hrm

revert-5963-revert-5924-mxyng/llama3.1-rope

royh/embed-viz

jyan/local2

jyan/auth

jyan/local

jyan/parse-temp

jmorganca/template-mistral

jyan/reord-g

royh-openai-suffixdocs

royh-imgembed

royh-embed-parallel

jyan/quant4

royh-precision

jyan/progress

pdevine/fix-template

jyan/quant3

pdevine/ggla

mxyng/update-registry-domain

jmorganca/ggml-static

mxyng/create-context

jyan/v0.146

mxyng/layers-from-files

build_dist

bmizerany/noseek

royh-ls

royh-name

timeout

mxyng/server-timestamp

bmizerany/nosillyggufslurps

royh-params

jmorganca/llama-cpp-7c26775

royh-openai-delete

royh-show-rigid

jmorganca/enable-fa

jmorganca/no-error-template

jyan/format

royh-testdelete

bmizerany/fastverify

language_support

pdevine/ps-glitches

brucemacd/tokenize

bruce/iq-quants

bmizerany/filepathwithcoloninhost

mxyng/split-bin

bmizerany/client-registry

jmorganca/if-none-match

native

jmorganca/native

jmorganca/batch-embeddings

jmorganca/initcmake

jmorganca/mm

pdevine/showggmlinfo

modenameenforcealphanum

bmizerany/modenameenforcealphanum

jmorganca/done-reason

jmorganca/llama-cpp-8960fe8

ollama.com

bmizerany/filepathnobuild

bmizerany/types/model/defaultfix

rmdisplaylong

nogogen

bmizerany/x

modelfile-readme

bmizerany/replacecolon

jmorganca/limit

jmorganca/execstack

jmorganca/replace-assets

mxyng/tune-concurrency

jmorganca/testing

whitespace-detection

jmorganca/options

upgrade-all

scratch

cuda-search

mattw/airenamer

mattw/allmodelsonhuggingface

mattw/quantcontext

mattw/whatneedstorun

brucemacd/llama-mem-calc

mattw/faq-context

mattw/communitylinks

mattw/noprune

mattw/python-functioncalling

rename

mxyng/install

pulse

remove-first

editor

mattw/selfqueryingretrieval

cgo

mattw/howtoquant

api

matt/streamingapi

format-config

mxyng/extra-args

shell

update-nous-hermes

cp-model

upload-progress

fix-unknown-model

fix-model-names

delete-fix

insecure-registry

ls

deletemodels

progressbar

readme-updates

license-layers

skip-list

list-models

modelpath

matt/examplemodelfiles

distribution

go-opts

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/ollama#70071