[GH-ISSUE #11938] [Bug/Feature Request] JSON Parse Failure via Reverse Tunnel with Chunked HTTP Responses #69984

Open
opened 2026-05-04 19:59:05 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @ademirtug on GitHub (Aug 16, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/11938

What is the issue?

Description:
I encountered an issue when using Ollama behind a reverse tunnel, where JSON responses fail to parse when Ollama switches from Content-Length responses to HTTP chunked transfer encoding.

Setup:

  • Remote Server: Runs the Open Web UI (port 9000) but does not have Ollama installed.
  • Local PC: Runs Ollama (localhost:11434) and is behind a CGNAT.
  • Tunnel: A reverse tunnel is used (either via a custom C# relay or ssh -R 11434:localhost:11434) to connect the server to the local Ollama instance.

Steps to Reproduce:

  1. Run Ollama locally on your PC.
  2. Connect the server’s Open Web UI to Ollama via the reverse tunnel.
  3. Request the list of models — this works correctly.
  4. Send a query, e.g., "hey".
  5. Ollama switches to HTTP chunked transfer encoding, and the Open Web UI fails with a JSON parse error.

Observed Behavior:

  • Content-Length responses are correctly parsed.
  • Chunked transfer responses break JSON parsing through the tunnel.

Expected Behavior:

  • JSON responses from Ollama should be correctly parsed regardless of transfer encoding.

Additional Notes:

  • This issue occurs both with my C# relay implementation and a standard SSH reverse tunnel (ssh -R).
  • The problem seems related to handling of chunked HTTP responses across a TCP relay.

Request:

  • A feature or flag to force Content-Length

Relevant log output


OS

Windows

GPU

AMD

CPU

AMD

Ollama version

0.11.5

Originally created by @ademirtug on GitHub (Aug 16, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/11938 ### What is the issue? Description: I encountered an issue when using Ollama behind a reverse tunnel, where JSON responses fail to parse when Ollama switches from `Content-Length` responses to HTTP chunked transfer encoding. Setup: - Remote Server: Runs the Open Web UI (port 9000) but does not have Ollama installed. - Local PC: Runs Ollama (localhost:11434) and is behind a CGNAT. - Tunnel: A reverse tunnel is used (either via a custom C# relay or `ssh -R 11434:localhost:11434`) to connect the server to the local Ollama instance. Steps to Reproduce: 1. Run Ollama locally on your PC. 2. Connect the server’s Open Web UI to Ollama via the reverse tunnel. 3. Request the list of models — this works correctly. 4. Send a query, e.g., `"hey"`. 5. Ollama switches to HTTP chunked transfer encoding, and the Open Web UI fails with a JSON parse error. Observed Behavior: - `Content-Length` responses are correctly parsed. - Chunked transfer responses break JSON parsing through the tunnel. Expected Behavior: - JSON responses from Ollama should be correctly parsed regardless of transfer encoding. Additional Notes: - This issue occurs both with my C# relay implementation and a standard SSH reverse tunnel (`ssh -R`). - The problem seems related to handling of chunked HTTP responses across a TCP relay. Request: - A feature or flag to force `Content-Length` ### Relevant log output ```shell ``` ### OS Windows ### GPU AMD ### CPU AMD ### Ollama version 0.11.5
GiteaMirror added the bug label 2026-05-04 19:59:05 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#69984