[GH-ISSUE #13755] CLI multiline prompt shows no response until follow-up message is sent #71076

Closed
opened 2026-05-04 23:55:45 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @timlucastech on GitHub (Jan 17, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/13755

What is the issue?

When entering a multiline prompt in interactive mode (using the ... continuation lines), the model processes the request but no response is displayed. The response only appears after sending a follow-up message (e.g., "Hello?"), at which point the model's response references the original prompt, proving it was processed.

OS

Linux (Ubuntu)

GPU

CPU-only (also has AMD GPU with ROCm installed)

CPU

x86_64

Ollama version

0.14.2

Model

danielsheep/Qwen3-Coder-30B-A3B-Instruct-1M-Unsloth:UD-Q5_K_XL

Steps to reproduce

  1. Run ollama run (tested with danielsheep/Qwen3-Coder-30B-A3B-Instruct-1M-Unsloth:UD-Q5_K_XL)
  2. Enter a multiline prompt that spans multiple lines:

Generate a comprehensive inline HTML interface that uses tailwind CDNs and fetches the latest price of bitcoin and prints it to the interface; make the interface nice and modern and smoothly update in realtime by refreshing the GUI without flickers or visual glitches; make
... the interface modern and aesthetically nice UX

  1. Press Enter - cursor returns to >>> prompt with no response displayed
  2. Type any follow-up message like "Hello?"
  3. The model now responds, referencing the original request (e.g., "I can see you're interested in the Bitcoin price tracker HTML file I generated for you...")

Expected behavior

Response should stream/display immediately after the multiline prompt is submitted.

Actual behavior

No response is displayed. The response only appears after a subsequent message is sent.

Additional context

  • Single-line prompts in interactive mode work correctly
  • Piped input works correctly: echo "prompt" | ollama run
  • API calls work correctly: curl http://localhost:11434/api/generate -d '...'
  • This suggests the issue is specific to the readline/terminal handling for multiline interactive input
  • Terminal: xterm-256color
  • Issue persists after upgrading from 0.14.1 to 0.14.2

Logs

Server logs show successful HTTP 200 responses for the requests, confirming the model processed them:
[GIN] 2026/01/17 - 14:09:42 | 200 | 796.41043ms | 127.0.0.1 | POST "/api/chat"
[GIN] 2026/01/17 - 14:10:54 | 200 | 5.754584434s | 127.0.0.1 | POST "/api/chat"

Originally created by @timlucastech on GitHub (Jan 17, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/13755 What is the issue? When entering a multiline prompt in interactive mode (using the ... continuation lines), the model processes the request but no response is displayed. The response only appears after sending a follow-up message (e.g., "Hello?"), at which point the model's response references the original prompt, proving it was processed. OS Linux (Ubuntu) GPU CPU-only (also has AMD GPU with ROCm installed) CPU x86_64 Ollama version 0.14.2 Model danielsheep/Qwen3-Coder-30B-A3B-Instruct-1M-Unsloth:UD-Q5_K_XL Steps to reproduce 1. Run ollama run <model> (tested with danielsheep/Qwen3-Coder-30B-A3B-Instruct-1M-Unsloth:UD-Q5_K_XL) 2. Enter a multiline prompt that spans multiple lines: >>> Generate a comprehensive inline HTML interface that uses tailwind CDNs and fetches the latest price of bitcoin and prints it to the interface; make the interface nice and modern and smoothly update in realtime by refreshing the GUI without flickers or visual glitches; make ... the interface modern and aesthetically nice UX 3. Press Enter - cursor returns to >>> prompt with no response displayed 4. Type any follow-up message like "Hello?" 5. The model now responds, referencing the original request (e.g., "I can see you're interested in the Bitcoin price tracker HTML file I generated for you...") Expected behavior Response should stream/display immediately after the multiline prompt is submitted. Actual behavior No response is displayed. The response only appears after a subsequent message is sent. Additional context - Single-line prompts in interactive mode work correctly - Piped input works correctly: echo "prompt" | ollama run <model> - API calls work correctly: curl http://localhost:11434/api/generate -d '...' - This suggests the issue is specific to the readline/terminal handling for multiline interactive input - Terminal: xterm-256color - Issue persists after upgrading from 0.14.1 to 0.14.2 Logs Server logs show successful HTTP 200 responses for the requests, confirming the model processed them: [GIN] 2026/01/17 - 14:09:42 | 200 | 796.41043ms | 127.0.0.1 | POST "/api/chat" [GIN] 2026/01/17 - 14:10:54 | 200 | 5.754584434s | 127.0.0.1 | POST "/api/chat"
GiteaMirror added the bugneeds more info labels 2026-05-04 23:55:45 -05:00
Author
Owner

@rick-github commented on GitHub (Jan 17, 2026):

Unable to reproduce. The short response time (796ms) implies that the model received the prompt but ignored it. On an AMD Ryzen 9 7900X system the model took 88s to generate the response. Try setting OLLAMA_DEBUG=2 and post the log, this will show the prompt and might provide insight in to what's going on.

<!-- gh-comment-id:3762964397 --> @rick-github commented on GitHub (Jan 17, 2026): Unable to reproduce. The short response time (796ms) implies that the model received the prompt but ignored it. On an AMD Ryzen 9 7900X system the model took 88s to generate the response. Try setting `OLLAMA_DEBUG=2` and post the log, this will show the prompt and might provide insight in to what's going on.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#71076