[GH-ISSUE #13258] Incorrect Content type header by API #70824

Closed
opened 2026-05-04 23:07:14 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @SKJoy on GitHub (Nov 27, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/13258

What is the issue?

I am checking out the Cloud models with a Docker instance of OLLaMA and looks like the /chat and /generate endpoints are returning with incorrect text/plain; charset=utf-8 value for Content-Type response header. While the /tags endpoint comes up with the correct application/json; charset=utf-8 value for that header. The actual response content is valid JSON without doubt. The incorrect header is causing issues with automatic parsing though!

Content-Type response header

  • /api/tags = application/json; charset=utf-8
  • /api/chat = text/plain; charset=utf-8
  • /api/generate = text/plain; charset=utf-8

Below is the complete cURL diagnostic;

{
	"Response": {
		"Status": {
			"Code": 200,
			"Message": "OK"
		},
		"Header": {
			"Parsed": {
				"DATE": "Thu, 27 Nov 2025 02:43:56 GMT",
				"CONTENT-TYPE": "text\/plain; charset=utf-8",
				"TRANSFER-ENCODING": "chunked"
			},
			"Raw": "HTTP\/1.1 200 OK\r\nDate: Thu, 27 Nov 2025 02:43:56 GMT\r\nContent-Type: text\/plain; charset=utf-8\r\nTransfer-Encoding: chunked"
		},
		"Data": {
			"Translated": null,
			"Raw": "{\"model\":\"gpt-oss:120b\",\"remote_model\":\"gpt-oss:120b\",\"remote_host\":\"https:\/\/ollama.com:443\",\"created_at\":\"2025-11-27T02:43:56.638582213Z\",\"message\":{\"role\":\"assistant\",\"content\":\"A stack is a linear data structure that follows Last\u2011In\u2011First\u2011Out (LIFO): the most recently added element is removed first. Operations typically include push to add and pop to remove. A queue follows First\u2011In\u2011First\u2011Out (FIFO): elements leave in the order they arrived. Enqueue adds to the rear, dequeue removes from the front. Stacks are useful for recursion, backtracking, and expression evaluation, while queues model waiting lines, breadth\u2011first search, and asynchronous processing. Both structures can be implemented using arrays or linked lists efficiently.\",\"thinking\":\"Need to produce explanation in exactly 80 words? The prompt: \\\"Explain the difference between a stack and a queue in 80 words.\\\" Likely they want exactly 80 words. We must count words. Let's craft concise explanation of 80 words. Count manually.\\n\\nDraft:\\n\\n\\\"A stack is a linear data structure that follows Last\u2011In\u2011First\u2011Out (LIFO): the most recently added element is removed first. Operations typically include push to add and pop to remove. A queue follows First\u2011In\u2011First\u2011Out (FIFO): elements leave in the order they arrived. Enqueue adds to the rear, dequeue removes from the front. Stacks are useful for recursion, backtracking, and expression evaluation, while queues model waiting lines, breadth\u2011first search, and asynchronous processing.\\\"\\n\\nNow count words.\\n\\nCount:\\n\\nA(1) stack(2) is(3) a(4) linear(5) data(6) structure(7) that(8) follows(9) Last\u2011In\u2011First\u2011Out(10) (LIFO): the(11) most(12) recently(13) added(14) element(15) is(16) removed(17) first.(18) Operations(19) typically(20) include(21) push(22) to(23) add(24) and(25) pop(26) to(27) remove.(28) A(29) queue(30) follows(31) First\u2011In\u2011First\u2011Out(32) (FIFO): (33) elements(34) leave(35) in(36) the(37) order(38) they(39) arrived.(40) Enqueue(41) adds(42) to(43) the(44) rear,(45) dequeue(46) removes(47) from(48) the(49) front.(50) Stacks(51) are(52) useful(53) for(54) recursion,(55) backtracking,(56) and(57) expression(58) evaluation,(59) while(60) queues(61) model(62) waiting(63) lines,(64) breadth\u2011first(65) search,(66) and(67) asynchronous(68) processing.(69)\\n\\nWe have 69 words. Need 11 more words to reach 80. Add a sentence with 11 words.\\n\\nAdd: \\\"Both structures can be implemented using arrays or linked lists efficiently.\\\" Count words.\\n\\nBoth(1) structures(2) can(3) be(4) implemented(5) using(6) arrays(7) or(8) linked(9) lists(10) efficiently(11). That's 11 words.\\n\\nNow total 69+11 = 80 exactly.\\n\\nCombine after previous sentence. Provide final answer.\"},\"done\":true,\"done_reason\":\"stop\",\"total_duration\":4420428574,\"prompt_eval_count\":81,\"eval_count\":700}\n"
		}
	},
	"Request": {
		"URL": "http:\/\/172.31.199.3:11434\/api\/chat",
		"Method": "POST",
		"Header": [
			"Content-Type: application\/json",
			"Accept: application\/json"
		],
		"UserAgent": "Mozilla\/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/124.0.0.0 Safari\/537.36 Edg\/124.0.0.0",
		"Data": "{\"model\":\"gpt-oss:120b\",\"messages\":[{\"role\":\"user\",\"content\":\"Explain the difference between a stack and a queue in 80 words.\"}],\"stream\":false}"
	}
}

ollama version is 0.12.11

Relevant log output


OS

Docker

GPU

Other

CPU

Intel

Ollama version

0.12.11

Originally created by @SKJoy on GitHub (Nov 27, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/13258 ### What is the issue? I am checking out the **Cloud** models with a **Docker** instance of **OLLaMA** and looks like the `/chat` and `/generate` endpoints are returning with incorrect `text/plain; charset=utf-8` value for `Content-Type` response header. While the `/tags` endpoint comes up with the correct `application/json; charset=utf-8` value for that header. The actual response **content** is valid `JSON` without doubt. The **incorrect** header is causing issues with automatic parsing though! ## `Content-Type` response header - `/api/tags` = `application/json; charset=utf-8` - `/api/chat` = `text/plain; charset=utf-8` - `/api/generate` = `text/plain; charset=utf-8` Below is the complete **cURL** diagnostic; ``` { "Response": { "Status": { "Code": 200, "Message": "OK" }, "Header": { "Parsed": { "DATE": "Thu, 27 Nov 2025 02:43:56 GMT", "CONTENT-TYPE": "text\/plain; charset=utf-8", "TRANSFER-ENCODING": "chunked" }, "Raw": "HTTP\/1.1 200 OK\r\nDate: Thu, 27 Nov 2025 02:43:56 GMT\r\nContent-Type: text\/plain; charset=utf-8\r\nTransfer-Encoding: chunked" }, "Data": { "Translated": null, "Raw": "{\"model\":\"gpt-oss:120b\",\"remote_model\":\"gpt-oss:120b\",\"remote_host\":\"https:\/\/ollama.com:443\",\"created_at\":\"2025-11-27T02:43:56.638582213Z\",\"message\":{\"role\":\"assistant\",\"content\":\"A stack is a linear data structure that follows Last\u2011In\u2011First\u2011Out (LIFO): the most recently added element is removed first. Operations typically include push to add and pop to remove. A queue follows First\u2011In\u2011First\u2011Out (FIFO): elements leave in the order they arrived. Enqueue adds to the rear, dequeue removes from the front. Stacks are useful for recursion, backtracking, and expression evaluation, while queues model waiting lines, breadth\u2011first search, and asynchronous processing. Both structures can be implemented using arrays or linked lists efficiently.\",\"thinking\":\"Need to produce explanation in exactly 80 words? The prompt: \\\"Explain the difference between a stack and a queue in 80 words.\\\" Likely they want exactly 80 words. We must count words. Let's craft concise explanation of 80 words. Count manually.\\n\\nDraft:\\n\\n\\\"A stack is a linear data structure that follows Last\u2011In\u2011First\u2011Out (LIFO): the most recently added element is removed first. Operations typically include push to add and pop to remove. A queue follows First\u2011In\u2011First\u2011Out (FIFO): elements leave in the order they arrived. Enqueue adds to the rear, dequeue removes from the front. Stacks are useful for recursion, backtracking, and expression evaluation, while queues model waiting lines, breadth\u2011first search, and asynchronous processing.\\\"\\n\\nNow count words.\\n\\nCount:\\n\\nA(1) stack(2) is(3) a(4) linear(5) data(6) structure(7) that(8) follows(9) Last\u2011In\u2011First\u2011Out(10) (LIFO): the(11) most(12) recently(13) added(14) element(15) is(16) removed(17) first.(18) Operations(19) typically(20) include(21) push(22) to(23) add(24) and(25) pop(26) to(27) remove.(28) A(29) queue(30) follows(31) First\u2011In\u2011First\u2011Out(32) (FIFO): (33) elements(34) leave(35) in(36) the(37) order(38) they(39) arrived.(40) Enqueue(41) adds(42) to(43) the(44) rear,(45) dequeue(46) removes(47) from(48) the(49) front.(50) Stacks(51) are(52) useful(53) for(54) recursion,(55) backtracking,(56) and(57) expression(58) evaluation,(59) while(60) queues(61) model(62) waiting(63) lines,(64) breadth\u2011first(65) search,(66) and(67) asynchronous(68) processing.(69)\\n\\nWe have 69 words. Need 11 more words to reach 80. Add a sentence with 11 words.\\n\\nAdd: \\\"Both structures can be implemented using arrays or linked lists efficiently.\\\" Count words.\\n\\nBoth(1) structures(2) can(3) be(4) implemented(5) using(6) arrays(7) or(8) linked(9) lists(10) efficiently(11). That's 11 words.\\n\\nNow total 69+11 = 80 exactly.\\n\\nCombine after previous sentence. Provide final answer.\"},\"done\":true,\"done_reason\":\"stop\",\"total_duration\":4420428574,\"prompt_eval_count\":81,\"eval_count\":700}\n" } }, "Request": { "URL": "http:\/\/172.31.199.3:11434\/api\/chat", "Method": "POST", "Header": [ "Content-Type: application\/json", "Accept: application\/json" ], "UserAgent": "Mozilla\/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/124.0.0.0 Safari\/537.36 Edg\/124.0.0.0", "Data": "{\"model\":\"gpt-oss:120b\",\"messages\":[{\"role\":\"user\",\"content\":\"Explain the difference between a stack and a queue in 80 words.\"}],\"stream\":false}" } } ``` **ollama** version is `0.12.11` ### Relevant log output ```shell ``` ### OS Docker ### GPU Other ### CPU Intel ### Ollama version 0.12.11
GiteaMirror added the bug label 2026-05-04 23:07:14 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#70824