[GH-ISSUE #12362] JSON reply schema is ignored by Cloud model #54726

Open
opened 2026-04-29 07:06:07 -05:00 by GiteaMirror · 9 comments
Owner

Originally created by @scriptdealer on GitHub (Sep 21, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/12362

Originally assigned to: @ParthSareen on GitHub.

What is the issue?

I'm using github.com/ollama/ollama/api with github.com/invopop/jsonschema marshaller, which works fine with local qwen3-coder:30b, but now I'm trying qwen3-coder:480b-cloud, and it returns JSON string that answers the prompt, but doesn't follow the reply schema.

Looks like a bug to me

Relevant log output


OS

Windows

GPU

No response

CPU

Intel

Ollama version

0.12.0

Originally created by @scriptdealer on GitHub (Sep 21, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/12362 Originally assigned to: @ParthSareen on GitHub. ### What is the issue? I'm using `github.com/ollama/ollama/api` with `github.com/invopop/jsonschema` marshaller, which works fine with local `qwen3-coder:30b`, but now I'm trying `qwen3-coder:480b-cloud`, and it returns JSON string that answers the prompt, but doesn't follow the reply schema. Looks like a bug to me ### Relevant log output ```shell ``` ### OS Windows ### GPU _No response_ ### CPU Intel ### Ollama version 0.12.0
GiteaMirror added the cloudbug labels 2026-04-29 07:06:08 -05:00
Author
Owner

@scriptdealer commented on GitHub (Sep 21, 2025):

This is not a direct API access; I'm using localhost in both cases. Just model name was changed

<!-- gh-comment-id:3315510471 --> @scriptdealer commented on GitHub (Sep 21, 2025): This is not a direct API access; I'm using localhost in both cases. Just model name was changed
Author
Owner

@hen-corix commented on GitHub (Sep 23, 2025):

Today, I tried hard to use cloud models (directly over ollama.com/api/chat) with the "format" parameter equal to a json schema. GPT-OSS and Deepseek both responded with JSON, but not according to my schema, and said in the thinking output "the user wants output in json, but no explicit schema was given".

Also, the test request from the docs https://github.com/ollama/ollama/blob/main/docs/api.md#request-3 seems not to work: the output comes with the field "availability" (like in the prompt) but not with the requested field "available".

edit: this seems to be an issue with all models. I got it working when I include the json schema directly in the prompt, but not in the "format" parameter.

<!-- gh-comment-id:3323188787 --> @hen-corix commented on GitHub (Sep 23, 2025): Today, I tried hard to use cloud models (directly over ollama.com/api/chat) with the "format" parameter equal to a json schema. GPT-OSS and Deepseek both responded with JSON, but not according to my schema, and said in the thinking output "the user wants output in json, but no explicit schema was given". Also, the test request from the docs https://github.com/ollama/ollama/blob/main/docs/api.md#request-3 seems not to work: the output comes with the field "availability" (like in the prompt) but not with the requested field "available". edit: this seems to be an issue with all models. I got it working when I include the json schema directly in the prompt, but not in the "format" parameter.
Author
Owner

@scriptdealer commented on GitHub (Oct 2, 2025):

Here is the code I'm using:

schemaMaker := jsonschema.Reflector{
			AllowAdditionalProperties: false,
			DoNotReference:            true,
		}
schema = schemaMaker.Reflect(my.ReplyStruct)
schemaBytes, _ := schema.MarshalJSON()
req := &api.ChatRequest{
    Format: json.RawMessage(schemaBytes)
}

Also, please add official code examples for both json_schema and older json_object response types (i.e. Deepseek)

<!-- gh-comment-id:3360954031 --> @scriptdealer commented on GitHub (Oct 2, 2025): Here is the code I'm using: ``` schemaMaker := jsonschema.Reflector{ AllowAdditionalProperties: false, DoNotReference: true, } schema = schemaMaker.Reflect(my.ReplyStruct) schemaBytes, _ := schema.MarshalJSON() req := &api.ChatRequest{ Format: json.RawMessage(schemaBytes) } ``` Also, please add official code examples for both `json_schema` and older `json_object` response types (i.e. Deepseek)
Author
Owner

@mawiswiss commented on GitHub (Oct 24, 2025):

I also have this issue. When running the model locally with "Structured outputs", I get an object, when using the cloud model, I get a string, which is wrong. It can easily be reproduced with the example snippet in the docs.

curl -X POST http://localhost:11434/api/chat -H "Content-Type: application/json" -d '{
  "model": "gpt-oss:120b-cloud",
  "messages": [{"role": "user", "content": "Tell me about Canada."}],
  "stream": false,
  "format": {
    "type": "object",
    "properties": {
      "name": {
        "type": "string"
      },
      "capital": {
        "type": "string"
      },
      "languages": {
        "type": "array",
        "items": {
          "type": "string"
        }
      }
    },
    "required": [
      "name",
      "capital",
      "languages"
    ]
  }
}'

Returns:

{"model":"gpt-oss:120b-cloud","remote_model":"gpt-oss:120b","remote_host":"https://ollama.com:443","created_at":"2025-10-24T23:31:44.151178025Z","message":{"role":"assistant","content":"**Canada – A Quick Overview**\n\n| Category | Highlights |\n|----------|------------|\n| **Location** | North‑America, stretching from the Atlantic Ocean in the east to the Pacific Ocean in the west, and northward into the Arctic Ocean. |\n| **Size** | 2nd largest country by total area (≈ 9.98 million km²). 10 provinces and 3 territories. |\n| **Population** | About 40 million (2024 estimate). Major urban centres: Toronto, Montréal, Vancouver, Calgary, Edmonton, Ottawa (capital). |\n| ..."},"done":true,"done_reason":"stop","total_duration":9630030893,"prompt_eval_count":77,"eval_count":1167}

Running it with the local model gives me this response (an object):

{"model":"gpt-oss:20b","created_at":"2025-10-24T23:42:28.75745Z","message":{"role":"assistant","content":"{\"name\":\"Canada\",\"capital\":\"Ottawa\",\"languages\":[\"English\",\"French\",\"many Indigenous languages\"]}","thinking":"User wants \"Tell me about Canada.\" Likely wants general overview: geography, culture, politics, economy, demographics, history, etc. Should be concise but comprehensive. We'll provide key facts: location, provinces/territories, capital, population, languages, history, economy sectors, culture, climate, major cities, tourism, politics, education, etc. Probably want up-to-date as of 2025. We'll deliver an informative piece. We'll keep language accessible, maybe some bullet points, and a summary."},"done":true,"done_reason":"stop","total_duration":60240011291,"load_duration":56774027041,"prompt_eval_count":184,"prompt_eval_duration":26297875,"eval_count":22,"eval_duration":341659459}%
<!-- gh-comment-id:3445257596 --> @mawiswiss commented on GitHub (Oct 24, 2025): I also have this issue. When running the model locally with "Structured outputs", I get an object, when using the cloud model, I get a string, which is wrong. It can easily be reproduced with the example snippet in the docs. ``` curl -X POST http://localhost:11434/api/chat -H "Content-Type: application/json" -d '{ "model": "gpt-oss:120b-cloud", "messages": [{"role": "user", "content": "Tell me about Canada."}], "stream": false, "format": { "type": "object", "properties": { "name": { "type": "string" }, "capital": { "type": "string" }, "languages": { "type": "array", "items": { "type": "string" } } }, "required": [ "name", "capital", "languages" ] } }' ``` Returns: ``` {"model":"gpt-oss:120b-cloud","remote_model":"gpt-oss:120b","remote_host":"https://ollama.com:443","created_at":"2025-10-24T23:31:44.151178025Z","message":{"role":"assistant","content":"**Canada – A Quick Overview**\n\n| Category | Highlights |\n|----------|------------|\n| **Location** | North‑America, stretching from the Atlantic Ocean in the east to the Pacific Ocean in the west, and northward into the Arctic Ocean. |\n| **Size** | 2nd largest country by total area (≈ 9.98 million km²). 10 provinces and 3 territories. |\n| **Population** | About 40 million (2024 estimate). Major urban centres: Toronto, Montréal, Vancouver, Calgary, Edmonton, Ottawa (capital). |\n| ..."},"done":true,"done_reason":"stop","total_duration":9630030893,"prompt_eval_count":77,"eval_count":1167} ``` Running it with the local model gives me this response (an object): ``` {"model":"gpt-oss:20b","created_at":"2025-10-24T23:42:28.75745Z","message":{"role":"assistant","content":"{\"name\":\"Canada\",\"capital\":\"Ottawa\",\"languages\":[\"English\",\"French\",\"many Indigenous languages\"]}","thinking":"User wants \"Tell me about Canada.\" Likely wants general overview: geography, culture, politics, economy, demographics, history, etc. Should be concise but comprehensive. We'll provide key facts: location, provinces/territories, capital, population, languages, history, economy sectors, culture, climate, major cities, tourism, politics, education, etc. Probably want up-to-date as of 2025. We'll deliver an informative piece. We'll keep language accessible, maybe some bullet points, and a summary."},"done":true,"done_reason":"stop","total_duration":60240011291,"load_duration":56774027041,"prompt_eval_count":184,"prompt_eval_duration":26297875,"eval_count":22,"eval_duration":341659459}% ```
Author
Owner

@JazzzzX commented on GitHub (Nov 7, 2025):

structured_llm = llm.with_structured_output( schema, include_raw=True, method="function_calling" )

For me when using ollama cloud i use function_calling method for strucuture output and in local ollam i use json_schema method. I hope its a bug

<!-- gh-comment-id:3500922184 --> @JazzzzX commented on GitHub (Nov 7, 2025): `structured_llm = llm.with_structured_output( schema, include_raw=True, method="function_calling" )` For me when using ollama cloud i use function_calling method for strucuture output and in local ollam i use json_schema method. I hope its a bug
Author
Owner

@guilhermecxe commented on GitHub (Nov 8, 2025):

I am having the same issue. But using this as alternative:

from ollama import Client

client = Client(
    host="https://ollama.com",
    headers={'Authorization': 'Bearer ...'}
)

response = client.chat(
    model="qwen3-vl:235b-cloud",
    messages=[
        {"role": "system", "content": f"Use this format to respond: {Section.model_json_schema()}\n\nNo further explanations or formatting needed."},
        {"role": "user", "content": "Answer with synthetic data"}
    ],
    # format=Section.model_json_schema()
)

Big models tend to respect the format, but still fingers crossed when using large schemas or large prompts.

<!-- gh-comment-id:3506595727 --> @guilhermecxe commented on GitHub (Nov 8, 2025): I am having the same issue. But using this as alternative: ```python from ollama import Client client = Client( host="https://ollama.com", headers={'Authorization': 'Bearer ...'} ) response = client.chat( model="qwen3-vl:235b-cloud", messages=[ {"role": "system", "content": f"Use this format to respond: {Section.model_json_schema()}\n\nNo further explanations or formatting needed."}, {"role": "user", "content": "Answer with synthetic data"} ], # format=Section.model_json_schema() ) ``` Big models tend to respect the format, but still fingers crossed when using large schemas or large prompts.
Author
Owner

@ParthSareen commented on GitHub (Dec 10, 2025):

Hey folks - we don't currently support structured outputs on the cloud just yet. Coming soon!

<!-- gh-comment-id:3639047973 --> @ParthSareen commented on GitHub (Dec 10, 2025): Hey folks - we don't currently support structured outputs on the cloud just yet. Coming soon!
Author
Owner

@tnorlund commented on GitHub (Jan 14, 2026):

Any update on this? Would really appreciate this!

<!-- gh-comment-id:3747634137 --> @tnorlund commented on GitHub (Jan 14, 2026): Any update on this? Would really appreciate this!
Author
Owner

@pavelai commented on GitHub (Jan 28, 2026):

For those who still facing the issue. The solution could be using tools with a single tool provided and an instruction to use this particular tool. The parameter for this tool should be a single object equal to the format wanted for structured output. Also it could be several tools at once

For models which use OpenAI Harmony output it would be even more efficient. Because the format has less verbose format for tools description DSL, which is closer to TypeScript, and not so extensively verbose as JSON schema. But I'm not sure how to check Qwen's formatting right now

Refs

<!-- gh-comment-id:3810939985 --> @pavelai commented on GitHub (Jan 28, 2026): For those who still facing the issue. The solution could be using tools with a single tool provided and an instruction to use this particular tool. The parameter for this tool should be a single object equal to the format wanted for structured output. Also it could be several tools at once For models which use OpenAI Harmony output it would be even more efficient. Because the format has less verbose format for tools description DSL, which is closer to TypeScript, and not so extensively verbose as JSON schema. But I'm not sure how to check Qwen's formatting right now ## Refs * https://cookbook.openai.com/articles/openai-harmony#function-calling * https://cookbook.openai.com/articles/openai-harmony#structured-output
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#54726