[PR #7900] [MERGED] Structured Outputs - Chat Endpoint #43800

Closed
opened 2026-04-24 23:23:13 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/7900
Author: @ParthSareen
Created: 12/1/2024
Status: Merged
Merged: 12/5/2024
Merged by: @ParthSareen

Base: mainHead: parth/structured-outputs


📝 Commits (10+)

📊 Changes

10 files changed (+180 additions, -25 deletions)

View changed files

📝 api/types.go (+1 -1)
📝 cmd/cmd.go (+2 -1)
📝 llama/llama.go (+33 -0)
📝 llama/llama_test.go (+69 -0)
📝 llama/sampling_ext.cpp (+27 -2)
📝 llama/sampling_ext.h (+2 -0)
📝 llm/server.go (+17 -10)
📝 openai/openai.go (+21 -4)
📝 openai/openai_test.go (+7 -6)
📝 server/routes.go (+1 -1)

📄 Description

Structured outputs

A longtime ask from the community - we now support the passing in of a json schema, translate to grammar and use it for sampling.

Why not full grammar support

We gave this a ton of thought and there's 3 main points around here:

  1. Inherent complexity of grammars - Generating a grammar for the average user is not a great experience and should be one that is abstracted away from them. Digging into the code, the API layer also needs some TLC which would mean changing some interfaces on Ollama's end while maintaining a consistent UX.
  2. Sampling performance - there are many new papers and methodologies for grammars (outlines, xgrammar, etc). We want to keep grammar generation and sampling coupled to improve the performance of sampling down the road.
  3. Parity with existing experiences - other client SDKs (e.g. OpenAI) already support structured outputs and it's imperative we keep the experience simple on our end but also support those.

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/7900 **Author:** [@ParthSareen](https://github.com/ParthSareen) **Created:** 12/1/2024 **Status:** ✅ Merged **Merged:** 12/5/2024 **Merged by:** [@ParthSareen](https://github.com/ParthSareen) **Base:** `main` ← **Head:** `parth/structured-outputs` --- ### 📝 Commits (10+) - [`091e860`](https://github.com/ollama/ollama/commit/091e8602b5f88b8594fd2c1f10360e8b53068808) WIP - [`eedbdf1`](https://github.com/ollama/ollama/commit/eedbdf17770ca603bb9d8c340fc2c793428d6133) Improve passing of format, better separation of concerns - [`43452ac`](https://github.com/ollama/ollama/commit/43452ac35f8f03986052881c346eccd1c1a65e11) Add tests - [`90419d8`](https://github.com/ollama/ollama/commit/90419d8d42f07a5f02f5473165972b780c35d862) Improve tests and general cleanup - [`5db6e7a`](https://github.com/ollama/ollama/commit/5db6e7afbad90547b9cb01b2c53f04f19badf118) rename schema_to_grammar - [`5109a23`](https://github.com/ollama/ollama/commit/5109a230657b4c58f968c8460d7108caada9c18c) Address comments + updates - [`9788561`](https://github.com/ollama/ollama/commit/9788561af31da7e0656612ad8c7c5e361d984d78) Make json schema more flexible - [`6017a9d`](https://github.com/ollama/ollama/commit/6017a9d3f7f35bf77ee0aeb2f7a974d42379c422) Address comments - [`f3a649e`](https://github.com/ollama/ollama/commit/f3a649e4ee51d6c0ae80e008dac0cb6a864c2df6) Addressing comments - [`00803b3`](https://github.com/ollama/ollama/commit/00803b3f21e343cb6c60b884d2ca38484f948280) Update openai/openai_test.go ### 📊 Changes **10 files changed** (+180 additions, -25 deletions) <details> <summary>View changed files</summary> 📝 `api/types.go` (+1 -1) 📝 `cmd/cmd.go` (+2 -1) 📝 `llama/llama.go` (+33 -0) 📝 `llama/llama_test.go` (+69 -0) 📝 `llama/sampling_ext.cpp` (+27 -2) 📝 `llama/sampling_ext.h` (+2 -0) 📝 `llm/server.go` (+17 -10) 📝 `openai/openai.go` (+21 -4) 📝 `openai/openai_test.go` (+7 -6) 📝 `server/routes.go` (+1 -1) </details> ### 📄 Description ## Structured outputs A longtime ask from the community - we now support the passing in of a json schema, translate to grammar and use it for sampling. ## Why not full grammar support We gave this a ton of thought and there's 3 main points around here: 1. Inherent complexity of grammars - Generating a grammar for the average user is not a great experience and should be one that is abstracted away from them. Digging into the code, the API layer also needs some TLC which would mean changing some interfaces on Ollama's end while maintaining a consistent UX. 2. Sampling performance - there are many new papers and methodologies for grammars (outlines, xgrammar, etc). We want to keep grammar generation and sampling coupled to improve the performance of sampling down the road. 3. Parity with existing experiences - other client SDKs (e.g. OpenAI) already support structured outputs and it's imperative we keep the experience simple on our end but also support those. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-24 23:23:13 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#43800