[PR #12054] [CLOSED] GBNF format and grammar parameter #13698

Closed
opened 2026-04-13 00:33:08 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/12054
Author: @sjsone
Created: 8/23/2025
Status: Closed

Base: mainHead: 11911-gbnf


📝 Commits (10+)

  • 2fa029f openai: allow for content and tool calls in the same message
  • 329f40a docs: update the faq (#11760)
  • c499fb7 openai: when converting role=tool messages, propagate the tool name
  • a3803ca openai: always provide reasoning
  • b743aac server: Reduce gpt-oss context length for small VRAM GPUs
  • 67ff21b tests: add integration coverage for oss-gpt (#11696)
  • 2123f26 ggml: Use GGML's typedef'ed pointer types
  • 5ea9c72 ggml: Support closing backends
  • 5539a6c ggml: No-alloc mode
  • 3683edd CONTRIBUTING: Explicitly note docs:... as a good example (#11755)

📊 Changes

297 files changed (+161197 additions, -49801 deletions)

View changed files

📝 CMakeLists.txt (+2 -1)
📝 CONTRIBUTING.md (+1 -0)
📝 Dockerfile (+2 -0)
📝 Makefile.sync (+17 -8)
📝 README.md (+4 -0)
📝 api/types.go (+48 -14)
📝 api/types_test.go (+47 -0)
📝 cmd/cmd.go (+25 -1)
📝 convert/convert_gptoss.go (+66 -34)
📝 convert/reader.go (+1 -0)
📝 convert/reader_safetensors.go (+26 -13)
convert/reader_test.go (+232 -0)
📝 discover/amd_linux.go (+34 -31)
📝 discover/gpu.go (+2 -0)
📝 discover/gpu_info_cudart.c (+3 -6)
📝 discover/gpu_info_cudart.h (+2 -5)
📝 discover/types.go (+2 -1)
📝 docs/faq.md (+11 -7)
📝 docs/linux.md (+5 -1)
📝 docs/turbo.md (+1 -1)

...and 80 more files

📄 Description

This change introduces a new request format GBNF so users can provide their own grammar to ollama.

Complementary to the new format the user has to provide the grammar. Either via a file when running ollama run or as a new parameter when issuing a server request.

CLI

ollama run --format GBNF --grammarfile test.gbnf qwen3:1.7b

Request

curl -X "POST" "http://localhost:11434/api/generate" \
     -H 'Content-Type: application/json; charset=utf-8' \
     -d $'{
  "model": "qwen3:1.7b",
  "prompt": "Hi",
  "grammar": "root ::= think-block \\"\\\\n\\\\n\\" answer-block\\n\\nthink-block ::= \\"<think>\\\\n\\" think-content \\"\\\\n</think>\\"\\n\\nanswer-block ::= \\"<answer>\\\\n\\" answer-content \\"\\\\n</answer>\\"\\nanswer-content ::= [^\\\\n]+ (\\"\\\\n\\" [^\\\\n]+)*\\n\\nthink-content ::= [^\\\\n]+ (\\"\\\\n\\" [^\\\\n]+)*",
  "format": "GBNF"
}'

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/12054 **Author:** [@sjsone](https://github.com/sjsone) **Created:** 8/23/2025 **Status:** ❌ Closed **Base:** `main` ← **Head:** `11911-gbnf` --- ### 📝 Commits (10+) - [`2fa029f`](https://github.com/ollama/ollama/commit/2fa029f68fb53d17e1dd0a50edb36c5579134614) openai: allow for content _and_ tool calls in the same message - [`329f40a`](https://github.com/ollama/ollama/commit/329f40a4587e9c4ed38a7635ec580b29a35df779) docs: update the faq (#11760) - [`c499fb7`](https://github.com/ollama/ollama/commit/c499fb7f495d242087b3129f43eba35b8a7b071b) openai: when converting role=tool messages, propagate the tool name - [`a3803ca`](https://github.com/ollama/ollama/commit/a3803cafd7c16df031910d991bdd0db86cb5f172) openai: always provide reasoning - [`b743aac`](https://github.com/ollama/ollama/commit/b743aac74327ec7f82fb21f3496a9669c3e22210) server: Reduce gpt-oss context length for small VRAM GPUs - [`67ff21b`](https://github.com/ollama/ollama/commit/67ff21bcda696236e829008f0fa974512c2bde8d) tests: add integration coverage for oss-gpt (#11696) - [`2123f26`](https://github.com/ollama/ollama/commit/2123f26211a5024b14a55c12d80508acc787f67d) ggml: Use GGML's typedef'ed pointer types - [`5ea9c72`](https://github.com/ollama/ollama/commit/5ea9c72dfb2f196cfc19c0bb992eb1a95820d3cd) ggml: Support closing backends - [`5539a6c`](https://github.com/ollama/ollama/commit/5539a6cb729bda3339fbb664a3a42134f6e13392) ggml: No-alloc mode - [`3683edd`](https://github.com/ollama/ollama/commit/3683edd9ba015ca71f598e9e07e927f85ea2e626) CONTRIBUTING: Explicitly note docs:... as a good example (#11755) ### 📊 Changes **297 files changed** (+161197 additions, -49801 deletions) <details> <summary>View changed files</summary> 📝 `CMakeLists.txt` (+2 -1) 📝 `CONTRIBUTING.md` (+1 -0) 📝 `Dockerfile` (+2 -0) 📝 `Makefile.sync` (+17 -8) 📝 `README.md` (+4 -0) 📝 `api/types.go` (+48 -14) 📝 `api/types_test.go` (+47 -0) 📝 `cmd/cmd.go` (+25 -1) 📝 `convert/convert_gptoss.go` (+66 -34) 📝 `convert/reader.go` (+1 -0) 📝 `convert/reader_safetensors.go` (+26 -13) ➕ `convert/reader_test.go` (+232 -0) 📝 `discover/amd_linux.go` (+34 -31) 📝 `discover/gpu.go` (+2 -0) 📝 `discover/gpu_info_cudart.c` (+3 -6) 📝 `discover/gpu_info_cudart.h` (+2 -5) 📝 `discover/types.go` (+2 -1) 📝 `docs/faq.md` (+11 -7) 📝 `docs/linux.md` (+5 -1) 📝 `docs/turbo.md` (+1 -1) _...and 80 more files_ </details> ### 📄 Description This change introduces a new request format `GBNF` so users can provide their own grammar to ollama. Complementary to the new format the user has to provide the grammar. Either via a file when running `ollama run` or as a new parameter when issuing a server request. ### CLI ```bash ollama run --format GBNF --grammarfile test.gbnf qwen3:1.7b ``` ### Request ```bash curl -X "POST" "http://localhost:11434/api/generate" \ -H 'Content-Type: application/json; charset=utf-8' \ -d $'{ "model": "qwen3:1.7b", "prompt": "Hi", "grammar": "root ::= think-block \\"\\\\n\\\\n\\" answer-block\\n\\nthink-block ::= \\"<think>\\\\n\\" think-content \\"\\\\n</think>\\"\\n\\nanswer-block ::= \\"<answer>\\\\n\\" answer-content \\"\\\\n</answer>\\"\\nanswer-content ::= [^\\\\n]+ (\\"\\\\n\\" [^\\\\n]+)*\\n\\nthink-content ::= [^\\\\n]+ (\\"\\\\n\\" [^\\\\n]+)*", "format": "GBNF" }' ``` --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-13 00:33:08 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#13698