[PR #565] [CLOSED] Add support for GBNF grammar definitions #56921

Closed
opened 2026-04-29 11:30:58 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/565
Author: @SyrupThinker
Created: 9/21/2023
Status: Closed

Base: mainHead: st/grammar


📝 Commits (2)

  • 6150872 Add support for GBNF grammar definitions
  • 9dda9cc Add an example GBNF JSON model

📊 Changes

4 files changed (+37 additions, -0 deletions)

View changed files

📝 api/types.go (+1 -0)
📝 docs/modelfile.md (+1 -0)
examples/json/Modelfile (+33 -0)
📝 llm/ext_server.go (+2 -0)

📄 Description

This PR exposes the llama.cpp grammar parameter in the generate API.

It allows the user to provide a GBNF grammar to constrain the output of an LLM.

This can be used to, for example, reliably generate structured data like JSON:

>>> Generate a list of 5 random mock users that contain a firstname, lastname, birthday, created_at and email field. The created_at field should be RFC3339 and lie in the range of 2000 to 2020. Emails should use multiple subdomains under the example.com domain. The result should be in a JSON object with a users key.
{
    "users": [
      {
        "firstname": "Emma",
        "lastname": "Brown",
        "birthday": "1993-08-12T00:00:00Z",
        "created_at": "2017-04-15T13:00:00Z",
        "email": "emma.brown@example.co.uk"
      },
      {
        "firstname": "Olivia",
        "lastname": "Jones",
        "birthday": "1996-03-25T00:00:00Z",
        "created_at": "2018-02-17T14:00:00Z",
        "email": "olivia.jones@example.edu"
      },
      {
        "firstname": "Ava",
        "lastname": "Smith",
        "birthday": "1997-08-24T00:00:00Z",
        "created_at": "2019-05-12T15:00:00Z",
        "email": "ava.smith@example.net"
      },
      {
        "firstname": "Sophia",
        "lastname": "Johnson",
        "birthday": "1998-04-26T00:00:00Z",
        "created_at": "2020-03-07T16:00:00Z",
        "email": "sophia.johnson@example.org"
      },
      {
        "firstname": "Mia",
        "lastname": "Williams",
        "birthday": "1999-07-23T00:00:00Z",
        "created_at": "2020-12-08T17:00:00Z",
        "email": "mia.williams@example.com"
      }
    ]
  }

>>> Create a JSON object that contains the latest birthday and the earliest created_at date, omit the time.
{
"latest_birthday": "1999-07-23",
"earliest_created_at": "2000-01-01"
}

Generated with the examples/json Modelfile, first attempt, not cherry-picked

A note for potential users
The generated documents are first try valid JSON, without extra tuning.
But note how the LLM used different TLD's, not subdomains. The earliest_created_at is also not as intended, the instruction is ambiguous.
This only ensures that the grammar is followed, the semantics might still be wrong.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/565 **Author:** [@SyrupThinker](https://github.com/SyrupThinker) **Created:** 9/21/2023 **Status:** ❌ Closed **Base:** `main` ← **Head:** `st/grammar` --- ### 📝 Commits (2) - [`6150872`](https://github.com/ollama/ollama/commit/6150872489fa296a01c89cfe232b993efc429c43) Add support for GBNF grammar definitions - [`9dda9cc`](https://github.com/ollama/ollama/commit/9dda9ccdfba5b46fe9cd70750a68e3c63387f7fe) Add an example GBNF JSON model ### 📊 Changes **4 files changed** (+37 additions, -0 deletions) <details> <summary>View changed files</summary> 📝 `api/types.go` (+1 -0) 📝 `docs/modelfile.md` (+1 -0) ➕ `examples/json/Modelfile` (+33 -0) 📝 `llm/ext_server.go` (+2 -0) </details> ### 📄 Description This PR exposes the llama.cpp `grammar` parameter in the generate API. It allows the user to provide a [GBNF grammar](https://github.com/ggerganov/llama.cpp/tree/master/grammars) to constrain the output of an LLM. This can be used to, for example, reliably generate structured data like JSON: ``` >>> Generate a list of 5 random mock users that contain a firstname, lastname, birthday, created_at and email field. The created_at field should be RFC3339 and lie in the range of 2000 to 2020. Emails should use multiple subdomains under the example.com domain. The result should be in a JSON object with a users key. { "users": [ { "firstname": "Emma", "lastname": "Brown", "birthday": "1993-08-12T00:00:00Z", "created_at": "2017-04-15T13:00:00Z", "email": "emma.brown@example.co.uk" }, { "firstname": "Olivia", "lastname": "Jones", "birthday": "1996-03-25T00:00:00Z", "created_at": "2018-02-17T14:00:00Z", "email": "olivia.jones@example.edu" }, { "firstname": "Ava", "lastname": "Smith", "birthday": "1997-08-24T00:00:00Z", "created_at": "2019-05-12T15:00:00Z", "email": "ava.smith@example.net" }, { "firstname": "Sophia", "lastname": "Johnson", "birthday": "1998-04-26T00:00:00Z", "created_at": "2020-03-07T16:00:00Z", "email": "sophia.johnson@example.org" }, { "firstname": "Mia", "lastname": "Williams", "birthday": "1999-07-23T00:00:00Z", "created_at": "2020-12-08T17:00:00Z", "email": "mia.williams@example.com" } ] } >>> Create a JSON object that contains the latest birthday and the earliest created_at date, omit the time. { "latest_birthday": "1999-07-23", "earliest_created_at": "2000-01-01" } ``` *Generated with the examples/json Modelfile, first attempt, not cherry-picked* **A note for potential users** The generated documents are first try valid JSON, without extra tuning. But note how the LLM used different TLD's, not subdomains. The `earliest_created_at` is also not as intended, the instruction is ambiguous. This only ensures that the grammar is followed, the semantics might still be wrong. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-29 11:30:58 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#56921