[PR #4525] Exposing grammar as a request parameter in completion/chat with go-side grammar validation #11514

Closed
opened 2026-04-12 23:31:35 -05:00 by GiteaMirror · 0 comments
Owner

Original Pull Request: https://github.com/ollama/ollama/pull/4525

State: closed
Merged: No


Why is passing down grammars needed?

Relying upon the context of a prompt to dictate structure can be unreliable (because its dependent upon the model and generational randomness) and takes up context space. Grammar is a well proven way to constrain generational output, and in fact format="JSON" even depends on it, but format="JSON" allows no reliable specification large complex structures and can even be tricked with prompt attacks.

image

Why grammar and not JSON schema?

While JSON schema would make a nice future addition, there's interest in data structures outside of JSON (simple enum values, programming languages, etc.). Also, JSON schema generators will rely upon grammars fundamentally, so validating the grammar generated by JSON schema will also benefit from grammar checking.

Why not just pass along the grammar to llama.cpp?

I looked into complexities of passing along grammar to llama.cpp server. There's a few challenges:

  • llama.cpp server doesn't return errors when bad grammar is passed to it with streaming mode on. It gives an incomprehensible "unexpected EOF"
    image
  • the in memory model will be reused if the grammar is valid OR changed. BUT... the in-memory model appears to get reloaded if you give it a bad grammar and then follow up with a good grammar.
    image
  • it appears to work perfectly reusing in memory models just passing along a completely valid grammar (even a variety of valid grammars)

My conclusion from this given the advice of the community is that we do indeed have to do our our GBNF grammar validation on the Go server side to do our best at preventing passing down bad grammar.


In this PR i've created:

  • the functionality to pass along grammar in chat and completion mode
  • documentation in readme related to new property
  • prevention of using grammar and json parameters at same time.
  • validation code for grammars
  • extensive set of 30+ tests for grammar ranging from character classes, strings, internationalizations comments, etc.
  • tests of every known grammar on llama.cpp and also individual unit tests
  • no usages of regex to make clear understandable parsing

Edge cases:

  • i've probably not implemented the entirety of whats possible in character classes, but I have a limited subset compatible with the grammar listed on llamma.cpp. My assumption is most people's grammars will be less complex than these.
  • there might be some valid grammars I don't currently support (but to the best of my knowledge we support all the major publicly available ones including ones as complex as C programming language), I chose not to use a full on go parser library because I wanted the cognitive load of this code to be approachable initially (rather than every viewer of this code to have to learn a new library). if in the future, we want to replace it with a more formal technology we can and tests can be reused.

Examples of success:

Screenshot 2024-05-19 at 1 29 10 PM Screenshot 2024-05-19 at 1 44 22 PM

Example of failure:

Screenshot 2024-05-19 at 1 28 22 PM

I believe this PR satisfies https://github.com/ollama/ollama/issues/4074 with an acceptable amount of protection from sending invalid GBNF grammars with useful error messages.

**Original Pull Request:** https://github.com/ollama/ollama/pull/4525 **State:** closed **Merged:** No --- **Why is passing down grammars needed?** Relying upon the context of a prompt to dictate structure can be unreliable (because its dependent upon the model and generational randomness) and takes up context space. Grammar is a well proven way to constrain generational output, and in fact `format="JSON"` even depends on it, but `format="JSON"` allows no reliable specification large complex structures and can even be tricked with prompt attacks. ![image](https://github.com/ollama/ollama/assets/294042/c37845b6-abba-4f2c-864b-a6e152766d07) **Why grammar and not JSON schema?** While JSON schema would make a nice future addition, there's interest in data structures outside of JSON (simple enum values, programming languages, etc.). Also, JSON schema generators will rely upon grammars fundamentally, so validating the grammar generated by JSON schema will also benefit from grammar checking. **Why not just pass along the grammar to llama.cpp?** I looked into complexities of passing along grammar to llama.cpp server. There's a few challenges: * llama.cpp server [doesn't return errors when bad grammar is passed to it](https://github.com/ggerganov/llama.cpp/issues/7391) with streaming mode on. It gives an incomprehensible "unexpected EOF" ![image](https://github.com/ollama/ollama/assets/294042/fcee5ad2-c231-4421-a03e-ba993b71512a) * the in memory model will be reused if the grammar is valid OR changed. BUT... the in-memory model appears to get reloaded if you give it a bad grammar and then follow up with a good grammar. ![image](https://github.com/ollama/ollama/assets/294042/c8542d29-2480-4451-ba34-05a64e5df2cd) * it appears to work perfectly reusing in memory models just passing along a completely valid grammar (even a variety of valid grammars) My conclusion from this given the advice of the community is that we do indeed have to do our our GBNF grammar validation on the Go server side to do our best at preventing passing down bad grammar. ---- In this PR i've created: * the functionality to pass along `grammar` in chat and completion mode * documentation in readme related to new property * prevention of using `grammar` and `json` parameters at same time. * validation code for grammars * extensive set of 30+ tests for grammar ranging from character classes, strings, internationalizations comments, etc. * tests of every known grammar on llama.cpp and also individual unit tests * no usages of regex to make clear understandable parsing Edge cases: * i've probably not implemented the entirety of whats possible in character classes, but I have a limited subset compatible with the grammar listed on llamma.cpp. My assumption is most people's grammars will be less complex than these. * there might be some valid grammars I don't currently support (but to the best of my knowledge we support all the major publicly available ones including ones as complex as C programming language), I chose not to use a full on go parser library because I wanted the cognitive load of this code to be approachable initially (rather than every viewer of this code to have to learn a new library). if in the future, we want to replace it with a more formal technology we can and tests can be reused. Examples of success: <img width="915" alt="Screenshot 2024-05-19 at 1 29 10 PM" src="https://github.com/ollama/ollama/assets/294042/ebc45377-2c27-4874-a2b2-00185736d1f9"> <img width="908" alt="Screenshot 2024-05-19 at 1 44 22 PM" src="https://github.com/ollama/ollama/assets/294042/535a5211-3060-421e-a8ef-b262ec1d969f"> Example of failure: <img width="967" alt="Screenshot 2024-05-19 at 1 28 22 PM" src="https://github.com/ollama/ollama/assets/294042/0ea76e6a-a34d-477a-871c-71ec7027a4f1"> I believe this PR satisfies https://github.com/ollama/ollama/issues/4074 with an acceptable amount of protection from sending invalid GBNF grammars with useful error messages.
GiteaMirror added the pull-request label 2026-04-12 23:31:35 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#11514