[GH-ISSUE #3616] Richer grammars #64267

Closed
opened 2026-05-03 16:50:40 -05:00 by GiteaMirror · 9 comments
Owner

Originally created by @tezlm on GitHub (Apr 12, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/3616

What are you trying to do?

Being able to specify grammars is great, but it seems a bit underutilized at the moment. This is mostly a thought dump on how it could be improved from experimentation...

How should we solve this?

  • Using llama.cpp grammar directly would be pretty powerful and nice to have
  • Specifying jsonschema for json. Llama.cpp json is usually forced into a specific key order and ollama json isn't schema'd at all
  • Changing the format on the fly is useful, but I think it would be nice to have a way to specify a grammar in the Modelfile.

What is the impact of not solving this?

Not having either of the first two ideas is annoying, since there's no way to guarantee that a model generates a response in a format I want. The third idea allows one to make a "llm api", where a model generates a specific response every time (imagine bundling a "summary llm" that always responds with {"summary":"..."} as a Modelfile.)

Anything else?

No response

Originally created by @tezlm on GitHub (Apr 12, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/3616 ### What are you trying to do? Being able to specify grammars is great, but it seems a bit underutilized at the moment. This is mostly a thought dump on how it could be improved from experimentation... ### How should we solve this? - Using llama.cpp grammar directly would be pretty powerful and nice to have - Specifying jsonschema for json. Llama.cpp json is usually forced into a specific key order and ollama json isn't schema'd at all - Changing the format on the fly is useful, but I think it would be nice to have a way to specify a grammar in the Modelfile. ### What is the impact of not solving this? Not having either of the first two ideas is annoying, since there's no way to guarantee that a model generates a response in a format I want. The third idea allows one to make a "llm api", where a model generates a specific response every time (imagine bundling a "summary llm" that always responds with `{"summary":"..."}` as a Modelfile.) ### Anything else? _No response_
Author
Owner

@ravenscroftj commented on GitHub (Apr 14, 2024):

There are a few open PRs for this behaviour - the most recent one being https://github.com/ollama/ollama/pull/3618 it would be amazing to get this merged in. It's a 2 line change that exposes the llama.cpp GBNF functionality via modelfile parameters. Its not my patch but I've compiled it and used it locally and it works really well.

<!-- gh-comment-id:2053939297 --> @ravenscroftj commented on GitHub (Apr 14, 2024): There are a few open PRs for this behaviour - the most recent one being https://github.com/ollama/ollama/pull/3618 it would be amazing to get this merged in. It's a 2 line change that exposes the llama.cpp GBNF functionality via modelfile parameters. Its not my patch but I've compiled it and used it locally and it works really well.
Author
Owner

@rhohndorf commented on GitHub (Apr 21, 2024):

Specifying grammars in the Modelfile is one thing. It would be much more useful to be able to send a grammar string in the request, similar to llama.cpp server.
Is that possible with Ollama now? There's nothing about grammars in the api docs.

<!-- gh-comment-id:2068188988 --> @rhohndorf commented on GitHub (Apr 21, 2024): Specifying grammars in the Modelfile is one thing. It would be much more useful to be able to send a grammar string in the request, similar to llama.cpp server. Is that possible with Ollama now? There's nothing about grammars in the api docs.
Author
Owner

@ravenscroftj commented on GitHub (Apr 21, 2024):

Yes you can send the grammar as an option when you submit a request with the patch I linked to above enabled. It just isn't documented!

Here's an example:

POST http://localhost:11434/api/chat

{
  "model":"llama3:8b",
  "stream": false,
  "messages":[
    {"role":"user", "content": "The sky is blue, true or false?"}
  ],
  "options":{
    "grammar": "root ::= (\"true\" | \"false\")"
  }
}

Response:

{
  "model": "llama3:8b",
  "created_at": "2024-04-21T20:53:15.212659393Z",
  "message": {
    "role": "assistant",
    "content": "true"
  },
  "done": true,
  "total_duration": 545867966,
  "load_duration": 5912270,
  "prompt_eval_duration": 213384000,
  "eval_count": 2,
  "eval_duration": 202538000
}
<!-- gh-comment-id:2068195083 --> @ravenscroftj commented on GitHub (Apr 21, 2024): Yes you can send the grammar as an option when you submit a request with the patch I linked to above enabled. It just isn't documented! Here's an example: POST http://localhost:11434/api/chat ```json { "model":"llama3:8b", "stream": false, "messages":[ {"role":"user", "content": "The sky is blue, true or false?"} ], "options":{ "grammar": "root ::= (\"true\" | \"false\")" } } ``` Response: ```json { "model": "llama3:8b", "created_at": "2024-04-21T20:53:15.212659393Z", "message": { "role": "assistant", "content": "true" }, "done": true, "total_duration": 545867966, "load_duration": 5912270, "prompt_eval_duration": 213384000, "eval_count": 2, "eval_duration": 202538000 } ```
Author
Owner

@markcda commented on GitHub (Apr 22, 2024):

It just isn't documented!

So, I can add docs about it in #3618.

UPD: done.

<!-- gh-comment-id:2068912067 --> @markcda commented on GitHub (Apr 22, 2024): > It just isn't documented! So, I can add docs about it in #3618. UPD: done.
Author
Owner

@rhohndorf commented on GitHub (Apr 25, 2024):

That's really really cool. Though as far as i can see it's not merged into main yet.

<!-- gh-comment-id:2076103875 --> @rhohndorf commented on GitHub (Apr 25, 2024): That's really really cool. Though as far as i can see it's not merged into main yet.
Author
Owner

@Nidvogr commented on GitHub (Apr 28, 2024):

This would be so nice to have, one step closer to ditching my custom python bindings for llamacpp and being able to use ollama now that they also started supporting concurrent models. Sending a grammar together with the request is a great feature to support.

<!-- gh-comment-id:2081697288 --> @Nidvogr commented on GitHub (Apr 28, 2024): This would be so nice to have, one step closer to ditching my custom python bindings for llamacpp and being able to use ollama now that they also started supporting concurrent models. Sending a grammar together with the request is a great feature to support.
Author
Owner

@mitar commented on GitHub (Jun 27, 2024):

I added support for this and JSON schema in #5348.

<!-- gh-comment-id:2195837077 --> @mitar commented on GitHub (Jun 27, 2024): I added support for this and JSON schema in #5348.
Author
Owner

@Kinglord commented on GitHub (Aug 7, 2024):

Hey all, I know there's an automated ping here but just to better align everyone please check out and comment on my new call to the Ollama team for clarity here. As always please be civil and stay on topic! 😄 - https://github.com/ollama/ollama/issues/6237

<!-- gh-comment-id:2273914619 --> @Kinglord commented on GitHub (Aug 7, 2024): Hey all, I know there's an automated ping here but just to better align everyone please check out and comment on my new call to the Ollama team for clarity here. As always please be civil and stay on topic! 😄 - https://github.com/ollama/ollama/issues/6237
Author
Owner

@ParthSareen commented on GitHub (Dec 5, 2024):

Hey! Thanks for raising this! Going to close this out as we're supporting structured outputs through https://github.com/ollama/ollama/pull/7900

Left a comment with some background as well: https://github.com/ollama/ollama/issues/6237#issuecomment-2518836758

<!-- gh-comment-id:2518842341 --> @ParthSareen commented on GitHub (Dec 5, 2024): Hey! Thanks for raising this! Going to close this out as we're supporting structured outputs through https://github.com/ollama/ollama/pull/7900 Left a comment with some background as well: https://github.com/ollama/ollama/issues/6237#issuecomment-2518836758
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#64267