[GH-ISSUE #808] Grammar-guided generation support #26151

Closed
opened 2026-04-22 02:11:59 -05:00 by GiteaMirror · 15 comments
Owner

Originally created by @tmc on GitHub (Oct 16, 2023).
Original GitHub issue: https://github.com/ollama/ollama/issues/808

Thoughts on introducing a straightforward way for a Modelfile to point to a grammar and thread that through to sampling/inference?

Originally created by @tmc on GitHub (Oct 16, 2023). Original GitHub issue: https://github.com/ollama/ollama/issues/808 Thoughts on introducing a straightforward way for a Modelfile to point to a grammar and thread that through to sampling/inference?
GiteaMirror added the feedback wantedfeature request labels 2026-04-22 02:12:00 -05:00
Author
Owner

@Lwrless commented on GitHub (Oct 17, 2023):

I'd like that feature too.
I see that grammar support was added in PR #565, but It has not been merged yet.

And I have submitted pull request #830 to add some JSON schema support. So for anyone looking for that feature, you might want to check it out. :)

<!-- gh-comment-id:1765490996 --> @Lwrless commented on GitHub (Oct 17, 2023): I'd like that feature too. I see that grammar support was added in PR #565, but It has not been merged yet. And I have submitted pull request #830 to add some JSON schema support. So for anyone looking for that feature, you might want to check it out. :)
Author
Owner

@briancleland commented on GitHub (Oct 17, 2023):

Another vote for this feature to be merged.

<!-- gh-comment-id:1766775015 --> @briancleland commented on GitHub (Oct 17, 2023): Another vote for this feature to be merged.
Author
Owner

@igorschlum commented on GitHub (Oct 17, 2023):

I asked bing for more explanations and I agree.

Okay, I will try to translate what I just said in English, including the description of a Modelfile. Here is what I get:

A Modelfile is a file that contains the parameters of a machine learning model, as I explained to you earlier. A machine learning model is an algorithm that learns from data and that can perform tasks such as classification, regression, generation, etc. A Modelfile allows you to save and load a machine learning model, in order to reuse it or share it .

A grammar is a set of rules that define the structure and meaning of a language, such as French or English. Sampling or sampling is a method that allows you to generate text from a machine learning model, by randomly choosing the words or characters to add to the text. Inference is a method that allows you to generate text from a machine learning model, by choosing the words or characters that maximize the probability of the text.

What this user would like is that there is a simple way for a Modelfile to indicate which grammar it refers to, and that this grammar is taken into account during sampling or inference. Thus, the generated text would be more coherent and more compliant with the rules of the language.

For example, if the Modelfile corresponds to a model that generates French, it could point to a grammar that specifies the rules of conjugation, agreement, punctuation, etc. of French. During sampling or inference, the model would use this grammar to generate correct text in French.

<!-- gh-comment-id:1766865170 --> @igorschlum commented on GitHub (Oct 17, 2023): I asked bing for more explanations and I agree. Okay, I will try to translate what I just said in English, including the description of a Modelfile. Here is what I get: A Modelfile is a file that contains the parameters of a machine learning model, as I explained to you earlier. A machine learning model is an algorithm that learns from data and that can perform tasks such as classification, regression, generation, etc. A Modelfile allows you to save and load a machine learning model, in order to reuse it or share it . A grammar is a set of rules that define the structure and meaning of a language, such as French or English. Sampling or sampling is a method that allows you to generate text from a machine learning model, by randomly choosing the words or characters to add to the text. Inference is a method that allows you to generate text from a machine learning model, by choosing the words or characters that maximize the probability of the text. What this user would like is that there is a simple way for a Modelfile to indicate which grammar it refers to, and that this grammar is taken into account during sampling or inference. Thus, the generated text would be more coherent and more compliant with the rules of the language. For example, if the Modelfile corresponds to a model that generates French, it could point to a grammar that specifies the rules of conjugation, agreement, punctuation, etc. of French. During sampling or inference, the model would use this grammar to generate correct text in French.
Author
Owner

@zabirauf commented on GitHub (Nov 4, 2023):

+1 to adding this in ollama

<!-- gh-comment-id:1793343078 --> @zabirauf commented on GitHub (Nov 4, 2023): +1 to adding this in ollama
Author
Owner

@donedgardo commented on GitHub (Nov 5, 2023):

+1 would love to see this added to support memgpt integration

<!-- gh-comment-id:1793870450 --> @donedgardo commented on GitHub (Nov 5, 2023): +1 would love to see this added to support memgpt integration
Author
Owner

@orkutmuratyilmaz commented on GitHub (Nov 6, 2023):

+1 to this feature with MemGPT integration:)

<!-- gh-comment-id:1794588270 --> @orkutmuratyilmaz commented on GitHub (Nov 6, 2023): +1 to this feature with MemGPT integration:)
Author
Owner

@ziontee113 commented on GitHub (Nov 6, 2023):

+1 to this feature for MemGPT integration & similar projects.

<!-- gh-comment-id:1795006131 --> @ziontee113 commented on GitHub (Nov 6, 2023): +1 to this feature for MemGPT integration & similar projects.
Author
Owner

@adambuttrick commented on GitHub (Nov 23, 2023):

+1 to this feature for my use cases. I get the odd, incorrectly formatted response using JSON mode and the output format in the prompt, so I think this would greatly improve reliability.

<!-- gh-comment-id:1824999819 --> @adambuttrick commented on GitHub (Nov 23, 2023): +1 to this feature for my use cases. I get the odd, incorrectly formatted response using JSON mode and the output format in the prompt, so I think this would greatly improve reliability.
Author
Owner

@tezlm commented on GitHub (Nov 26, 2023):

It would also be nice to be able to change the grammar on the fly, instead of baked into a Modelfile

<!-- gh-comment-id:1826463366 --> @tezlm commented on GitHub (Nov 26, 2023): It would also be nice to be able to change the grammar on the fly, instead of baked into a Modelfile
Author
Owner

@technovangelist commented on GitHub (Dec 4, 2023):

The original issue asked for grammar guided generation. This has been added recently with format: json available at the API and the CLI. You can output web formed JSON and specify the schema to be used for that output. So I will go ahead and close the issue now. If you think there is anything we left out, reopen and we can address. Thanks for being part of this great community.

<!-- gh-comment-id:1839438738 --> @technovangelist commented on GitHub (Dec 4, 2023): The original issue asked for grammar guided generation. This has been added recently with `format: json` available at the API and the CLI. You can output web formed JSON and specify the schema to be used for that output. So I will go ahead and close the issue now. If you think there is anything we left out, reopen and we can address. Thanks for being part of this great community.
Author
Owner

@hobofan commented on GitHub (Dec 8, 2023):

@technovangelist I think this issue should be reopened. format: json only allows for specifying structure for JSON output, but does not give access to the underlying grammar parameter of llama.cpp to allow for doing grammar guided generation according to different GBNF grammars (like implemented in this PR: https://github.com/jmorganca/ollama/pull/565).

E.g. one might want to use a custom GBNF grammer to generate answers in different data formats (e.g. YAML or something more esoteric).

<!-- gh-comment-id:1846931832 --> @hobofan commented on GitHub (Dec 8, 2023): @technovangelist I think this issue should be reopened. `format: json` only allows for specifying structure for JSON output, but does not give access to the underlying `grammar` parameter of llama.cpp to allow for doing grammar guided generation according to different GBNF grammars (like implemented in this PR: https://github.com/jmorganca/ollama/pull/565). E.g. one might want to use a custom GBNF grammer to generate answers in different data formats (e.g. YAML or something more esoteric).
Author
Owner

@technovangelist commented on GitHub (Dec 9, 2023):

@hobofan ok, that makes sense. Since the original issue wasn't about GBNF specifically and your comment is, would you mind opening a new issue making the case for a lower level GBNF feature? I could make it, but then it may be harder for you to follow up on it.

Thanks so much for clarifying and adding this info.

<!-- gh-comment-id:1847990444 --> @technovangelist commented on GitHub (Dec 9, 2023): @hobofan ok, that makes sense. Since the original issue wasn't about GBNF specifically and your comment is, would you mind opening a new issue making the case for a lower level GBNF feature? I could make it, but then it may be harder for you to follow up on it. Thanks so much for clarifying and adding this info.
Author
Owner

@James4Ever0 commented on GitHub (Dec 20, 2023):

Grammar support would be awesome, and logits probability response could be a good starting point.

<!-- gh-comment-id:1864781798 --> @James4Ever0 commented on GitHub (Dec 20, 2023): Grammar support would be awesome, and logits probability response could be a good starting point.
Author
Owner

@ggregoire commented on GitHub (Jul 15, 2024):

The original issue asked for grammar guided generation. This has been added recently with format: json available at the API and the CLI. You can output web formed JSON and specify the schema to be used for that output. So I will go ahead and close the issue now. If you think there is anything we left out, reopen and we can address. Thanks for being part of this great community.

Hi @technovangelist ! Maybe I missed it in the docs, but how do you specify the schema? I can't find the option in https://github.com/ollama/ollama/blob/main/docs/api.md#generate-a-completion. Also, it seems like the pull request that adds support for the JSON schema has never been merged: https://github.com/ollama/ollama/pull/830 (maybe another one was merged instead?). Thanks!

<!-- gh-comment-id:2229235448 --> @ggregoire commented on GitHub (Jul 15, 2024): > The original issue asked for grammar guided generation. This has been added recently with format: json available at the API and the CLI. **You can output web formed JSON and specify the schema to be used for that output.** So I will go ahead and close the issue now. If you think there is anything we left out, reopen and we can address. Thanks for being part of this great community. Hi @technovangelist ! Maybe I missed it in the docs, but how do you specify the schema? I can't find the option in https://github.com/ollama/ollama/blob/main/docs/api.md#generate-a-completion. Also, it seems like the pull request that adds support for the JSON schema has never been merged: https://github.com/ollama/ollama/pull/830 (maybe another one was merged instead?). Thanks!
Author
Owner

@Kinglord commented on GitHub (Aug 7, 2024):

Hey all, I know there's an automated ping here but just to better align everyone please check out and comment on my new call to the Ollama team for clarity here. As always please be civil and stay on topic! 😄 - https://github.com/ollama/ollama/issues/6237

<!-- gh-comment-id:2273914325 --> @Kinglord commented on GitHub (Aug 7, 2024): Hey all, I know there's an automated ping here but just to better align everyone please check out and comment on my new call to the Ollama team for clarity here. As always please be civil and stay on topic! 😄 - https://github.com/ollama/ollama/issues/6237
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#26151