[GH-ISSUE #6322] Why role must be "system" or "user" or "assistant"? How can I add a custom role like "tool"? #3969

New Issue

GiteaMirror · 2026-04-12T14:50:33-05:00

GiteaMirror commented

2026-04-12 14:50:33 -05:00

Originally created by @zhangsheng377 on GitHub (Aug 12, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6322

15c2d8fe14/parser/parser.go (L294)

Originally created by @zhangsheng377 on GitHub (Aug 12, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/6322 https://github.com/ollama/ollama/blob/15c2d8fe149ba2b58aadbab615a6955f8821c7a9/parser/parser.go#L294

GiteaMirror closed this issue

2026-04-12 14:50:33 -05:00

GiteaMirror commented

2026-04-12 14:50:34 -05:00

@Cephra commented on GitHub (Aug 12, 2024):

You might want to take a look at these:

To see how tool calling works with ollama.

@Cephra commented on GitHub (Aug 12, 2024): You might want to take a look at these: - https://ollama.com/blog/tool-support - https://github.com/ollama/ollama-python/blob/main/examples/tools/main.py To see how tool calling works with ollama.

GiteaMirror commented

2026-04-12 14:50:34 -05:00

@zhangsheng377 commented on GitHub (Aug 12, 2024):

You might want to take a look at these:

https://ollama.com/blog/tool-support

https://github.com/ollama/ollama-python/blob/main/examples/tools/main.py

To see how tool calling works with ollama.

I know this, but I still don't think it needs to check the role.
For example, if I want to use mutil-agent, I'll assign a different role to each LLM, and many more scenarios. In conclusion, I feel that it is necessary to allow users to customize roles.

@zhangsheng377 commented on GitHub (Aug 12, 2024): > You might want to take a look at these: > > * https://ollama.com/blog/tool-support > * https://github.com/ollama/ollama-python/blob/main/examples/tools/main.py > > To see how tool calling works with ollama. I know this, but I still don't think it needs to check the role. For example, if I want to use mutil-agent, I'll assign a different role to each LLM, and many more scenarios. In conclusion, I feel that it is necessary to allow users to customize roles.

GiteaMirror commented

2026-04-12 14:50:35 -05:00

@rick-github commented on GitHub (Aug 13, 2024):

How does the model know about your customized role? If I create a role called "pink_elephant", how does the model process it?

@rick-github commented on GitHub (Aug 13, 2024): How does the model know about your customized role? If I create a role called "pink_elephant", how does the model process it?

GiteaMirror commented

2026-04-12 14:50:35 -05:00

@zhangsheng377 commented on GitHub (Aug 13, 2024):

How does the model know about your customized role? If I create a role called "pink_elephant", how does the model process it?

Some that involve new knowledge may need to be fine-tuned, but more can actually be directly input to the model. Rest assured, its logical ability is enough for it to understand.

@zhangsheng377 commented on GitHub (Aug 13, 2024): > How does the model know about your customized role? If I create a role called "pink_elephant", how does the model process it? Some that involve new knowledge may need to be fine-tuned, but more can actually be directly input to the model. Rest assured, its logical ability is enough for it to understand.

GiteaMirror commented

2026-04-12 14:50:36 -05:00

@rick-github commented on GitHub (Aug 13, 2024):

I think you overestimate the logical ability of the current set of models.

What problem are you trying to solve?

@rick-github commented on GitHub (Aug 13, 2024): I think you overestimate the logical ability of the current set of models. What problem are you trying to solve?

GiteaMirror commented

2026-04-12 14:50:36 -05:00

@zhangsheng377 commented on GitHub (Aug 13, 2024):

I think you overestimate the logical ability of the current set of models.

What problem are you trying to solve?

I use huggingface(transformers) to run dialogues locally. There are some custom roles in it, and it runs very well locally. But once I switch to ollama, I get an error saying invalid role.

@zhangsheng377 commented on GitHub (Aug 13, 2024): > I think you overestimate the logical ability of the current set of models. > > What problem are you trying to solve? I use huggingface(transformers) to run dialogues locally. There are some custom roles in it, and it runs very well locally. But once I switch to ollama, I get an error saying invalid role.

GiteaMirror commented

2026-04-12 14:50:37 -05:00

@rick-github commented on GitHub (Aug 13, 2024):

The message roles of "system", "user" and "assistant" are well defined and used anywhere transformer inference is done, because that's how the models are trained. For example, I can't send a message role of "pink_elephant" to OpenAI. Even huggingface models use "system" and "user" in their model cards. If you use roles that the models don't know, then the results may not be as good as they could be. Typically for a multi-agent setup the approach is to create base level model instances and then layer role-specific information on top:

#!/usr/bin/env python3

import ollama

class agent():
  def __init__(self, role, model="llama3.1"):
    self.role = role
    self.model = model
  def generate(self, prompt):
    return ollama.generate(prompt=prompt, system=self.role, model=self.model, stream=False)["response"]

pink_elephant = agent(role="you are a pink elephant - you are imaginary, brightly coloured, and inebriated. you answer questions with brevity")
einstein = agent(role="you are albert einstein, super smart with a fondness for playing violin. you answer questions with brevity")

print(pink_elephant.generate("hi, what do you like to do?"))
print(einstein.generate("hi, what do you like to do?"))

PARTY! *hiccup* Drink fruity cocktails! Dance on trumpets! Wear sparkly tutus! *slurp*
Play violin. Relaxes the mind. Allows focus on theory.

@rick-github commented on GitHub (Aug 13, 2024): The message roles of "system", "user" and "assistant" are well defined and used anywhere transformer inference is done, because that's how the models are trained. For example, I can't send a message role of "pink_elephant" to OpenAI. Even huggingface models use "system" and "user" in their model cards. If you use roles that the models don't know, then the results may not be as good as they could be. Typically for a multi-agent setup the approach is to create base level model instances and then layer role-specific information on top: ``` #!/usr/bin/env python3 import ollama class agent(): def __init__(self, role, model="llama3.1"): self.role = role self.model = model def generate(self, prompt): return ollama.generate(prompt=prompt, system=self.role, model=self.model, stream=False)["response"] pink_elephant = agent(role="you are a pink elephant - you are imaginary, brightly coloured, and inebriated. you answer questions with brevity") einstein = agent(role="you are albert einstein, super smart with a fondness for playing violin. you answer questions with brevity") print(pink_elephant.generate("hi, what do you like to do?")) print(einstein.generate("hi, what do you like to do?")) ``` ``` PARTY! *hiccup* Drink fruity cocktails! Dance on trumpets! Wear sparkly tutus! *slurp* Play violin. Relaxes the mind. Allows focus on theory. ```

GiteaMirror commented

2026-04-12 14:50:37 -05:00

@zhangsheng377 commented on GitHub (Aug 13, 2024):

The message roles of "system", "user" and "assistant" are well defined and used anywhere transformer inference is done, because that's how the models are trained. For example, I can't send a message role of "pink_elephant" to OpenAI. Even huggingface models use "system" and "user" in their model cards. If you use roles that the models don't know, then the results may not be as good as they could be. Typically for a multi-agent setup the approach is to create base level model instances and then layer role-specific information on top:
#!/usr/bin/env python3

import ollama

class agent():
  def __init__(self, role, model="llama3.1"):
    self.role = role
    self.model = model
  def generate(self, prompt):
    return ollama.generate(prompt=prompt, system=self.role, model=self.model, stream=False)["response"]

pink_elephant = agent(role="you are a pink elephant - you are imaginary, brightly coloured, and inebriated. you answer questions with brevity")
einstein = agent(role="you are albert einstein, super smart with a fondness for playing violin. you answer questions with brevity")

print(pink_elephant.generate("hi, what do you like to do?"))
print(einstein.generate("hi, what do you like to do?"))
PARTY! *hiccup* Drink fruity cocktails! Dance on trumpets! Wear sparkly tutus! *slurp*
Play violin. Relaxes the mind. Allows focus on theory.

👍~

Well, to be honest, I actually want to write my own ai_agent process and fine-tune the model myself.
In fact, you are right. For my scenario, I guess I can just use the tools which information you suggested (although I haven’t tried it yet).

However, I still want to know why you are so resistant to opening custom roles? It stands to reason that it is just a matter of adding a configuration to the modelfile.

@zhangsheng377 commented on GitHub (Aug 13, 2024): > The message roles of "system", "user" and "assistant" are well defined and used anywhere transformer inference is done, because that's how the models are trained. For example, I can't send a message role of "pink_elephant" to OpenAI. Even huggingface models use "system" and "user" in their model cards. If you use roles that the models don't know, then the results may not be as good as they could be. Typically for a multi-agent setup the approach is to create base level model instances and then layer role-specific information on top: > > ``` > #!/usr/bin/env python3 > > import ollama > > class agent(): > def __init__(self, role, model="llama3.1"): > self.role = role > self.model = model > def generate(self, prompt): > return ollama.generate(prompt=prompt, system=self.role, model=self.model, stream=False)["response"] > > pink_elephant = agent(role="you are a pink elephant - you are imaginary, brightly coloured, and inebriated. you answer questions with brevity") > einstein = agent(role="you are albert einstein, super smart with a fondness for playing violin. you answer questions with brevity") > > print(pink_elephant.generate("hi, what do you like to do?")) > print(einstein.generate("hi, what do you like to do?")) > ``` > > ``` > PARTY! *hiccup* Drink fruity cocktails! Dance on trumpets! Wear sparkly tutus! *slurp* > Play violin. Relaxes the mind. Allows focus on theory. > ``` 👍~ Well, to be honest, I actually want to write my own [ai_agent process](https://github.com/BZ-coding/ai_agent/blob/main/utils/ai_agent.py) and fine-tune the model myself. In fact, you are right. For my scenario, I guess I can just use the tools which information you suggested (although I haven’t tried it yet). However, I still want to know why you are so resistant to opening custom roles? It stands to reason that it is just a matter of adding a configuration to the modelfile.

GiteaMirror commented

2026-04-12 14:50:38 -05:00

@rick-github commented on GitHub (Aug 13, 2024):

It's not a matter of being resistant, it's the way that the models work.

Take a look at the template file for a model. I see qwen2:7b referenced in chatbot.py, so we'll use that:

$ ollama show --template qwen2:7b
{{ if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}{{ if .Prompt }}<|im_start|>user
{{ .Prompt }}<|im_end|>
{{ end }}<|im_start|>assistant
{{ .Response }}<|im_end|>

Conveniently it's a simple template. It's the equivalent of the chat_template that is found in the config.json of the source model. The purpose of the template is to format the query that is sent via the API to a form that can be processed in to a token stream that can be fed to the model. The added text (<|im_start|>, system, user, etc) are specific strings that are converted to tokens that the model is trained to recognize (you can see these strings mapped to tokens in tokenizer.json - warning, large file). These tokens are instrumental in guiding the probabilistic nature of the token generation during the output phase. If the model is not trained to recognize a string as a special token, then that string is just a set of characters that are processed by the model to generate output.

$ jq . tokenizer.json | egrep '\s"(user|assistant|system|<\|im_(start|end)\|>)"'
      "content": "<|im_start|>",
      "content": "<|im_end|>",
      "user": 872,
      "system": 8948,
      "assistant": 77091,

$ jq . tokenizer.json | egrep pink_elephant

So, for the pink_elephant agent from the sample python script from earlier, the call to the model uses the character stream:

<|im_start|>system
you are a pink elephant - you are imaginary, brightly coloured, and inebriated. you answer questions with brevity<|im_end|>
<|im_start|>user
hi, what do you like to do?<|im_end|>
<|im_start|>assistant
<|im_end|>

This is tokenized and then fed in to the model to generate the response, a sequence of tokens that is de-tokenized into plain text which is then returned via the API call response.

You can in fact skip this templating step and inject the stream directly into the tokenizer by using raw mode. Note that each model has a different tokenizer, so the raw query will vary across models.

$ curl -s localhost:11434/api/generate -d '{"model":"qwen2:7b","prompt":"<|im_start|>system\nyou are a pink elephant - you are imaginary, brightly coloured, and inebriated. you answer questions with brevity<|im_end|>\n<|im_start|>user\nhi, what do you like to do?<|im_end|>\n<|im_start|>assistant\n","stream":false,"raw":true}' | jq -r .response
Hi there! I like to roam around, play with my rainbow trunk, and munch on cotton candy clouds. Fun times! 🦄✨

So it's technically feasible to substitute your own roles into a query, but because the model has no intrinsic understanding of the role name you are using, the results may vary. For example, if I change the user role to tiger, the model incorporates that into it's self worldview rather than treating it as an attribute of the questioner:

$ curl -s localhost:11434/api/generate -d '{"model":"qwen2:7b","prompt":"<|im_start|>system\nyou are a pink elephant - you are imaginary, brightly coloured, and inebriated. you answer questions with brevity<|im_end|>\n<|im_start|>tiger\nhi, what do you like to do?<|im_end|>\n<|im_start|>assistant\n","stream":false,"raw":true}' | jq -r .response
Roar loudly and chase prey while lounging in the sun!

@rick-github commented on GitHub (Aug 13, 2024): It's not a matter of being resistant, it's the way that the models work. Take a look at the template file for a model. I see qwen2:7b referenced in chatbot.py, so we'll use that: ``` $ ollama show --template qwen2:7b {{ if .System }}<|im_start|>system {{ .System }}<|im_end|> {{ end }}{{ if .Prompt }}<|im_start|>user {{ .Prompt }}<|im_end|> {{ end }}<|im_start|>assistant {{ .Response }}<|im_end|> ``` Conveniently it's a simple template. It's the equivalent of the `chat_template` that is found in the [config.json](https://huggingface.co/Qwen/Qwen2-7B/blob/main/tokenizer_config.json) of the source model. The purpose of the template is to format the query that is sent via the API to a form that can be processed in to a token stream that can be fed to the model. The added text (`<|im_start|>`, `system`, `user`, etc) are specific strings that are converted to tokens that the model is trained to recognize (you can see these strings mapped to tokens in [tokenizer.json](https://huggingface.co/Qwen/Qwen2-7B/blob/main/tokenizer.json) - warning, large file). These tokens are instrumental in guiding the probabilistic nature of the token generation during the output phase. If the model is not trained to recognize a string as a special token, then that string is just a set of characters that are processed by the model to generate output. ``` $ jq . tokenizer.json | egrep '\s"(user|assistant|system|<\|im_(start|end)\|>)"' "content": "<|im_start|>", "content": "<|im_end|>", "user": 872, "system": 8948, "assistant": 77091, ``` ``` $ jq . tokenizer.json | egrep pink_elephant ``` So, for the pink_elephant agent from the sample python script from earlier, the call to the model uses the character stream: ``` <|im_start|>system you are a pink elephant - you are imaginary, brightly coloured, and inebriated. you answer questions with brevity<|im_end|> <|im_start|>user hi, what do you like to do?<|im_end|> <|im_start|>assistant <|im_end|> ``` This is tokenized and then fed in to the model to generate the response, a sequence of tokens that is de-tokenized into plain text which is then returned via the API call response. You can in fact skip this templating step and inject the stream directly into the tokenizer by using [raw mode](https://github.com/ollama/ollama/blob/main/docs/api.md#request-raw-mode). Note that each model has a different tokenizer, so the raw query will vary across models. ``` $ curl -s localhost:11434/api/generate -d '{"model":"qwen2:7b","prompt":"<|im_start|>system\nyou are a pink elephant - you are imaginary, brightly coloured, and inebriated. you answer questions with brevity<|im_end|>\n<|im_start|>user\nhi, what do you like to do?<|im_end|>\n<|im_start|>assistant\n","stream":false,"raw":true}' | jq -r .response Hi there! I like to roam around, play with my rainbow trunk, and munch on cotton candy clouds. Fun times! 🦄✨ ``` So it's technically feasible to substitute your own roles into a query, but because the model has no intrinsic understanding of the role name you are using, the results may vary. For example, if I change the `user` role to `tiger`, the model incorporates that into it's self worldview rather than treating it as an attribute of the questioner: ``` $ curl -s localhost:11434/api/generate -d '{"model":"qwen2:7b","prompt":"<|im_start|>system\nyou are a pink elephant - you are imaginary, brightly coloured, and inebriated. you answer questions with brevity<|im_end|>\n<|im_start|>tiger\nhi, what do you like to do?<|im_end|>\n<|im_start|>assistant\n","stream":false,"raw":true}' | jq -r .response Roar loudly and chase prey while lounging in the sun! ```

GiteaMirror commented

2026-04-12 14:50:38 -05:00

@zhangsheng377 commented on GitHub (Aug 13, 2024):

That's true, but I will fine-tune model with myself. And transfer it to gguf, and load in ollama.

54e8cbeb9c/utils/chatbot.py (L7)
Just like the chinese-llama3 model, it is fine-tuned from llama3.

https://github.com/BZ-coding/ai_agent
You can open a translation and read the project introduction. I do plan to train a model that can use tools.

@zhangsheng377 commented on GitHub (Aug 13, 2024): That's true, but I will fine-tune model with myself. And transfer it to gguf, and load in ollama. https://github.com/BZ-coding/ai_agent/blob/54e8cbeb9cfff59f96d6772135725c6b62418d33/utils/chatbot.py#L7 Just like the chinese-llama3 model, it is fine-tuned from llama3. https://github.com/BZ-coding/ai_agent You can open a translation and read the project introduction. I do plan to train a model that can use tools.

GiteaMirror commented

2026-04-12 14:50:39 -05:00

@zhangsheng377 commented on GitHub (Aug 13, 2024):

To put it another way, I think it's up to the modelfile to tell Ollama what roles to support (whether it's been fine-tuned or not), rather than Ollama just assuming it.

@zhangsheng377 commented on GitHub (Aug 13, 2024): To put it another way, I think it's up to the modelfile to tell Ollama what roles to support (whether it's been fine-tuned or not), rather than Ollama just assuming it.

GiteaMirror commented

2026-04-12 14:50:39 -05:00

@jmorganca commented on GitHub (Sep 4, 2024):

Hi @zhangsheng377 you can define custom TEMPLATE (see https://github.com/ollama/ollama/blob/main/docs/template.md) and try passing in custom role names in the /api/chat endpoint. That may work. Otherwise, you could also try using raw mode with /api/generate, would allow you to use any prompt format you'd like. Hope this helps – happy to shed more light on this.

@jmorganca commented on GitHub (Sep 4, 2024): Hi @zhangsheng377 you can define custom `TEMPLATE` (see https://github.com/ollama/ollama/blob/main/docs/template.md) and try passing in custom role names in the `/api/chat` endpoint. That may work. Otherwise, you could also try using `raw` mode with `/api/generate`, would allow you to use any prompt format you'd like. Hope this helps – happy to shed more light on this.

GiteaMirror commented

2026-04-12 14:50:40 -05:00

@zhangsheng377 commented on GitHub (Sep 4, 2024):

Hi @zhangsheng377 you can define custom TEMPLATE (see https://github.com/ollama/ollama/blob/main/docs/template.md) and try passing in custom role names in the /api/chat endpoint. That may work. Otherwise, you could also try using raw mode with /api/generate, would allow you to use any prompt format you'd like. Hope this helps – happy to shed more light on this.

But my requirement is to use the role type of tool in an interface compatible with OpenAI.

@zhangsheng377 commented on GitHub (Sep 4, 2024): > Hi @zhangsheng377 you can define custom `TEMPLATE` (see https://github.com/ollama/ollama/blob/main/docs/template.md) and try passing in custom role names in the `/api/chat` endpoint. That may work. Otherwise, you could also try using `raw` mode with `/api/generate`, would allow you to use any prompt format you'd like. Hope this helps – happy to shed more light on this. But my requirement is to use the role type of tool in an interface compatible with OpenAI.

GiteaMirror commented

2026-04-12 14:50:41 -05:00

@xylobol commented on GitHub (Aug 15, 2025):

https://ollama.com/library/granite3.3

Granite 3.3 models require a message with the role 'control' to enable thinking. Since I'm trying to access them from an Open-WebUI "model", I can't use /api/generate.

For anyone else trying to solve this particular problem, you can use this: https://openwebui.com/f/adamoutler/granite_thinking_filter

@xylobol commented on GitHub (Aug 15, 2025): https://ollama.com/library/granite3.3 Granite 3.3 models require a message with the role 'control' to enable thinking. Since I'm trying to access them from an Open-WebUI "model", I can't use `/api/generate`. For anyone else trying to solve this particular problem, you can use this: https://openwebui.com/f/adamoutler/granite_thinking_filter

GiteaMirror referenced this issue

2026-04-22 06:07:14 -05:00

[GH-ISSUE #3969] cuda subprocess exits immediately with host cuda library in path #28214

GiteaMirror referenced this issue

2026-04-22 06:11:23 -05:00

[GH-ISSUE #4008] Compute Capability Misidentification with PhysX cudart library #28244

GiteaMirror referenced this issue

2026-04-28 10:19:19 -05:00

[GH-ISSUE #3969] cuda subprocess exits immediately with host cuda library in path #48966

GiteaMirror referenced this issue

2026-04-28 10:35:14 -05:00

[GH-ISSUE #4008] Compute Capability Misidentification with PhysX cudart library #48996

GiteaMirror referenced this issue

2026-05-03 17:51:40 -05:00

[GH-ISSUE #3969] cuda subprocess exits immediately with host cuda library in path #64492

GiteaMirror referenced this issue

2026-05-03 17:57:53 -05:00

[GH-ISSUE #4008] Compute Capability Misidentification with PhysX cudart library #64522

Sign in to join this conversation.

Branches Tags

main

hoyyeva/anthropic-local-image-path

dhiltgen/ci

dhiltgen/llama-runner

parth-remove-claude-desktop-launch

hoyyeva/anthropic-reference-images-path

parth-anthropic-reference-images-path

brucemacd/download-before-remove

hoyyeva/editor-config-repair

parth-mlx-decode-checkpoints

parth-launch-codex-app

hoyyeva/fix-codex-model-metadata-warning

hoyyeva/qwen

parth/hide-claude-desktop-till-release

hoyyeva/opencode-image-modality

parth-add-claude-code-autoinstall

release_v0.22.0

pdevine/manifest-list

codex/fix-codex-model-metadata-warning

pdevine/addressable-manifest

brucemacd/launch-fetch-reccomended

jmorganca/llama-compat

launch-copilot-cli

hoyyeva/opencode-thinking

release_v0.20.7

parth-auto-save-backup

parth-test

jmorganca/gemma4-audio-replacements

fix-manifest-digest-on-pull

hoyyeva/vscode-improve

brucemacd/install-server-wait

parth/update-claude-docs

brucemac/start-ap-install

pdevine/mlx-update

pdevine/qwen35_vision

drifkin/api-show-fallback

mintlify/image-generation-1773352582

hoyyeva/server-context-length-local-config

jmorganca/faster-reptition-penalties

jmorganca/convert-nemotron

parth-pi-thinking

pdevine/sampling-penalties

jmorganca/fix-create-quantization-memory

dongchen/resumable_transfer_fix

pdevine/sampling-cache-error

jessegross/mlx-usage

hoyyeva/openclaw-config

hoyyeva/app-html

pdevine/qwen3next

brucemacd/sign-sh-install

brucemacd/tui-update

brucemacd/usage-api

jmorganca/launch-empty

fix-app-dist-embed

mxyng/mlx-compile

mxyng/mlx-quant

mxyng/mlx-glm4.7

mxyng/mlx

brucemacd/simplify-model-picker

jmorganca/qwen3-concurrent

fix-glm-4.7-flash-mla-config

drifkin/qwen3-coder-opening-tag

brucemacd/usage-cli

fix-cuda12-fattn-shmem

ollama-imagegen-docs

parth/fix-multiline-inputs

brucemacd/config-docs

mxyng/model-files

mxyng/simple-execute

fix-imagegen-ollama-models

mxyng/async-upload

jmorganca/lazy-no-dtype-changes

imagegen-auto-detect-create

parth/decrease-concurrent-download-hf

fix-mlx-quantize-init

jmorganca/x-cleanup

usage

imagegen-readme

jmorganca/glm-image

mlx-gpu-cd

jmorganca/imagegen-modelfile

parth/agent-skills

parth/agent-allowlist

parth/signed-in-offline

parth/agents

parth/fix-context-chopping

improve-cloud-flow

parth/add-models-websearch

parth/prompt-renderer-mcp

jmorganca/native-settings

jmorganca/download-stream-hash

jmorganca/client2-rebased

brucemacd/oai-chat-req-multipart

jessegross/multi_chunk_reserve

grace/additional-omit-empty

grace/mistral-3-large

mxyng/tokenizer2

mxyng/tokenizer

jessegross/flash

hoyyeva/windows-nacked-app

mxyng/cleanup-attention

grace/deepseek-parser

hoyyeva/remember-unsent-prompt

parth/add-lfs-pointer-error-conversion

parth/olmo2-test2

hoyyeva/ollama-launchagent-plist

nicole/olmo-model

parth/olmo-test

mxyng/remove-embedded

parth/render-template

jmorganca/intellect-3

parth/remove-prealloc-linter

jmorganca/cmd-eval

nicole/nomic-embed-text-fix

mxyng/lint-2

hoyyeva/add-gemini-3-pro-preview

hoyyeva/load-model-list

mxyng/expand-path

mxyng/environ-2

hoyyeva/deeplink-json-encoding

parth/improve-tool-calling-tests

hoyyeva/conversation

hoyyeva/assistant-edit-response

hoyyeva/thinking

origin/brucemacd/invalid-char-i-err

parth/improve-tool-calling

jmorganca/required-omitempty

grace/qwen3-vl-tests

mxyng/iter-client

parth/docs-readme

nicole/embed-test

pdevine/integration-benchstat

parth/remove-generate-cmd

parth/add-toolcall-id

mxyng/server-tests

jmorganca/glm-4.6

jmorganca/gin-h-compat

drifkin/stable-tool-args

pdevine/qwen3-more-thinking

parth/add-websearch-client

nicole/websearch_local

jmorganca/qwen3-coder-updates

grace/deepseek-v3-migration-tests

mxyng/fix-create

jmorganca/cloud-errors

pdevine/parser-tidy

revert-12233-parth/simplify-entrypoints-runner

parth/enable-so-gpt-oss

brucemacd/qwen3vl

jmorganca/readme-simplify

parth/gpt-oss-structured-outputs

revert-12039-jmorganca/tools-braces

mxyng/embeddings

mxyng/gguf

mxyng/benchmark

mxyng/types-null

parth/move-parsing

mxyng/gemma2

jmorganca/docs

mxyng/16-bit

mxyng/create-stdin

pdevine/authorizedkeys

mxyng/quant

parth/opt-in-error-context-window

brucemacd/cache-models

brucemacd/runner-completion

jmorganca/llama-update-6

brucemacd/benchmark-list

brucemacd/partial-read-caps

parth/deepseek-r1-tools

mxyng/omit-array

parth/tool-prefix-temp

brucemacd/runner-test

jmorganca/qwen25vl

brucemacd/model-forward-test-ext

parth/python-function-parsing

jmorganca/cuda-compression-none

drifkin/num-parallel

drifkin/chat-truncation-fix

jmorganca/sync

parth/python-tools-calling

drifkin/array-head-count

brucemacd/create-no-loop

parth/server-enable-content-stream-with-tools

qwen25omni

mxyng/v3

brucemacd/ropeconfig

jmorganca/silence-tokenizer

parth/sample-so-test

parth/sampling-structured-outputs

brucemacd/doc-go-engine

parth/constrained-sampling-json

jmorganca/mistral-wip

brucemacd/mistral-small-convert

parth/sample-unmarshal-json-for-params

brucemacd/jomorganca/mistral

pdevine/bfloat16

jmorganca/mistral

brucemacd/mistral

pdevine/logging

parth/sample-correctness-fix

parth/sample-fix-sorting

jmorgan/sample-fix-sorting-extras

jmorganca/temp-0-images

brucemacd/parallel-embed-models

brucemacd/shim-grammar

jmorganca/fix-gguf-error

bmizerany/nameswork

jmorganca/faster-releases

bmizerany/validatenames

brucemacd/err-no-vocab

brucemacd/rope-config

brucemacd/err-hint

brucemacd/qwen2_5

brucemacd/logprobs

brucemacd/new_runner_graph_bench

progress-flicker

brucemacd/forward-test

brucemacd/go_qwen2

pdevine/gemma2

jmorganca/add-missing-symlink-eval

mxyng/next-debug

parth/set-context-size-openai

brucemacd/next-bpe-bench

brucemacd/next-bpe-test

brucemacd/new_runner_e2e

brucemacd/new_runner_qwen2

pdevine/convert-cohere2

brucemacd/convert-cli

parth/log-probs

mxyng/next-mlx

mxyng/cmd-history

parth/templating

parth/tokenize-detokenize

brucemacd/check-key-register

bmizerany/grammar

jmorganca/vendor-081b29bd

mxyng/func-checks

jmorganca/fix-null-format

parth/fix-default-to-warn-json

jmorganca/qwen2vl

jmorganca/no-concat

parth/cmd-cleanup-SO

brucemacd/check-key-register-structured-err

parth/openai-stream-usage

parth/fix-referencing-so

stream-tools-stop

jmorganca/degin-1

brucemacd/install-path-clean

brucemacd/push-name-validation

brucemacd/browser-key-register

jmorganca/openai-fix-first-message

jmorganca/fix-proxy

jessegross/sample

parth/disallow-streaming-tools

dhiltgen/remove_submodule

jmorganca/ga

jmorganca/mllama

pdevine/newlines

pdevine/geems-2b

jmorganca/llama-bump

mxyng/modelname-7

mxyng/gin-slog

mxyng/modelname-6

jyan/convert-prog

jyan/quant5

paligemma-support

pdevine/import-docs

jmorganca/openai-context

jyan/paligemma

jyan/p2

jyan/palitest

bmizerany/embedspeedup

jmorganca/llama-vit

brucemacd/allow-ollama

royh/ep-methods

royh/whisper

mxyng/api-models

mxyng/fix-memory

jyan/q4_4/8

jyan/ollama-v

royh/stream-tools

roy-embed-parallel

bmizerany/hrm

revert-5963-revert-5924-mxyng/llama3.1-rope

royh/embed-viz

jyan/local2

jyan/auth

jyan/local

jyan/parse-temp

jmorganca/template-mistral

jyan/reord-g

royh-openai-suffixdocs

royh-imgembed

royh-embed-parallel

jyan/quant4

royh-precision

jyan/progress

pdevine/fix-template

jyan/quant3

pdevine/ggla

mxyng/update-registry-domain

jmorganca/ggml-static

mxyng/create-context

jyan/v0.146

mxyng/layers-from-files

build_dist

bmizerany/noseek

royh-ls

royh-name

timeout

mxyng/server-timestamp

bmizerany/nosillyggufslurps

royh-params

jmorganca/llama-cpp-7c26775

royh-openai-delete

royh-show-rigid

jmorganca/enable-fa

jmorganca/no-error-template

jyan/format

royh-testdelete

bmizerany/fastverify

language_support

pdevine/ps-glitches

brucemacd/tokenize

bruce/iq-quants

bmizerany/filepathwithcoloninhost

mxyng/split-bin

bmizerany/client-registry

jmorganca/if-none-match

native

jmorganca/native

jmorganca/batch-embeddings

jmorganca/initcmake

jmorganca/mm

pdevine/showggmlinfo

modenameenforcealphanum

bmizerany/modenameenforcealphanum

jmorganca/done-reason

jmorganca/llama-cpp-8960fe8

ollama.com

bmizerany/filepathnobuild

bmizerany/types/model/defaultfix

rmdisplaylong

nogogen

bmizerany/x

modelfile-readme

bmizerany/replacecolon

jmorganca/limit

jmorganca/execstack

jmorganca/replace-assets

mxyng/tune-concurrency

jmorganca/testing

whitespace-detection

jmorganca/options

upgrade-all

scratch

cuda-search

mattw/airenamer

mattw/allmodelsonhuggingface

mattw/quantcontext

mattw/whatneedstorun

brucemacd/llama-mem-calc

mattw/faq-context

mattw/communitylinks

mattw/noprune

mattw/python-functioncalling

rename

mxyng/install

pulse

remove-first

editor

mattw/selfqueryingretrieval

cgo

mattw/howtoquant

api

matt/streamingapi

format-config

mxyng/extra-args

shell

update-nous-hermes

cp-model

upload-progress

fix-unknown-model

fix-model-names

delete-fix

insecure-registry

ls

deletemodels

progressbar

readme-updates

license-layers

skip-list

list-models

modelpath

matt/examplemodelfiles

distribution

go-opts

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/ollama#3969