[GH-ISSUE #5736] bug: Open WebUI RAG Malfunction with Ollama Versions Post 0.2.1 #3571

New Issue

GiteaMirror · 2026-04-12T14:18:03-05:00

GiteaMirror commented

2026-04-12 14:18:03 -05:00

Originally created by @silentoplayz on GitHub (Jul 17, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/5736

What is the issue?

Summary:

Retrieval-Augmented Generation (RAG) functionality within Open WebUI breaks when using Ollama versions later than 0.2.1 for local models. While external models (e.g., GroqCloud's LLama 3 8B) function correctly with RAG, local models fail to utilize the selected document, returning irrelevant or fabricated information. This issue occurs with both SentenceTransformers and Ollama RAG embedding models.

Affected Versions:

Ollama: 0.2.2, 0.2.3, 0.2.4, 0.2.5, 0.2.6, 0.2.7, 0.2.8
Open WebUI: Latest dev and main branches

Unaffected Versions:

Ollama: versions prior to 0.2.1
Open WebUI: ?

Steps to Reproduce:

Clean Slate:
- Downgrade Ollama to version 0.2.0 (ollama --version).
- In Open WebUI, clear all documents from the Workspace > Documents tab.
- Navigate to Admin Panel > Settings > Documents and click Reset Upload Directory and Reset Vector Storage.
Successful RAG Test (Ollama 0.2.0 & 0.2.1):
- Add a .txt document to the Open WebUI Documents workspace.
- Start a new chat and select the document using the # key.
- Input a query related to the document content.
- Verify that both local and external LLMs respond accurately, incorporating information from the selected document.
- Repeat steps 1 & 2 for Ollama version 0.2.1 after upgrading (ollama --version).
Failing RAG Test (Ollama 0.2.2 onwards):
- Upgrade Ollama to version 0.2.2 (ollama --version).
- Start a new chat, select the same document from step 2 using the # key, and input the same query.
- Observe that local LLMs fail to utilize the document content, providing irrelevant or fabricated responses.
- Verify that external LLMs still function correctly with RAG.
- Repeat step 3 for Ollama versions 0.2.3-0.3.0, observing the same behavior.

Expected Behavior:
Local LLMs should successfully utilize the selected document for RAG, providing accurate and relevant responses based on its content, regardless of the Ollama version used.

Actual Behavior:
Local LLMs fail to perform RAG accurately when using Ollama versions 0.2.2 and later, while external models remain unaffected. This occurs despite successful document loading and embedding generation (confirmed by testing with both SentenceTransformers and Ollama embedding models).

Additional Notes:

The issue persists across multiple attempts, regenerations, and message edits.
The problem is not specific to a particular document or query, as it consistently occurs with different types of documents.
Resetting the Open WebUI upload directory and vector storage, as well as re-uploading documents, does not resolve the issue.
The issue is not related to Tika document extraction for RAG within Open WebUI, as confirmed through testing.
Downgrading Ollama to version 0.2.0 completely resolves the RAG malfunction within Open WebUI.

Conclusion:
A regression appears to have been introduced in Ollama versions after 0.2.1, specifically impacting the interaction between Ollama and Open WebUI for local model RAG functionality. This issue necessitates investigation and resolution to ensure the proper functioning of RAG across all supported Ollama versions.

The maintainer of Open WebUI has also confirmed this bug on the latest version of Open WebUI in combination with the latest version of Ollama:

Latest Open WebUI RAG + Ollama v0.2.2 (failures 100% of the time with local models it seems):

OS

Windows, Docker

GPU

AMD RX 6800 XT

CPU

Intel i7-12700K

Ollama version

0.2.2, 0.2.3, 0.2.4, 0.2.5, 0.2.6, 0.2.7, 0.2.8, 0.2.9, 0.3.0 (latest)

Originally created by @silentoplayz on GitHub (Jul 17, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/5736 ### What is the issue? **Summary:** Retrieval-Augmented Generation (RAG) functionality within Open WebUI breaks when using Ollama versions later than 0.2.1 for local models. While external models (e.g., GroqCloud's LLama 3 8B) function correctly with RAG, local models fail to utilize the selected document, returning irrelevant or fabricated information. This issue occurs with both `SentenceTransformers` and `Ollama` RAG embedding models. **Affected Versions:** * Ollama: 0.2.2, 0.2.3, 0.2.4, 0.2.5, 0.2.6, 0.2.7, 0.2.8 * Open WebUI: Latest `dev` and `main` branches **Unaffected Versions:** * Ollama: versions prior to 0.2.1 * Open WebUI: ? **Steps to Reproduce:** 1. **Clean Slate:** * Downgrade Ollama to version 0.2.0 (`ollama --version`). * In Open WebUI, clear all documents from the `Workspace` > `Documents` tab. * Navigate to `Admin Panel` > `Settings` > `Documents` and click `Reset Upload Directory` and `Reset Vector Storage`. 2. **Successful RAG Test (Ollama 0.2.0 & 0.2.1):** * Add a `.txt` document to the Open WebUI `Documents` workspace. * Start a new chat and select the document using the `#` key. * Input a query related to the document content. * Verify that both local and external LLMs respond accurately, incorporating information from the selected document. * Repeat steps 1 & 2 for Ollama version 0.2.1 after upgrading (`ollama --version`). 3. **Failing RAG Test (Ollama 0.2.2 onwards):** * Upgrade Ollama to version 0.2.2 (`ollama --version`). * Start a new chat, select the same document from step 2 using the `#` key, and input the same query. * Observe that local LLMs fail to utilize the document content, providing irrelevant or fabricated responses. * Verify that external LLMs still function correctly with RAG. * Repeat step 3 for Ollama versions 0.2.3-0.3.0, observing the same behavior. **Expected Behavior:** Local LLMs should successfully utilize the selected document for RAG, providing accurate and relevant responses based on its content, regardless of the Ollama version used. **Actual Behavior:** Local LLMs fail to perform RAG accurately when using Ollama versions 0.2.2 and later, while external models remain unaffected. This occurs despite successful document loading and embedding generation (confirmed by testing with both `SentenceTransformers` and `Ollama` embedding models). **Additional Notes:** * The issue persists across multiple attempts, regenerations, and message edits. * The problem is not specific to a particular document or query, as it consistently occurs with different types of documents. * Resetting the Open WebUI upload directory and vector storage, as well as re-uploading documents, does not resolve the issue. * The issue is not related to `Tika` document extraction for RAG within Open WebUI, as confirmed through testing. * Downgrading Ollama to version 0.2.0 completely resolves the RAG malfunction within Open WebUI. **Conclusion:** A regression appears to have been introduced in Ollama versions after 0.2.1, specifically impacting the interaction between Ollama and Open WebUI for local model RAG functionality. This issue necessitates investigation and resolution to ensure the proper functioning of RAG across all supported Ollama versions. ### Related issue on the Open WebUI repo: https://github.com/open-webui/open-webui/discussions/3907 The maintainer of Open WebUI has also confirmed this bug on the latest version of Open WebUI in combination with the latest version of Ollama: ![Screenshot 2024-07-16 185540](https://github.com/user-attachments/assets/206d0c37-237a-476d-87d4-1c44fa4ebe7b) ![Screenshot 2024-07-16 185651](https://github.com/user-attachments/assets/8874ef89-bfa9-42e7-abd3-fb76a093e279) Latest Open WebUI RAG + Ollama v0.2.2 (failures 100% of the time with local models it seems): ![image](https://github.com/user-attachments/assets/9b4872a9-2912-4820-bd6e-b4f567aecc00) ### OS Windows, Docker ### GPU AMD RX 6800 XT ### CPU Intel i7-12700K ### Ollama version 0.2.2, 0.2.3, 0.2.4, 0.2.5, 0.2.6, 0.2.7, 0.2.8, 0.2.9, 0.3.0 (latest)

GiteaMirror added the bug label 2026-04-12 14:18:03 -05:00

GiteaMirror closed this issue

2026-04-12 14:18:04 -05:00

GiteaMirror commented

2026-04-12 14:18:05 -05:00

@Qualzz commented on GitHub (Jul 17, 2024):

Confirming the issue on my side too with the last update.

Nvidia (4090)
AMD Ryzen 7950 X

@Qualzz commented on GitHub (Jul 17, 2024): Confirming the issue on my side too with the last update. Nvidia (4090) AMD Ryzen 7950 X

GiteaMirror commented

2026-04-12 14:18:05 -05:00

@EncodedBird commented on GitHub (Jul 17, 2024):

To help with variables, I also have the issue on arch linux with a non-docker install.

AMD RX 6900 XT
CPU AMD Ryzen 5950X

@EncodedBird commented on GitHub (Jul 17, 2024): To help with variables, I also have the issue on arch linux with a non-docker install. AMD RX 6900 XT CPU AMD Ryzen 5950X

GiteaMirror commented

2026-04-12 14:18:05 -05:00

@zouzeTG commented on GitHub (Jul 17, 2024):

Same issue for me !
I use GPU NVIDIA A100 and Intel CPU with Windows server 2022

@zouzeTG commented on GitHub (Jul 17, 2024): Same issue for me ! I use GPU NVIDIA A100 and Intel CPU with Windows server 2022

GiteaMirror commented

2026-04-12 14:18:06 -05:00

@zouzeTG commented on GitHub (Jul 17, 2024):

I'm going to do more test, because for me I have also encontered bug wtih the version 2.1. with docx and xls files.
But I'm not sure

@zouzeTG commented on GitHub (Jul 17, 2024): I'm going to do more test, because for me I have also encontered bug wtih the version 2.1. with docx and xls files. But I'm not sure

GiteaMirror commented

2026-04-12 14:18:06 -05:00

@silentoplayz commented on GitHub (Jul 18, 2024):

Update/Bump:

After thorough testing, it has been determined that setting the Top K value within Open WebUI's Documents settings to a value of 1 resolves compatibility issues with RAG when using Ollama versions 0.2.1 and later.

Additionally, configuring the context length for your RAG model to a higher number, such as 8192, has been found to maintain functionality with these specific Ollama versions.

These observations are based on empirical data collected by the maintainer of Open WebUI and myself following rigorous testing. The provided screenshots serve as visual evidence supporting these findings:

Setting to adjust (Change this value to 1):

Open WebUI maintainer's findings:

RAG working with context length set to 8192 (tested on Ollama v0.2.5 with latest dev commit of Open WebUI):

RAG working with Top K set to 1 within Open WebUI's Documents settings (tested on Ollama v0.2.5 with latest dev commit of Open WebUI):

We believe the information presented is crucial for addressing the bug report at hand and ensures that future installations and updates of Ollama maintains optimal compatibility with Open WebUI's RAG functionalities.

@silentoplayz commented on GitHub (Jul 18, 2024): **Update/Bump**: After thorough testing, it has been determined that setting the `Top K` value within Open WebUI's `Documents` settings to a value of `1` resolves compatibility issues with RAG when using Ollama versions 0.2.1 and later. Additionally, configuring the context length for your RAG model to a higher number, such as `8192`, has been found to maintain functionality with these specific Ollama versions. These observations are based on empirical data collected by the maintainer of Open WebUI and myself following rigorous testing. The provided screenshots serve as visual evidence supporting these findings: Setting to adjust (Change this value to `1`): ![openweb1](https://github.com/user-attachments/assets/c299f84f-000a-4aaa-a06c-d9a33971dca5) Open WebUI maintainer's findings: ![openweb](https://github.com/user-attachments/assets/f6fa1891-c7e9-4ee7-84f8-aa894b74aa7b) RAG working with context length set to `8192` (**tested on Ollama v0.2.5 with latest dev commit of Open WebUI**): ![ragworkedhere](https://github.com/user-attachments/assets/8391422d-0c9c-4777-a3cf-81629845edf3) RAG working with `Top K` set to `1` within Open WebUI's `Documents` settings (**tested on Ollama v0.2.5 with latest dev commit of Open WebUI**): ![ragworkedheretoo](https://github.com/user-attachments/assets/3ab440b9-a802-4b13-9752-8a646be931db) We believe the information presented is crucial for addressing the bug report at hand and ensures that future installations and updates of Ollama maintains optimal compatibility with Open WebUI's RAG functionalities.

GiteaMirror commented

2026-04-12 14:18:07 -05:00

@Qualzz commented on GitHub (Jul 18, 2024):

On my end, I have a very simple prompt, total token count of the document is around 2k token, slighty more than the 2048 default. However, even if the model is set to 4096, or if in the "current chat settings" it's set to 4096, the issue is still there. Using 8112 works. But 4096 should be way more than enough for my query.

@silentoplayz it seems that using the context length of the "current chat settings" on the sidebar, doesn't have any effect. I can set it to 1, 5, 654654654165, the model will answer the ragquery only depending on the settings I've set in the model settings ( workspace>model>edit model>advanced settings.

@Qualzz commented on GitHub (Jul 18, 2024): On my end, I have a very simple prompt, total token count of the document is around 2k token, slighty more than the 2048 default. However, even if the model is set to 4096, or if in the "current chat settings" it's set to 4096, the issue is still there. Using 8112 works. But 4096 should be way more than enough for my query. @silentoplayz it seems that using the context length of the "current chat settings" on the sidebar, doesn't have any effect. I can set it to 1, 5, 654654654165, the model will answer the ragquery only depending on the settings I've set in the model settings ( workspace>model>edit model>advanced settings.

GiteaMirror commented

2026-04-12 14:18:07 -05:00

@silentoplayz commented on GitHub (Jul 18, 2024):

On my end, I have a very simple prompt, total token count of the document is around 2k token, slighty more than the 2048 default. However, even if the model is set to 4096, or if in the "current chat settings" it's set to 4096, the issue is still there. Using 8112 works. But 4096 should be way more than enough for my query.

@silentoplayz it seems that using the context length of the "current chat settings" on the sidebar, doesn't have any effect. I can set it to 1, 5, 654654654165, the model will answer the rag query only depending on the settings I've set in the model settings ( workspace>model>edit model>advanced settings.

It is odd that adjusting the Context Length value within the Chat Controls on the right-hand sidebar doesn't have any effect for you. I have noticed it works to "fix" RAG for me within Open WebUI, even just by raising the context length to 4096 in some cases.

I also came to report that the same set of issues I've reported here appear to be present in recently released versions of Ollama: v0.2.6 and v0.2.7.

@silentoplayz commented on GitHub (Jul 18, 2024): > On my end, I have a very simple prompt, total token count of the document is around 2k token, slighty more than the 2048 default. However, even if the model is set to 4096, or if in the "current chat settings" it's set to 4096, the issue is still there. Using 8112 works. But 4096 should be way more than enough for my query. > > @silentoplayz it seems that using the context length of the "current chat settings" on the sidebar, doesn't have any effect. I can set it to 1, 5, 654654654165, the model will answer the rag query only depending on the settings I've set in the model settings ( workspace>model>edit model>advanced settings. It is odd that adjusting the `Context Length` value within the Chat Controls on the right-hand sidebar doesn't have any effect for you. I have noticed it works to "fix" RAG for me within Open WebUI, even just by raising the context length to 4096 in some cases. I also came to report that the same set of issues I've reported here appear to be present in recently released versions of Ollama: v0.2.6 and v0.2.7.

GiteaMirror commented

2026-04-12 14:18:08 -05:00

@Qualzz commented on GitHub (Jul 23, 2024):

any update ?

@Qualzz commented on GitHub (Jul 23, 2024): any update ?

GiteaMirror commented

2026-04-12 14:18:08 -05:00

@zouzeTG commented on GitHub (Jul 23, 2024):

Hi, any news with this problem ?

@zouzeTG commented on GitHub (Jul 23, 2024): Hi, any news with this problem ?

GiteaMirror commented

2026-04-12 14:18:09 -05:00

@ToeiRei commented on GitHub (Jul 25, 2024):

Looks like 0.3.0 still has that problem

@ToeiRei commented on GitHub (Jul 25, 2024): Looks like 0.3.0 still has that problem

GiteaMirror commented

2026-04-12 14:18:09 -05:00

@Qualzz commented on GitHub (Jul 25, 2024):

is this a openweb UI issue then ?

@Qualzz commented on GitHub (Jul 25, 2024): is this a openweb UI issue then ?

GiteaMirror commented

2026-04-12 14:18:10 -05:00

@silentoplayz commented on GitHub (Jul 25, 2024):

is this a openweb UI issue then ?

I honestly couldn't tell you. The maintainer of Open WebUI has been busy IRL lately and won't be available to check things out until at least the 1st of August. I've not heard anything from the Ollama team about this bug report.

But if I had to choose, it does appear to be an Ollama issue, as RAG works perfectly fine within the latest version of Open WebUI when paired with Ollama v0.2.1 or Ollama v0.2.0 and any versions before these two versions.

@silentoplayz commented on GitHub (Jul 25, 2024): > is this a openweb UI issue then ? I honestly couldn't tell you. The maintainer of Open WebUI has been busy IRL lately and won't be available to check things out until at least the 1st of August. I've not heard anything from the Ollama team about this bug report. But if I had to choose, it does appear to be an Ollama issue, as RAG works perfectly fine within the latest version of Open WebUI when paired with Ollama v0.2.1 or Ollama v0.2.0 and any versions before these two versions.

GiteaMirror commented

2026-04-12 14:18:11 -05:00

@ToeiRei commented on GitHub (Jul 26, 2024):

I have tried it on multiple systems now: The jetson container (arm64) wasn't affected. Looks like it's x86_64 that's acting up.

@ToeiRei commented on GitHub (Jul 26, 2024): I have tried it on multiple systems now: The jetson container (arm64) wasn't affected. Looks like it's x86_64 that's acting up.

GiteaMirror commented

2026-04-12 14:18:12 -05:00

@silentoplayz commented on GitHub (Jul 27, 2024):

Upon conducting additional testing on the dev branch of Open WebUI and the latest version of Ollama today, I've come to the conclusion that I need to adjust the context length of models within the recently added Chat Controls right-hand sidebar within Open WebUI or within the modelfile for a model in the Models section of the Workspace tab for RAG to tell me about documents. This is what I've come to find. It's still a bug to me, as RAG functioned effectively previously, without any need for alterations; and I've talked with Open WebUI contributors enough to find out that this is not normal behavior.

The solution involves setting the context length in the model file to an appropriate value such as 4096 or 8192 within Open WebUI. RAG should just work optimally again (hopefully), without further complications.

C:\Users\G30>ollama --version
ollama version is 0.3.0

@silentoplayz commented on GitHub (Jul 27, 2024): Upon conducting additional testing on the dev branch of Open WebUI and the latest version of Ollama today, I've come to the conclusion that I ***need*** to adjust the context length of models within the recently added `Chat Controls` right-hand sidebar within Open WebUI or within the modelfile for a model in the `Models` section of the `Workspace` tab for RAG to tell me about documents. This is what I've come to find. It's still a bug to me, as RAG functioned effectively previously, without any need for alterations; and I've talked with Open WebUI contributors enough to find out that this is not normal behavior. The solution involves setting the context length in the model file to an appropriate value such as `4096` or `8192` within Open WebUI. RAG should just work optimally again (hopefully), without further complications. ``` C:\Users\G30>ollama --version ollama version is 0.3.0 ```

GiteaMirror commented

2026-04-12 14:18:12 -05:00

@silentoplayz commented on GitHub (Jul 27, 2024):

Update: A workaround fix has been applied in the latest dev branch of Open WebUI, resurrecting RAG functionality back to the way it worked before; and that is for it to essentially work out of the box. It's a bit sneaky/hacky of a fix, but it DOES work!

So what changed? Basically, instead of using the system prompt to provide context to the LLM, Open WebUI now injects the context to the user prompt. We specifically rely on system prompts to provide context, so this fix isn't an ideal fix, but it's been applied as a workaround fix for the time being.

Commit: 1aaa2e8219

I will be closing this issue, lest someone else comes along to report that they have issues beyond this workaround fix.

@silentoplayz commented on GitHub (Jul 27, 2024): **Update**: A workaround fix has been applied in the latest dev branch of Open WebUI, resurrecting RAG functionality back to the way it worked before; and that is for it to essentially work out of the box. It's a bit sneaky/hacky of a fix, but it DOES work! So what changed? Basically, instead of using the system prompt to provide context to the LLM, Open WebUI now injects the context to the user prompt. We specifically rely on system prompts to provide context, so this fix isn't an ideal fix, but it's been applied as a workaround fix for the time being. Commit: https://github.com/open-webui/open-webui/commit/1aaa2e8219b5213725e137c65424f6cacab89b6b I will be closing this issue, lest someone else comes along to report that they have issues beyond this workaround fix.

GiteaMirror commented

2026-04-12 14:18:13 -05:00

@windowshopr commented on GitHub (Dec 8, 2024):

I will say that I was having issues with this too, ollama LLMs not reading the provided documents at all and generating bad answers. I tried providing the context as a user prompt instead of a system prompt, but the issue persisted so I don't think it's a prompt issue, BUT changing the context length in the advanced options helped. So looking into the max context length of the LLM you're using and setting that option to a multiple of that helps. Not sure what the "magic" amount is, but that helped me.

@windowshopr commented on GitHub (Dec 8, 2024): I will say that I was having issues with this too, ollama LLMs not reading the provided documents at all and generating bad answers. I tried providing the context as a user prompt instead of a system prompt, but the issue persisted so I don't think it's a prompt issue, BUT changing the context length in the advanced options helped. So looking into the max context length of the LLM you're using and setting that option to a multiple of that helps. Not sure what the "magic" amount is, but that helped me.

GiteaMirror referenced this issue

2026-04-22 05:37:54 -05:00

[GH-ISSUE #3571] Let ollama's locally started api server support cross-domain!! #27963

GiteaMirror referenced this issue

2026-04-22 06:10:21 -05:00

[GH-ISSUE #4001] CORS configuration error blocking authorization in Ollama's OpenAI compatible endpoint #28238

GiteaMirror referenced this issue

2026-04-28 09:07:48 -05:00

[GH-ISSUE #3571] Let ollama's locally started api server support cross-domain!! #48715

GiteaMirror referenced this issue

2026-04-28 10:33:40 -05:00

[GH-ISSUE #4001] CORS configuration error blocking authorization in Ollama's OpenAI compatible endpoint #48990

GiteaMirror referenced this issue

2026-05-03 16:43:45 -05:00

[GH-ISSUE #3571] Let ollama's locally started api server support cross-domain!! #64241

GiteaMirror referenced this issue

2026-05-03 17:57:13 -05:00

[GH-ISSUE #4001] CORS configuration error blocking authorization in Ollama's OpenAI compatible endpoint #64516

GiteaMirror referenced this issue

2026-05-09 07:07:15 -05:00

[GH-ISSUE #3571] Let ollama's locally started api server support cross-domain!! #79882

GiteaMirror referenced this issue

2026-05-09 08:20:23 -05:00

[GH-ISSUE #4001] CORS configuration error blocking authorization in Ollama's OpenAI compatible endpoint #80159

Sign in to join this conversation.

Branches Tags

main

parth-remove-ollama-agent-command

parth-agent-harness-skills-synthetic-tool

hoyyeva/fix-anthropic-text-before-thinking

parth-agent-cli-markdown-rendering

mxyng/docs-cloud

parth-update-hermes-launch

hoyyeva/vscode-extension-docs-update

parth-gemma4-chat-template-renderer

parth-api-status-context-length

hoyyeva/wire-up-context-length

hoyyeva/claude-code-context-doc

jmorganca/investigate-issue-17046

hoyyeva/hermes-docs

jmorganca/agent-loop-style

hoyyeva/openclaw

parth-agent-loop

hoyyeva/ollama-vscode-extension

brucemacd/cache-metrics

brucemacd/hermes-desktop

hoyyeva/docs-vscode

parth-input-style-experiment

brucemacd/docs-glm52

hoyyeva/poc-docs

Parth/mlx-launch-recommendations

parth-first-time-app-cli-experience

test/darwin-xcode-pin

improve-cloud-model-recommendations

hoyyeva/goose-docs

jmorganca/context-limit-fixes

hoyyeva/qwen-doc

hoyyeva/vscode-docs

jmorganca/remove-mlx-imagegen-code

parth-copilot-token-length-defaults

hoyyeva/poolside-windows

laguna-support

jmorganca/harden-markdown-rendering

laguna-renderer-parser

laguna-llamacpp

codex/make-integration-hidden-and-lunchable

brucemacd/omp-docs

pdevine/gguf-mtp-oldstyle

hoyyeva/migrate-pi

hoyyeva/anthropic-local-image-path

parth-launch-codex-app

hoyyeva/anthropic-reference-images-path

parth-anthropic-reference-images-path

brucemacd/download-before-remove

hoyyeva/editor-config-repair

parth-mlx-decode-checkpoints

parth/hide-claude-desktop-till-release

parth-add-claude-code-autoinstall

release_v0.22.0

pdevine/manifest-list

codex/fix-codex-model-metadata-warning

pdevine/addressable-manifest

brucemacd/launch-fetch-reccomended

jmorganca/llama-compat

launch-copilot-cli

release_v0.20.7

parth-auto-save-backup

parth-test

jmorganca/gemma4-audio-replacements

fix-manifest-digest-on-pull

hoyyeva/vscode-improve

brucemacd/install-server-wait

parth/update-claude-docs

brucemac/start-ap-install

pdevine/mlx-update

pdevine/qwen35_vision

drifkin/api-show-fallback

mintlify/image-generation-1773352582

hoyyeva/server-context-length-local-config

jmorganca/faster-reptition-penalties

jmorganca/convert-nemotron

parth-pi-thinking

pdevine/sampling-penalties

jmorganca/fix-create-quantization-memory

dongchen/resumable_transfer_fix

pdevine/sampling-cache-error

jessegross/mlx-usage

hoyyeva/openclaw-config

hoyyeva/app-html

pdevine/qwen3next

brucemacd/sign-sh-install

brucemacd/tui-update

brucemacd/usage-api

jmorganca/launch-empty

fix-app-dist-embed

mxyng/mlx-compile

mxyng/mlx-quant

mxyng/mlx-glm4.7

mxyng/mlx

brucemacd/simplify-model-picker

jmorganca/qwen3-concurrent

fix-glm-4.7-flash-mla-config

drifkin/qwen3-coder-opening-tag

brucemacd/usage-cli

fix-cuda12-fattn-shmem

ollama-imagegen-docs

parth/fix-multiline-inputs

brucemacd/config-docs

mxyng/model-files

mxyng/simple-execute

fix-imagegen-ollama-models

mxyng/async-upload

jmorganca/lazy-no-dtype-changes

imagegen-auto-detect-create

parth/decrease-concurrent-download-hf

fix-mlx-quantize-init

jmorganca/x-cleanup

usage

imagegen-readme

jmorganca/glm-image

mlx-gpu-cd

jmorganca/imagegen-modelfile

parth/agent-skills

parth/agent-allowlist

parth/signed-in-offline

parth/agents

parth/fix-context-chopping

improve-cloud-flow

parth/add-models-websearch

parth/prompt-renderer-mcp

jmorganca/native-settings

jmorganca/download-stream-hash

jmorganca/client2-rebased

brucemacd/oai-chat-req-multipart

jessegross/multi_chunk_reserve

grace/additional-omit-empty

grace/mistral-3-large

mxyng/tokenizer2

mxyng/tokenizer

jessegross/flash

hoyyeva/windows-nacked-app

mxyng/cleanup-attention

grace/deepseek-parser

hoyyeva/remember-unsent-prompt

parth/add-lfs-pointer-error-conversion

parth/olmo2-test2

hoyyeva/ollama-launchagent-plist

nicole/olmo-model

parth/olmo-test

mxyng/remove-embedded

parth/render-template

jmorganca/intellect-3

parth/remove-prealloc-linter

jmorganca/cmd-eval

nicole/nomic-embed-text-fix

mxyng/lint-2

hoyyeva/add-gemini-3-pro-preview

hoyyeva/load-model-list

mxyng/expand-path

mxyng/environ-2

hoyyeva/deeplink-json-encoding

parth/improve-tool-calling-tests

hoyyeva/conversation

hoyyeva/assistant-edit-response

hoyyeva/thinking

origin/brucemacd/invalid-char-i-err

parth/improve-tool-calling

jmorganca/required-omitempty

grace/qwen3-vl-tests

mxyng/iter-client

parth/docs-readme

nicole/embed-test

pdevine/integration-benchstat

parth/remove-generate-cmd

parth/add-toolcall-id

mxyng/server-tests

jmorganca/glm-4.6

jmorganca/gin-h-compat

drifkin/stable-tool-args

pdevine/qwen3-more-thinking

parth/add-websearch-client

nicole/websearch_local

jmorganca/qwen3-coder-updates

grace/deepseek-v3-migration-tests

mxyng/fix-create

jmorganca/cloud-errors

pdevine/parser-tidy

revert-12233-parth/simplify-entrypoints-runner

parth/enable-so-gpt-oss

brucemacd/qwen3vl

jmorganca/readme-simplify

parth/gpt-oss-structured-outputs

revert-12039-jmorganca/tools-braces

mxyng/embeddings

mxyng/gguf

mxyng/benchmark

mxyng/types-null

parth/move-parsing

mxyng/gemma2

jmorganca/docs

mxyng/16-bit

mxyng/create-stdin

pdevine/authorizedkeys

mxyng/quant

parth/opt-in-error-context-window

brucemacd/cache-models

brucemacd/runner-completion

jmorganca/llama-update-6

brucemacd/benchmark-list

brucemacd/partial-read-caps

parth/deepseek-r1-tools

mxyng/omit-array

parth/tool-prefix-temp

brucemacd/runner-test

jmorganca/qwen25vl

brucemacd/model-forward-test-ext

parth/python-function-parsing

jmorganca/cuda-compression-none

drifkin/num-parallel

drifkin/chat-truncation-fix

jmorganca/sync

parth/python-tools-calling

drifkin/array-head-count

brucemacd/create-no-loop

parth/server-enable-content-stream-with-tools

qwen25omni

mxyng/v3

brucemacd/ropeconfig

jmorganca/silence-tokenizer

parth/sample-so-test

parth/sampling-structured-outputs

brucemacd/doc-go-engine

parth/constrained-sampling-json

jmorganca/mistral-wip

brucemacd/mistral-small-convert

parth/sample-unmarshal-json-for-params

brucemacd/jomorganca/mistral

pdevine/bfloat16

jmorganca/mistral

brucemacd/mistral

pdevine/logging

parth/sample-correctness-fix

parth/sample-fix-sorting

jmorgan/sample-fix-sorting-extras

jmorganca/temp-0-images

brucemacd/parallel-embed-models

brucemacd/shim-grammar

jmorganca/fix-gguf-error

bmizerany/nameswork

jmorganca/faster-releases

bmizerany/validatenames

brucemacd/err-no-vocab

brucemacd/rope-config

brucemacd/err-hint

brucemacd/qwen2_5

brucemacd/logprobs

brucemacd/new_runner_graph_bench

progress-flicker

brucemacd/forward-test

brucemacd/go_qwen2

pdevine/gemma2

jmorganca/add-missing-symlink-eval

mxyng/next-debug

parth/set-context-size-openai

brucemacd/next-bpe-bench

brucemacd/next-bpe-test

brucemacd/new_runner_e2e

brucemacd/new_runner_qwen2

pdevine/convert-cohere2

brucemacd/convert-cli

parth/log-probs

mxyng/next-mlx

mxyng/cmd-history

parth/templating

parth/tokenize-detokenize

brucemacd/check-key-register

bmizerany/grammar

jmorganca/vendor-081b29bd

mxyng/func-checks

jmorganca/fix-null-format

parth/fix-default-to-warn-json

jmorganca/qwen2vl

jmorganca/no-concat

parth/cmd-cleanup-SO

brucemacd/check-key-register-structured-err

parth/openai-stream-usage

parth/fix-referencing-so

stream-tools-stop

jmorganca/degin-1

brucemacd/install-path-clean

brucemacd/push-name-validation

brucemacd/browser-key-register

jmorganca/openai-fix-first-message

jmorganca/fix-proxy

jessegross/sample

parth/disallow-streaming-tools

dhiltgen/remove_submodule

jmorganca/ga

jmorganca/mllama

pdevine/newlines

pdevine/geems-2b

jmorganca/llama-bump

mxyng/modelname-7

mxyng/gin-slog

mxyng/modelname-6

jyan/convert-prog

jyan/quant5

paligemma-support

pdevine/import-docs

jmorganca/openai-context

jyan/paligemma

jyan/p2

jyan/palitest

bmizerany/embedspeedup

jmorganca/llama-vit

brucemacd/allow-ollama

royh/ep-methods

royh/whisper

mxyng/api-models

mxyng/fix-memory

jyan/q4_4/8

jyan/ollama-v

royh/stream-tools

roy-embed-parallel

bmizerany/hrm

revert-5963-revert-5924-mxyng/llama3.1-rope

royh/embed-viz

jyan/local2

jyan/auth

jyan/local

jyan/parse-temp

jmorganca/template-mistral

jyan/reord-g

royh-openai-suffixdocs

royh-imgembed

royh-embed-parallel

jyan/quant4

royh-precision

jyan/progress

pdevine/fix-template

jyan/quant3

pdevine/ggla

mxyng/update-registry-domain

jmorganca/ggml-static

mxyng/create-context

jyan/v0.146

mxyng/layers-from-files

build_dist

bmizerany/noseek

royh-ls

royh-name

timeout

mxyng/server-timestamp

bmizerany/nosillyggufslurps

royh-params

jmorganca/llama-cpp-7c26775

royh-openai-delete

royh-show-rigid

jmorganca/enable-fa

jmorganca/no-error-template

jyan/format

royh-testdelete

bmizerany/fastverify

language_support

pdevine/ps-glitches

brucemacd/tokenize

bruce/iq-quants

bmizerany/filepathwithcoloninhost

mxyng/split-bin

bmizerany/client-registry

jmorganca/if-none-match

native

jmorganca/native

jmorganca/batch-embeddings

jmorganca/initcmake

jmorganca/mm

pdevine/showggmlinfo

modenameenforcealphanum

bmizerany/modenameenforcealphanum

jmorganca/done-reason

jmorganca/llama-cpp-8960fe8

ollama.com

bmizerany/filepathnobuild

bmizerany/types/model/defaultfix

rmdisplaylong

nogogen

bmizerany/x

modelfile-readme

bmizerany/replacecolon

jmorganca/limit

jmorganca/execstack

jmorganca/replace-assets

mxyng/tune-concurrency

jmorganca/testing

whitespace-detection

jmorganca/options

upgrade-all

scratch

cuda-search

mattw/airenamer

mattw/allmodelsonhuggingface

mattw/quantcontext

mattw/whatneedstorun

brucemacd/llama-mem-calc

mattw/faq-context

mattw/communitylinks

mattw/noprune

mattw/python-functioncalling

rename

mxyng/install

pulse

remove-first

editor

mattw/selfqueryingretrieval

cgo

mattw/howtoquant

api

matt/streamingapi

format-config

mxyng/extra-args

shell

update-nous-hermes

cp-model

upload-progress

fix-unknown-model

fix-model-names

delete-fix

insecure-registry

ls

deletemodels

progressbar

readme-updates

license-layers

skip-list

list-models

modelpath

matt/examplemodelfiles

distribution

go-opts

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/ollama#3571