[GH-ISSUE #1531] New Gemma:7B crashes ollama server when using open-webui #12540

New Issue

GiteaMirror · 2026-04-19T19:27:43-05:00

GiteaMirror commented

2026-04-19 19:27:43 -05:00

Originally created by @dewrama on GitHub (Apr 13, 2024).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/1531

Bug Report

Description

Ollama with 7B used to work but with recent update, running query with gemma:7b crashes Ollama. When I run gemma:7b inside of ollama, it works fine. It's only when I run with open-webui that it crashes with only gemma model (other models are fine)

Bug Summary:
Ollama with 7B used to work but with recent update, running query with gemma:7b crashes Ollama. I have to restart entire ollama server

Error on open-webui
Gemma:7b
Uh-oh! There was an issue connecting to Ollama.

Steps to Reproduce:
start open-webui with ollama, select gemma:7b model, run

Expected Behavior:

Actual Behavior:
Ollama crashes when running chat with gemma:7b on ollama

Environment

Windows 11
ollama version is 0.1.31

Reproduction Details

Confirmation:

I have read and followed all the instructions provided in the README.md.
I am on the latest version of both Open WebUI and Ollama.
I have included the browser console logs.
I have included the Docker container logs.

Logs and Screenshots

Ollama log
time=2024-04-12T21:31:40.000-07:00 level=WARN source=server.go:113 msg="server crash 12 - exit code 3221226505 - respawning"

{"function":"initialize","level":"INFO","line":444,"msg":"initializing slots","n_slots":1,"tid":"37388","timestamp":1712982626}
{"function":"initialize","level":"INFO","line":456,"msg":"new slot","n_ctx_slot":2048,"slot_id":0,"tid":"37388","timestamp":1712982626}
time=2024-04-12T21:30:26.332-07:00 level=INFO source=dyn_ext_server.go:159 msg="Starting llama main loop"
{"function":"update_slots","level":"INFO","line":1574,"msg":"all slots are idle and system prompt is empty, clear the KV cache","tid":"42304","timestamp":1712982626}
{"function":"launch_slot_with_data","level":"INFO","line":829,"msg":"slot is processing task","slot_id":0,"task_id":0,"tid":"42304","timestamp":1712982626}
{"function":"update_slots","ga_i":0,"level":"INFO","line":1812,"msg":"slot progression","n_past":0,"n_past_se":0,"n_prompt_tokens_processed":44,"slot_id":0,"task_id":0,"tid":"42304","timestamp":1712982626}
{"function":"update_slots","level":"INFO","line":1836,"msg":"kv cache rm [p0, end)","p0":0,"slot_id":0,"task_id":0,"tid":"42304","timestamp":1712982626}
CUDA error: out of memory
current device: 0, in function alloc at C:\a\ollama\ollama\llm\llama.cpp\ggml-cuda.cu:532
cuMemSetAccess(pool_addr + pool_size, reserve_size, &access, 1)
GGML_ASSERT: C:\a\ollama\ollama\llm\llama.cpp\ggml-cuda.cu:193: !"CUDA error"

Browser Console Logs:
[Include relevant browser console logs, if applicable]

Docker Container Logs:
[Include relevant Docker container logs, if applicable]

Screenshots (if applicable):
[Attach any relevant screenshots to help illustrate the issue]

Installation Method

[Describe the method you used to install the project, e.g., manual installation, Docker, package manager, etc.]

Additional Information

[Include any additional details that may help in understanding and reproducing the issue. This could include specific configurations, error messages, or anything else relevant to the bug.]

Note

If the bug report is incomplete or does not follow the provided instructions, it may not be addressed. Please ensure that you have followed the steps outlined in the README.md and troubleshooting.md documents, and provide all necessary information for us to reproduce and address the issue. Thank you!

Originally created by @dewrama on GitHub (Apr 13, 2024). Original GitHub issue: https://github.com/open-webui/open-webui/issues/1531 # Bug Report ## Description Ollama with 7B used to work but with recent update, running query with gemma:7b crashes Ollama. When I run gemma:7b inside of ollama, it works fine. It's only when I run with open-webui that it crashes with only gemma model (other models are fine) **Bug Summary:** Ollama with 7B used to work but with recent update, running query with gemma:7b crashes Ollama. I have to restart entire ollama server Error on open-webui Gemma:7b Uh-oh! There was an issue connecting to Ollama. **Steps to Reproduce:** start open-webui with ollama, select gemma:7b model, run **Expected Behavior:** **Actual Behavior:** Ollama crashes when running chat with gemma:7b on ollama ## Environment Windows 11 ollama version is 0.1.31 ## Reproduction Details **Confirmation:** - [ ] I have read and followed all the instructions provided in the README.md. - [ ] I am on the latest version of both Open WebUI and Ollama. - [ ] I have included the browser console logs. - [ ] I have included the Docker container logs. ## Logs and Screenshots Ollama log time=2024-04-12T21:31:40.000-07:00 level=WARN source=server.go:113 msg="server crash 12 - exit code 3221226505 - respawning" {"function":"initialize","level":"INFO","line":444,"msg":"initializing slots","n_slots":1,"tid":"37388","timestamp":1712982626} {"function":"initialize","level":"INFO","line":456,"msg":"new slot","n_ctx_slot":2048,"slot_id":0,"tid":"37388","timestamp":1712982626} time=2024-04-12T21:30:26.332-07:00 level=INFO source=dyn_ext_server.go:159 msg="Starting llama main loop" {"function":"update_slots","level":"INFO","line":1574,"msg":"all slots are idle and system prompt is empty, clear the KV cache","tid":"42304","timestamp":1712982626} {"function":"launch_slot_with_data","level":"INFO","line":829,"msg":"slot is processing task","slot_id":0,"task_id":0,"tid":"42304","timestamp":1712982626} {"function":"update_slots","ga_i":0,"level":"INFO","line":1812,"msg":"slot progression","n_past":0,"n_past_se":0,"n_prompt_tokens_processed":44,"slot_id":0,"task_id":0,"tid":"42304","timestamp":1712982626} {"function":"update_slots","level":"INFO","line":1836,"msg":"kv cache rm [p0, end)","p0":0,"slot_id":0,"task_id":0,"tid":"42304","timestamp":1712982626} CUDA error: out of memory current device: 0, in function alloc at C:\a\ollama\ollama\llm\llama.cpp\ggml-cuda.cu:532 cuMemSetAccess(pool_addr + pool_size, reserve_size, &access, 1) GGML_ASSERT: C:\a\ollama\ollama\llm\llama.cpp\ggml-cuda.cu:193: !"CUDA error" **Browser Console Logs:** [Include relevant browser console logs, if applicable] **Docker Container Logs:** [Include relevant Docker container logs, if applicable] **Screenshots (if applicable):** [Attach any relevant screenshots to help illustrate the issue] ## Installation Method [Describe the method you used to install the project, e.g., manual installation, Docker, package manager, etc.] ## Additional Information [Include any additional details that may help in understanding and reproducing the issue. This could include specific configurations, error messages, or anything else relevant to the bug.] ## Note If the bug report is incomplete or does not follow the provided instructions, it may not be addressed. Please ensure that you have followed the steps outlined in the README.md and troubleshooting.md documents, and provide all necessary information for us to reproduce and address the issue. Thank you!

GiteaMirror closed this issue

2026-04-19 19:27:43 -05:00

GiteaMirror commented

2026-04-19 19:27:44 -05:00

@justinh-rahb commented on GitHub (Apr 13, 2024):

Hello @dewrama, for best help you should file this issue with the Ollama project.

@justinh-rahb commented on GitHub (Apr 13, 2024): Hello @dewrama, for best help you should file this issue with the [Ollama project](https://github.com/ollama/ollama/issues).

GiteaMirror referenced this issue

2026-04-20 04:31:31 -05:00

[PR #12540] [CLOSED] build(deps): bump opentelemetry-instrumentation-httpx from 0.51b0 to 0.52b0 #22959

GiteaMirror referenced this issue

2026-04-25 11:32:48 -05:00

[PR #12540] [CLOSED] build(deps): bump opentelemetry-instrumentation-httpx from 0.51b0 to 0.52b0 #38589

GiteaMirror referenced this issue

2026-04-29 20:37:39 -05:00

[PR #12540] [CLOSED] build(deps): bump opentelemetry-instrumentation-httpx from 0.51b0 to 0.52b0 #46007

GiteaMirror referenced this issue

2026-05-06 05:30:30 -05:00

[PR #12540] [CLOSED] build(deps): bump opentelemetry-instrumentation-httpx from 0.51b0 to 0.52b0 #61815