[PR #6844] [CLOSED] runner.go: Don't panic when processing sequences #12241

Closed
opened 2026-04-12 23:52:44 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/6844
Author: @jessegross
Created: 9/17/2024
Status: Closed

Base: jmorganca/llamaHead: jessegross/no_panic


📝 Commits (4)

  • ac003ca runner.go: Update TODOs
  • 1547609 runner.go: Don't panic when processing sequences
  • 5e021a4 runner.go: Remove stop tokens from cache
  • 94a48ab runner.go: Allocate batches for all sequences during init

📊 Changes

3 files changed (+98 additions, -62 deletions)

View changed files

📝 llama/runner/cache.go (+11 -7)
📝 llama/runner/cache_test.go (+4 -2)
📝 llama/runner/runner.go (+83 -53)

📄 Description

If there is an error processing a sequence, we should simply return
an HTTP error and abort that sequence rather than panic the whole
runner. This will make us more resilient to transient failures.

Panics can still occur during startup as there is no way to serve
requests if that fails.

Based on some code that was originally part of the vision work.

Co-authored-by: jmorganca jmorganca@gmail.com


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/6844 **Author:** [@jessegross](https://github.com/jessegross) **Created:** 9/17/2024 **Status:** ❌ Closed **Base:** `jmorganca/llama` ← **Head:** `jessegross/no_panic` --- ### 📝 Commits (4) - [`ac003ca`](https://github.com/ollama/ollama/commit/ac003ca715bc7b89066012dce7b88035fd3d61e7) runner.go: Update TODOs - [`1547609`](https://github.com/ollama/ollama/commit/1547609b46b9bb23a217f1bbbb1bec917ed05a2d) runner.go: Don't panic when processing sequences - [`5e021a4`](https://github.com/ollama/ollama/commit/5e021a417135683677e26290d8d37dfddd79622a) runner.go: Remove stop tokens from cache - [`94a48ab`](https://github.com/ollama/ollama/commit/94a48ab1b3ccdb3330d1e40ef6a1ad9b0f3710fb) runner.go: Allocate batches for all sequences during init ### 📊 Changes **3 files changed** (+98 additions, -62 deletions) <details> <summary>View changed files</summary> 📝 `llama/runner/cache.go` (+11 -7) 📝 `llama/runner/cache_test.go` (+4 -2) 📝 `llama/runner/runner.go` (+83 -53) </details> ### 📄 Description If there is an error processing a sequence, we should simply return an HTTP error and abort that sequence rather than panic the whole runner. This will make us more resilient to transient failures. Panics can still occur during startup as there is no way to serve requests if that fails. Based on some code that was originally part of the vision work. Co-authored-by: jmorganca <jmorganca@gmail.com> --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-12 23:52:44 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#12241