[PR #14773] runner: replace panics with graceful error handling in sample/decode #77122

Open
opened 2026-05-05 09:49:08 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/14773
Author: @alvinttang
Created: 3/11/2026
Status: 🔄 Open

Base: mainHead: fix/graceful-sample-error


📝 Commits (1)

  • 14f1247 runner: replace panics with graceful error handling in sample/decode

📊 Changes

2 files changed (+10 additions, -2 deletions)

View changed files

📝 llm/server.go (+4 -0)
📝 runner/ollamarunner/runner.go (+6 -2)

📄 Description

Summary

Fixes #14718

Replace panic("failed to sample token") and panic("failed to decode token") in runner/ollamarunner/runner.go with graceful error handling that terminates only the failing sequence instead of crashing the entire runner process.

Problem

When seq.sampler.Sample(logits) or Decode([]int32{token}) returns an error, the current code calls panic(...), which kills the entire runner process — including all other active sequences being served concurrently. This is unnecessarily destructive since the error is scoped to a single sequence.

Solution

  • Log the error with slog.Error including the sequence ID and error details
  • Call s.removeSequence(i, llm.DoneReasonError) to cleanly terminate just the failing sequence
  • continue to process remaining sequences in the batch

This matches the existing error handling pattern used throughout the same function, e.g.:

// EOS handling (line 780)
s.removeSequence(i, llm.DoneReasonStop)
continue

// Length limit (line 799)
s.removeSequence(i, llm.DoneReasonLength)
continue

// Connection closed (line 852)
s.removeSequence(i, llm.DoneReasonConnectionClosed)
continue

A new DoneReasonError constant is added to llm.DoneReason to distinguish error terminations from normal stop reasons.

Test plan

  • Verify go vet ./llm/... passes (confirmed locally)
  • Verify existing tests pass
  • Manual: trigger a sample error and confirm the runner continues serving other sequences

🤖 Generated with Claude Code


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/14773 **Author:** [@alvinttang](https://github.com/alvinttang) **Created:** 3/11/2026 **Status:** 🔄 Open **Base:** `main` ← **Head:** `fix/graceful-sample-error` --- ### 📝 Commits (1) - [`14f1247`](https://github.com/ollama/ollama/commit/14f1247e337bf2035d1b5c930265bec8f7d326fb) runner: replace panics with graceful error handling in sample/decode ### 📊 Changes **2 files changed** (+10 additions, -2 deletions) <details> <summary>View changed files</summary> 📝 `llm/server.go` (+4 -0) 📝 `runner/ollamarunner/runner.go` (+6 -2) </details> ### 📄 Description ## Summary Fixes #14718 Replace `panic("failed to sample token")` and `panic("failed to decode token")` in `runner/ollamarunner/runner.go` with graceful error handling that terminates only the failing sequence instead of crashing the entire runner process. ## Problem When `seq.sampler.Sample(logits)` or `Decode([]int32{token})` returns an error, the current code calls `panic(...)`, which kills the entire runner process — including all other active sequences being served concurrently. This is unnecessarily destructive since the error is scoped to a single sequence. ## Solution - Log the error with `slog.Error` including the sequence ID and error details - Call `s.removeSequence(i, llm.DoneReasonError)` to cleanly terminate just the failing sequence - `continue` to process remaining sequences in the batch This matches the existing error handling pattern used throughout the same function, e.g.: ```go // EOS handling (line 780) s.removeSequence(i, llm.DoneReasonStop) continue // Length limit (line 799) s.removeSequence(i, llm.DoneReasonLength) continue // Connection closed (line 852) s.removeSequence(i, llm.DoneReasonConnectionClosed) continue ``` A new `DoneReasonError` constant is added to `llm.DoneReason` to distinguish error terminations from normal stop reasons. ## Test plan - [ ] Verify `go vet ./llm/...` passes (confirmed locally) - [ ] Verify existing tests pass - [ ] Manual: trigger a sample error and confirm the runner continues serving other sequences 🤖 Generated with [Claude Code](https://claude.com/claude-code) --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-05-05 09:49:08 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#77122