[GH-ISSUE #14839] runner: allocModel silently swallows error from prompt graph reservation #71634

Closed
opened 2026-05-05 02:15:46 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @mango766 on GitHub (Mar 14, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/14839

What is the issue?

I was reading through the model loading code and noticed that allocModel() in runner/ollamarunner/runner.go has a bug where an error from reserveWorstCaseGraph(true) is silently discarded:

err = s.reserveWorstCaseGraph(true)
if err != nil {
    return nil  // <-- should be `return err`
}

return s.reserveWorstCaseGraph(false)

The function signature is func (s *Server) allocModel(...) (panicErr error), so returning nil when an error occurs means the caller thinks model allocation succeeded when it actually didn't.

This could cause issues when the prompt-sized graph reservation fails (e.g. memory pressure), because:

  1. allocModel reports success to its caller
  2. The model appears to load correctly
  3. But the worst-case prompt graph was never reserved
  4. Subsequent inference may fail in unexpected ways

The second call reserveWorstCaseGraph(false) (for the generation graph) correctly returns its error, so only the prompt graph reservation has this problem.

How to reproduce

This would manifest when memory is tight enough that reserveWorstCaseGraph(true) fails but there's still enough to proceed to the second call. In that case, the model loads without proper graph allocation for prompt processing.

Suggested fix

Change return nil to return err on line 1234.

OS

Linux

GPU

N/A (code-level bug)

CPU

N/A

Ollama version

main branch (current HEAD)

Originally created by @mango766 on GitHub (Mar 14, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/14839 ### What is the issue? I was reading through the model loading code and noticed that `allocModel()` in `runner/ollamarunner/runner.go` has a bug where an error from `reserveWorstCaseGraph(true)` is silently discarded: ```go err = s.reserveWorstCaseGraph(true) if err != nil { return nil // <-- should be `return err` } return s.reserveWorstCaseGraph(false) ``` The function signature is `func (s *Server) allocModel(...) (panicErr error)`, so returning `nil` when an error occurs means the caller thinks model allocation succeeded when it actually didn't. This could cause issues when the prompt-sized graph reservation fails (e.g. memory pressure), because: 1. `allocModel` reports success to its caller 2. The model appears to load correctly 3. But the worst-case prompt graph was never reserved 4. Subsequent inference may fail in unexpected ways The second call `reserveWorstCaseGraph(false)` (for the generation graph) correctly returns its error, so only the prompt graph reservation has this problem. ### How to reproduce This would manifest when memory is tight enough that `reserveWorstCaseGraph(true)` fails but there's still enough to proceed to the second call. In that case, the model loads without proper graph allocation for prompt processing. ### Suggested fix Change `return nil` to `return err` on line 1234. ### OS Linux ### GPU _N/A (code-level bug)_ ### CPU _N/A_ ### Ollama version main branch (current HEAD)
GiteaMirror added the bug label 2026-05-05 02:15:46 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#71634