[PR #6521] [MERGED] Go Server Fixes #12134

Closed
opened 2026-04-12 23:50:23 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/6521
Author: @jessegross
Created: 8/27/2024
Status: Merged
Merged: 8/27/2024
Merged by: @jessegross

Base: jmorganca/llamaHead: jessegross/goserver-fixes


📝 Commits (6)

  • 55e6683 runner.go: Scale batches to be processed by numParallel
  • 020f5b7 runner.go: Hold mutex for entire time when processing batch
  • ecf9206 runner.go: Separate KV cache and context sizes
  • 0b92754 runner.go: Fix resource leaks when removing sequences
  • 8a25d6f runner.go: Fix deadlock if a connection is closed during decoding
  • aa0a499 runner.go: Move pieces[] into sequence

📊 Changes

2 files changed (+156 additions, -140 deletions)

View changed files

📝 llama/llama.go (+3 -1)
📝 llama/runner/runner.go (+153 -139)

📄 Description

Fixes for the go server branch primarily around concurrency and resource allocation


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/6521 **Author:** [@jessegross](https://github.com/jessegross) **Created:** 8/27/2024 **Status:** ✅ Merged **Merged:** 8/27/2024 **Merged by:** [@jessegross](https://github.com/jessegross) **Base:** `jmorganca/llama` ← **Head:** `jessegross/goserver-fixes` --- ### 📝 Commits (6) - [`55e6683`](https://github.com/ollama/ollama/commit/55e6683ecb618a90557752e5fbe2ddca737f3a53) runner.go: Scale batches to be processed by numParallel - [`020f5b7`](https://github.com/ollama/ollama/commit/020f5b7f1c83825f85c70d2bbe5f041863f75a46) runner.go: Hold mutex for entire time when processing batch - [`ecf9206`](https://github.com/ollama/ollama/commit/ecf9206a99fe428889965fb0aab79b2d17363380) runner.go: Separate KV cache and context sizes - [`0b92754`](https://github.com/ollama/ollama/commit/0b92754a523363c4657ce4e398a20e1e30b7617c) runner.go: Fix resource leaks when removing sequences - [`8a25d6f`](https://github.com/ollama/ollama/commit/8a25d6fa8cc86466c8a7ca40a1f324cf4687d595) runner.go: Fix deadlock if a connection is closed during decoding - [`aa0a499`](https://github.com/ollama/ollama/commit/aa0a49975fc3fb35008c6d804b121c2bb03874ce) runner.go: Move pieces[] into sequence ### 📊 Changes **2 files changed** (+156 additions, -140 deletions) <details> <summary>View changed files</summary> 📝 `llama/llama.go` (+3 -1) 📝 `llama/runner/runner.go` (+153 -139) </details> ### 📄 Description Fixes for the go server branch primarily around concurrency and resource allocation --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-12 23:50:23 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#12134