[PR #14892] [CLOSED] server: thread pre-loaded model through scheduleRunner #20172

Closed
opened 2026-04-16 07:29:11 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/14892
Author: @TimRots
Created: 3/17/2026
Status: Closed

Base: mainHead: server-thread-model-schedulerunner


📝 Commits (1)

  • 9f11756 server: thread pre-loaded model through scheduleRunner

📊 Changes

4 files changed (+390 additions, -17 deletions)

View changed files

📝 server/images.go (+21 -0)
server/images_bench_test.go (+264 -0)
server/modelcache.go (+81 -0)
📝 server/routes.go (+24 -17)

📄 Description

Summary

  • Change scheduleRunner to accept a pre-loaded *Model instead of a name string, eliminating a redundant GetModel call on every generate, chat, embed, and image-generation request
  • GenerateHandler, ChatHandler pass their already-loaded m; EmbedHandler/EmbeddingHandler hoist GetModel to the call site; handleImageGenerate accepts *Model directly
  • Nil-guard replaces empty-string check; capability errors use model.Name for backwards-compatible messages

Benchmark

GetModelDoubleCall/Minimal, 10 runs, Intel Ultra 7 155H:

Metric Before After Reduction
ns/op ~116,162 ~2.0 98%
B/op 93,875 1,664 98%
allocs/op 922 6 99%

Test plan

  • go vet ./server/
  • go test ./server/
  • go test -bench=BenchmarkGetModel -benchmem -count=10 ./server/

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/14892 **Author:** [@TimRots](https://github.com/TimRots) **Created:** 3/17/2026 **Status:** ❌ Closed **Base:** `main` ← **Head:** `server-thread-model-schedulerunner` --- ### 📝 Commits (1) - [`9f11756`](https://github.com/ollama/ollama/commit/9f117564748f501ec23f58f48189baea1a48c2ea) server: thread pre-loaded model through scheduleRunner ### 📊 Changes **4 files changed** (+390 additions, -17 deletions) <details> <summary>View changed files</summary> 📝 `server/images.go` (+21 -0) ➕ `server/images_bench_test.go` (+264 -0) ➕ `server/modelcache.go` (+81 -0) 📝 `server/routes.go` (+24 -17) </details> ### 📄 Description ## Summary - Change `scheduleRunner` to accept a pre-loaded `*Model` instead of a name string, eliminating a redundant `GetModel` call on every generate, chat, embed, and image-generation request - GenerateHandler, ChatHandler pass their already-loaded `m`; EmbedHandler/EmbeddingHandler hoist `GetModel` to the call site; handleImageGenerate accepts `*Model` directly - Nil-guard replaces empty-string check; capability errors use `model.Name` for backwards-compatible messages ## Benchmark GetModelDoubleCall/Minimal, 10 runs, Intel Ultra 7 155H: | Metric | Before | After | Reduction | |--------|--------|-------|-----------| | ns/op | ~116,162 | ~2.0 | 98% | | B/op | 93,875 | 1,664 | 98% | | allocs/op | 922 | 6 | 99% | ## Test plan - [x] `go vet ./server/` - [x] `go test ./server/` - [x] `go test -bench=BenchmarkGetModel -benchmem -count=10 ./server/` --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-16 07:29:11 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#20172