[PR #7144] [CLOSED] Better handle small models in scheduler #38201

Closed
opened 2026-04-22 22:52:40 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/7144
Author: @dhiltgen
Created: 10/8/2024
Status: Closed

Base: mainHead: embed_race


📝 Commits (1)

  • 6581301 Better handle small models in scheduler

📊 Changes

2 files changed (+7 additions, -3 deletions)

View changed files

📝 llm/ggml.go (+3 -0)
📝 server/sched.go (+4 -3)

📄 Description

Our memory prediction for small models tends to over-estimate the actual VRAM usage, which causes the scheduler to incorrectly wait too long for recovery.

Fixes #7130


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/7144 **Author:** [@dhiltgen](https://github.com/dhiltgen) **Created:** 10/8/2024 **Status:** ❌ Closed **Base:** `main` ← **Head:** `embed_race` --- ### 📝 Commits (1) - [`6581301`](https://github.com/ollama/ollama/commit/65813010d8bbaa9189d38efc71198c0c9d25561f) Better handle small models in scheduler ### 📊 Changes **2 files changed** (+7 additions, -3 deletions) <details> <summary>View changed files</summary> 📝 `llm/ggml.go` (+3 -0) 📝 `server/sched.go` (+4 -3) </details> ### 📄 Description Our memory prediction for small models tends to over-estimate the actual VRAM usage, which causes the scheduler to incorrectly wait too long for recovery. Fixes #7130 --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-22 22:52:40 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#38201