[PR #12378] Add per-model parallel request configuration using OLLAMA_NUM_PARALLEL_RULES environment variable #19076

Open
opened 2026-04-16 06:56:08 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/12378
Author: @R-omk
Created: 9/23/2025
Status: 🔄 Open

Base: mainHead: num_parallel_rules


📝 Commits (1)

  • 47b5c10 Add per-model parallel request configuration using OLLAMA_NUM_PARALLEL_RULES environment variable

📊 Changes

5 files changed (+192 additions, -21 deletions)

View changed files

📝 cmd/cmd.go (+1 -0)
📝 docs/faq.md (+5 -0)
📝 envconfig/config.go (+108 -20)
envconfig/parallel_rules_test.go (+77 -0)
📝 server/sched.go (+1 -1)

📄 Description

close: #4894

This PR introduces per‑model parallel request configuration using the OLLAMA_NUM_PARALLEL_RULES environment variable.

What’s new

  • New environment variable OLLAMA_NUM_PARALLEL_RULES allowing YAML‑defined parallel limits per model.
  • Extended documentation in docs/faq.md with usage examples.
  • Added parsing logic in envconfig/config.go with validation and regex support.
  • Added comprehensive tests in envconfig/parallel_rules_test.go.
  • Adjusted scheduler to respect per‑model limits.

Why

Provides fine‑grained control over request concurrency per model, improving resource utilization and allowing different parallelism settings for different models.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/12378 **Author:** [@R-omk](https://github.com/R-omk) **Created:** 9/23/2025 **Status:** 🔄 Open **Base:** `main` ← **Head:** `num_parallel_rules` --- ### 📝 Commits (1) - [`47b5c10`](https://github.com/ollama/ollama/commit/47b5c1078fd6f5d381793beb44f00a7f1c8319ea) Add per-model parallel request configuration using OLLAMA_NUM_PARALLEL_RULES environment variable ### 📊 Changes **5 files changed** (+192 additions, -21 deletions) <details> <summary>View changed files</summary> 📝 `cmd/cmd.go` (+1 -0) 📝 `docs/faq.md` (+5 -0) 📝 `envconfig/config.go` (+108 -20) ➕ `envconfig/parallel_rules_test.go` (+77 -0) 📝 `server/sched.go` (+1 -1) </details> ### 📄 Description close: #4894 This PR introduces per‑model parallel request configuration using the `OLLAMA_NUM_PARALLEL_RULES` environment variable. ## What’s new - New environment variable `OLLAMA_NUM_PARALLEL_RULES` allowing YAML‑defined parallel limits per model. - Extended documentation in `docs/faq.md` with usage examples. - Added parsing logic in `envconfig/config.go` with validation and regex support. - Added comprehensive tests in `envconfig/parallel_rules_test.go`. - Adjusted scheduler to respect per‑model limits. ## Why Provides fine‑grained control over request concurrency per model, improving resource utilization and allowing different parallelism settings for different models. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-16 06:56:08 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#19076