[PR #11906] test: improve scheduler/concurrency stress tests #13650

Closed
opened 2026-04-13 00:31:59 -05:00 by GiteaMirror · 0 comments
Owner

Original Pull Request: https://github.com/ollama/ollama/pull/11906

State: closed
Merged: Yes


The scheduler test used to use approximate memory figures and would often over or under shoot a systems capcity leading to flaky test results. This should improve the reliability of this scenario by leveraging ps output to determinie exactly how many models it takes to trigger thrashing.

The concurrency test is also refined to target num_parallel + 1 and handle timeouts better.

With these refinements, TestMultiModelConcurrency was redundant

Also added a new TestGenerateWithHistory to exercise parallel requests with history context to ensure cache behavior

Focus embeddings tests on embedding models

**Original Pull Request:** https://github.com/ollama/ollama/pull/11906 **State:** closed **Merged:** Yes --- The scheduler test used to use approximate memory figures and would often over or under shoot a systems capcity leading to flaky test results. This should improve the reliability of this scenario by leveraging ps output to determinie exactly how many models it takes to trigger thrashing. The concurrency test is also refined to target num_parallel + 1 and handle timeouts better. With these refinements, TestMultiModelConcurrency was redundant Also added a new TestGenerateWithHistory to exercise parallel requests with history context to ensure cache behavior Focus embeddings tests on embedding models
GiteaMirror added the pull-request label 2026-04-13 00:31:59 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#13650