[PR #11301] model: precompute special tokens in NewVocabulary to avoid repeated alloc #13502

Closed
opened 2026-04-13 00:29:00 -05:00 by GiteaMirror · 0 comments
Owner

Original Pull Request: https://github.com/ollama/ollama/pull/11301

State: closed
Merged: No


This moves the special token list creation into the constructor, so that SpecialVocabulary() doesn't need to run sync.Once.Do() repeatedly. Doing this only once per model load avoids excessive memory allocations during chat requests when sync.Once.Do() is called with a Closure / Anonymous function.

Fixes #11299

**Original Pull Request:** https://github.com/ollama/ollama/pull/11301 **State:** closed **Merged:** No --- This moves the special token list creation into the constructor, so that `SpecialVocabulary()` doesn't need to run `sync.Once.Do()` repeatedly. Doing this only once per model load avoids excessive memory allocations during chat requests when `sync.Once.Do()` is called with a Closure / Anonymous function. Fixes #11299
GiteaMirror added the pull-request label 2026-04-13 00:29:00 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#13502