[PR #6714] [MERGED] catch when model vocab size is set correctly #58901

Closed
opened 2026-04-29 13:47:13 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/6714
Author: @pdevine
Created: 9/9/2024
Status: Merged
Merged: 9/10/2024
Merged by: @pdevine

Base: mainHead: pdevine/largevocab


📝 Commits (1)

  • e3d43e3 catch when model vocab size is set correctly

📊 Changes

1 file changed (+7 additions, -3 deletions)

View changed files

📝 convert/convert.go (+7 -3)

📄 Description

This check catches if there are too many tokens in the tokenizer vs. the expected number of tokens specified in the vocab_size field of config.json. This typically happens if the added_tokens array in tokenizer.json ends up has too many tokens.

Right now this results in the back end barfing during inference instead of catching it during ollama create.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/6714 **Author:** [@pdevine](https://github.com/pdevine) **Created:** 9/9/2024 **Status:** ✅ Merged **Merged:** 9/10/2024 **Merged by:** [@pdevine](https://github.com/pdevine) **Base:** `main` ← **Head:** `pdevine/largevocab` --- ### 📝 Commits (1) - [`e3d43e3`](https://github.com/ollama/ollama/commit/e3d43e373b16de6c9d457af9f58b3cd2e64814b8) catch when model vocab size is set correctly ### 📊 Changes **1 file changed** (+7 additions, -3 deletions) <details> <summary>View changed files</summary> 📝 `convert/convert.go` (+7 -3) </details> ### 📄 Description This check catches if there are too many tokens in the tokenizer vs. the expected number of tokens specified in the `vocab_size` field of `config.json`. This typically happens if the `added_tokens` array in `tokenizer.json` ends up has too many tokens. Right now this results in the back end barfing during inference instead of catching it during `ollama create`. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-29 13:47:13 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#58901