[PR #7369] [MERGED] Fix deepseek deseret regex #38273

Closed
opened 2026-04-22 22:56:51 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/7369
Author: @dhiltgen
Created: 10/25/2024
Status: Merged
Merged: 10/26/2024
Merged by: @dhiltgen

Base: mainHead: win_unicode


📝 Commits (1)

  • ffc12dc Fix deepseek deseret regex

📊 Changes

3 files changed (+88 additions, -1 deletions)

View changed files

📝 llama/llama-vocab.cpp (+1 -1)
llama/patches/0012-fix-deepseek-deseret-regex.patch (+66 -0)
📝 llama/unicode.cpp (+21 -0)

📄 Description

On windows compiled with gcc the c++ regex library failed to handle the characters

Without any changes, loading the model in the Go server crashes with

llama_model_load: error loading model: error loading model vocabulary: wstring_convert::from_bytes

The patch for unicode.cpp gets past the wide character conversion problem, but then hits

Regex error: Invalid range in bracket expression.

Switching to the \U<8hexchars> syntax to capture the range for Deseret resolves the regex problem.

Fixes #7311


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/7369 **Author:** [@dhiltgen](https://github.com/dhiltgen) **Created:** 10/25/2024 **Status:** ✅ Merged **Merged:** 10/26/2024 **Merged by:** [@dhiltgen](https://github.com/dhiltgen) **Base:** `main` ← **Head:** `win_unicode` --- ### 📝 Commits (1) - [`ffc12dc`](https://github.com/ollama/ollama/commit/ffc12dcae9f0250e74da3adaab11ee9bc5b39184) Fix deepseek deseret regex ### 📊 Changes **3 files changed** (+88 additions, -1 deletions) <details> <summary>View changed files</summary> 📝 `llama/llama-vocab.cpp` (+1 -1) ➕ `llama/patches/0012-fix-deepseek-deseret-regex.patch` (+66 -0) 📝 `llama/unicode.cpp` (+21 -0) </details> ### 📄 Description On windows compiled with gcc the c++ regex library failed to handle the characters Without any changes, loading the model in the Go server crashes with ``` llama_model_load: error loading model: error loading model vocabulary: wstring_convert::from_bytes ``` The patch for unicode.cpp gets past the wide character conversion problem, but then hits ``` Regex error: Invalid range in bracket expression. ``` Switching to the `\U<8hexchars>` syntax to capture the range for Deseret resolves the regex problem. Fixes #7311 --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-22 22:56:51 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#38273