[PR #8673] [CLOSED] test: byte pair encoding #38619

Closed
opened 2026-04-22 23:17:33 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/8673
Author: @BruceMacD
Created: 1/29/2025
Status: Closed

Base: mxyng/nextHead: brucemacd/next-bpe-test


📝 Commits (3)

📊 Changes

58 files changed (+4077 additions, -474 deletions)

View changed files

cache/cache.go (+63 -0)
📝 convert/convert.go (+16 -16)
📝 convert/convert_bert.go (+5 -5)
📝 convert/convert_commandr.go (+5 -5)
📝 convert/convert_gemma.go (+5 -5)
📝 convert/convert_gemma2.go (+2 -4)
📝 convert/convert_gemma2_adapter.go (+5 -5)
📝 convert/convert_llama.go (+6 -6)
📝 convert/convert_llama_adapter.go (+5 -5)
📝 convert/convert_mixtral.go (+5 -5)
📝 convert/convert_phi3.go (+7 -7)
📝 convert/convert_qwen2.go (+5 -5)
📝 convert/convert_test.go (+6 -6)
📝 fs/ggml/ggml.go (+111 -95)
📝 fs/ggml/gguf.go (+6 -7)
📝 fs/ggml/type.go (+2 -7)
📝 fs/util/bufioutil/buffer_seeker.go (+0 -0)
📝 fs/util/bufioutil/buffer_seeker_test.go (+0 -0)
llm/ggla.go (+0 -149)
llm/ggml_test.go (+0 -1)

...and 38 more files

📄 Description

Adding a basic unit test for the bpe tokenizer.

Tests on the next branch are failing, so this test pipeline will also, but the tokenizer tests specifically pass.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/8673 **Author:** [@BruceMacD](https://github.com/BruceMacD) **Created:** 1/29/2025 **Status:** ❌ Closed **Base:** `mxyng/next` ← **Head:** `brucemacd/next-bpe-test` --- ### 📝 Commits (3) - [`6a41201`](https://github.com/ollama/ollama/commit/6a4120143f87239c6c4a40bc582a7f7669f14bfe) next - [`b21482e`](https://github.com/ollama/ollama/commit/b21482e4a98f15e2cd1505b38a92fe25411a2b5f) fix linter - [`60b2d49`](https://github.com/ollama/ollama/commit/60b2d494bcf797a936647eed75b81e366a2b6e3e) model: test byte pair encoding ### 📊 Changes **58 files changed** (+4077 additions, -474 deletions) <details> <summary>View changed files</summary> ➕ `cache/cache.go` (+63 -0) 📝 `convert/convert.go` (+16 -16) 📝 `convert/convert_bert.go` (+5 -5) 📝 `convert/convert_commandr.go` (+5 -5) 📝 `convert/convert_gemma.go` (+5 -5) 📝 `convert/convert_gemma2.go` (+2 -4) 📝 `convert/convert_gemma2_adapter.go` (+5 -5) 📝 `convert/convert_llama.go` (+6 -6) 📝 `convert/convert_llama_adapter.go` (+5 -5) 📝 `convert/convert_mixtral.go` (+5 -5) 📝 `convert/convert_phi3.go` (+7 -7) 📝 `convert/convert_qwen2.go` (+5 -5) 📝 `convert/convert_test.go` (+6 -6) 📝 `fs/ggml/ggml.go` (+111 -95) 📝 `fs/ggml/gguf.go` (+6 -7) 📝 `fs/ggml/type.go` (+2 -7) 📝 `fs/util/bufioutil/buffer_seeker.go` (+0 -0) 📝 `fs/util/bufioutil/buffer_seeker_test.go` (+0 -0) ➖ `llm/ggla.go` (+0 -149) ➖ `llm/ggml_test.go` (+0 -1) _...and 38 more files_ </details> ### 📄 Description Adding a basic unit test for the bpe tokenizer. Tests on the `next` branch are failing, so this test pipeline will also, but the tokenizer tests specifically pass. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-22 23:17:33 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#38619