[GH-ISSUE #4026] Llama 3 BPE tokenization needs improvement #49008

Closed
opened 2026-04-28 10:35:54 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @coder543 on GitHub (Apr 29, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/4026

What is the issue?

This PR just merged on llama.cpp, which contained important improvements to how tokenization worked for Llama 3 and other models. An example of the issue is noted here.

Hopefully ollama can update to the latest llama.cpp quickly and make a new release.

OS

Linux

GPU

Nvidia

CPU

AMD

Ollama version

all versions up to this point

Originally created by @coder543 on GitHub (Apr 29, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/4026 ### What is the issue? [This PR](https://github.com/ggerganov/llama.cpp/pull/6920) just merged on llama.cpp, which contained important improvements to how tokenization worked for Llama 3 and other models. An example of the issue is [noted here](https://github.com/ggerganov/llama.cpp/issues/6914). Hopefully ollama can update to the latest llama.cpp quickly and make a new release. ### OS Linux ### GPU Nvidia ### CPU AMD ### Ollama version all versions up to this point
GiteaMirror added the bug label 2026-04-28 10:35:54 -05:00
Author
Owner

@MoonRide303 commented on GitHub (Apr 29, 2024):

You might want to wait for https://github.com/ggerganov/llama.cpp/pull/6965 to be merged, too (should happen soon).

<!-- gh-comment-id:2082884030 --> @MoonRide303 commented on GitHub (Apr 29, 2024): You might want to wait for https://github.com/ggerganov/llama.cpp/pull/6965 to be merged, too (should happen soon).
Author
Owner

@coder543 commented on GitHub (May 11, 2024):

https://github.com/ggerganov/llama.cpp/pull/6965 has been merged now. I'm unclear when things were fixed in ollama, but I just tested with 0.1.35, and I can't reproduce it anymore. Closing.

<!-- gh-comment-id:2105757751 --> @coder543 commented on GitHub (May 11, 2024): https://github.com/ggerganov/llama.cpp/pull/6965 has been merged now. I'm unclear when things were fixed in ollama, but I just tested with 0.1.35, and I can't reproduce it anymore. Closing.
Author
Owner

@dpublic commented on GitHub (May 13, 2024):

The llama.cpp commit link in ollama is dated 4/30 and https://github.com/ggerganov/llama.cpp/pull/6965 was merged to llama.cpp on 5/9.
So, it doesn't look like this merge was included with the last 0.1.37 ollama release.
Does that mean ollama was changed to handle the previous llama.cpp behavior and a future llama.cpp sync in ollama will change behavior?

<!-- gh-comment-id:2108081433 --> @dpublic commented on GitHub (May 13, 2024): The llama.cpp commit link in ollama is dated 4/30 and https://github.com/ggerganov/llama.cpp/pull/6965 was merged to llama.cpp on 5/9. So, it doesn't look like this merge was included with the last 0.1.37 ollama release. Does that mean ollama was changed to handle the previous llama.cpp behavior and a future llama.cpp sync in ollama will change behavior?
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#49008