[GH-ISSUE #4100] Error: do encode request: Post "http://127.0.0.1:39207/tokenize": EOF #28307

Open
opened 2026-04-22 06:20:20 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @j2l on GitHub (May 2, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/4100

What is the issue?

Hello,
I downloaded the Q4KM model from https://huggingface.co/LiteLLMs/French-Alpaca-Llama3-8B-Instruct-v1.0-GGUF/tree/main/Q4_K_M
renamed locally to French-Alpaca-Llama3-8B-Instruct-v1.gguf
Modelfile:
FROM "./French-Alpaca-Llama3-8B-Instruct-v1.gguf"
ollama create frll3 -f ./Modelfile

transferring model data 
creating model layer 
using already created layer sha256:08941f7a82566ca0116881e211330eae5838c20146132ed8fb9de46b6f5ea54b 
writing layer sha256:9f194159c3b80adee4448e1a1d380df743363881d417b5e9841e9611f884c155 
writing manifest 
success 

ollama run frll3
>>> hi
Error: do encode request: Post "http://127.0.0.1:39207/tokenize": EOF

But ollama run llama3 works fine, I can discuss with it, no error. I have a 3060 (12GB VRAM)

Is it because of the model? The renaming? the Makefile?
Thanks

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.1.32

Originally created by @j2l on GitHub (May 2, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/4100 ### What is the issue? Hello, I downloaded the Q4KM model from https://huggingface.co/LiteLLMs/French-Alpaca-Llama3-8B-Instruct-v1.0-GGUF/tree/main/Q4_K_M renamed locally to French-Alpaca-Llama3-8B-Instruct-v1.gguf Modelfile: `FROM "./French-Alpaca-Llama3-8B-Instruct-v1.gguf"` ollama create frll3 -f ./Modelfile ``` transferring model data creating model layer using already created layer sha256:08941f7a82566ca0116881e211330eae5838c20146132ed8fb9de46b6f5ea54b writing layer sha256:9f194159c3b80adee4448e1a1d380df743363881d417b5e9841e9611f884c155 writing manifest success ``` ollama run frll3 `>>> hi` `Error: do encode request: Post "http://127.0.0.1:39207/tokenize": EOF` But `ollama run llama3` works fine, I can discuss with it, no error. I have a 3060 (12GB VRAM) Is it because of the model? The renaming? the Makefile? Thanks ### OS Linux ### GPU Nvidia ### CPU Intel ### Ollama version 0.1.32
GiteaMirror added the bug label 2026-04-22 06:20:20 -05:00
Author
Owner

@Bonbon-Chan commented on GitHub (May 18, 2024):

Hi,

OLLama is working fine with all the "standard" model that I can access through ollama command line.

But I got the same problem with the same model :
`huggingface-cli download LiteLLMs/French-Alpaca-Llama3-8B-Instruct-v1.0-GGUF Q4_K_M/Q4_K_M-00001-of-00001.gguf --local-dir V:\ollama\downloads\French-Alpaca --local-dir-use-symlinks False

ollama create FrenchAlpaca -f modelfile

transferring model data
using existing layer sha256:08941f7a82566ca0116881e211330eae5838c20146132ed8fb9de46b6f5ea54b
creating new layer sha256:db8fbfd0cb288a053f83ac9014ca9bac2558b1bbcd80b5c408a548e7acba8a24
creating new layer sha256:1e7410908fd3e45ef2cd89cebf07322e0571fc4cda9e42ca7af366aa8d676f28
writing manifest
success

ollama run FrenchAlpaca

Salut !
Error: do encode request: Post "http://127.0.0.1:52278/tokenize": read tcp 127.0.0.1:52284->127.0.0.1:52278: wsarecv: Une connexion existante a dû être fermée par l’hôte distant.
`

With the modelfile:

FROM V:\ollama\downloads\French-Alpaca\Q4_K_M\Q4_K_M-00001-of-00001.gguf
PARAMETER num_ctx 32768

My configuration is a Windows 10 Intel with a Nvidia RTX3060.

I have try with the Q4_K_S model with the same result. I don't know if it is a problem when the model was created or a ollama problem but the huggingface page of the model claims that the model is compatible with ollama.

<!-- gh-comment-id:2118886382 --> @Bonbon-Chan commented on GitHub (May 18, 2024): Hi, OLLama is working fine with all the "standard" model that I can access through ollama command line. But I got the same problem with the same model : `huggingface-cli download LiteLLMs/French-Alpaca-Llama3-8B-Instruct-v1.0-GGUF Q4_K_M/Q4_K_M-00001-of-00001.gguf --local-dir V:\ollama\downloads\French-Alpaca --local-dir-use-symlinks False ollama create FrenchAlpaca -f modelfile transferring model data using existing layer sha256:08941f7a82566ca0116881e211330eae5838c20146132ed8fb9de46b6f5ea54b creating new layer sha256:db8fbfd0cb288a053f83ac9014ca9bac2558b1bbcd80b5c408a548e7acba8a24 creating new layer sha256:1e7410908fd3e45ef2cd89cebf07322e0571fc4cda9e42ca7af366aa8d676f28 writing manifest success ollama run FrenchAlpaca >>> Salut ! Error: do encode request: Post "http://127.0.0.1:52278/tokenize": read tcp 127.0.0.1:52284->127.0.0.1:52278: wsarecv: Une connexion existante a dû être fermée par l’hôte distant. ` With the modelfile: > FROM V:\ollama\downloads\French-Alpaca\Q4_K_M\Q4_K_M-00001-of-00001.gguf > PARAMETER num_ctx 32768 My configuration is a Windows 10 Intel with a Nvidia RTX3060. I have try with the Q4_K_S model with the same result. I don't know if it is a problem when the model was created or a ollama problem but the huggingface page of the model claims that the model is compatible with ollama.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#28307