[GH-ISSUE #7901] Error: max retries exceeded: unexpected EOF #30815

Closed
opened 2026-04-22 10:45:10 -05:00 by GiteaMirror · 6 comments
Owner

Originally created by @szzhh on GitHub (Dec 1, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/7901

What is the issue?

When I pull the llama3.1:405b model, ' Error: max retries exceeded: unexpected EOF ' often appears. But usually, I can continue downloading instead of re-downloading. But I encountered a problem yesterday. I downloaded more than 200g of the model, but after the error, I had to re-download the command, and the model's ID and size also changed. I have encountered this problem before, as shown in the figure below. I don't know why I encounter such a problem. Is there any way I can continue downloading?
3
1
2

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.4.6

Originally created by @szzhh on GitHub (Dec 1, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/7901 ### What is the issue? When I pull the llama3.1:405b model, ' Error: max retries exceeded: unexpected EOF ' often appears. But usually, I can continue downloading instead of re-downloading. But I encountered a problem yesterday. I downloaded more than 200g of the model, but after the error, I had to re-download the command, and the model's ID and size also changed. I have encountered this problem before, as shown in the figure below. I don't know why I encounter such a problem. Is there any way I can continue downloading? ![3](https://github.com/user-attachments/assets/d60bce03-ebbf-4741-8fcc-f803892e71a6) ![1](https://github.com/user-attachments/assets/f13af8bc-d382-46c6-833f-fe03ff0ed514) ![2](https://github.com/user-attachments/assets/a364694f-90dc-4e1c-ab06-0388ad2d219c) ### OS Linux ### GPU Nvidia ### CPU Intel ### Ollama version 0.4.6
GiteaMirror added the bug label 2026-04-22 10:45:10 -05:00
Author
Owner

@rick-github commented on GitHub (Dec 1, 2024):

The default quant has been changed from q4_0 to q4_K_M. You might be able to resume the download by pulling llama3.1:405b-instruct-q4_0.

<!-- gh-comment-id:2509653413 --> @rick-github commented on GitHub (Dec 1, 2024): The default quant has been changed from q4_0 to q4_K_M. You might be able to resume the download by pulling llama3.1:405b-instruct-q4_0.
Author
Owner

@szzhh commented on GitHub (Dec 1, 2024):

The default quant has been changed from q4_0 to q4_K_M. You might be able to resume the download by pulling llama3.1:405b-instruct-q4_0.

Thank you for your reply, I can continue downloading now. By the way, may I ask if there have been any significant changes between the modified model and the original model.

<!-- gh-comment-id:2509661787 --> @szzhh commented on GitHub (Dec 1, 2024): > The default quant has been changed from q4_0 to q4_K_M. You might be able to resume the download by pulling llama3.1:405b-instruct-q4_0. Thank you for your reply, I can continue downloading now. By the way, may I ask if there have been any significant changes between the modified model and the original model.
Author
Owner

@rick-github commented on GitHub (Dec 1, 2024):

https://github.com/ollama/ollama/issues/5425

<!-- gh-comment-id:2509665531 --> @rick-github commented on GitHub (Dec 1, 2024): https://github.com/ollama/ollama/issues/5425
Author
Owner

@szzhh commented on GitHub (Dec 1, 2024):

thank you 

szh
@.***

---Original---
From: @.>
Date: Sun, Dec 1, 2024 17:58 PM
To: @.
>;
Cc: "Sun @.@.>;
Subject: Re: [ollama/ollama] Error: max retries exceeded: unexpected EOF(Issue #7901)

#5425


Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: @.***>

<!-- gh-comment-id:2509667908 --> @szzhh commented on GitHub (Dec 1, 2024): thank you&nbsp; szh ***@***.*** ---Original--- From: ***@***.***&gt; Date: Sun, Dec 1, 2024 17:58 PM To: ***@***.***&gt;; Cc: "Sun ***@***.******@***.***&gt;; Subject: Re: [ollama/ollama] Error: max retries exceeded: unexpected EOF(Issue #7901) #5425 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***&gt;
Author
Owner

@szzhh commented on GitHub (Dec 2, 2024):

#5425

Hello, I would like to ask why the GPU memory usage keeps showing 0MB during model inference.
4

<!-- gh-comment-id:2510548737 --> @szzhh commented on GitHub (Dec 2, 2024): > #5425 Hello, I would like to ask why the GPU memory usage keeps showing 0MB during model inference. ![4](https://github.com/user-attachments/assets/efc3b19e-96ed-4c62-8040-0c3b88bcbffd)
Author
Owner

@rick-github commented on GitHub (Dec 2, 2024):

Server logs would aid in debugging.

<!-- gh-comment-id:2510999088 --> @rick-github commented on GitHub (Dec 2, 2024): [Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) would aid in debugging.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#30815