[GH-ISSUE #2487] Server error: msg="failed to encode prompt" err="exception server shutting down" #63488

Closed
opened 2026-05-03 13:48:54 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @hyjwei on GitHub (Feb 14, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/2487

After ollama server is idle for about 5 minutes, it will automatically shutdown. When a client wakes it up, it will then reload the model and respond to the client.

However, the binary from current main branch will give an error and cause the client (ollama run) to abort. This error is probably caused by commit 6680761596.

On the server side

First, ollama server shutdown after 5 minutes of idle (timestamp: 1707877174 --> 1707877473):

[1707877174] slot 0 released (661 tokens in cache)
[1707877473]
initiating shutdown - draining remaining tasks...
[1707877473]
llama server shutting down
[1707877474] llama server shutdown complete

Then, upon receiving a new prompt from client, ollama server reloads the model and then gets error:

[1707877500] warming up the model with an empty run
[1707877502] Available slots:
[1707877502]  -> Slot 0 - max context: 2048
time=2024-02-13T21:25:02.469-05:00 level=INFO source=dyn_ext_server.go:156 msg="Starting llama main loop"
[1707877502] llama server main loop starting
[1707877502] all slots are idle and system prompt is empty, clear the KV cache
time=2024-02-13T21:25:02.472-05:00 level=ERROR source=prompt.go:86 msg="failed to encode prompt" err="exception server shutting down"
[GIN] 2024/02/13 - 21:25:02 | 400 | 12.223554387s |       127.0.0.1 | POST     "/api/chat"

On the client side

$ ollama run phi
>>> What is the biggest city in France?
 Paris is the largest city in France, both in terms of population and area. It is located
on the Seine River in the north-central part of the country and is known for its iconic
landmarks such as the Eiffel Tower, Louvre Museum, Notre-Dame Cathedral, and many other
historical buildings. Paris has a rich history, vibrant culture, and is one of the most
visited cities in the world.

then wait 5 minutes for ollama server to shutdown

>>> What is the biggest city in France?
Error: exception server shutting down

Investigation

I went through the recent commits, and found that if I revert commit 6680761596, this error would be gone.

Originally created by @hyjwei on GitHub (Feb 14, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/2487 After ollama server is idle for about 5 minutes, it will automatically shutdown. When a client wakes it up, it will then reload the model and respond to the client. However, the binary from current `main` branch will give an error and cause the client (`ollama run`) to abort. This error is probably caused by commit 6680761596cbd832619ba5a295f03b74c6500743. ### On the server side First, ollama server shutdown after 5 minutes of idle (timestamp: 1707877174 --> 1707877473): ``` [1707877174] slot 0 released (661 tokens in cache) [1707877473] initiating shutdown - draining remaining tasks... [1707877473] llama server shutting down [1707877474] llama server shutdown complete ``` Then, upon receiving a new prompt from client, ollama server reloads the model and then gets error: ``` [1707877500] warming up the model with an empty run [1707877502] Available slots: [1707877502] -> Slot 0 - max context: 2048 time=2024-02-13T21:25:02.469-05:00 level=INFO source=dyn_ext_server.go:156 msg="Starting llama main loop" [1707877502] llama server main loop starting [1707877502] all slots are idle and system prompt is empty, clear the KV cache time=2024-02-13T21:25:02.472-05:00 level=ERROR source=prompt.go:86 msg="failed to encode prompt" err="exception server shutting down" [GIN] 2024/02/13 - 21:25:02 | 400 | 12.223554387s | 127.0.0.1 | POST "/api/chat" ``` ### On the client side ``` $ ollama run phi >>> What is the biggest city in France? Paris is the largest city in France, both in terms of population and area. It is located on the Seine River in the north-central part of the country and is known for its iconic landmarks such as the Eiffel Tower, Louvre Museum, Notre-Dame Cathedral, and many other historical buildings. Paris has a rich history, vibrant culture, and is one of the most visited cities in the world. ``` _then wait 5 minutes for ollama server to shutdown_ ``` >>> What is the biggest city in France? Error: exception server shutting down ``` ### Investigation I went through the recent commits, and found that if I revert commit 6680761596cbd832619ba5a295f03b74c6500743, this error would be gone.
Author
Owner

@hyjwei commented on GitHub (Feb 14, 2024):

It seems being fixed by ollama#2484

<!-- gh-comment-id:1943605377 --> @hyjwei commented on GitHub (Feb 14, 2024): It seems being fixed by ollama#2484
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#63488