[GH-ISSUE #14366] Ollama / Openclaw / Hostinger KVM 8 - Pegs the server #71395

Closed
opened 2026-05-05 01:29:05 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @santonivich on GitHub (Feb 23, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/14366

Image

That's me using qwen2.5:3b-instruct on Openclaw hosted on a KVM 2 connecting to my ollama on a KVM 8. The specs of the dedicated ollama server are:

CPU core 8
Memory 32 GB
Disk space 400 GB

Grated, I realize that I need more power in reality...but...no matter what model I am using the server just gets destroyed. Has anyone dealt with this and got it working or am I simply dealing with a ollama server that doesn't have the horsepower?

Thanks

Originally created by @santonivich on GitHub (Feb 23, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/14366 <img width="1906" height="202" alt="Image" src="https://github.com/user-attachments/assets/5d3d8ee7-9e6d-40c1-8260-6734261b8d92" /> That's me using qwen2.5:3b-instruct on Openclaw hosted on a KVM 2 connecting to my ollama on a KVM 8. The specs of the dedicated ollama server are: CPU core 8 Memory 32 GB Disk space 400 GB Grated, I realize that I need more power in reality...but...no matter what model I am using the server just gets destroyed. Has anyone dealt with this and got it working or am I simply dealing with a ollama server that doesn't have the horsepower? Thanks
GiteaMirror added the question label 2026-05-05 01:29:05 -05:00
Author
Owner

@rick-github commented on GitHub (Feb 23, 2026):

Not enough horsepower. You can reduce the amount of CPU that the model consumes by limiting the number of threads it can use:

$ ollama run qwen2.5:3b-instruct-q4_K_M
>>> /set parameter num_thread 4
Set parameter 'num_thread' to '4'
>>> /save qwen2.5:3b-4t             
Created new model 'qwen2.5:3b-4t'
>>> 
$ ollama run qwen2.5:3b-4t

This will run slower but won't hammer your box as much.

<!-- gh-comment-id:3942059588 --> @rick-github commented on GitHub (Feb 23, 2026): Not enough horsepower. You can reduce the amount of CPU that the model consumes by limiting the number of threads it can use: ```console $ ollama run qwen2.5:3b-instruct-q4_K_M >>> /set parameter num_thread 4 Set parameter 'num_thread' to '4' >>> /save qwen2.5:3b-4t Created new model 'qwen2.5:3b-4t' >>> $ ollama run qwen2.5:3b-4t ``` This will run slower but won't hammer your box as much.
Author
Owner

@santonivich commented on GitHub (Feb 23, 2026):

Thanks. That’s what I figured the issue was. I appreciate confirming.

<!-- gh-comment-id:3942066549 --> @santonivich commented on GitHub (Feb 23, 2026): Thanks. That’s what I figured the issue was. I appreciate confirming.
Author
Owner

@rick-github commented on GitHub (Feb 23, 2026):

If available, a GPU will result in a marked increase in performance.

<!-- gh-comment-id:3942070211 --> @rick-github commented on GitHub (Feb 23, 2026): If available, a GPU will result in a marked increase in performance.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#71395