[GH-ISSUE #1837] Ollama crashes quite often for Fedora 39 with NVIDIA T1200 Laptop GPU #63084

Closed
opened 2026-05-03 11:43:24 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @ilovepumpkin on GitHub (Jan 7, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/1837

Hello,

When I use ollama with NVIDIA T1200 Laptop GPU on Fedora 39, it crashes quite often regardless what models I am running. Is there any way to troubleshoot this issue?

Here is the output of nvidia-smi

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.29.06              Driver Version: 545.29.06    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA T1200 Laptop GPU        Off | 00000000:01:00.0  On |                  N/A |
| N/A   44C    P8               6W /  60W |    303MiB /  4096MiB |      7%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      3280      G   /usr/libexec/Xorg                           115MiB |
|    0   N/A  N/A      4776    C+G   ...seed-version=20240105-201042.648000      177MiB |
+---------------------------------------------------------------------------------------+

Originally created by @ilovepumpkin on GitHub (Jan 7, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/1837 Hello, When I use ollama with NVIDIA T1200 Laptop GPU on Fedora 39, it crashes quite often regardless what models I am running. Is there any way to troubleshoot this issue? Here is the output of `nvidia-smi` ``` +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 545.29.06 Driver Version: 545.29.06 CUDA Version: 12.3 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA T1200 Laptop GPU Off | 00000000:01:00.0 On | N/A | | N/A 44C P8 6W / 60W | 303MiB / 4096MiB | 7% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ +---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | 0 N/A N/A 3280 G /usr/libexec/Xorg 115MiB | | 0 N/A N/A 4776 C+G ...seed-version=20240105-201042.648000 177MiB | +---------------------------------------------------------------------------------------+ ```
Author
Owner

@ilovepumpkin commented on GitHub (Jan 7, 2024):

I got the following "out of memory" error when using ollama v0.1.18.

CUDA error 2 at /go/src/github.com/jmorganca/ollama/llm/llama.cpp/ggml-cuda.cu:9132: out of memory
current device: 0
GGML_ASSERT: /go/src/github.com/jmorganca/ollama/llm/llama.cpp/ggml-cuda.cu:9132: !"CUDA error"

However, it seems working well after I switching to v0.1.17.

<!-- gh-comment-id:1880025393 --> @ilovepumpkin commented on GitHub (Jan 7, 2024): I got the following "out of memory" error when using ollama v0.1.18. ``` CUDA error 2 at /go/src/github.com/jmorganca/ollama/llm/llama.cpp/ggml-cuda.cu:9132: out of memory current device: 0 GGML_ASSERT: /go/src/github.com/jmorganca/ollama/llm/llama.cpp/ggml-cuda.cu:9132: !"CUDA error" ``` However, it seems working well after I switching to v0.1.17.
Author
Owner

@ilovepumpkin commented on GitHub (Jan 7, 2024):

Well, after using it for a while, I am still getting the error Error: llama runner exited, you may not have enough available memory to run this model

<!-- gh-comment-id:1880039983 --> @ilovepumpkin commented on GitHub (Jan 7, 2024): Well, after using it for a while, I am still getting the error `Error: llama runner exited, you may not have enough available memory to run this model `
Author
Owner

@ilovepumpkin commented on GitHub (Jan 8, 2024):

I keep getting "out of memory" error when using v0.1.17, even in v0.1.14. Especially when I try to integrate ollama with anythingLLM ( https://github.com/Mintplex-Labs/anything-llm ), it crashes quite often.

2024/01/08 15:18:14 llama.go:506: llama runner started in 1.401141 seconds

CUDA error 2 at /go/src/github.com/jmorganca/ollama/llm/llama.cpp/gguf/ggml-cuda.cu:5924: out of memory
current device: 0
2024/01/08 15:18:32 llama.go:449: 2 at /go/src/github.com/jmorganca/ollama/llm/llama.cpp/gguf/ggml-cuda.cu:5924: out of memory
current device: 0
2024/01/08 15:18:32 llama.go:523: llama runner stopped successfully
[GIN] 2024/01/08 - 15:18:32 | 200 | 19.310051007s |       127.0.0.1 | POST     "/api/generate"
^C2024/01/08 15:19:16 llama.go:523: llama runner stopped successfully
<!-- gh-comment-id:1880491510 --> @ilovepumpkin commented on GitHub (Jan 8, 2024): I keep getting "out of memory" error when using v0.1.17, even in v0.1.14. Especially when I try to integrate ollama with anythingLLM ( https://github.com/Mintplex-Labs/anything-llm ), it crashes quite often. ``` 2024/01/08 15:18:14 llama.go:506: llama runner started in 1.401141 seconds CUDA error 2 at /go/src/github.com/jmorganca/ollama/llm/llama.cpp/gguf/ggml-cuda.cu:5924: out of memory current device: 0 2024/01/08 15:18:32 llama.go:449: 2 at /go/src/github.com/jmorganca/ollama/llm/llama.cpp/gguf/ggml-cuda.cu:5924: out of memory current device: 0 2024/01/08 15:18:32 llama.go:523: llama runner stopped successfully [GIN] 2024/01/08 - 15:18:32 | 200 | 19.310051007s | 127.0.0.1 | POST "/api/generate" ^C2024/01/08 15:19:16 llama.go:523: llama runner stopped successfully ```
Author
Owner

@ilovepumpkin commented on GitHub (Jan 8, 2024):

It looks like that the crash is related to how ollama is used - when I use it in VSCode Continue extention, it is stable. but when it being used in AnytingLLM, it crashes very quickly. Does this mean I should report a bug to AnythingLLM?

<!-- gh-comment-id:1880659679 --> @ilovepumpkin commented on GitHub (Jan 8, 2024): It looks like that the crash is related to how ollama is used - when I use it in VSCode Continue extention, it is stable. but when it being used in AnytingLLM, it crashes very quickly. Does this mean I should report a bug to AnythingLLM?
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#63084