[GH-ISSUE #10739] llama4:scout not working oob #69112

Closed
opened 2026-05-04 17:11:55 -05:00 by GiteaMirror · 10 comments
Owner

Originally created by @lemassykoi on GitHub (May 16, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10739

What is the issue?

[ user @ Debian : /home/user ] 17:13 $ ollama run llama4:scout
pulling manifest
pulling 9d507a36062c: 100% ▕██████████████████▏  67 GB
pulling 399a8a5a36db: 100% ▕██████████████████▏ 7.8 KB
pulling 24ca191a372b: 100% ▕██████████████████▏ 6.0 KB
pulling 8a13cf51fd9e: 100% ▕██████████████████▏ 1.1 KB
pulling fc1ffc71ab8e: 100% ▕██████████████████▏ 1.6 KB
pulling bee89e20d457: 100% ▕██████████████████▏   31 B
pulling f7ce8f326f5d: 100% ▕██████████████████▏ 1.1 KB
verifying sha256 digest
writing manifest
success
Error: Post "http://127.0.0.1:11434/api/generate": EOF
[ user @ Debian : /home/user ] 17:24 $ ollama --version
ollama version is 0.7.0

Relevant log output


OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.7.0

Originally created by @lemassykoi on GitHub (May 16, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10739 ### What is the issue? ``` [ user @ Debian : /home/user ] 17:13 $ ollama run llama4:scout pulling manifest pulling 9d507a36062c: 100% ▕██████████████████▏ 67 GB pulling 399a8a5a36db: 100% ▕██████████████████▏ 7.8 KB pulling 24ca191a372b: 100% ▕██████████████████▏ 6.0 KB pulling 8a13cf51fd9e: 100% ▕██████████████████▏ 1.1 KB pulling fc1ffc71ab8e: 100% ▕██████████████████▏ 1.6 KB pulling bee89e20d457: 100% ▕██████████████████▏ 31 B pulling f7ce8f326f5d: 100% ▕██████████████████▏ 1.1 KB verifying sha256 digest writing manifest success Error: Post "http://127.0.0.1:11434/api/generate": EOF [ user @ Debian : /home/user ] 17:24 $ ollama --version ollama version is 0.7.0 ``` ### Relevant log output ```shell ``` ### OS Linux ### GPU Nvidia ### CPU Intel ### Ollama version 0.7.0
GiteaMirror added the bug label 2026-05-04 17:11:55 -05:00
Author
Owner

@rick-github commented on GitHub (May 16, 2025):

Server logs may aid in debugging.

<!-- gh-comment-id:2887095631 --> @rick-github commented on GitHub (May 16, 2025): [Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) may aid in debugging.
Author
Owner

@lemassykoi commented on GitHub (May 16, 2025):

May 16 17:45:22 Debian ollama[6012]: time=2025-05-16T17:45:22.333+02:00 level=WARN source=sched.go:676 msg="gpu VRAM usage didn't recover within timeout" seconds=5.668763082 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5
May 16 17:45:51 Debian ollama[6012]: [GIN] 2025/05/16 - 17:45:51 | 200 |     530.546µs |       127.0.0.1 | HEAD     "/"
May 16 17:45:51 Debian ollama[6012]: [GIN] 2025/05/16 - 17:45:51 | 200 |   60.418543ms |       127.0.0.1 | POST     "/api/generate"
May 16 17:45:51 Debian ollama[6012]: time=2025-05-16T17:45:51.484+02:00 level=DEBUG source=manifest.go:53 msg="layer does not exist" digest=sha256:399a8a5a36db7e6f011306e5720ce0d84dcc6f7b71765a05c6e74e323b6965a2
May 16 17:45:51 Debian ollama[6012]: time=2025-05-16T17:45:51.489+02:00 level=DEBUG source=manifest.go:53 msg="layer does not exist" digest=sha256:24ca191a372b46ea0a07eaa1c5fdb3d983be200f32cc15dcc308fca1421c87d5
May 16 17:45:51 Debian ollama[6012]: time=2025-05-16T17:45:51.494+02:00 level=DEBUG source=manifest.go:53 msg="layer does not exist" digest=sha256:399a8a5a36db7e6f011306e5720ce0d84dcc6f7b71765a05c6e74e323b6965a2
May 16 17:45:51 Debian ollama[6012]: time=2025-05-16T17:45:51.499+02:00 level=DEBUG source=manifest.go:53 msg="layer does not exist" digest=sha256:24ca191a372b46ea0a07eaa1c5fdb3d983be200f32cc15dcc308fca1421c87d5
May 16 17:45:51 Debian ollama[6012]: time=2025-05-16T17:45:51.512+02:00 level=DEBUG source=manifest.go:53 msg="layer does not exist" digest=sha256:399a8a5a36db7e6f011306e5720ce0d84dcc6f7b71765a05c6e74e323b6965a2
May 16 17:45:51 Debian ollama[6012]: time=2025-05-16T17:45:51.516+02:00 level=DEBUG source=manifest.go:53 msg="layer does not exist" digest=sha256:24ca191a372b46ea0a07eaa1c5fdb3d983be200f32cc15dcc308fca1421c87d5
<!-- gh-comment-id:2887103438 --> @lemassykoi commented on GitHub (May 16, 2025): ``` May 16 17:45:22 Debian ollama[6012]: time=2025-05-16T17:45:22.333+02:00 level=WARN source=sched.go:676 msg="gpu VRAM usage didn't recover within timeout" seconds=5.668763082 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 May 16 17:45:51 Debian ollama[6012]: [GIN] 2025/05/16 - 17:45:51 | 200 | 530.546µs | 127.0.0.1 | HEAD "/" May 16 17:45:51 Debian ollama[6012]: [GIN] 2025/05/16 - 17:45:51 | 200 | 60.418543ms | 127.0.0.1 | POST "/api/generate" May 16 17:45:51 Debian ollama[6012]: time=2025-05-16T17:45:51.484+02:00 level=DEBUG source=manifest.go:53 msg="layer does not exist" digest=sha256:399a8a5a36db7e6f011306e5720ce0d84dcc6f7b71765a05c6e74e323b6965a2 May 16 17:45:51 Debian ollama[6012]: time=2025-05-16T17:45:51.489+02:00 level=DEBUG source=manifest.go:53 msg="layer does not exist" digest=sha256:24ca191a372b46ea0a07eaa1c5fdb3d983be200f32cc15dcc308fca1421c87d5 May 16 17:45:51 Debian ollama[6012]: time=2025-05-16T17:45:51.494+02:00 level=DEBUG source=manifest.go:53 msg="layer does not exist" digest=sha256:399a8a5a36db7e6f011306e5720ce0d84dcc6f7b71765a05c6e74e323b6965a2 May 16 17:45:51 Debian ollama[6012]: time=2025-05-16T17:45:51.499+02:00 level=DEBUG source=manifest.go:53 msg="layer does not exist" digest=sha256:24ca191a372b46ea0a07eaa1c5fdb3d983be200f32cc15dcc308fca1421c87d5 May 16 17:45:51 Debian ollama[6012]: time=2025-05-16T17:45:51.512+02:00 level=DEBUG source=manifest.go:53 msg="layer does not exist" digest=sha256:399a8a5a36db7e6f011306e5720ce0d84dcc6f7b71765a05c6e74e323b6965a2 May 16 17:45:51 Debian ollama[6012]: time=2025-05-16T17:45:51.516+02:00 level=DEBUG source=manifest.go:53 msg="layer does not exist" digest=sha256:24ca191a372b46ea0a07eaa1c5fdb3d983be200f32cc15dcc308fca1421c87d5 ```
Author
Owner

@rick-github commented on GitHub (May 16, 2025):

Full server log may aid in debugging.

<!-- gh-comment-id:2887105171 --> @rick-github commented on GitHub (May 16, 2025): Full [server log](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) may aid in debugging.
Author
Owner

@lemassykoi commented on GitHub (May 16, 2025):

is that enough?

May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.476+02:00 level=INFO source=server.go:625 msg="waiting for server to become available" status="llm server loading model"
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.476+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.00"
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.727+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.01"
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.978+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.02"
May 16 17:44:50 Debian ollama[6012]: time=2025-05-16T17:44:50.228+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.03"
May 16 17:44:50 Debian ollama[6012]: time=2025-05-16T17:44:50.479+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.04"
May 16 17:44:50 Debian ollama[6012]: time=2025-05-16T17:44:50.729+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.04"
May 16 17:44:50 Debian ollama[6012]: time=2025-05-16T17:44:50.980+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.05"
May 16 17:44:51 Debian ollama[6012]: time=2025-05-16T17:44:51.231+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.06"
May 16 17:44:51 Debian ollama[6012]: time=2025-05-16T17:44:51.481+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.07"
May 16 17:44:51 Debian ollama[6012]: time=2025-05-16T17:44:51.732+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.08"
May 16 17:44:51 Debian ollama[6012]: time=2025-05-16T17:44:51.983+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.09"
May 16 17:44:52 Debian ollama[6012]: time=2025-05-16T17:44:52.234+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.10"
May 16 17:44:52 Debian ollama[6012]: time=2025-05-16T17:44:52.484+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.11"
May 16 17:44:52 Debian ollama[6012]: time=2025-05-16T17:44:52.735+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.12"
May 16 17:44:52 Debian ollama[6012]: time=2025-05-16T17:44:52.986+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.13"
May 16 17:44:53 Debian ollama[6012]: time=2025-05-16T17:44:53.236+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.14"
May 16 17:44:53 Debian ollama[6012]: time=2025-05-16T17:44:53.487+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.15"
May 16 17:44:53 Debian ollama[6012]: time=2025-05-16T17:44:53.738+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.16"
May 16 17:44:53 Debian ollama[6012]: time=2025-05-16T17:44:53.988+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.17"
May 16 17:44:54 Debian ollama[6012]: time=2025-05-16T17:44:54.239+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.18"
May 16 17:44:54 Debian ollama[6012]: time=2025-05-16T17:44:54.490+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.19"
May 16 17:44:54 Debian ollama[6012]: time=2025-05-16T17:44:54.740+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.20"
May 16 17:44:54 Debian ollama[6012]: time=2025-05-16T17:44:54.991+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.21"
May 16 17:44:55 Debian ollama[6012]: time=2025-05-16T17:44:55.242+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.22"
May 16 17:44:55 Debian ollama[6012]: time=2025-05-16T17:44:55.492+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.23"
May 16 17:44:55 Debian ollama[6012]: time=2025-05-16T17:44:55.743+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.24"
May 16 17:44:55 Debian ollama[6012]: time=2025-05-16T17:44:55.994+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.25"
May 16 17:44:56 Debian ollama[6012]: time=2025-05-16T17:44:56.244+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.26"
May 16 17:44:56 Debian ollama[6012]: time=2025-05-16T17:44:56.495+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.27"
May 16 17:44:56 Debian ollama[6012]: time=2025-05-16T17:44:56.745+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.28"
May 16 17:44:56 Debian ollama[6012]: time=2025-05-16T17:44:56.996+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.29"
May 16 17:44:57 Debian ollama[6012]: time=2025-05-16T17:44:57.247+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.30"
May 16 17:44:57 Debian ollama[6012]: time=2025-05-16T17:44:57.497+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.31"
May 16 17:44:57 Debian ollama[6012]: time=2025-05-16T17:44:57.748+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.32"
May 16 17:44:57 Debian ollama[6012]: time=2025-05-16T17:44:57.999+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.33"
May 16 17:44:58 Debian ollama[6012]: time=2025-05-16T17:44:58.249+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.34"
May 16 17:44:58 Debian ollama[6012]: time=2025-05-16T17:44:58.500+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.35"
May 16 17:44:58 Debian ollama[6012]: time=2025-05-16T17:44:58.751+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.36"
May 16 17:44:59 Debian ollama[6012]: time=2025-05-16T17:44:59.002+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.37"
May 16 17:44:59 Debian ollama[6012]: time=2025-05-16T17:44:59.253+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.38"
May 16 17:44:59 Debian ollama[6012]: time=2025-05-16T17:44:59.505+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.39"
May 16 17:44:59 Debian ollama[6012]: time=2025-05-16T17:44:59.755+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.40"
May 16 17:45:00 Debian ollama[6012]: time=2025-05-16T17:45:00.007+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.42"
May 16 17:45:00 Debian ollama[6012]: time=2025-05-16T17:45:00.257+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.43"
May 16 17:45:00 Debian ollama[6012]: time=2025-05-16T17:45:00.508+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.44"
May 16 17:45:00 Debian ollama[6012]: time=2025-05-16T17:45:00.759+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.45"
May 16 17:45:01 Debian ollama[6012]: time=2025-05-16T17:45:01.009+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.46"
May 16 17:45:01 Debian ollama[6012]: time=2025-05-16T17:45:01.260+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.47"
May 16 17:45:01 Debian ollama[6012]: time=2025-05-16T17:45:01.511+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.48"
May 16 17:45:01 Debian ollama[6012]: time=2025-05-16T17:45:01.761+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.49"
May 16 17:45:02 Debian ollama[6012]: time=2025-05-16T17:45:02.012+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.50"
May 16 17:45:02 Debian ollama[6012]: time=2025-05-16T17:45:02.263+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.51"
May 16 17:45:02 Debian ollama[6012]: time=2025-05-16T17:45:02.513+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.52"
May 16 17:45:02 Debian ollama[6012]: time=2025-05-16T17:45:02.764+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.53"
May 16 17:45:03 Debian ollama[6012]: time=2025-05-16T17:45:03.015+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.54"
May 16 17:45:03 Debian ollama[6012]: time=2025-05-16T17:45:03.265+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.55"
May 16 17:45:03 Debian ollama[6012]: time=2025-05-16T17:45:03.516+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.56"
May 16 17:45:03 Debian ollama[6012]: time=2025-05-16T17:45:03.766+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.57"
May 16 17:45:04 Debian ollama[6012]: time=2025-05-16T17:45:04.017+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.58"
May 16 17:45:04 Debian ollama[6012]: time=2025-05-16T17:45:04.268+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.59"
May 16 17:45:04 Debian ollama[6012]: time=2025-05-16T17:45:04.520+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.60"
May 16 17:45:04 Debian ollama[6012]: time=2025-05-16T17:45:04.771+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.61"
May 16 17:45:05 Debian ollama[6012]: time=2025-05-16T17:45:05.021+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.62"
May 16 17:45:05 Debian ollama[6012]: time=2025-05-16T17:45:05.272+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.63"
May 16 17:45:05 Debian ollama[6012]: time=2025-05-16T17:45:05.523+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.64"
May 16 17:45:05 Debian ollama[6012]: time=2025-05-16T17:45:05.773+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.65"
May 16 17:45:06 Debian ollama[6012]: time=2025-05-16T17:45:06.024+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.66"
May 16 17:45:06 Debian ollama[6012]: time=2025-05-16T17:45:06.275+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.67"
May 16 17:45:06 Debian ollama[6012]: time=2025-05-16T17:45:06.526+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.68"
May 16 17:45:06 Debian ollama[6012]: time=2025-05-16T17:45:06.777+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.70"
May 16 17:45:07 Debian ollama[6012]: time=2025-05-16T17:45:07.027+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.71"
May 16 17:45:07 Debian ollama[6012]: time=2025-05-16T17:45:07.278+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.72"
May 16 17:45:07 Debian ollama[6012]: time=2025-05-16T17:45:07.529+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.73"
May 16 17:45:07 Debian ollama[6012]: time=2025-05-16T17:45:07.779+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.74"
May 16 17:45:08 Debian ollama[6012]: time=2025-05-16T17:45:08.030+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.75"
May 16 17:45:08 Debian ollama[6012]: time=2025-05-16T17:45:08.281+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.76"
May 16 17:45:08 Debian ollama[6012]: time=2025-05-16T17:45:08.532+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.77"
May 16 17:45:08 Debian ollama[6012]: time=2025-05-16T17:45:08.783+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.78"
May 16 17:45:09 Debian ollama[6012]: time=2025-05-16T17:45:09.033+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.79"
May 16 17:45:09 Debian ollama[6012]: time=2025-05-16T17:45:09.284+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.80"
May 16 17:45:09 Debian ollama[6012]: time=2025-05-16T17:45:09.535+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.81"
May 16 17:45:09 Debian ollama[6012]: time=2025-05-16T17:45:09.785+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.82"
May 16 17:45:10 Debian ollama[6012]: time=2025-05-16T17:45:10.036+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.83"
May 16 17:45:10 Debian ollama[6012]: time=2025-05-16T17:45:10.287+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.84"
May 16 17:45:10 Debian ollama[6012]: time=2025-05-16T17:45:10.537+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.85"
May 16 17:45:10 Debian ollama[6012]: time=2025-05-16T17:45:10.788+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.86"
May 16 17:45:11 Debian ollama[6012]: time=2025-05-16T17:45:11.039+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.87"
May 16 17:45:11 Debian ollama[6012]: time=2025-05-16T17:45:11.289+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.88"
May 16 17:45:11 Debian ollama[6012]: time=2025-05-16T17:45:11.540+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.89"
May 16 17:45:11 Debian ollama[6012]: time=2025-05-16T17:45:11.791+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.90"
May 16 17:45:12 Debian ollama[6012]: time=2025-05-16T17:45:12.041+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.91"
May 16 17:45:12 Debian ollama[6012]: time=2025-05-16T17:45:12.292+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.92"
May 16 17:45:12 Debian ollama[6012]: time=2025-05-16T17:45:12.543+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.93"
May 16 17:45:12 Debian ollama[6012]: time=2025-05-16T17:45:12.793+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.94"
May 16 17:45:13 Debian ollama[6012]: time=2025-05-16T17:45:13.044+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.95"
May 16 17:45:13 Debian ollama[6012]: time=2025-05-16T17:45:13.295+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.96"
May 16 17:45:13 Debian ollama[6012]: time=2025-05-16T17:45:13.545+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.97"
May 16 17:45:13 Debian ollama[6012]: time=2025-05-16T17:45:13.796+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.97"
May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.047+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.98"
May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.297+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.99"
May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.548+02:00 level=DEBUG source=server.go:636 msg="model load progress 1.00"
May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.732+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=tokenizer.ggml.pretokenizer default="[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]*[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]+(?i:'s|'t|'re|'ve|'m|'ll|'d)?|[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]+[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]*(?i:'s|'t|'re|'ve|'m|'ll|'d)?|\\p{N}{1,3}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n/]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+"
May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.734+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.vision.num_channels default=3
May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.734+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.vision.max_upscaling_size default=448
May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.734+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.rope.freq_scale default=1
May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.734+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.no_rope_interval default=4
May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.734+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.attention.temperature_tuning default=true
May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.734+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.attention.scale default=0.10000000149011612
May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.734+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.attention.floor_scale default=8192
May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.800+02:00 level=DEBUG source=server.go:636 msg="model load progress 1.00"
May 16 17:45:15 Debian ollama[6012]: time=2025-05-16T17:45:15.052+02:00 level=DEBUG source=server.go:639 msg="model load completed, waiting for server to become available" status="llm server loading model"
May 16 17:45:15 Debian ollama[6012]: time=2025-05-16T17:45:15.505+02:00 level=INFO source=server.go:625 msg="waiting for server to become available" status="llm server not responding"
May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.661+02:00 level=ERROR source=sched.go:478 msg="error loading llama server" error="llama runner process has terminated: signal: killed"
May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.662+02:00 level=DEBUG source=sched.go:480 msg="triggering expiration for failed load" runner.name=registry.ollama.ai/library/llama4:scout runner.inference=cuda runner.devices=1 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 runner.num_ctx=24576
May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.663+02:00 level=DEBUG source=sched.go:364 msg="runner expired event received" runner.name=registry.ollama.ai/library/llama4:scout runner.inference=cuda runner.devices=1 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 runner.num_ctx=24576
May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.663+02:00 level=DEBUG source=sched.go:379 msg="got lock to unload expired event" runner.name=registry.ollama.ai/library/llama4:scout runner.inference=cuda runner.devices=1 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 runner.num_ctx=24576
May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.664+02:00 level=DEBUG source=sched.go:391 msg="starting background wait for VRAM recovery" runner.name=registry.ollama.ai/library/llama4:scout runner.inference=cuda runner.devices=1 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 runner.num_ctx=24576
May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.665+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.1 GiB" before.free_swap="215.1 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="52.6 MiB"
May 16 17:45:16 Debian ollama[6012]: [GIN] 2025/05/16 - 17:45:16 | 500 | 28.018068622s |       127.0.0.1 | POST     "/api/generate"
May 16 17:45:16 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:16 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:16 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:16 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:16 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:16 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:16 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:16 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:16 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:16 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:16 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:16 Debian ollama[6012]: calling cuInit
May 16 17:45:16 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:16 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:16 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:16 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:16 Debian ollama[6012]: device count 1
May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.831+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.4 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:16 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.832+02:00 level=DEBUG source=server.go:1023 msg="stopping llama server" pid=6062
May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.832+02:00 level=DEBUG source=sched.go:396 msg="runner terminated and removed from list, blocking for VRAM recovery" runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5
May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.083+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="52.6 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="63.4 MiB"
May 16 17:45:17 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:17 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:17 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:17 Debian ollama[6012]: calling cuInit
May 16 17:45:17 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:17 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:17 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:17 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:17 Debian ollama[6012]: device count 1
May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.235+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:17 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.333+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="63.4 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="70.3 MiB"
May 16 17:45:17 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:17 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:17 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:17 Debian ollama[6012]: calling cuInit
May 16 17:45:17 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:17 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:17 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:17 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:17 Debian ollama[6012]: device count 1
May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.445+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:17 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.583+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="70.3 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="71.8 MiB"
May 16 17:45:17 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:17 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:17 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:17 Debian ollama[6012]: calling cuInit
May 16 17:45:17 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:17 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:17 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:17 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:17 Debian ollama[6012]: device count 1
May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.683+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:17 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.833+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="71.8 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="72.2 MiB"
May 16 17:45:17 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:17 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:17 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:17 Debian ollama[6012]: calling cuInit
May 16 17:45:17 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:17 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:17 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:17 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:17 Debian ollama[6012]: device count 1
May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.925+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:17 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.083+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="72.2 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="72.5 MiB"
May 16 17:45:18 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:18 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:18 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:18 Debian ollama[6012]: calling cuInit
May 16 17:45:18 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:18 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:18 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:18 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:18 Debian ollama[6012]: device count 1
May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.182+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:18 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.332+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="72.5 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="72.7 MiB"
May 16 17:45:18 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:18 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:18 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:18 Debian ollama[6012]: calling cuInit
May 16 17:45:18 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:18 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:18 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:18 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:18 Debian ollama[6012]: device count 1
May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.426+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:18 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.583+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="72.7 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="72.7 MiB"
May 16 17:45:18 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:18 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:18 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:18 Debian ollama[6012]: calling cuInit
May 16 17:45:18 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:18 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:18 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:18 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:18 Debian ollama[6012]: device count 1
May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.683+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:18 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.833+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="72.7 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="72.9 MiB"
May 16 17:45:18 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:18 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:18 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:18 Debian ollama[6012]: calling cuInit
May 16 17:45:18 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:18 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:18 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:18 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:18 Debian ollama[6012]: device count 1
May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.930+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:18 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.083+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="72.9 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="72.9 MiB"
May 16 17:45:19 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:19 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:19 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:19 Debian ollama[6012]: calling cuInit
May 16 17:45:19 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:19 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:19 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:19 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:19 Debian ollama[6012]: device count 1
May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.184+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:19 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.332+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="72.9 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB"
May 16 17:45:19 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:19 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:19 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:19 Debian ollama[6012]: calling cuInit
May 16 17:45:19 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:19 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:19 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:19 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:19 Debian ollama[6012]: device count 1
May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.427+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:19 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.583+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB"
May 16 17:45:19 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:19 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:19 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:19 Debian ollama[6012]: calling cuInit
May 16 17:45:19 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:19 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:19 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:19 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:19 Debian ollama[6012]: device count 1
May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.683+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:19 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.833+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB"
May 16 17:45:19 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:19 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:19 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:19 Debian ollama[6012]: calling cuInit
May 16 17:45:19 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:19 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:19 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:19 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:19 Debian ollama[6012]: device count 1
May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.926+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:19 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.083+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB"
May 16 17:45:20 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:20 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:20 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:20 Debian ollama[6012]: calling cuInit
May 16 17:45:20 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:20 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:20 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:20 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:20 Debian ollama[6012]: device count 1
May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.183+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:20 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.333+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB"
May 16 17:45:20 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:20 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:20 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:20 Debian ollama[6012]: calling cuInit
May 16 17:45:20 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:20 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:20 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:20 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:20 Debian ollama[6012]: device count 1
May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.426+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:20 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.583+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB"
May 16 17:45:20 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:20 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:20 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:20 Debian ollama[6012]: calling cuInit
May 16 17:45:20 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:20 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:20 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:20 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:20 Debian ollama[6012]: device count 1
May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.683+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:20 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.833+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB"
May 16 17:45:20 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:20 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:20 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:20 Debian ollama[6012]: calling cuInit
May 16 17:45:20 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:20 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:20 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:20 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:20 Debian ollama[6012]: device count 1
May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.926+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:20 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.083+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB"
May 16 17:45:21 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:21 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:21 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:21 Debian ollama[6012]: calling cuInit
May 16 17:45:21 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:21 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:21 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:21 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:21 Debian ollama[6012]: device count 1
May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.182+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:21 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.333+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB"
May 16 17:45:21 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:21 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:21 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:21 Debian ollama[6012]: calling cuInit
May 16 17:45:21 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:21 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:21 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:21 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:21 Debian ollama[6012]: device count 1
May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.426+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:21 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.583+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB"
May 16 17:45:21 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:21 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:21 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:21 Debian ollama[6012]: calling cuInit
May 16 17:45:21 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:21 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:21 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:21 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:21 Debian ollama[6012]: device count 1
May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.683+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:21 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.832+02:00 level=WARN source=sched.go:676 msg="gpu VRAM usage didn't recover within timeout" seconds=5.168159089 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5
May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.833+02:00 level=DEBUG source=sched.go:399 msg="sending an unloaded event" runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5
May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.833+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB"
May 16 17:45:21 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:21 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:21 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:21 Debian ollama[6012]: calling cuInit
May 16 17:45:21 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:21 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:21 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:21 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:21 Debian ollama[6012]: device count 1
May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.833+02:00 level=DEBUG source=sched.go:312 msg="ignoring unload event with no pending requests"
May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.927+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:21 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:22 Debian ollama[6012]: time=2025-05-16T17:45:22.082+02:00 level=WARN source=sched.go:676 msg="gpu VRAM usage didn't recover within timeout" seconds=5.418638538 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5
May 16 17:45:22 Debian ollama[6012]: time=2025-05-16T17:45:22.083+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.1 MiB"
May 16 17:45:22 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:22 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:22 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:22 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:22 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:22 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:22 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:22 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:22 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:22 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:22 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:22 Debian ollama[6012]: calling cuInit
May 16 17:45:22 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:22 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:22 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:22 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:22 Debian ollama[6012]: device count 1
May 16 17:45:22 Debian ollama[6012]: time=2025-05-16T17:45:22.181+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:22 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:22 Debian ollama[6012]: time=2025-05-16T17:45:22.333+02:00 level=WARN source=sched.go:676 msg="gpu VRAM usage didn't recover within timeout" seconds=5.668763082 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5
May 16 17:45:51 Debian ollama[6012]: [GIN] 2025/05/16 - 17:45:51 | 200 |     530.546µs |       127.0.0.1 | HEAD     "/"
May 16 17:45:51 Debian ollama[6012]: [GIN] 2025/05/16 - 17:45:51 | 200 |   60.418543ms |       127.0.0.1 | POST     "/api/generate"
May 16 17:45:51 Debian ollama[6012]: time=2025-05-16T17:45:51.484+02:00 level=DEBUG source=manifest.go:53 msg="layer does not exist" digest=sha256:399a8a5a36db7e6f011306e5720ce0d84dcc6f7b71765a05c6e74e323b6965a2
May 16 17:45:51 Debian ollama[6012]: time=2025-05-16T17:45:51.489+02:00 level=DEBUG source=manifest.go:53 msg="layer does not exist" digest=sha256:24ca191a372b46ea0a07eaa1c5fdb3d983be200f32cc15dcc308fca1421c87d5
May 16 17:45:51 Debian ollama[6012]: time=2025-05-16T17:45:51.494+02:00 level=DEBUG source=manifest.go:53 msg="layer does not exist" digest=sha256:399a8a5a36db7e6f011306e5720ce0d84dcc6f7b71765a05c6e74e323b6965a2
May 16 17:45:51 Debian ollama[6012]: time=2025-05-16T17:45:51.499+02:00 level=DEBUG source=manifest.go:53 msg="layer does not exist" digest=sha256:24ca191a372b46ea0a07eaa1c5fdb3d983be200f32cc15dcc308fca1421c87d5
May 16 17:45:51 Debian ollama[6012]: time=2025-05-16T17:45:51.512+02:00 level=DEBUG source=manifest.go:53 msg="layer does not exist" digest=sha256:399a8a5a36db7e6f011306e5720ce0d84dcc6f7b71765a05c6e74e323b6965a2
May 16 17:45:51 Debian ollama[6012]: time=2025-05-16T17:45:51.516+02:00 level=DEBUG source=manifest.go:53 msg="layer does not exist" digest=sha256:24ca191a372b46ea0a07eaa1c5fdb3d983be200f32cc15dcc308fca1421c87d5
<!-- gh-comment-id:2887116999 --> @lemassykoi commented on GitHub (May 16, 2025): is that enough? ``` May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.476+02:00 level=INFO source=server.go:625 msg="waiting for server to become available" status="llm server loading model" May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.476+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.00" May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.727+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.01" May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.978+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.02" May 16 17:44:50 Debian ollama[6012]: time=2025-05-16T17:44:50.228+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.03" May 16 17:44:50 Debian ollama[6012]: time=2025-05-16T17:44:50.479+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.04" May 16 17:44:50 Debian ollama[6012]: time=2025-05-16T17:44:50.729+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.04" May 16 17:44:50 Debian ollama[6012]: time=2025-05-16T17:44:50.980+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.05" May 16 17:44:51 Debian ollama[6012]: time=2025-05-16T17:44:51.231+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.06" May 16 17:44:51 Debian ollama[6012]: time=2025-05-16T17:44:51.481+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.07" May 16 17:44:51 Debian ollama[6012]: time=2025-05-16T17:44:51.732+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.08" May 16 17:44:51 Debian ollama[6012]: time=2025-05-16T17:44:51.983+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.09" May 16 17:44:52 Debian ollama[6012]: time=2025-05-16T17:44:52.234+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.10" May 16 17:44:52 Debian ollama[6012]: time=2025-05-16T17:44:52.484+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.11" May 16 17:44:52 Debian ollama[6012]: time=2025-05-16T17:44:52.735+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.12" May 16 17:44:52 Debian ollama[6012]: time=2025-05-16T17:44:52.986+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.13" May 16 17:44:53 Debian ollama[6012]: time=2025-05-16T17:44:53.236+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.14" May 16 17:44:53 Debian ollama[6012]: time=2025-05-16T17:44:53.487+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.15" May 16 17:44:53 Debian ollama[6012]: time=2025-05-16T17:44:53.738+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.16" May 16 17:44:53 Debian ollama[6012]: time=2025-05-16T17:44:53.988+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.17" May 16 17:44:54 Debian ollama[6012]: time=2025-05-16T17:44:54.239+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.18" May 16 17:44:54 Debian ollama[6012]: time=2025-05-16T17:44:54.490+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.19" May 16 17:44:54 Debian ollama[6012]: time=2025-05-16T17:44:54.740+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.20" May 16 17:44:54 Debian ollama[6012]: time=2025-05-16T17:44:54.991+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.21" May 16 17:44:55 Debian ollama[6012]: time=2025-05-16T17:44:55.242+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.22" May 16 17:44:55 Debian ollama[6012]: time=2025-05-16T17:44:55.492+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.23" May 16 17:44:55 Debian ollama[6012]: time=2025-05-16T17:44:55.743+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.24" May 16 17:44:55 Debian ollama[6012]: time=2025-05-16T17:44:55.994+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.25" May 16 17:44:56 Debian ollama[6012]: time=2025-05-16T17:44:56.244+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.26" May 16 17:44:56 Debian ollama[6012]: time=2025-05-16T17:44:56.495+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.27" May 16 17:44:56 Debian ollama[6012]: time=2025-05-16T17:44:56.745+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.28" May 16 17:44:56 Debian ollama[6012]: time=2025-05-16T17:44:56.996+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.29" May 16 17:44:57 Debian ollama[6012]: time=2025-05-16T17:44:57.247+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.30" May 16 17:44:57 Debian ollama[6012]: time=2025-05-16T17:44:57.497+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.31" May 16 17:44:57 Debian ollama[6012]: time=2025-05-16T17:44:57.748+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.32" May 16 17:44:57 Debian ollama[6012]: time=2025-05-16T17:44:57.999+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.33" May 16 17:44:58 Debian ollama[6012]: time=2025-05-16T17:44:58.249+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.34" May 16 17:44:58 Debian ollama[6012]: time=2025-05-16T17:44:58.500+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.35" May 16 17:44:58 Debian ollama[6012]: time=2025-05-16T17:44:58.751+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.36" May 16 17:44:59 Debian ollama[6012]: time=2025-05-16T17:44:59.002+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.37" May 16 17:44:59 Debian ollama[6012]: time=2025-05-16T17:44:59.253+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.38" May 16 17:44:59 Debian ollama[6012]: time=2025-05-16T17:44:59.505+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.39" May 16 17:44:59 Debian ollama[6012]: time=2025-05-16T17:44:59.755+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.40" May 16 17:45:00 Debian ollama[6012]: time=2025-05-16T17:45:00.007+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.42" May 16 17:45:00 Debian ollama[6012]: time=2025-05-16T17:45:00.257+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.43" May 16 17:45:00 Debian ollama[6012]: time=2025-05-16T17:45:00.508+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.44" May 16 17:45:00 Debian ollama[6012]: time=2025-05-16T17:45:00.759+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.45" May 16 17:45:01 Debian ollama[6012]: time=2025-05-16T17:45:01.009+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.46" May 16 17:45:01 Debian ollama[6012]: time=2025-05-16T17:45:01.260+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.47" May 16 17:45:01 Debian ollama[6012]: time=2025-05-16T17:45:01.511+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.48" May 16 17:45:01 Debian ollama[6012]: time=2025-05-16T17:45:01.761+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.49" May 16 17:45:02 Debian ollama[6012]: time=2025-05-16T17:45:02.012+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.50" May 16 17:45:02 Debian ollama[6012]: time=2025-05-16T17:45:02.263+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.51" May 16 17:45:02 Debian ollama[6012]: time=2025-05-16T17:45:02.513+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.52" May 16 17:45:02 Debian ollama[6012]: time=2025-05-16T17:45:02.764+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.53" May 16 17:45:03 Debian ollama[6012]: time=2025-05-16T17:45:03.015+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.54" May 16 17:45:03 Debian ollama[6012]: time=2025-05-16T17:45:03.265+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.55" May 16 17:45:03 Debian ollama[6012]: time=2025-05-16T17:45:03.516+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.56" May 16 17:45:03 Debian ollama[6012]: time=2025-05-16T17:45:03.766+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.57" May 16 17:45:04 Debian ollama[6012]: time=2025-05-16T17:45:04.017+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.58" May 16 17:45:04 Debian ollama[6012]: time=2025-05-16T17:45:04.268+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.59" May 16 17:45:04 Debian ollama[6012]: time=2025-05-16T17:45:04.520+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.60" May 16 17:45:04 Debian ollama[6012]: time=2025-05-16T17:45:04.771+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.61" May 16 17:45:05 Debian ollama[6012]: time=2025-05-16T17:45:05.021+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.62" May 16 17:45:05 Debian ollama[6012]: time=2025-05-16T17:45:05.272+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.63" May 16 17:45:05 Debian ollama[6012]: time=2025-05-16T17:45:05.523+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.64" May 16 17:45:05 Debian ollama[6012]: time=2025-05-16T17:45:05.773+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.65" May 16 17:45:06 Debian ollama[6012]: time=2025-05-16T17:45:06.024+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.66" May 16 17:45:06 Debian ollama[6012]: time=2025-05-16T17:45:06.275+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.67" May 16 17:45:06 Debian ollama[6012]: time=2025-05-16T17:45:06.526+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.68" May 16 17:45:06 Debian ollama[6012]: time=2025-05-16T17:45:06.777+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.70" May 16 17:45:07 Debian ollama[6012]: time=2025-05-16T17:45:07.027+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.71" May 16 17:45:07 Debian ollama[6012]: time=2025-05-16T17:45:07.278+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.72" May 16 17:45:07 Debian ollama[6012]: time=2025-05-16T17:45:07.529+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.73" May 16 17:45:07 Debian ollama[6012]: time=2025-05-16T17:45:07.779+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.74" May 16 17:45:08 Debian ollama[6012]: time=2025-05-16T17:45:08.030+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.75" May 16 17:45:08 Debian ollama[6012]: time=2025-05-16T17:45:08.281+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.76" May 16 17:45:08 Debian ollama[6012]: time=2025-05-16T17:45:08.532+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.77" May 16 17:45:08 Debian ollama[6012]: time=2025-05-16T17:45:08.783+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.78" May 16 17:45:09 Debian ollama[6012]: time=2025-05-16T17:45:09.033+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.79" May 16 17:45:09 Debian ollama[6012]: time=2025-05-16T17:45:09.284+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.80" May 16 17:45:09 Debian ollama[6012]: time=2025-05-16T17:45:09.535+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.81" May 16 17:45:09 Debian ollama[6012]: time=2025-05-16T17:45:09.785+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.82" May 16 17:45:10 Debian ollama[6012]: time=2025-05-16T17:45:10.036+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.83" May 16 17:45:10 Debian ollama[6012]: time=2025-05-16T17:45:10.287+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.84" May 16 17:45:10 Debian ollama[6012]: time=2025-05-16T17:45:10.537+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.85" May 16 17:45:10 Debian ollama[6012]: time=2025-05-16T17:45:10.788+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.86" May 16 17:45:11 Debian ollama[6012]: time=2025-05-16T17:45:11.039+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.87" May 16 17:45:11 Debian ollama[6012]: time=2025-05-16T17:45:11.289+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.88" May 16 17:45:11 Debian ollama[6012]: time=2025-05-16T17:45:11.540+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.89" May 16 17:45:11 Debian ollama[6012]: time=2025-05-16T17:45:11.791+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.90" May 16 17:45:12 Debian ollama[6012]: time=2025-05-16T17:45:12.041+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.91" May 16 17:45:12 Debian ollama[6012]: time=2025-05-16T17:45:12.292+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.92" May 16 17:45:12 Debian ollama[6012]: time=2025-05-16T17:45:12.543+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.93" May 16 17:45:12 Debian ollama[6012]: time=2025-05-16T17:45:12.793+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.94" May 16 17:45:13 Debian ollama[6012]: time=2025-05-16T17:45:13.044+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.95" May 16 17:45:13 Debian ollama[6012]: time=2025-05-16T17:45:13.295+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.96" May 16 17:45:13 Debian ollama[6012]: time=2025-05-16T17:45:13.545+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.97" May 16 17:45:13 Debian ollama[6012]: time=2025-05-16T17:45:13.796+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.97" May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.047+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.98" May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.297+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.99" May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.548+02:00 level=DEBUG source=server.go:636 msg="model load progress 1.00" May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.732+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=tokenizer.ggml.pretokenizer default="[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]*[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]+(?i:'s|'t|'re|'ve|'m|'ll|'d)?|[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]+[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]*(?i:'s|'t|'re|'ve|'m|'ll|'d)?|\\p{N}{1,3}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n/]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+" May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.734+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.vision.num_channels default=3 May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.734+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.vision.max_upscaling_size default=448 May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.734+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.rope.freq_scale default=1 May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.734+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.no_rope_interval default=4 May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.734+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.attention.temperature_tuning default=true May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.734+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.attention.scale default=0.10000000149011612 May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.734+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.attention.floor_scale default=8192 May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.800+02:00 level=DEBUG source=server.go:636 msg="model load progress 1.00" May 16 17:45:15 Debian ollama[6012]: time=2025-05-16T17:45:15.052+02:00 level=DEBUG source=server.go:639 msg="model load completed, waiting for server to become available" status="llm server loading model" May 16 17:45:15 Debian ollama[6012]: time=2025-05-16T17:45:15.505+02:00 level=INFO source=server.go:625 msg="waiting for server to become available" status="llm server not responding" May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.661+02:00 level=ERROR source=sched.go:478 msg="error loading llama server" error="llama runner process has terminated: signal: killed" May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.662+02:00 level=DEBUG source=sched.go:480 msg="triggering expiration for failed load" runner.name=registry.ollama.ai/library/llama4:scout runner.inference=cuda runner.devices=1 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 runner.num_ctx=24576 May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.663+02:00 level=DEBUG source=sched.go:364 msg="runner expired event received" runner.name=registry.ollama.ai/library/llama4:scout runner.inference=cuda runner.devices=1 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 runner.num_ctx=24576 May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.663+02:00 level=DEBUG source=sched.go:379 msg="got lock to unload expired event" runner.name=registry.ollama.ai/library/llama4:scout runner.inference=cuda runner.devices=1 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 runner.num_ctx=24576 May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.664+02:00 level=DEBUG source=sched.go:391 msg="starting background wait for VRAM recovery" runner.name=registry.ollama.ai/library/llama4:scout runner.inference=cuda runner.devices=1 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 runner.num_ctx=24576 May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.665+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.1 GiB" before.free_swap="215.1 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="52.6 MiB" May 16 17:45:16 Debian ollama[6012]: [GIN] 2025/05/16 - 17:45:16 | 500 | 28.018068622s | 127.0.0.1 | POST "/api/generate" May 16 17:45:16 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:16 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:16 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:16 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:16 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:16 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:16 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:16 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:16 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:16 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:16 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:16 Debian ollama[6012]: calling cuInit May 16 17:45:16 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:16 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:16 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:16 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:16 Debian ollama[6012]: device count 1 May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.831+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.4 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:16 Debian ollama[6012]: releasing cuda driver library May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.832+02:00 level=DEBUG source=server.go:1023 msg="stopping llama server" pid=6062 May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.832+02:00 level=DEBUG source=sched.go:396 msg="runner terminated and removed from list, blocking for VRAM recovery" runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.083+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="52.6 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="63.4 MiB" May 16 17:45:17 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:17 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:17 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:17 Debian ollama[6012]: calling cuInit May 16 17:45:17 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:17 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:17 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:17 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:17 Debian ollama[6012]: device count 1 May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.235+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:17 Debian ollama[6012]: releasing cuda driver library May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.333+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="63.4 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="70.3 MiB" May 16 17:45:17 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:17 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:17 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:17 Debian ollama[6012]: calling cuInit May 16 17:45:17 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:17 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:17 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:17 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:17 Debian ollama[6012]: device count 1 May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.445+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:17 Debian ollama[6012]: releasing cuda driver library May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.583+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="70.3 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="71.8 MiB" May 16 17:45:17 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:17 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:17 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:17 Debian ollama[6012]: calling cuInit May 16 17:45:17 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:17 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:17 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:17 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:17 Debian ollama[6012]: device count 1 May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.683+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:17 Debian ollama[6012]: releasing cuda driver library May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.833+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="71.8 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="72.2 MiB" May 16 17:45:17 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:17 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:17 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:17 Debian ollama[6012]: calling cuInit May 16 17:45:17 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:17 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:17 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:17 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:17 Debian ollama[6012]: device count 1 May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.925+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:17 Debian ollama[6012]: releasing cuda driver library May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.083+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="72.2 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="72.5 MiB" May 16 17:45:18 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:18 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:18 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:18 Debian ollama[6012]: calling cuInit May 16 17:45:18 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:18 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:18 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:18 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:18 Debian ollama[6012]: device count 1 May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.182+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:18 Debian ollama[6012]: releasing cuda driver library May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.332+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="72.5 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="72.7 MiB" May 16 17:45:18 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:18 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:18 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:18 Debian ollama[6012]: calling cuInit May 16 17:45:18 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:18 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:18 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:18 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:18 Debian ollama[6012]: device count 1 May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.426+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:18 Debian ollama[6012]: releasing cuda driver library May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.583+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="72.7 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="72.7 MiB" May 16 17:45:18 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:18 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:18 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:18 Debian ollama[6012]: calling cuInit May 16 17:45:18 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:18 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:18 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:18 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:18 Debian ollama[6012]: device count 1 May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.683+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:18 Debian ollama[6012]: releasing cuda driver library May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.833+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="72.7 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="72.9 MiB" May 16 17:45:18 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:18 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:18 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:18 Debian ollama[6012]: calling cuInit May 16 17:45:18 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:18 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:18 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:18 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:18 Debian ollama[6012]: device count 1 May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.930+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:18 Debian ollama[6012]: releasing cuda driver library May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.083+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="72.9 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="72.9 MiB" May 16 17:45:19 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:19 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:19 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:19 Debian ollama[6012]: calling cuInit May 16 17:45:19 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:19 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:19 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:19 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:19 Debian ollama[6012]: device count 1 May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.184+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:19 Debian ollama[6012]: releasing cuda driver library May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.332+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="72.9 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB" May 16 17:45:19 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:19 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:19 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:19 Debian ollama[6012]: calling cuInit May 16 17:45:19 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:19 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:19 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:19 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:19 Debian ollama[6012]: device count 1 May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.427+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:19 Debian ollama[6012]: releasing cuda driver library May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.583+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB" May 16 17:45:19 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:19 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:19 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:19 Debian ollama[6012]: calling cuInit May 16 17:45:19 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:19 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:19 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:19 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:19 Debian ollama[6012]: device count 1 May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.683+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:19 Debian ollama[6012]: releasing cuda driver library May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.833+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB" May 16 17:45:19 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:19 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:19 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:19 Debian ollama[6012]: calling cuInit May 16 17:45:19 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:19 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:19 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:19 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:19 Debian ollama[6012]: device count 1 May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.926+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:19 Debian ollama[6012]: releasing cuda driver library May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.083+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB" May 16 17:45:20 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:20 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:20 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:20 Debian ollama[6012]: calling cuInit May 16 17:45:20 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:20 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:20 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:20 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:20 Debian ollama[6012]: device count 1 May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.183+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:20 Debian ollama[6012]: releasing cuda driver library May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.333+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB" May 16 17:45:20 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:20 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:20 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:20 Debian ollama[6012]: calling cuInit May 16 17:45:20 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:20 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:20 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:20 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:20 Debian ollama[6012]: device count 1 May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.426+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:20 Debian ollama[6012]: releasing cuda driver library May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.583+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB" May 16 17:45:20 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:20 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:20 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:20 Debian ollama[6012]: calling cuInit May 16 17:45:20 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:20 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:20 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:20 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:20 Debian ollama[6012]: device count 1 May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.683+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:20 Debian ollama[6012]: releasing cuda driver library May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.833+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB" May 16 17:45:20 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:20 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:20 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:20 Debian ollama[6012]: calling cuInit May 16 17:45:20 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:20 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:20 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:20 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:20 Debian ollama[6012]: device count 1 May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.926+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:20 Debian ollama[6012]: releasing cuda driver library May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.083+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB" May 16 17:45:21 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:21 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:21 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:21 Debian ollama[6012]: calling cuInit May 16 17:45:21 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:21 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:21 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:21 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:21 Debian ollama[6012]: device count 1 May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.182+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:21 Debian ollama[6012]: releasing cuda driver library May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.333+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB" May 16 17:45:21 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:21 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:21 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:21 Debian ollama[6012]: calling cuInit May 16 17:45:21 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:21 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:21 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:21 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:21 Debian ollama[6012]: device count 1 May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.426+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:21 Debian ollama[6012]: releasing cuda driver library May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.583+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB" May 16 17:45:21 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:21 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:21 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:21 Debian ollama[6012]: calling cuInit May 16 17:45:21 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:21 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:21 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:21 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:21 Debian ollama[6012]: device count 1 May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.683+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:21 Debian ollama[6012]: releasing cuda driver library May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.832+02:00 level=WARN source=sched.go:676 msg="gpu VRAM usage didn't recover within timeout" seconds=5.168159089 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.833+02:00 level=DEBUG source=sched.go:399 msg="sending an unloaded event" runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.833+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB" May 16 17:45:21 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:21 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:21 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:21 Debian ollama[6012]: calling cuInit May 16 17:45:21 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:21 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:21 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:21 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:21 Debian ollama[6012]: device count 1 May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.833+02:00 level=DEBUG source=sched.go:312 msg="ignoring unload event with no pending requests" May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.927+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:21 Debian ollama[6012]: releasing cuda driver library May 16 17:45:22 Debian ollama[6012]: time=2025-05-16T17:45:22.082+02:00 level=WARN source=sched.go:676 msg="gpu VRAM usage didn't recover within timeout" seconds=5.418638538 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 May 16 17:45:22 Debian ollama[6012]: time=2025-05-16T17:45:22.083+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.1 MiB" May 16 17:45:22 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:22 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:22 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:22 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:22 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:22 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:22 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:22 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:22 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:22 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:22 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:22 Debian ollama[6012]: calling cuInit May 16 17:45:22 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:22 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:22 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:22 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:22 Debian ollama[6012]: device count 1 May 16 17:45:22 Debian ollama[6012]: time=2025-05-16T17:45:22.181+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:22 Debian ollama[6012]: releasing cuda driver library May 16 17:45:22 Debian ollama[6012]: time=2025-05-16T17:45:22.333+02:00 level=WARN source=sched.go:676 msg="gpu VRAM usage didn't recover within timeout" seconds=5.668763082 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 May 16 17:45:51 Debian ollama[6012]: [GIN] 2025/05/16 - 17:45:51 | 200 | 530.546µs | 127.0.0.1 | HEAD "/" May 16 17:45:51 Debian ollama[6012]: [GIN] 2025/05/16 - 17:45:51 | 200 | 60.418543ms | 127.0.0.1 | POST "/api/generate" May 16 17:45:51 Debian ollama[6012]: time=2025-05-16T17:45:51.484+02:00 level=DEBUG source=manifest.go:53 msg="layer does not exist" digest=sha256:399a8a5a36db7e6f011306e5720ce0d84dcc6f7b71765a05c6e74e323b6965a2 May 16 17:45:51 Debian ollama[6012]: time=2025-05-16T17:45:51.489+02:00 level=DEBUG source=manifest.go:53 msg="layer does not exist" digest=sha256:24ca191a372b46ea0a07eaa1c5fdb3d983be200f32cc15dcc308fca1421c87d5 May 16 17:45:51 Debian ollama[6012]: time=2025-05-16T17:45:51.494+02:00 level=DEBUG source=manifest.go:53 msg="layer does not exist" digest=sha256:399a8a5a36db7e6f011306e5720ce0d84dcc6f7b71765a05c6e74e323b6965a2 May 16 17:45:51 Debian ollama[6012]: time=2025-05-16T17:45:51.499+02:00 level=DEBUG source=manifest.go:53 msg="layer does not exist" digest=sha256:24ca191a372b46ea0a07eaa1c5fdb3d983be200f32cc15dcc308fca1421c87d5 May 16 17:45:51 Debian ollama[6012]: time=2025-05-16T17:45:51.512+02:00 level=DEBUG source=manifest.go:53 msg="layer does not exist" digest=sha256:399a8a5a36db7e6f011306e5720ce0d84dcc6f7b71765a05c6e74e323b6965a2 May 16 17:45:51 Debian ollama[6012]: time=2025-05-16T17:45:51.516+02:00 level=DEBUG source=manifest.go:53 msg="layer does not exist" digest=sha256:24ca191a372b46ea0a07eaa1c5fdb3d983be200f32cc15dcc308fca1421c87d5 ```
Author
Owner

@rick-github commented on GitHub (May 16, 2025):

May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.661+02:00 level=ERROR source=sched.go:478
 msg="error loading llama server" error="llama runner process has terminated: signal: killed"

signal killed likely means that an external actor terminated the runner. This is usually the kernel killing a process that it thinks is going to cause an OOM issue, there will probably be an entry in the system log. If this was a full log, we would be able to see what memory allocations ollama had done preceding this failure.

<!-- gh-comment-id:2887127614 --> @rick-github commented on GitHub (May 16, 2025): ``` May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.661+02:00 level=ERROR source=sched.go:478 msg="error loading llama server" error="llama runner process has terminated: signal: killed" ``` `signal killed` likely means that an external actor terminated the runner. This is usually the kernel killing a process that it thinks is going to cause an OOM issue, there will probably be an entry in the system log. If this was a full log, we would be able to see what memory allocations ollama had done preceding this failure.
Author
Owner

@lemassykoi commented on GitHub (May 17, 2025):

what do yo you mean by "full log" ? I read your link

<!-- gh-comment-id:2887995524 --> @lemassykoi commented on GitHub (May 17, 2025): what do yo you mean by "full log" ? I read your link
Author
Owner

@rick-github commented on GitHub (May 17, 2025):

A log that includes events from before May 16 17:44:49.

<!-- gh-comment-id:2888404403 --> @rick-github commented on GitHub (May 17, 2025): A log that includes events from before May 16 17:44:49.
Author
Owner

@lemassykoi commented on GitHub (May 17, 2025):

I deleted model, rebooted computer, downloaded again, and now, there is no more errors, prompt is waiting for my input

<!-- gh-comment-id:2888448140 --> @lemassykoi commented on GitHub (May 17, 2025): I deleted model, rebooted computer, downloaded again, and now, there is no more errors, prompt is waiting for my input
Author
Owner

@lemassykoi commented on GitHub (May 17, 2025):

full log from problem:

sudo journalctl --since "2025-05-16 17:44:47" --until "2025-05-16 17:45:51" -u ollama

May 16 17:44:47 Debian systemd[1]: Stopping ollama.service - Ollama Service...
May 16 17:44:47 Debian systemd[1]: ollama.service: Deactivated successfully.
May 16 17:44:47 Debian systemd[1]: Stopped ollama.service - Ollama Service.
May 16 17:44:47 Debian systemd[1]: ollama.service: Consumed 56.626s CPU time.
May 16 17:44:47 Debian systemd[1]: Started ollama.service - Ollama Service.
May 16 17:44:47 Debian ollama[6012]: time=2025-05-16T17:44:47.146+02:00 level=INFO source=routes.go:1205 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:8192 OLLAMA_DEBUG:DEBUG OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/mnt/LLM/Ollama/Models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:3 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
May 16 17:44:47 Debian ollama[6012]: time=2025-05-16T17:44:47.211+02:00 level=INFO source=images.go:463 msg="total blobs: 301"
May 16 17:44:47 Debian ollama[6012]: time=2025-05-16T17:44:47.221+02:00 level=INFO source=images.go:470 msg="total unused blobs removed: 0"
May 16 17:44:47 Debian ollama[6012]: time=2025-05-16T17:44:47.225+02:00 level=INFO source=routes.go:1258 msg="Listening on [::]:11434 (version 0.7.0)"
May 16 17:44:47 Debian ollama[6012]: time=2025-05-16T17:44:47.225+02:00 level=DEBUG source=sched.go:108 msg="starting llm scheduler"
May 16 17:44:47 Debian ollama[6012]: time=2025-05-16T17:44:47.225+02:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
May 16 17:44:47 Debian ollama[6012]: time=2025-05-16T17:44:47.226+02:00 level=DEBUG source=gpu.go:98 msg="searching for GPU discovery libraries for NVIDIA"
May 16 17:44:47 Debian ollama[6012]: time=2025-05-16T17:44:47.226+02:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=libcuda.so*
May 16 17:44:47 Debian ollama[6012]: time=2025-05-16T17:44:47.226+02:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[/usr/local/lib/ollama/libcuda.so* /libcuda.so* /usr/local/cuda*/targets/*/lib/libcuda.so* /usr/lib/*-linux-gnu/nvidia/current/libcuda.so* /usr/lib/*-linux-gnu/libcuda.so* /usr/lib/wsl/lib/libcuda.so* /usr/lib/wsl/drivers/*/libcuda.so* /opt/cuda/lib*/libcuda.so* /usr/local/cuda/lib*/libcuda.so* /usr/lib*/libcuda.so* /usr/local/lib*/libcuda.so*]"
May 16 17:44:47 Debian ollama[6012]: time=2025-05-16T17:44:47.231+02:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths=[/usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03]
May 16 17:44:47 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:44:47 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:44:47 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:44:47 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:44:47 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:44:47 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:44:47 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:44:47 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:44:47 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:44:47 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:44:47 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:44:47 Debian ollama[6012]: calling cuInit
May 16 17:44:47 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:44:47 Debian ollama[6012]: raw version 0x2f3a
May 16 17:44:47 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:44:47 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:44:47 Debian ollama[6012]: device count 1
May 16 17:44:47 Debian ollama[6012]: time=2025-05-16T17:44:47.304+02:00 level=DEBUG source=gpu.go:125 msg="detected GPUs" count=1 library=/usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:44:47 Debian ollama[6012]: [GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6] CUDA totalMem 11872mb
May 16 17:44:47 Debian ollama[6012]: [GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6] CUDA freeMem 11655mb
May 16 17:44:47 Debian ollama[6012]: [GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6] Compute Capability 8.9
May 16 17:44:47 Debian ollama[6012]: time=2025-05-16T17:44:47.407+02:00 level=DEBUG source=amd_linux.go:419 msg="amdgpu driver not detected /sys/module/amdgpu"
May 16 17:44:47 Debian ollama[6012]: releasing cuda driver library
May 16 17:44:47 Debian ollama[6012]: time=2025-05-16T17:44:47.407+02:00 level=INFO source=types.go:130 msg="inference compute" id=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 library=cuda variant=v12 compute=8.9 driver=12.9 name="NVIDIA GeForce RTX 4070 Ti" total="11.6 GiB" available="11.4 GiB"
May 16 17:44:48 Debian ollama[6012]: [GIN] 2025/05/16 - 17:44:48 | 200 |     630.787µs |       127.0.0.1 | HEAD     "/"
May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.620+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=general.alignment default=32
May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.644+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=general.alignment default=32
May 16 17:44:48 Debian ollama[6012]: [GIN] 2025/05/16 - 17:44:48 | 200 |   92.467358ms |       127.0.0.1 | POST     "/api/show"
May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.672+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=general.alignment default=32
May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.672+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="215.1 MiB" now.total="62.6 GiB" now.free="59.1 GiB" now.free_swap="215.1 MiB"
May 16 17:44:48 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:44:48 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:44:48 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:44:48 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:44:48 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:44:48 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:44:48 Debian ollama[6012]: calling cuInit
May 16 17:44:48 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:44:48 Debian ollama[6012]: raw version 0x2f3a
May 16 17:44:48 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:44:48 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:44:48 Debian ollama[6012]: device count 1
May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.768+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.4 GiB" now.total="11.6 GiB" now.free="11.4 GiB" now.used="216.9 MiB"
May 16 17:44:48 Debian ollama[6012]: releasing cuda driver library
May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.768+02:00 level=DEBUG source=sched.go:185 msg="updating default concurrency" OLLAMA_MAX_LOADED_MODELS=3 gpu_count=1
May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.785+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=general.alignment default=32
May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.802+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=general.alignment default=32
May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.802+02:00 level=DEBUG source=sched.go:228 msg="loading first model" model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5
May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.802+02:00 level=DEBUG source=memory.go:111 msg=evaluating library=cuda gpu_count=1 available="[11.4 GiB]"
May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.803+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.vision.num_channels default=0
May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.803+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.1 GiB" before.free_swap="215.1 MiB" now.total="62.6 GiB" now.free="59.1 GiB" now.free_swap="215.1 MiB"
May 16 17:44:48 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:44:48 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:44:48 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:44:48 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:44:48 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:44:48 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:44:48 Debian ollama[6012]: calling cuInit
May 16 17:44:48 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:44:48 Debian ollama[6012]: raw version 0x2f3a
May 16 17:44:48 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:44:48 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:44:48 Debian ollama[6012]: device count 1
May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.896+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.4 GiB" now.total="11.6 GiB" now.free="11.4 GiB" now.used="216.9 MiB"
May 16 17:44:48 Debian ollama[6012]: releasing cuda driver library
May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.896+02:00 level=DEBUG source=memory.go:111 msg=evaluating library=cuda gpu_count=1 available="[11.4 GiB]"
May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.897+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.vision.num_channels default=0
May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.897+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.1 GiB" before.free_swap="215.1 MiB" now.total="62.6 GiB" now.free="59.1 GiB" now.free_swap="215.1 MiB"
May 16 17:44:48 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:44:48 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:44:48 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:44:48 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:44:48 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:44:48 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:44:48 Debian ollama[6012]: calling cuInit
May 16 17:44:48 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:44:48 Debian ollama[6012]: raw version 0x2f3a
May 16 17:44:48 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:44:48 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:44:48 Debian ollama[6012]: device count 1
May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.994+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.4 GiB" now.total="11.6 GiB" now.free="11.4 GiB" now.used="216.9 MiB"
May 16 17:44:48 Debian ollama[6012]: releasing cuda driver library
May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.994+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.1 GiB" before.free_swap="215.1 MiB" now.total="62.6 GiB" now.free="59.1 GiB" now.free_swap="215.1 MiB"
May 16 17:44:48 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:44:48 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:44:48 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:44:48 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:44:48 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:44:48 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:44:48 Debian ollama[6012]: calling cuInit
May 16 17:44:48 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:44:48 Debian ollama[6012]: raw version 0x2f3a
May 16 17:44:48 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:44:48 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:44:48 Debian ollama[6012]: device count 1
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.091+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.4 GiB" now.total="11.6 GiB" now.free="11.4 GiB" now.used="216.9 MiB"
May 16 17:44:49 Debian ollama[6012]: releasing cuda driver library
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.091+02:00 level=INFO source=server.go:135 msg="system memory" total="62.6 GiB" free="59.1 GiB" free_swap="215.1 MiB"
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.091+02:00 level=DEBUG source=memory.go:111 msg=evaluating library=cuda gpu_count=1 available="[11.4 GiB]"
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.092+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.vision.num_channels default=0
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.092+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.1 GiB" before.free_swap="215.1 MiB" now.total="62.6 GiB" now.free="59.1 GiB" now.free_swap="215.1 MiB"
May 16 17:44:49 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:44:49 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:44:49 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:44:49 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:44:49 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:44:49 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:44:49 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:44:49 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:44:49 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:44:49 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:44:49 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:44:49 Debian ollama[6012]: calling cuInit
May 16 17:44:49 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:44:49 Debian ollama[6012]: raw version 0x2f3a
May 16 17:44:49 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:44:49 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:44:49 Debian ollama[6012]: device count 1
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.187+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.4 GiB" now.total="11.6 GiB" now.free="11.4 GiB" now.used="216.9 MiB"
May 16 17:44:49 Debian ollama[6012]: releasing cuda driver library
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.187+02:00 level=INFO source=server.go:168 msg=offload library=cuda layers.requested=-1 layers.model=49 layers.offload=4 layers.split="" memory.available="[11.4 GiB]" memory.gpu_overhead="0 B" memory.required.full="68.3 GiB" memory.required.partial="10.9 GiB" memory.required.kv="2.2 GiB" memory.required.allocations="[10.9 GiB]" memory.weights.total="60.6 GiB" memory.weights.repeating="59.8 GiB" memory.weights.nonrepeating="809.3 MiB" memory.graph.full="2.0 GiB" memory.graph.partial="2.0 GiB" projector.weights="1.6 GiB" projector.graph="0 B"
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.187+02:00 level=INFO source=server.go:211 msg="enabling flash attention"
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.187+02:00 level=DEBUG source=server.go:284 msg="compatible gpu libraries" compatible="[cuda_v12 cuda_v11]"
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.222+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=general.alignment default=32
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.223+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=tokenizer.ggml.pretokenizer default="[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]*[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]+(?i:'s|'t|'re|'ve|'m|'ll|'d)?|[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]+[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]*(?i:'s|'t|'re|'ve|'m|'ll|'d)?|\\p{N}{1,3}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n/]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+"
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.223+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.vision.num_channels default=3
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.223+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.vision.max_upscaling_size default=448
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.223+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.rope.freq_scale default=1
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.223+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.no_rope_interval default=4
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.223+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.attention.temperature_tuning default=true
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.223+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.attention.scale default=0.10000000149011612
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.223+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.attention.floor_scale default=8192
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.224+02:00 level=DEBUG source=server.go:360 msg="adding gpu library" path=/usr/local/lib/ollama/cuda_v12
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.224+02:00 level=DEBUG source=server.go:367 msg="adding gpu dependency paths" paths=[/usr/local/lib/ollama/cuda_v12]
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.224+02:00 level=INFO source=server.go:431 msg="starting llama server" cmd="/usr/local/bin/ollama runner --ollama-engine --model /mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 --ctx-size 24576 --batch-size 512 --n-gpu-layers 4 --threads 8 --flash-attn --kv-cache-type q8_0 --no-mmap --parallel 3 --port 39075"
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.224+02:00 level=DEBUG source=server.go:432 msg=subprocess PATH=/usr/local/bin:/usr/bin:/bin:/usr/games:/usr/local/games OLLAMA_HOST=0.0.0.0:11434 OLLAMA_FLASH_ATTENTION=1 OLLAMA_MODELS=/mnt/LLM/Ollama/Models OLLAMA_KV_CACHE_TYPE=q8_0 OLLAMA_CONTEXT_LENGTH=8192 OLLAMA_DEBUG=1 GGML_CUDA_ENABLE_UNIFIED_MEMORY=1 OLLAMA_NUM_PARALLEL=3 OLLAMA_MAX_LOADED_MODELS=3 OLLAMA_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/cuda_v12 LD_LIBRARY_PATH=/usr/local/lib/ollama/cuda_v12:/usr/local/lib/ollama/cuda_v12:/usr/local/lib/ollama:/usr/local/lib/ollama CUDA_VISIBLE_DEVICES=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.224+02:00 level=INFO source=sched.go:472 msg="loaded runners" count=1
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.224+02:00 level=INFO source=server.go:591 msg="waiting for llama runner to start responding"
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.225+02:00 level=INFO source=server.go:625 msg="waiting for server to become available" status="llm server not responding"
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.232+02:00 level=INFO source=runner.go:836 msg="starting ollama engine"
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.233+02:00 level=INFO source=runner.go:899 msg="Server listening on 127.0.0.1:39075"
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.277+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=general.alignment default=32
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.278+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=general.name default=""
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.278+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=general.description default=""
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.278+02:00 level=INFO source=ggml.go:73 msg="" architecture=llama4 file_type=Q4_K_M name="" description="" num_tensors=1182 num_key_values=45
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.278+02:00 level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama
May 16 17:44:49 Debian ollama[6012]: load_backend: loaded CPU backend from /usr/local/lib/ollama/libggml-cpu-alderlake.so
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.285+02:00 level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama/cuda_v12
May 16 17:44:49 Debian ollama[6012]: ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
May 16 17:44:49 Debian ollama[6012]: ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
May 16 17:44:49 Debian ollama[6012]: ggml_cuda_init: found 1 CUDA devices:
May 16 17:44:49 Debian ollama[6012]:   Device 0: NVIDIA GeForce RTX 4070 Ti, compute capability 8.9, VMM: yes
May 16 17:44:49 Debian ollama[6012]: load_backend: loaded CUDA backend from /usr/local/lib/ollama/cuda_v12/libggml-cuda.so
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.398+02:00 level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX_VNNI=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 CUDA.0.ARCHS=500,600,610,700,750,800,860,870,890,900,1200 CUDA.0.USE_GRAPHS=1 CUDA.0.PEER_MAX_BATCH_SIZE=128 compiler=cgo(gcc)
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.467+02:00 level=INFO source=ggml.go:299 msg="model weights" buffer=CPU size="57.5 GiB"
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.467+02:00 level=INFO source=ggml.go:299 msg="model weights" buffer=CUDA0 size="5.3 GiB"
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.476+02:00 level=INFO source=server.go:625 msg="waiting for server to become available" status="llm server loading model"
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.476+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.00"
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.727+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.01"
May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.978+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.02"
May 16 17:44:50 Debian ollama[6012]: time=2025-05-16T17:44:50.228+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.03"
May 16 17:44:50 Debian ollama[6012]: time=2025-05-16T17:44:50.479+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.04"
May 16 17:44:50 Debian ollama[6012]: time=2025-05-16T17:44:50.729+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.04"
May 16 17:44:50 Debian ollama[6012]: time=2025-05-16T17:44:50.980+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.05"
May 16 17:44:51 Debian ollama[6012]: time=2025-05-16T17:44:51.231+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.06"
May 16 17:44:51 Debian ollama[6012]: time=2025-05-16T17:44:51.481+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.07"
May 16 17:44:51 Debian ollama[6012]: time=2025-05-16T17:44:51.732+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.08"
May 16 17:44:51 Debian ollama[6012]: time=2025-05-16T17:44:51.983+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.09"
May 16 17:44:52 Debian ollama[6012]: time=2025-05-16T17:44:52.234+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.10"
May 16 17:44:52 Debian ollama[6012]: time=2025-05-16T17:44:52.484+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.11"
May 16 17:44:52 Debian ollama[6012]: time=2025-05-16T17:44:52.735+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.12"
May 16 17:44:52 Debian ollama[6012]: time=2025-05-16T17:44:52.986+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.13"
May 16 17:44:53 Debian ollama[6012]: time=2025-05-16T17:44:53.236+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.14"
May 16 17:44:53 Debian ollama[6012]: time=2025-05-16T17:44:53.487+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.15"
May 16 17:44:53 Debian ollama[6012]: time=2025-05-16T17:44:53.738+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.16"
May 16 17:44:53 Debian ollama[6012]: time=2025-05-16T17:44:53.988+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.17"
May 16 17:44:54 Debian ollama[6012]: time=2025-05-16T17:44:54.239+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.18"
May 16 17:44:54 Debian ollama[6012]: time=2025-05-16T17:44:54.490+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.19"
May 16 17:44:54 Debian ollama[6012]: time=2025-05-16T17:44:54.740+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.20"
May 16 17:44:54 Debian ollama[6012]: time=2025-05-16T17:44:54.991+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.21"
May 16 17:44:55 Debian ollama[6012]: time=2025-05-16T17:44:55.242+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.22"
May 16 17:44:55 Debian ollama[6012]: time=2025-05-16T17:44:55.492+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.23"
May 16 17:44:55 Debian ollama[6012]: time=2025-05-16T17:44:55.743+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.24"
May 16 17:44:55 Debian ollama[6012]: time=2025-05-16T17:44:55.994+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.25"
May 16 17:44:56 Debian ollama[6012]: time=2025-05-16T17:44:56.244+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.26"
May 16 17:44:56 Debian ollama[6012]: time=2025-05-16T17:44:56.495+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.27"
May 16 17:44:56 Debian ollama[6012]: time=2025-05-16T17:44:56.745+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.28"
May 16 17:44:56 Debian ollama[6012]: time=2025-05-16T17:44:56.996+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.29"
May 16 17:44:57 Debian ollama[6012]: time=2025-05-16T17:44:57.247+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.30"
May 16 17:44:57 Debian ollama[6012]: time=2025-05-16T17:44:57.497+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.31"
May 16 17:44:57 Debian ollama[6012]: time=2025-05-16T17:44:57.748+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.32"
May 16 17:44:57 Debian ollama[6012]: time=2025-05-16T17:44:57.999+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.33"
May 16 17:44:58 Debian ollama[6012]: time=2025-05-16T17:44:58.249+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.34"
May 16 17:44:58 Debian ollama[6012]: time=2025-05-16T17:44:58.500+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.35"
May 16 17:44:58 Debian ollama[6012]: time=2025-05-16T17:44:58.751+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.36"
May 16 17:44:59 Debian ollama[6012]: time=2025-05-16T17:44:59.002+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.37"
May 16 17:44:59 Debian ollama[6012]: time=2025-05-16T17:44:59.253+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.38"
May 16 17:44:59 Debian ollama[6012]: time=2025-05-16T17:44:59.505+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.39"
May 16 17:44:59 Debian ollama[6012]: time=2025-05-16T17:44:59.755+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.40"
May 16 17:45:00 Debian ollama[6012]: time=2025-05-16T17:45:00.007+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.42"
May 16 17:45:00 Debian ollama[6012]: time=2025-05-16T17:45:00.257+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.43"
May 16 17:45:00 Debian ollama[6012]: time=2025-05-16T17:45:00.508+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.44"
May 16 17:45:00 Debian ollama[6012]: time=2025-05-16T17:45:00.759+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.45"
May 16 17:45:01 Debian ollama[6012]: time=2025-05-16T17:45:01.009+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.46"
May 16 17:45:01 Debian ollama[6012]: time=2025-05-16T17:45:01.260+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.47"
May 16 17:45:01 Debian ollama[6012]: time=2025-05-16T17:45:01.511+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.48"
May 16 17:45:01 Debian ollama[6012]: time=2025-05-16T17:45:01.761+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.49"
May 16 17:45:02 Debian ollama[6012]: time=2025-05-16T17:45:02.012+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.50"
May 16 17:45:02 Debian ollama[6012]: time=2025-05-16T17:45:02.263+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.51"
May 16 17:45:02 Debian ollama[6012]: time=2025-05-16T17:45:02.513+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.52"
May 16 17:45:02 Debian ollama[6012]: time=2025-05-16T17:45:02.764+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.53"
May 16 17:45:03 Debian ollama[6012]: time=2025-05-16T17:45:03.015+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.54"
May 16 17:45:03 Debian ollama[6012]: time=2025-05-16T17:45:03.265+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.55"
May 16 17:45:03 Debian ollama[6012]: time=2025-05-16T17:45:03.516+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.56"
May 16 17:45:03 Debian ollama[6012]: time=2025-05-16T17:45:03.766+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.57"
May 16 17:45:04 Debian ollama[6012]: time=2025-05-16T17:45:04.017+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.58"
May 16 17:45:04 Debian ollama[6012]: time=2025-05-16T17:45:04.268+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.59"
May 16 17:45:04 Debian ollama[6012]: time=2025-05-16T17:45:04.520+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.60"
May 16 17:45:04 Debian ollama[6012]: time=2025-05-16T17:45:04.771+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.61"
May 16 17:45:05 Debian ollama[6012]: time=2025-05-16T17:45:05.021+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.62"
May 16 17:45:05 Debian ollama[6012]: time=2025-05-16T17:45:05.272+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.63"
May 16 17:45:05 Debian ollama[6012]: time=2025-05-16T17:45:05.523+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.64"
May 16 17:45:05 Debian ollama[6012]: time=2025-05-16T17:45:05.773+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.65"
May 16 17:45:06 Debian ollama[6012]: time=2025-05-16T17:45:06.024+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.66"
May 16 17:45:06 Debian ollama[6012]: time=2025-05-16T17:45:06.275+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.67"
May 16 17:45:06 Debian ollama[6012]: time=2025-05-16T17:45:06.526+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.68"
May 16 17:45:06 Debian ollama[6012]: time=2025-05-16T17:45:06.777+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.70"
May 16 17:45:07 Debian ollama[6012]: time=2025-05-16T17:45:07.027+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.71"
May 16 17:45:07 Debian ollama[6012]: time=2025-05-16T17:45:07.278+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.72"
May 16 17:45:07 Debian ollama[6012]: time=2025-05-16T17:45:07.529+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.73"
May 16 17:45:07 Debian ollama[6012]: time=2025-05-16T17:45:07.779+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.74"
May 16 17:45:08 Debian ollama[6012]: time=2025-05-16T17:45:08.030+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.75"
May 16 17:45:08 Debian ollama[6012]: time=2025-05-16T17:45:08.281+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.76"
May 16 17:45:08 Debian ollama[6012]: time=2025-05-16T17:45:08.532+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.77"
May 16 17:45:08 Debian ollama[6012]: time=2025-05-16T17:45:08.783+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.78"
May 16 17:45:09 Debian ollama[6012]: time=2025-05-16T17:45:09.033+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.79"
May 16 17:45:09 Debian ollama[6012]: time=2025-05-16T17:45:09.284+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.80"
May 16 17:45:09 Debian ollama[6012]: time=2025-05-16T17:45:09.535+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.81"
May 16 17:45:09 Debian ollama[6012]: time=2025-05-16T17:45:09.785+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.82"
May 16 17:45:10 Debian ollama[6012]: time=2025-05-16T17:45:10.036+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.83"
May 16 17:45:10 Debian ollama[6012]: time=2025-05-16T17:45:10.287+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.84"
May 16 17:45:10 Debian ollama[6012]: time=2025-05-16T17:45:10.537+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.85"
May 16 17:45:10 Debian ollama[6012]: time=2025-05-16T17:45:10.788+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.86"
May 16 17:45:11 Debian ollama[6012]: time=2025-05-16T17:45:11.039+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.87"
May 16 17:45:11 Debian ollama[6012]: time=2025-05-16T17:45:11.289+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.88"
May 16 17:45:11 Debian ollama[6012]: time=2025-05-16T17:45:11.540+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.89"
May 16 17:45:11 Debian ollama[6012]: time=2025-05-16T17:45:11.791+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.90"
May 16 17:45:12 Debian ollama[6012]: time=2025-05-16T17:45:12.041+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.91"
May 16 17:45:12 Debian ollama[6012]: time=2025-05-16T17:45:12.292+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.92"
May 16 17:45:12 Debian ollama[6012]: time=2025-05-16T17:45:12.543+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.93"
May 16 17:45:12 Debian ollama[6012]: time=2025-05-16T17:45:12.793+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.94"
May 16 17:45:13 Debian ollama[6012]: time=2025-05-16T17:45:13.044+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.95"
May 16 17:45:13 Debian ollama[6012]: time=2025-05-16T17:45:13.295+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.96"
May 16 17:45:13 Debian ollama[6012]: time=2025-05-16T17:45:13.545+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.97"
May 16 17:45:13 Debian ollama[6012]: time=2025-05-16T17:45:13.796+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.97"
May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.047+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.98"
May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.297+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.99"
May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.548+02:00 level=DEBUG source=server.go:636 msg="model load progress 1.00"
May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.732+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=tokenizer.ggml.pretokenizer default="[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]*[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]+(?i:'s|'t|'re|'ve|'m|'ll|'d)?|[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]+[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]*(?i:'s|'t|'re|'ve|'m|'ll|'d)?|\\p{N}{1,3}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n/]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+"
May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.734+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.vision.num_channels default=3
May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.734+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.vision.max_upscaling_size default=448
May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.734+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.rope.freq_scale default=1
May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.734+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.no_rope_interval default=4
May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.734+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.attention.temperature_tuning default=true
May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.734+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.attention.scale default=0.10000000149011612
May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.734+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.attention.floor_scale default=8192
May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.800+02:00 level=DEBUG source=server.go:636 msg="model load progress 1.00"
May 16 17:45:15 Debian ollama[6012]: time=2025-05-16T17:45:15.052+02:00 level=DEBUG source=server.go:639 msg="model load completed, waiting for server to become available" status="llm server loading model"
May 16 17:45:15 Debian ollama[6012]: time=2025-05-16T17:45:15.505+02:00 level=INFO source=server.go:625 msg="waiting for server to become available" status="llm server not responding"
May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.661+02:00 level=ERROR source=sched.go:478 msg="error loading llama server" error="llama runner process has terminated: signal: killed"
May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.662+02:00 level=DEBUG source=sched.go:480 msg="triggering expiration for failed load" runner.name=registry.ollama.ai/library/llama4:scout runner.inference=cuda runner.devices=1 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 runner.num_ctx=24576
May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.663+02:00 level=DEBUG source=sched.go:364 msg="runner expired event received" runner.name=registry.ollama.ai/library/llama4:scout runner.inference=cuda runner.devices=1 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 runner.num_ctx=24576
May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.663+02:00 level=DEBUG source=sched.go:379 msg="got lock to unload expired event" runner.name=registry.ollama.ai/library/llama4:scout runner.inference=cuda runner.devices=1 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 runner.num_ctx=24576
May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.664+02:00 level=DEBUG source=sched.go:391 msg="starting background wait for VRAM recovery" runner.name=registry.ollama.ai/library/llama4:scout runner.inference=cuda runner.devices=1 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 runner.num_ctx=24576
May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.665+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.1 GiB" before.free_swap="215.1 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="52.6 MiB"
May 16 17:45:16 Debian ollama[6012]: [GIN] 2025/05/16 - 17:45:16 | 500 | 28.018068622s |       127.0.0.1 | POST     "/api/generate"
May 16 17:45:16 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:16 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:16 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:16 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:16 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:16 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:16 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:16 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:16 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:16 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:16 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:16 Debian ollama[6012]: calling cuInit
May 16 17:45:16 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:16 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:16 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:16 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:16 Debian ollama[6012]: device count 1
May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.831+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.4 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:16 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.832+02:00 level=DEBUG source=server.go:1023 msg="stopping llama server" pid=6062
May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.832+02:00 level=DEBUG source=sched.go:396 msg="runner terminated and removed from list, blocking for VRAM recovery" runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5
May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.083+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="52.6 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="63.4 MiB"
May 16 17:45:17 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:17 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:17 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:17 Debian ollama[6012]: calling cuInit
May 16 17:45:17 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:17 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:17 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:17 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:17 Debian ollama[6012]: device count 1
May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.235+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:17 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.333+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="63.4 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="70.3 MiB"
May 16 17:45:17 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:17 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:17 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:17 Debian ollama[6012]: calling cuInit
May 16 17:45:17 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:17 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:17 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:17 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:17 Debian ollama[6012]: device count 1
May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.445+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:17 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.583+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="70.3 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="71.8 MiB"
May 16 17:45:17 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:17 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:17 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:17 Debian ollama[6012]: calling cuInit
May 16 17:45:17 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:17 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:17 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:17 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:17 Debian ollama[6012]: device count 1
May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.683+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:17 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.833+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="71.8 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="72.2 MiB"
May 16 17:45:17 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:17 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:17 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:17 Debian ollama[6012]: calling cuInit
May 16 17:45:17 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:17 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:17 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:17 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:17 Debian ollama[6012]: device count 1
May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.925+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:17 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.083+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="72.2 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="72.5 MiB"
May 16 17:45:18 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:18 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:18 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:18 Debian ollama[6012]: calling cuInit
May 16 17:45:18 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:18 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:18 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:18 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:18 Debian ollama[6012]: device count 1
May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.182+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:18 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.332+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="72.5 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="72.7 MiB"
May 16 17:45:18 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:18 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:18 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:18 Debian ollama[6012]: calling cuInit
May 16 17:45:18 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:18 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:18 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:18 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:18 Debian ollama[6012]: device count 1
May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.426+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:18 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.583+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="72.7 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="72.7 MiB"
May 16 17:45:18 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:18 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:18 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:18 Debian ollama[6012]: calling cuInit
May 16 17:45:18 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:18 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:18 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:18 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:18 Debian ollama[6012]: device count 1
May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.683+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:18 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.833+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="72.7 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="72.9 MiB"
May 16 17:45:18 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:18 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:18 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:18 Debian ollama[6012]: calling cuInit
May 16 17:45:18 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:18 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:18 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:18 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:18 Debian ollama[6012]: device count 1
May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.930+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:18 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.083+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="72.9 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="72.9 MiB"
May 16 17:45:19 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:19 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:19 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:19 Debian ollama[6012]: calling cuInit
May 16 17:45:19 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:19 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:19 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:19 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:19 Debian ollama[6012]: device count 1
May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.184+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:19 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.332+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="72.9 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB"
May 16 17:45:19 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:19 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:19 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:19 Debian ollama[6012]: calling cuInit
May 16 17:45:19 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:19 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:19 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:19 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:19 Debian ollama[6012]: device count 1
May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.427+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:19 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.583+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB"
May 16 17:45:19 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:19 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:19 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:19 Debian ollama[6012]: calling cuInit
May 16 17:45:19 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:19 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:19 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:19 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:19 Debian ollama[6012]: device count 1
May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.683+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:19 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.833+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB"
May 16 17:45:19 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:19 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:19 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:19 Debian ollama[6012]: calling cuInit
May 16 17:45:19 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:19 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:19 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:19 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:19 Debian ollama[6012]: device count 1
May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.926+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:19 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.083+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB"
May 16 17:45:20 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:20 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:20 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:20 Debian ollama[6012]: calling cuInit
May 16 17:45:20 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:20 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:20 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:20 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:20 Debian ollama[6012]: device count 1
May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.183+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:20 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.333+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB"
May 16 17:45:20 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:20 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:20 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:20 Debian ollama[6012]: calling cuInit
May 16 17:45:20 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:20 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:20 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:20 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:20 Debian ollama[6012]: device count 1
May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.426+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:20 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.583+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB"
May 16 17:45:20 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:20 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:20 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:20 Debian ollama[6012]: calling cuInit
May 16 17:45:20 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:20 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:20 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:20 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:20 Debian ollama[6012]: device count 1
May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.683+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:20 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.833+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB"
May 16 17:45:20 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:20 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:20 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:20 Debian ollama[6012]: calling cuInit
May 16 17:45:20 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:20 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:20 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:20 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:20 Debian ollama[6012]: device count 1
May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.926+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:20 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.083+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB"
May 16 17:45:21 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:21 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:21 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:21 Debian ollama[6012]: calling cuInit
May 16 17:45:21 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:21 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:21 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:21 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:21 Debian ollama[6012]: device count 1
May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.182+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:21 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.333+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB"
May 16 17:45:21 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:21 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:21 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:21 Debian ollama[6012]: calling cuInit
May 16 17:45:21 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:21 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:21 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:21 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:21 Debian ollama[6012]: device count 1
May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.426+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:21 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.583+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB"
May 16 17:45:21 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:21 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:21 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:21 Debian ollama[6012]: calling cuInit
May 16 17:45:21 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:21 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:21 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:21 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:21 Debian ollama[6012]: device count 1
May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.683+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:21 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.832+02:00 level=WARN source=sched.go:676 msg="gpu VRAM usage didn't recover within timeout" seconds=5.168159089 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5
May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.833+02:00 level=DEBUG source=sched.go:399 msg="sending an unloaded event" runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5
May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.833+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB"
May 16 17:45:21 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:21 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:21 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:21 Debian ollama[6012]: calling cuInit
May 16 17:45:21 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:21 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:21 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:21 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:21 Debian ollama[6012]: device count 1
May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.833+02:00 level=DEBUG source=sched.go:312 msg="ignoring unload event with no pending requests"
May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.927+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:21 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:22 Debian ollama[6012]: time=2025-05-16T17:45:22.082+02:00 level=WARN source=sched.go:676 msg="gpu VRAM usage didn't recover within timeout" seconds=5.418638538 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5
May 16 17:45:22 Debian ollama[6012]: time=2025-05-16T17:45:22.083+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.1 MiB"
May 16 17:45:22 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03
May 16 17:45:22 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790
May 16 17:45:22 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850
May 16 17:45:22 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0
May 16 17:45:22 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910
May 16 17:45:22 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10
May 16 17:45:22 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50
May 16 17:45:22 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90
May 16 17:45:22 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50
May 16 17:45:22 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0
May 16 17:45:22 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710
May 16 17:45:22 Debian ollama[6012]: calling cuInit
May 16 17:45:22 Debian ollama[6012]: calling cuDriverGetVersion
May 16 17:45:22 Debian ollama[6012]: raw version 0x2f3a
May 16 17:45:22 Debian ollama[6012]: CUDA driver version: 12.9
May 16 17:45:22 Debian ollama[6012]: calling cuDeviceGetCount
May 16 17:45:22 Debian ollama[6012]: device count 1
May 16 17:45:22 Debian ollama[6012]: time=2025-05-16T17:45:22.181+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB"
May 16 17:45:22 Debian ollama[6012]: releasing cuda driver library
May 16 17:45:22 Debian ollama[6012]: time=2025-05-16T17:45:22.333+02:00 level=WARN source=sched.go:676 msg="gpu VRAM usage didn't recover within timeout" seconds=5.668763082 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5
<!-- gh-comment-id:2888460274 --> @lemassykoi commented on GitHub (May 17, 2025): full log from problem: `sudo journalctl --since "2025-05-16 17:44:47" --until "2025-05-16 17:45:51" -u ollama` ``` May 16 17:44:47 Debian systemd[1]: Stopping ollama.service - Ollama Service... May 16 17:44:47 Debian systemd[1]: ollama.service: Deactivated successfully. May 16 17:44:47 Debian systemd[1]: Stopped ollama.service - Ollama Service. May 16 17:44:47 Debian systemd[1]: ollama.service: Consumed 56.626s CPU time. May 16 17:44:47 Debian systemd[1]: Started ollama.service - Ollama Service. May 16 17:44:47 Debian ollama[6012]: time=2025-05-16T17:44:47.146+02:00 level=INFO source=routes.go:1205 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:8192 OLLAMA_DEBUG:DEBUG OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/mnt/LLM/Ollama/Models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:3 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]" May 16 17:44:47 Debian ollama[6012]: time=2025-05-16T17:44:47.211+02:00 level=INFO source=images.go:463 msg="total blobs: 301" May 16 17:44:47 Debian ollama[6012]: time=2025-05-16T17:44:47.221+02:00 level=INFO source=images.go:470 msg="total unused blobs removed: 0" May 16 17:44:47 Debian ollama[6012]: time=2025-05-16T17:44:47.225+02:00 level=INFO source=routes.go:1258 msg="Listening on [::]:11434 (version 0.7.0)" May 16 17:44:47 Debian ollama[6012]: time=2025-05-16T17:44:47.225+02:00 level=DEBUG source=sched.go:108 msg="starting llm scheduler" May 16 17:44:47 Debian ollama[6012]: time=2025-05-16T17:44:47.225+02:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs" May 16 17:44:47 Debian ollama[6012]: time=2025-05-16T17:44:47.226+02:00 level=DEBUG source=gpu.go:98 msg="searching for GPU discovery libraries for NVIDIA" May 16 17:44:47 Debian ollama[6012]: time=2025-05-16T17:44:47.226+02:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=libcuda.so* May 16 17:44:47 Debian ollama[6012]: time=2025-05-16T17:44:47.226+02:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[/usr/local/lib/ollama/libcuda.so* /libcuda.so* /usr/local/cuda*/targets/*/lib/libcuda.so* /usr/lib/*-linux-gnu/nvidia/current/libcuda.so* /usr/lib/*-linux-gnu/libcuda.so* /usr/lib/wsl/lib/libcuda.so* /usr/lib/wsl/drivers/*/libcuda.so* /opt/cuda/lib*/libcuda.so* /usr/local/cuda/lib*/libcuda.so* /usr/lib*/libcuda.so* /usr/local/lib*/libcuda.so*]" May 16 17:44:47 Debian ollama[6012]: time=2025-05-16T17:44:47.231+02:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths=[/usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03] May 16 17:44:47 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:44:47 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:44:47 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:44:47 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:44:47 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:44:47 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:44:47 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:44:47 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:44:47 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:44:47 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:44:47 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:44:47 Debian ollama[6012]: calling cuInit May 16 17:44:47 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:44:47 Debian ollama[6012]: raw version 0x2f3a May 16 17:44:47 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:44:47 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:44:47 Debian ollama[6012]: device count 1 May 16 17:44:47 Debian ollama[6012]: time=2025-05-16T17:44:47.304+02:00 level=DEBUG source=gpu.go:125 msg="detected GPUs" count=1 library=/usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:44:47 Debian ollama[6012]: [GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6] CUDA totalMem 11872mb May 16 17:44:47 Debian ollama[6012]: [GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6] CUDA freeMem 11655mb May 16 17:44:47 Debian ollama[6012]: [GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6] Compute Capability 8.9 May 16 17:44:47 Debian ollama[6012]: time=2025-05-16T17:44:47.407+02:00 level=DEBUG source=amd_linux.go:419 msg="amdgpu driver not detected /sys/module/amdgpu" May 16 17:44:47 Debian ollama[6012]: releasing cuda driver library May 16 17:44:47 Debian ollama[6012]: time=2025-05-16T17:44:47.407+02:00 level=INFO source=types.go:130 msg="inference compute" id=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 library=cuda variant=v12 compute=8.9 driver=12.9 name="NVIDIA GeForce RTX 4070 Ti" total="11.6 GiB" available="11.4 GiB" May 16 17:44:48 Debian ollama[6012]: [GIN] 2025/05/16 - 17:44:48 | 200 | 630.787µs | 127.0.0.1 | HEAD "/" May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.620+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=general.alignment default=32 May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.644+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=general.alignment default=32 May 16 17:44:48 Debian ollama[6012]: [GIN] 2025/05/16 - 17:44:48 | 200 | 92.467358ms | 127.0.0.1 | POST "/api/show" May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.672+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=general.alignment default=32 May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.672+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="215.1 MiB" now.total="62.6 GiB" now.free="59.1 GiB" now.free_swap="215.1 MiB" May 16 17:44:48 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:44:48 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:44:48 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:44:48 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:44:48 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:44:48 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:44:48 Debian ollama[6012]: calling cuInit May 16 17:44:48 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:44:48 Debian ollama[6012]: raw version 0x2f3a May 16 17:44:48 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:44:48 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:44:48 Debian ollama[6012]: device count 1 May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.768+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.4 GiB" now.total="11.6 GiB" now.free="11.4 GiB" now.used="216.9 MiB" May 16 17:44:48 Debian ollama[6012]: releasing cuda driver library May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.768+02:00 level=DEBUG source=sched.go:185 msg="updating default concurrency" OLLAMA_MAX_LOADED_MODELS=3 gpu_count=1 May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.785+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=general.alignment default=32 May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.802+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=general.alignment default=32 May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.802+02:00 level=DEBUG source=sched.go:228 msg="loading first model" model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.802+02:00 level=DEBUG source=memory.go:111 msg=evaluating library=cuda gpu_count=1 available="[11.4 GiB]" May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.803+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.vision.num_channels default=0 May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.803+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.1 GiB" before.free_swap="215.1 MiB" now.total="62.6 GiB" now.free="59.1 GiB" now.free_swap="215.1 MiB" May 16 17:44:48 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:44:48 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:44:48 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:44:48 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:44:48 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:44:48 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:44:48 Debian ollama[6012]: calling cuInit May 16 17:44:48 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:44:48 Debian ollama[6012]: raw version 0x2f3a May 16 17:44:48 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:44:48 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:44:48 Debian ollama[6012]: device count 1 May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.896+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.4 GiB" now.total="11.6 GiB" now.free="11.4 GiB" now.used="216.9 MiB" May 16 17:44:48 Debian ollama[6012]: releasing cuda driver library May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.896+02:00 level=DEBUG source=memory.go:111 msg=evaluating library=cuda gpu_count=1 available="[11.4 GiB]" May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.897+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.vision.num_channels default=0 May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.897+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.1 GiB" before.free_swap="215.1 MiB" now.total="62.6 GiB" now.free="59.1 GiB" now.free_swap="215.1 MiB" May 16 17:44:48 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:44:48 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:44:48 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:44:48 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:44:48 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:44:48 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:44:48 Debian ollama[6012]: calling cuInit May 16 17:44:48 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:44:48 Debian ollama[6012]: raw version 0x2f3a May 16 17:44:48 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:44:48 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:44:48 Debian ollama[6012]: device count 1 May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.994+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.4 GiB" now.total="11.6 GiB" now.free="11.4 GiB" now.used="216.9 MiB" May 16 17:44:48 Debian ollama[6012]: releasing cuda driver library May 16 17:44:48 Debian ollama[6012]: time=2025-05-16T17:44:48.994+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.1 GiB" before.free_swap="215.1 MiB" now.total="62.6 GiB" now.free="59.1 GiB" now.free_swap="215.1 MiB" May 16 17:44:48 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:44:48 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:44:48 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:44:48 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:44:48 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:44:48 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:44:48 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:44:48 Debian ollama[6012]: calling cuInit May 16 17:44:48 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:44:48 Debian ollama[6012]: raw version 0x2f3a May 16 17:44:48 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:44:48 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:44:48 Debian ollama[6012]: device count 1 May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.091+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.4 GiB" now.total="11.6 GiB" now.free="11.4 GiB" now.used="216.9 MiB" May 16 17:44:49 Debian ollama[6012]: releasing cuda driver library May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.091+02:00 level=INFO source=server.go:135 msg="system memory" total="62.6 GiB" free="59.1 GiB" free_swap="215.1 MiB" May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.091+02:00 level=DEBUG source=memory.go:111 msg=evaluating library=cuda gpu_count=1 available="[11.4 GiB]" May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.092+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.vision.num_channels default=0 May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.092+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.1 GiB" before.free_swap="215.1 MiB" now.total="62.6 GiB" now.free="59.1 GiB" now.free_swap="215.1 MiB" May 16 17:44:49 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:44:49 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:44:49 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:44:49 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:44:49 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:44:49 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:44:49 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:44:49 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:44:49 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:44:49 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:44:49 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:44:49 Debian ollama[6012]: calling cuInit May 16 17:44:49 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:44:49 Debian ollama[6012]: raw version 0x2f3a May 16 17:44:49 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:44:49 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:44:49 Debian ollama[6012]: device count 1 May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.187+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.4 GiB" now.total="11.6 GiB" now.free="11.4 GiB" now.used="216.9 MiB" May 16 17:44:49 Debian ollama[6012]: releasing cuda driver library May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.187+02:00 level=INFO source=server.go:168 msg=offload library=cuda layers.requested=-1 layers.model=49 layers.offload=4 layers.split="" memory.available="[11.4 GiB]" memory.gpu_overhead="0 B" memory.required.full="68.3 GiB" memory.required.partial="10.9 GiB" memory.required.kv="2.2 GiB" memory.required.allocations="[10.9 GiB]" memory.weights.total="60.6 GiB" memory.weights.repeating="59.8 GiB" memory.weights.nonrepeating="809.3 MiB" memory.graph.full="2.0 GiB" memory.graph.partial="2.0 GiB" projector.weights="1.6 GiB" projector.graph="0 B" May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.187+02:00 level=INFO source=server.go:211 msg="enabling flash attention" May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.187+02:00 level=DEBUG source=server.go:284 msg="compatible gpu libraries" compatible="[cuda_v12 cuda_v11]" May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.222+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=general.alignment default=32 May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.223+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=tokenizer.ggml.pretokenizer default="[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]*[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]+(?i:'s|'t|'re|'ve|'m|'ll|'d)?|[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]+[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]*(?i:'s|'t|'re|'ve|'m|'ll|'d)?|\\p{N}{1,3}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n/]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+" May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.223+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.vision.num_channels default=3 May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.223+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.vision.max_upscaling_size default=448 May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.223+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.rope.freq_scale default=1 May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.223+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.no_rope_interval default=4 May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.223+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.attention.temperature_tuning default=true May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.223+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.attention.scale default=0.10000000149011612 May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.223+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.attention.floor_scale default=8192 May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.224+02:00 level=DEBUG source=server.go:360 msg="adding gpu library" path=/usr/local/lib/ollama/cuda_v12 May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.224+02:00 level=DEBUG source=server.go:367 msg="adding gpu dependency paths" paths=[/usr/local/lib/ollama/cuda_v12] May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.224+02:00 level=INFO source=server.go:431 msg="starting llama server" cmd="/usr/local/bin/ollama runner --ollama-engine --model /mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 --ctx-size 24576 --batch-size 512 --n-gpu-layers 4 --threads 8 --flash-attn --kv-cache-type q8_0 --no-mmap --parallel 3 --port 39075" May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.224+02:00 level=DEBUG source=server.go:432 msg=subprocess PATH=/usr/local/bin:/usr/bin:/bin:/usr/games:/usr/local/games OLLAMA_HOST=0.0.0.0:11434 OLLAMA_FLASH_ATTENTION=1 OLLAMA_MODELS=/mnt/LLM/Ollama/Models OLLAMA_KV_CACHE_TYPE=q8_0 OLLAMA_CONTEXT_LENGTH=8192 OLLAMA_DEBUG=1 GGML_CUDA_ENABLE_UNIFIED_MEMORY=1 OLLAMA_NUM_PARALLEL=3 OLLAMA_MAX_LOADED_MODELS=3 OLLAMA_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/cuda_v12 LD_LIBRARY_PATH=/usr/local/lib/ollama/cuda_v12:/usr/local/lib/ollama/cuda_v12:/usr/local/lib/ollama:/usr/local/lib/ollama CUDA_VISIBLE_DEVICES=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.224+02:00 level=INFO source=sched.go:472 msg="loaded runners" count=1 May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.224+02:00 level=INFO source=server.go:591 msg="waiting for llama runner to start responding" May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.225+02:00 level=INFO source=server.go:625 msg="waiting for server to become available" status="llm server not responding" May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.232+02:00 level=INFO source=runner.go:836 msg="starting ollama engine" May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.233+02:00 level=INFO source=runner.go:899 msg="Server listening on 127.0.0.1:39075" May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.277+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=general.alignment default=32 May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.278+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=general.name default="" May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.278+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=general.description default="" May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.278+02:00 level=INFO source=ggml.go:73 msg="" architecture=llama4 file_type=Q4_K_M name="" description="" num_tensors=1182 num_key_values=45 May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.278+02:00 level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama May 16 17:44:49 Debian ollama[6012]: load_backend: loaded CPU backend from /usr/local/lib/ollama/libggml-cpu-alderlake.so May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.285+02:00 level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama/cuda_v12 May 16 17:44:49 Debian ollama[6012]: ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no May 16 17:44:49 Debian ollama[6012]: ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no May 16 17:44:49 Debian ollama[6012]: ggml_cuda_init: found 1 CUDA devices: May 16 17:44:49 Debian ollama[6012]: Device 0: NVIDIA GeForce RTX 4070 Ti, compute capability 8.9, VMM: yes May 16 17:44:49 Debian ollama[6012]: load_backend: loaded CUDA backend from /usr/local/lib/ollama/cuda_v12/libggml-cuda.so May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.398+02:00 level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX_VNNI=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 CUDA.0.ARCHS=500,600,610,700,750,800,860,870,890,900,1200 CUDA.0.USE_GRAPHS=1 CUDA.0.PEER_MAX_BATCH_SIZE=128 compiler=cgo(gcc) May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.467+02:00 level=INFO source=ggml.go:299 msg="model weights" buffer=CPU size="57.5 GiB" May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.467+02:00 level=INFO source=ggml.go:299 msg="model weights" buffer=CUDA0 size="5.3 GiB" May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.476+02:00 level=INFO source=server.go:625 msg="waiting for server to become available" status="llm server loading model" May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.476+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.00" May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.727+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.01" May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.978+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.02" May 16 17:44:50 Debian ollama[6012]: time=2025-05-16T17:44:50.228+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.03" May 16 17:44:50 Debian ollama[6012]: time=2025-05-16T17:44:50.479+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.04" May 16 17:44:50 Debian ollama[6012]: time=2025-05-16T17:44:50.729+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.04" May 16 17:44:50 Debian ollama[6012]: time=2025-05-16T17:44:50.980+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.05" May 16 17:44:51 Debian ollama[6012]: time=2025-05-16T17:44:51.231+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.06" May 16 17:44:51 Debian ollama[6012]: time=2025-05-16T17:44:51.481+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.07" May 16 17:44:51 Debian ollama[6012]: time=2025-05-16T17:44:51.732+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.08" May 16 17:44:51 Debian ollama[6012]: time=2025-05-16T17:44:51.983+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.09" May 16 17:44:52 Debian ollama[6012]: time=2025-05-16T17:44:52.234+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.10" May 16 17:44:52 Debian ollama[6012]: time=2025-05-16T17:44:52.484+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.11" May 16 17:44:52 Debian ollama[6012]: time=2025-05-16T17:44:52.735+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.12" May 16 17:44:52 Debian ollama[6012]: time=2025-05-16T17:44:52.986+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.13" May 16 17:44:53 Debian ollama[6012]: time=2025-05-16T17:44:53.236+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.14" May 16 17:44:53 Debian ollama[6012]: time=2025-05-16T17:44:53.487+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.15" May 16 17:44:53 Debian ollama[6012]: time=2025-05-16T17:44:53.738+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.16" May 16 17:44:53 Debian ollama[6012]: time=2025-05-16T17:44:53.988+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.17" May 16 17:44:54 Debian ollama[6012]: time=2025-05-16T17:44:54.239+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.18" May 16 17:44:54 Debian ollama[6012]: time=2025-05-16T17:44:54.490+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.19" May 16 17:44:54 Debian ollama[6012]: time=2025-05-16T17:44:54.740+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.20" May 16 17:44:54 Debian ollama[6012]: time=2025-05-16T17:44:54.991+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.21" May 16 17:44:55 Debian ollama[6012]: time=2025-05-16T17:44:55.242+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.22" May 16 17:44:55 Debian ollama[6012]: time=2025-05-16T17:44:55.492+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.23" May 16 17:44:55 Debian ollama[6012]: time=2025-05-16T17:44:55.743+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.24" May 16 17:44:55 Debian ollama[6012]: time=2025-05-16T17:44:55.994+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.25" May 16 17:44:56 Debian ollama[6012]: time=2025-05-16T17:44:56.244+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.26" May 16 17:44:56 Debian ollama[6012]: time=2025-05-16T17:44:56.495+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.27" May 16 17:44:56 Debian ollama[6012]: time=2025-05-16T17:44:56.745+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.28" May 16 17:44:56 Debian ollama[6012]: time=2025-05-16T17:44:56.996+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.29" May 16 17:44:57 Debian ollama[6012]: time=2025-05-16T17:44:57.247+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.30" May 16 17:44:57 Debian ollama[6012]: time=2025-05-16T17:44:57.497+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.31" May 16 17:44:57 Debian ollama[6012]: time=2025-05-16T17:44:57.748+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.32" May 16 17:44:57 Debian ollama[6012]: time=2025-05-16T17:44:57.999+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.33" May 16 17:44:58 Debian ollama[6012]: time=2025-05-16T17:44:58.249+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.34" May 16 17:44:58 Debian ollama[6012]: time=2025-05-16T17:44:58.500+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.35" May 16 17:44:58 Debian ollama[6012]: time=2025-05-16T17:44:58.751+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.36" May 16 17:44:59 Debian ollama[6012]: time=2025-05-16T17:44:59.002+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.37" May 16 17:44:59 Debian ollama[6012]: time=2025-05-16T17:44:59.253+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.38" May 16 17:44:59 Debian ollama[6012]: time=2025-05-16T17:44:59.505+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.39" May 16 17:44:59 Debian ollama[6012]: time=2025-05-16T17:44:59.755+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.40" May 16 17:45:00 Debian ollama[6012]: time=2025-05-16T17:45:00.007+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.42" May 16 17:45:00 Debian ollama[6012]: time=2025-05-16T17:45:00.257+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.43" May 16 17:45:00 Debian ollama[6012]: time=2025-05-16T17:45:00.508+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.44" May 16 17:45:00 Debian ollama[6012]: time=2025-05-16T17:45:00.759+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.45" May 16 17:45:01 Debian ollama[6012]: time=2025-05-16T17:45:01.009+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.46" May 16 17:45:01 Debian ollama[6012]: time=2025-05-16T17:45:01.260+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.47" May 16 17:45:01 Debian ollama[6012]: time=2025-05-16T17:45:01.511+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.48" May 16 17:45:01 Debian ollama[6012]: time=2025-05-16T17:45:01.761+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.49" May 16 17:45:02 Debian ollama[6012]: time=2025-05-16T17:45:02.012+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.50" May 16 17:45:02 Debian ollama[6012]: time=2025-05-16T17:45:02.263+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.51" May 16 17:45:02 Debian ollama[6012]: time=2025-05-16T17:45:02.513+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.52" May 16 17:45:02 Debian ollama[6012]: time=2025-05-16T17:45:02.764+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.53" May 16 17:45:03 Debian ollama[6012]: time=2025-05-16T17:45:03.015+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.54" May 16 17:45:03 Debian ollama[6012]: time=2025-05-16T17:45:03.265+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.55" May 16 17:45:03 Debian ollama[6012]: time=2025-05-16T17:45:03.516+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.56" May 16 17:45:03 Debian ollama[6012]: time=2025-05-16T17:45:03.766+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.57" May 16 17:45:04 Debian ollama[6012]: time=2025-05-16T17:45:04.017+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.58" May 16 17:45:04 Debian ollama[6012]: time=2025-05-16T17:45:04.268+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.59" May 16 17:45:04 Debian ollama[6012]: time=2025-05-16T17:45:04.520+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.60" May 16 17:45:04 Debian ollama[6012]: time=2025-05-16T17:45:04.771+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.61" May 16 17:45:05 Debian ollama[6012]: time=2025-05-16T17:45:05.021+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.62" May 16 17:45:05 Debian ollama[6012]: time=2025-05-16T17:45:05.272+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.63" May 16 17:45:05 Debian ollama[6012]: time=2025-05-16T17:45:05.523+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.64" May 16 17:45:05 Debian ollama[6012]: time=2025-05-16T17:45:05.773+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.65" May 16 17:45:06 Debian ollama[6012]: time=2025-05-16T17:45:06.024+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.66" May 16 17:45:06 Debian ollama[6012]: time=2025-05-16T17:45:06.275+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.67" May 16 17:45:06 Debian ollama[6012]: time=2025-05-16T17:45:06.526+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.68" May 16 17:45:06 Debian ollama[6012]: time=2025-05-16T17:45:06.777+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.70" May 16 17:45:07 Debian ollama[6012]: time=2025-05-16T17:45:07.027+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.71" May 16 17:45:07 Debian ollama[6012]: time=2025-05-16T17:45:07.278+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.72" May 16 17:45:07 Debian ollama[6012]: time=2025-05-16T17:45:07.529+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.73" May 16 17:45:07 Debian ollama[6012]: time=2025-05-16T17:45:07.779+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.74" May 16 17:45:08 Debian ollama[6012]: time=2025-05-16T17:45:08.030+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.75" May 16 17:45:08 Debian ollama[6012]: time=2025-05-16T17:45:08.281+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.76" May 16 17:45:08 Debian ollama[6012]: time=2025-05-16T17:45:08.532+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.77" May 16 17:45:08 Debian ollama[6012]: time=2025-05-16T17:45:08.783+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.78" May 16 17:45:09 Debian ollama[6012]: time=2025-05-16T17:45:09.033+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.79" May 16 17:45:09 Debian ollama[6012]: time=2025-05-16T17:45:09.284+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.80" May 16 17:45:09 Debian ollama[6012]: time=2025-05-16T17:45:09.535+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.81" May 16 17:45:09 Debian ollama[6012]: time=2025-05-16T17:45:09.785+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.82" May 16 17:45:10 Debian ollama[6012]: time=2025-05-16T17:45:10.036+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.83" May 16 17:45:10 Debian ollama[6012]: time=2025-05-16T17:45:10.287+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.84" May 16 17:45:10 Debian ollama[6012]: time=2025-05-16T17:45:10.537+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.85" May 16 17:45:10 Debian ollama[6012]: time=2025-05-16T17:45:10.788+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.86" May 16 17:45:11 Debian ollama[6012]: time=2025-05-16T17:45:11.039+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.87" May 16 17:45:11 Debian ollama[6012]: time=2025-05-16T17:45:11.289+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.88" May 16 17:45:11 Debian ollama[6012]: time=2025-05-16T17:45:11.540+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.89" May 16 17:45:11 Debian ollama[6012]: time=2025-05-16T17:45:11.791+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.90" May 16 17:45:12 Debian ollama[6012]: time=2025-05-16T17:45:12.041+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.91" May 16 17:45:12 Debian ollama[6012]: time=2025-05-16T17:45:12.292+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.92" May 16 17:45:12 Debian ollama[6012]: time=2025-05-16T17:45:12.543+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.93" May 16 17:45:12 Debian ollama[6012]: time=2025-05-16T17:45:12.793+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.94" May 16 17:45:13 Debian ollama[6012]: time=2025-05-16T17:45:13.044+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.95" May 16 17:45:13 Debian ollama[6012]: time=2025-05-16T17:45:13.295+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.96" May 16 17:45:13 Debian ollama[6012]: time=2025-05-16T17:45:13.545+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.97" May 16 17:45:13 Debian ollama[6012]: time=2025-05-16T17:45:13.796+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.97" May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.047+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.98" May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.297+02:00 level=DEBUG source=server.go:636 msg="model load progress 0.99" May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.548+02:00 level=DEBUG source=server.go:636 msg="model load progress 1.00" May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.732+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=tokenizer.ggml.pretokenizer default="[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]*[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]+(?i:'s|'t|'re|'ve|'m|'ll|'d)?|[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]+[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]*(?i:'s|'t|'re|'ve|'m|'ll|'d)?|\\p{N}{1,3}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n/]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+" May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.734+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.vision.num_channels default=3 May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.734+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.vision.max_upscaling_size default=448 May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.734+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.rope.freq_scale default=1 May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.734+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.no_rope_interval default=4 May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.734+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.attention.temperature_tuning default=true May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.734+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.attention.scale default=0.10000000149011612 May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.734+02:00 level=DEBUG source=ggml.go:154 msg="key not found" key=llama4.attention.floor_scale default=8192 May 16 17:45:14 Debian ollama[6012]: time=2025-05-16T17:45:14.800+02:00 level=DEBUG source=server.go:636 msg="model load progress 1.00" May 16 17:45:15 Debian ollama[6012]: time=2025-05-16T17:45:15.052+02:00 level=DEBUG source=server.go:639 msg="model load completed, waiting for server to become available" status="llm server loading model" May 16 17:45:15 Debian ollama[6012]: time=2025-05-16T17:45:15.505+02:00 level=INFO source=server.go:625 msg="waiting for server to become available" status="llm server not responding" May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.661+02:00 level=ERROR source=sched.go:478 msg="error loading llama server" error="llama runner process has terminated: signal: killed" May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.662+02:00 level=DEBUG source=sched.go:480 msg="triggering expiration for failed load" runner.name=registry.ollama.ai/library/llama4:scout runner.inference=cuda runner.devices=1 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 runner.num_ctx=24576 May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.663+02:00 level=DEBUG source=sched.go:364 msg="runner expired event received" runner.name=registry.ollama.ai/library/llama4:scout runner.inference=cuda runner.devices=1 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 runner.num_ctx=24576 May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.663+02:00 level=DEBUG source=sched.go:379 msg="got lock to unload expired event" runner.name=registry.ollama.ai/library/llama4:scout runner.inference=cuda runner.devices=1 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 runner.num_ctx=24576 May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.664+02:00 level=DEBUG source=sched.go:391 msg="starting background wait for VRAM recovery" runner.name=registry.ollama.ai/library/llama4:scout runner.inference=cuda runner.devices=1 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 runner.num_ctx=24576 May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.665+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.1 GiB" before.free_swap="215.1 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="52.6 MiB" May 16 17:45:16 Debian ollama[6012]: [GIN] 2025/05/16 - 17:45:16 | 500 | 28.018068622s | 127.0.0.1 | POST "/api/generate" May 16 17:45:16 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:16 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:16 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:16 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:16 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:16 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:16 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:16 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:16 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:16 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:16 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:16 Debian ollama[6012]: calling cuInit May 16 17:45:16 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:16 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:16 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:16 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:16 Debian ollama[6012]: device count 1 May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.831+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.4 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:16 Debian ollama[6012]: releasing cuda driver library May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.832+02:00 level=DEBUG source=server.go:1023 msg="stopping llama server" pid=6062 May 16 17:45:16 Debian ollama[6012]: time=2025-05-16T17:45:16.832+02:00 level=DEBUG source=sched.go:396 msg="runner terminated and removed from list, blocking for VRAM recovery" runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.083+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="52.6 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="63.4 MiB" May 16 17:45:17 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:17 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:17 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:17 Debian ollama[6012]: calling cuInit May 16 17:45:17 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:17 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:17 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:17 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:17 Debian ollama[6012]: device count 1 May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.235+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:17 Debian ollama[6012]: releasing cuda driver library May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.333+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="63.4 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="70.3 MiB" May 16 17:45:17 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:17 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:17 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:17 Debian ollama[6012]: calling cuInit May 16 17:45:17 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:17 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:17 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:17 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:17 Debian ollama[6012]: device count 1 May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.445+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:17 Debian ollama[6012]: releasing cuda driver library May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.583+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="70.3 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="71.8 MiB" May 16 17:45:17 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:17 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:17 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:17 Debian ollama[6012]: calling cuInit May 16 17:45:17 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:17 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:17 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:17 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:17 Debian ollama[6012]: device count 1 May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.683+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:17 Debian ollama[6012]: releasing cuda driver library May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.833+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="71.8 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="72.2 MiB" May 16 17:45:17 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:17 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:17 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:17 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:17 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:17 Debian ollama[6012]: calling cuInit May 16 17:45:17 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:17 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:17 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:17 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:17 Debian ollama[6012]: device count 1 May 16 17:45:17 Debian ollama[6012]: time=2025-05-16T17:45:17.925+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:17 Debian ollama[6012]: releasing cuda driver library May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.083+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="72.2 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="72.5 MiB" May 16 17:45:18 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:18 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:18 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:18 Debian ollama[6012]: calling cuInit May 16 17:45:18 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:18 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:18 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:18 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:18 Debian ollama[6012]: device count 1 May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.182+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:18 Debian ollama[6012]: releasing cuda driver library May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.332+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="72.5 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="72.7 MiB" May 16 17:45:18 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:18 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:18 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:18 Debian ollama[6012]: calling cuInit May 16 17:45:18 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:18 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:18 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:18 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:18 Debian ollama[6012]: device count 1 May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.426+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:18 Debian ollama[6012]: releasing cuda driver library May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.583+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="72.7 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="72.7 MiB" May 16 17:45:18 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:18 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:18 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:18 Debian ollama[6012]: calling cuInit May 16 17:45:18 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:18 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:18 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:18 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:18 Debian ollama[6012]: device count 1 May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.683+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:18 Debian ollama[6012]: releasing cuda driver library May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.833+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="72.7 MiB" now.total="62.6 GiB" now.free="59.3 GiB" now.free_swap="72.9 MiB" May 16 17:45:18 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:18 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:18 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:18 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:18 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:18 Debian ollama[6012]: calling cuInit May 16 17:45:18 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:18 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:18 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:18 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:18 Debian ollama[6012]: device count 1 May 16 17:45:18 Debian ollama[6012]: time=2025-05-16T17:45:18.930+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:18 Debian ollama[6012]: releasing cuda driver library May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.083+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.3 GiB" before.free_swap="72.9 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="72.9 MiB" May 16 17:45:19 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:19 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:19 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:19 Debian ollama[6012]: calling cuInit May 16 17:45:19 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:19 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:19 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:19 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:19 Debian ollama[6012]: device count 1 May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.184+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:19 Debian ollama[6012]: releasing cuda driver library May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.332+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="72.9 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB" May 16 17:45:19 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:19 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:19 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:19 Debian ollama[6012]: calling cuInit May 16 17:45:19 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:19 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:19 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:19 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:19 Debian ollama[6012]: device count 1 May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.427+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:19 Debian ollama[6012]: releasing cuda driver library May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.583+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB" May 16 17:45:19 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:19 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:19 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:19 Debian ollama[6012]: calling cuInit May 16 17:45:19 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:19 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:19 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:19 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:19 Debian ollama[6012]: device count 1 May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.683+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:19 Debian ollama[6012]: releasing cuda driver library May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.833+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB" May 16 17:45:19 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:19 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:19 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:19 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:19 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:19 Debian ollama[6012]: calling cuInit May 16 17:45:19 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:19 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:19 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:19 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:19 Debian ollama[6012]: device count 1 May 16 17:45:19 Debian ollama[6012]: time=2025-05-16T17:45:19.926+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:19 Debian ollama[6012]: releasing cuda driver library May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.083+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB" May 16 17:45:20 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:20 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:20 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:20 Debian ollama[6012]: calling cuInit May 16 17:45:20 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:20 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:20 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:20 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:20 Debian ollama[6012]: device count 1 May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.183+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:20 Debian ollama[6012]: releasing cuda driver library May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.333+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB" May 16 17:45:20 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:20 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:20 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:20 Debian ollama[6012]: calling cuInit May 16 17:45:20 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:20 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:20 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:20 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:20 Debian ollama[6012]: device count 1 May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.426+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:20 Debian ollama[6012]: releasing cuda driver library May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.583+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB" May 16 17:45:20 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:20 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:20 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:20 Debian ollama[6012]: calling cuInit May 16 17:45:20 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:20 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:20 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:20 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:20 Debian ollama[6012]: device count 1 May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.683+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:20 Debian ollama[6012]: releasing cuda driver library May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.833+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB" May 16 17:45:20 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:20 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:20 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:20 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:20 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:20 Debian ollama[6012]: calling cuInit May 16 17:45:20 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:20 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:20 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:20 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:20 Debian ollama[6012]: device count 1 May 16 17:45:20 Debian ollama[6012]: time=2025-05-16T17:45:20.926+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:20 Debian ollama[6012]: releasing cuda driver library May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.083+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB" May 16 17:45:21 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:21 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:21 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:21 Debian ollama[6012]: calling cuInit May 16 17:45:21 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:21 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:21 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:21 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:21 Debian ollama[6012]: device count 1 May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.182+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:21 Debian ollama[6012]: releasing cuda driver library May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.333+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB" May 16 17:45:21 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:21 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:21 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:21 Debian ollama[6012]: calling cuInit May 16 17:45:21 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:21 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:21 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:21 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:21 Debian ollama[6012]: device count 1 May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.426+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:21 Debian ollama[6012]: releasing cuda driver library May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.583+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB" May 16 17:45:21 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:21 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:21 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:21 Debian ollama[6012]: calling cuInit May 16 17:45:21 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:21 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:21 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:21 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:21 Debian ollama[6012]: device count 1 May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.683+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:21 Debian ollama[6012]: releasing cuda driver library May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.832+02:00 level=WARN source=sched.go:676 msg="gpu VRAM usage didn't recover within timeout" seconds=5.168159089 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.833+02:00 level=DEBUG source=sched.go:399 msg="sending an unloaded event" runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.833+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.0 MiB" May 16 17:45:21 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:21 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:21 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:21 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:21 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:21 Debian ollama[6012]: calling cuInit May 16 17:45:21 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:21 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:21 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:21 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:21 Debian ollama[6012]: device count 1 May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.833+02:00 level=DEBUG source=sched.go:312 msg="ignoring unload event with no pending requests" May 16 17:45:21 Debian ollama[6012]: time=2025-05-16T17:45:21.927+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:21 Debian ollama[6012]: releasing cuda driver library May 16 17:45:22 Debian ollama[6012]: time=2025-05-16T17:45:22.082+02:00 level=WARN source=sched.go:676 msg="gpu VRAM usage didn't recover within timeout" seconds=5.418638538 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 May 16 17:45:22 Debian ollama[6012]: time=2025-05-16T17:45:22.083+02:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.6 GiB" before.free="59.2 GiB" before.free_swap="73.0 MiB" now.total="62.6 GiB" now.free="59.2 GiB" now.free_swap="73.1 MiB" May 16 17:45:22 Debian ollama[6012]: initializing /usr/lib/x86_64-linux-gnu/libcuda.so.575.51.03 May 16 17:45:22 Debian ollama[6012]: dlsym: cuInit - 0x7f32e6974790 May 16 17:45:22 Debian ollama[6012]: dlsym: cuDriverGetVersion - 0x7f32e6974850 May 16 17:45:22 Debian ollama[6012]: dlsym: cuDeviceGetCount - 0x7f32e69749d0 May 16 17:45:22 Debian ollama[6012]: dlsym: cuDeviceGet - 0x7f32e6974910 May 16 17:45:22 Debian ollama[6012]: dlsym: cuDeviceGetAttribute - 0x7f32e6974f10 May 16 17:45:22 Debian ollama[6012]: dlsym: cuDeviceGetUuid - 0x7f32e6974b50 May 16 17:45:22 Debian ollama[6012]: dlsym: cuDeviceGetName - 0x7f32e6974a90 May 16 17:45:22 Debian ollama[6012]: dlsym: cuCtxCreate_v3 - 0x7f32e6975a50 May 16 17:45:22 Debian ollama[6012]: dlsym: cuMemGetInfo_v2 - 0x7f32e69786f0 May 16 17:45:22 Debian ollama[6012]: dlsym: cuCtxDestroy - 0x7f32e69da710 May 16 17:45:22 Debian ollama[6012]: calling cuInit May 16 17:45:22 Debian ollama[6012]: calling cuDriverGetVersion May 16 17:45:22 Debian ollama[6012]: raw version 0x2f3a May 16 17:45:22 Debian ollama[6012]: CUDA driver version: 12.9 May 16 17:45:22 Debian ollama[6012]: calling cuDeviceGetCount May 16 17:45:22 Debian ollama[6012]: device count 1 May 16 17:45:22 Debian ollama[6012]: time=2025-05-16T17:45:22.181+02:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-b03f2475-42ef-39d7-4f43-1e4d2c9e81f6 name="NVIDIA GeForce RTX 4070 Ti" overhead="0 B" before.total="11.6 GiB" before.free="11.3 GiB" now.total="11.6 GiB" now.free="11.3 GiB" now.used="262.9 MiB" May 16 17:45:22 Debian ollama[6012]: releasing cuda driver library May 16 17:45:22 Debian ollama[6012]: time=2025-05-16T17:45:22.333+02:00 level=WARN source=sched.go:676 msg="gpu VRAM usage didn't recover within timeout" seconds=5.668763082 runner.size="68.3 GiB" runner.vram="10.9 GiB" runner.parallel=3 runner.pid=6062 runner.model=/mnt/LLM/Ollama/Models/blobs/sha256-9d507a36062c2845dd3bb3e93364e9abc1607118acd8650727a700f72fb126e5 ```
Author
Owner

@rick-github commented on GitHub (May 17, 2025):

May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.187+02:00 level=INFO source=server.go:168 msg=offload
 library=cuda layers.requested=-1 layers.model=49 layers.offload=4 layers.split="" memory.available="[11.4 GiB]"
 memory.gpu_overhead="0 B" memory.required.full="68.3 GiB" memory.required.partial="10.9 GiB" memory.required.kv="2.2 GiB"
 memory.required.allocations="[10.9 GiB]" memory.weights.total="60.6 GiB" memory.weights.repeating="59.8 GiB"
 memory.weights.nonrepeating="809.3 MiB" memory.graph.full="2.0 GiB" memory.graph.partial="2.0 GiB"
 projector.weights="1.6 GiB" projector.graph="0 B"

ollama estimated that it needed 68.3G to load the model, and allocated 10.9G of the available 11.4G of VRAM to offload 4 of 49 layers. So 45 layers or 57.4G of system RAM was used.

May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.091+02:00 level=INFO source=server.go:135
 msg="system memory" total="62.6 GiB" free="59.1 GiB" free_swap="215.1 MiB"

Your system has 59.1G of system RAM free, with 215.1MB of free swap, so 59.315G available. So the kernel saw that 57.4G was being used by the ollama runner, leaving 1.915G available for other processes. It could be that at the time the runner was killed, another process was started that needed more than the 1.915G avialable, so the kernel looked around and saw that the ollama runner was being a hog, and killed it.

I deleted model, rebooted computer, downloaded again, and now, there is no more errors, prompt is waiting for my input

Rebooting the computer will have killed off a bunch of competing processes and freed up RAM and swap, allowing the runner process to take most of the RAM without the kernel getting trigger happy. If you want to reduce the chance of being kiiled because of OOM again, you could increase the available swap.

<!-- gh-comment-id:2888497327 --> @rick-github commented on GitHub (May 17, 2025): ``` May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.187+02:00 level=INFO source=server.go:168 msg=offload library=cuda layers.requested=-1 layers.model=49 layers.offload=4 layers.split="" memory.available="[11.4 GiB]" memory.gpu_overhead="0 B" memory.required.full="68.3 GiB" memory.required.partial="10.9 GiB" memory.required.kv="2.2 GiB" memory.required.allocations="[10.9 GiB]" memory.weights.total="60.6 GiB" memory.weights.repeating="59.8 GiB" memory.weights.nonrepeating="809.3 MiB" memory.graph.full="2.0 GiB" memory.graph.partial="2.0 GiB" projector.weights="1.6 GiB" projector.graph="0 B" ``` ollama estimated that it needed 68.3G to load the model, and allocated 10.9G of the available 11.4G of VRAM to offload 4 of 49 layers. So 45 layers or 57.4G of system RAM was used. ``` May 16 17:44:49 Debian ollama[6012]: time=2025-05-16T17:44:49.091+02:00 level=INFO source=server.go:135 msg="system memory" total="62.6 GiB" free="59.1 GiB" free_swap="215.1 MiB" ``` Your system has 59.1G of system RAM free, with 215.1MB of free swap, so 59.315G available. So the kernel saw that 57.4G was being used by the ollama runner, leaving 1.915G available for other processes. It could be that at the time the runner was killed, another process was started that needed more than the 1.915G avialable, so the kernel looked around and saw that the ollama runner was being a hog, and killed it. > I deleted model, rebooted computer, downloaded again, and now, there is no more errors, prompt is waiting for my input Rebooting the computer will have killed off a bunch of competing processes and freed up RAM and swap, allowing the runner process to take most of the RAM without the kernel getting trigger happy. If you want to reduce the chance of being kiiled because of OOM again, you could increase the available swap.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#69112