mirror of
https://github.com/ollama/ollama.git
synced 2026-05-07 00:22:43 -05:00
[GH-ISSUE #14044] huihui_ai abliterated models crash with CUDA error: out of memory #71236
Closed
opened 2026-05-05 00:50:33 -05:00 by GiteaMirror
·
6 comments
No Branch/Tag Specified
main
hoyyeva/anthropic-local-image-path
dhiltgen/ci
dhiltgen/llama-runner
parth-remove-claude-desktop-launch
hoyyeva/anthropic-reference-images-path
parth-anthropic-reference-images-path
brucemacd/download-before-remove
hoyyeva/editor-config-repair
parth-mlx-decode-checkpoints
parth-launch-codex-app
hoyyeva/fix-codex-model-metadata-warning
hoyyeva/qwen
parth/hide-claude-desktop-till-release
hoyyeva/opencode-image-modality
parth-add-claude-code-autoinstall
release_v0.22.0
pdevine/manifest-list
codex/fix-codex-model-metadata-warning
pdevine/addressable-manifest
brucemacd/launch-fetch-reccomended
jmorganca/llama-compat
launch-copilot-cli
hoyyeva/opencode-thinking
release_v0.20.7
parth-auto-save-backup
parth-test
jmorganca/gemma4-audio-replacements
fix-manifest-digest-on-pull
hoyyeva/vscode-improve
brucemacd/install-server-wait
parth/update-claude-docs
brucemac/start-ap-install
pdevine/mlx-update
pdevine/qwen35_vision
drifkin/api-show-fallback
mintlify/image-generation-1773352582
hoyyeva/server-context-length-local-config
jmorganca/faster-reptition-penalties
jmorganca/convert-nemotron
parth-pi-thinking
pdevine/sampling-penalties
jmorganca/fix-create-quantization-memory
dongchen/resumable_transfer_fix
pdevine/sampling-cache-error
jessegross/mlx-usage
hoyyeva/openclaw-config
hoyyeva/app-html
pdevine/qwen3next
brucemacd/sign-sh-install
brucemacd/tui-update
brucemacd/usage-api
jmorganca/launch-empty
fix-app-dist-embed
mxyng/mlx-compile
mxyng/mlx-quant
mxyng/mlx-glm4.7
mxyng/mlx
brucemacd/simplify-model-picker
jmorganca/qwen3-concurrent
fix-glm-4.7-flash-mla-config
drifkin/qwen3-coder-opening-tag
brucemacd/usage-cli
fix-cuda12-fattn-shmem
ollama-imagegen-docs
parth/fix-multiline-inputs
brucemacd/config-docs
mxyng/model-files
mxyng/simple-execute
fix-imagegen-ollama-models
mxyng/async-upload
jmorganca/lazy-no-dtype-changes
imagegen-auto-detect-create
parth/decrease-concurrent-download-hf
fix-mlx-quantize-init
jmorganca/x-cleanup
usage
imagegen-readme
jmorganca/glm-image
mlx-gpu-cd
jmorganca/imagegen-modelfile
parth/agent-skills
parth/agent-allowlist
parth/signed-in-offline
parth/agents
parth/fix-context-chopping
improve-cloud-flow
parth/add-models-websearch
parth/prompt-renderer-mcp
jmorganca/native-settings
jmorganca/download-stream-hash
jmorganca/client2-rebased
brucemacd/oai-chat-req-multipart
jessegross/multi_chunk_reserve
grace/additional-omit-empty
grace/mistral-3-large
mxyng/tokenizer2
mxyng/tokenizer
jessegross/flash
hoyyeva/windows-nacked-app
mxyng/cleanup-attention
grace/deepseek-parser
hoyyeva/remember-unsent-prompt
parth/add-lfs-pointer-error-conversion
parth/olmo2-test2
hoyyeva/ollama-launchagent-plist
nicole/olmo-model
parth/olmo-test
mxyng/remove-embedded
parth/render-template
jmorganca/intellect-3
parth/remove-prealloc-linter
jmorganca/cmd-eval
nicole/nomic-embed-text-fix
mxyng/lint-2
hoyyeva/add-gemini-3-pro-preview
hoyyeva/load-model-list
mxyng/expand-path
mxyng/environ-2
hoyyeva/deeplink-json-encoding
parth/improve-tool-calling-tests
hoyyeva/conversation
hoyyeva/assistant-edit-response
hoyyeva/thinking
origin/brucemacd/invalid-char-i-err
parth/improve-tool-calling
jmorganca/required-omitempty
grace/qwen3-vl-tests
mxyng/iter-client
parth/docs-readme
nicole/embed-test
pdevine/integration-benchstat
parth/remove-generate-cmd
parth/add-toolcall-id
mxyng/server-tests
jmorganca/glm-4.6
jmorganca/gin-h-compat
drifkin/stable-tool-args
pdevine/qwen3-more-thinking
parth/add-websearch-client
nicole/websearch_local
jmorganca/qwen3-coder-updates
grace/deepseek-v3-migration-tests
mxyng/fix-create
jmorganca/cloud-errors
pdevine/parser-tidy
revert-12233-parth/simplify-entrypoints-runner
parth/enable-so-gpt-oss
brucemacd/qwen3vl
jmorganca/readme-simplify
parth/gpt-oss-structured-outputs
revert-12039-jmorganca/tools-braces
mxyng/embeddings
mxyng/gguf
mxyng/benchmark
mxyng/types-null
parth/move-parsing
mxyng/gemma2
jmorganca/docs
mxyng/16-bit
mxyng/create-stdin
pdevine/authorizedkeys
mxyng/quant
parth/opt-in-error-context-window
brucemacd/cache-models
brucemacd/runner-completion
jmorganca/llama-update-6
brucemacd/benchmark-list
brucemacd/partial-read-caps
parth/deepseek-r1-tools
mxyng/omit-array
parth/tool-prefix-temp
brucemacd/runner-test
jmorganca/qwen25vl
brucemacd/model-forward-test-ext
parth/python-function-parsing
jmorganca/cuda-compression-none
drifkin/num-parallel
drifkin/chat-truncation-fix
jmorganca/sync
parth/python-tools-calling
drifkin/array-head-count
brucemacd/create-no-loop
parth/server-enable-content-stream-with-tools
qwen25omni
mxyng/v3
brucemacd/ropeconfig
jmorganca/silence-tokenizer
parth/sample-so-test
parth/sampling-structured-outputs
brucemacd/doc-go-engine
parth/constrained-sampling-json
jmorganca/mistral-wip
brucemacd/mistral-small-convert
parth/sample-unmarshal-json-for-params
brucemacd/jomorganca/mistral
pdevine/bfloat16
jmorganca/mistral
brucemacd/mistral
pdevine/logging
parth/sample-correctness-fix
parth/sample-fix-sorting
jmorgan/sample-fix-sorting-extras
jmorganca/temp-0-images
brucemacd/parallel-embed-models
brucemacd/shim-grammar
jmorganca/fix-gguf-error
bmizerany/nameswork
jmorganca/faster-releases
bmizerany/validatenames
brucemacd/err-no-vocab
brucemacd/rope-config
brucemacd/err-hint
brucemacd/qwen2_5
brucemacd/logprobs
brucemacd/new_runner_graph_bench
progress-flicker
brucemacd/forward-test
brucemacd/go_qwen2
pdevine/gemma2
jmorganca/add-missing-symlink-eval
mxyng/next-debug
parth/set-context-size-openai
brucemacd/next-bpe-bench
brucemacd/next-bpe-test
brucemacd/new_runner_e2e
brucemacd/new_runner_qwen2
pdevine/convert-cohere2
brucemacd/convert-cli
parth/log-probs
mxyng/next-mlx
mxyng/cmd-history
parth/templating
parth/tokenize-detokenize
brucemacd/check-key-register
bmizerany/grammar
jmorganca/vendor-081b29bd
mxyng/func-checks
jmorganca/fix-null-format
parth/fix-default-to-warn-json
jmorganca/qwen2vl
jmorganca/no-concat
parth/cmd-cleanup-SO
brucemacd/check-key-register-structured-err
parth/openai-stream-usage
parth/fix-referencing-so
stream-tools-stop
jmorganca/degin-1
brucemacd/install-path-clean
brucemacd/push-name-validation
brucemacd/browser-key-register
jmorganca/openai-fix-first-message
jmorganca/fix-proxy
jessegross/sample
parth/disallow-streaming-tools
dhiltgen/remove_submodule
jmorganca/ga
jmorganca/mllama
pdevine/newlines
pdevine/geems-2b
jmorganca/llama-bump
mxyng/modelname-7
mxyng/gin-slog
mxyng/modelname-6
jyan/convert-prog
jyan/quant5
paligemma-support
pdevine/import-docs
jmorganca/openai-context
jyan/paligemma
jyan/p2
jyan/palitest
bmizerany/embedspeedup
jmorganca/llama-vit
brucemacd/allow-ollama
royh/ep-methods
royh/whisper
mxyng/api-models
mxyng/fix-memory
jyan/q4_4/8
jyan/ollama-v
royh/stream-tools
roy-embed-parallel
bmizerany/hrm
revert-5963-revert-5924-mxyng/llama3.1-rope
royh/embed-viz
jyan/local2
jyan/auth
jyan/local
jyan/parse-temp
jmorganca/template-mistral
jyan/reord-g
royh-openai-suffixdocs
royh-imgembed
royh-embed-parallel
jyan/quant4
royh-precision
jyan/progress
pdevine/fix-template
jyan/quant3
pdevine/ggla
mxyng/update-registry-domain
jmorganca/ggml-static
mxyng/create-context
jyan/v0.146
mxyng/layers-from-files
build_dist
bmizerany/noseek
royh-ls
royh-name
timeout
mxyng/server-timestamp
bmizerany/nosillyggufslurps
royh-params
jmorganca/llama-cpp-7c26775
royh-openai-delete
royh-show-rigid
jmorganca/enable-fa
jmorganca/no-error-template
jyan/format
royh-testdelete
bmizerany/fastverify
language_support
pdevine/ps-glitches
brucemacd/tokenize
bruce/iq-quants
bmizerany/filepathwithcoloninhost
mxyng/split-bin
bmizerany/client-registry
jmorganca/if-none-match
native
jmorganca/native
jmorganca/batch-embeddings
jmorganca/initcmake
jmorganca/mm
pdevine/showggmlinfo
modenameenforcealphanum
bmizerany/modenameenforcealphanum
jmorganca/done-reason
jmorganca/llama-cpp-8960fe8
ollama.com
bmizerany/filepathnobuild
bmizerany/types/model/defaultfix
rmdisplaylong
nogogen
bmizerany/x
modelfile-readme
bmizerany/replacecolon
jmorganca/limit
jmorganca/execstack
jmorganca/replace-assets
mxyng/tune-concurrency
jmorganca/testing
whitespace-detection
jmorganca/options
upgrade-all
scratch
cuda-search
mattw/airenamer
mattw/allmodelsonhuggingface
mattw/quantcontext
mattw/whatneedstorun
brucemacd/llama-mem-calc
mattw/faq-context
mattw/communitylinks
mattw/noprune
mattw/python-functioncalling
rename
mxyng/install
pulse
remove-first
editor
mattw/selfqueryingretrieval
cgo
mattw/howtoquant
api
matt/streamingapi
format-config
mxyng/extra-args
shell
update-nous-hermes
cp-model
upload-progress
fix-unknown-model
fix-model-names
delete-fix
insecure-registry
ls
deletemodels
progressbar
readme-updates
license-layers
skip-list
list-models
modelpath
matt/examplemodelfiles
distribution
go-opts
v0.30.0-rc3
v0.30.0-rc2
v0.30.0-rc1
v0.30.0-rc0
v0.23.1
v0.23.1-rc0
v0.23.0
v0.23.0-rc0
v0.22.1
v0.22.1-rc1
v0.22.1-rc0
v0.22.0
v0.22.0-rc1
v0.21.3-rc0
v0.21.2-rc1
v0.21.2
v0.21.2-rc0
v0.21.1
v0.21.1-rc1
v0.21.1-rc0
v0.21.0
v0.21.0-rc1
v0.21.0-rc0
v0.20.8-rc0
v0.20.7
v0.20.7-rc1
v0.20.7-rc0
v0.20.6
v0.20.6-rc1
v0.20.6-rc0
v0.20.5
v0.20.5-rc2
v0.20.5-rc1
v0.20.5-rc0
v0.20.4
v0.20.4-rc2
v0.20.4-rc1
v0.20.4-rc0
v0.20.3
v0.20.3-rc0
v0.20.2
v0.20.1
v0.20.1-rc2
v0.20.1-rc1
v0.20.1-rc0
v0.20.0
v0.20.0-rc1
v0.20.0-rc0
v0.19.0
v0.19.0-rc2
v0.19.0-rc1
v0.19.0-rc0
v0.18.4-rc1
v0.18.4-rc0
v0.18.3
v0.18.3-rc2
v0.18.3-rc1
v0.18.3-rc0
v0.18.2
v0.18.2-rc1
v0.18.2-rc0
v0.18.1
v0.18.1-rc1
v0.18.1-rc0
v0.18.0
v0.18.0-rc2
v0.18.0-rc1
v0.18.0-rc0
v0.17.8-rc4
v0.17.8-rc3
v0.17.8-rc2
v0.17.8-rc1
v0.17.8-rc0
v0.17.7
v0.17.7-rc2
v0.17.7-rc1
v0.17.7-rc0
v0.17.6
v0.17.5
v0.17.4
v0.17.3
v0.17.2
v0.17.1
v0.17.1-rc2
v0.17.1-rc1
v0.17.1-rc0
v0.17.0
v0.17.0-rc2
v0.17.0-rc1
v0.17.0-rc0
v0.16.3
v0.16.3-rc2
v0.16.3-rc1
v0.16.3-rc0
v0.16.2
v0.16.2-rc0
v0.16.1
v0.16.0
v0.16.0-rc2
v0.16.0-rc0
v0.16.0-rc1
v0.15.6
v0.15.5
v0.15.5-rc5
v0.15.5-rc4
v0.15.5-rc3
v0.15.5-rc2
v0.15.5-rc1
v0.15.5-rc0
v0.15.4
v0.15.3
v0.15.2
v0.15.1
v0.15.1-rc1
v0.15.1-rc0
v0.15.0-rc6
v0.15.0
v0.15.0-rc5
v0.15.0-rc4
v0.15.0-rc3
v0.15.0-rc2
v0.15.0-rc1
v0.15.0-rc0
v0.14.3
v0.14.3-rc3
v0.14.3-rc2
v0.14.3-rc1
v0.14.3-rc0
v0.14.2
v0.14.2-rc1
v0.14.2-rc0
v0.14.1
v0.14.0-rc11
v0.14.0
v0.14.0-rc10
v0.14.0-rc9
v0.14.0-rc8
v0.14.0-rc7
v0.14.0-rc6
v0.14.0-rc5
v0.14.0-rc4
v0.14.0-rc3
v0.14.0-rc2
v0.14.0-rc1
v0.14.0-rc0
v0.13.5
v0.13.5-rc1
v0.13.5-rc0
v0.13.4-rc2
v0.13.4
v0.13.4-rc1
v0.13.4-rc0
v0.13.3
v0.13.3-rc1
v0.13.3-rc0
v0.13.2
v0.13.2-rc2
v0.13.2-rc1
v0.13.2-rc0
v0.13.1
v0.13.1-rc2
v0.13.1-rc1
v0.13.1-rc0
v0.13.0
v0.13.0-rc0
v0.12.11
v0.12.11-rc1
v0.12.11-rc0
v0.12.10
v0.12.10-rc1
v0.12.10-rc0
v0.12.9-rc0
v0.12.9
v0.12.8
v0.12.8-rc0
v0.12.7
v0.12.7-rc1
v0.12.7-rc0
v0.12.7-citest0
v0.12.6
v0.12.6-rc1
v0.12.6-rc0
v0.12.5
v0.12.5-rc0
v0.12.4
v0.12.4-rc7
v0.12.4-rc6
v0.12.4-rc5
v0.12.4-rc4
v0.12.4-rc3
v0.12.4-rc2
v0.12.4-rc1
v0.12.4-rc0
v0.12.3
v0.12.2
v0.12.2-rc0
v0.12.1
v0.12.1-rc1
v0.12.1-rc2
v0.12.1-rc0
v0.12.0
v0.12.0-rc1
v0.12.0-rc0
v0.11.11
v0.11.11-rc3
v0.11.11-rc2
v0.11.11-rc1
v0.11.11-rc0
v0.11.10
v0.11.9
v0.11.9-rc0
v0.11.8
v0.11.8-rc0
v0.11.7-rc1
v0.11.7-rc0
v0.11.7
v0.11.6
v0.11.6-rc0
v0.11.5-rc4
v0.11.5-rc3
v0.11.5
v0.11.5-rc5
v0.11.5-rc2
v0.11.5-rc1
v0.11.5-rc0
v0.11.4
v0.11.4-rc0
v0.11.3
v0.11.3-rc0
v0.11.2
v0.11.1
v0.11.0-rc0
v0.11.0-rc1
v0.11.0-rc2
v0.11.0
v0.10.2-int1
v0.10.1
v0.10.0
v0.10.0-rc4
v0.10.0-rc3
v0.10.0-rc2
v0.10.0-rc1
v0.10.0-rc0
v0.9.7-rc1
v0.9.7-rc0
v0.9.6
v0.9.6-rc0
v0.9.6-ci0
v0.9.5
v0.9.4-rc5
v0.9.4-rc6
v0.9.4
v0.9.4-rc3
v0.9.4-rc4
v0.9.4-rc1
v0.9.4-rc2
v0.9.4-rc0
v0.9.3
v0.9.3-rc5
v0.9.4-citest0
v0.9.3-rc4
v0.9.3-rc3
v0.9.3-rc2
v0.9.3-rc1
v0.9.3-rc0
v0.9.2
v0.9.1
v0.9.1-rc1
v0.9.1-rc0
v0.9.1-ci1
v0.9.1-ci0
v0.9.0
v0.9.0-rc0
v0.8.0
v0.8.0-rc0
v0.7.1-rc2
v0.7.1
v0.7.1-rc1
v0.7.1-rc0
v0.7.0
v0.7.0-rc1
v0.7.0-rc0
v0.6.9-rc0
v0.6.8
v0.6.8-rc0
v0.6.7
v0.6.7-rc2
v0.6.7-rc1
v0.6.7-rc0
v0.6.6
v0.6.6-rc2
v0.6.6-rc1
v0.6.6-rc0
v0.6.5-rc1
v0.6.5
v0.6.5-rc0
v0.6.4-rc0
v0.6.4
v0.6.3-rc1
v0.6.3
v0.6.3-rc0
v0.6.2
v0.6.2-rc0
v0.6.1
v0.6.1-rc0
v0.6.0-rc0
v0.6.0
v0.5.14-rc0
v0.5.13
v0.5.13-rc6
v0.5.13-rc5
v0.5.13-rc4
v0.5.13-rc3
v0.5.13-rc2
v0.5.13-rc1
v0.5.13-rc0
v0.5.12
v0.5.12-rc1
v0.5.12-rc0
v0.5.11
v0.5.10
v0.5.9
v0.5.9-rc0
v0.5.8-rc13
v0.5.8
v0.5.8-rc12
v0.5.8-rc11
v0.5.8-rc10
v0.5.8-rc9
v0.5.8-rc8
v0.5.8-rc7
v0.5.8-rc6
v0.5.8-rc5
v0.5.8-rc4
v0.5.8-rc3
v0.5.8-rc2
v0.5.8-rc1
v0.5.8-rc0
v0.5.7
v0.5.6
v0.5.5
v0.5.5-rc0
v0.5.4
v0.5.3
v0.5.3-rc0
v0.5.2
v0.5.2-rc3
v0.5.2-rc2
v0.5.2-rc1
v0.5.2-rc0
v0.5.1
v0.5.0
v0.5.0-rc1
v0.4.8-rc0
v0.4.7
v0.4.6
v0.4.5
v0.4.4
v0.4.3
v0.4.3-rc0
v0.4.2
v0.4.2-rc1
v0.4.2-rc0
v0.4.1
v0.4.1-rc0
v0.4.0
v0.4.0-rc8
v0.4.0-rc7
v0.4.0-rc6
v0.4.0-rc5
v0.4.0-rc4
v0.4.0-rc3
v0.4.0-rc2
v0.4.0-rc1
v0.4.0-rc0
v0.4.0-ci3
v0.3.14
v0.3.14-rc0
v0.3.13
v0.3.12
v0.3.12-rc5
v0.3.12-rc4
v0.3.12-rc3
v0.3.12-rc2
v0.3.12-rc1
v0.3.11
v0.3.11-rc4
v0.3.11-rc3
v0.3.11-rc2
v0.3.11-rc1
v0.3.10
v0.3.10-rc1
v0.3.9
v0.3.8
v0.3.7
v0.3.7-rc6
v0.3.7-rc5
v0.3.7-rc4
v0.3.7-rc3
v0.3.7-rc2
v0.3.7-rc1
v0.3.6
v0.3.5
v0.3.4
v0.3.3
v0.3.2
v0.3.1
v0.3.0
v0.2.8
v0.2.8-rc2
v0.2.8-rc1
v0.2.7
v0.2.6
v0.2.5
v0.2.4
v0.2.3
v0.2.2
v0.2.2-rc2
v0.2.2-rc1
v0.2.1
v0.2.0
v0.1.49-rc14
v0.1.49-rc13
v0.1.49-rc12
v0.1.49-rc11
v0.1.49-rc10
v0.1.49-rc9
v0.1.49-rc8
v0.1.49-rc7
v0.1.49-rc6
v0.1.49-rc4
v0.1.49-rc5
v0.1.49-rc3
v0.1.49-rc2
v0.1.49-rc1
v0.1.48
v0.1.47
v0.1.46
v0.1.45-rc5
v0.1.45
v0.1.45-rc4
v0.1.45-rc3
v0.1.45-rc2
v0.1.45-rc1
v0.1.44
v0.1.43
v0.1.42
v0.1.41
v0.1.40
v0.1.40-rc1
v0.1.39
v0.1.39-rc2
v0.1.39-rc1
v0.1.38
v0.1.37
v0.1.36
v0.1.35
v0.1.35-rc1
v0.1.34
v0.1.34-rc1
v0.1.33
v0.1.33-rc7
v0.1.33-rc6
v0.1.33-rc5
v0.1.33-rc4
v0.1.33-rc3
v0.1.33-rc2
v0.1.33-rc1
v0.1.32
v0.1.32-rc2
v0.1.32-rc1
v0.1.31
v0.1.30
v0.1.29
v0.1.28
v0.1.27
v0.1.26
v0.1.25
v0.1.24
v0.1.23
v0.1.22
v0.1.21
v0.1.20
v0.1.19
v0.1.18
v0.1.17
v0.1.16
v0.1.15
v0.1.14
v0.1.13
v0.1.12
v0.1.11
v0.1.10
v0.1.9
v0.1.8
v0.1.7
v0.1.6
v0.1.5
v0.1.4
v0.1.3
v0.1.2
v0.1.1
v0.1.0
v0.0.21
v0.0.20
v0.0.19
v0.0.18
v0.0.17
v0.0.16
v0.0.15
v0.0.14
v0.0.13
v0.0.12
v0.0.11
v0.0.10
v0.0.9
v0.0.8
v0.0.7
v0.0.6
v0.0.5
v0.0.4
v0.0.3
v0.0.2
v0.0.1
Labels
Clear labels
amd
api
app
bug
build
cli
cloud
compatibility
context-length
create
docker
documentation
embeddings
feature request
feedback wanted
good first issue
gpt-oss
gpu
harmony
help wanted
image
install
intel
js
launch
linux
macos
memory
mlx
model
needs more info
networking
nvidia
ollama.com
performance
pull-request
python
question
registry
rendering
thinking
tools
top
vulkan
windows
wsl
Mirrored from GitHub Pull Request
No Label
bug
Milestone
No items
No Milestone
Projects
Clear projects
No project
No Assignees
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: github-starred/ollama#71236
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @Johnreidsilver on GitHub (Feb 3, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/14044
What is the issue?
ollama version is 0.15.5-rc1
tried huihui_ai/glm-4.7-flash-abliterated:latest (which is larger than my VRAM) and huihui_ai/deepseek-r1-abliterated:latest (which is smaller than my VRAM) both crash with: Error: an error was encountered while running the model: CUDA error: out of memory
Running for example glm-4.7-flash:latest which is larger than these models or an even larger (24GB) nemotron-3-nano:latest, runs fine
Relevant log output
OS
Linux
GPU
Nvidia
CPU
AMD
Ollama version
0.15.5-rc1
@rick-github commented on GitHub (Feb 3, 2026):
Looks like -rc1 is a bit more conservative in how much memory it allocates.
See here for some ways to mitigate OOMs while this is looked at.
@Johnreidsilver commented on GitHub (Feb 3, 2026):
Thanks for the quick feedback and mitigations. Apparently just a tad more complex questions also crash the other models with the current rc1.
GGML_CUDA_ENABLE_UNIFIED_MEMORY=1 OLLAMA_FLASH_ATTENTION=1 OLLAMA_CONTEXT_LENGTH=2048 ollama run glm-4.7-flash:latest
OOMed with a simple "how to set ollama server environment variables, running ubuntu 24.04 ?"
It looked like VRAM usage started climbing during the answer until it OOMs, there was RAM still available.
journalctl -u ollama --no-pager --follow --pager-end | grep flash
Feb 03 15:14:25 LAPTOP ollama[2997]: llama_context: flash_attn = auto
Feb 03 15:15:47 LAPTOP ollama[2997]: time=2026-02-03T15:15:47.913Z level=INFO source=server.go:246 msg="enabling flash attention"
Full journal
Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc0004887e0 sp=0xc0004887c8 pc=0x569e293e11e5 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc0004887e8 sp=0xc0004887e0 pc=0x569e2943b461 Feb 03 15:09:15 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:09:15 LAPTOP ollama[2997]: goroutine 35 gp=0xc0004821c0 m=nil [GC worker (idle)]: Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gopark(0x7241bacae0c?, 0x0?, 0x0?, 0x0?, 0x0?) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000488f38 sp=0xc000488f18 pc=0x569e294335ce Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gcBgMarkWorker(0xc000111730) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1423 +0xe9 fp=0xc000488fc8 sp=0xc000488f38 pc=0x569e293e1309 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc000488fe0 sp=0xc000488fc8 pc=0x569e293e11e5 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc000488fe8 sp=0xc000488fe0 pc=0x569e2943b461 Feb 03 15:09:15 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:09:15 LAPTOP ollama[2997]: goroutine 36 gp=0xc000482380 m=nil [GC worker (idle)]: Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gopark(0x7241ba78781?, 0x3?, 0x28?, 0xcc?, 0x0?) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000489738 sp=0xc000489718 pc=0x569e294335ce Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gcBgMarkWorker(0xc000111730) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1423 +0xe9 fp=0xc0004897c8 sp=0xc000489738 pc=0x569e293e1309 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc0004897e0 sp=0xc0004897c8 pc=0x569e293e11e5 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc0004897e8 sp=0xc0004897e0 pc=0x569e2943b461 Feb 03 15:09:15 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:09:15 LAPTOP ollama[2997]: goroutine 37 gp=0xc000482540 m=nil [GC worker (idle)]: Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gopark(0x722d7add63d?, 0x0?, 0x0?, 0x0?, 0x0?) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000489f38 sp=0xc000489f18 pc=0x569e294335ce Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gcBgMarkWorker(0xc000111730) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1423 +0xe9 fp=0xc000489fc8 sp=0xc000489f38 pc=0x569e293e1309 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc000489fe0 sp=0xc000489fc8 pc=0x569e293e11e5 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc000489fe8 sp=0xc000489fe0 pc=0x569e2943b461 Feb 03 15:09:15 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:09:15 LAPTOP ollama[2997]: goroutine 38 gp=0xc000482700 m=nil [GC worker (idle)]: Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gopark(0x7241bab7036?, 0x3?, 0xce?, 0xe6?, 0x0?) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc00048a738 sp=0xc00048a718 pc=0x569e294335ce Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gcBgMarkWorker(0xc000111730) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1423 +0xe9 fp=0xc00048a7c8 sp=0xc00048a738 pc=0x569e293e1309 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc00048a7e0 sp=0xc00048a7c8 pc=0x569e293e11e5 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc00048a7e8 sp=0xc00048a7e0 pc=0x569e2943b461 Feb 03 15:09:15 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:09:15 LAPTOP ollama[2997]: goroutine 39 gp=0xc0004828c0 m=nil [GC worker (idle)]: Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gopark(0x7241ba7adf9?, 0x1?, 0xe1?, 0xbf?, 0x0?) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc00048af38 sp=0xc00048af18 pc=0x569e294335ce Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gcBgMarkWorker(0xc000111730) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1423 +0xe9 fp=0xc00048afc8 sp=0xc00048af38 pc=0x569e293e1309 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc00048afe0 sp=0xc00048afc8 pc=0x569e293e11e5 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc00048afe8 sp=0xc00048afe0 pc=0x569e2943b461 Feb 03 15:09:15 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:09:15 LAPTOP ollama[2997]: goroutine 40 gp=0xc000482a80 m=nil [GC worker (idle)]: Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gopark(0x7241ba80475?, 0x1?, 0xfc?, 0xf3?, 0x0?) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc00048b738 sp=0xc00048b718 pc=0x569e294335ce Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gcBgMarkWorker(0xc000111730) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1423 +0xe9 fp=0xc00048b7c8 sp=0xc00048b738 pc=0x569e293e1309 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc00048b7e0 sp=0xc00048b7c8 pc=0x569e293e11e5 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc00048b7e8 sp=0xc00048b7e0 pc=0x569e2943b461 Feb 03 15:09:15 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:09:15 LAPTOP ollama[2997]: goroutine 5 gp=0xc000003a40 m=nil [GC worker (idle)]: Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gopark(0x7241ba7adf9?, 0x3?, 0x79?, 0x4c?, 0x0?) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000086738 sp=0xc000086718 pc=0x569e294335ce Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gcBgMarkWorker(0xc000111730) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1423 +0xe9 fp=0xc0000867c8 sp=0xc000086738 pc=0x569e293e1309 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc0000867e0 sp=0xc0000867c8 pc=0x569e293e11e5 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc0000867e8 sp=0xc0000867e0 pc=0x569e2943b461 Feb 03 15:09:15 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:09:15 LAPTOP ollama[2997]: goroutine 22 gp=0xc000103880 m=nil [GC worker (idle)]: Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gopark(0x7241ba809ea?, 0x1?, 0x2?, 0xec?, 0x0?) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000081f38 sp=0xc000081f18 pc=0x569e294335ce Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gcBgMarkWorker(0xc000111730) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1423 +0xe9 fp=0xc000081fc8 sp=0xc000081f38 pc=0x569e293e1309 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc000081fe0 sp=0xc000081fc8 pc=0x569e293e11e5 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc000081fe8 sp=0xc000081fe0 pc=0x569e2943b461 Feb 03 15:09:15 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:09:15 LAPTOP ollama[2997]: goroutine 41 gp=0xc000482c40 m=nil [GC worker (idle)]: Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gopark(0x569e2b591f00?, 0x1?, 0x7a?, 0xcc?, 0x0?) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc00048bf38 sp=0xc00048bf18 pc=0x569e294335ce Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gcBgMarkWorker(0xc000111730) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1423 +0xe9 fp=0xc00048bfc8 sp=0xc00048bf38 pc=0x569e293e1309 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc00048bfe0 sp=0xc00048bfc8 pc=0x569e293e11e5 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc00048bfe8 sp=0xc00048bfe0 pc=0x569e2943b461 Feb 03 15:09:15 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:09:15 LAPTOP ollama[2997]: goroutine 6 gp=0xc000003c00 m=nil [GC worker (idle)]: Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gopark(0x7241ba80475?, 0x1?, 0x6?, 0x41?, 0x0?) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000086f38 sp=0xc000086f18 pc=0x569e294335ce Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gcBgMarkWorker(0xc000111730) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1423 +0xe9 fp=0xc000086fc8 sp=0xc000086f38 pc=0x569e293e1309 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc000086fe0 sp=0xc000086fc8 pc=0x569e293e11e5 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc000086fe8 sp=0xc000086fe0 pc=0x569e2943b461 Feb 03 15:09:15 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:09:15 LAPTOP ollama[2997]: goroutine 23 gp=0xc000103a40 m=nil [GC worker (idle)]: Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gopark(0x7241bb24afd?, 0x1?, 0xd6?, 0x5c?, 0x0?) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000082738 sp=0xc000082718 pc=0x569e294335ce Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gcBgMarkWorker(0xc000111730) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1423 +0xe9 fp=0xc0000827c8 sp=0xc000082738 pc=0x569e293e1309 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc0000827e0 sp=0xc0000827c8 pc=0x569e293e11e5 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc0000827e8 sp=0xc0000827e0 pc=0x569e2943b461 Feb 03 15:09:15 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:09:15 LAPTOP ollama[2997]: goroutine 42 gp=0xc000482e00 m=nil [GC worker (idle)]: Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gopark(0x569e2b591f00?, 0x1?, 0xc4?, 0x10?, 0x0?) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000484738 sp=0xc000484718 pc=0x569e294335ce Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gcBgMarkWorker(0xc000111730) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1423 +0xe9 fp=0xc0004847c8 sp=0xc000484738 pc=0x569e293e1309 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc0004847e0 sp=0xc0004847c8 pc=0x569e293e11e5 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc0004847e8 sp=0xc0004847e0 pc=0x569e2943b461 Feb 03 15:09:15 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:09:15 LAPTOP ollama[2997]: goroutine 7 gp=0xc000003dc0 m=nil [GC worker (idle)]: Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gopark(0x569e2b591f00?, 0x1?, 0x8c?, 0x56?, 0x0?) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000087738 sp=0xc000087718 pc=0x569e294335ce Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gcBgMarkWorker(0xc000111730) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1423 +0xe9 fp=0xc0000877c8 sp=0xc000087738 pc=0x569e293e1309 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc0000877e0 sp=0xc0000877c8 pc=0x569e293e11e5 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc0000877e8 sp=0xc0000877e0 pc=0x569e2943b461 Feb 03 15:09:15 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:09:15 LAPTOP ollama[2997]: goroutine 8 gp=0xc000582fc0 m=nil [chan receive]: Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gopark(0x30?, 0x569e2aac0c00?, 0x1?, 0x0?, 0xc00126d798?) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc00126d750 sp=0xc00126d730 pc=0x569e294335ce Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.chanrecv(0xc000f16460, 0x0, 0x1) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/chan.go:664 +0x445 fp=0xc00126d7c8 sp=0xc00126d750 pc=0x569e293cf8e5 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.chanrecv1(0x569e2a62bec9?, 0x29?) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/chan.go:506 +0x12 fp=0xc00126d7f0 sp=0xc00126d7c8 pc=0x569e293cf472 Feb 03 15:09:15 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner.(*Server).forwardBatch(_, {0xb3, {0x569e2ab7fe50, 0xc00012f9c0}, {0x569e2ab8c008, 0xc000fbafc0}, {0xc00150a010, 0x1, 0x1}, {{0x569e2ab8c008, ...}, ...}, ...}) Feb 03 15:09:15 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner/runner.go:475 +0xfa fp=0xc00126db58 sp=0xc00126d7f0 pc=0x569e299b783a Feb 03 15:09:15 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner.(*Server).run(0xc0002350e0, {0x569e2ab74390, 0xc00062d720}) Feb 03 15:09:15 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner/runner.go:452 +0x18c fp=0xc00126dfb8 sp=0xc00126db58 pc=0x569e299b74ec Feb 03 15:09:15 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner.Execute.gowrap1() Feb 03 15:09:15 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner/runner.go:1418 +0x28 fp=0xc00126dfe0 sp=0xc00126dfb8 pc=0x569e299c0b68 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc00126dfe8 sp=0xc00126dfe0 pc=0x569e2943b461 Feb 03 15:09:15 LAPTOP ollama[2997]: created by github.com/ollama/ollama/runner/ollamarunner.Execute in goroutine 1 Feb 03 15:09:15 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner/runner.go:1418 +0x4c9 Feb 03 15:09:15 LAPTOP ollama[2997]: goroutine 9 gp=0xc000583180 m=nil [select]: Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gopark(0xc000feba08?, 0x2?, 0x4?, 0x0?, 0xc000feb86c?) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000feb698 sp=0xc000feb678 pc=0x569e294335ce Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.selectgo(0xc000feba08, 0xc000feb868, 0xc0000ea080?, 0x0, 0x1?, 0x1) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/select.go:351 +0x837 fp=0xc000feb7d0 sp=0xc000feb698 pc=0x569e29412477 Feb 03 15:09:15 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner.(*Server).completion(0xc0002350e0, {0x569e2ab71ec0, 0xc000d602a0}, 0xc0000fe8c0) Feb 03 15:09:15 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner/runner.go:950 +0xc4e fp=0xc000febac0 sp=0xc000feb7d0 pc=0x569e299bbc0e Feb 03 15:09:15 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner.(*Server).completion-fm({0x569e2ab71ec0?, 0xc000d602a0?}, 0xc001095b40?) Feb 03 15:09:15 LAPTOP ollama[2997]: :1 +0x36 fp=0xc000febaf0 sp=0xc000febac0 pc=0x569e299c1056 Feb 03 15:09:15 LAPTOP ollama[2997]: net/http.HandlerFunc.ServeHTTP(0xc000037c80?, {0x569e2ab71ec0?, 0xc000d602a0?}, 0xc001095b60?) Feb 03 15:09:15 LAPTOP ollama[2997]: net/http/server.go:2294 +0x29 fp=0xc000febb18 sp=0xc000febaf0 pc=0x569e29733989 Feb 03 15:09:15 LAPTOP ollama[2997]: net/http.(*ServeMux).ServeHTTP(0x569e293d8325?, {0x569e2ab71ec0, 0xc000d602a0}, 0xc0000fe8c0) Feb 03 15:09:15 LAPTOP ollama[2997]: net/http/server.go:2822 +0x1c4 fp=0xc000febb68 sp=0xc000febb18 pc=0x569e29735884 Feb 03 15:09:15 LAPTOP ollama[2997]: net/http.serverHandler.ServeHTTP({0x569e2ab6e2f0?}, {0x569e2ab71ec0?, 0xc000d602a0?}, 0x1?) Feb 03 15:09:15 LAPTOP ollama[2997]: net/http/server.go:3301 +0x8e fp=0xc000febb98 sp=0xc000febb68 pc=0x569e2975330e Feb 03 15:09:15 LAPTOP ollama[2997]: net/http.(*conn).serve(0xc000000480, {0x569e2ab74358, 0xc0000fd050}) Feb 03 15:09:15 LAPTOP ollama[2997]: net/http/server.go:2102 +0x625 fp=0xc000febfb8 sp=0xc000febb98 pc=0x569e29731e85 Feb 03 15:09:15 LAPTOP ollama[2997]: net/http.(*Server).Serve.gowrap3() Feb 03 15:09:15 LAPTOP ollama[2997]: net/http/server.go:3454 +0x28 fp=0xc000febfe0 sp=0xc000febfb8 pc=0x569e29737748 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc000febfe8 sp=0xc000febfe0 pc=0x569e2943b461 Feb 03 15:09:15 LAPTOP ollama[2997]: created by net/http.(*Server).Serve in goroutine 1 Feb 03 15:09:15 LAPTOP ollama[2997]: net/http/server.go:3454 +0x485 Feb 03 15:09:15 LAPTOP ollama[2997]: goroutine 1363 gp=0xc000583340 m=nil [chan receive]: Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gopark(0x30?, 0x569e2aac0c00?, 0x1?, 0x48?, 0xc002595b20?) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc002595ad8 sp=0xc002595ab8 pc=0x569e294335ce Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.chanrecv(0xc000f8e3f0, 0x0, 0x1) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/chan.go:664 +0x445 fp=0xc002595b50 sp=0xc002595ad8 pc=0x569e293cf8e5 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.chanrecv1(0x569e2a62f735?, 0x2c?) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/chan.go:506 +0x12 fp=0xc002595b78 sp=0xc002595b50 pc=0x569e293cf472 Feb 03 15:09:15 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner.(*Server).computeBatch(0xc0002350e0, {0xb3, {0x569e2ab7fe50, 0xc00012f9c0}, {0x569e2ab8c008, 0xc000fbafc0}, {0xc00150a010, 0x1, 0x1}, {{0x569e2ab8c008, ...}, ...}, ...}) Feb 03 15:09:15 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner/runner.go:651 +0x185 fp=0xc002595ef0 sp=0xc002595b78 pc=0x569e299b9425 Feb 03 15:09:15 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner.(*Server).run.gowrap1() Feb 03 15:09:15 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner/runner.go:458 +0x58 fp=0xc002595fe0 sp=0xc002595ef0 pc=0x569e299b7718 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc002595fe8 sp=0xc002595fe0 pc=0x569e2943b461 Feb 03 15:09:15 LAPTOP ollama[2997]: created by github.com/ollama/ollama/runner/ollamarunner.(*Server).run in goroutine 8 Feb 03 15:09:15 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner/runner.go:458 +0x2cd Feb 03 15:09:15 LAPTOP ollama[2997]: goroutine 974 gp=0xc0005836c0 m=nil [IO wait]: Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.gopark(0x100000001?, 0x100000001?, 0x1?, 0x0?, 0xb?) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000a835d8 sp=0xc000a835b8 pc=0x569e294335ce Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.netpollblock(0x569e29456d98?, 0x293ccd06?, 0x9e?) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/netpoll.go:575 +0xf7 fp=0xc000a83610 sp=0xc000a835d8 pc=0x569e293f88f7 Feb 03 15:09:15 LAPTOP ollama[2997]: internal/poll.runtime_pollWait(0x7f6abc266cc8, 0x72) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/netpoll.go:351 +0x85 fp=0xc000a83630 sp=0xc000a83610 pc=0x569e294327e5 Feb 03 15:09:15 LAPTOP ollama[2997]: internal/poll.(*pollDesc).wait(0xc0000ea880?, 0xc0000fd151?, 0x0) Feb 03 15:09:15 LAPTOP ollama[2997]: internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc000a83658 sp=0xc000a83630 pc=0x569e294ba967 Feb 03 15:09:15 LAPTOP ollama[2997]: internal/poll.(*pollDesc).waitRead(...) Feb 03 15:09:15 LAPTOP ollama[2997]: internal/poll/fd_poll_runtime.go:89 Feb 03 15:09:15 LAPTOP ollama[2997]: internal/poll.(*FD).Read(0xc0000ea880, {0xc0000fd151, 0x1, 0x1}) Feb 03 15:09:15 LAPTOP ollama[2997]: internal/poll/fd_unix.go:165 +0x27a fp=0xc000a836f0 sp=0xc000a83658 pc=0x569e294bbc5a Feb 03 15:09:15 LAPTOP ollama[2997]: net.(*netFD).Read(0xc0000ea880, {0xc0000fd151?, 0xc000629818?, 0xc000a83770?}) Feb 03 15:09:15 LAPTOP ollama[2997]: net/fd_posix.go:55 +0x25 fp=0xc000a83738 sp=0xc000a836f0 pc=0x569e29530e45 Feb 03 15:09:15 LAPTOP ollama[2997]: net.(*conn).Read(0xc00011c9c8, {0xc0000fd151?, 0xc000f40180?, 0x569e2979f140?}) Feb 03 15:09:15 LAPTOP ollama[2997]: net/net.go:194 +0x45 fp=0xc000a83780 sp=0xc000a83738 pc=0x569e2953f205 Feb 03 15:09:15 LAPTOP ollama[2997]: net/http.(*connReader).backgroundRead(0xc0000fd140) Feb 03 15:09:15 LAPTOP ollama[2997]: net/http/server.go:690 +0x37 fp=0xc000a837c8 sp=0xc000a83780 pc=0x569e2972bd57 Feb 03 15:09:15 LAPTOP ollama[2997]: net/http.(*connReader).startBackgroundRead.gowrap2() Feb 03 15:09:15 LAPTOP ollama[2997]: net/http/server.go:686 +0x25 fp=0xc000a837e0 sp=0xc000a837c8 pc=0x569e2972bc85 Feb 03 15:09:15 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:09:15 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc000a837e8 sp=0xc000a837e0 pc=0x569e2943b461 Feb 03 15:09:15 LAPTOP ollama[2997]: created by net/http.(*connReader).startBackgroundRead in goroutine 9 Feb 03 15:09:15 LAPTOP ollama[2997]: net/http/server.go:686 +0xb6 Feb 03 15:09:15 LAPTOP ollama[2997]: rax 0x0 Feb 03 15:09:15 LAPTOP ollama[2997]: rbx 0x27beb Feb 03 15:09:15 LAPTOP ollama[2997]: rcx 0x7f6b03c9eb2c Feb 03 15:09:15 LAPTOP ollama[2997]: rdx 0x6 Feb 03 15:09:15 LAPTOP ollama[2997]: rdi 0x27bd7 Feb 03 15:09:15 LAPTOP ollama[2997]: rsi 0x27beb Feb 03 15:09:15 LAPTOP ollama[2997]: rbp 0x7f6a2bffdf40 Feb 03 15:09:15 LAPTOP ollama[2997]: rsp 0x7f6a2bffdf00 Feb 03 15:09:15 LAPTOP ollama[2997]: r8 0x0 Feb 03 15:09:15 LAPTOP ollama[2997]: r9 0x7 Feb 03 15:09:15 LAPTOP ollama[2997]: r10 0x8 Feb 03 15:09:15 LAPTOP ollama[2997]: r11 0x246 Feb 03 15:09:15 LAPTOP ollama[2997]: r12 0x6 Feb 03 15:09:15 LAPTOP ollama[2997]: r13 0x7f6a6f58ed20 Feb 03 15:09:15 LAPTOP ollama[2997]: r14 0x16 Feb 03 15:09:15 LAPTOP ollama[2997]: r15 0x0 Feb 03 15:09:15 LAPTOP ollama[2997]: rip 0x7f6b03c9eb2c Feb 03 15:09:15 LAPTOP ollama[2997]: rflags 0x246 Feb 03 15:09:15 LAPTOP ollama[2997]: cs 0x33 Feb 03 15:09:15 LAPTOP ollama[2997]: fs 0x0 Feb 03 15:09:15 LAPTOP ollama[2997]: gs 0x0 Feb 03 15:09:16 LAPTOP ollama[2997]: time=2026-02-03T15:09:16.891Z level=ERROR source=server.go:303 msg="llama runner terminated" error="exit status 2" Feb 03 15:09:16 LAPTOP ollama[2997]: [GIN] 2026/02/03 - 15:09:16 | 200 | 14.575684097s | 127.0.0.1 | POST "/api/chat" Feb 03 15:13:00 LAPTOP ollama[2997]: [GIN] 2026/02/03 - 15:13:00 | 200 | 35.829µs | 127.0.0.1 | GET "/api/version" Feb 03 15:14:10 LAPTOP ollama[2997]: [GIN] 2026/02/03 - 15:14:10 | 200 | 35.27µs | 127.0.0.1 | GET "/api/version" Feb 03 15:14:14 LAPTOP ollama[2997]: [GIN] 2026/02/03 - 15:14:14 | 200 | 24.026µs | 127.0.0.1 | HEAD "/" Feb 03 15:14:14 LAPTOP ollama[2997]: [GIN] 2026/02/03 - 15:14:14 | 200 | 37.486016ms | 127.0.0.1 | GET "/api/tags" Feb 03 15:14:16 LAPTOP ollama[2997]: time=2026-02-03T15:14:16.903Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 40111" Feb 03 15:14:17 LAPTOP ollama[2997]: time=2026-02-03T15:14:17.293Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 33023" Feb 03 15:14:17 LAPTOP ollama[2997]: time=2026-02-03T15:14:17.542Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 41771" Feb 03 15:14:17 LAPTOP ollama[2997]: time=2026-02-03T15:14:17.792Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 33019" Feb 03 15:14:18 LAPTOP ollama[2997]: time=2026-02-03T15:14:18.043Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 35235" Feb 03 15:14:18 LAPTOP ollama[2997]: time=2026-02-03T15:14:18.292Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 39085" Feb 03 15:14:18 LAPTOP ollama[2997]: time=2026-02-03T15:14:18.542Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 42927" Feb 03 15:14:18 LAPTOP ollama[2997]: time=2026-02-03T15:14:18.793Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 38317" Feb 03 15:14:19 LAPTOP ollama[2997]: time=2026-02-03T15:14:19.043Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 35063" Feb 03 15:14:19 LAPTOP ollama[2997]: time=2026-02-03T15:14:19.293Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 41141" Feb 03 15:14:19 LAPTOP ollama[2997]: time=2026-02-03T15:14:19.543Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 33299" Feb 03 15:14:19 LAPTOP ollama[2997]: time=2026-02-03T15:14:19.793Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 42535" Feb 03 15:14:20 LAPTOP ollama[2997]: time=2026-02-03T15:14:20.043Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 42815" Feb 03 15:14:20 LAPTOP ollama[2997]: time=2026-02-03T15:14:20.292Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 40685" Feb 03 15:14:20 LAPTOP ollama[2997]: time=2026-02-03T15:14:20.542Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 42087" Feb 03 15:14:20 LAPTOP ollama[2997]: time=2026-02-03T15:14:20.793Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 46807" Feb 03 15:14:21 LAPTOP ollama[2997]: time=2026-02-03T15:14:21.042Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 34391" Feb 03 15:14:21 LAPTOP ollama[2997]: time=2026-02-03T15:14:21.293Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 40753" Feb 03 15:14:21 LAPTOP ollama[2997]: time=2026-02-03T15:14:21.543Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 43361" Feb 03 15:14:21 LAPTOP ollama[2997]: [GIN] 2026/02/03 - 15:14:21 | 200 | 22.419µs | 127.0.0.1 | HEAD "/" Feb 03 15:14:21 LAPTOP ollama[2997]: [GIN] 2026/02/03 - 15:14:21 | 200 | 122.959633ms | 127.0.0.1 | POST "/api/show" Feb 03 15:14:21 LAPTOP ollama[2997]: time=2026-02-03T15:14:21.792Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 34169" Feb 03 15:14:21 LAPTOP ollama[2997]: [GIN] 2026/02/03 - 15:14:21 | 200 | 109.148238ms | 127.0.0.1 | POST "/api/show" Feb 03 15:14:22 LAPTOP ollama[2997]: time=2026-02-03T15:14:22.024Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 43215" Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: loaded meta data with 29 key-value pairs and 292 tensors from /usr/share/ollama/.ollama/models/blobs/sha256-667b0c1932bc6ffc593ed1d03f895bf2dc8dc6df21db3042284a6f4416b06a29 (version GGUF V3 (latest)) Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 0: general.architecture str = llama Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 1: general.type str = model Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 2: general.name str = Meta Llama 3.1 8B Instruct Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 3: general.finetune str = Instruct Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 4: general.basename str = Meta-Llama-3.1 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 5: general.size_label str = 8B Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 6: general.license str = llama3.1 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 7: general.tags arr[str,6] = ["facebook", "meta", "pytorch", "llam... Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 8: general.languages arr[str,8] = ["en", "de", "fr", "it", "pt", "hi", ... Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 9: llama.block_count u32 = 32 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 10: llama.context_length u32 = 131072 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 11: llama.embedding_length u32 = 4096 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 12: llama.feed_forward_length u32 = 14336 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 13: llama.attention.head_count u32 = 32 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 14: llama.attention.head_count_kv u32 = 8 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 15: llama.rope.freq_base f32 = 500000.000000 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 16: llama.attention.layer_norm_rms_epsilon f32 = 0.000010 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 17: general.file_type u32 = 15 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 18: llama.vocab_size u32 = 128256 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 19: llama.rope.dimension_count u32 = 128 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 20: tokenizer.ggml.model str = gpt2 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 21: tokenizer.ggml.pre str = llama-bpe Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 22: tokenizer.ggml.tokens arr[str,128256] = ["!", "\"", "#", "$", "%", "&", "'", ... Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 23: tokenizer.ggml.token_type arr[i32,128256] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 24: tokenizer.ggml.merges arr[str,280147] = ["Ġ Ġ", "Ġ ĠĠĠ", "ĠĠ ĠĠ", "... Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 25: tokenizer.ggml.bos_token_id u32 = 128000 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 26: tokenizer.ggml.eos_token_id u32 = 128009 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 27: tokenizer.chat_template str = {{- bos_token }}\n{%- if custom_tools ... Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 28: general.quantization_version u32 = 2 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - type f32: 66 tensors Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - type q4_K: 193 tensors Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - type q6_K: 33 tensors Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: file format = GGUF V3 (latest) Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: file type = Q4_K - Medium Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: file size = 4.58 GiB (4.89 BPW) Feb 03 15:14:22 LAPTOP ollama[2997]: load: 0 unused tokens Feb 03 15:14:22 LAPTOP ollama[2997]: load: printing all EOG tokens: Feb 03 15:14:22 LAPTOP ollama[2997]: load: - 128001 ('<|end_of_text|>') Feb 03 15:14:22 LAPTOP ollama[2997]: load: - 128008 ('<|eom_id|>') Feb 03 15:14:22 LAPTOP ollama[2997]: load: - 128009 ('<|eot_id|>') Feb 03 15:14:22 LAPTOP ollama[2997]: load: special tokens cache size = 256 Feb 03 15:14:22 LAPTOP ollama[2997]: load: token to piece cache size = 0.7999 MB Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: arch = llama Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: vocab_only = 1 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: no_alloc = 0 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: model type = ?B Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: model params = 8.03 B Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: general.name = Meta Llama 3.1 8B Instruct Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: vocab type = BPE Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: n_vocab = 128256 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: n_merges = 280147 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: BOS token = 128000 '<|begin_of_text|>' Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: EOS token = 128009 '<|eot_id|>' Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: EOT token = 128009 '<|eot_id|>' Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: EOM token = 128008 '<|eom_id|>' Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: LF token = 198 'Ċ' Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: EOG token = 128001 '<|end_of_text|>' Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: EOG token = 128008 '<|eom_id|>' Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: EOG token = 128009 '<|eot_id|>' Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: max token length = 256 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_load: vocab only - skipping tensors Feb 03 15:14:22 LAPTOP ollama[2997]: time=2026-02-03T15:14:22.532Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --model /usr/share/ollama/.ollama/models/blobs/sha256-667b0c1932bc6ffc593ed1d03f895bf2dc8dc6df21db3042284a6f4416b06a29 --port 36445" Feb 03 15:14:22 LAPTOP ollama[2997]: time=2026-02-03T15:14:22.532Z level=INFO source=sched.go:463 msg="system memory" total="27.3 GiB" free="22.0 GiB" free_swap="16.0 GiB" Feb 03 15:14:22 LAPTOP ollama[2997]: time=2026-02-03T15:14:22.532Z level=INFO source=sched.go:470 msg="gpu memory" id=GPU-ccc527a2-1a5a-f3c1-f540-717a73381106 library=CUDA available="5.0 GiB" free="5.5 GiB" minimum="457.0 MiB" overhead="0 B" Feb 03 15:14:22 LAPTOP ollama[2997]: time=2026-02-03T15:14:22.532Z level=INFO source=server.go:497 msg="loading model" "model layers"=33 requested=-1 Feb 03 15:14:22 LAPTOP ollama[2997]: time=2026-02-03T15:14:22.532Z level=INFO source=device.go:240 msg="model weights" device=CUDA0 size="3.6 GiB" Feb 03 15:14:22 LAPTOP ollama[2997]: time=2026-02-03T15:14:22.532Z level=INFO source=device.go:245 msg="model weights" device=CPU size="676.0 MiB" Feb 03 15:14:22 LAPTOP ollama[2997]: time=2026-02-03T15:14:22.532Z level=INFO source=device.go:251 msg="kv cache" device=CUDA0 size="480.0 MiB" Feb 03 15:14:22 LAPTOP ollama[2997]: time=2026-02-03T15:14:22.532Z level=INFO source=device.go:256 msg="kv cache" device=CPU size="32.0 MiB" Feb 03 15:14:22 LAPTOP ollama[2997]: time=2026-02-03T15:14:22.532Z level=INFO source=device.go:262 msg="compute graph" device=CUDA0 size="677.5 MiB" Feb 03 15:14:22 LAPTOP ollama[2997]: time=2026-02-03T15:14:22.532Z level=INFO source=device.go:272 msg="total memory" size="5.5 GiB" Feb 03 15:14:22 LAPTOP ollama[2997]: time=2026-02-03T15:14:22.544Z level=INFO source=runner.go:965 msg="starting go runner" Feb 03 15:14:22 LAPTOP ollama[2997]: load_backend: loaded CPU backend from /usr/local/lib/ollama/libggml-cpu-haswell.so Feb 03 15:14:22 LAPTOP ollama[2997]: ggml_cuda_init: found 1 CUDA devices: Feb 03 15:14:22 LAPTOP ollama[2997]: Device 0: NVIDIA GeForce RTX 3060 Laptop GPU, compute capability 8.6, VMM: yes, ID: GPU-ccc527a2-1a5a-f3c1-f540-717a73381106 Feb 03 15:14:22 LAPTOP ollama[2997]: load_backend: loaded CUDA backend from /usr/local/lib/ollama/cuda_v13/libggml-cuda.so Feb 03 15:14:22 LAPTOP ollama[2997]: time=2026-02-03T15:14:22.655Z level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 CUDA.0.ARCHS=750,800,860,890,900,1000,1030,1100,1200,1210 CUDA.0.USE_GRAPHS=1 CUDA.0.PEER_MAX_BATCH_SIZE=128 compiler=cgo(gcc) Feb 03 15:14:22 LAPTOP ollama[2997]: time=2026-02-03T15:14:22.656Z level=INFO source=runner.go:1001 msg="Server listening on 127.0.0.1:36445" Feb 03 15:14:22 LAPTOP ollama[2997]: time=2026-02-03T15:14:22.660Z level=INFO source=runner.go:895 msg=load request="{Operation:commit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Auto KvSize:4096 KvCacheType: NumThreads:8 GPULayers:30[ID:GPU-ccc527a2-1a5a-f3c1-f540-717a73381106 Layers:30(2..31)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:true}" Feb 03 15:14:22 LAPTOP ollama[2997]: time=2026-02-03T15:14:22.660Z level=INFO source=server.go:1349 msg="waiting for llama runner to start responding" Feb 03 15:14:22 LAPTOP ollama[2997]: time=2026-02-03T15:14:22.660Z level=INFO source=server.go:1383 msg="waiting for server to become available" status="llm server loading model" Feb 03 15:14:22 LAPTOP ollama[2997]: ggml_backend_cuda_device_get_memory device GPU-ccc527a2-1a5a-f3c1-f540-717a73381106 utilizing NVML memory reporting free: 5888147456 total: 6442450944 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_load_from_file_impl: using device CUDA0 (NVIDIA GeForce RTX 3060 Laptop GPU) (0000:01:00.0) - 5615 MiB free Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: direct I/O is enabled, disabling mmap Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: loaded meta data with 29 key-value pairs and 292 tensors from /usr/share/ollama/.ollama/models/blobs/sha256-667b0c1932bc6ffc593ed1d03f895bf2dc8dc6df21db3042284a6f4416b06a29 (version GGUF V3 (latest)) Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 0: general.architecture str = llama Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 1: general.type str = model Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 2: general.name str = Meta Llama 3.1 8B Instruct Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 3: general.finetune str = Instruct Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 4: general.basename str = Meta-Llama-3.1 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 5: general.size_label str = 8B Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 6: general.license str = llama3.1 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 7: general.tags arr[str,6] = ["facebook", "meta", "pytorch", "llam... Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 8: general.languages arr[str,8] = ["en", "de", "fr", "it", "pt", "hi", ... Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 9: llama.block_count u32 = 32 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 10: llama.context_length u32 = 131072 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 11: llama.embedding_length u32 = 4096 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 12: llama.feed_forward_length u32 = 14336 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 13: llama.attention.head_count u32 = 32 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 14: llama.attention.head_count_kv u32 = 8 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 15: llama.rope.freq_base f32 = 500000.000000 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 16: llama.attention.layer_norm_rms_epsilon f32 = 0.000010 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 17: general.file_type u32 = 15 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 18: llama.vocab_size u32 = 128256 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 19: llama.rope.dimension_count u32 = 128 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 20: tokenizer.ggml.model str = gpt2 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 21: tokenizer.ggml.pre str = llama-bpe Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 22: tokenizer.ggml.tokens arr[str,128256] = ["!", "\"", "#", "$", "%", "&", "'", ... Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 23: tokenizer.ggml.token_type arr[i32,128256] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 24: tokenizer.ggml.merges arr[str,280147] = ["Ġ Ġ", "Ġ ĠĠĠ", "ĠĠ ĠĠ", "... Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 25: tokenizer.ggml.bos_token_id u32 = 128000 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 26: tokenizer.ggml.eos_token_id u32 = 128009 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 27: tokenizer.chat_template str = {{- bos_token }}\n{%- if custom_tools ... Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - kv 28: general.quantization_version u32 = 2 Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - type f32: 66 tensors Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - type q4_K: 193 tensors Feb 03 15:14:22 LAPTOP ollama[2997]: llama_model_loader: - type q6_K: 33 tensors Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: file format = GGUF V3 (latest) Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: file type = Q4_K - Medium Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: file size = 4.58 GiB (4.89 BPW) Feb 03 15:14:22 LAPTOP ollama[2997]: load: 0 unused tokens Feb 03 15:14:22 LAPTOP ollama[2997]: load: printing all EOG tokens: Feb 03 15:14:22 LAPTOP ollama[2997]: load: - 128001 ('<|end_of_text|>') Feb 03 15:14:22 LAPTOP ollama[2997]: load: - 128008 ('<|eom_id|>') Feb 03 15:14:22 LAPTOP ollama[2997]: load: - 128009 ('<|eot_id|>') Feb 03 15:14:22 LAPTOP ollama[2997]: load: special tokens cache size = 256 Feb 03 15:14:22 LAPTOP ollama[2997]: load: token to piece cache size = 0.7999 MB Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: arch = llama Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: vocab_only = 0 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: no_alloc = 0 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: n_ctx_train = 131072 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: n_embd = 4096 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: n_embd_inp = 4096 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: n_layer = 32 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: n_head = 32 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: n_head_kv = 8 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: n_rot = 128 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: n_swa = 0 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: is_swa_any = 0 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: n_embd_head_k = 128 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: n_embd_head_v = 128 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: n_gqa = 4 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: n_embd_k_gqa = 1024 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: n_embd_v_gqa = 1024 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: f_norm_eps = 0.0e+00 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: f_norm_rms_eps = 1.0e-05 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: f_clamp_kqv = 0.0e+00 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: f_max_alibi_bias = 0.0e+00 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: f_logit_scale = 0.0e+00 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: f_attn_scale = 0.0e+00 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: n_ff = 14336 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: n_expert = 0 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: n_expert_used = 0 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: n_expert_groups = 0 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: n_group_used = 0 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: causal attn = 1 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: pooling type = 0 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: rope type = 0 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: rope scaling = linear Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: freq_base_train = 500000.0 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: freq_scale_train = 1 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: n_ctx_orig_yarn = 131072 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: rope_yarn_log_mul = 0.0000 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: rope_finetuned = unknown Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: model type = 8B Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: model params = 8.03 B Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: general.name = Meta Llama 3.1 8B Instruct Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: vocab type = BPE Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: n_vocab = 128256 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: n_merges = 280147 Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: BOS token = 128000 '<|begin_of_text|>' Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: EOS token = 128009 '<|eot_id|>' Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: EOT token = 128009 '<|eot_id|>' Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: EOM token = 128008 '<|eom_id|>' Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: LF token = 198 'Ċ' Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: EOG token = 128001 '<|end_of_text|>' Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: EOG token = 128008 '<|eom_id|>' Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: EOG token = 128009 '<|eot_id|>' Feb 03 15:14:22 LAPTOP ollama[2997]: print_info: max token length = 256 Feb 03 15:14:22 LAPTOP ollama[2997]: load_tensors: loading model tensors, this can take a while... (mmap = false, direct_io = true) Feb 03 15:14:23 LAPTOP ollama[2997]: load_tensors: offloading output layer to GPU Feb 03 15:14:23 LAPTOP ollama[2997]: load_tensors: offloading 29 repeating layers to GPU Feb 03 15:14:23 LAPTOP ollama[2997]: load_tensors: offloaded 30/33 layers to GPU Feb 03 15:14:23 LAPTOP ollama[2997]: load_tensors: CPU model buffer size = 281.81 MiB Feb 03 15:14:23 LAPTOP ollama[2997]: load_tensors: CUDA0 model buffer size = 4005.99 MiB Feb 03 15:14:23 LAPTOP ollama[2997]: load_tensors: CUDA_Host model buffer size = 397.50 MiB Feb 03 15:14:25 LAPTOP ollama[2997]: llama_context: constructing llama_context Feb 03 15:14:25 LAPTOP ollama[2997]: llama_context: n_seq_max = 1 Feb 03 15:14:25 LAPTOP ollama[2997]: llama_context: n_ctx = 4096 Feb 03 15:14:25 LAPTOP ollama[2997]: llama_context: n_ctx_seq = 4096 Feb 03 15:14:25 LAPTOP ollama[2997]: llama_context: n_batch = 512 Feb 03 15:14:25 LAPTOP ollama[2997]: llama_context: n_ubatch = 512 Feb 03 15:14:25 LAPTOP ollama[2997]: llama_context: causal_attn = 1 Feb 03 15:14:25 LAPTOP ollama[2997]: llama_context: flash_attn = auto Feb 03 15:14:25 LAPTOP ollama[2997]: llama_context: kv_unified = false Feb 03 15:14:25 LAPTOP ollama[2997]: llama_context: freq_base = 500000.0 Feb 03 15:14:25 LAPTOP ollama[2997]: llama_context: freq_scale = 1 Feb 03 15:14:25 LAPTOP ollama[2997]: llama_context: n_ctx_seq (4096) < n_ctx_train (131072) -- the full capacity of the model will not be utilized Feb 03 15:14:25 LAPTOP ollama[2997]: llama_context: CUDA_Host output buffer size = 0.50 MiB Feb 03 15:14:25 LAPTOP ollama[2997]: llama_kv_cache: CPU KV buffer size = 48.00 MiB Feb 03 15:14:25 LAPTOP ollama[2997]: llama_kv_cache: CUDA0 KV buffer size = 464.00 MiB Feb 03 15:14:25 LAPTOP ollama[2997]: llama_kv_cache: size = 512.00 MiB ( 4096 cells, 32 layers, 1/1 seqs), K (f16): 256.00 MiB, V (f16): 256.00 MiB Feb 03 15:14:25 LAPTOP ollama[2997]: sched_reserve: reserving ... Feb 03 15:14:25 LAPTOP ollama[2997]: sched_reserve: Flash Attention was auto, set to enabled Feb 03 15:14:25 LAPTOP ollama[2997]: sched_reserve: CUDA0 compute buffer size = 258.50 MiB Feb 03 15:14:25 LAPTOP ollama[2997]: sched_reserve: CUDA_Host compute buffer size = 24.01 MiB Feb 03 15:14:25 LAPTOP ollama[2997]: sched_reserve: graph nodes = 999 Feb 03 15:14:25 LAPTOP ollama[2997]: sched_reserve: graph splits = 39 (with bs=512), 2 (with bs=1) Feb 03 15:14:25 LAPTOP ollama[2997]: sched_reserve: reserve took 181.09 ms, sched copies = 1 Feb 03 15:14:25 LAPTOP ollama[2997]: time=2026-02-03T15:14:25.670Z level=INFO source=server.go:1387 msg="llama runner started in 3.14 seconds" Feb 03 15:14:25 LAPTOP ollama[2997]: time=2026-02-03T15:14:25.670Z level=INFO source=sched.go:537 msg="loaded runners" count=1 Feb 03 15:14:25 LAPTOP ollama[2997]: time=2026-02-03T15:14:25.670Z level=INFO source=server.go:1349 msg="waiting for llama runner to start responding" Feb 03 15:14:25 LAPTOP ollama[2997]: time=2026-02-03T15:14:25.670Z level=INFO source=server.go:1387 msg="llama runner started in 3.14 seconds" Feb 03 15:14:25 LAPTOP ollama[2997]: [GIN] 2026/02/03 - 15:14:25 | 200 | 3.779642619s | 127.0.0.1 | POST "/api/generate" Feb 03 15:15:09 LAPTOP ollama[2997]: [GIN] 2026/02/03 - 15:15:09 | 200 | 18.654346345s | 127.0.0.1 | POST "/api/chat" Feb 03 15:15:47 LAPTOP ollama[2997]: [GIN] 2026/02/03 - 15:15:47 | 200 | 23.816µs | 127.0.0.1 | HEAD "/" Feb 03 15:15:47 LAPTOP ollama[2997]: [GIN] 2026/02/03 - 15:15:47 | 200 | 148.15122ms | 127.0.0.1 | POST "/api/show" Feb 03 15:15:47 LAPTOP ollama[2997]: [GIN] 2026/02/03 - 15:15:47 | 200 | 147.327713ms | 127.0.0.1 | POST "/api/show" Feb 03 15:15:47 LAPTOP ollama[2997]: time=2026-02-03T15:15:47.671Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 37155" Feb 03 15:15:47 LAPTOP ollama[2997]: time=2026-02-03T15:15:47.841Z level=INFO source=sched.go:655 msg="updated VRAM based on existing loaded models" gpu=GPU-ccc527a2-1a5a-f3c1-f540-717a73381106 library=CUDA total="6.0 GiB" available="734.2 MiB" Feb 03 15:15:47 LAPTOP ollama[2997]: time=2026-02-03T15:15:47.913Z level=INFO source=server.go:246 msg="enabling flash attention" Feb 03 15:15:47 LAPTOP ollama[2997]: time=2026-02-03T15:15:47.913Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --model /usr/share/ollama/.ollama/models/blobs/sha256-9eba2761cf0b88b8bc11a065a7b5b47f1b13ce820e8e492cb1010b450f9ec950 --port 43805" Feb 03 15:15:47 LAPTOP ollama[2997]: time=2026-02-03T15:15:47.914Z level=INFO source=sched.go:463 msg="system memory" total="27.3 GiB" free="21.0 GiB" free_swap="16.0 GiB" Feb 03 15:15:47 LAPTOP ollama[2997]: time=2026-02-03T15:15:47.914Z level=INFO source=sched.go:470 msg="gpu memory" id=GPU-ccc527a2-1a5a-f3c1-f540-717a73381106 library=CUDA available="277.2 MiB" free="734.2 MiB" minimum="457.0 MiB" overhead="0 B" Feb 03 15:15:47 LAPTOP ollama[2997]: time=2026-02-03T15:15:47.914Z level=INFO source=server.go:756 msg="loading model" "model layers"=48 requested=-1 Feb 03 15:15:47 LAPTOP ollama[2997]: time=2026-02-03T15:15:47.925Z level=INFO source=runner.go:1405 msg="starting ollama engine" Feb 03 15:15:47 LAPTOP ollama[2997]: time=2026-02-03T15:15:47.925Z level=INFO source=runner.go:1440 msg="Server listening on 127.0.0.1:43805" Feb 03 15:15:47 LAPTOP ollama[2997]: time=2026-02-03T15:15:47.936Z level=INFO source=runner.go:1278 msg=load request="{Operation:fit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:8 GPULayers:48[ID:GPU-ccc527a2-1a5a-f3c1-f540-717a73381106 Layers:48(0..47)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}" Feb 03 15:15:47 LAPTOP ollama[2997]: time=2026-02-03T15:15:47.974Z level=INFO source=ggml.go:136 msg="" architecture=glm4moelite file_type=Q4_K_M name="" description="" num_tensors=844 num_key_values=39 Feb 03 15:15:47 LAPTOP ollama[2997]: load_backend: loaded CPU backend from /usr/local/lib/ollama/libggml-cpu-haswell.so Feb 03 15:15:48 LAPTOP ollama[2997]: ggml_cuda_init: found 1 CUDA devices: Feb 03 15:15:48 LAPTOP ollama[2997]: Device 0: NVIDIA GeForce RTX 3060 Laptop GPU, compute capability 8.6, VMM: yes, ID: GPU-ccc527a2-1a5a-f3c1-f540-717a73381106 Feb 03 15:15:48 LAPTOP ollama[2997]: load_backend: loaded CUDA backend from /usr/local/lib/ollama/cuda_v13/libggml-cuda.so Feb 03 15:15:48 LAPTOP ollama[2997]: time=2026-02-03T15:15:48.064Z level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 CUDA.0.ARCHS=750,800,860,890,900,1000,1030,1100,1200,1210 CUDA.0.USE_GRAPHS=1 CUDA.0.PEER_MAX_BATCH_SIZE=128 compiler=cgo(gcc) Feb 03 15:15:48 LAPTOP ollama[2997]: time=2026-02-03T15:15:48.521Z level=INFO source=server.go:1028 msg="model requires more gpu memory than is currently available, evicting a model to make space" "loaded layers"=0 Feb 03 15:15:48 LAPTOP ollama[2997]: time=2026-02-03T15:15:48.521Z level=INFO source=runner.go:1278 msg=load request="{Operation:close LoraPath:[] Parallel:0 BatchSize:0 FlashAttention:Disabled KvSize:0 KvCacheType: NumThreads:0 GPULayers:[] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}" Feb 03 15:15:48 LAPTOP ollama[2997]: time=2026-02-03T15:15:48.521Z level=INFO source=device.go:240 msg="model weights" device=CUDA0 size="17.5 GiB" Feb 03 15:15:48 LAPTOP ollama[2997]: time=2026-02-03T15:15:48.521Z level=INFO source=device.go:245 msg="model weights" device=CPU size="170.2 MiB" Feb 03 15:15:48 LAPTOP ollama[2997]: time=2026-02-03T15:15:48.521Z level=INFO source=device.go:251 msg="kv cache" device=CUDA0 size="399.5 MiB" Feb 03 15:15:48 LAPTOP ollama[2997]: time=2026-02-03T15:15:48.521Z level=INFO source=device.go:262 msg="compute graph" device=CUDA0 size="86.0 MiB" Feb 03 15:15:48 LAPTOP ollama[2997]: time=2026-02-03T15:15:48.521Z level=INFO source=device.go:267 msg="compute graph" device=CPU size="55.0 MiB" Feb 03 15:15:48 LAPTOP ollama[2997]: time=2026-02-03T15:15:48.521Z level=INFO source=device.go:272 msg="total memory" size="18.2 GiB" Feb 03 15:15:48 LAPTOP ollama[2997]: time=2026-02-03T15:15:48.521Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 43761" Feb 03 15:15:48 LAPTOP ollama[2997]: time=2026-02-03T15:15:48.885Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 43811" Feb 03 15:15:49 LAPTOP ollama[2997]: time=2026-02-03T15:15:49.007Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 42277" Feb 03 15:15:49 LAPTOP ollama[2997]: time=2026-02-03T15:15:49.188Z level=INFO source=sched.go:463 msg="system memory" total="27.3 GiB" free="21.6 GiB" free_swap="16.0 GiB" Feb 03 15:15:49 LAPTOP ollama[2997]: time=2026-02-03T15:15:49.188Z level=INFO source=sched.go:470 msg="gpu memory" id=GPU-ccc527a2-1a5a-f3c1-f540-717a73381106 library=CUDA available="4.9 GiB" free="5.3 GiB" minimum="457.0 MiB" overhead="0 B" Feb 03 15:15:49 LAPTOP ollama[2997]: time=2026-02-03T15:15:49.188Z level=INFO source=server.go:756 msg="loading model" "model layers"=48 requested=-1 Feb 03 15:15:49 LAPTOP ollama[2997]: time=2026-02-03T15:15:49.188Z level=INFO source=runner.go:1278 msg=load request="{Operation:fit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:8 GPULayers:12[ID:GPU-ccc527a2-1a5a-f3c1-f540-717a73381106 Layers:12(35..46)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}" Feb 03 15:15:49 LAPTOP ollama[2997]: time=2026-02-03T15:15:49.249Z level=INFO source=runner.go:1278 msg=load request="{Operation:fit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:8 GPULayers:11[ID:GPU-ccc527a2-1a5a-f3c1-f540-717a73381106 Layers:11(36..46)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}" Feb 03 15:15:49 LAPTOP ollama[2997]: time=2026-02-03T15:15:49.313Z level=INFO source=runner.go:1278 msg=load request="{Operation:alloc LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:8 GPULayers:11[ID:GPU-ccc527a2-1a5a-f3c1-f540-717a73381106 Layers:11(36..46)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}" Feb 03 15:15:49 LAPTOP ollama[2997]: time=2026-02-03T15:15:49.531Z level=INFO source=runner.go:1278 msg=load request="{Operation:commit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:8 GPULayers:11[ID:GPU-ccc527a2-1a5a-f3c1-f540-717a73381106 Layers:11(36..46)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}" Feb 03 15:15:49 LAPTOP ollama[2997]: time=2026-02-03T15:15:49.531Z level=INFO source=ggml.go:482 msg="offloading 11 repeating layers to GPU" Feb 03 15:15:49 LAPTOP ollama[2997]: time=2026-02-03T15:15:49.531Z level=INFO source=ggml.go:486 msg="offloading output layer to CPU" Feb 03 15:15:49 LAPTOP ollama[2997]: time=2026-02-03T15:15:49.531Z level=INFO source=ggml.go:494 msg="offloaded 11/48 layers to GPU" Feb 03 15:15:49 LAPTOP ollama[2997]: time=2026-02-03T15:15:49.531Z level=INFO source=device.go:240 msg="model weights" device=CUDA0 size="4.3 GiB" Feb 03 15:15:49 LAPTOP ollama[2997]: time=2026-02-03T15:15:49.531Z level=INFO source=device.go:245 msg="model weights" device=CPU size="13.4 GiB" Feb 03 15:15:49 LAPTOP ollama[2997]: time=2026-02-03T15:15:49.531Z level=INFO source=device.go:251 msg="kv cache" device=CUDA0 size="93.5 MiB" Feb 03 15:15:49 LAPTOP ollama[2997]: time=2026-02-03T15:15:49.531Z level=INFO source=device.go:256 msg="kv cache" device=CPU size="306.0 MiB" Feb 03 15:15:49 LAPTOP ollama[2997]: time=2026-02-03T15:15:49.531Z level=INFO source=device.go:262 msg="compute graph" device=CUDA0 size="219.5 MiB" Feb 03 15:15:49 LAPTOP ollama[2997]: time=2026-02-03T15:15:49.531Z level=INFO source=device.go:267 msg="compute graph" device=CPU size="55.0 MiB" Feb 03 15:15:49 LAPTOP ollama[2997]: time=2026-02-03T15:15:49.531Z level=INFO source=device.go:272 msg="total memory" size="18.4 GiB" Feb 03 15:15:49 LAPTOP ollama[2997]: time=2026-02-03T15:15:49.531Z level=INFO source=sched.go:537 msg="loaded runners" count=1 Feb 03 15:15:49 LAPTOP ollama[2997]: time=2026-02-03T15:15:49.531Z level=INFO source=server.go:1349 msg="waiting for llama runner to start responding" Feb 03 15:15:49 LAPTOP ollama[2997]: time=2026-02-03T15:15:49.532Z level=INFO source=server.go:1383 msg="waiting for server to become available" status="llm server loading model" Feb 03 15:15:55 LAPTOP ollama[2997]: time=2026-02-03T15:15:55.068Z level=INFO source=server.go:1387 msg="llama runner started in 7.15 seconds" Feb 03 15:15:55 LAPTOP ollama[2997]: [GIN] 2026/02/03 - 15:15:55 | 200 | 7.575581079s | 127.0.0.1 | POST "/api/generate" Feb 03 15:16:49 LAPTOP ollama[2997]: CUDA error: out of memory Feb 03 15:16:49 LAPTOP ollama[2997]: current device: 0, in function ggml_cuda_graph_evaluate_and_capture at //ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu:3858 Feb 03 15:16:49 LAPTOP ollama[2997]: cudaGraphInstantiate(&graph->instance, graph->graph, __null, __null, 0) Feb 03 15:16:49 LAPTOP ollama[2997]: //ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu:98: CUDA error Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185648] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185647] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185646] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185645] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185644] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185643] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185642] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185641] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185640] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185639] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185638] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185637] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185636] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185635] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185634] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185633] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185600] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185599] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185598] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185597] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185596] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185595] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185594] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185593] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185592] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185591] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185590] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185589] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185588] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185587] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185586] Feb 03 15:16:50 LAPTOP ollama[2997]: [New LWP 185585] Feb 03 15:16:50 LAPTOP ollama[2997]: [Thread debugging using libthread_db enabled] Feb 03 15:16:50 LAPTOP ollama[2997]: Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Feb 03 15:16:50 LAPTOP ollama[2997]: 0x000056c5f9fb2263 in ?? () Feb 03 15:16:50 LAPTOP ollama[2997]: #0 0x000056c5f9fb2263 in ?? () Feb 03 15:16:50 LAPTOP ollama[2997]: #1 0x000056c5f9f6e610 in ?? () Feb 03 15:16:50 LAPTOP ollama[2997]: #2 0x000056c5fc03a440 in ?? () Feb 03 15:16:50 LAPTOP ollama[2997]: #3 0x0000000000000080 in ?? () Feb 03 15:16:50 LAPTOP ollama[2997]: #4 0x0000000000000000 in ?? () Feb 03 15:16:50 LAPTOP ollama[2997]: [Inferior 1 (process 185584) detached] Feb 03 15:16:50 LAPTOP ollama[2997]: SIGABRT: abort Feb 03 15:16:50 LAPTOP ollama[2997]: PC=0x77c57129eb2c m=18 sigcode=18446744073709551610 Feb 03 15:16:50 LAPTOP ollama[2997]: signal arrived during cgo execution Feb 03 15:16:50 LAPTOP ollama[2997]: goroutine 2407 gp=0xc00148f880 m=18 mp=0xc000680808 [syscall]: Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.cgocall(0x56c5fadcf000, 0xc00288eaa0) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/cgocall.go:167 +0x4b fp=0xc00288ea78 sp=0xc00288ea40 pc=0x56c5f9fa514b Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/ml/backend/ggml._Cfunc_ggml_backend_sched_graph_compute_async(0x77c5140a9950, 0x77c46d8eafd0) Feb 03 15:16:50 LAPTOP ollama[2997]: _cgo_gotypes.go:977 +0x4a fp=0xc00288eaa0 sp=0xc00288ea78 pc=0x56c5fa436b6a Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/ml/backend/ggml.(*Context).ComputeWithNotify.func2(...) Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/ml/backend/ggml/ggml.go:825 Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/ml/backend/ggml.(*Context).ComputeWithNotify(0xc000aa2800, 0xc002b47480?, {0xc000be30a0, 0x1, 0x2?}) Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/ml/backend/ggml/ggml.go:825 +0x1b2 fp=0xc00288eb78 sp=0xc00288eaa0 pc=0x56c5fa4446d2 Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner.(*Server).computeBatch(0xc00023f0e0, {0x2b7, {0x56c5fb6f4e50, 0xc000aa2800}, {0x56c5fb701008, 0xc00115c690}, {0xc002422090, 0x1, 0x1}, {{0x56c5fb701008, ...}, ...}, ...}) Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner/runner.go:723 +0x876 fp=0xc00288eef0 sp=0xc00288eb78 pc=0x56c5fa52eb16 Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner.(*Server).run.gowrap1() Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner/runner.go:458 +0x58 fp=0xc00288efe0 sp=0xc00288eef0 pc=0x56c5fa52c718 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc00288efe8 sp=0xc00288efe0 pc=0x56c5f9fb0461 Feb 03 15:16:50 LAPTOP ollama[2997]: created by github.com/ollama/ollama/runner/ollamarunner.(*Server).run in goroutine 12 Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner/runner.go:458 +0x2cd Feb 03 15:16:50 LAPTOP ollama[2997]: goroutine 1 gp=0xc000002380 m=nil [IO wait, 1 minutes]: Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc001263790 sp=0xc001263770 pc=0x56c5f9fa85ce Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.netpollblock(0xc0004bd7e0?, 0xf9f41d06?, 0xc5?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/netpoll.go:575 +0xf7 fp=0xc0012637c8 sp=0xc001263790 pc=0x56c5f9f6d8f7 Feb 03 15:16:50 LAPTOP ollama[2997]: internal/poll.runtime_pollWait(0x77c571477eb0, 0x72) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/netpoll.go:351 +0x85 fp=0xc0012637e8 sp=0xc0012637c8 pc=0x56c5f9fa77e5 Feb 03 15:16:50 LAPTOP ollama[2997]: internal/poll.(*pollDesc).wait(0xc00004fa00?, 0x900000036?, 0x0) Feb 03 15:16:50 LAPTOP ollama[2997]: internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc001263810 sp=0xc0012637e8 pc=0x56c5fa02f967 Feb 03 15:16:50 LAPTOP ollama[2997]: internal/poll.(*pollDesc).waitRead(...) Feb 03 15:16:50 LAPTOP ollama[2997]: internal/poll/fd_poll_runtime.go:89 Feb 03 15:16:50 LAPTOP ollama[2997]: internal/poll.(*FD).Accept(0xc00004fa00) Feb 03 15:16:50 LAPTOP ollama[2997]: internal/poll/fd_unix.go:620 +0x295 fp=0xc0012638b8 sp=0xc001263810 pc=0x56c5fa034d35 Feb 03 15:16:50 LAPTOP ollama[2997]: net.(*netFD).accept(0xc00004fa00) Feb 03 15:16:50 LAPTOP ollama[2997]: net/fd_unix.go:172 +0x29 fp=0xc001263970 sp=0xc0012638b8 pc=0x56c5fa0a7de9 Feb 03 15:16:50 LAPTOP ollama[2997]: net.(*TCPListener).accept(0xc0002fab40) Feb 03 15:16:50 LAPTOP ollama[2997]: net/tcpsock_posix.go:159 +0x1b fp=0xc0012639c0 sp=0xc001263970 pc=0x56c5fa0bdcfb Feb 03 15:16:50 LAPTOP ollama[2997]: net.(*TCPListener).Accept(0xc0002fab40) Feb 03 15:16:50 LAPTOP ollama[2997]: net/tcpsock.go:380 +0x30 fp=0xc0012639f0 sp=0xc0012639c0 pc=0x56c5fa0bcbb0 Feb 03 15:16:50 LAPTOP ollama[2997]: net/http.(*onceCloseListener).Accept(0xc000126480?) Feb 03 15:16:50 LAPTOP ollama[2997]: :1 +0x24 fp=0xc001263a08 sp=0xc0012639f0 pc=0x56c5fa2d4a84 Feb 03 15:16:50 LAPTOP ollama[2997]: net/http.(*Server).Serve(0xc000517400, {0x56c5fb6e6ce0, 0xc0002fab40}) Feb 03 15:16:50 LAPTOP ollama[2997]: net/http/server.go:3424 +0x30c fp=0xc001263b38 sp=0xc001263a08 pc=0x56c5fa2ac34c Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner.Execute({0xc0000340a0, 0x4, 0x4}) Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner/runner.go:1441 +0x94e fp=0xc001263d08 sp=0xc001263b38 pc=0x56c5fa5358ee Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/runner.Execute({0xc000034080?, 0x0?, 0x0?}) Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/runner.go:28 +0x118 fp=0xc001263d30 sp=0xc001263d08 pc=0x56c5fa598a78 Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/cmd.NewCLI.func3(0xc000517200?, {0x56c5fb1620fd?, 0x4?, 0x56c5fb162101?}) Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/cmd/cmd.go:1966 +0x45 fp=0xc001263d58 sp=0xc001263d30 pc=0x56c5fad60e25 Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/spf13/cobra.(*Command).execute(0xc000129508, {0xc00018fc70, 0x5, 0x5}) Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/spf13/cobra@v1.7.0/command.go:940 +0x85c fp=0xc001263e78 sp=0xc001263d58 pc=0x56c5fa121d7c Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/spf13/cobra.(*Command).ExecuteC(0xc0004d2908) Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3a5 fp=0xc001263f30 sp=0xc001263e78 pc=0x56c5fa1225c5 Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/spf13/cobra.(*Command).Execute(...) Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/spf13/cobra@v1.7.0/command.go:992 Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/spf13/cobra.(*Command).ExecuteContext(...) Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/spf13/cobra@v1.7.0/command.go:985 Feb 03 15:16:50 LAPTOP ollama[2997]: main.main() Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/main.go:12 +0x4d fp=0xc001263f50 sp=0xc001263f30 pc=0x56c5fad6190d Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.main() Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:283 +0x29d fp=0xc001263fe0 sp=0xc001263f50 pc=0x56c5f9f74f7d Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc001263fe8 sp=0xc001263fe0 pc=0x56c5f9fb0461 Feb 03 15:16:50 LAPTOP ollama[2997]: goroutine 2 gp=0xc000002e00 m=nil [force gc (idle), 1 minutes]: Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000084fa8 sp=0xc000084f88 pc=0x56c5f9fa85ce Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goparkunlock(...) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:441 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.forcegchelper() Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:348 +0xb8 fp=0xc000084fe0 sp=0xc000084fa8 pc=0x56c5f9f752b8 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc000084fe8 sp=0xc000084fe0 pc=0x56c5f9fb0461 Feb 03 15:16:50 LAPTOP ollama[2997]: created by runtime.init.7 in goroutine 1 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:336 +0x1a Feb 03 15:16:50 LAPTOP ollama[2997]: goroutine 3 gp=0xc000003340 m=nil [GC sweep wait]: Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000085780 sp=0xc000085760 pc=0x56c5f9fa85ce Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goparkunlock(...) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:441 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.bgsweep(0xc0000ac000) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgcsweep.go:316 +0xdf fp=0xc0000857c8 sp=0xc000085780 pc=0x56c5f9f5fa5f Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcenable.gowrap1() Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:204 +0x25 fp=0xc0000857e0 sp=0xc0000857c8 pc=0x56c5f9f53e45 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc0000857e8 sp=0xc0000857e0 pc=0x56c5f9fb0461 Feb 03 15:16:50 LAPTOP ollama[2997]: created by runtime.gcenable in goroutine 1 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:204 +0x66 Feb 03 15:16:50 LAPTOP ollama[2997]: goroutine 4 gp=0xc000003500 m=nil [GC scavenge wait]: Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gopark(0x1808e11?, 0x17c1494?, 0x0?, 0x0?, 0x0?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000085f78 sp=0xc000085f58 pc=0x56c5f9fa85ce Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goparkunlock(...) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:441 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.(*scavengerState).park(0x56c5fc0374e0) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgcscavenge.go:425 +0x49 fp=0xc000085fa8 sp=0xc000085f78 pc=0x56c5f9f5d4a9 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.bgscavenge(0xc0000ac000) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgcscavenge.go:658 +0x59 fp=0xc000085fc8 sp=0xc000085fa8 pc=0x56c5f9f5da39 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcenable.gowrap2() Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:205 +0x25 fp=0xc000085fe0 sp=0xc000085fc8 pc=0x56c5f9f53de5 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc000085fe8 sp=0xc000085fe0 pc=0x56c5f9fb0461 Feb 03 15:16:50 LAPTOP ollama[2997]: created by runtime.gcenable in goroutine 1 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:205 +0xa5 Feb 03 15:16:50 LAPTOP ollama[2997]: goroutine 5 gp=0xc000003dc0 m=nil [finalizer wait, 1 minutes]: Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gopark(0x1b8?, 0xc000002380?, 0x1?, 0x23?, 0xc000084688?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000084630 sp=0xc000084610 pc=0x56c5f9fa85ce Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.runfinq() Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mfinal.go:196 +0x107 fp=0xc0000847e0 sp=0xc000084630 pc=0x56c5f9f52e07 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc0000847e8 sp=0xc0000847e0 pc=0x56c5f9fb0461 Feb 03 15:16:50 LAPTOP ollama[2997]: created by runtime.createfing in goroutine 1 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mfinal.go:166 +0x3d Feb 03 15:16:50 LAPTOP ollama[2997]: goroutine 6 gp=0xc0001e88c0 m=nil [chan receive]: Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gopark(0xc00023b720?, 0xc0026ea018?, 0x60?, 0x67?, 0x56c5fa08ea28?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000086718 sp=0xc0000866f8 pc=0x56c5f9fa85ce Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.chanrecv(0xc0000ba310, 0x0, 0x1) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/chan.go:664 +0x445 fp=0xc000086790 sp=0xc000086718 pc=0x56c5f9f448e5 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.chanrecv1(0x0?, 0x0?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/chan.go:506 +0x12 fp=0xc0000867b8 sp=0xc000086790 pc=0x56c5f9f44472 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.unique_runtime_registerUniqueMapCleanup.func2(...) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1796 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.unique_runtime_registerUniqueMapCleanup.gowrap1() Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1799 +0x2f fp=0xc0000867e0 sp=0xc0000867b8 pc=0x56c5f9f56fef Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc0000867e8 sp=0xc0000867e0 pc=0x56c5f9fb0461 Feb 03 15:16:50 LAPTOP ollama[2997]: created by unique.runtime_registerUniqueMapCleanup in goroutine 1 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1794 +0x85 Feb 03 15:16:50 LAPTOP ollama[2997]: goroutine 7 gp=0xc0001e8c40 m=nil [GC worker (idle), 1 minutes]: Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000086f38 sp=0xc000086f18 pc=0x56c5f9fa85ce Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkWorker(0xc0000bb730) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1423 +0xe9 fp=0xc000086fc8 sp=0xc000086f38 pc=0x56c5f9f56309 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc000086fe0 sp=0xc000086fc8 pc=0x56c5f9f561e5 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc000086fe8 sp=0xc000086fe0 pc=0x56c5f9fb0461 Feb 03 15:16:50 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:16:50 LAPTOP ollama[2997]: goroutine 18 gp=0xc000504000 m=nil [GC worker (idle)]: Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gopark(0x78dcb5f9e8d?, 0x1?, 0x5c?, 0x94?, 0x0?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000080738 sp=0xc000080718 pc=0x56c5f9fa85ce Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkWorker(0xc0000bb730) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1423 +0xe9 fp=0xc0000807c8 sp=0xc000080738 pc=0x56c5f9f56309 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc0000807e0 sp=0xc0000807c8 pc=0x56c5f9f561e5 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc0000807e8 sp=0xc0000807e0 pc=0x56c5f9fb0461 Feb 03 15:16:50 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:16:50 LAPTOP ollama[2997]: goroutine 34 gp=0xc000102380 m=nil [GC worker (idle)]: Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gopark(0x787a7ac6cc6?, 0x1?, 0x7e?, 0xa9?, 0x0?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc00011a738 sp=0xc00011a718 pc=0x56c5f9fa85ce Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkWorker(0xc0000bb730) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1423 +0xe9 fp=0xc00011a7c8 sp=0xc00011a738 pc=0x56c5f9f56309 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc00011a7e0 sp=0xc00011a7c8 pc=0x56c5f9f561e5 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc00011a7e8 sp=0xc00011a7e0 pc=0x56c5f9fb0461 Feb 03 15:16:50 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:16:50 LAPTOP ollama[2997]: goroutine 35 gp=0xc000102540 m=nil [GC worker (idle)]: Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gopark(0x78dcb05eca3?, 0x1?, 0xc9?, 0x63?, 0x0?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc00011af38 sp=0xc00011af18 pc=0x56c5f9fa85ce Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkWorker(0xc0000bb730) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1423 +0xe9 fp=0xc00011afc8 sp=0xc00011af38 pc=0x56c5f9f56309 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc00011afe0 sp=0xc00011afc8 pc=0x56c5f9f561e5 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc00011afe8 sp=0xc00011afe0 pc=0x56c5f9fb0461 Feb 03 15:16:50 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:16:50 LAPTOP ollama[2997]: goroutine 8 gp=0xc0001e8e00 m=nil [GC worker (idle)]: Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gopark(0x78dcb604f11?, 0x3?, 0xb4?, 0xca?, 0x0?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000087738 sp=0xc000087718 pc=0x56c5f9fa85ce Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkWorker(0xc0000bb730) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1423 +0xe9 fp=0xc0000877c8 sp=0xc000087738 pc=0x56c5f9f56309 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc0000877e0 sp=0xc0000877c8 pc=0x56c5f9f561e5 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc0000877e8 sp=0xc0000877e0 pc=0x56c5f9fb0461 Feb 03 15:16:50 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:16:50 LAPTOP ollama[2997]: goroutine 19 gp=0xc0005041c0 m=nil [GC worker (idle)]: Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gopark(0x78dcb5a86be?, 0x1?, 0x1b?, 0xed?, 0x0?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000080f38 sp=0xc000080f18 pc=0x56c5f9fa85ce Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkWorker(0xc0000bb730) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1423 +0xe9 fp=0xc000080fc8 sp=0xc000080f38 pc=0x56c5f9f56309 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc000080fe0 sp=0xc000080fc8 pc=0x56c5f9f561e5 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc000080fe8 sp=0xc000080fe0 pc=0x56c5f9fb0461 Feb 03 15:16:50 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:16:50 LAPTOP ollama[2997]: goroutine 36 gp=0xc000102700 m=nil [GC worker (idle)]: Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gopark(0x78dcb5fb532?, 0x3?, 0x21?, 0xb2?, 0x0?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc00011b738 sp=0xc00011b718 pc=0x56c5f9fa85ce Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkWorker(0xc0000bb730) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1423 +0xe9 fp=0xc00011b7c8 sp=0xc00011b738 pc=0x56c5f9f56309 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc00011b7e0 sp=0xc00011b7c8 pc=0x56c5f9f561e5 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc00011b7e8 sp=0xc00011b7e0 pc=0x56c5f9fb0461 Feb 03 15:16:50 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:16:50 LAPTOP ollama[2997]: goroutine 9 gp=0xc0001e8fc0 m=nil [GC worker (idle)]: Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gopark(0x78dcb05e1ba?, 0x3?, 0x3a?, 0x89?, 0x0?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000087f38 sp=0xc000087f18 pc=0x56c5f9fa85ce Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkWorker(0xc0000bb730) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1423 +0xe9 fp=0xc000087fc8 sp=0xc000087f38 pc=0x56c5f9f56309 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc000087fe0 sp=0xc000087fc8 pc=0x56c5f9f561e5 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc000087fe8 sp=0xc000087fe0 pc=0x56c5f9fb0461 Feb 03 15:16:50 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:16:50 LAPTOP ollama[2997]: goroutine 20 gp=0xc000504700 m=nil [GC worker (idle)]: Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gopark(0x78dcb05eca3?, 0x1?, 0x47?, 0xa3?, 0x0?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000081738 sp=0xc000081718 pc=0x56c5f9fa85ce Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkWorker(0xc0000bb730) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1423 +0xe9 fp=0xc0000817c8 sp=0xc000081738 pc=0x56c5f9f56309 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc0000817e0 sp=0xc0000817c8 pc=0x56c5f9f561e5 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc0000817e8 sp=0xc0000817e0 pc=0x56c5f9fb0461 Feb 03 15:16:50 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:16:50 LAPTOP ollama[2997]: goroutine 21 gp=0xc0005048c0 m=nil [GC worker (idle)]: Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gopark(0x78dcb05e72f?, 0x1?, 0xe6?, 0xe4?, 0x0?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000081f38 sp=0xc000081f18 pc=0x56c5f9fa85ce Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkWorker(0xc0000bb730) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1423 +0xe9 fp=0xc000081fc8 sp=0xc000081f38 pc=0x56c5f9f56309 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc000081fe0 sp=0xc000081fc8 pc=0x56c5f9f561e5 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc000081fe8 sp=0xc000081fe0 pc=0x56c5f9fb0461 Feb 03 15:16:50 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:16:50 LAPTOP ollama[2997]: goroutine 22 gp=0xc000504a80 m=nil [GC worker (idle)]: Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gopark(0x78becd40b68?, 0x1?, 0xcc?, 0x0?, 0x0?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000082738 sp=0xc000082718 pc=0x56c5f9fa85ce Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkWorker(0xc0000bb730) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1423 +0xe9 fp=0xc0000827c8 sp=0xc000082738 pc=0x56c5f9f56309 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc0000827e0 sp=0xc0000827c8 pc=0x56c5f9f561e5 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc0000827e8 sp=0xc0000827e0 pc=0x56c5f9fb0461 Feb 03 15:16:50 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:16:50 LAPTOP ollama[2997]: goroutine 23 gp=0xc000504c40 m=nil [GC worker (idle)]: Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gopark(0x78dcb5a761a?, 0x3?, 0x75?, 0x5?, 0x0?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000082f38 sp=0xc000082f18 pc=0x56c5f9fa85ce Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkWorker(0xc0000bb730) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1423 +0xe9 fp=0xc000082fc8 sp=0xc000082f38 pc=0x56c5f9f56309 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc000082fe0 sp=0xc000082fc8 pc=0x56c5f9f561e5 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc000082fe8 sp=0xc000082fe0 pc=0x56c5f9fb0461 Feb 03 15:16:50 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:16:50 LAPTOP ollama[2997]: goroutine 24 gp=0xc000504e00 m=nil [GC worker (idle)]: Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gopark(0x56c5fc106f00?, 0x1?, 0xef?, 0x87?, 0x0?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000083738 sp=0xc000083718 pc=0x56c5f9fa85ce Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkWorker(0xc0000bb730) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1423 +0xe9 fp=0xc0000837c8 sp=0xc000083738 pc=0x56c5f9f56309 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc0000837e0 sp=0xc0000837c8 pc=0x56c5f9f561e5 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc0000837e8 sp=0xc0000837e0 pc=0x56c5f9fb0461 Feb 03 15:16:50 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:16:50 LAPTOP ollama[2997]: goroutine 50 gp=0xc000584000 m=nil [GC worker (idle)]: Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gopark(0x56c5fc106f00?, 0x1?, 0xa4?, 0x1b?, 0x0?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000116738 sp=0xc000116718 pc=0x56c5f9fa85ce Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkWorker(0xc0000bb730) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1423 +0xe9 fp=0xc0001167c8 sp=0xc000116738 pc=0x56c5f9f56309 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc0001167e0 sp=0xc0001167c8 pc=0x56c5f9f561e5 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc0001167e8 sp=0xc0001167e0 pc=0x56c5f9fb0461 Feb 03 15:16:50 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:16:50 LAPTOP ollama[2997]: goroutine 10 gp=0xc0001e9180 m=nil [GC worker (idle)]: Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gopark(0x78dcb5f0c0c?, 0x3?, 0x5d?, 0xd4?, 0x0?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc0004aa738 sp=0xc0004aa718 pc=0x56c5f9fa85ce Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkWorker(0xc0000bb730) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1423 +0xe9 fp=0xc0004aa7c8 sp=0xc0004aa738 pc=0x56c5f9f56309 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc0004aa7e0 sp=0xc0004aa7c8 pc=0x56c5f9f561e5 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc0004aa7e8 sp=0xc0004aa7e0 pc=0x56c5f9fb0461 Feb 03 15:16:50 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:16:50 LAPTOP ollama[2997]: goroutine 11 gp=0xc0001e9340 m=nil [GC worker (idle)]: Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gopark(0x78dcb5f7d44?, 0x1?, 0x89?, 0x9f?, 0x0?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc0004aaf38 sp=0xc0004aaf18 pc=0x56c5f9fa85ce Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkWorker(0xc0000bb730) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1423 +0xe9 fp=0xc0004aafc8 sp=0xc0004aaf38 pc=0x56c5f9f56309 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gcBgMarkStartWorkers.gowrap1() Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x25 fp=0xc0004aafe0 sp=0xc0004aafc8 pc=0x56c5f9f561e5 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc0004aafe8 sp=0xc0004aafe0 pc=0x56c5f9fb0461 Feb 03 15:16:50 LAPTOP ollama[2997]: created by runtime.gcBgMarkStartWorkers in goroutine 1 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/mgc.go:1339 +0x105 Feb 03 15:16:50 LAPTOP ollama[2997]: goroutine 12 gp=0xc000584e00 m=nil [chan receive]: Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gopark(0x30?, 0x56c5fb635c00?, 0x1?, 0x0?, 0xc001265798?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc001265750 sp=0xc001265730 pc=0x56c5f9fa85ce Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.chanrecv(0xc000da0700, 0x0, 0x1) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/chan.go:664 +0x445 fp=0xc0012657c8 sp=0xc001265750 pc=0x56c5f9f448e5 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.chanrecv1(0x56c5fb1a0ec9?, 0x29?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/chan.go:506 +0x12 fp=0xc0012657f0 sp=0xc0012657c8 pc=0x56c5f9f44472 Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner.(*Server).forwardBatch(_, {0x2b8, {0x56c5fb6f4e50, 0xc000aa2980}, {0x56c5fb701008, 0xc0012767b0}, {0xc0024220a8, 0x1, 0x1}, {{0x56c5fb701008, ...}, ...}, ...}) Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner/runner.go:475 +0xfa fp=0xc001265b58 sp=0xc0012657f0 pc=0x56c5fa52c83a Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner.(*Server).run(0xc00023f0e0, {0x56c5fb6e9390, 0xc00018fd10}) Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner/runner.go:452 +0x18c fp=0xc001265fb8 sp=0xc001265b58 pc=0x56c5fa52c4ec Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner.Execute.gowrap1() Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner/runner.go:1418 +0x28 fp=0xc001265fe0 sp=0xc001265fb8 pc=0x56c5fa535b68 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc001265fe8 sp=0xc001265fe0 pc=0x56c5f9fb0461 Feb 03 15:16:50 LAPTOP ollama[2997]: created by github.com/ollama/ollama/runner/ollamarunner.Execute in goroutine 1 Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner/runner.go:1418 +0x4c9 Feb 03 15:16:50 LAPTOP ollama[2997]: goroutine 13 gp=0xc000584fc0 m=nil [select]: Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gopark(0xc000f89a08?, 0x2?, 0x4?, 0x0?, 0xc000f8986c?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000f89698 sp=0xc000f89678 pc=0x56c5f9fa85ce Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.selectgo(0xc000f89a08, 0xc000f89868, 0xc000cf2280?, 0x0, 0x1?, 0x1) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/select.go:351 +0x837 fp=0xc000f897d0 sp=0xc000f89698 pc=0x56c5f9f87477 Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner.(*Server).completion(0xc00023f0e0, {0x56c5fb6e6ec0, 0xc000dfc0e0}, 0xc00049c280) Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner/runner.go:950 +0xc4e fp=0xc000f89ac0 sp=0xc000f897d0 pc=0x56c5fa530c0e Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner.(*Server).completion-fm({0x56c5fb6e6ec0?, 0xc000dfc0e0?}, 0xc000f9db40?) Feb 03 15:16:50 LAPTOP ollama[2997]: :1 +0x36 fp=0xc000f89af0 sp=0xc000f89ac0 pc=0x56c5fa536056 Feb 03 15:16:50 LAPTOP ollama[2997]: net/http.HandlerFunc.ServeHTTP(0xc0004c7740?, {0x56c5fb6e6ec0?, 0xc000dfc0e0?}, 0xc000f9db60?) Feb 03 15:16:50 LAPTOP ollama[2997]: net/http/server.go:2294 +0x29 fp=0xc000f89b18 sp=0xc000f89af0 pc=0x56c5fa2a8989 Feb 03 15:16:50 LAPTOP ollama[2997]: net/http.(*ServeMux).ServeHTTP(0x56c5f9f4d325?, {0x56c5fb6e6ec0, 0xc000dfc0e0}, 0xc00049c280) Feb 03 15:16:50 LAPTOP ollama[2997]: net/http/server.go:2822 +0x1c4 fp=0xc000f89b68 sp=0xc000f89b18 pc=0x56c5fa2aa884 Feb 03 15:16:50 LAPTOP ollama[2997]: net/http.serverHandler.ServeHTTP({0x56c5fb6e32f0?}, {0x56c5fb6e6ec0?, 0xc000dfc0e0?}, 0x1?) Feb 03 15:16:50 LAPTOP ollama[2997]: net/http/server.go:3301 +0x8e fp=0xc000f89b98 sp=0xc000f89b68 pc=0x56c5fa2c830e Feb 03 15:16:50 LAPTOP ollama[2997]: net/http.(*conn).serve(0xc000126480, {0x56c5fb6e9358, 0xc000114ab0}) Feb 03 15:16:50 LAPTOP ollama[2997]: net/http/server.go:2102 +0x625 fp=0xc000f89fb8 sp=0xc000f89b98 pc=0x56c5fa2a6e85 Feb 03 15:16:50 LAPTOP ollama[2997]: net/http.(*Server).Serve.gowrap3() Feb 03 15:16:50 LAPTOP ollama[2997]: net/http/server.go:3454 +0x28 fp=0xc000f89fe0 sp=0xc000f89fb8 pc=0x56c5fa2ac748 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc000f89fe8 sp=0xc000f89fe0 pc=0x56c5f9fb0461 Feb 03 15:16:50 LAPTOP ollama[2997]: created by net/http.(*Server).Serve in goroutine 1 Feb 03 15:16:50 LAPTOP ollama[2997]: net/http/server.go:3454 +0x485 Feb 03 15:16:50 LAPTOP ollama[2997]: goroutine 859 gp=0xc000602e00 m=nil [IO wait]: Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gopark(0x5?, 0x0?, 0x0?, 0x0?, 0xb?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000b86dd8 sp=0xc000b86db8 pc=0x56c5f9fa85ce Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.netpollblock(0x56c5f9fcbd98?, 0xf9f41d06?, 0xc5?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/netpoll.go:575 +0xf7 fp=0xc000b86e10 sp=0xc000b86dd8 pc=0x56c5f9f6d8f7 Feb 03 15:16:50 LAPTOP ollama[2997]: internal/poll.runtime_pollWait(0x77c571477d98, 0x72) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/netpoll.go:351 +0x85 fp=0xc000b86e30 sp=0xc000b86e10 pc=0x56c5f9fa77e5 Feb 03 15:16:50 LAPTOP ollama[2997]: internal/poll.(*pollDesc).wait(0xc00004fa80?, 0xc000114bb1?, 0x0) Feb 03 15:16:50 LAPTOP ollama[2997]: internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc000b86e58 sp=0xc000b86e30 pc=0x56c5fa02f967 Feb 03 15:16:50 LAPTOP ollama[2997]: internal/poll.(*pollDesc).waitRead(...) Feb 03 15:16:50 LAPTOP ollama[2997]: internal/poll/fd_poll_runtime.go:89 Feb 03 15:16:50 LAPTOP ollama[2997]: internal/poll.(*FD).Read(0xc00004fa80, {0xc000114bb1, 0x1, 0x1}) Feb 03 15:16:50 LAPTOP ollama[2997]: internal/poll/fd_unix.go:165 +0x27a fp=0xc000b86ef0 sp=0xc000b86e58 pc=0x56c5fa030c5a Feb 03 15:16:50 LAPTOP ollama[2997]: net.(*netFD).Read(0xc00004fa80, {0xc000114bb1?, 0xc0002fac18?, 0xc000b86f70?}) Feb 03 15:16:50 LAPTOP ollama[2997]: net/fd_posix.go:55 +0x25 fp=0xc000b86f38 sp=0xc000b86ef0 pc=0x56c5fa0a5e45 Feb 03 15:16:50 LAPTOP ollama[2997]: net.(*conn).Read(0xc00007c938, {0xc000114bb1?, 0xc000e98000?, 0x56c5fa314140?}) Feb 03 15:16:50 LAPTOP ollama[2997]: net/net.go:194 +0x45 fp=0xc000b86f80 sp=0xc000b86f38 pc=0x56c5fa0b4205 Feb 03 15:16:50 LAPTOP ollama[2997]: net/http.(*connReader).backgroundRead(0xc000114ba0) Feb 03 15:16:50 LAPTOP ollama[2997]: net/http/server.go:690 +0x37 fp=0xc000b86fc8 sp=0xc000b86f80 pc=0x56c5fa2a0d57 Feb 03 15:16:50 LAPTOP ollama[2997]: net/http.(*connReader).startBackgroundRead.gowrap2() Feb 03 15:16:50 LAPTOP ollama[2997]: net/http/server.go:686 +0x25 fp=0xc000b86fe0 sp=0xc000b86fc8 pc=0x56c5fa2a0c85 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc000b86fe8 sp=0xc000b86fe0 pc=0x56c5f9fb0461 Feb 03 15:16:50 LAPTOP ollama[2997]: created by net/http.(*connReader).startBackgroundRead in goroutine 13 Feb 03 15:16:50 LAPTOP ollama[2997]: net/http/server.go:686 +0xb6 Feb 03 15:16:50 LAPTOP ollama[2997]: goroutine 2408 gp=0xc0010a2fc0 m=nil [chan receive]: Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.gopark(0x30?, 0x56c5fb635c00?, 0x1?, 0xe0?, 0xc000b50b20?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/proc.go:435 +0xce fp=0xc000b50ad8 sp=0xc000b50ab8 pc=0x56c5f9fa85ce Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.chanrecv(0xc000da0690, 0x0, 0x1) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/chan.go:664 +0x445 fp=0xc000b50b50 sp=0xc000b50ad8 pc=0x56c5f9f448e5 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.chanrecv1(0x56c5fb1a4735?, 0x2c?) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/chan.go:506 +0x12 fp=0xc000b50b78 sp=0xc000b50b50 pc=0x56c5f9f44472 Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner.(*Server).computeBatch(0xc00023f0e0, {0x2b8, {0x56c5fb6f4e50, 0xc000aa2980}, {0x56c5fb701008, 0xc0012767b0}, {0xc0024220a8, 0x1, 0x1}, {{0x56c5fb701008, ...}, ...}, ...}) Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner/runner.go:651 +0x185 fp=0xc000b50ef0 sp=0xc000b50b78 pc=0x56c5fa52e425 Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner.(*Server).run.gowrap1() Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner/runner.go:458 +0x58 fp=0xc000b50fe0 sp=0xc000b50ef0 pc=0x56c5fa52c718 Feb 03 15:16:50 LAPTOP ollama[2997]: runtime.goexit({}) Feb 03 15:16:50 LAPTOP ollama[2997]: runtime/asm_amd64.s:1700 +0x1 fp=0xc000b50fe8 sp=0xc000b50fe0 pc=0x56c5f9fb0461 Feb 03 15:16:50 LAPTOP ollama[2997]: created by github.com/ollama/ollama/runner/ollamarunner.(*Server).run in goroutine 12 Feb 03 15:16:50 LAPTOP ollama[2997]: github.com/ollama/ollama/runner/ollamarunner/runner.go:458 +0x2cd Feb 03 15:16:50 LAPTOP ollama[2997]: rax 0x0 Feb 03 15:16:50 LAPTOP ollama[2997]: rbx 0x2d524 Feb 03 15:16:50 LAPTOP ollama[2997]: rcx 0x77c57129eb2c Feb 03 15:16:50 LAPTOP ollama[2997]: rdx 0x6 Feb 03 15:16:50 LAPTOP ollama[2997]: rdi 0x2d4f0 Feb 03 15:16:50 LAPTOP ollama[2997]: rsi 0x2d524 Feb 03 15:16:50 LAPTOP ollama[2997]: rbp 0x77c498ffbf40 Feb 03 15:16:50 LAPTOP ollama[2997]: rsp 0x77c498ffbf00 Feb 03 15:16:50 LAPTOP ollama[2997]: r8 0x0 Feb 03 15:16:50 LAPTOP ollama[2997]: r9 0x7 Feb 03 15:16:50 LAPTOP ollama[2997]: r10 0x8 Feb 03 15:16:50 LAPTOP ollama[2997]: r11 0x246 Feb 03 15:16:50 LAPTOP ollama[2997]: r12 0x6 Feb 03 15:16:50 LAPTOP ollama[2997]: r13 0x77c4d758ed20 Feb 03 15:16:50 LAPTOP ollama[2997]: r14 0x16 Feb 03 15:16:50 LAPTOP ollama[2997]: r15 0x0 Feb 03 15:16:50 LAPTOP ollama[2997]: rip 0x77c57129eb2c Feb 03 15:16:50 LAPTOP ollama[2997]: rflags 0x246 Feb 03 15:16:50 LAPTOP ollama[2997]: cs 0x33 Feb 03 15:16:50 LAPTOP ollama[2997]: fs 0x0 Feb 03 15:16:50 LAPTOP ollama[2997]: gs 0x0 Feb 03 15:16:51 LAPTOP ollama[2997]: time=2026-02-03T15:16:51.929Z level=ERROR source=server.go:303 msg="llama runner terminated" error="exit status 2" Feb 03 15:16:51 LAPTOP ollama[2997]: [GIN] 2026/02/03 - 15:16:51 | 200 | 54.730861921s | 127.0.0.1 | POST "/api/chat" Feb 03 15:21:51 LAPTOP ollama[2997]: time=2026-02-03T15:21:51.964Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 44013"@rick-github commented on GitHub (Feb 3, 2026):
Since the crash occurs during generation and not at model load (when the memory allocation is made), this may be either larger than expected transient allocations or an actual memory leak. You could try setting
OLLAMA_GPU_OVERHEAD, that will protect against large transient allocations but not memory leaks. #14045 shows there are other problems with -rc1, so possibly related, and I expect this will get a lot of attention.@Johnreidsilver commented on GitHub (Feb 3, 2026):
Thanks again. Just downgraded to v15.4 and everything is working fine. VRAM usage is steady during generation. Using huihui_ai/deepseek-r1-abliterated:latest (5GB), VRAM stays put at 5.263GB.
I tried 15.5 RC1 with OLLAMA_GPU_OVERHEAD=536870912 still crashed. Memory leak in VRAM?
GGML_CUDA_ENABLE_UNIFIED_MEMORY=1 OLLAMA_FLASH_ATTENTION=1 OLLAMA_CONTEXT_LENGTH=1024 OLLAMA_GPU_OVERHEAD=536870912 OLLAMA_NEW_ENGINE=1 ollama run glm-4.7-flash:latest
@Johnreidsilver commented on GitHub (Feb 4, 2026):
fixed in v15.5 RC2
@rick-github commented on GitHub (Feb 4, 2026):
0.15.5-rc2 backed out the ggml bump.