mirror of
https://github.com/ollama/ollama.git
synced 2026-05-23 06:01:37 -05:00
Open
opened 2026-05-04 23:29:05 -05:00 by GiteaMirror
·
11 comments
No Branch/Tag Specified
main
dhiltgen/llama-runner
parth-migrate-pi
codex/make-integration-hidden-and-lunchable
hoyyeva/migrate-pi
hoyyeva/opencode-thinking
hoyyeva/anthropic-local-image-path
parth-launch-codex-app
hoyyeva/anthropic-reference-images-path
parth-anthropic-reference-images-path
brucemacd/download-before-remove
hoyyeva/editor-config-repair
parth-mlx-decode-checkpoints
hoyyeva/qwen
parth/hide-claude-desktop-till-release
parth-add-claude-code-autoinstall
release_v0.22.0
pdevine/manifest-list
codex/fix-codex-model-metadata-warning
pdevine/addressable-manifest
brucemacd/launch-fetch-reccomended
jmorganca/llama-compat
launch-copilot-cli
release_v0.20.7
parth-auto-save-backup
parth-test
jmorganca/gemma4-audio-replacements
fix-manifest-digest-on-pull
hoyyeva/vscode-improve
brucemacd/install-server-wait
parth/update-claude-docs
brucemac/start-ap-install
pdevine/mlx-update
pdevine/qwen35_vision
drifkin/api-show-fallback
mintlify/image-generation-1773352582
hoyyeva/server-context-length-local-config
jmorganca/faster-reptition-penalties
jmorganca/convert-nemotron
parth-pi-thinking
pdevine/sampling-penalties
jmorganca/fix-create-quantization-memory
dongchen/resumable_transfer_fix
pdevine/sampling-cache-error
jessegross/mlx-usage
hoyyeva/openclaw-config
hoyyeva/app-html
pdevine/qwen3next
brucemacd/sign-sh-install
brucemacd/tui-update
brucemacd/usage-api
jmorganca/launch-empty
fix-app-dist-embed
mxyng/mlx-compile
mxyng/mlx-quant
mxyng/mlx-glm4.7
mxyng/mlx
brucemacd/simplify-model-picker
jmorganca/qwen3-concurrent
fix-glm-4.7-flash-mla-config
drifkin/qwen3-coder-opening-tag
brucemacd/usage-cli
fix-cuda12-fattn-shmem
ollama-imagegen-docs
parth/fix-multiline-inputs
brucemacd/config-docs
mxyng/model-files
mxyng/simple-execute
fix-imagegen-ollama-models
mxyng/async-upload
jmorganca/lazy-no-dtype-changes
imagegen-auto-detect-create
parth/decrease-concurrent-download-hf
fix-mlx-quantize-init
jmorganca/x-cleanup
usage
imagegen-readme
jmorganca/glm-image
mlx-gpu-cd
jmorganca/imagegen-modelfile
parth/agent-skills
parth/agent-allowlist
parth/signed-in-offline
parth/agents
parth/fix-context-chopping
improve-cloud-flow
parth/add-models-websearch
parth/prompt-renderer-mcp
jmorganca/native-settings
jmorganca/download-stream-hash
jmorganca/client2-rebased
brucemacd/oai-chat-req-multipart
jessegross/multi_chunk_reserve
grace/additional-omit-empty
grace/mistral-3-large
mxyng/tokenizer2
mxyng/tokenizer
jessegross/flash
hoyyeva/windows-nacked-app
mxyng/cleanup-attention
grace/deepseek-parser
hoyyeva/remember-unsent-prompt
parth/add-lfs-pointer-error-conversion
parth/olmo2-test2
hoyyeva/ollama-launchagent-plist
nicole/olmo-model
parth/olmo-test
mxyng/remove-embedded
parth/render-template
jmorganca/intellect-3
parth/remove-prealloc-linter
jmorganca/cmd-eval
nicole/nomic-embed-text-fix
mxyng/lint-2
hoyyeva/add-gemini-3-pro-preview
hoyyeva/load-model-list
mxyng/expand-path
mxyng/environ-2
hoyyeva/deeplink-json-encoding
parth/improve-tool-calling-tests
hoyyeva/conversation
hoyyeva/assistant-edit-response
hoyyeva/thinking
origin/brucemacd/invalid-char-i-err
parth/improve-tool-calling
jmorganca/required-omitempty
grace/qwen3-vl-tests
mxyng/iter-client
parth/docs-readme
nicole/embed-test
pdevine/integration-benchstat
parth/remove-generate-cmd
parth/add-toolcall-id
mxyng/server-tests
jmorganca/glm-4.6
jmorganca/gin-h-compat
drifkin/stable-tool-args
pdevine/qwen3-more-thinking
parth/add-websearch-client
nicole/websearch_local
jmorganca/qwen3-coder-updates
grace/deepseek-v3-migration-tests
mxyng/fix-create
jmorganca/cloud-errors
pdevine/parser-tidy
revert-12233-parth/simplify-entrypoints-runner
parth/enable-so-gpt-oss
brucemacd/qwen3vl
jmorganca/readme-simplify
parth/gpt-oss-structured-outputs
revert-12039-jmorganca/tools-braces
mxyng/embeddings
mxyng/gguf
mxyng/benchmark
mxyng/types-null
parth/move-parsing
mxyng/gemma2
jmorganca/docs
mxyng/16-bit
mxyng/create-stdin
pdevine/authorizedkeys
mxyng/quant
parth/opt-in-error-context-window
brucemacd/cache-models
brucemacd/runner-completion
jmorganca/llama-update-6
brucemacd/benchmark-list
brucemacd/partial-read-caps
parth/deepseek-r1-tools
mxyng/omit-array
parth/tool-prefix-temp
brucemacd/runner-test
jmorganca/qwen25vl
brucemacd/model-forward-test-ext
parth/python-function-parsing
jmorganca/cuda-compression-none
drifkin/num-parallel
drifkin/chat-truncation-fix
jmorganca/sync
parth/python-tools-calling
drifkin/array-head-count
brucemacd/create-no-loop
parth/server-enable-content-stream-with-tools
qwen25omni
mxyng/v3
brucemacd/ropeconfig
jmorganca/silence-tokenizer
parth/sample-so-test
parth/sampling-structured-outputs
brucemacd/doc-go-engine
parth/constrained-sampling-json
jmorganca/mistral-wip
brucemacd/mistral-small-convert
parth/sample-unmarshal-json-for-params
brucemacd/jomorganca/mistral
pdevine/bfloat16
jmorganca/mistral
brucemacd/mistral
pdevine/logging
parth/sample-correctness-fix
parth/sample-fix-sorting
jmorgan/sample-fix-sorting-extras
jmorganca/temp-0-images
brucemacd/parallel-embed-models
brucemacd/shim-grammar
jmorganca/fix-gguf-error
bmizerany/nameswork
jmorganca/faster-releases
bmizerany/validatenames
brucemacd/err-no-vocab
brucemacd/rope-config
brucemacd/err-hint
brucemacd/qwen2_5
brucemacd/logprobs
brucemacd/new_runner_graph_bench
progress-flicker
brucemacd/forward-test
brucemacd/go_qwen2
pdevine/gemma2
jmorganca/add-missing-symlink-eval
mxyng/next-debug
parth/set-context-size-openai
brucemacd/next-bpe-bench
brucemacd/next-bpe-test
brucemacd/new_runner_e2e
brucemacd/new_runner_qwen2
pdevine/convert-cohere2
brucemacd/convert-cli
parth/log-probs
mxyng/next-mlx
mxyng/cmd-history
parth/templating
parth/tokenize-detokenize
brucemacd/check-key-register
bmizerany/grammar
jmorganca/vendor-081b29bd
mxyng/func-checks
jmorganca/fix-null-format
parth/fix-default-to-warn-json
jmorganca/qwen2vl
jmorganca/no-concat
parth/cmd-cleanup-SO
brucemacd/check-key-register-structured-err
parth/openai-stream-usage
parth/fix-referencing-so
stream-tools-stop
jmorganca/degin-1
brucemacd/install-path-clean
brucemacd/push-name-validation
brucemacd/browser-key-register
jmorganca/openai-fix-first-message
jmorganca/fix-proxy
jessegross/sample
parth/disallow-streaming-tools
dhiltgen/remove_submodule
jmorganca/ga
jmorganca/mllama
pdevine/newlines
pdevine/geems-2b
jmorganca/llama-bump
mxyng/modelname-7
mxyng/gin-slog
mxyng/modelname-6
jyan/convert-prog
jyan/quant5
paligemma-support
pdevine/import-docs
jmorganca/openai-context
jyan/paligemma
jyan/p2
jyan/palitest
bmizerany/embedspeedup
jmorganca/llama-vit
brucemacd/allow-ollama
royh/ep-methods
royh/whisper
mxyng/api-models
mxyng/fix-memory
jyan/q4_4/8
jyan/ollama-v
royh/stream-tools
roy-embed-parallel
bmizerany/hrm
revert-5963-revert-5924-mxyng/llama3.1-rope
royh/embed-viz
jyan/local2
jyan/auth
jyan/local
jyan/parse-temp
jmorganca/template-mistral
jyan/reord-g
royh-openai-suffixdocs
royh-imgembed
royh-embed-parallel
jyan/quant4
royh-precision
jyan/progress
pdevine/fix-template
jyan/quant3
pdevine/ggla
mxyng/update-registry-domain
jmorganca/ggml-static
mxyng/create-context
jyan/v0.146
mxyng/layers-from-files
build_dist
bmizerany/noseek
royh-ls
royh-name
timeout
mxyng/server-timestamp
bmizerany/nosillyggufslurps
royh-params
jmorganca/llama-cpp-7c26775
royh-openai-delete
royh-show-rigid
jmorganca/enable-fa
jmorganca/no-error-template
jyan/format
royh-testdelete
bmizerany/fastverify
language_support
pdevine/ps-glitches
brucemacd/tokenize
bruce/iq-quants
bmizerany/filepathwithcoloninhost
mxyng/split-bin
bmizerany/client-registry
jmorganca/if-none-match
native
jmorganca/native
jmorganca/batch-embeddings
jmorganca/initcmake
jmorganca/mm
pdevine/showggmlinfo
modenameenforcealphanum
bmizerany/modenameenforcealphanum
jmorganca/done-reason
jmorganca/llama-cpp-8960fe8
ollama.com
bmizerany/filepathnobuild
bmizerany/types/model/defaultfix
rmdisplaylong
nogogen
bmizerany/x
modelfile-readme
bmizerany/replacecolon
jmorganca/limit
jmorganca/execstack
jmorganca/replace-assets
mxyng/tune-concurrency
jmorganca/testing
whitespace-detection
jmorganca/options
upgrade-all
scratch
cuda-search
mattw/airenamer
mattw/allmodelsonhuggingface
mattw/quantcontext
mattw/whatneedstorun
brucemacd/llama-mem-calc
mattw/faq-context
mattw/communitylinks
mattw/noprune
mattw/python-functioncalling
rename
mxyng/install
pulse
remove-first
editor
mattw/selfqueryingretrieval
cgo
mattw/howtoquant
api
matt/streamingapi
format-config
mxyng/extra-args
shell
update-nous-hermes
cp-model
upload-progress
fix-unknown-model
fix-model-names
delete-fix
insecure-registry
ls
deletemodels
progressbar
readme-updates
license-layers
skip-list
list-models
modelpath
matt/examplemodelfiles
distribution
go-opts
v0.30.0-rc23
v0.30.0-rc22
v0.30.0-rc21
v0.30.0-rc20
v0.30.0-rc19
v0.30.0-rc18
v0.25.0-rc0
v0.30.0-rc17
v0.30.0-rc16
v0.24.0-rc1
v0.24.0
v0.24.0-rc0
v0.23.4
v0.23.4-rc0
v0.30.0-rc15
v0.23.3
v0.23.3-rc1
v0.30.0-rc14
v0.23.3-rc0
v0.30.0-rc13
v0.30.0-rc12
v0.30.0-rc11
v0.30.0-rc10
v0.30.0-rc9
v0.30.0-rc8
v0.30.0-rc7
v0.30.0-rc6
v0.30.0-rc5
v0.23.2
v0.23.2-rc0
v0.30.0-rc4
v0.30.0-rc3
v0.30.0-rc2
v0.30.0-rc1
v0.30.0-rc0
v0.23.1
v0.23.1-rc0
v0.23.0
v0.23.0-rc0
v0.22.1
v0.22.1-rc1
v0.22.1-rc0
v0.22.0
v0.22.0-rc1
v0.21.3-rc0
v0.21.2-rc1
v0.21.2
v0.21.2-rc0
v0.21.1
v0.21.1-rc1
v0.21.1-rc0
v0.21.0
v0.21.0-rc1
v0.21.0-rc0
v0.20.8-rc0
v0.20.7
v0.20.7-rc1
v0.20.7-rc0
v0.20.6
v0.20.6-rc1
v0.20.6-rc0
v0.20.5
v0.20.5-rc2
v0.20.5-rc1
v0.20.5-rc0
v0.20.4
v0.20.4-rc2
v0.20.4-rc1
v0.20.4-rc0
v0.20.3
v0.20.3-rc0
v0.20.2
v0.20.1
v0.20.1-rc2
v0.20.1-rc1
v0.20.1-rc0
v0.20.0
v0.20.0-rc1
v0.20.0-rc0
v0.19.0
v0.19.0-rc2
v0.19.0-rc1
v0.19.0-rc0
v0.18.4-rc1
v0.18.4-rc0
v0.18.3
v0.18.3-rc2
v0.18.3-rc1
v0.18.3-rc0
v0.18.2
v0.18.2-rc1
v0.18.2-rc0
v0.18.1
v0.18.1-rc1
v0.18.1-rc0
v0.18.0
v0.18.0-rc2
v0.18.0-rc1
v0.18.0-rc0
v0.17.8-rc4
v0.17.8-rc3
v0.17.8-rc2
v0.17.8-rc1
v0.17.8-rc0
v0.17.7
v0.17.7-rc2
v0.17.7-rc1
v0.17.7-rc0
v0.17.6
v0.17.5
v0.17.4
v0.17.3
v0.17.2
v0.17.1
v0.17.1-rc2
v0.17.1-rc1
v0.17.1-rc0
v0.17.0
v0.17.0-rc2
v0.17.0-rc1
v0.17.0-rc0
v0.16.3
v0.16.3-rc2
v0.16.3-rc1
v0.16.3-rc0
v0.16.2
v0.16.2-rc0
v0.16.1
v0.16.0
v0.16.0-rc2
v0.16.0-rc0
v0.16.0-rc1
v0.15.6
v0.15.5
v0.15.5-rc5
v0.15.5-rc4
v0.15.5-rc3
v0.15.5-rc2
v0.15.5-rc1
v0.15.5-rc0
v0.15.4
v0.15.3
v0.15.2
v0.15.1
v0.15.1-rc1
v0.15.1-rc0
v0.15.0-rc6
v0.15.0
v0.15.0-rc5
v0.15.0-rc4
v0.15.0-rc3
v0.15.0-rc2
v0.15.0-rc1
v0.15.0-rc0
v0.14.3
v0.14.3-rc3
v0.14.3-rc2
v0.14.3-rc1
v0.14.3-rc0
v0.14.2
v0.14.2-rc1
v0.14.2-rc0
v0.14.1
v0.14.0-rc11
v0.14.0
v0.14.0-rc10
v0.14.0-rc9
v0.14.0-rc8
v0.14.0-rc7
v0.14.0-rc6
v0.14.0-rc5
v0.14.0-rc4
v0.14.0-rc3
v0.14.0-rc2
v0.14.0-rc1
v0.14.0-rc0
v0.13.5
v0.13.5-rc1
v0.13.5-rc0
v0.13.4-rc2
v0.13.4
v0.13.4-rc1
v0.13.4-rc0
v0.13.3
v0.13.3-rc1
v0.13.3-rc0
v0.13.2
v0.13.2-rc2
v0.13.2-rc1
v0.13.2-rc0
v0.13.1
v0.13.1-rc2
v0.13.1-rc1
v0.13.1-rc0
v0.13.0
v0.13.0-rc0
v0.12.11
v0.12.11-rc1
v0.12.11-rc0
v0.12.10
v0.12.10-rc1
v0.12.10-rc0
v0.12.9-rc0
v0.12.9
v0.12.8
v0.12.8-rc0
v0.12.7
v0.12.7-rc1
v0.12.7-rc0
v0.12.7-citest0
v0.12.6
v0.12.6-rc1
v0.12.6-rc0
v0.12.5
v0.12.5-rc0
v0.12.4
v0.12.4-rc7
v0.12.4-rc6
v0.12.4-rc5
v0.12.4-rc4
v0.12.4-rc3
v0.12.4-rc2
v0.12.4-rc1
v0.12.4-rc0
v0.12.3
v0.12.2
v0.12.2-rc0
v0.12.1
v0.12.1-rc1
v0.12.1-rc2
v0.12.1-rc0
v0.12.0
v0.12.0-rc1
v0.12.0-rc0
v0.11.11
v0.11.11-rc3
v0.11.11-rc2
v0.11.11-rc1
v0.11.11-rc0
v0.11.10
v0.11.9
v0.11.9-rc0
v0.11.8
v0.11.8-rc0
v0.11.7-rc1
v0.11.7-rc0
v0.11.7
v0.11.6
v0.11.6-rc0
v0.11.5-rc4
v0.11.5-rc3
v0.11.5
v0.11.5-rc5
v0.11.5-rc2
v0.11.5-rc1
v0.11.5-rc0
v0.11.4
v0.11.4-rc0
v0.11.3
v0.11.3-rc0
v0.11.2
v0.11.1
v0.11.0-rc0
v0.11.0-rc1
v0.11.0-rc2
v0.11.0
v0.10.2-int1
v0.10.1
v0.10.0
v0.10.0-rc4
v0.10.0-rc3
v0.10.0-rc2
v0.10.0-rc1
v0.10.0-rc0
v0.9.7-rc1
v0.9.7-rc0
v0.9.6
v0.9.6-rc0
v0.9.6-ci0
v0.9.5
v0.9.4-rc5
v0.9.4-rc6
v0.9.4
v0.9.4-rc3
v0.9.4-rc4
v0.9.4-rc1
v0.9.4-rc2
v0.9.4-rc0
v0.9.3
v0.9.3-rc5
v0.9.4-citest0
v0.9.3-rc4
v0.9.3-rc3
v0.9.3-rc2
v0.9.3-rc1
v0.9.3-rc0
v0.9.2
v0.9.1
v0.9.1-rc1
v0.9.1-rc0
v0.9.1-ci1
v0.9.1-ci0
v0.9.0
v0.9.0-rc0
v0.8.0
v0.8.0-rc0
v0.7.1-rc2
v0.7.1
v0.7.1-rc1
v0.7.1-rc0
v0.7.0
v0.7.0-rc1
v0.7.0-rc0
v0.6.9-rc0
v0.6.8
v0.6.8-rc0
v0.6.7
v0.6.7-rc2
v0.6.7-rc1
v0.6.7-rc0
v0.6.6
v0.6.6-rc2
v0.6.6-rc1
v0.6.6-rc0
v0.6.5-rc1
v0.6.5
v0.6.5-rc0
v0.6.4-rc0
v0.6.4
v0.6.3-rc1
v0.6.3
v0.6.3-rc0
v0.6.2
v0.6.2-rc0
v0.6.1
v0.6.1-rc0
v0.6.0-rc0
v0.6.0
v0.5.14-rc0
v0.5.13
v0.5.13-rc6
v0.5.13-rc5
v0.5.13-rc4
v0.5.13-rc3
v0.5.13-rc2
v0.5.13-rc1
v0.5.13-rc0
v0.5.12
v0.5.12-rc1
v0.5.12-rc0
v0.5.11
v0.5.10
v0.5.9
v0.5.9-rc0
v0.5.8-rc13
v0.5.8
v0.5.8-rc12
v0.5.8-rc11
v0.5.8-rc10
v0.5.8-rc9
v0.5.8-rc8
v0.5.8-rc7
v0.5.8-rc6
v0.5.8-rc5
v0.5.8-rc4
v0.5.8-rc3
v0.5.8-rc2
v0.5.8-rc1
v0.5.8-rc0
v0.5.7
v0.5.6
v0.5.5
v0.5.5-rc0
v0.5.4
v0.5.3
v0.5.3-rc0
v0.5.2
v0.5.2-rc3
v0.5.2-rc2
v0.5.2-rc1
v0.5.2-rc0
v0.5.1
v0.5.0
v0.5.0-rc1
v0.4.8-rc0
v0.4.7
v0.4.6
v0.4.5
v0.4.4
v0.4.3
v0.4.3-rc0
v0.4.2
v0.4.2-rc1
v0.4.2-rc0
v0.4.1
v0.4.1-rc0
v0.4.0
v0.4.0-rc8
v0.4.0-rc7
v0.4.0-rc6
v0.4.0-rc5
v0.4.0-rc4
v0.4.0-rc3
v0.4.0-rc2
v0.4.0-rc1
v0.4.0-rc0
v0.4.0-ci3
v0.3.14
v0.3.14-rc0
v0.3.13
v0.3.12
v0.3.12-rc5
v0.3.12-rc4
v0.3.12-rc3
v0.3.12-rc2
v0.3.12-rc1
v0.3.11
v0.3.11-rc4
v0.3.11-rc3
v0.3.11-rc2
v0.3.11-rc1
v0.3.10
v0.3.10-rc1
v0.3.9
v0.3.8
v0.3.7
v0.3.7-rc6
v0.3.7-rc5
v0.3.7-rc4
v0.3.7-rc3
v0.3.7-rc2
v0.3.7-rc1
v0.3.6
v0.3.5
v0.3.4
v0.3.3
v0.3.2
v0.3.1
v0.3.0
v0.2.8
v0.2.8-rc2
v0.2.8-rc1
v0.2.7
v0.2.6
v0.2.5
v0.2.4
v0.2.3
v0.2.2
v0.2.2-rc2
v0.2.2-rc1
v0.2.1
v0.2.0
v0.1.49-rc14
v0.1.49-rc13
v0.1.49-rc12
v0.1.49-rc11
v0.1.49-rc10
v0.1.49-rc9
v0.1.49-rc8
v0.1.49-rc7
v0.1.49-rc6
v0.1.49-rc4
v0.1.49-rc5
v0.1.49-rc3
v0.1.49-rc2
v0.1.49-rc1
v0.1.48
v0.1.47
v0.1.46
v0.1.45-rc5
v0.1.45
v0.1.45-rc4
v0.1.45-rc3
v0.1.45-rc2
v0.1.45-rc1
v0.1.44
v0.1.43
v0.1.42
v0.1.41
v0.1.40
v0.1.40-rc1
v0.1.39
v0.1.39-rc2
v0.1.39-rc1
v0.1.38
v0.1.37
v0.1.36
v0.1.35
v0.1.35-rc1
v0.1.34
v0.1.34-rc1
v0.1.33
v0.1.33-rc7
v0.1.33-rc6
v0.1.33-rc5
v0.1.33-rc4
v0.1.33-rc3
v0.1.33-rc2
v0.1.33-rc1
v0.1.32
v0.1.32-rc2
v0.1.32-rc1
v0.1.31
v0.1.30
v0.1.29
v0.1.28
v0.1.27
v0.1.26
v0.1.25
v0.1.24
v0.1.23
v0.1.22
v0.1.21
v0.1.20
v0.1.19
v0.1.18
v0.1.17
v0.1.16
v0.1.15
v0.1.14
v0.1.13
v0.1.12
v0.1.11
v0.1.10
v0.1.9
v0.1.8
v0.1.7
v0.1.6
v0.1.5
v0.1.4
v0.1.3
v0.1.2
v0.1.1
v0.1.0
v0.0.21
v0.0.20
v0.0.19
v0.0.18
v0.0.17
v0.0.16
v0.0.15
v0.0.14
v0.0.13
v0.0.12
v0.0.11
v0.0.10
v0.0.9
v0.0.8
v0.0.7
v0.0.6
v0.0.5
v0.0.4
v0.0.3
v0.0.2
v0.0.1
Labels
Clear labels
amd
api
app
bug
build
cli
cloud
compatibility
context-length
create
docker
documentation
embeddings
feature request
feedback wanted
good first issue
gpt-oss
gpu
harmony
help wanted
image
install
intel
js
launch
linux
macos
memory
mlx
model
needs more info
networking
nvidia
ollama.com
performance
pull-request
python
question
registry
rendering
thinking
tools
top
vulkan
windows
wsl
Mirrored from GitHub Pull Request
No Label
model
Milestone
No items
No Milestone
Projects
Clear projects
No project
No Assignees
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: github-starred/ollama#70926
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @pinghe on GitHub (Dec 12, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/13433
https://github.com/zai-org/Open-AutoGLM?tab=readme-ov-file
https://huggingface.co/zai-org/AutoGLM-Phone-9B
https://huggingface.co/zai-org/AutoGLM-Phone-9B-Multilingual
@rick-github commented on GitHub (Dec 13, 2025):
This model is
glm4varchitecture so will be supported when https://github.com/ggml-org/llama.cpp/pull/16600 is merged and vendor synced.@rick-github commented on GitHub (Jan 1, 2026):
@pinghe commented on GitHub (Jan 5, 2026):
05 12:12:16 arch ollama[927]: warmup: warmup with image size = 1288 x 1288
05 12:12:16 arch ollama[927]: ggml_backend_cuda_buffer_type_alloc_buffer: allocating 515.05 MiB on device 0: cudaMalloc led: out of memory
05 12:12:16 arch ollama[927]: ggml_gallocr_reserve_n_impl: failed to allocate CUDA0 buffer of size 540070912
05 12:12:16 arch ollama[927]: alloc_compute_meta: CPU compute buffer size = 19.11 MiB
05 12:12:16 arch ollama[927]: alloc_compute_meta: graph splits = 1, nodes = 632
05 12:12:16 arch ollama[927]: warmup: flash attention is enabled
05 12:12:16 arch ollama[927]: time=2026-01-05T12:12:16.324+08:00 level=INFO source=server.go:1376 msg="llama runner rted in 2.61 seconds"
05 12:12:16 arch ollama[927]: time=2026-01-05T12:12:16.324+08:00 level=INFO source=sched.go:517 msg="loaded runners" nt=1
05 12:12:16 arch ollama[927]: time=2026-01-05T12:12:16.324+08:00 level=INFO source=server.go:1338 msg="waiting for ma runner to start responding"
05 12:12:16 arch ollama[927]: time=2026-01-05T12:12:16.325+08:00 level=INFO source=server.go:1376 msg="llama runner rted in 2.61 seconds"
05 12:12:16 arch ollama[927]: add_text: <|begin_of_image|>
05 12:12:16 arch ollama[927]: image_tokens->nx = 85
05 12:12:16 arch ollama[927]: image_tokens->ny = 48
05 12:12:16 arch ollama[927]: batch_f32 size = 1
05 12:12:16 arch ollama[927]: add_text: <|end_of_image|>
05 12:12:16 arch ollama[927]: ggml_backend_cuda_buffer_type_alloc_buffer: allocating 993.11 MiB on device 0: cudaMalloc led: out of memory
05 12:12:16 arch ollama[927]: ggml_gallocr_reserve_n_impl: failed to allocate CUDA0 buffer of size 1041346560
05 12:12:16 arch ollama[927]: SIGSEGV: segmentation violation
05 12:12:16 arch ollama[927]: PC=0x564e038541cb m=15 sigcode=1 addr=0x0
05 12:12:16 arch ollama[927]: signal arrived during cgo execution
05 12:12:16 arch ollama[927]: goroutine 40 gp=0xc000505c00 m=15 mp=0xc000101008 [syscall]:
05 12:12:16 arch ollama[927]: runtime.cgocall(0x564e03841190, 0xc0000491d8)
05 12:12:16 arch ollama[927]: runtime/cgocall.go:167 +0x4b fp=0xc0000491b0 sp=0xc000049178 pc=0x564e02aef6eb
05 12:12:16 arch ollama[927]: github.com/ollama/ollama/llama._Cfunc_mtmd_encode_chunk(0x7f834008dd90, 0x7f82817bc000)
05 12:12:16 arch ollama[927]: _cgo_gotypes.go:1079 +0x4a fp=0xc0000491d8 sp=0xc0000491b0 pc=0x564e02eaa04a
05 12:12:16 arch ollama[927]: github.com/ollama/ollama/llama.(*MtmdContext).MultimodalTokenize.func11(...)
05 12:12:16 arch ollama[927]: github.com/ollama/ollama/llama/llama.go:595
05 12:12:16 arch ollama[927]: github.com/ollama/ollama/llama.(*MtmdContext).MultimodalTokenize(0xc00034c0b8, 0003a6750, {0xc000a82000, 0x117bcb, 0
00049520?})
@rick-github
@rick-github commented on GitHub (Jan 5, 2026):
Full log.
@pinghe commented on GitHub (Jan 5, 2026):
12:12:12 arch ollama[927]: time=2026-01-05T12:12:12.828+08:00 level=INFO source=runner.go:464 msg="failure during GPU discovery" OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/cuda_v13]" extra_envs=map[] error="failed to finish discovery before timeout"
12:12:12 arch ollama[927]: time=2026-01-05T12:12:12.828+08:00 level=WARN source=runner.go:356 msg="unable to refresh free memory, using old values"
12:12:12 arch ollama[927]: time=2026-01-05T12:12:12.829+08:00 level=INFO source=server.go:429 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 33053"
12:12:13 arch ollama[927]: llama_model_loader: loaded meta data with 33 key-value pairs and 523 tensors from /usr/share/ollama/.ollama/models/blobs/sha256-004fd25079bfce8caa7363df20c20af820432e1dd55d22a2b0e728e79223e77a (version GGUF V3 (latest))
12:12:13 arch ollama[927]: llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
12:12:13 arch ollama[927]: llama_model_loader: - kv 0: general.architecture str = glm4
12:12:13 arch ollama[927]: llama_model_loader: - kv 1: general.type str = model
12:12:13 arch ollama[927]: llama_model_loader: - kv 2: general.size_label str = 9.4B
12:12:13 arch ollama[927]: llama_model_loader: - kv 3: general.license str = mit
12:12:13 arch ollama[927]: llama_model_loader: - kv 4: general.base_model.count u32 = 1
12:12:13 arch ollama[927]: llama_model_loader: - kv 5: general.base_model.0.name str = GLM 4.1V 9B Base
12:12:13 arch ollama[927]: llama_model_loader: - kv 6: general.base_model.0.organization str = Zai Org
12:12:13 arch ollama[927]: llama_model_loader: - kv 7: general.base_model.0.repo_url str = https://huggingface.co/zai-org/GLM-4....
12:12:13 arch ollama[927]: llama_model_loader: - kv 8: general.tags arr[str,2] = ["agent", "image-text-to-text"]
12:12:13 arch ollama[927]: llama_model_loader: - kv 9: general.languages arr[str,1] = ["zh"]
12:12:13 arch ollama[927]: llama_model_loader: - kv 10: glm4.block_count u32 = 40
12:12:13 arch ollama[927]: llama_model_loader: - kv 11: glm4.context_length u32 = 65536
12:12:13 arch ollama[927]: llama_model_loader: - kv 12: glm4.embedding_length u32 = 4096
12:12:13 arch ollama[927]: llama_model_loader: - kv 13: glm4.feed_forward_length u32 = 13696
12:12:13 arch ollama[927]: llama_model_loader: - kv 14: glm4.attention.head_count u32 = 32
12:12:13 arch ollama[927]: llama_model_loader: - kv 15: glm4.attention.head_count_kv u32 = 2
12:12:13 arch ollama[927]: llama_model_loader: - kv 16: glm4.rope.dimension_sections arr[i32,4] = [8, 12, 12, 0]
12:12:13 arch ollama[927]: llama_model_loader: - kv 17: glm4.rope.freq_base f32 = 10000.000000
12:12:13 arch ollama[927]: llama_model_loader: - kv 18: glm4.attention.layer_norm_rms_epsilon f32 = 0.000010
12:12:13 arch ollama[927]: llama_model_loader: - kv 19: glm4.rope.dimension_count u32 = 64
12:12:13 arch ollama[927]: llama_model_loader: - kv 20: tokenizer.ggml.model str = gpt2
12:12:13 arch ollama[927]: llama_model_loader: - kv 21: tokenizer.ggml.pre str = glm4
12:12:13 arch ollama[927]: llama_model_loader: - kv 22: tokenizer.ggml.tokens arr[str,151552] = ["!", """, "#", "$", "%", "&", "'", ...
12:12:13 arch ollama[927]: llama_model_loader: - kv 23: tokenizer.ggml.token_type arr[i32,151552] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
12:12:13 arch ollama[927]: llama_model_loader: - kv 24: tokenizer.ggml.merges arr[str,318088] = ["Ġ Ġ", "Ġ ĠĠĠ", "ĠĠ ĠĠ", "...
12:12:13 arch ollama[927]: llama_model_loader: - kv 25: tokenizer.ggml.eos_token_id u32 = 151329
12:12:13 arch ollama[927]: llama_model_loader: - kv 26: tokenizer.ggml.padding_token_id u32 = 151329
12:12:13 arch ollama[927]: llama_model_loader: - kv 27: tokenizer.ggml.eot_token_id u32 = 151336
12:12:13 arch ollama[927]: llama_model_loader: - kv 28: tokenizer.ggml.unknown_token_id u32 = 151329
12:12:13 arch ollama[927]: llama_model_loader: - kv 29: tokenizer.ggml.bos_token_id u32 = 151329
12:12:13 arch ollama[927]: llama_model_loader: - kv 30: tokenizer.chat_template str = [gMASK]\n{%- for msg in messages ...
12:12:13 arch ollama[927]: llama_model_loader: - kv 31: general.quantization_version u32 = 2
12:12:13 arch ollama[927]: llama_model_loader: - kv 32: general.file_type u32 = 15
12:12:13 arch ollama[927]: llama_model_loader: - type f32: 281 tensors
12:12:13 arch ollama[927]: llama_model_loader: - type q5_0: 20 tensors
12:12:13 arch ollama[927]: llama_model_loader: - type q8_0: 20 tensors
12:12:13 arch ollama[927]: llama_model_loader: - type q4_K: 181 tensors
12:12:13 arch ollama[927]: llama_model_loader: - type q6_K: 21 tensors
12:12:13 arch ollama[927]: print_info: file format = GGUF V3 (latest)
12:12:13 arch ollama[927]: print_info: file type = Q4_K - Medium
12:12:13 arch ollama[927]: print_info: file size = 5.73 GiB (5.24 BPW)
12:12:13 arch ollama[927]: load: special_eot_id is not in special_eog_ids - the tokenizer config may be incorrect
12:12:13 arch ollama[927]: load: printing all EOG tokens:
12:12:13 arch ollama[927]: load: - 151329 ('<|endoftext|>')
12:12:13 arch ollama[927]: load: - 151336 ('<|user|>')
12:12:13 arch ollama[927]: load: special tokens cache size = 23
12:12:13 arch ollama[927]: load: token to piece cache size = 0.9711 MB
12:12:13 arch ollama[927]: print_info: arch = glm4
12:12:13 arch ollama[927]: print_info: vocab_only = 1
12:12:13 arch ollama[927]: print_info: no_alloc = 0
12:12:13 arch ollama[927]: print_info: model type = ?B
12:12:13 arch ollama[927]: print_info: model params = 9.40 B
12:12:13 arch ollama[927]: print_info: general.name = n/a
12:12:13 arch ollama[927]: print_info: vocab type = BPE
12:12:13 arch ollama[927]: print_info: n_vocab = 151552
12:12:13 arch ollama[927]: print_info: n_merges = 318088
12:12:13 arch ollama[927]: print_info: BOS token = 151329 '<|endoftext|>'
12:12:13 arch ollama[927]: print_info: EOS token = 151329 '<|endoftext|>'
12:12:13 arch ollama[927]: print_info: EOT token = 151336 '<|user|>'
12:12:13 arch ollama[927]: print_info: UNK token = 151329 '<|endoftext|>'
12:12:13 arch ollama[927]: print_info: PAD token = 151329 '<|endoftext|>'
12:12:13 arch ollama[927]: print_info: LF token = 198 'Ċ'
12:12:13 arch ollama[927]: print_info: EOG token = 151329 '<|endoftext|>'
12:12:13 arch ollama[927]: print_info: EOG token = 151336 '<|user|>'
12:12:13 arch ollama[927]: print_info: max token length = 1024
12:12:13 arch ollama[927]: llama_model_load: vocab only - skipping tensors
12:12:13 arch ollama[927]: time=2026-01-05T12:12:13.712+08:00 level=INFO source=server.go:429 msg="starting runner" cmd="/usr/local/bin/ollama runner --model /usr/share/ollama/.ollama/models/blobs/sha256-004fd25079bfce8caa7363df20c20af820432e1dd55d22a2b0e728e79223e77a --port 33773"
12:12:13 arch ollama[927]: time=2026-01-05T12:12:13.712+08:00 level=INFO source=sched.go:443 msg="system memory" total="62.6 GiB" free="41.9 GiB" free_swap="3.2 GiB"
12:12:13 arch ollama[927]: time=2026-01-05T12:12:13.712+08:00 level=INFO source=sched.go:450 msg="gpu memory" id=GPU-8215c551-6dde-569b-490d-884f3ab7a437 library=CUDA available="6.6 GiB" free="7.1 GiB" minimum="457.0 MiB" overhead="0 B"
12:12:13 arch ollama[927]: time=2026-01-05T12:12:13.712+08:00 level=INFO source=server.go:496 msg="loading model" "model layers"=41 requested=-1
12:12:13 arch ollama[927]: time=2026-01-05T12:12:13.713+08:00 level=INFO source=device.go:240 msg="model weights" device=CUDA0 size="4.2 GiB"
12:12:13 arch ollama[927]: time=2026-01-05T12:12:13.713+08:00 level=INFO source=device.go:245 msg="model weights" device=CPU size="1.3 GiB"
12:12:13 arch ollama[927]: time=2026-01-05T12:12:13.713+08:00 level=INFO source=device.go:251 msg="kv cache" device=CUDA0 size="544.0 MiB"
12:12:13 arch ollama[927]: time=2026-01-05T12:12:13.713+08:00 level=INFO source=device.go:256 msg="kv cache" device=CPU size="96.0 MiB"
12:12:13 arch ollama[927]: time=2026-01-05T12:12:13.713+08:00 level=INFO source=device.go:262 msg="compute graph" device=CUDA0 size="1.7 GiB"
12:12:13 arch ollama[927]: time=2026-01-05T12:12:13.713+08:00 level=INFO source=device.go:272 msg="total memory" size="7.7 GiB"
12:12:13 arch ollama[927]: time=2026-01-05T12:12:13.728+08:00 level=INFO source=runner.go:965 msg="starting go runner"
12:12:13 arch ollama[927]: load_backend: loaded CPU backend from /usr/local/lib/ollama/libggml-cpu-haswell.so
12:12:13 arch ollama[927]: ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
12:12:13 arch ollama[927]: ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
12:12:13 arch ollama[927]: ggml_cuda_init: found 1 CUDA devices:
12:12:13 arch ollama[927]: Device 0: NVIDIA GeForce RTX 2060 SUPER, compute capability 7.5, VMM: yes, ID: GPU-8215c551-6dde-569b-490d-884f3ab7a437
12:12:13 arch ollama[927]: load_backend: loaded CUDA backend from /usr/local/lib/ollama/cuda_v13/libggml-cuda.so
12:12:13 arch ollama[927]: time=2026-01-05T12:12:13.805+08:00 level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 CUDA.0.ARCHS=750,800,860,870,890,900,1000,1030,1100,1200,1210 CUDA.0.USE_GRAPHS=1 CUDA.0.PEER_MAX_BATCH_SIZE=128 compiler=cgo(gcc)
12:12:13 arch ollama[927]: time=2026-01-05T12:12:13.805+08:00 level=INFO source=runner.go:1001 msg="Server listening on 127.0.0.1:33773"
12:12:13 arch ollama[927]: time=2026-01-05T12:12:13.811+08:00 level=INFO source=runner.go:895 msg=load request="{Operation:commit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Auto KvSize:16384 KvCacheType: NumThreads:12 GPULayers:34[ID:GPU-8215c551-6dde-569b-490d-884f3ab7a437 Layers:34(6..39)] MultiUserCache:false ProjectorPath:/usr/share/ollama/.ollama/models/blobs/sha256-454dd441c925c1a81c984204bb9d54feef0ef07b789c6fe1118099014ba2727d MainGPU:0 UseMmap:true}"
12:12:13 arch ollama[927]: time=2026-01-05T12:12:13.811+08:00 level=INFO source=server.go:1338 msg="waiting for llama runner to start responding"
12:12:13 arch ollama[927]: time=2026-01-05T12:12:13.811+08:00 level=INFO source=server.go:1372 msg="waiting for server to become available" status="llm server loading model"
12:12:13 arch ollama[927]: ggml_backend_cuda_device_get_memory device GPU-8215c551-6dde-569b-490d-884f3ab7a437 utilizing NVML memory reporting free: 7585398784 total: 8589934592
12:12:13 arch ollama[927]: llama_model_load_from_file_impl: using device CUDA0 (NVIDIA GeForce RTX 2060 SUPER) (0000:03:00.0) - 7234 MiB free
12:12:13 arch ollama[927]: llama_model_loader: loaded meta data with 33 key-value pairs and 523 tensors from /usr/share/ollama/.ollama/models/blobs/sha256-004fd25079bfce8caa7363df20c20af820432e1dd55d22a2b0e728e79223e77a (version GGUF V3 (latest))
12:12:13 arch ollama[927]: llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
12:12:13 arch ollama[927]: llama_model_loader: - kv 0: general.architecture str = glm4
12:12:13 arch ollama[927]: llama_model_loader: - kv 1: general.type str = model
12:12:13 arch ollama[927]: llama_model_loader: - kv 2: general.size_label str = 9.4B
12:12:13 arch ollama[927]: llama_model_loader: - kv 3: general.license str = mit
12:12:13 arch ollama[927]: llama_model_loader: - kv 4: general.base_model.count u32 = 1
12:12:13 arch ollama[927]: llama_model_loader: - kv 5: general.base_model.0.name str = GLM 4.1V 9B Base
12:12:13 arch ollama[927]: llama_model_loader: - kv 6: general.base_model.0.organization str = Zai Org
12:12:13 arch ollama[927]: llama_model_loader: - kv 7: general.base_model.0.repo_url str = https://huggingface.co/zai-org/GLM-4....
12:12:13 arch ollama[927]: llama_model_loader: - kv 8: general.tags arr[str,2] = ["agent", "image-text-to-text"]
12:12:13 arch ollama[927]: llama_model_loader: - kv 9: general.languages arr[str,1] = ["zh"]
12:12:13 arch ollama[927]: llama_model_loader: - kv 10: glm4.block_count u32 = 40
12:12:13 arch ollama[927]: llama_model_loader: - kv 11: glm4.context_length u32 = 65536
12:12:13 arch ollama[927]: llama_model_loader: - kv 12: glm4.embedding_length u32 = 4096
12:12:13 arch ollama[927]: llama_model_loader: - kv 13: glm4.feed_forward_length u32 = 13696
12:12:13 arch ollama[927]: llama_model_loader: - kv 14: glm4.attention.head_count u32 = 32
12:12:13 arch ollama[927]: llama_model_loader: - kv 15: glm4.attention.head_count_kv u32 = 2
12:12:13 arch ollama[927]: llama_model_loader: - kv 16: glm4.rope.dimension_sections arr[i32,4] = [8, 12, 12, 0]
12:12:13 arch ollama[927]: llama_model_loader: - kv 17: glm4.rope.freq_base f32 = 10000.000000
12:12:13 arch ollama[927]: llama_model_loader: - kv 18: glm4.attention.layer_norm_rms_epsilon f32 = 0.000010
12:12:13 arch ollama[927]: llama_model_loader: - kv 19: glm4.rope.dimension_count u32 = 64
12:12:13 arch ollama[927]: llama_model_loader: - kv 20: tokenizer.ggml.model str = gpt2
12:12:13 arch ollama[927]: llama_model_loader: - kv 21: tokenizer.ggml.pre str = glm4
12:12:13 arch ollama[927]: llama_model_loader: - kv 22: tokenizer.ggml.tokens arr[str,151552] = ["!", """, "#", "$", "%", "&", "'", ...
12:12:13 arch ollama[927]: llama_model_loader: - kv 23: tokenizer.ggml.token_type arr[i32,151552] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
12:12:14 arch ollama[927]: llama_model_loader: - kv 24: tokenizer.ggml.merges arr[str,318088] = ["Ġ Ġ", "Ġ ĠĠĠ", "ĠĠ ĠĠ", "...
12:12:14 arch ollama[927]: llama_model_loader: - kv 25: tokenizer.ggml.eos_token_id u32 = 151329
12:12:14 arch ollama[927]: llama_model_loader: - kv 26: tokenizer.ggml.padding_token_id u32 = 151329
12:12:14 arch ollama[927]: llama_model_loader: - kv 27: tokenizer.ggml.eot_token_id u32 = 151336
12:12:14 arch ollama[927]: llama_model_loader: - kv 28: tokenizer.ggml.unknown_token_id u32 = 151329
12:12:14 arch ollama[927]: llama_model_loader: - kv 29: tokenizer.ggml.bos_token_id u32 = 151329
12:12:14 arch ollama[927]: llama_model_loader: - kv 30: tokenizer.chat_template str = [gMASK]\n{%- for msg in messages ...
12:12:14 arch ollama[927]: llama_model_loader: - kv 31: general.quantization_version u32 = 2
12:12:14 arch ollama[927]: llama_model_loader: - kv 32: general.file_type u32 = 15
12:12:14 arch ollama[927]: llama_model_loader: - type f32: 281 tensors
12:12:14 arch ollama[927]: llama_model_loader: - type q5_0: 20 tensors
12:12:14 arch ollama[927]: llama_model_loader: - type q8_0: 20 tensors
12:12:14 arch ollama[927]: llama_model_loader: - type q4_K: 181 tensors
12:12:14 arch ollama[927]: llama_model_loader: - type q6_K: 21 tensors
12:12:14 arch ollama[927]: print_info: file format = GGUF V3 (latest)
12:12:14 arch ollama[927]: print_info: file type = Q4_K - Medium
12:12:14 arch ollama[927]: print_info: file size = 5.73 GiB (5.24 BPW)
12:12:14 arch ollama[927]: load: special_eot_id is not in special_eog_ids - the tokenizer config may be incorrect
12:12:14 arch ollama[927]: load: printing all EOG tokens:
12:12:14 arch ollama[927]: load: - 151329 ('<|endoftext|>')
12:12:14 arch ollama[927]: load: - 151336 ('<|user|>')
12:12:14 arch ollama[927]: load: special tokens cache size = 23
12:12:14 arch ollama[927]: load: token to piece cache size = 0.9711 MB
12:12:14 arch ollama[927]: print_info: arch = glm4
12:12:14 arch ollama[927]: print_info: vocab_only = 0
12:12:14 arch ollama[927]: print_info: no_alloc = 0
12:12:14 arch ollama[927]: print_info: n_ctx_train = 65536
12:12:14 arch ollama[927]: print_info: n_embd = 4096
12:12:14 arch ollama[927]: print_info: n_embd_inp = 4096
12:12:14 arch ollama[927]: print_info: n_layer = 40
12:12:14 arch ollama[927]: print_info: n_head = 32
12:12:14 arch ollama[927]: print_info: n_head_kv = 2
12:12:14 arch ollama[927]: print_info: n_rot = 64
12:12:14 arch ollama[927]: print_info: n_swa = 0
12:12:14 arch ollama[927]: print_info: is_swa_any = 0
12:12:14 arch ollama[927]: print_info: n_embd_head_k = 128
12:12:14 arch ollama[927]: print_info: n_embd_head_v = 128
12:12:14 arch ollama[927]: print_info: n_gqa = 16
12:12:14 arch ollama[927]: print_info: n_embd_k_gqa = 256
12:12:14 arch ollama[927]: print_info: n_embd_v_gqa = 256
12:12:14 arch ollama[927]: print_info: f_norm_eps = 0.0e+00
12:12:14 arch ollama[927]: print_info: f_norm_rms_eps = 1.0e-05
12:12:14 arch ollama[927]: print_info: f_clamp_kqv = 0.0e+00
12:12:14 arch ollama[927]: print_info: f_max_alibi_bias = 0.0e+00
12:12:14 arch ollama[927]: print_info: f_logit_scale = 0.0e+00
12:12:14 arch ollama[927]: print_info: f_attn_scale = 0.0e+00
12:12:14 arch ollama[927]: print_info: n_ff = 13696
12:12:14 arch ollama[927]: print_info: n_expert = 0
12:12:14 arch ollama[927]: print_info: n_expert_used = 0
12:12:14 arch ollama[927]: print_info: n_expert_groups = 0
12:12:14 arch ollama[927]: print_info: n_group_used = 0
12:12:14 arch ollama[927]: print_info: causal attn = 1
12:12:14 arch ollama[927]: print_info: pooling type = 0
12:12:14 arch ollama[927]: print_info: rope type = 8
12:12:14 arch ollama[927]: print_info: rope scaling = linear
12:12:14 arch ollama[927]: print_info: freq_base_train = 10000.0
12:12:14 arch ollama[927]: print_info: freq_scale_train = 1
12:12:14 arch ollama[927]: print_info: n_ctx_orig_yarn = 65536
12:12:14 arch ollama[927]: print_info: rope_yarn_log_mul= 0.0000
12:12:14 arch ollama[927]: print_info: rope_finetuned = unknown
12:12:14 arch ollama[927]: print_info: mrope sections = [8, 12, 12, 0]
12:12:14 arch ollama[927]: print_info: model type = 9B
12:12:14 arch ollama[927]: print_info: model params = 9.40 B
12:12:14 arch ollama[927]: print_info: general.name = n/a
12:12:14 arch ollama[927]: print_info: vocab type = BPE
12:12:14 arch ollama[927]: print_info: n_vocab = 151552
12:12:14 arch ollama[927]: print_info: n_merges = 318088
12:12:14 arch ollama[927]: print_info: BOS token = 151329 '<|endoftext|>'
12:12:14 arch ollama[927]: print_info: EOS token = 151329 '<|endoftext|>'
12:12:14 arch ollama[927]: print_info: EOT token = 151336 '<|user|>'
12:12:14 arch ollama[927]: print_info: UNK token = 151329 '<|endoftext|>'
12:12:14 arch ollama[927]: print_info: PAD token = 151329 '<|endoftext|>'
12:12:14 arch ollama[927]: print_info: LF token = 198 'Ċ'
12:12:14 arch ollama[927]: print_info: EOG token = 151329 '<|endoftext|>'
12:12:14 arch ollama[927]: print_info: EOG token = 151336 '<|user|>'
12:12:14 arch ollama[927]: print_info: max token length = 1024
12:12:14 arch ollama[927]: load_tensors: loading model tensors, this can take a while... (mmap = true)
12:12:14 arch ollama[927]: load_tensors: offloading 34 repeating layers to GPU
12:12:14 arch ollama[927]: load_tensors: offloaded 34/41 layers to GPU
12:12:14 arch ollama[927]: load_tensors: CPU_Mapped model buffer size = 1617.29 MiB
12:12:14 arch ollama[927]: load_tensors: CUDA0 model buffer size = 4254.72 MiB
12:12:15 arch ollama[927]: llama_context: constructing llama_context
12:12:15 arch ollama[927]: llama_context: n_seq_max = 1
12:12:15 arch ollama[927]: llama_context: n_ctx = 16384
12:12:15 arch ollama[927]: llama_context: n_ctx_seq = 16384
12:12:15 arch ollama[927]: llama_context: n_batch = 512
12:12:15 arch ollama[927]: llama_context: n_ubatch = 512
12:12:15 arch ollama[927]: llama_context: causal_attn = 1
12:12:15 arch ollama[927]: llama_context: flash_attn = auto
12:12:15 arch ollama[927]: llama_context: kv_unified = false
12:12:15 arch ollama[927]: llama_context: freq_base = 10000.0
12:12:15 arch ollama[927]: llama_context: freq_scale = 1
12:12:15 arch ollama[927]: llama_context: n_ctx_seq (16384) < n_ctx_train (65536) -- the full capacity of the model will not be utilized
12:12:15 arch ollama[927]: llama_context: CPU output buffer size = 0.59 MiB
12:12:15 arch ollama[927]: llama_kv_cache: CPU KV buffer size = 96.00 MiB
12:12:15 arch ollama[927]: llama_kv_cache: CUDA0 KV buffer size = 544.00 MiB
12:12:15 arch ollama[927]: llama_kv_cache: size = 640.00 MiB ( 16384 cells, 40 layers, 1/1 seqs), K (f16): 320.00 MiB, V (f16): 320.00 MiB
12:12:15 arch ollama[927]: llama_context: Flash Attention was auto, set to enabled
12:12:15 arch ollama[927]: llama_context: CUDA0 compute buffer size = 789.62 MiB
12:12:15 arch ollama[927]: llama_context: CUDA_Host compute buffer size = 40.02 MiB
12:12:15 arch ollama[927]: llama_context: graph nodes = 1487
12:12:15 arch ollama[927]: llama_context: graph splits = 94 (with bs=512), 3 (with bs=1)
12:12:15 arch ollama[927]: clip_model_loader: model name:
12:12:15 arch ollama[927]: clip_model_loader: description:
12:12:15 arch ollama[927]: clip_model_loader: GGUF version: 3
12:12:15 arch ollama[927]: clip_model_loader: alignment: 32
12:12:15 arch ollama[927]: clip_model_loader: n_tensors: 182
12:12:15 arch ollama[927]: clip_model_loader: n_kv: 25
12:12:15 arch ollama[927]: clip_model_loader: has vision encoder
12:12:15 arch ollama[927]: clip_model_loader: tensor[0]: n_dims = 2, name = v.blk.0.attn_out.weight, tensor_size=2506752, offset=0, shape:[1536, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[1]: n_dims = 2, name = v.blk.0.attn_qkv.weight, tensor_size=7520256, offset=2506752, shape:[1536, 4608, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[2]: n_dims = 2, name = v.blk.0.ffn_down.weight, tensor_size=6684672, offset=10027008, shape:[4096, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[3]: n_dims = 2, name = v.blk.0.ffn_gate.weight, tensor_size=6684672, offset=16711680, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[4]: n_dims = 2, name = v.blk.0.ffn_up.weight, tensor_size=6684672, offset=23396352, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[5]: n_dims = 1, name = v.blk.0.ln1.weight, tensor_size=6144, offset=30081024, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[6]: n_dims = 1, name = v.blk.0.ln2.weight, tensor_size=6144, offset=30087168, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[7]: n_dims = 2, name = v.blk.1.attn_out.weight, tensor_size=2506752, offset=30093312, shape:[1536, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[8]: n_dims = 2, name = v.blk.1.attn_qkv.weight, tensor_size=7520256, offset=32600064, shape:[1536, 4608, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[9]: n_dims = 2, name = v.blk.1.ffn_down.weight, tensor_size=6684672, offset=40120320, shape:[4096, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[10]: n_dims = 2, name = v.blk.1.ffn_gate.weight, tensor_size=6684672, offset=46804992, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[11]: n_dims = 2, name = v.blk.1.ffn_up.weight, tensor_size=6684672, offset=53489664, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[12]: n_dims = 1, name = v.blk.1.ln1.weight, tensor_size=6144, offset=60174336, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[13]: n_dims = 1, name = v.blk.1.ln2.weight, tensor_size=6144, offset=60180480, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[14]: n_dims = 2, name = v.blk.10.attn_out.weight, tensor_size=2506752, offset=60186624, shape:[1536, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[15]: n_dims = 2, name = v.blk.10.attn_qkv.weight, tensor_size=7520256, offset=62693376, shape:[1536, 4608, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[16]: n_dims = 2, name = v.blk.10.ffn_down.weight, tensor_size=6684672, offset=70213632, shape:[4096, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[17]: n_dims = 2, name = v.blk.10.ffn_gate.weight, tensor_size=6684672, offset=76898304, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[18]: n_dims = 2, name = v.blk.10.ffn_up.weight, tensor_size=6684672, offset=83582976, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[19]: n_dims = 1, name = v.blk.10.ln1.weight, tensor_size=6144, offset=90267648, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[20]: n_dims = 1, name = v.blk.10.ln2.weight, tensor_size=6144, offset=90273792, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[21]: n_dims = 2, name = v.blk.11.attn_out.weight, tensor_size=2506752, offset=90279936, shape:[1536, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[22]: n_dims = 2, name = v.blk.11.attn_qkv.weight, tensor_size=7520256, offset=92786688, shape:[1536, 4608, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[23]: n_dims = 2, name = v.blk.11.ffn_down.weight, tensor_size=6684672, offset=100306944, shape:[4096, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[24]: n_dims = 2, name = v.blk.11.ffn_gate.weight, tensor_size=6684672, offset=106991616, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[25]: n_dims = 2, name = v.blk.11.ffn_up.weight, tensor_size=6684672, offset=113676288, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[26]: n_dims = 1, name = v.blk.11.ln1.weight, tensor_size=6144, offset=120360960, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[27]: n_dims = 1, name = v.blk.11.ln2.weight, tensor_size=6144, offset=120367104, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[28]: n_dims = 2, name = v.blk.12.attn_out.weight, tensor_size=2506752, offset=120373248, shape:[1536, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[29]: n_dims = 2, name = v.blk.12.attn_qkv.weight, tensor_size=7520256, offset=122880000, shape:[1536, 4608, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[30]: n_dims = 2, name = v.blk.12.ffn_down.weight, tensor_size=6684672, offset=130400256, shape:[4096, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[31]: n_dims = 2, name = v.blk.12.ffn_gate.weight, tensor_size=6684672, offset=137084928, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[32]: n_dims = 2, name = v.blk.12.ffn_up.weight, tensor_size=6684672, offset=143769600, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[33]: n_dims = 1, name = v.blk.12.ln1.weight, tensor_size=6144, offset=150454272, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[34]: n_dims = 1, name = v.blk.12.ln2.weight, tensor_size=6144, offset=150460416, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[35]: n_dims = 2, name = v.blk.13.attn_out.weight, tensor_size=2506752, offset=150466560, shape:[1536, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[36]: n_dims = 2, name = v.blk.13.attn_qkv.weight, tensor_size=7520256, offset=152973312, shape:[1536, 4608, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[37]: n_dims = 2, name = v.blk.13.ffn_down.weight, tensor_size=6684672, offset=160493568, shape:[4096, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[38]: n_dims = 2, name = v.blk.13.ffn_gate.weight, tensor_size=6684672, offset=167178240, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[39]: n_dims = 2, name = v.blk.13.ffn_up.weight, tensor_size=6684672, offset=173862912, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[40]: n_dims = 1, name = v.blk.13.ln1.weight, tensor_size=6144, offset=180547584, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[41]: n_dims = 1, name = v.blk.13.ln2.weight, tensor_size=6144, offset=180553728, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[42]: n_dims = 2, name = v.blk.14.attn_out.weight, tensor_size=2506752, offset=180559872, shape:[1536, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[43]: n_dims = 2, name = v.blk.14.attn_qkv.weight, tensor_size=7520256, offset=183066624, shape:[1536, 4608, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[44]: n_dims = 2, name = v.blk.14.ffn_down.weight, tensor_size=6684672, offset=190586880, shape:[4096, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[45]: n_dims = 2, name = v.blk.14.ffn_gate.weight, tensor_size=6684672, offset=197271552, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[46]: n_dims = 2, name = v.blk.14.ffn_up.weight, tensor_size=6684672, offset=203956224, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[47]: n_dims = 1, name = v.blk.14.ln1.weight, tensor_size=6144, offset=210640896, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[48]: n_dims = 1, name = v.blk.14.ln2.weight, tensor_size=6144, offset=210647040, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[49]: n_dims = 2, name = v.blk.15.attn_out.weight, tensor_size=2506752, offset=210653184, shape:[1536, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[50]: n_dims = 2, name = v.blk.15.attn_qkv.weight, tensor_size=7520256, offset=213159936, shape:[1536, 4608, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[51]: n_dims = 2, name = v.blk.15.ffn_down.weight, tensor_size=6684672, offset=220680192, shape:[4096, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[52]: n_dims = 2, name = v.blk.15.ffn_gate.weight, tensor_size=6684672, offset=227364864, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[53]: n_dims = 2, name = v.blk.15.ffn_up.weight, tensor_size=6684672, offset=234049536, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[54]: n_dims = 1, name = v.blk.15.ln1.weight, tensor_size=6144, offset=240734208, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[55]: n_dims = 1, name = v.blk.15.ln2.weight, tensor_size=6144, offset=240740352, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[56]: n_dims = 2, name = v.blk.16.attn_out.weight, tensor_size=2506752, offset=240746496, shape:[1536, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[57]: n_dims = 2, name = v.blk.16.attn_qkv.weight, tensor_size=7520256, offset=243253248, shape:[1536, 4608, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[58]: n_dims = 2, name = v.blk.16.ffn_down.weight, tensor_size=6684672, offset=250773504, shape:[4096, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[59]: n_dims = 2, name = v.blk.16.ffn_gate.weight, tensor_size=6684672, offset=257458176, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[60]: n_dims = 2, name = v.blk.16.ffn_up.weight, tensor_size=6684672, offset=264142848, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[61]: n_dims = 1, name = v.blk.16.ln1.weight, tensor_size=6144, offset=270827520, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[62]: n_dims = 1, name = v.blk.16.ln2.weight, tensor_size=6144, offset=270833664, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[63]: n_dims = 2, name = v.blk.17.attn_out.weight, tensor_size=2506752, offset=270839808, shape:[1536, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[64]: n_dims = 2, name = v.blk.17.attn_qkv.weight, tensor_size=7520256, offset=273346560, shape:[1536, 4608, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[65]: n_dims = 2, name = v.blk.17.ffn_down.weight, tensor_size=6684672, offset=280866816, shape:[4096, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[66]: n_dims = 2, name = v.blk.17.ffn_gate.weight, tensor_size=6684672, offset=287551488, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[67]: n_dims = 2, name = v.blk.17.ffn_up.weight, tensor_size=6684672, offset=294236160, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[68]: n_dims = 1, name = v.blk.17.ln1.weight, tensor_size=6144, offset=300920832, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[69]: n_dims = 1, name = v.blk.17.ln2.weight, tensor_size=6144, offset=300926976, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[70]: n_dims = 2, name = v.blk.18.attn_out.weight, tensor_size=2506752, offset=300933120, shape:[1536, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[71]: n_dims = 2, name = v.blk.18.attn_qkv.weight, tensor_size=7520256, offset=303439872, shape:[1536, 4608, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[72]: n_dims = 2, name = v.blk.18.ffn_down.weight, tensor_size=6684672, offset=310960128, shape:[4096, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[73]: n_dims = 2, name = v.blk.18.ffn_gate.weight, tensor_size=6684672, offset=317644800, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[74]: n_dims = 2, name = v.blk.18.ffn_up.weight, tensor_size=6684672, offset=324329472, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[75]: n_dims = 1, name = v.blk.18.ln1.weight, tensor_size=6144, offset=331014144, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[76]: n_dims = 1, name = v.blk.18.ln2.weight, tensor_size=6144, offset=331020288, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[77]: n_dims = 2, name = v.blk.19.attn_out.weight, tensor_size=2506752, offset=331026432, shape:[1536, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[78]: n_dims = 2, name = v.blk.19.attn_qkv.weight, tensor_size=7520256, offset=333533184, shape:[1536, 4608, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[79]: n_dims = 2, name = v.blk.19.ffn_down.weight, tensor_size=6684672, offset=341053440, shape:[4096, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[80]: n_dims = 2, name = v.blk.19.ffn_gate.weight, tensor_size=6684672, offset=347738112, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[81]: n_dims = 2, name = v.blk.19.ffn_up.weight, tensor_size=6684672, offset=354422784, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[82]: n_dims = 1, name = v.blk.19.ln1.weight, tensor_size=6144, offset=361107456, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[83]: n_dims = 1, name = v.blk.19.ln2.weight, tensor_size=6144, offset=361113600, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[84]: n_dims = 2, name = v.blk.2.attn_out.weight, tensor_size=2506752, offset=361119744, shape:[1536, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[85]: n_dims = 2, name = v.blk.2.attn_qkv.weight, tensor_size=7520256, offset=363626496, shape:[1536, 4608, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[86]: n_dims = 2, name = v.blk.2.ffn_down.weight, tensor_size=6684672, offset=371146752, shape:[4096, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[87]: n_dims = 2, name = v.blk.2.ffn_gate.weight, tensor_size=6684672, offset=377831424, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[88]: n_dims = 2, name = v.blk.2.ffn_up.weight, tensor_size=6684672, offset=384516096, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[89]: n_dims = 1, name = v.blk.2.ln1.weight, tensor_size=6144, offset=391200768, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[90]: n_dims = 1, name = v.blk.2.ln2.weight, tensor_size=6144, offset=391206912, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[91]: n_dims = 2, name = v.blk.20.attn_out.weight, tensor_size=2506752, offset=391213056, shape:[1536, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[92]: n_dims = 2, name = v.blk.20.attn_qkv.weight, tensor_size=7520256, offset=393719808, shape:[1536, 4608, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[93]: n_dims = 2, name = v.blk.20.ffn_down.weight, tensor_size=6684672, offset=401240064, shape:[4096, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[94]: n_dims = 2, name = v.blk.20.ffn_gate.weight, tensor_size=6684672, offset=407924736, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[95]: n_dims = 2, name = v.blk.20.ffn_up.weight, tensor_size=6684672, offset=414609408, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[96]: n_dims = 1, name = v.blk.20.ln1.weight, tensor_size=6144, offset=421294080, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[97]: n_dims = 1, name = v.blk.20.ln2.weight, tensor_size=6144, offset=421300224, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[98]: n_dims = 2, name = v.blk.21.attn_out.weight, tensor_size=2506752, offset=421306368, shape:[1536, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[99]: n_dims = 2, name = v.blk.21.attn_qkv.weight, tensor_size=7520256, offset=423813120, shape:[1536, 4608, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[100]: n_dims = 2, name = v.blk.21.ffn_down.weight, tensor_size=6684672, offset=431333376, shape:[4096, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[101]: n_dims = 2, name = v.blk.21.ffn_gate.weight, tensor_size=6684672, offset=438018048, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[102]: n_dims = 2, name = v.blk.21.ffn_up.weight, tensor_size=6684672, offset=444702720, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[103]: n_dims = 1, name = v.blk.21.ln1.weight, tensor_size=6144, offset=451387392, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[104]: n_dims = 1, name = v.blk.21.ln2.weight, tensor_size=6144, offset=451393536, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[105]: n_dims = 2, name = v.blk.22.attn_out.weight, tensor_size=2506752, offset=451399680, shape:[1536, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[106]: n_dims = 2, name = v.blk.22.attn_qkv.weight, tensor_size=7520256, offset=453906432, shape:[1536, 4608, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[107]: n_dims = 2, name = v.blk.22.ffn_down.weight, tensor_size=6684672, offset=461426688, shape:[4096, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[108]: n_dims = 2, name = v.blk.22.ffn_gate.weight, tensor_size=6684672, offset=468111360, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[109]: n_dims = 2, name = v.blk.22.ffn_up.weight, tensor_size=6684672, offset=474796032, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[110]: n_dims = 1, name = v.blk.22.ln1.weight, tensor_size=6144, offset=481480704, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[111]: n_dims = 1, name = v.blk.22.ln2.weight, tensor_size=6144, offset=481486848, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[112]: n_dims = 2, name = v.blk.23.attn_out.weight, tensor_size=2506752, offset=481492992, shape:[1536, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[113]: n_dims = 2, name = v.blk.23.attn_qkv.weight, tensor_size=7520256, offset=483999744, shape:[1536, 4608, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[114]: n_dims = 2, name = v.blk.23.ffn_down.weight, tensor_size=6684672, offset=491520000, shape:[4096, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[115]: n_dims = 2, name = v.blk.23.ffn_gate.weight, tensor_size=6684672, offset=498204672, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[116]: n_dims = 2, name = v.blk.23.ffn_up.weight, tensor_size=6684672, offset=504889344, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[117]: n_dims = 1, name = v.blk.23.ln1.weight, tensor_size=6144, offset=511574016, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[118]: n_dims = 1, name = v.blk.23.ln2.weight, tensor_size=6144, offset=511580160, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[119]: n_dims = 2, name = v.blk.3.attn_out.weight, tensor_size=2506752, offset=511586304, shape:[1536, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[120]: n_dims = 2, name = v.blk.3.attn_qkv.weight, tensor_size=7520256, offset=514093056, shape:[1536, 4608, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[121]: n_dims = 2, name = v.blk.3.ffn_down.weight, tensor_size=6684672, offset=521613312, shape:[4096, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[122]: n_dims = 2, name = v.blk.3.ffn_gate.weight, tensor_size=6684672, offset=528297984, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[123]: n_dims = 2, name = v.blk.3.ffn_up.weight, tensor_size=6684672, offset=534982656, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[124]: n_dims = 1, name = v.blk.3.ln1.weight, tensor_size=6144, offset=541667328, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[125]: n_dims = 1, name = v.blk.3.ln2.weight, tensor_size=6144, offset=541673472, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[126]: n_dims = 2, name = v.blk.4.attn_out.weight, tensor_size=2506752, offset=541679616, shape:[1536, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[127]: n_dims = 2, name = v.blk.4.attn_qkv.weight, tensor_size=7520256, offset=544186368, shape:[1536, 4608, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[128]: n_dims = 2, name = v.blk.4.ffn_down.weight, tensor_size=6684672, offset=551706624, shape:[4096, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[129]: n_dims = 2, name = v.blk.4.ffn_gate.weight, tensor_size=6684672, offset=558391296, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[130]: n_dims = 2, name = v.blk.4.ffn_up.weight, tensor_size=6684672, offset=565075968, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[131]: n_dims = 1, name = v.blk.4.ln1.weight, tensor_size=6144, offset=571760640, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[132]: n_dims = 1, name = v.blk.4.ln2.weight, tensor_size=6144, offset=571766784, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[133]: n_dims = 2, name = v.blk.5.attn_out.weight, tensor_size=2506752, offset=571772928, shape:[1536, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[134]: n_dims = 2, name = v.blk.5.attn_qkv.weight, tensor_size=7520256, offset=574279680, shape:[1536, 4608, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[135]: n_dims = 2, name = v.blk.5.ffn_down.weight, tensor_size=6684672, offset=581799936, shape:[4096, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[136]: n_dims = 2, name = v.blk.5.ffn_gate.weight, tensor_size=6684672, offset=588484608, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[137]: n_dims = 2, name = v.blk.5.ffn_up.weight, tensor_size=6684672, offset=595169280, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[138]: n_dims = 1, name = v.blk.5.ln1.weight, tensor_size=6144, offset=601853952, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[139]: n_dims = 1, name = v.blk.5.ln2.weight, tensor_size=6144, offset=601860096, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[140]: n_dims = 2, name = v.blk.6.attn_out.weight, tensor_size=2506752, offset=601866240, shape:[1536, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[141]: n_dims = 2, name = v.blk.6.attn_qkv.weight, tensor_size=7520256, offset=604372992, shape:[1536, 4608, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[142]: n_dims = 2, name = v.blk.6.ffn_down.weight, tensor_size=6684672, offset=611893248, shape:[4096, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[143]: n_dims = 2, name = v.blk.6.ffn_gate.weight, tensor_size=6684672, offset=618577920, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[144]: n_dims = 2, name = v.blk.6.ffn_up.weight, tensor_size=6684672, offset=625262592, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[145]: n_dims = 1, name = v.blk.6.ln1.weight, tensor_size=6144, offset=631947264, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[146]: n_dims = 1, name = v.blk.6.ln2.weight, tensor_size=6144, offset=631953408, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[147]: n_dims = 2, name = v.blk.7.attn_out.weight, tensor_size=2506752, offset=631959552, shape:[1536, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[148]: n_dims = 2, name = v.blk.7.attn_qkv.weight, tensor_size=7520256, offset=634466304, shape:[1536, 4608, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[149]: n_dims = 2, name = v.blk.7.ffn_down.weight, tensor_size=6684672, offset=641986560, shape:[4096, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[150]: n_dims = 2, name = v.blk.7.ffn_gate.weight, tensor_size=6684672, offset=648671232, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[151]: n_dims = 2, name = v.blk.7.ffn_up.weight, tensor_size=6684672, offset=655355904, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[152]: n_dims = 1, name = v.blk.7.ln1.weight, tensor_size=6144, offset=662040576, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[153]: n_dims = 1, name = v.blk.7.ln2.weight, tensor_size=6144, offset=662046720, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[154]: n_dims = 2, name = v.blk.8.attn_out.weight, tensor_size=2506752, offset=662052864, shape:[1536, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[155]: n_dims = 2, name = v.blk.8.attn_qkv.weight, tensor_size=7520256, offset=664559616, shape:[1536, 4608, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[156]: n_dims = 2, name = v.blk.8.ffn_down.weight, tensor_size=6684672, offset=672079872, shape:[4096, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[157]: n_dims = 2, name = v.blk.8.ffn_gate.weight, tensor_size=6684672, offset=678764544, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[158]: n_dims = 2, name = v.blk.8.ffn_up.weight, tensor_size=6684672, offset=685449216, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[159]: n_dims = 1, name = v.blk.8.ln1.weight, tensor_size=6144, offset=692133888, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[160]: n_dims = 1, name = v.blk.8.ln2.weight, tensor_size=6144, offset=692140032, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[161]: n_dims = 2, name = v.blk.9.attn_out.weight, tensor_size=2506752, offset=692146176, shape:[1536, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[162]: n_dims = 2, name = v.blk.9.attn_qkv.weight, tensor_size=7520256, offset=694652928, shape:[1536, 4608, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[163]: n_dims = 2, name = v.blk.9.ffn_down.weight, tensor_size=6684672, offset=702173184, shape:[4096, 1536, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[164]: n_dims = 2, name = v.blk.9.ffn_gate.weight, tensor_size=6684672, offset=708857856, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[165]: n_dims = 2, name = v.blk.9.ffn_up.weight, tensor_size=6684672, offset=715542528, shape:[1536, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[166]: n_dims = 1, name = v.blk.9.ln1.weight, tensor_size=6144, offset=722227200, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[167]: n_dims = 1, name = v.blk.9.ln2.weight, tensor_size=6144, offset=722233344, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[168]: n_dims = 1, name = mm.patch_merger.bias, tensor_size=16384, offset=722239488, shape:[4096, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[169]: n_dims = 4, name = mm.patch_merger.weight, tensor_size=50331648, offset=722255872, shape:[2, 2, 1536, 4096], type = f16
12:12:15 arch ollama[927]: clip_model_loader: tensor[170]: n_dims = 2, name = v.position_embd.weight, tensor_size=3538944, offset=772587520, shape:[1536, 576, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[171]: n_dims = 2, name = mm.down.weight, tensor_size=59604992, offset=776126464, shape:[13696, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[172]: n_dims = 2, name = mm.gate.weight, tensor_size=59604992, offset=835731456, shape:[4096, 13696, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[173]: n_dims = 1, name = mm.post_norm.bias, tensor_size=16384, offset=895336448, shape:[4096, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[174]: n_dims = 1, name = mm.post_norm.weight, tensor_size=16384, offset=895352832, shape:[4096, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[175]: n_dims = 2, name = mm.model.fc.weight, tensor_size=17825792, offset=895369216, shape:[4096, 4096, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[176]: n_dims = 2, name = mm.up.weight, tensor_size=59604992, offset=913195008, shape:[4096, 13696, 1, 1], type = q8_0
12:12:15 arch ollama[927]: clip_model_loader: tensor[177]: n_dims = 1, name = v.patch_embd.bias, tensor_size=6144, offset=972800000, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[178]: n_dims = 4, name = v.patch_embd.weight, tensor_size=3612672, offset=972806144, shape:[14, 14, 3, 1536], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[179]: n_dims = 4, name = v.patch_embd.weight.1, tensor_size=3612672, offset=976418816, shape:[14, 14, 3, 1536], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[180]: n_dims = 1, name = v.norm_embd.weight, tensor_size=6144, offset=980031488, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_model_loader: tensor[181]: n_dims = 1, name = v.post_ln.weight, tensor_size=6144, offset=980037632, shape:[1536, 1, 1, 1], type = f32
12:12:15 arch ollama[927]: clip_ctx: CLIP using CUDA0 backend
12:12:15 arch ollama[927]: load_hparams: projector: glm4v
12:12:15 arch ollama[927]: load_hparams: n_embd: 1536
12:12:15 arch ollama[927]: load_hparams: n_head: 12
12:12:15 arch ollama[927]: load_hparams: n_ff: 13696
12:12:15 arch ollama[927]: load_hparams: n_layer: 24
12:12:15 arch ollama[927]: load_hparams: ffn_op: silu
12:12:15 arch ollama[927]: load_hparams: projection_dim: 4096
12:12:15 arch ollama[927]: --- vision hparams ---
12:12:15 arch ollama[927]: load_hparams: image_size: 336
12:12:15 arch ollama[927]: load_hparams: patch_size: 14
12:12:15 arch ollama[927]: load_hparams: has_llava_proj: 0
12:12:15 arch ollama[927]: load_hparams: minicpmv_version: 0
12:12:15 arch ollama[927]: load_hparams: n_merge: 2
12:12:15 arch ollama[927]: load_hparams: n_wa_pattern: 0
12:12:15 arch ollama[927]: load_hparams: image_min_pixels: 6272
12:12:15 arch ollama[927]: load_hparams: image_max_pixels: 3211264
12:12:15 arch ollama[927]: load_hparams: model size: 934.64 MiB
12:12:15 arch ollama[927]: load_hparams: metadata size: 0.06 MiB
12:12:16 arch ollama[927]: load_tensors: loaded 182 tensors from /usr/share/ollama/.ollama/models/blobs/sha256-454dd441c925c1a81c984204bb9d54feef0ef07b789c6fe1118099014ba2727d
12:12:16 arch ollama[927]: warmup: warmup with image size = 1288 x 1288
12:12:16 arch ollama[927]: ggml_backend_cuda_buffer_type_alloc_buffer: allocating 515.05 MiB on device 0: cudaMalloc failed: out of memory
12:12:16 arch ollama[927]: ggml_gallocr_reserve_n_impl: failed to allocate CUDA0 buffer of size 540070912
12:12:16 arch ollama[927]: alloc_compute_meta: CPU compute buffer size = 19.11 MiB
12:12:16 arch ollama[927]: alloc_compute_meta: graph splits = 1, nodes = 632
12:12:16 arch ollama[927]: warmup: flash attention is enabled
12:12:16 arch ollama[927]: time=2026-01-05T12:12:16.324+08:00 level=INFO source=server.go:1376 msg="llama runner started in 2.61 seconds"
12:12:16 arch ollama[927]: time=2026-01-05T12:12:16.324+08:00 level=INFO source=sched.go:517 msg="loaded runners" count=1
12:12:16 arch ollama[927]: time=2026-01-05T12:12:16.324+08:00 level=INFO source=server.go:1338 msg="waiting for llama runner to start responding"
12:12:16 arch ollama[927]: time=2026-01-05T12:12:16.325+08:00 level=INFO source=server.go:1376 msg="llama runner started in 2.61 seconds"
12:12:16 arch ollama[927]: add_text: <|begin_of_image|>
12:12:16 arch ollama[927]: image_tokens->nx = 85
12:12:16 arch ollama[927]: image_tokens->ny = 48
12:12:16 arch ollama[927]: batch_f32 size = 1
12:12:16 arch ollama[927]: add_text: <|end_of_image|>
12:12:16 arch ollama[927]: ggml_backend_cuda_buffer_type_alloc_buffer: allocating 993.11 MiB on device 0: cudaMalloc failed: out of memory
12:12:16 arch ollama[927]: ggml_gallocr_reserve_n_impl: failed to allocate CUDA0 buffer of size 1041346560
12:12:16 arch ollama[927]: SIGSEGV: segmentation violation
12:12:16 arch ollama[927]: PC=0x564e038541cb m=15 sigcode=1 addr=0x0
12:12:16 arch ollama[927]: signal arrived during cgo execution
12:12:16 arch ollama[927]: goroutine 40 gp=0xc000505c00 m=15 mp=0xc000101008 [syscall]:
12:12:16 arch ollama[927]: runtime.cgocall(0x564e03841190, 0xc0000491d8)
12:12:16 arch ollama[927]: runtime/cgocall.go:167 +0x4b fp=0xc0000491b0 sp=0xc000049178 pc=0x564e02aef6eb
12:12:16 arch ollama[927]: github.com/ollama/ollama/llama._Cfunc_mtmd_encode_chunk(0x7f834008dd90, 0x7f82817bc000)
12:12:16 arch ollama[927]: _cgo_gotypes.go:1079 +0x4a fp=0xc0000491d8 sp=0xc0000491b0 pc=0x564e02eaa04a
12:12:16 arch ollama[927]: github.com/ollama/ollama/llama.(*MtmdContext).MultimodalTokenize.func11(...)
12:12:16 arch ollama[927]: github.com/ollama/ollama/llama/llama.go:595
12:12:16 arch ollama[927]: github.com/ollama/ollama/llama.(*MtmdContext).MultimodalTokenize(0xc00034c0b8, 0xc0003a6750, {0xc000a82000, 0x117bcb, 0xc000049520?})
12:12:16 arch ollama[927]: github.com/ollama/ollama/llama/llama.go:595 +0x6c5 fp=0xc000049490 sp=0xc0000491d8 pc=0x564e02eaf085
12:12:16 arch ollama[927]: github.com/ollama/ollama/runner/llamarunner.(*ImageContext).MultimodalTokenize(0xc00063c240, 0xc0003a6750, {0xc000a82000, 0x117bcb, 0x117bcd})
12:12:16 arch ollama[927]: github.com/ollama/ollama/runner/llamarunner/image.go:76 +0x145 fp=0xc000049530 sp=0xc000049490 pc=0x564e02f61365
12:12:16 arch ollama[927]: github.com/ollama/ollama/runner/llamarunner.(*Server).inputs(0xc00013f900, {0xc000042440?, 0x3d?}, {0xc00061c080, 0x1, 0x160?})
12:12:16 arch ollama[927]: github.com/ollama/ollama/runner/llamarunner/runner.go:236 +0x2c6 fp=0xc000049698 sp=0xc000049530 pc=0x564e02f62766
12:12:16 arch ollama[927]: github.com/ollama/ollama/runner/llamarunner.(*Server).NewSequence(0xc00013f900, {0xc000042440, 0x3d}, {0xc00061c080, 0x1, 0x1}, {0x28000, {0xc000044610, 0x1, 0x1}, ...})
12:12:16 arch ollama[927]: github.com/ollama/ollama/runner/llamarunner/runner.go:126 +0x8d fp=0xc000049838 sp=0xc000049698 pc=0x564e02f61d2d
12:12:16 arch ollama[927]: github.com/ollama/ollama/runner/llamarunner.(*Server).completion(0xc00013f900, {0x564e040a5fa0, 0xc0003c21c0}, 0xc0003b2280)
12:12:16 arch ollama[927]: github.com/ollama/ollama/runner/llamarunner/runner.go:659 +0x5f9 fp=0xc000049ac0 sp=0xc000049838 pc=0x564e02f64d99
12:12:16 arch ollama[927]: github.com/ollama/ollama/runner/llamarunner.(*Server).completion-fm({0x564e040a5fa0?, 0xc0003c21c0?}, 0xc000049b40?)
12:12:16 arch ollama[927]: :1 +0x36 fp=0xc000049af0 sp=0xc000049ac0 pc=0x564e02f687f6
12:12:16 arch ollama[927]: net/http.HandlerFunc.ServeHTTP(0xc000544180?, {0x564e040a5fa0?, 0xc0003c21c0?}, 0xc000049b60?)
12:12:16 arch ollama[927]: net/http/server.go:2294 +0x29 fp=0xc000049b18 sp=0xc000049af0 pc=0x564e02df20e9
12:12:16 arch ollama[927]: net/http.(*ServeMux).ServeHTTP(0x564e02a978c5?, {0x564e040a5fa0, 0xc0003c21c0}, 0xc0003b2280)
12:12:16 arch ollama[927]: net/http/server.go:2822 +0x1c4 fp=0xc000049b68 sp=0xc000049b18 pc=0x564e02df3fe4
12:12:16 arch ollama[927]: net/http.serverHandler.ServeHTTP({0x564e040a2590?}, {0x564e040a5fa0?, 0xc0003c21c0?}, 0x1?)
12:12:16 arch ollama[927]: net/http/server.go:3301 +0x8e fp=0xc000049b98 sp=0xc000049b68 pc=0x564e02e11a6e
12:12:16 arch ollama[927]: net/http.(*conn).serve(0xc0001463f0, {0x564e040a83d8, 0xc000144b10})
12:12:16 arch ollama[927]: net/http/server.go:2102 +0x625 fp=0xc000049fb8 sp=0xc000049b98 pc=0x564e02df05e5
12:12:16 arch ollama[927]: net/http.(*Server).Serve.gowrap3()
12:12:16 arch ollama[927]: net/http/server.go:3454 +0x28 fp=0xc000049fe0 sp=0xc000049fb8 pc=0x564e02df5ea8
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc000049fe8 sp=0xc000049fe0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by net/http.(*Server).Serve in goroutine 1
12:12:16 arch ollama[927]: net/http/server.go:3454 +0x485
12:12:16 arch ollama[927]: goroutine 1 gp=0xc000002380 m=nil [IO wait]:
12:12:16 arch ollama[927]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc00018f790 sp=0xc00018f770 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.netpollblock(0xc0005197e0?, 0x2a8c2a6?, 0x4e?)
12:12:16 arch ollama[927]: runtime/netpoll.go:575 +0xf7 fp=0xc00018f7c8 sp=0xc00018f790 pc=0x564e02ab7e97
12:12:16 arch ollama[927]: internal/poll.runtime_pollWait(0x7f83f643aeb0, 0x72)
12:12:16 arch ollama[927]: runtime/netpoll.go:351 +0x85 fp=0xc00018f7e8 sp=0xc00018f7c8 pc=0x564e02af1d85
12:12:16 arch ollama[927]: internal/poll.(*pollDesc).wait(0xc000701480?, 0x900000036?, 0x0)
12:12:16 arch ollama[927]: internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc00018f810 sp=0xc00018f7e8 pc=0x564e02b79f07
12:12:16 arch ollama[927]: internal/poll.(*pollDesc).waitRead(...)
12:12:16 arch ollama[927]: internal/poll/fd_poll_runtime.go:89
12:12:16 arch ollama[927]: internal/poll.(*FD).Accept(0xc000701480)
12:12:16 arch ollama[927]: internal/poll/fd_unix.go:620 +0x295 fp=0xc00018f8b8 sp=0xc00018f810 pc=0x564e02b7f2d5
12:12:16 arch ollama[927]: net.(*netFD).accept(0xc000701480)
12:12:16 arch ollama[927]: net/fd_unix.go:172 +0x29 fp=0xc00018f970 sp=0xc00018f8b8 pc=0x564e02bf21a9
12:12:16 arch ollama[927]: net.(*TCPListener).accept(0xc000531200)
12:12:16 arch ollama[927]: net/tcpsock_posix.go:159 +0x1b fp=0xc00018f9c0 sp=0xc00018f970 pc=0x564e02c07b5b
12:12:16 arch ollama[927]: net.(*TCPListener).Accept(0xc000531200)
12:12:16 arch ollama[927]: net/tcpsock.go:380 +0x30 fp=0xc00018f9f0 sp=0xc00018f9c0 pc=0x564e02c06a10
12:12:16 arch ollama[927]: net/http.(*onceCloseListener).Accept(0xc0001463f0?)
12:12:16 arch ollama[927]: :1 +0x24 fp=0xc00018fa08 sp=0xc00018f9f0 pc=0x564e02e1e1e4
12:12:16 arch ollama[927]: net/http.(*Server).Serve(0xc000213700, {0x564e040a5dc0, 0xc000531200})
12:12:16 arch ollama[927]: net/http/server.go:3424 +0x30c fp=0xc00018fb38 sp=0xc00018fa08 pc=0x564e02df5aac
12:12:16 arch ollama[927]: github.com/ollama/ollama/runner/llamarunner.Execute({0xc000034260, 0x4, 0x4})
12:12:16 arch ollama[927]: github.com/ollama/ollama/runner/llamarunner/runner.go:1002 +0x8f5 fp=0xc00018fd08 sp=0xc00018fb38 pc=0x564e02f68175
12:12:16 arch ollama[927]: github.com/ollama/ollama/runner.Execute({0xc000034250?, 0x0?, 0x0?})
12:12:16 arch ollama[927]: github.com/ollama/ollama/runner/runner.go:22 +0xd4 fp=0xc00018fd30 sp=0xc00018fd08 pc=0x564e03013cf4
12:12:16 arch ollama[927]: github.com/ollama/ollama/cmd.NewCLI.func2(0xc000213400?, {0x564e03b880ad?, 0x4?, 0x564e03b880b1?})
12:12:16 arch ollama[927]: github.com/ollama/ollama/cmd/cmd.go:1841 +0x45 fp=0xc00018fd58 sp=0xc00018fd30 pc=0x564e037d0f25
12:12:16 arch ollama[927]: github.com/spf13/cobra.(*Command).execute(0xc000149508, {0xc000531000, 0x4, 0x4})
12:12:16 arch ollama[927]: github.com/spf13/cobra@v1.7.0/command.go:940 +0x85c fp=0xc00018fe78 sp=0xc00018fd58 pc=0x564e02c6b7fc
12:12:16 arch ollama[927]: github.com/spf13/cobra.(*Command).ExecuteC(0xc000126908)
12:12:16 arch ollama[927]: github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3a5 fp=0xc00018ff30 sp=0xc00018fe78 pc=0x564e02c6c045
12:12:16 arch ollama[927]: github.com/spf13/cobra.(*Command).Execute(...)
12:12:16 arch ollama[927]: github.com/spf13/cobra@v1.7.0/command.go:992
12:12:16 arch ollama[927]: github.com/spf13/cobra.(*Command).ExecuteContext(...)
12:12:16 arch ollama[927]: github.com/spf13/cobra@v1.7.0/command.go:985
12:12:16 arch ollama[927]: main.main()
12:12:16 arch ollama[927]: github.com/ollama/ollama/main.go:12 +0x4d fp=0xc00018ff50 sp=0xc00018ff30 pc=0x564e037d1a0d
12:12:16 arch ollama[927]: runtime.main()
12:12:16 arch ollama[927]: runtime/proc.go:283 +0x29d fp=0xc00018ffe0 sp=0xc00018ff50 pc=0x564e02abf51d
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc00018ffe8 sp=0xc00018ffe0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: goroutine 2 gp=0xc000002e00 m=nil [force gc (idle)]:
12:12:16 arch ollama[927]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc00008efa8 sp=0xc00008ef88 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.goparkunlock(...)
12:12:16 arch ollama[927]: runtime/proc.go:441
12:12:16 arch ollama[927]: runtime.forcegchelper()
12:12:16 arch ollama[927]: runtime/proc.go:348 +0xb8 fp=0xc00008efe0 sp=0xc00008efa8 pc=0x564e02abf858
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc00008efe8 sp=0xc00008efe0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by runtime.init.7 in goroutine 1
12:12:16 arch ollama[927]: runtime/proc.go:336 +0x1a
12:12:16 arch ollama[927]: goroutine 3 gp=0xc000003340 m=nil [GC sweep wait]:
12:12:16 arch ollama[927]: runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc00008f780 sp=0xc00008f760 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.goparkunlock(...)
12:12:16 arch ollama[927]: runtime/proc.go:441
12:12:16 arch ollama[927]: runtime.bgsweep(0xc0000ba000)
12:12:16 arch ollama[927]: runtime/mgcsweep.go:316 +0xdf fp=0xc00008f7c8 sp=0xc00008f780 pc=0x564e02aa9fff
12:12:16 arch ollama[927]: runtime.gcenable.gowrap1()
12:12:16 arch ollama[927]: runtime/mgc.go:204 +0x25 fp=0xc00008f7e0 sp=0xc00008f7c8 pc=0x564e02a9e3e5
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc00008f7e8 sp=0xc00008f7e0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by runtime.gcenable in goroutine 1
12:12:16 arch ollama[927]: runtime/mgc.go:204 +0x66
12:12:16 arch ollama[927]: goroutine 4 gp=0xc000003500 m=nil [GC scavenge wait]:
12:12:16 arch ollama[927]: runtime.gopark(0x10000?, 0x564e03d592a0?, 0x0?, 0x0?, 0x0?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc00008ff78 sp=0xc00008ff58 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.goparkunlock(...)
12:12:16 arch ollama[927]: runtime/proc.go:441
12:12:16 arch ollama[927]: runtime.(*scavengerState).park(0x564e0497c280)
12:12:16 arch ollama[927]: runtime/mgcscavenge.go:425 +0x49 fp=0xc00008ffa8 sp=0xc00008ff78 pc=0x564e02aa7a49
12:12:16 arch ollama[927]: runtime.bgscavenge(0xc0000ba000)
12:12:16 arch ollama[927]: runtime/mgcscavenge.go:658 +0x59 fp=0xc00008ffc8 sp=0xc00008ffa8 pc=0x564e02aa7fd9
12:12:16 arch ollama[927]: runtime.gcenable.gowrap2()
12:12:16 arch ollama[927]: runtime/mgc.go:205 +0x25 fp=0xc00008ffe0 sp=0xc00008ffc8 pc=0x564e02a9e385
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc00008ffe8 sp=0xc00008ffe0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by runtime.gcenable in goroutine 1
12:12:16 arch ollama[927]: runtime/mgc.go:205 +0xa5
12:12:16 arch ollama[927]: goroutine 5 gp=0xc000003dc0 m=nil [finalizer wait]:
12:12:16 arch ollama[927]: runtime.gopark(0x0?, 0x564e04092250?, 0x40?, 0x61?, 0x1000000010?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc00008e630 sp=0xc00008e610 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.runfinq()
12:12:16 arch ollama[927]: runtime/mfinal.go:196 +0x107 fp=0xc00008e7e0 sp=0xc00008e630 pc=0x564e02a9d3a7
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc00008e7e8 sp=0xc00008e7e0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by runtime.createfing in goroutine 1
12:12:16 arch ollama[927]: runtime/mfinal.go:166 +0x3d
12:12:16 arch ollama[927]: goroutine 6 gp=0xc0001f08c0 m=nil [chan receive]:
12:12:16 arch ollama[927]: runtime.gopark(0xc000245720?, 0xc000590018?, 0x60?, 0x7?, 0x564e02bd8de8?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc000090718 sp=0xc0000906f8 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.chanrecv(0xc0000c6310, 0x0, 0x1)
12:12:16 arch ollama[927]: runtime/chan.go:664 +0x445 fp=0xc000090790 sp=0xc000090718 pc=0x564e02a8ee85
12:12:16 arch ollama[927]: runtime.chanrecv1(0x0?, 0x0?)
12:12:16 arch ollama[927]: runtime/chan.go:506 +0x12 fp=0xc0000907b8 sp=0xc000090790 pc=0x564e02a8ea12
12:12:16 arch ollama[927]: runtime.unique_runtime_registerUniqueMapCleanup.func2(...)
12:12:16 arch ollama[927]: runtime/mgc.go:1796
12:12:16 arch ollama[927]: runtime.unique_runtime_registerUniqueMapCleanup.gowrap1()
12:12:16 arch ollama[927]: runtime/mgc.go:1799 +0x2f fp=0xc0000907e0 sp=0xc0000907b8 pc=0x564e02aa158f
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc0000907e8 sp=0xc0000907e0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by unique.runtime_registerUniqueMapCleanup in goroutine 1
12:12:16 arch ollama[927]: runtime/mgc.go:1794 +0x85
12:12:16 arch ollama[927]: goroutine 7 gp=0xc0001f0c40 m=nil [GC worker (idle)]:
12:12:16 arch ollama[927]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc000090f38 sp=0xc000090f18 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.gcBgMarkWorker(0xc0000c7730)
12:12:16 arch ollama[927]: runtime/mgc.go:1423 +0xe9 fp=0xc000090fc8 sp=0xc000090f38 pc=0x564e02aa08a9
12:12:16 arch ollama[927]: runtime.gcBgMarkStartWorkers.gowrap1()
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x25 fp=0xc000090fe0 sp=0xc000090fc8 pc=0x564e02aa0785
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc000090fe8 sp=0xc000090fe0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by runtime.gcBgMarkStartWorkers in goroutine 1
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x105
12:12:16 arch ollama[927]: goroutine 18 gp=0xc000504000 m=nil [GC worker (idle)]:
12:12:16 arch ollama[927]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc00008a738 sp=0xc00008a718 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.gcBgMarkWorker(0xc0000c7730)
12:12:16 arch ollama[927]: runtime/mgc.go:1423 +0xe9 fp=0xc00008a7c8 sp=0xc00008a738 pc=0x564e02aa08a9
12:12:16 arch ollama[927]: runtime.gcBgMarkStartWorkers.gowrap1()
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x25 fp=0xc00008a7e0 sp=0xc00008a7c8 pc=0x564e02aa0785
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc00008a7e8 sp=0xc00008a7e0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by runtime.gcBgMarkStartWorkers in goroutine 1
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x105
12:12:16 arch ollama[927]: goroutine 19 gp=0xc0005041c0 m=nil [GC worker (idle)]:
12:12:16 arch ollama[927]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc00008af38 sp=0xc00008af18 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.gcBgMarkWorker(0xc0000c7730)
12:12:16 arch ollama[927]: runtime/mgc.go:1423 +0xe9 fp=0xc00008afc8 sp=0xc00008af38 pc=0x564e02aa08a9
12:12:16 arch ollama[927]: runtime.gcBgMarkStartWorkers.gowrap1()
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x25 fp=0xc00008afe0 sp=0xc00008afc8 pc=0x564e02aa0785
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc00008afe8 sp=0xc00008afe0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by runtime.gcBgMarkStartWorkers in goroutine 1
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x105
12:12:16 arch ollama[927]: goroutine 34 gp=0xc000102380 m=nil [GC worker (idle)]:
12:12:16 arch ollama[927]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc00011a738 sp=0xc00011a718 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.gcBgMarkWorker(0xc0000c7730)
12:12:16 arch ollama[927]: runtime/mgc.go:1423 +0xe9 fp=0xc00011a7c8 sp=0xc00011a738 pc=0x564e02aa08a9
12:12:16 arch ollama[927]: runtime.gcBgMarkStartWorkers.gowrap1()
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x25 fp=0xc00011a7e0 sp=0xc00011a7c8 pc=0x564e02aa0785
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc00011a7e8 sp=0xc00011a7e0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by runtime.gcBgMarkStartWorkers in goroutine 1
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x105
12:12:16 arch ollama[927]: goroutine 35 gp=0xc000102540 m=nil [GC worker (idle)]:
12:12:16 arch ollama[927]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc00011af38 sp=0xc00011af18 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.gcBgMarkWorker(0xc0000c7730)
12:12:16 arch ollama[927]: runtime/mgc.go:1423 +0xe9 fp=0xc00011afc8 sp=0xc00011af38 pc=0x564e02aa08a9
12:12:16 arch ollama[927]: runtime.gcBgMarkStartWorkers.gowrap1()
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x25 fp=0xc00011afe0 sp=0xc00011afc8 pc=0x564e02aa0785
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc00011afe8 sp=0xc00011afe0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by runtime.gcBgMarkStartWorkers in goroutine 1
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x105
12:12:16 arch ollama[927]: goroutine 20 gp=0xc000504380 m=nil [GC worker (idle)]:
12:12:16 arch ollama[927]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc00008b738 sp=0xc00008b718 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.gcBgMarkWorker(0xc0000c7730)
12:12:16 arch ollama[927]: runtime/mgc.go:1423 +0xe9 fp=0xc00008b7c8 sp=0xc00008b738 pc=0x564e02aa08a9
12:12:16 arch ollama[927]: runtime.gcBgMarkStartWorkers.gowrap1()
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x25 fp=0xc00008b7e0 sp=0xc00008b7c8 pc=0x564e02aa0785
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc00008b7e8 sp=0xc00008b7e0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by runtime.gcBgMarkStartWorkers in goroutine 1
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x105
12:12:16 arch ollama[927]: goroutine 21 gp=0xc000504540 m=nil [GC worker (idle)]:
12:12:16 arch ollama[927]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc00008bf38 sp=0xc00008bf18 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.gcBgMarkWorker(0xc0000c7730)
12:12:16 arch ollama[927]: runtime/mgc.go:1423 +0xe9 fp=0xc00008bfc8 sp=0xc00008bf38 pc=0x564e02aa08a9
12:12:16 arch ollama[927]: runtime.gcBgMarkStartWorkers.gowrap1()
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x25 fp=0xc00008bfe0 sp=0xc00008bfc8 pc=0x564e02aa0785
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc00008bfe8 sp=0xc00008bfe0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by runtime.gcBgMarkStartWorkers in goroutine 1
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x105
12:12:16 arch ollama[927]: goroutine 22 gp=0xc000504700 m=nil [GC worker (idle)]:
12:12:16 arch ollama[927]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc00008c738 sp=0xc00008c718 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.gcBgMarkWorker(0xc0000c7730)
12:12:16 arch ollama[927]: runtime/mgc.go:1423 +0xe9 fp=0xc00008c7c8 sp=0xc00008c738 pc=0x564e02aa08a9
12:12:16 arch ollama[927]: runtime.gcBgMarkStartWorkers.gowrap1()
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x25 fp=0xc00008c7e0 sp=0xc00008c7c8 pc=0x564e02aa0785
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc00008c7e8 sp=0xc00008c7e0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by runtime.gcBgMarkStartWorkers in goroutine 1
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x105
12:12:16 arch ollama[927]: goroutine 23 gp=0xc0005048c0 m=nil [GC worker (idle)]:
12:12:16 arch ollama[927]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc00008cf38 sp=0xc00008cf18 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.gcBgMarkWorker(0xc0000c7730)
12:12:16 arch ollama[927]: runtime/mgc.go:1423 +0xe9 fp=0xc00008cfc8 sp=0xc00008cf38 pc=0x564e02aa08a9
12:12:16 arch ollama[927]: runtime.gcBgMarkStartWorkers.gowrap1()
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x25 fp=0xc00008cfe0 sp=0xc00008cfc8 pc=0x564e02aa0785
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc00008cfe8 sp=0xc00008cfe0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by runtime.gcBgMarkStartWorkers in goroutine 1
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x105
12:12:16 arch ollama[927]: goroutine 24 gp=0xc000504a80 m=nil [GC worker (idle)]:
12:12:16 arch ollama[927]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc00008d738 sp=0xc00008d718 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.gcBgMarkWorker(0xc0000c7730)
12:12:16 arch ollama[927]: runtime/mgc.go:1423 +0xe9 fp=0xc00008d7c8 sp=0xc00008d738 pc=0x564e02aa08a9
12:12:16 arch ollama[927]: runtime.gcBgMarkStartWorkers.gowrap1()
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x25 fp=0xc00008d7e0 sp=0xc00008d7c8 pc=0x564e02aa0785
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc00008d7e8 sp=0xc00008d7e0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by runtime.gcBgMarkStartWorkers in goroutine 1
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x105
12:12:16 arch ollama[927]: goroutine 25 gp=0xc000504c40 m=nil [GC worker (idle)]:
12:12:16 arch ollama[927]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc00008df38 sp=0xc00008df18 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.gcBgMarkWorker(0xc0000c7730)
12:12:16 arch ollama[927]: runtime/mgc.go:1423 +0xe9 fp=0xc00008dfc8 sp=0xc00008df38 pc=0x564e02aa08a9
12:12:16 arch ollama[927]: runtime.gcBgMarkStartWorkers.gowrap1()
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x25 fp=0xc00008dfe0 sp=0xc00008dfc8 pc=0x564e02aa0785
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc00008dfe8 sp=0xc00008dfe0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by runtime.gcBgMarkStartWorkers in goroutine 1
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x105
12:12:16 arch ollama[927]: goroutine 8 gp=0xc0001f0e00 m=nil [GC worker (idle)]:
12:12:16 arch ollama[927]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc000091738 sp=0xc000091718 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.gcBgMarkWorker(0xc0000c7730)
12:12:16 arch ollama[927]: runtime/mgc.go:1423 +0xe9 fp=0xc0000917c8 sp=0xc000091738 pc=0x564e02aa08a9
12:12:16 arch ollama[927]: runtime.gcBgMarkStartWorkers.gowrap1()
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x25 fp=0xc0000917e0 sp=0xc0000917c8 pc=0x564e02aa0785
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc0000917e8 sp=0xc0000917e0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by runtime.gcBgMarkStartWorkers in goroutine 1
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x105
12:12:16 arch ollama[927]: goroutine 36 gp=0xc000102700 m=nil [GC worker (idle)]:
12:12:16 arch ollama[927]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc00011b738 sp=0xc00011b718 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.gcBgMarkWorker(0xc0000c7730)
12:12:16 arch ollama[927]: runtime/mgc.go:1423 +0xe9 fp=0xc00011b7c8 sp=0xc00011b738 pc=0x564e02aa08a9
12:12:16 arch ollama[927]: runtime.gcBgMarkStartWorkers.gowrap1()
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x25 fp=0xc00011b7e0 sp=0xc00011b7c8 pc=0x564e02aa0785
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc00011b7e8 sp=0xc00011b7e0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by runtime.gcBgMarkStartWorkers in goroutine 1
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x105
12:12:16 arch ollama[927]: goroutine 37 gp=0xc0001028c0 m=nil [GC worker (idle)]:
12:12:16 arch ollama[927]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc00011bf38 sp=0xc00011bf18 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.gcBgMarkWorker(0xc0000c7730)
12:12:16 arch ollama[927]: runtime/mgc.go:1423 +0xe9 fp=0xc00011bfc8 sp=0xc00011bf38 pc=0x564e02aa08a9
12:12:16 arch ollama[927]: runtime.gcBgMarkStartWorkers.gowrap1()
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x25 fp=0xc00011bfe0 sp=0xc00011bfc8 pc=0x564e02aa0785
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc00011bfe8 sp=0xc00011bfe0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by runtime.gcBgMarkStartWorkers in goroutine 1
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x105
12:12:16 arch ollama[927]: goroutine 9 gp=0xc0001f0fc0 m=nil [GC worker (idle)]:
12:12:16 arch ollama[927]: runtime.gopark(0x16e3afe5ca10f?, 0x0?, 0x0?, 0x0?, 0x0?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc000091f38 sp=0xc000091f18 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.gcBgMarkWorker(0xc0000c7730)
12:12:16 arch ollama[927]: runtime/mgc.go:1423 +0xe9 fp=0xc000091fc8 sp=0xc000091f38 pc=0x564e02aa08a9
12:12:16 arch ollama[927]: runtime.gcBgMarkStartWorkers.gowrap1()
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x25 fp=0xc000091fe0 sp=0xc000091fc8 pc=0x564e02aa0785
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc000091fe8 sp=0xc000091fe0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by runtime.gcBgMarkStartWorkers in goroutine 1
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x105
12:12:16 arch ollama[927]: goroutine 10 gp=0xc0001f1180 m=nil [GC worker (idle)]:
12:12:16 arch ollama[927]: runtime.gopark(0x564e04a4a680?, 0x3?, 0xda?, 0x8f?, 0x0?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc000116738 sp=0xc000116718 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.gcBgMarkWorker(0xc0000c7730)
12:12:16 arch ollama[927]: runtime/mgc.go:1423 +0xe9 fp=0xc0001167c8 sp=0xc000116738 pc=0x564e02aa08a9
12:12:16 arch ollama[927]: runtime.gcBgMarkStartWorkers.gowrap1()
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x25 fp=0xc0001167e0 sp=0xc0001167c8 pc=0x564e02aa0785
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc0001167e8 sp=0xc0001167e0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by runtime.gcBgMarkStartWorkers in goroutine 1
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x105
12:12:16 arch ollama[927]: goroutine 11 gp=0xc0001f1340 m=nil [GC worker (idle)]:
12:12:16 arch ollama[927]: runtime.gopark(0x564e04a4a680?, 0x1?, 0x8f?, 0x69?, 0x0?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc000116f38 sp=0xc000116f18 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.gcBgMarkWorker(0xc0000c7730)
12:12:16 arch ollama[927]: runtime/mgc.go:1423 +0xe9 fp=0xc000116fc8 sp=0xc000116f38 pc=0x564e02aa08a9
12:12:16 arch ollama[927]: runtime.gcBgMarkStartWorkers.gowrap1()
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x25 fp=0xc000116fe0 sp=0xc000116fc8 pc=0x564e02aa0785
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc000116fe8 sp=0xc000116fe0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by runtime.gcBgMarkStartWorkers in goroutine 1
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x105
12:12:16 arch ollama[927]: goroutine 12 gp=0xc0001f1500 m=nil [GC worker (idle)]:
12:12:16 arch ollama[927]: runtime.gopark(0x564e04a4a680?, 0x1?, 0x46?, 0x67?, 0x0?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc000117738 sp=0xc000117718 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.gcBgMarkWorker(0xc0000c7730)
12:12:16 arch ollama[927]: runtime/mgc.go:1423 +0xe9 fp=0xc0001177c8 sp=0xc000117738 pc=0x564e02aa08a9
12:12:16 arch ollama[927]: runtime.gcBgMarkStartWorkers.gowrap1()
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x25 fp=0xc0001177e0 sp=0xc0001177c8 pc=0x564e02aa0785
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc0001177e8 sp=0xc0001177e0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by runtime.gcBgMarkStartWorkers in goroutine 1
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x105
12:12:16 arch ollama[927]: goroutine 13 gp=0xc0001f16c0 m=nil [GC worker (idle)]:
12:12:16 arch ollama[927]: runtime.gopark(0x16e3afe5baaf9?, 0x0?, 0x0?, 0x0?, 0x0?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc000117f38 sp=0xc000117f18 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.gcBgMarkWorker(0xc0000c7730)
12:12:16 arch ollama[927]: runtime/mgc.go:1423 +0xe9 fp=0xc000117fc8 sp=0xc000117f38 pc=0x564e02aa08a9
12:12:16 arch ollama[927]: runtime.gcBgMarkStartWorkers.gowrap1()
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x25 fp=0xc000117fe0 sp=0xc000117fc8 pc=0x564e02aa0785
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc000117fe8 sp=0xc000117fe0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by runtime.gcBgMarkStartWorkers in goroutine 1
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x105
12:12:16 arch ollama[927]: goroutine 14 gp=0xc0001f1880 m=nil [GC worker (idle)]:
12:12:16 arch ollama[927]: runtime.gopark(0x16e3afe5bde06?, 0x1?, 0xd5?, 0x34?, 0x0?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc000118738 sp=0xc000118718 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.gcBgMarkWorker(0xc0000c7730)
12:12:16 arch ollama[927]: runtime/mgc.go:1423 +0xe9 fp=0xc0001187c8 sp=0xc000118738 pc=0x564e02aa08a9
12:12:16 arch ollama[927]: runtime.gcBgMarkStartWorkers.gowrap1()
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x25 fp=0xc0001187e0 sp=0xc0001187c8 pc=0x564e02aa0785
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc0001187e8 sp=0xc0001187e0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by runtime.gcBgMarkStartWorkers in goroutine 1
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x105
12:12:16 arch ollama[927]: goroutine 15 gp=0xc0001f1a40 m=nil [GC worker (idle)]:
12:12:16 arch ollama[927]: runtime.gopark(0x564e04a4a680?, 0x1?, 0x78?, 0x39?, 0x0?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc000118f38 sp=0xc000118f18 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.gcBgMarkWorker(0xc0000c7730)
12:12:16 arch ollama[927]: runtime/mgc.go:1423 +0xe9 fp=0xc000118fc8 sp=0xc000118f38 pc=0x564e02aa08a9
12:12:16 arch ollama[927]: runtime.gcBgMarkStartWorkers.gowrap1()
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x25 fp=0xc000118fe0 sp=0xc000118fc8 pc=0x564e02aa0785
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc000118fe8 sp=0xc000118fe0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by runtime.gcBgMarkStartWorkers in goroutine 1
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x105
12:12:16 arch ollama[927]: goroutine 38 gp=0xc000102a80 m=nil [GC worker (idle)]:
12:12:16 arch ollama[927]: runtime.gopark(0x564e04a4a680?, 0x1?, 0x6a?, 0xc4?, 0x0?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc00011c738 sp=0xc00011c718 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.gcBgMarkWorker(0xc0000c7730)
12:12:16 arch ollama[927]: runtime/mgc.go:1423 +0xe9 fp=0xc00011c7c8 sp=0xc00011c738 pc=0x564e02aa08a9
12:12:16 arch ollama[927]: runtime.gcBgMarkStartWorkers.gowrap1()
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x25 fp=0xc00011c7e0 sp=0xc00011c7c8 pc=0x564e02aa0785
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc00011c7e8 sp=0xc00011c7e0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by runtime.gcBgMarkStartWorkers in goroutine 1
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x105
12:12:16 arch ollama[927]: goroutine 16 gp=0xc0001f1c00 m=nil [GC worker (idle)]:
12:12:16 arch ollama[927]: runtime.gopark(0x16e3afe5bc298?, 0x1?, 0xb0?, 0x10?, 0x0?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc000119738 sp=0xc000119718 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.gcBgMarkWorker(0xc0000c7730)
12:12:16 arch ollama[927]: runtime/mgc.go:1423 +0xe9 fp=0xc0001197c8 sp=0xc000119738 pc=0x564e02aa08a9
12:12:16 arch ollama[927]: runtime.gcBgMarkStartWorkers.gowrap1()
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x25 fp=0xc0001197e0 sp=0xc0001197c8 pc=0x564e02aa0785
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc0001197e8 sp=0xc0001197e0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by runtime.gcBgMarkStartWorkers in goroutine 1
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x105
12:12:16 arch ollama[927]: goroutine 50 gp=0xc0001f1dc0 m=nil [GC worker (idle)]:
12:12:16 arch ollama[927]: runtime.gopark(0x16e3afe5ba5b0?, 0x1?, 0x1d?, 0x9c?, 0x0?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc000119f38 sp=0xc000119f18 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.gcBgMarkWorker(0xc0000c7730)
12:12:16 arch ollama[927]: runtime/mgc.go:1423 +0xe9 fp=0xc000119fc8 sp=0xc000119f38 pc=0x564e02aa08a9
12:12:16 arch ollama[927]: runtime.gcBgMarkStartWorkers.gowrap1()
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x25 fp=0xc000119fe0 sp=0xc000119fc8 pc=0x564e02aa0785
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc000119fe8 sp=0xc000119fe0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by runtime.gcBgMarkStartWorkers in goroutine 1
12:12:16 arch ollama[927]: runtime/mgc.go:1339 +0x105
12:12:16 arch ollama[927]: goroutine 39 gp=0xc000505a40 m=nil [sync.Cond.Wait]:
12:12:16 arch ollama[927]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0xc000165c78?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc000165be8 sp=0xc000165bc8 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.goparkunlock(...)
12:12:16 arch ollama[927]: runtime/proc.go:441
12:12:16 arch ollama[927]: sync.runtime_notifyListWait(0xc0005311d0, 0x0)
12:12:16 arch ollama[927]: runtime/sema.go:597 +0x15a fp=0xc000165c38 sp=0xc000165be8 pc=0x564e02af46ba
12:12:16 arch ollama[927]: sync.(*Cond).Wait(0xc000165cb8?)
12:12:16 arch ollama[927]: sync/cond.go:71 +0x85 fp=0xc000165c70 sp=0xc000165c38 pc=0x564e02b047c5
12:12:16 arch ollama[927]: github.com/ollama/ollama/runner/llamarunner.(*Server).processBatch(0xc00013f900, 0xc00033c140, 0xc00033c190)
12:12:16 arch ollama[927]: github.com/ollama/ollama/runner/llamarunner/runner.go:408 +0x93 fp=0xc000165ee8 sp=0xc000165c70 pc=0x564e02f63213
12:12:16 arch ollama[927]: github.com/ollama/ollama/runner/llamarunner.(*Server).run(0xc00013f900, {0x564e040a8410, 0xc000623090})
12:12:16 arch ollama[927]: github.com/ollama/ollama/runner/llamarunner/runner.go:387 +0x1d5 fp=0xc000165fb8 sp=0xc000165ee8 pc=0x564e02f63015
12:12:16 arch ollama[927]: github.com/ollama/ollama/runner/llamarunner.Execute.gowrap1()
12:12:16 arch ollama[927]: github.com/ollama/ollama/runner/llamarunner/runner.go:981 +0x28 fp=0xc000165fe0 sp=0xc000165fb8 pc=0x564e02f683e8
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc000165fe8 sp=0xc000165fe0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by github.com/ollama/ollama/runner/llamarunner.Execute in goroutine 1
12:12:16 arch ollama[927]: github.com/ollama/ollama/runner/llamarunner/runner.go:981 +0x4c5
12:12:16 arch ollama[927]: goroutine 45 gp=0xc000602fc0 m=nil [IO wait]:
12:12:16 arch ollama[927]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0xb?)
12:12:16 arch ollama[927]: runtime/proc.go:435 +0xce fp=0xc000163dd8 sp=0xc000163db8 pc=0x564e02af2b6e
12:12:16 arch ollama[927]: runtime.netpollblock(0x564e02b16338?, 0x2a8c2a6?, 0x4e?)
12:12:16 arch ollama[927]: runtime/netpoll.go:575 +0xf7 fp=0xc000163e10 sp=0xc000163dd8 pc=0x564e02ab7e97
12:12:16 arch ollama[927]: internal/poll.runtime_pollWait(0x7f83f643ad98, 0x72)
12:12:16 arch ollama[927]: runtime/netpoll.go:351 +0x85 fp=0xc000163e30 sp=0xc000163e10 pc=0x564e02af1d85
12:12:16 arch ollama[927]: internal/poll.(*pollDesc).wait(0xc000701500?, 0xc000144c11?, 0x0)
12:12:16 arch ollama[927]: internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc000163e58 sp=0xc000163e30 pc=0x564e02b79f07
12:12:16 arch ollama[927]: internal/poll.(*pollDesc).waitRead(...)
12:12:16 arch ollama[927]: internal/poll/fd_poll_runtime.go:89
12:12:16 arch ollama[927]: internal/poll.(*FD).Read(0xc000701500, {0xc000144c11, 0x1, 0x1})
12:12:16 arch ollama[927]: internal/poll/fd_unix.go:165 +0x27a fp=0xc000163ef0 sp=0xc000163e58 pc=0x564e02b7b1fa
12:12:16 arch ollama[927]: net.(*netFD).Read(0xc000701500, {0xc000144c11?, 0x0?, 0x0?})
12:12:16 arch ollama[927]: net/fd_posix.go:55 +0x25 fp=0xc000163f38 sp=0xc000163ef0 pc=0x564e02bf0205
12:12:16 arch ollama[927]: net.(*conn).Read(0xc000092908, {0xc000144c11?, 0x0?, 0x0?})
12:12:16 arch ollama[927]: net/net.go:194 +0x45 fp=0xc000163f80 sp=0xc000163f38 pc=0x564e02bfe5c5
12:12:16 arch ollama[927]: net/http.(*connReader).backgroundRead(0xc000144c00)
12:12:16 arch ollama[927]: net/http/server.go:690 +0x37 fp=0xc000163fc8 sp=0xc000163f80 pc=0x564e02dea4b7
12:12:16 arch ollama[927]: net/http.(*connReader).startBackgroundRead.gowrap2()
12:12:16 arch ollama[927]: net/http/server.go:686 +0x25 fp=0xc000163fe0 sp=0xc000163fc8 pc=0x564e02dea3e5
12:12:16 arch ollama[927]: runtime.goexit({})
12:12:16 arch ollama[927]: runtime/asm_amd64.s:1700 +0x1 fp=0xc000163fe8 sp=0xc000163fe0 pc=0x564e02afaa01
12:12:16 arch ollama[927]: created by net/http.(*connReader).startBackgroundRead in goroutine 40
12:12:16 arch ollama[927]: net/http/server.go:686 +0xb6
12:12:16 arch ollama[927]: rax 0x7f82d2850890
12:12:16 arch ollama[927]: rbx 0x0
12:12:16 arch ollama[927]: rcx 0x0
12:12:16 arch ollama[927]: rdx 0x0
12:12:16 arch ollama[927]: rdi 0x7f82d25dfff0
12:12:16 arch ollama[927]: rsi 0x7f82c622f310
12:12:16 arch ollama[927]: rbp 0x7f82d25e7e68
12:12:16 arch ollama[927]: rsp 0x7f82d5ffcbc0
12:12:16 arch ollama[927]: r8 0x0
12:12:16 arch ollama[927]: r9 0x7f82736c7040
12:12:16 arch ollama[927]: r10 0x24db000
12:12:16 arch ollama[927]: r11 0x246
12:12:16 arch ollama[927]: r12 0x7f81c445a9e0
12:12:16 arch ollama[927]: r13 0x160
12:12:16 arch ollama[927]: r14 0x7f82c622f030
12:12:16 arch ollama[927]: r15 0x1
12:12:16 arch ollama[927]: rip 0x564e038541cb
12:12:16 arch ollama[927]: rflags 0x10246
12:12:16 arch ollama[927]: cs 0x33
12:12:16 arch ollama[927]: fs 0x0
12:12:16 arch ollama[927]: gs 0x0
12:12:16 arch ollama[927]: time=2026-01-05T12:12:16.610+08:00 level=ERROR source=server.go:1583 msg="post predict" error="Post "http://127.0.0.1:33773/completion": EOF"
@babysource commented on GitHub (Jan 5, 2026):
open-autoglm check_deployment always response "</answer>"
@rick-github
@rick-github commented on GitHub (Jan 5, 2026):
What is "open-autoglm check_deployment"?
@rick-github commented on GitHub (Jan 5, 2026):
@pinghe Full log, from the start.
@babysource commented on GitHub (Jan 5, 2026):
Where can I obtain the full log of ollama?
@rick-github
@babysource commented on GitHub (Jan 5, 2026):
Step:
However, if the images in the script<scripts/check_deployment_cn.py> are removed, a normal response will be returned.
@rick-github
$ journalctl -u ollama --since "1 hour ago"
Full log
@pinghe commented on GitHub (Jan 5, 2026):
journalctl -u ollama --since "1 hour ago"
Full log