mirror of
https://github.com/ollama/ollama.git
synced 2026-05-07 16:40:08 -05:00
Closed
opened 2026-04-28 16:15:58 -05:00 by GiteaMirror
·
12 comments
No Branch/Tag Specified
main
dhiltgen/ci
dhiltgen/llama-runner
hoyyeva/anthropic-local-image-path
hoyyeva/anthropic-reference-images-path
parth-anthropic-reference-images-path
brucemacd/download-before-remove
hoyyeva/editor-config-repair
parth-mlx-decode-checkpoints
parth-launch-codex-app
hoyyeva/fix-codex-model-metadata-warning
hoyyeva/qwen
parth/hide-claude-desktop-till-release
hoyyeva/opencode-image-modality
parth-add-claude-code-autoinstall
release_v0.22.0
pdevine/manifest-list
codex/fix-codex-model-metadata-warning
pdevine/addressable-manifest
brucemacd/launch-fetch-reccomended
jmorganca/llama-compat
launch-copilot-cli
hoyyeva/opencode-thinking
release_v0.20.7
parth-auto-save-backup
parth-test
jmorganca/gemma4-audio-replacements
fix-manifest-digest-on-pull
hoyyeva/vscode-improve
brucemacd/install-server-wait
parth/update-claude-docs
brucemac/start-ap-install
pdevine/mlx-update
pdevine/qwen35_vision
drifkin/api-show-fallback
mintlify/image-generation-1773352582
hoyyeva/server-context-length-local-config
jmorganca/faster-reptition-penalties
jmorganca/convert-nemotron
parth-pi-thinking
pdevine/sampling-penalties
jmorganca/fix-create-quantization-memory
dongchen/resumable_transfer_fix
pdevine/sampling-cache-error
jessegross/mlx-usage
hoyyeva/openclaw-config
hoyyeva/app-html
pdevine/qwen3next
brucemacd/sign-sh-install
brucemacd/tui-update
brucemacd/usage-api
jmorganca/launch-empty
fix-app-dist-embed
mxyng/mlx-compile
mxyng/mlx-quant
mxyng/mlx-glm4.7
mxyng/mlx
brucemacd/simplify-model-picker
jmorganca/qwen3-concurrent
fix-glm-4.7-flash-mla-config
drifkin/qwen3-coder-opening-tag
brucemacd/usage-cli
fix-cuda12-fattn-shmem
ollama-imagegen-docs
parth/fix-multiline-inputs
brucemacd/config-docs
mxyng/model-files
mxyng/simple-execute
fix-imagegen-ollama-models
mxyng/async-upload
jmorganca/lazy-no-dtype-changes
imagegen-auto-detect-create
parth/decrease-concurrent-download-hf
fix-mlx-quantize-init
jmorganca/x-cleanup
usage
imagegen-readme
jmorganca/glm-image
mlx-gpu-cd
jmorganca/imagegen-modelfile
parth/agent-skills
parth/agent-allowlist
parth/signed-in-offline
parth/agents
parth/fix-context-chopping
improve-cloud-flow
parth/add-models-websearch
parth/prompt-renderer-mcp
jmorganca/native-settings
jmorganca/download-stream-hash
jmorganca/client2-rebased
brucemacd/oai-chat-req-multipart
jessegross/multi_chunk_reserve
grace/additional-omit-empty
grace/mistral-3-large
mxyng/tokenizer2
mxyng/tokenizer
jessegross/flash
hoyyeva/windows-nacked-app
mxyng/cleanup-attention
grace/deepseek-parser
hoyyeva/remember-unsent-prompt
parth/add-lfs-pointer-error-conversion
parth/olmo2-test2
hoyyeva/ollama-launchagent-plist
nicole/olmo-model
parth/olmo-test
mxyng/remove-embedded
parth/render-template
jmorganca/intellect-3
parth/remove-prealloc-linter
jmorganca/cmd-eval
nicole/nomic-embed-text-fix
mxyng/lint-2
hoyyeva/add-gemini-3-pro-preview
hoyyeva/load-model-list
mxyng/expand-path
mxyng/environ-2
hoyyeva/deeplink-json-encoding
parth/improve-tool-calling-tests
hoyyeva/conversation
hoyyeva/assistant-edit-response
hoyyeva/thinking
origin/brucemacd/invalid-char-i-err
parth/improve-tool-calling
jmorganca/required-omitempty
grace/qwen3-vl-tests
mxyng/iter-client
parth/docs-readme
nicole/embed-test
pdevine/integration-benchstat
parth/remove-generate-cmd
parth/add-toolcall-id
mxyng/server-tests
jmorganca/glm-4.6
jmorganca/gin-h-compat
drifkin/stable-tool-args
pdevine/qwen3-more-thinking
parth/add-websearch-client
nicole/websearch_local
jmorganca/qwen3-coder-updates
grace/deepseek-v3-migration-tests
mxyng/fix-create
jmorganca/cloud-errors
pdevine/parser-tidy
revert-12233-parth/simplify-entrypoints-runner
parth/enable-so-gpt-oss
brucemacd/qwen3vl
jmorganca/readme-simplify
parth/gpt-oss-structured-outputs
revert-12039-jmorganca/tools-braces
mxyng/embeddings
mxyng/gguf
mxyng/benchmark
mxyng/types-null
parth/move-parsing
mxyng/gemma2
jmorganca/docs
mxyng/16-bit
mxyng/create-stdin
pdevine/authorizedkeys
mxyng/quant
parth/opt-in-error-context-window
brucemacd/cache-models
brucemacd/runner-completion
jmorganca/llama-update-6
brucemacd/benchmark-list
brucemacd/partial-read-caps
parth/deepseek-r1-tools
mxyng/omit-array
parth/tool-prefix-temp
brucemacd/runner-test
jmorganca/qwen25vl
brucemacd/model-forward-test-ext
parth/python-function-parsing
jmorganca/cuda-compression-none
drifkin/num-parallel
drifkin/chat-truncation-fix
jmorganca/sync
parth/python-tools-calling
drifkin/array-head-count
brucemacd/create-no-loop
parth/server-enable-content-stream-with-tools
qwen25omni
mxyng/v3
brucemacd/ropeconfig
jmorganca/silence-tokenizer
parth/sample-so-test
parth/sampling-structured-outputs
brucemacd/doc-go-engine
parth/constrained-sampling-json
jmorganca/mistral-wip
brucemacd/mistral-small-convert
parth/sample-unmarshal-json-for-params
brucemacd/jomorganca/mistral
pdevine/bfloat16
jmorganca/mistral
brucemacd/mistral
pdevine/logging
parth/sample-correctness-fix
parth/sample-fix-sorting
jmorgan/sample-fix-sorting-extras
jmorganca/temp-0-images
brucemacd/parallel-embed-models
brucemacd/shim-grammar
jmorganca/fix-gguf-error
bmizerany/nameswork
jmorganca/faster-releases
bmizerany/validatenames
brucemacd/err-no-vocab
brucemacd/rope-config
brucemacd/err-hint
brucemacd/qwen2_5
brucemacd/logprobs
brucemacd/new_runner_graph_bench
progress-flicker
brucemacd/forward-test
brucemacd/go_qwen2
pdevine/gemma2
jmorganca/add-missing-symlink-eval
mxyng/next-debug
parth/set-context-size-openai
brucemacd/next-bpe-bench
brucemacd/next-bpe-test
brucemacd/new_runner_e2e
brucemacd/new_runner_qwen2
pdevine/convert-cohere2
brucemacd/convert-cli
parth/log-probs
mxyng/next-mlx
mxyng/cmd-history
parth/templating
parth/tokenize-detokenize
brucemacd/check-key-register
bmizerany/grammar
jmorganca/vendor-081b29bd
mxyng/func-checks
jmorganca/fix-null-format
parth/fix-default-to-warn-json
jmorganca/qwen2vl
jmorganca/no-concat
parth/cmd-cleanup-SO
brucemacd/check-key-register-structured-err
parth/openai-stream-usage
parth/fix-referencing-so
stream-tools-stop
jmorganca/degin-1
brucemacd/install-path-clean
brucemacd/push-name-validation
brucemacd/browser-key-register
jmorganca/openai-fix-first-message
jmorganca/fix-proxy
jessegross/sample
parth/disallow-streaming-tools
dhiltgen/remove_submodule
jmorganca/ga
jmorganca/mllama
pdevine/newlines
pdevine/geems-2b
jmorganca/llama-bump
mxyng/modelname-7
mxyng/gin-slog
mxyng/modelname-6
jyan/convert-prog
jyan/quant5
paligemma-support
pdevine/import-docs
jmorganca/openai-context
jyan/paligemma
jyan/p2
jyan/palitest
bmizerany/embedspeedup
jmorganca/llama-vit
brucemacd/allow-ollama
royh/ep-methods
royh/whisper
mxyng/api-models
mxyng/fix-memory
jyan/q4_4/8
jyan/ollama-v
royh/stream-tools
roy-embed-parallel
bmizerany/hrm
revert-5963-revert-5924-mxyng/llama3.1-rope
royh/embed-viz
jyan/local2
jyan/auth
jyan/local
jyan/parse-temp
jmorganca/template-mistral
jyan/reord-g
royh-openai-suffixdocs
royh-imgembed
royh-embed-parallel
jyan/quant4
royh-precision
jyan/progress
pdevine/fix-template
jyan/quant3
pdevine/ggla
mxyng/update-registry-domain
jmorganca/ggml-static
mxyng/create-context
jyan/v0.146
mxyng/layers-from-files
build_dist
bmizerany/noseek
royh-ls
royh-name
timeout
mxyng/server-timestamp
bmizerany/nosillyggufslurps
royh-params
jmorganca/llama-cpp-7c26775
royh-openai-delete
royh-show-rigid
jmorganca/enable-fa
jmorganca/no-error-template
jyan/format
royh-testdelete
bmizerany/fastverify
language_support
pdevine/ps-glitches
brucemacd/tokenize
bruce/iq-quants
bmizerany/filepathwithcoloninhost
mxyng/split-bin
bmizerany/client-registry
jmorganca/if-none-match
native
jmorganca/native
jmorganca/batch-embeddings
jmorganca/initcmake
jmorganca/mm
pdevine/showggmlinfo
modenameenforcealphanum
bmizerany/modenameenforcealphanum
jmorganca/done-reason
jmorganca/llama-cpp-8960fe8
ollama.com
bmizerany/filepathnobuild
bmizerany/types/model/defaultfix
rmdisplaylong
nogogen
bmizerany/x
modelfile-readme
bmizerany/replacecolon
jmorganca/limit
jmorganca/execstack
jmorganca/replace-assets
mxyng/tune-concurrency
jmorganca/testing
whitespace-detection
jmorganca/options
upgrade-all
scratch
cuda-search
mattw/airenamer
mattw/allmodelsonhuggingface
mattw/quantcontext
mattw/whatneedstorun
brucemacd/llama-mem-calc
mattw/faq-context
mattw/communitylinks
mattw/noprune
mattw/python-functioncalling
rename
mxyng/install
pulse
remove-first
editor
mattw/selfqueryingretrieval
cgo
mattw/howtoquant
api
matt/streamingapi
format-config
mxyng/extra-args
shell
update-nous-hermes
cp-model
upload-progress
fix-unknown-model
fix-model-names
delete-fix
insecure-registry
ls
deletemodels
progressbar
readme-updates
license-layers
skip-list
list-models
modelpath
matt/examplemodelfiles
distribution
go-opts
v0.30.0-rc5
v0.23.2-rc0
v0.30.0-rc4
v0.30.0-rc3
v0.30.0-rc2
v0.30.0-rc1
v0.30.0-rc0
v0.23.1
v0.23.1-rc0
v0.23.0
v0.23.0-rc0
v0.22.1
v0.22.1-rc1
v0.22.1-rc0
v0.22.0
v0.22.0-rc1
v0.21.3-rc0
v0.21.2-rc1
v0.21.2
v0.21.2-rc0
v0.21.1
v0.21.1-rc1
v0.21.1-rc0
v0.21.0
v0.21.0-rc1
v0.21.0-rc0
v0.20.8-rc0
v0.20.7
v0.20.7-rc1
v0.20.7-rc0
v0.20.6
v0.20.6-rc1
v0.20.6-rc0
v0.20.5
v0.20.5-rc2
v0.20.5-rc1
v0.20.5-rc0
v0.20.4
v0.20.4-rc2
v0.20.4-rc1
v0.20.4-rc0
v0.20.3
v0.20.3-rc0
v0.20.2
v0.20.1
v0.20.1-rc2
v0.20.1-rc1
v0.20.1-rc0
v0.20.0
v0.20.0-rc1
v0.20.0-rc0
v0.19.0
v0.19.0-rc2
v0.19.0-rc1
v0.19.0-rc0
v0.18.4-rc1
v0.18.4-rc0
v0.18.3
v0.18.3-rc2
v0.18.3-rc1
v0.18.3-rc0
v0.18.2
v0.18.2-rc1
v0.18.2-rc0
v0.18.1
v0.18.1-rc1
v0.18.1-rc0
v0.18.0
v0.18.0-rc2
v0.18.0-rc1
v0.18.0-rc0
v0.17.8-rc4
v0.17.8-rc3
v0.17.8-rc2
v0.17.8-rc1
v0.17.8-rc0
v0.17.7
v0.17.7-rc2
v0.17.7-rc1
v0.17.7-rc0
v0.17.6
v0.17.5
v0.17.4
v0.17.3
v0.17.2
v0.17.1
v0.17.1-rc2
v0.17.1-rc1
v0.17.1-rc0
v0.17.0
v0.17.0-rc2
v0.17.0-rc1
v0.17.0-rc0
v0.16.3
v0.16.3-rc2
v0.16.3-rc1
v0.16.3-rc0
v0.16.2
v0.16.2-rc0
v0.16.1
v0.16.0
v0.16.0-rc2
v0.16.0-rc0
v0.16.0-rc1
v0.15.6
v0.15.5
v0.15.5-rc5
v0.15.5-rc4
v0.15.5-rc3
v0.15.5-rc2
v0.15.5-rc1
v0.15.5-rc0
v0.15.4
v0.15.3
v0.15.2
v0.15.1
v0.15.1-rc1
v0.15.1-rc0
v0.15.0-rc6
v0.15.0
v0.15.0-rc5
v0.15.0-rc4
v0.15.0-rc3
v0.15.0-rc2
v0.15.0-rc1
v0.15.0-rc0
v0.14.3
v0.14.3-rc3
v0.14.3-rc2
v0.14.3-rc1
v0.14.3-rc0
v0.14.2
v0.14.2-rc1
v0.14.2-rc0
v0.14.1
v0.14.0-rc11
v0.14.0
v0.14.0-rc10
v0.14.0-rc9
v0.14.0-rc8
v0.14.0-rc7
v0.14.0-rc6
v0.14.0-rc5
v0.14.0-rc4
v0.14.0-rc3
v0.14.0-rc2
v0.14.0-rc1
v0.14.0-rc0
v0.13.5
v0.13.5-rc1
v0.13.5-rc0
v0.13.4-rc2
v0.13.4
v0.13.4-rc1
v0.13.4-rc0
v0.13.3
v0.13.3-rc1
v0.13.3-rc0
v0.13.2
v0.13.2-rc2
v0.13.2-rc1
v0.13.2-rc0
v0.13.1
v0.13.1-rc2
v0.13.1-rc1
v0.13.1-rc0
v0.13.0
v0.13.0-rc0
v0.12.11
v0.12.11-rc1
v0.12.11-rc0
v0.12.10
v0.12.10-rc1
v0.12.10-rc0
v0.12.9-rc0
v0.12.9
v0.12.8
v0.12.8-rc0
v0.12.7
v0.12.7-rc1
v0.12.7-rc0
v0.12.7-citest0
v0.12.6
v0.12.6-rc1
v0.12.6-rc0
v0.12.5
v0.12.5-rc0
v0.12.4
v0.12.4-rc7
v0.12.4-rc6
v0.12.4-rc5
v0.12.4-rc4
v0.12.4-rc3
v0.12.4-rc2
v0.12.4-rc1
v0.12.4-rc0
v0.12.3
v0.12.2
v0.12.2-rc0
v0.12.1
v0.12.1-rc1
v0.12.1-rc2
v0.12.1-rc0
v0.12.0
v0.12.0-rc1
v0.12.0-rc0
v0.11.11
v0.11.11-rc3
v0.11.11-rc2
v0.11.11-rc1
v0.11.11-rc0
v0.11.10
v0.11.9
v0.11.9-rc0
v0.11.8
v0.11.8-rc0
v0.11.7-rc1
v0.11.7-rc0
v0.11.7
v0.11.6
v0.11.6-rc0
v0.11.5-rc4
v0.11.5-rc3
v0.11.5
v0.11.5-rc5
v0.11.5-rc2
v0.11.5-rc1
v0.11.5-rc0
v0.11.4
v0.11.4-rc0
v0.11.3
v0.11.3-rc0
v0.11.2
v0.11.1
v0.11.0-rc0
v0.11.0-rc1
v0.11.0-rc2
v0.11.0
v0.10.2-int1
v0.10.1
v0.10.0
v0.10.0-rc4
v0.10.0-rc3
v0.10.0-rc2
v0.10.0-rc1
v0.10.0-rc0
v0.9.7-rc1
v0.9.7-rc0
v0.9.6
v0.9.6-rc0
v0.9.6-ci0
v0.9.5
v0.9.4-rc5
v0.9.4-rc6
v0.9.4
v0.9.4-rc3
v0.9.4-rc4
v0.9.4-rc1
v0.9.4-rc2
v0.9.4-rc0
v0.9.3
v0.9.3-rc5
v0.9.4-citest0
v0.9.3-rc4
v0.9.3-rc3
v0.9.3-rc2
v0.9.3-rc1
v0.9.3-rc0
v0.9.2
v0.9.1
v0.9.1-rc1
v0.9.1-rc0
v0.9.1-ci1
v0.9.1-ci0
v0.9.0
v0.9.0-rc0
v0.8.0
v0.8.0-rc0
v0.7.1-rc2
v0.7.1
v0.7.1-rc1
v0.7.1-rc0
v0.7.0
v0.7.0-rc1
v0.7.0-rc0
v0.6.9-rc0
v0.6.8
v0.6.8-rc0
v0.6.7
v0.6.7-rc2
v0.6.7-rc1
v0.6.7-rc0
v0.6.6
v0.6.6-rc2
v0.6.6-rc1
v0.6.6-rc0
v0.6.5-rc1
v0.6.5
v0.6.5-rc0
v0.6.4-rc0
v0.6.4
v0.6.3-rc1
v0.6.3
v0.6.3-rc0
v0.6.2
v0.6.2-rc0
v0.6.1
v0.6.1-rc0
v0.6.0-rc0
v0.6.0
v0.5.14-rc0
v0.5.13
v0.5.13-rc6
v0.5.13-rc5
v0.5.13-rc4
v0.5.13-rc3
v0.5.13-rc2
v0.5.13-rc1
v0.5.13-rc0
v0.5.12
v0.5.12-rc1
v0.5.12-rc0
v0.5.11
v0.5.10
v0.5.9
v0.5.9-rc0
v0.5.8-rc13
v0.5.8
v0.5.8-rc12
v0.5.8-rc11
v0.5.8-rc10
v0.5.8-rc9
v0.5.8-rc8
v0.5.8-rc7
v0.5.8-rc6
v0.5.8-rc5
v0.5.8-rc4
v0.5.8-rc3
v0.5.8-rc2
v0.5.8-rc1
v0.5.8-rc0
v0.5.7
v0.5.6
v0.5.5
v0.5.5-rc0
v0.5.4
v0.5.3
v0.5.3-rc0
v0.5.2
v0.5.2-rc3
v0.5.2-rc2
v0.5.2-rc1
v0.5.2-rc0
v0.5.1
v0.5.0
v0.5.0-rc1
v0.4.8-rc0
v0.4.7
v0.4.6
v0.4.5
v0.4.4
v0.4.3
v0.4.3-rc0
v0.4.2
v0.4.2-rc1
v0.4.2-rc0
v0.4.1
v0.4.1-rc0
v0.4.0
v0.4.0-rc8
v0.4.0-rc7
v0.4.0-rc6
v0.4.0-rc5
v0.4.0-rc4
v0.4.0-rc3
v0.4.0-rc2
v0.4.0-rc1
v0.4.0-rc0
v0.4.0-ci3
v0.3.14
v0.3.14-rc0
v0.3.13
v0.3.12
v0.3.12-rc5
v0.3.12-rc4
v0.3.12-rc3
v0.3.12-rc2
v0.3.12-rc1
v0.3.11
v0.3.11-rc4
v0.3.11-rc3
v0.3.11-rc2
v0.3.11-rc1
v0.3.10
v0.3.10-rc1
v0.3.9
v0.3.8
v0.3.7
v0.3.7-rc6
v0.3.7-rc5
v0.3.7-rc4
v0.3.7-rc3
v0.3.7-rc2
v0.3.7-rc1
v0.3.6
v0.3.5
v0.3.4
v0.3.3
v0.3.2
v0.3.1
v0.3.0
v0.2.8
v0.2.8-rc2
v0.2.8-rc1
v0.2.7
v0.2.6
v0.2.5
v0.2.4
v0.2.3
v0.2.2
v0.2.2-rc2
v0.2.2-rc1
v0.2.1
v0.2.0
v0.1.49-rc14
v0.1.49-rc13
v0.1.49-rc12
v0.1.49-rc11
v0.1.49-rc10
v0.1.49-rc9
v0.1.49-rc8
v0.1.49-rc7
v0.1.49-rc6
v0.1.49-rc4
v0.1.49-rc5
v0.1.49-rc3
v0.1.49-rc2
v0.1.49-rc1
v0.1.48
v0.1.47
v0.1.46
v0.1.45-rc5
v0.1.45
v0.1.45-rc4
v0.1.45-rc3
v0.1.45-rc2
v0.1.45-rc1
v0.1.44
v0.1.43
v0.1.42
v0.1.41
v0.1.40
v0.1.40-rc1
v0.1.39
v0.1.39-rc2
v0.1.39-rc1
v0.1.38
v0.1.37
v0.1.36
v0.1.35
v0.1.35-rc1
v0.1.34
v0.1.34-rc1
v0.1.33
v0.1.33-rc7
v0.1.33-rc6
v0.1.33-rc5
v0.1.33-rc4
v0.1.33-rc3
v0.1.33-rc2
v0.1.33-rc1
v0.1.32
v0.1.32-rc2
v0.1.32-rc1
v0.1.31
v0.1.30
v0.1.29
v0.1.28
v0.1.27
v0.1.26
v0.1.25
v0.1.24
v0.1.23
v0.1.22
v0.1.21
v0.1.20
v0.1.19
v0.1.18
v0.1.17
v0.1.16
v0.1.15
v0.1.14
v0.1.13
v0.1.12
v0.1.11
v0.1.10
v0.1.9
v0.1.8
v0.1.7
v0.1.6
v0.1.5
v0.1.4
v0.1.3
v0.1.2
v0.1.1
v0.1.0
v0.0.21
v0.0.20
v0.0.19
v0.0.18
v0.0.17
v0.0.16
v0.0.15
v0.0.14
v0.0.13
v0.0.12
v0.0.11
v0.0.10
v0.0.9
v0.0.8
v0.0.7
v0.0.6
v0.0.5
v0.0.4
v0.0.3
v0.0.2
v0.0.1
Labels
Clear labels
amd
api
app
bug
build
cli
cloud
compatibility
context-length
create
docker
documentation
embeddings
feature request
feedback wanted
good first issue
gpt-oss
gpu
harmony
help wanted
image
install
intel
js
launch
linux
macos
memory
mlx
model
needs more info
networking
nvidia
ollama.com
performance
pull-request
python
question
registry
rendering
thinking
tools
top
vulkan
windows
wsl
Mirrored from GitHub Pull Request
Milestone
No items
No Milestone
Projects
Clear projects
No project
No Assignees
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: github-starred/ollama#50537
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @turndown on GitHub (Aug 19, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6408
What is the issue?
At first, it started running normally, but after a while, it reported 404,and can‘t run any model.
Can you help me solve it?Thx.
install by:curl -fsSL https://ollama.com/install.sh
log below:
Aug 19 10:25:57 ecs-lcdsj ollama[1026502]: llm_load_print_meta: LF token = 148848 'ÄĬ'
Aug 19 10:25:57 ecs-lcdsj ollama[1026502]: llm_load_print_meta: EOT token = 151643 '<|endoftext|>'
Aug 19 10:25:57 ecs-lcdsj ollama[1026502]: llm_load_print_meta: max token length = 256
Aug 19 10:25:57 ecs-lcdsj ollama[1026502]: ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
Aug 19 10:25:57 ecs-lcdsj ollama[1026502]: ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
Aug 19 10:25:57 ecs-lcdsj ollama[1026502]: ggml_cuda_init: found 1 CUDA devices:
Aug 19 10:25:57 ecs-lcdsj ollama[1026502]: Device 0: NVIDIA A100-PCIE-40GB, compute capability 8.0, VMM: yes
Aug 19 10:25:57 ecs-lcdsj ollama[1026502]: llm_load_tensors: ggml ctx size = 0.30 MiB
Aug 19 10:25:57 ecs-lcdsj ollama[1026502]: llm_load_tensors: offloading 28 repeating layers to GPU
Aug 19 10:25:57 ecs-lcdsj ollama[1026502]: llm_load_tensors: offloading non-repeating layers to GPU
Aug 19 10:25:57 ecs-lcdsj ollama[1026502]: llm_load_tensors: offloaded 29/29 layers to GPU
Aug 19 10:25:57 ecs-lcdsj ollama[1026502]: llm_load_tensors: CPU buffer size = 292.36 MiB
Aug 19 10:25:57 ecs-lcdsj ollama[1026502]: llm_load_tensors: CUDA0 buffer size = 3928.07 MiB
Aug 19 10:26:01 ecs-lcdsj ollama[1026502]: [GIN] 2024/08/19 - 10:26:01 | 404 | 185.499µs | ::1 | POST "/api/chat"
Aug 19 10:26:02 ecs-lcdsj ollama[1026502]: [GIN] 2024/08/19 - 10:26:02 | 200 | 1.273346ms | 172.17.0.2 | GET "/api/tags"
Aug 19 10:26:02 ecs-lcdsj ollama[1026502]: [GIN] 2024/08/19 - 10:26:02 | 200 | 88.559µs | 172.17.0.2 | GET "/api/vers>
Aug 19 10:26:26 ecs-lcdsj ollama[1026502]: [GIN] 2024/08/19 - 10:26:26 | 200 | 207.009µs | 127.0.0.1 | HEAD "/"
Aug 19 10:26:26 ecs-lcdsj ollama[1026502]: [GIN] 2024/08/19 - 10:26:26 | 200 | 1.100698ms | 127.0.0.1 | GET "/api/tags"
Aug 19 10:26:33 ecs-lcdsj ollama[1026502]: [GIN] 2024/08/19 - 10:26:33 | 200 | 46.933µs | 127.0.0.1 | HEAD "/"
Aug 19 10:26:33 ecs-lcdsj ollama[1026502]: [GIN] 2024/08/19 - 10:26:33 | 200 | 23.522263ms | 127.0.0.1 | POST "/api/show"
Aug 19 10:26:44 ecs-lcdsj ollama[1026502]: time=2024-08-19T10:26:44.502+08:00 level=INFO source=server.go:627 msg="waiting for serve>
Aug 19 10:26:44 ecs-lcdsj ollama[1026502]: time=2024-08-19T10:26:44.780+08:00 level=INFO source=server.go:627 msg="waiting for serve>
Aug 19 10:27:01 ecs-lcdsj ollama[1026502]: [GIN] 2024/08/19 - 10:27:01 | 404 | 7.051455ms | ::1 | POST "/api/chat"
Aug 19 10:28:01 ecs-lcdsj ollama[1026502]: [GIN] 2024/08/19 - 10:28:01 | 404 | 367.924µs | ::1 | POST "/api/chat"
Aug 19 10:28:55 ecs-lcdsj systemd[1]: Stopping Ollama Service...
Aug 19 10:28:55 ecs-lcdsj ollama[1026502]: time=2024-08-19T10:28:55.817+08:00 level=WARN source=server.go:600 msg="client connection>
Aug 19 10:28:55 ecs-lcdsj ollama[1026502]: time=2024-08-19T10:28:55.818+08:00 level=ERROR source=sched.go:451 msg="error loading lla>
Aug 19 10:28:55 ecs-lcdsj ollama[1026502]: [GIN] 2024/08/19 - 10:28:55 | 499 | 3m0s | 172.17.0.2 | POST "/api/chat"
Aug 19 10:28:56 ecs-lcdsj systemd[1]: ollama.service: Succeeded.
Aug 19 10:28:56 ecs-lcdsj systemd[1]: Stopped Ollama Service.
Aug 19 10:28:56 ecs-lcdsj systemd[1]: Started Ollama Service.
Aug 19 10:28:56 ecs-lcdsj ollama[1032507]: 2024/08/19 10:28:56 routes.go:1125: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU>
Aug 19 10:28:56 ecs-lcdsj ollama[1032507]: time=2024-08-19T10:28:56.246+08:00 level=INFO source=images.go:782 msg="total blobs: 15"
Aug 19 10:28:56 ecs-lcdsj ollama[1032507]: time=2024-08-19T10:28:56.249+08:00 level=INFO source=images.go:790 msg="total unused blob>
Aug 19 10:28:56 ecs-lcdsj ollama[1032507]: time=2024-08-19T10:28:56.249+08:00 level=INFO source=routes.go:1172 msg="Listening on [::>
Aug 19 10:28:56 ecs-lcdsj ollama[1032507]: time=2024-08-19T10:28:56.250+08:00 level=INFO source=payload.go:30 msg="extracting embedd>
Aug 19 10:29:01 ecs-lcdsj ollama[1032507]: time=2024-08-19T10:29:01.035+08:00 level=INFO source=payload.go:44 msg="Dynamic LLM libra>
Aug 19 10:29:01 ecs-lcdsj ollama[1032507]: time=2024-08-19T10:29:01.037+08:00 level=INFO source=gpu.go:204 msg="looking for compatib>
Aug 19 10:29:10 ecs-lcdsj ollama[1032507]: time=2024-08-19T10:29:10.605+08:00 level=INFO source=types.go:105 msg="inference compute">
Aug 19 10:29:10 ecs-lcdsj ollama[1032507]: time=2024-08-19T10:29:10.606+08:00 level=INFO source=types.go:105 msg="inference compute">
Aug 19 10:29:10 ecs-lcdsj ollama[1032507]: time=2024-08-19T10:29:10.606+08:00 level=INFO source=types.go:105 msg="inference compute">
Aug 19 10:29:10 ecs-lcdsj ollama[1032507]: time=2024-08-19T10:29:10.606+08:00 level=INFO source=types.go:105 msg="inference compute">
Aug 19 10:29:10 ecs-lcdsj ollama[1032507]: [GIN] 2024/08/19 - 10:29:10 | 404 | 13.419583ms | ::1 | POST "/api/chat"
Aug 19 10:30:01 ecs-lcdsj ollama[1032507]: [GIN] 2024/08/19 - 10:30:01 | 404 | 990.349µs | ::1 | POST "/api/chat"
Aug 19 10:31:01 ecs-lcdsj ollama[1032507]: [GIN] 2024/08/19 - 10:31:01 | 404 | 224.61µs | ::1 | POST "/api/chat"
Aug 19 10:32:01 ecs-lcdsj ollama[1032507]: [GIN] 2024/08/19 - 10:32:01 | 404 | 15.250541ms | ::1 | POST "/api/chat"
Aug 19 10:32:27 ecs-lcdsj ollama[1032507]: [GIN] 2024/08/19 - 10:32:27 | 200 | 46.654µs | 127.0.0.1 | GET "/api/vers>
Aug 19 10:33:01 ecs-lcdsj ollama[1032507]: [GIN] 2024/08/19 - 10:33:01 | 404 | 959.34µs | ::1 | POST "/api/chat"
Aug 19 10:34:02 ecs-lcdsj ollama[1032507]: [GIN] 2024/08/19 - 10:34:02 | 404 | 18.592866ms | ::1 | POST "/api/chat"
Aug 19 10:35:01 ecs-lcdsj ollama[1032507]: [GIN] 2024/08/19 - 10:35:01 | 404 | 284.394µs | ::1 | POST "/api/chat"
OS
Linux
GPU
Nvidia
CPU
Other
Ollama version
0.3.6
@turndown commented on GitHub (Aug 19, 2024):
I think the key issue here is that the model cannot be loaded, but it can be searched locally

0.560+08:00 level=INFO source=server.go:627 msg="waiting for server to become available" status="llm server not responding"
0.855+08:00 level=INFO source=server.go:627 msg="waiting for server to become available" status="llm server loading model"
9:01 | 404 | 8.989131ms | ::1 | POST "/api/chat"
0:01 | 404 | 191.365µs | ::1 | POST "/api/chat"
@turndown commented on GitHub (Aug 19, 2024):
I try to print some debug detail log, it's just report [GIN] 2024/08/19 - 14:13:01 | 404 | 414.019µs | 127.0.0.1 | POST "/api/chat"
(base) [root@ecs-lcdsj ~]# OLLAMA_DEBUG=1 ollama serve 2>&1 | tee server.log
2024/08/19 14:09:56 routes.go:1123: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://] OLLAMA_RUNNERS_DIR: OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"
time=2024-08-19T14:09:56.520+08:00 level=INFO source=images.go:782 msg="total blobs: 0"
time=2024-08-19T14:09:56.520+08:00 level=INFO source=images.go:790 msg="total unused blobs removed: 0"
time=2024-08-19T14:09:56.520+08:00 level=INFO source=routes.go:1170 msg="Listening on 127.0.0.1:11434 (version 0.3.5)"
time=2024-08-19T14:09:56.522+08:00 level=INFO source=payload.go:30 msg="extracting embedded files" dir=/tmp/ollama565513732/runners
time=2024-08-19T14:09:56.522+08:00 level=DEBUG source=payload.go:182 msg=extracting variant=cpu file=build/linux/x86_64/cpu/bin/ollama_llama_server.gz
time=2024-08-19T14:09:56.522+08:00 level=DEBUG source=payload.go:182 msg=extracting variant=cpu_avx file=build/linux/x86_64/cpu_avx/bin/ollama_llama_server.gz
time=2024-08-19T14:09:56.522+08:00 level=DEBUG source=payload.go:182 msg=extracting variant=cpu_avx2 file=build/linux/x86_64/cpu_avx2/bin/ollama_llama_server.gz
time=2024-08-19T14:09:56.522+08:00 level=DEBUG source=payload.go:182 msg=extracting variant=cuda_v11 file=build/linux/x86_64/cuda_v11/bin/libcublas.so.11.gz
time=2024-08-19T14:09:56.522+08:00 level=DEBUG source=payload.go:182 msg=extracting variant=cuda_v11 file=build/linux/x86_64/cuda_v11/bin/libcublasLt.so.11.gz
time=2024-08-19T14:09:56.522+08:00 level=DEBUG source=payload.go:182 msg=extracting variant=cuda_v11 file=build/linux/x86_64/cuda_v11/bin/libcudart.so.11.0.gz
time=2024-08-19T14:09:56.522+08:00 level=DEBUG source=payload.go:182 msg=extracting variant=cuda_v11 file=build/linux/x86_64/cuda_v11/bin/ollama_llama_server.gz
time=2024-08-19T14:09:56.522+08:00 level=DEBUG source=payload.go:182 msg=extracting variant=rocm_v60102 file=build/linux/x86_64/rocm_v60102/bin/deps.txt.gz
time=2024-08-19T14:09:56.522+08:00 level=DEBUG source=payload.go:182 msg=extracting variant=rocm_v60102 file=build/linux/x86_64/rocm_v60102/bin/ollama_llama_server.gz
time=2024-08-19T14:10:01.179+08:00 level=DEBUG source=payload.go:71 msg="availableServers : found" file=/tmp/ollama565513732/runners/cpu/ollama_llama_server
time=2024-08-19T14:10:01.179+08:00 level=DEBUG source=payload.go:71 msg="availableServers : found" file=/tmp/ollama565513732/runners/cpu_avx/ollama_llama_server
time=2024-08-19T14:10:01.179+08:00 level=DEBUG source=payload.go:71 msg="availableServers : found" file=/tmp/ollama565513732/runners/cpu_avx2/ollama_llama_server
time=2024-08-19T14:10:01.179+08:00 level=DEBUG source=payload.go:71 msg="availableServers : found" file=/tmp/ollama565513732/runners/cuda_v11/ollama_llama_server
time=2024-08-19T14:10:01.179+08:00 level=DEBUG source=payload.go:71 msg="availableServers : found" file=/tmp/ollama565513732/runners/rocm_v60102/ollama_llama_server
time=2024-08-19T14:10:01.179+08:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cuda_v11 rocm_v60102 cpu cpu_avx cpu_avx2]"
time=2024-08-19T14:10:01.179+08:00 level=DEBUG source=payload.go:45 msg="Override detection logic by setting OLLAMA_LLM_LIBRARY"
time=2024-08-19T14:10:01.179+08:00 level=DEBUG source=sched.go:105 msg="starting llm scheduler"
time=2024-08-19T14:10:01.179+08:00 level=INFO source=gpu.go:204 msg="looking for compatible GPUs"
time=2024-08-19T14:10:01.180+08:00 level=DEBUG source=gpu.go:90 msg="searching for GPU discovery libraries for NVIDIA"
time=2024-08-19T14:10:01.180+08:00 level=DEBUG source=gpu.go:472 msg="Searching for GPU library" name=libcuda.so
time=2024-08-19T14:10:01.180+08:00 level=DEBUG source=gpu.go:491 msg="gpu library search" globs="[/usr/local/cuda-12.5/lib64/libcuda.so** /root/libcuda.so** /usr/local/cuda*/targets//lib/libcuda.so /usr/lib/-linux-gnu/nvidia/current/libcuda.so /usr/lib/-linux-gnu/libcuda.so /usr/lib/wsl/lib/libcuda.so* /usr/lib/wsl/drivers//libcuda.so /opt/cuda/lib*/libcuda.so* /usr/local/cuda/lib*/libcuda.so* /usr/lib*/libcuda.so* /usr/local/lib*/libcuda.so*]"
time=2024-08-19T14:10:01.187+08:00 level=DEBUG source=gpu.go:525 msg="discovered GPU libraries" paths="[/usr/lib/libcuda.so.550.90.07 /usr/lib64/libcuda.so.550.90.07]"
library /usr/lib/libcuda.so.550.90.07 load err: /usr/lib/libcuda.so.550.90.07: wrong ELF class: ELFCLASS32
time=2024-08-19T14:10:01.188+08:00 level=DEBUG source=gpu.go:566 msg="skipping 32bit library" library=/usr/lib/libcuda.so.550.90.07
CUDA driver version: 12.4
time=2024-08-19T14:10:01.548+08:00 level=DEBUG source=gpu.go:123 msg="detected GPUs" count=4 library=/usr/lib64/libcuda.so.550.90.07
[GPU-220df675-5d27-88e7-0958-f62f77a1e82a] CUDA totalMem 40326 mb
[GPU-220df675-5d27-88e7-0958-f62f77a1e82a] CUDA freeMem 38836 mb
[GPU-220df675-5d27-88e7-0958-f62f77a1e82a] Compute Capability 8.0
[GPU-0509be8c-c34b-4e94-ccc8-3d06d7a287ff] CUDA totalMem 40326 mb
[GPU-0509be8c-c34b-4e94-ccc8-3d06d7a287ff] CUDA freeMem 39903 mb
[GPU-0509be8c-c34b-4e94-ccc8-3d06d7a287ff] Compute Capability 8.0
[GPU-238d50b9-e2e6-8bf5-cf29-8a98895db3ac] CUDA totalMem 40326 mb
[GPU-238d50b9-e2e6-8bf5-cf29-8a98895db3ac] CUDA freeMem 39903 mb
[GPU-238d50b9-e2e6-8bf5-cf29-8a98895db3ac] Compute Capability 8.0
[GPU-7aabff4a-5756-eee1-b793-880410188e85] CUDA totalMem 40326 mb
[GPU-7aabff4a-5756-eee1-b793-880410188e85] CUDA freeMem 39903 mb
[GPU-7aabff4a-5756-eee1-b793-880410188e85] Compute Capability 8.0
time=2024-08-19T14:10:02.846+08:00 level=DEBUG source=amd_linux.go:371 msg="amdgpu driver not detected /sys/module/amdgpu"
releasing cuda driver library
time=2024-08-19T14:10:02.846+08:00 level=INFO source=types.go:105 msg="inference compute" id=GPU-220df675-5d27-88e7-0958-f62f77a1e82a library=cuda compute=8.0 driver=12.4 name="NVIDIA A100-PCIE-40GB" total="39.4 GiB" available="37.9 GiB"
time=2024-08-19T14:10:02.846+08:00 level=INFO source=types.go:105 msg="inference compute" id=GPU-0509be8c-c34b-4e94-ccc8-3d06d7a287ff library=cuda compute=8.0 driver=12.4 name="NVIDIA A100-PCIE-40GB" total="39.4 GiB" available="39.0 GiB"
time=2024-08-19T14:10:02.846+08:00 level=INFO source=types.go:105 msg="inference compute" id=GPU-238d50b9-e2e6-8bf5-cf29-8a98895db3ac library=cuda compute=8.0 driver=12.4 name="NVIDIA A100-PCIE-40GB" total="39.4 GiB" available="39.0 GiB"
time=2024-08-19T14:10:02.846+08:00 level=INFO source=types.go:105 msg="inference compute" id=GPU-7aabff4a-5756-eee1-b793-880410188e85 library=cuda compute=8.0 driver=12.4 name="NVIDIA A100-PCIE-40GB" total="39.4 GiB" available="39.0 GiB"
[GIN] 2024/08/19 - 14:10:02 | 404 | 2.731353ms | 127.0.0.1 | POST "/api/chat"
[GIN] 2024/08/19 - 14:11:01 | 404 | 205.613µs | 127.0.0.1 | POST "/api/chat"
[GIN] 2024/08/19 - 14:12:02 | 404 | 15.933031ms | 127.0.0.1 | POST "/api/chat"
[GIN] 2024/08/19 - 14:13:01 | 404 | 414.019µs | 127.0.0.1 | POST "/api/chat"
@rick-github commented on GitHub (Aug 19, 2024):
What's in the body of the 404 response that the client receives?
@turndown commented on GitHub (Aug 19, 2024):
Hi,do you mean this blow?I use openwebui to connect ollama, and it got no response.


But sometimes it worked fine,sometimes it happens when switching models.
I can't find the pattern of what happened.
Thx for your reply.
@rick-github commented on GitHub (Aug 19, 2024):
The most likely problem is that the request that is being sent to ollama has a bad model name:
If you can get the contents of the 404 response that ollama sent, it will probably have information about why the request failed, whether a bad model name or some other reason.
The HTTP tracer extension won't help because that's looking at the traffic between the browser and the open-webui port, not between open-webui and ollama.
Run this and use open-webui, when an error occurs you should be able to find the error message in the packet trace:
@turndown commented on GitHub (Aug 19, 2024):
I try this command, and find some message that show model name "qwen2:72b".

But I don't use this model after I deteled it. I will pull the model and try again.
tcpdump -X -i lo port 1143419:41:01.470424 IP6 localhost.11434 > localhost.spremotetablet: Flags [P.], seq 1:194, ack 199, win 512, options [nop,nop,TS val 1269460090 ecr 1269460088], length 193 0x0000: 6002 1aed 00e1 0640 0000 0000 0000 0000......@........0x0010: 0000 0000 0000 0001 0000 0000 0000 0000 ................
0x0020: 0000 0000 0000 0001 2caa b796 97e3 4236 ........,.....B6
0x0030: 8bb0 bc8e 8018 0200 00e9 0000 0101 080a ................
0x0040: 4baa 6c7a 4baa 6c78 4854 5450 2f31 2e31 K.lzK.lxHTTP/1.1
0x0050: 2034 3034 204e 6f74 2046 6f75 6e64 0d0a .404.Not.Found..
0x0060: 436f 6e74 656e 742d 5479 7065 3a20 6170 Content-Type:.ap
0x0070: 706c 6963 6174 696f 6e2f 6a73 6f6e 3b20 plication/json;.
0x0080: 6368 6172 7365 743d 7574 662d 380d 0a44 charset=utf-8..D
0x0090: 6174 653a 204d 6f6e 2c20 3139 2041 7567 ate:.Mon,.19.Aug
0x00a0: 2032 3032 3420 3131 3a34 313a 3031 2047 .2024.11:41:01.G
0x00b0: 4d54 0d0a 436f 6e74 656e 742d 4c65 6e67 MT..Content-Leng
0x00c0: 7468 3a20 3633 0d0a 0d0a 7b22 6572 726f th:.63....{"erro
0x00d0: 7222 3a22 6d6f 6465 6c20 5c22 7177 656e r":"model."qwen
0x00e0: 323a 3732 625c 2220 6e6f 7420 666f 756e 2:72b".not.foun
0x00f0: 642c 2074 7279 2070 756c 6c69 6e67 2069 d,.try.pulling.i
0x0100: 7420 6669 7273 7422 7d t.first"}
`
But if it's a model name issue, why do I get a 404 error and get stuck when I execute this command on the terminal?
I was so confused and thanks for your direction.
@rick-github commented on GitHub (Aug 19, 2024):
I think you have multiple problems. The 404 that you tracedumped is different to the
ollama run llama3:latestissue because the models are not the same. You need to separate out the problems and post server logs that clearly show the issue you are trying to fix.@turndown commented on GitHub (Aug 20, 2024):
Today I stop openwebui and test docker ollama 0.3.5 image. I just run docker exec -it ollama ollama run svjack/qwen1_5_14b in a terminal,but and in another terminal capture the dump message that still show "model": "qwen2:72b".
I don't know why. Will the names of these models conflict? For example, if everything starts with 'qwen', such as' qwen2 ',' qwen: 72b ', etc., will this create problem.
Also, may I ask if the packet capture command is real-time? Why wasn't the query packet I executed on the terminal captured like this blow?

Thank you very much for your help.
@pdevine commented on GitHub (Aug 30, 2024):
I don't see
qwen2:72bin theollama listoutput. Can youollama pull qwen2:72band then try to use the api with that model again?@pdevine commented on GitHub (Sep 2, 2024):
I'm going to go ahead and close the issue. I'm pretty certain that you just need to pull the correct model. I'll reopen it if you're still having the issue.
@TheLillin commented on GitHub (Oct 16, 2024):
this worked for me, In WSL I Entered " docker run --rm --volume /var/run/docker.sock:/var/run/docker.sock containrrr/watchtower --run-once open-webui " and then I could interact with the models on Open WebUI
@deepanshu-prajapati01 commented on GitHub (Nov 5, 2024):
As for me i was also encountering the following issue in the open WebUI
Ollama: 500, message='Internal Server Error', url='http://127.0.0.1:11434/api/chat'
But somehow trying another model (latest) works for me.
Hope anyone finds it helpful!