mirror of
https://github.com/ollama/ollama.git
synced 2026-05-06 08:02:14 -05:00
[GH-ISSUE #2225] Ollama stops generating output and fails to run models after a few minutes #1273
Closed
opened 2026-04-12 11:04:03 -05:00 by GiteaMirror
·
54 comments
No Branch/Tag Specified
main
parth-mlx-decode-checkpoints
dhiltgen/ci
hoyyeva/editor-config-repair
parth-launch-codex-app
hoyyeva/fix-codex-model-metadata-warning
hoyyeva/qwen
hoyyeva/launch-backup-ux
parth/hide-claude-desktop-till-release
hoyyeva/opencode-image-modality
parth-add-claude-code-autoinstall
release_v0.22.0
pdevine/manifest-list
codex/fix-codex-model-metadata-warning
pdevine/addressable-manifest
brucemacd/launch-fetch-reccomended
jmorganca/llama-compat
launch-copilot-cli
hoyyeva/opencode-thinking
release_v0.20.7
parth-auto-save-backup
parth-test
jmorganca/gemma4-audio-replacements
fix-manifest-digest-on-pull
hoyyeva/vscode-improve
brucemacd/install-server-wait
brucemacd/download-before-remove
parth/update-claude-docs
parth-anthropic-reference-images-path
brucemac/start-ap-install
pdevine/mlx-update
pdevine/qwen35_vision
drifkin/api-show-fallback
mintlify/image-generation-1773352582
hoyyeva/server-context-length-local-config
jmorganca/faster-reptition-penalties
jmorganca/convert-nemotron
parth-pi-thinking
pdevine/sampling-penalties
jmorganca/fix-create-quantization-memory
dongchen/resumable_transfer_fix
pdevine/sampling-cache-error
jessegross/mlx-usage
hoyyeva/openclaw-config
hoyyeva/app-html
pdevine/qwen3next
brucemacd/sign-sh-install
brucemacd/tui-update
brucemacd/usage-api
jmorganca/launch-empty
fix-app-dist-embed
mxyng/mlx-compile
mxyng/mlx-quant
mxyng/mlx-glm4.7
mxyng/mlx
brucemacd/simplify-model-picker
jmorganca/qwen3-concurrent
fix-glm-4.7-flash-mla-config
drifkin/qwen3-coder-opening-tag
brucemacd/usage-cli
fix-cuda12-fattn-shmem
ollama-imagegen-docs
parth/fix-multiline-inputs
brucemacd/config-docs
mxyng/model-files
mxyng/simple-execute
fix-imagegen-ollama-models
mxyng/async-upload
jmorganca/lazy-no-dtype-changes
imagegen-auto-detect-create
parth/decrease-concurrent-download-hf
fix-mlx-quantize-init
jmorganca/x-cleanup
usage
imagegen-readme
jmorganca/glm-image
mlx-gpu-cd
jmorganca/imagegen-modelfile
parth/agent-skills
parth/agent-allowlist
parth/signed-in-offline
parth/agents
parth/fix-context-chopping
improve-cloud-flow
parth/add-models-websearch
parth/prompt-renderer-mcp
jmorganca/native-settings
jmorganca/download-stream-hash
jmorganca/client2-rebased
brucemacd/oai-chat-req-multipart
jessegross/multi_chunk_reserve
grace/additional-omit-empty
grace/mistral-3-large
mxyng/tokenizer2
mxyng/tokenizer
jessegross/flash
hoyyeva/windows-nacked-app
mxyng/cleanup-attention
grace/deepseek-parser
hoyyeva/remember-unsent-prompt
parth/add-lfs-pointer-error-conversion
parth/olmo2-test2
hoyyeva/ollama-launchagent-plist
nicole/olmo-model
parth/olmo-test
mxyng/remove-embedded
parth/render-template
jmorganca/intellect-3
parth/remove-prealloc-linter
jmorganca/cmd-eval
nicole/nomic-embed-text-fix
mxyng/lint-2
hoyyeva/add-gemini-3-pro-preview
hoyyeva/load-model-list
mxyng/expand-path
mxyng/environ-2
hoyyeva/deeplink-json-encoding
parth/improve-tool-calling-tests
hoyyeva/conversation
hoyyeva/assistant-edit-response
hoyyeva/thinking
origin/brucemacd/invalid-char-i-err
parth/improve-tool-calling
jmorganca/required-omitempty
grace/qwen3-vl-tests
mxyng/iter-client
parth/docs-readme
nicole/embed-test
pdevine/integration-benchstat
parth/remove-generate-cmd
parth/add-toolcall-id
mxyng/server-tests
jmorganca/glm-4.6
jmorganca/gin-h-compat
drifkin/stable-tool-args
pdevine/qwen3-more-thinking
parth/add-websearch-client
nicole/websearch_local
jmorganca/qwen3-coder-updates
grace/deepseek-v3-migration-tests
mxyng/fix-create
jmorganca/cloud-errors
pdevine/parser-tidy
revert-12233-parth/simplify-entrypoints-runner
parth/enable-so-gpt-oss
brucemacd/qwen3vl
jmorganca/readme-simplify
parth/gpt-oss-structured-outputs
revert-12039-jmorganca/tools-braces
mxyng/embeddings
mxyng/gguf
mxyng/benchmark
mxyng/types-null
parth/move-parsing
mxyng/gemma2
jmorganca/docs
mxyng/16-bit
mxyng/create-stdin
pdevine/authorizedkeys
mxyng/quant
parth/opt-in-error-context-window
brucemacd/cache-models
brucemacd/runner-completion
jmorganca/llama-update-6
brucemacd/benchmark-list
brucemacd/partial-read-caps
parth/deepseek-r1-tools
mxyng/omit-array
parth/tool-prefix-temp
brucemacd/runner-test
jmorganca/qwen25vl
brucemacd/model-forward-test-ext
parth/python-function-parsing
jmorganca/cuda-compression-none
drifkin/num-parallel
drifkin/chat-truncation-fix
jmorganca/sync
parth/python-tools-calling
drifkin/array-head-count
brucemacd/create-no-loop
parth/server-enable-content-stream-with-tools
qwen25omni
mxyng/v3
brucemacd/ropeconfig
jmorganca/silence-tokenizer
parth/sample-so-test
parth/sampling-structured-outputs
brucemacd/doc-go-engine
parth/constrained-sampling-json
jmorganca/mistral-wip
brucemacd/mistral-small-convert
parth/sample-unmarshal-json-for-params
brucemacd/jomorganca/mistral
pdevine/bfloat16
jmorganca/mistral
brucemacd/mistral
pdevine/logging
parth/sample-correctness-fix
parth/sample-fix-sorting
jmorgan/sample-fix-sorting-extras
jmorganca/temp-0-images
brucemacd/parallel-embed-models
brucemacd/shim-grammar
jmorganca/fix-gguf-error
bmizerany/nameswork
jmorganca/faster-releases
bmizerany/validatenames
brucemacd/err-no-vocab
brucemacd/rope-config
brucemacd/err-hint
brucemacd/qwen2_5
brucemacd/logprobs
brucemacd/new_runner_graph_bench
progress-flicker
brucemacd/forward-test
brucemacd/go_qwen2
pdevine/gemma2
jmorganca/add-missing-symlink-eval
mxyng/next-debug
parth/set-context-size-openai
brucemacd/next-bpe-bench
brucemacd/next-bpe-test
brucemacd/new_runner_e2e
brucemacd/new_runner_qwen2
pdevine/convert-cohere2
brucemacd/convert-cli
parth/log-probs
mxyng/next-mlx
mxyng/cmd-history
parth/templating
parth/tokenize-detokenize
brucemacd/check-key-register
bmizerany/grammar
jmorganca/vendor-081b29bd
mxyng/func-checks
jmorganca/fix-null-format
parth/fix-default-to-warn-json
jmorganca/qwen2vl
jmorganca/no-concat
parth/cmd-cleanup-SO
brucemacd/check-key-register-structured-err
parth/openai-stream-usage
parth/fix-referencing-so
stream-tools-stop
jmorganca/degin-1
brucemacd/install-path-clean
brucemacd/push-name-validation
brucemacd/browser-key-register
jmorganca/openai-fix-first-message
jmorganca/fix-proxy
jessegross/sample
parth/disallow-streaming-tools
dhiltgen/remove_submodule
jmorganca/ga
jmorganca/mllama
pdevine/newlines
pdevine/geems-2b
jmorganca/llama-bump
mxyng/modelname-7
mxyng/gin-slog
mxyng/modelname-6
jyan/convert-prog
jyan/quant5
paligemma-support
pdevine/import-docs
jmorganca/openai-context
jyan/paligemma
jyan/p2
jyan/palitest
bmizerany/embedspeedup
jmorganca/llama-vit
brucemacd/allow-ollama
royh/ep-methods
royh/whisper
mxyng/api-models
mxyng/fix-memory
jyan/q4_4/8
jyan/ollama-v
royh/stream-tools
roy-embed-parallel
bmizerany/hrm
revert-5963-revert-5924-mxyng/llama3.1-rope
royh/embed-viz
jyan/local2
jyan/auth
jyan/local
jyan/parse-temp
jmorganca/template-mistral
jyan/reord-g
royh-openai-suffixdocs
royh-imgembed
royh-embed-parallel
jyan/quant4
royh-precision
jyan/progress
pdevine/fix-template
jyan/quant3
pdevine/ggla
mxyng/update-registry-domain
jmorganca/ggml-static
mxyng/create-context
jyan/v0.146
mxyng/layers-from-files
build_dist
bmizerany/noseek
royh-ls
royh-name
timeout
mxyng/server-timestamp
bmizerany/nosillyggufslurps
royh-params
jmorganca/llama-cpp-7c26775
royh-openai-delete
royh-show-rigid
jmorganca/enable-fa
jmorganca/no-error-template
jyan/format
royh-testdelete
bmizerany/fastverify
language_support
pdevine/ps-glitches
brucemacd/tokenize
bruce/iq-quants
bmizerany/filepathwithcoloninhost
mxyng/split-bin
bmizerany/client-registry
jmorganca/if-none-match
native
jmorganca/native
jmorganca/batch-embeddings
jmorganca/initcmake
jmorganca/mm
pdevine/showggmlinfo
modenameenforcealphanum
bmizerany/modenameenforcealphanum
jmorganca/done-reason
jmorganca/llama-cpp-8960fe8
ollama.com
bmizerany/filepathnobuild
bmizerany/types/model/defaultfix
rmdisplaylong
nogogen
bmizerany/x
modelfile-readme
bmizerany/replacecolon
jmorganca/limit
jmorganca/execstack
jmorganca/replace-assets
mxyng/tune-concurrency
jmorganca/testing
whitespace-detection
jmorganca/options
upgrade-all
scratch
cuda-search
mattw/airenamer
mattw/allmodelsonhuggingface
mattw/quantcontext
mattw/whatneedstorun
brucemacd/llama-mem-calc
mattw/faq-context
mattw/communitylinks
mattw/noprune
mattw/python-functioncalling
rename
mxyng/install
pulse
remove-first
editor
mattw/selfqueryingretrieval
cgo
mattw/howtoquant
api
matt/streamingapi
format-config
mxyng/extra-args
shell
update-nous-hermes
cp-model
upload-progress
fix-unknown-model
fix-model-names
delete-fix
insecure-registry
ls
deletemodels
progressbar
readme-updates
license-layers
skip-list
list-models
modelpath
matt/examplemodelfiles
distribution
go-opts
v0.23.1
v0.23.1-rc0
v0.23.0
v0.23.0-rc0
v0.22.1
v0.22.1-rc1
v0.22.1-rc0
v0.22.0
v0.22.0-rc1
v0.21.3-rc0
v0.21.2-rc1
v0.21.2
v0.21.2-rc0
v0.21.1
v0.21.1-rc1
v0.21.1-rc0
v0.21.0
v0.21.0-rc1
v0.21.0-rc0
v0.20.8-rc0
v0.20.7
v0.20.7-rc1
v0.20.7-rc0
v0.20.6
v0.20.6-rc1
v0.20.6-rc0
v0.20.5
v0.20.5-rc2
v0.20.5-rc1
v0.20.5-rc0
v0.20.4
v0.20.4-rc2
v0.20.4-rc1
v0.20.4-rc0
v0.20.3
v0.20.3-rc0
v0.20.2
v0.20.1
v0.20.1-rc2
v0.20.1-rc1
v0.20.1-rc0
v0.20.0
v0.20.0-rc1
v0.20.0-rc0
v0.19.0
v0.19.0-rc2
v0.19.0-rc1
v0.19.0-rc0
v0.18.4-rc1
v0.18.4-rc0
v0.18.3
v0.18.3-rc2
v0.18.3-rc1
v0.18.3-rc0
v0.18.2
v0.18.2-rc1
v0.18.2-rc0
v0.18.1
v0.18.1-rc1
v0.18.1-rc0
v0.18.0
v0.18.0-rc2
v0.18.0-rc1
v0.18.0-rc0
v0.17.8-rc4
v0.17.8-rc3
v0.17.8-rc2
v0.17.8-rc1
v0.17.8-rc0
v0.17.7
v0.17.7-rc2
v0.17.7-rc1
v0.17.7-rc0
v0.17.6
v0.17.5
v0.17.4
v0.17.3
v0.17.2
v0.17.1
v0.17.1-rc2
v0.17.1-rc1
v0.17.1-rc0
v0.17.0
v0.17.0-rc2
v0.17.0-rc1
v0.17.0-rc0
v0.16.3
v0.16.3-rc2
v0.16.3-rc1
v0.16.3-rc0
v0.16.2
v0.16.2-rc0
v0.16.1
v0.16.0
v0.16.0-rc2
v0.16.0-rc0
v0.16.0-rc1
v0.15.6
v0.15.5
v0.15.5-rc5
v0.15.5-rc4
v0.15.5-rc3
v0.15.5-rc2
v0.15.5-rc1
v0.15.5-rc0
v0.15.4
v0.15.3
v0.15.2
v0.15.1
v0.15.1-rc1
v0.15.1-rc0
v0.15.0-rc6
v0.15.0
v0.15.0-rc5
v0.15.0-rc4
v0.15.0-rc3
v0.15.0-rc2
v0.15.0-rc1
v0.15.0-rc0
v0.14.3
v0.14.3-rc3
v0.14.3-rc2
v0.14.3-rc1
v0.14.3-rc0
v0.14.2
v0.14.2-rc1
v0.14.2-rc0
v0.14.1
v0.14.0-rc11
v0.14.0
v0.14.0-rc10
v0.14.0-rc9
v0.14.0-rc8
v0.14.0-rc7
v0.14.0-rc6
v0.14.0-rc5
v0.14.0-rc4
v0.14.0-rc3
v0.14.0-rc2
v0.14.0-rc1
v0.14.0-rc0
v0.13.5
v0.13.5-rc1
v0.13.5-rc0
v0.13.4-rc2
v0.13.4
v0.13.4-rc1
v0.13.4-rc0
v0.13.3
v0.13.3-rc1
v0.13.3-rc0
v0.13.2
v0.13.2-rc2
v0.13.2-rc1
v0.13.2-rc0
v0.13.1
v0.13.1-rc2
v0.13.1-rc1
v0.13.1-rc0
v0.13.0
v0.13.0-rc0
v0.12.11
v0.12.11-rc1
v0.12.11-rc0
v0.12.10
v0.12.10-rc1
v0.12.10-rc0
v0.12.9-rc0
v0.12.9
v0.12.8
v0.12.8-rc0
v0.12.7
v0.12.7-rc1
v0.12.7-rc0
v0.12.7-citest0
v0.12.6
v0.12.6-rc1
v0.12.6-rc0
v0.12.5
v0.12.5-rc0
v0.12.4
v0.12.4-rc7
v0.12.4-rc6
v0.12.4-rc5
v0.12.4-rc4
v0.12.4-rc3
v0.12.4-rc2
v0.12.4-rc1
v0.12.4-rc0
v0.12.3
v0.12.2
v0.12.2-rc0
v0.12.1
v0.12.1-rc1
v0.12.1-rc2
v0.12.1-rc0
v0.12.0
v0.12.0-rc1
v0.12.0-rc0
v0.11.11
v0.11.11-rc3
v0.11.11-rc2
v0.11.11-rc1
v0.11.11-rc0
v0.11.10
v0.11.9
v0.11.9-rc0
v0.11.8
v0.11.8-rc0
v0.11.7-rc1
v0.11.7-rc0
v0.11.7
v0.11.6
v0.11.6-rc0
v0.11.5-rc4
v0.11.5-rc3
v0.11.5
v0.11.5-rc5
v0.11.5-rc2
v0.11.5-rc1
v0.11.5-rc0
v0.11.4
v0.11.4-rc0
v0.11.3
v0.11.3-rc0
v0.11.2
v0.11.1
v0.11.0-rc0
v0.11.0-rc1
v0.11.0-rc2
v0.11.0
v0.10.2-int1
v0.10.1
v0.10.0
v0.10.0-rc4
v0.10.0-rc3
v0.10.0-rc2
v0.10.0-rc1
v0.10.0-rc0
v0.9.7-rc1
v0.9.7-rc0
v0.9.6
v0.9.6-rc0
v0.9.6-ci0
v0.9.5
v0.9.4-rc5
v0.9.4-rc6
v0.9.4
v0.9.4-rc3
v0.9.4-rc4
v0.9.4-rc1
v0.9.4-rc2
v0.9.4-rc0
v0.9.3
v0.9.3-rc5
v0.9.4-citest0
v0.9.3-rc4
v0.9.3-rc3
v0.9.3-rc2
v0.9.3-rc1
v0.9.3-rc0
v0.9.2
v0.9.1
v0.9.1-rc1
v0.9.1-rc0
v0.9.1-ci1
v0.9.1-ci0
v0.9.0
v0.9.0-rc0
v0.8.0
v0.8.0-rc0
v0.7.1-rc2
v0.7.1
v0.7.1-rc1
v0.7.1-rc0
v0.7.0
v0.7.0-rc1
v0.7.0-rc0
v0.6.9-rc0
v0.6.8
v0.6.8-rc0
v0.6.7
v0.6.7-rc2
v0.6.7-rc1
v0.6.7-rc0
v0.6.6
v0.6.6-rc2
v0.6.6-rc1
v0.6.6-rc0
v0.6.5-rc1
v0.6.5
v0.6.5-rc0
v0.6.4-rc0
v0.6.4
v0.6.3-rc1
v0.6.3
v0.6.3-rc0
v0.6.2
v0.6.2-rc0
v0.6.1
v0.6.1-rc0
v0.6.0-rc0
v0.6.0
v0.5.14-rc0
v0.5.13
v0.5.13-rc6
v0.5.13-rc5
v0.5.13-rc4
v0.5.13-rc3
v0.5.13-rc2
v0.5.13-rc1
v0.5.13-rc0
v0.5.12
v0.5.12-rc1
v0.5.12-rc0
v0.5.11
v0.5.10
v0.5.9
v0.5.9-rc0
v0.5.8-rc13
v0.5.8
v0.5.8-rc12
v0.5.8-rc11
v0.5.8-rc10
v0.5.8-rc9
v0.5.8-rc8
v0.5.8-rc7
v0.5.8-rc6
v0.5.8-rc5
v0.5.8-rc4
v0.5.8-rc3
v0.5.8-rc2
v0.5.8-rc1
v0.5.8-rc0
v0.5.7
v0.5.6
v0.5.5
v0.5.5-rc0
v0.5.4
v0.5.3
v0.5.3-rc0
v0.5.2
v0.5.2-rc3
v0.5.2-rc2
v0.5.2-rc1
v0.5.2-rc0
v0.5.1
v0.5.0
v0.5.0-rc1
v0.4.8-rc0
v0.4.7
v0.4.6
v0.4.5
v0.4.4
v0.4.3
v0.4.3-rc0
v0.4.2
v0.4.2-rc1
v0.4.2-rc0
v0.4.1
v0.4.1-rc0
v0.4.0
v0.4.0-rc8
v0.4.0-rc7
v0.4.0-rc6
v0.4.0-rc5
v0.4.0-rc4
v0.4.0-rc3
v0.4.0-rc2
v0.4.0-rc1
v0.4.0-rc0
v0.4.0-ci3
v0.3.14
v0.3.14-rc0
v0.3.13
v0.3.12
v0.3.12-rc5
v0.3.12-rc4
v0.3.12-rc3
v0.3.12-rc2
v0.3.12-rc1
v0.3.11
v0.3.11-rc4
v0.3.11-rc3
v0.3.11-rc2
v0.3.11-rc1
v0.3.10
v0.3.10-rc1
v0.3.9
v0.3.8
v0.3.7
v0.3.7-rc6
v0.3.7-rc5
v0.3.7-rc4
v0.3.7-rc3
v0.3.7-rc2
v0.3.7-rc1
v0.3.6
v0.3.5
v0.3.4
v0.3.3
v0.3.2
v0.3.1
v0.3.0
v0.2.8
v0.2.8-rc2
v0.2.8-rc1
v0.2.7
v0.2.6
v0.2.5
v0.2.4
v0.2.3
v0.2.2
v0.2.2-rc2
v0.2.2-rc1
v0.2.1
v0.2.0
v0.1.49-rc14
v0.1.49-rc13
v0.1.49-rc12
v0.1.49-rc11
v0.1.49-rc10
v0.1.49-rc9
v0.1.49-rc8
v0.1.49-rc7
v0.1.49-rc6
v0.1.49-rc4
v0.1.49-rc5
v0.1.49-rc3
v0.1.49-rc2
v0.1.49-rc1
v0.1.48
v0.1.47
v0.1.46
v0.1.45-rc5
v0.1.45
v0.1.45-rc4
v0.1.45-rc3
v0.1.45-rc2
v0.1.45-rc1
v0.1.44
v0.1.43
v0.1.42
v0.1.41
v0.1.40
v0.1.40-rc1
v0.1.39
v0.1.39-rc2
v0.1.39-rc1
v0.1.38
v0.1.37
v0.1.36
v0.1.35
v0.1.35-rc1
v0.1.34
v0.1.34-rc1
v0.1.33
v0.1.33-rc7
v0.1.33-rc6
v0.1.33-rc5
v0.1.33-rc4
v0.1.33-rc3
v0.1.33-rc2
v0.1.33-rc1
v0.1.32
v0.1.32-rc2
v0.1.32-rc1
v0.1.31
v0.1.30
v0.1.29
v0.1.28
v0.1.27
v0.1.26
v0.1.25
v0.1.24
v0.1.23
v0.1.22
v0.1.21
v0.1.20
v0.1.19
v0.1.18
v0.1.17
v0.1.16
v0.1.15
v0.1.14
v0.1.13
v0.1.12
v0.1.11
v0.1.10
v0.1.9
v0.1.8
v0.1.7
v0.1.6
v0.1.5
v0.1.4
v0.1.3
v0.1.2
v0.1.1
v0.1.0
v0.0.21
v0.0.20
v0.0.19
v0.0.18
v0.0.17
v0.0.16
v0.0.15
v0.0.14
v0.0.13
v0.0.12
v0.0.11
v0.0.10
v0.0.9
v0.0.8
v0.0.7
v0.0.6
v0.0.5
v0.0.4
v0.0.3
v0.0.2
v0.0.1
Labels
Clear labels
amd
api
app
bug
build
cli
cloud
compatibility
context-length
create
docker
documentation
embeddings
feature request
feedback wanted
good first issue
gpt-oss
gpu
harmony
help wanted
image
install
intel
js
launch
linux
macos
memory
mlx
model
needs more info
networking
nvidia
ollama.com
performance
pull-request
python
question
registry
rendering
thinking
tools
top
vulkan
windows
wsl
Mirrored from GitHub Pull Request
No Label
bug
Milestone
No items
No Milestone
Projects
Clear projects
No project
No Assignees
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: github-starred/ollama#1273
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @TheStarAlight on GitHub (Jan 27, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/2225
Originally assigned to: @jmorganca on GitHub.
Hi, I'm running ollama on a Debian server and use the oterm as the interface.
After some chats (just less than 10 normal questions) the ollama fails to respond anymore and running
ollama run mixtraljust didn't success (it keeps loading).I noted that the same issue happened, like in #1863 . Is there a solution at the moment? Also, I'm not the administrator of the server and I even don't know how to restart ollama 😂. The serve process seems to runs as another user named ollama. Can anyone tell me how to restart it?
To developers: I can provide some debug information if you need, just tell me how to do it.
Thanks :D
@TheStarAlight commented on GitHub (Jan 27, 2024):
The model I'm running include mixtral:latest and wizard-math:70b.
I have access to an NVIDIA A100 PCI-e 80GB and the inputs are all simple sentences (no more than 100 words) and I ensure that nobody else is using the GPU (I see from nvitop).
@jmorganca commented on GitHub (Jan 27, 2024):
Hi @TheStarAlight, would it be possible to share which version of Ollama you are running?
ollama -vwill print this out. Thanks so much, and I'm sorry you hit this issue@TheStarAlight commented on GitHub (Jan 27, 2024):
@jmorganca Sure! The ollama version is 0.1.20, just installed three days ago via the shell script. Please tell me if you need more information :)
@jmorganca commented on GitHub (Jan 27, 2024):
Would it be possible to test with the newest version 0.1.22, which should fix this? https://github.com/ollama/ollama/releases/tag/v0.1.22
You can download the latest version of Ollama here: https://ollama.ai/download
Keep me posted!
@glorat commented on GitHub (Jan 30, 2024):
Is this a dupe issue of #1458 ?
Happened to me too on 0.1.22 with mistral on MacOS. Will post again if I can find a way to reproduce.
@TheStarAlight commented on GitHub (Jan 30, 2024):
@glorat I think so, it seems this problem happens on all platforms (linux, macOS and WSL).
@TheStarAlight commented on GitHub (Jan 30, 2024):
@jmorganca I'm sorry that I'm not the administrator of the server and the administrator has not responded to my request😂. I'll try it on my own computer (but it can only run <4b models, even the mistral got very slow after the first evaluation) before the ollama on the server gets updated.
Btw, how can I restart the ollama server process😂? It is started by the user ollama and I cannot stop it without administrator privilege. The process has been hanging on the server for a few days and I just cannot find a way to stop it.
Thank you!
@iplayfast commented on GitHub (Jan 30, 2024):
@jmorganca I can confirm that my memory issues have seemed to gone away with my stress test. https://github.com/ollama/ollama/issues/1691 Other issues have surfaced, but I think the ollama version 0.1.22 is a winner.
@tarbard commented on GitHub (Jan 31, 2024):
I'm seeing this behaviour on 0.1.22 too
After a few interactions (in this case codellama 70b) the API stops responding to ollama-webui and "ollama run codellama:70b-instruct-q4_K_M" just shows the loading animation and never starts.
journalctl -u ollama doesn't show any errors, just the last successful calls, is there any way to see more detailed logs?
"systemctl restart ollama" eventually restarts ollama but it takes quite a while
@thexclu commented on GitHub (Jan 31, 2024):
I have the same issue, running version 0.1.22 with mistral
@adriancbo commented on GitHub (Feb 4, 2024):
I am experiencing the same issue while running the technovangelist airenamer on version
0.1.23with any llava. It functions initially but then hangs after a few minutes, causing the CPU usage to reach 100%. Consequently, I am unable to run any models. My system configuration is as follows:@TheStarAlight commented on GitHub (Feb 7, 2024):
@jmorganca I tried the new version (0.1.22) of ollama, and broke the ollama on two separate servers with two identical inputs 😂, the problem still exists. However, I notice that the problem occurs when the context gets a bit long (~1600 Chinese characters, 7 prompts). Would it be the problem?
@TheStarAlight commented on GitHub (Feb 7, 2024):
I should have illustrated it more clearly. I'm using ollama-webui and qwen:72b (this time a different model), and I forwarded the 11434 port from the remote server for my local webui to access. After the problem happened, I saved the previous chat history and switched to another server, then tried to continue the chat before using the same prompt which caused the problem in the previous server, and it just stuck in the middle as well, just after a single evaluation ...
@lukebelbina commented on GitHub (Feb 8, 2024):
I am having the same issue with latest version 0.1.24. I works for a few minutes then eventually starts hanging on every request.
@coolrazor007 commented on GitHub (Feb 10, 2024):
I'm seeing this on 0.1.24 as well. How far back should I rollback in the interim? Anyone know when this was introduced?
@jmorganca commented on GitHub (Feb 11, 2024):
Sorry this is still a problem – what kind of prompt is being sent to the model – is it the same prompt over and over again, or a different one? Thanks!
@lukebelbina commented on GitHub (Feb 12, 2024):
I am sending the same preprompt with different user message, one after another (about every 1-2 second) using llama:17b. It crashes 100% of the time within about 10 minutes.
@lips85 commented on GitHub (Feb 13, 2024):
I worked on
ollama v0.124onmac m3 max 64gbThe model worked with two models,
mistral:latestandopenhermes:latest, and after performing the same task several times, the CPU usage increased to 99% and stopped.I confirmed that it was working with the GPU before the operation stopped.
Before checking the github issue, I thought it was a problem that only occurred on a specific OS (Mac silicon), but it seems to be a problem that occurs regardless of platform.
@TheStarAlight commented on GitHub (Feb 13, 2024):
@jmorganca Hi, thank you for your attention. I was just doing regular chats using ollama-webui (just like using ChatGPT). But now I cannot reproduce my previous chat anymore, I just had a chat with qwen:72b with longer than 2000 Chinese characters and the problem seemed gone away. But one thing is for sure, in my previous situation (ollama 0.1.22):
it seemed that this chat was "poisonous" and the next prompt would crash every ollama server (at lease my 2 servers) in the first run. I'll comment if I find another similar occasion :D
@timiil commented on GitHub (Feb 19, 2024):
seems we are faceing the same problem in ubuntu, no matter docker env or directly deploy ollama service , after we call the ollama http endpoint serval times, ollama http service will be hang up.
@TheStarAlight commented on GitHub (Feb 19, 2024):
Is there a reproducable way to reproduce the issue? Or if is there any way that we save the verbose log?
@mjspeck commented on GitHub (Feb 19, 2024):
I think I'm running into this issue as well.
@Sinan-Karakaya commented on GitHub (Feb 22, 2024):
I am running on the same issue, using mistral with a pre-prompt with a Mac M1 chip. After a couple of generation, the server will not respond until I kill my request
@bennylam commented on GitHub (Feb 29, 2024):
@gOATiful commented on GitHub (Feb 29, 2024):
We encounter the same problem on ubuntu 20.04.6 LTS.
@justinwaltrip commented on GitHub (Feb 29, 2024):
I was able to fix this issue by removing the JSON formatting parameter for /api/generate calls
@antonsapt4 commented on GitHub (Mar 1, 2024):
Have the same problem, I'm on mac M2 running ollama desktop version 0.1.27
Using gemma:7b-instruct-q6_K
First boot is run, doing some curl test just to make sure it works fine. But after idle whenever sending curl again when the model boot offloading to metal, it hang and restart my Macbook. Its happening all the time.
Can somebody shed some light?, should I uninstall and download a new Ollama or is there any setting that can fix this issues?
PS:
@justinwaltrip on Ollama desktop how to remove the JSON formatting?
@koleshjr commented on GitHub (Mar 5, 2024):
I am having the same issue even on the new version: 0.1.28 . This happens after 200 iterations on a custom finetuned 4 bit mistral on collabs free tier t4
@deadmanoz commented on GitHub (Mar 8, 2024):
We are experiencing the same issue on
0.1.28, using official Docker image on Ubuntu 22.04 with 8x RTX A40000.Running
llava:34bwith images as part of the request.Will successfully process infrequent requests than suddenly hang on some request and be unresponsive to request until container is restarted.
Requests are being made to
/api/generateendpoint across network, with streamFalsein request.ollama responds to health checks.
@deadmanoz commented on GitHub (Mar 8, 2024):
I was just watching this as it hit the issue.
CPU of 1 core hits 100%.
It will just hang here now until I restart the container. This is the output of
docker logs ollamanvtop:

htop

Please advise if I can provide any further logs or assist with debugging this issue
@seanmavley commented on GitHub (Mar 9, 2024):
@antonsapt4 Removing the json format param appears to work without issues for me.
Issue may be related to #1910
@deadmanoz commented on GitHub (Mar 9, 2024):
Note: I don't use
"format": "json"and have this issue..@jmccrosky commented on GitHub (Mar 17, 2024):
I also have this issue on an M3 Max. It seems to be somewhat random, but tends to happen more quickly with larger model or larget prompts. For example, with one model, including only one image per prompt works consistently, but including two per prompt will trigger this issue after some runs... With a larger model, it happens even with only one image.
@niyogrv commented on GitHub (Mar 20, 2024):
#1863 and this seems to be the same issue.
@deadmanoz commented on GitHub (Mar 28, 2024):
Yes, looks to be the same issue!
🤞 that https://github.com/ollama/ollama/issues/2225 is the resolution!
@Master-Lucas commented on GitHub (Apr 3, 2024):
Even in Ollama version 0.1.30, Japanese text generation stops. I've tried it once with the lightweight "mistral" and three times with "dolphin-mistral" (all 4q), and in total, at the point of failing generation, throwing all the input and output Japanese characters into the token counter, it stops around 3400-3600 tokens. It feels like it hangs up after generating longer texts for about 6-9 turns (Mac Sonoma14.4.1, 64GB). P.S. using terminal with this experiments, but using PageAssist webui also the same issue occures.
@Mecil9 commented on GitHub (Apr 7, 2024):
I have the same question. When I run ollama on apple M1 Max. My activity monitor shows 100% CPU usage and 0% GPU usage, and after running for a while ollama becomes unresponsive. Don't know what caused it.
Once the CPU reaches 100%, ollama will stop working. I have tried many methods to no avail!
@jmorganca commented on GitHub (Apr 15, 2024):
Hi all, this should be fixed in 0.1.31 (hanging when unicode characters are in the prompt). Further fixes for hanging are also in 0.1.32 - stay tuned!
@Destroyer commented on GitHub (Apr 20, 2024):
running the 0.1.32 and the issue still persists. using CPU AVX on Debian 12 with llama3 model, gets stuck before my proxy times out after 60 seconds, reloading the page and typing the query again fixes it but it is very annoying
@NikitaDeveloperAI commented on GitHub (Jun 3, 2024):
Currently running Ollama 0.1.41, sadly this problem still persists.
@seanmavley commented on GitHub (Jun 3, 2024):
@NikitaDeveloperAI this problem may hardly get a universal fix as it's like a whack-a-mole game.
In our case, we stopped using the
format='json", and explicitly wrote a prompt that output in the exact json structure we want.For now, that appears to work with some level of predictability and consistency. Testing it more, but so far, that approach doesn't cause hanging issues with Ollama.
@thistlillo commented on GitHub (Mar 24, 2025):
I am using ollama version is 0.6.1 on a linux "Rocky Linux 8.10 (Green Obsidian)", machine with four H100 80GB, I configure several Ollama servers, 1 per gpu. Very often it freezes.
With the models available today:
@fireblade2534 commented on GitHub (Mar 25, 2025):
same here with 0.6.2 and Gemma 3 on two A4500 gpu's
@chm-dev commented on GitHub (Apr 11, 2025):
RTX 3090, Windows 11
Exactly same thing as @thistlillo
qwen2.5-coder 32b freezes almost instantly after loading. Sometimes you might be able to actually have one or two chats before it freezes.
@festivus37 commented on GitHub (Apr 15, 2025):
Same. Ollama 0.6.5.
M3 Ultra 512 gigs.
First noticed it on Deepseek R1 671b q4 after a few queries, but then switched to Gemma 3 27b q8. It gets through about 7-10 requests and then starts freezing. The memory of the model is in use, the gpu is pegged permanently, no outputs via the api are being sent. Qutting Ollama fixes it for a short while, then it's back to the same behavior.
@softmarshmallow commented on GitHub (Apr 25, 2025):
Same here. Running M1 Max, gemma3:27b freezes 100% after 20 ~ 30 min. I though this was related to screen saver, but it was simply randomly freezing. very annoying, need to run automation over night, no way..
@Sven1403 commented on GitHub (Apr 28, 2025):
Same. Ollama 0.6.6 with a A16 vGPU.
First it works fine and then freeze. With ollama ps i see the model running or being stucked in "Stopping...". Only way to resolve is a reboot of the PC. Killing the ollama process isnt working.
@remidebette commented on GitHub (Jun 4, 2025):
Hi guys,
Same issues with ollama deployed on A100 40GB
(in a kubernetes environment: helm chart 1.19.0 deploying ollama v0.9.0)
@BillShiyaoZhang commented on GitHub (Jul 11, 2025):
Same issue. Ollama v0.9.6 on Mac M4 Pro
@czaku commented on GitHub (Aug 30, 2025):
same for me ollama 0.11.8 on latest macos, spinning but doesn't generate any output
@spacetime-labs commented on GitHub (Sep 8, 2025):
Same issue here on version 0.11.10 of MacOS. I use the ollama app, and models stop thinking and or responding. Remain stuck in 'loading', when I go back to it loading seems to have stopped and there is no output. Sometimes, adding another prompt or starting a new converstaion seems to get it to answer, but most times it gets stuck.
@Dayal-star commented on GitHub (Oct 24, 2025):
today is 24-OCT-2025 , the same is happening i am running on windows 10 and ollama version is 0.12.6.
@vijaykanade55-sys commented on GitHub (Oct 26, 2025):
I have been trying to run different models on Ollama like llama 3.2 and gemma:2b since past few days, and I have been encountering the same issue of run command getting stalled. I am using Windows 10 with ollama version 0.12.6. The run command does not process further and stop almost instantly. PFA image. Can anyone fix the issue and help me resolve it?
@JT0719 commented on GitHub (Nov 6, 2025):
Same as @vijaykanade55-sys over here... tried different models, reinstalled ollama, reset pc, tried on different pcs (all windows 10) and no luck. it downloads the model then it stays loading... on interface and on cmd... was working fine a few weeks ago.
ollama version 0.12.9.