mirror of
https://github.com/ollama/ollama.git
synced 2025-12-05 18:46:22 -06:00
-
released this
2025-10-29 13:50:56 -05:00 | 144 commits to main since this release📅 Originally published on GitHub: Wed, 29 Oct 2025 02:07:54 GMT
🏷️ Git tag created: Wed, 29 Oct 2025 18:50:56 GMTNew models
- Qwen3-VL: Qwen3-VL is now available in all parameter sizes ranging from 2B to 235B
- MiniMax-M2: a 230 Billion parameter model built for coding & agentic workflows available on Ollama's cloud
Add files and adjust thinking levels in Ollama's new app
Ollama's new app now includes a way to add one or many files when prompting the model:
For better responses, thinking levels can now be adjusted for the gpt-oss models:
New API documentation
New API documentation is available for Ollama's API: https://docs.ollama.com/api
What's Changed
- Model load failures now include more information on Windows
- Fixed embedding results being incorrect when running
embeddinggemma - Fixed gemma3n on Vulkan backend
- Increased time allocated for ROCm to discover devices
- Fixed truncation error when generating embeddings
- Fixed request status code when running cloud models
- The OpenAI-compatible
/v1/embeddingsendpoint now supportsencoding_formatparameter - Ollama will now parse tool calls that don't conform to
{"name": name, "arguments": args}(thanks @rick-github!) - Fixed prompt processing reporting in the llama runner
- Increase speed when scheduling models
- Fixed issue where
FROM <model>would not inheritRENDERERorPARSERcommands
New Contributors
- @npardal made their first contribution in https://github.com/ollama/ollama/pull/12715
Full Changelog: https://github.com/ollama/ollama/compare/v0.12.6...v0.12.7
Downloads