-
released this
2024-06-26 23:38:21 -05:00 | 1862 commits to main since this release📅 Originally published on GitHub: Thu, 27 Jun 2024 06:09:56 GMT
🏷️ Git tag created: Thu, 27 Jun 2024 04:38:21 GMTWhat's Changed
- Added support for Google Gemma 2 models (9B and 27B)
- Fixed issues with
ollama createwhen importing from Safetensors
A special thank you to the Google Cloud and DeepMind team members for Gemma 2 support.
Full Changelog: https://github.com/ollama/ollama/compare/v0.1.46...v0.1.47
Downloads
-
released this
2024-06-24 23:47:52 -05:00 | 1864 commits to main since this release📅 Originally published on GitHub: Sat, 22 Jun 2024 00:44:28 GMT
🏷️ Git tag created: Tue, 25 Jun 2024 04:47:52 GMTWhat's Changed
- Increased model loading speed with
ollama run, especially if running an already-loaded model - Improved performance of
/api/showincluding for large models - Fixes issue where the
--quantizeflag inollama createwould lead to an error - Improved model loading times when models would not completely fit in system memory on Linux
- Fixed issue where certain
Modelfileparameters would not be parsed correctly
Full Changelog: https://github.com/ollama/ollama/compare/v0.1.45...v0.1.46
Downloads
- Increased model loading speed with
-
released this
2024-06-20 12:46:24 -05:00 | 1875 commits to main since this release📅 Originally published on GitHub: Sat, 15 Jun 2024 19:11:17 GMT
🏷️ Git tag created: Thu, 20 Jun 2024 17:46:24 GMTNew models
- DeepSeek-Coder-V2: A 16B & 236B open-source Mixture-of-Experts code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks.
ollama showollama showwill now show model details such as context length, parameters, embedding size, license and more:% ollama show llama3 Model arch llama parameters 8.0B quantization Q4_0 context length 8192 embedding length 4096 Parameters num_keep 24 stop "<|start_header_id|>" stop "<|end_header_id|>" stop "<|eot_id|>" License META LLAMA 3 COMMUNITY LICENSE AGREEMENT Meta Llama 3 Version Release Date: April 18, 2024What's Changed
ollama show <model>will now show model information such as context window size- Model loading on Windows with CUDA GPUs is now faster
- Setting
seedin the/v1/chat/completionsOpenAI compatibility endpoint no longer changestemperature - Enhanced GPU discovery and multi-gpu support with concurrency
- The Linux install script will now skip searching for network devices
- Introduced a workaround for AMD Vega RX 56 SDMA support on Linux
- Fix memory prediction for
deepseek-v2anddeepseek-coder-v2models api/showendpoint returns extensive model metadata- GPU configuration variables are now reported in
ollama serve - Update Linux ROCm to v6.1.1
New Contributors
- @jayson-cloude made their first contribution in https://github.com/ollama/ollama/pull/4972
Full Changelog: https://github.com/ollama/ollama/compare/v0.1.44...v0.1.45
Downloads
-
released this
2024-06-13 15:26:09 -05:00 | 1948 commits to main since this release📅 Originally published on GitHub: Thu, 13 Jun 2024 19:38:15 GMT
🏷️ Git tag created: Thu, 13 Jun 2024 20:26:09 GMTWhat's Changed
- Fixed issue where unicode characters such as emojis would not be loaded correctly when running
ollama create - Fixed certain cases where Nvidia GPUs would not be detected and reported as compute capability 1.0 devices
Full Changelog: https://github.com/ollama/ollama/compare/v0.1.43...v0.1.44
Downloads
- Fixed issue where unicode characters such as emojis would not be loaded correctly when running
-
released this
2024-06-11 18:04:20 -05:00 | 1960 commits to main since this release📅 Originally published on GitHub: Tue, 11 Jun 2024 02:00:49 GMT
🏷️ Git tag created: Tue, 11 Jun 2024 23:04:20 GMTWhat's Changed
- New import.md guide for converting and importing models to Ollama
- Fixed issue where embedding vectors resulting from
/api/embeddingswould not be accurate - JSON mode responses will no longer include invalid escape characters
- Removing a model will no longer show incorrect
File not founderrors - Fixed issue where running
ollama createwould result in an error on Windows with certain file formatting
New Contributors
- @erhant made their first contribution in https://github.com/ollama/ollama/pull/4854
- @nischalj10 made their first contribution in https://github.com/ollama/ollama/pull/4612
- @dcasota made their first contribution in https://github.com/ollama/ollama/pull/4852
- @Napuh made their first contribution in https://github.com/ollama/ollama/pull/4084
- @hughescr made their first contribution in https://github.com/ollama/ollama/pull/3782
- @jimscard made their first contribution in https://github.com/ollama/ollama/pull/3382
Full Changelog: https://github.com/ollama/ollama/compare/v0.1.42...v0.1.43
Downloads
-
released this
2024-06-07 13:07:39 -05:00 | 1983 commits to main since this release📅 Originally published on GitHub: Fri, 07 Jun 2024 07:06:58 GMT
🏷️ Git tag created: Fri, 07 Jun 2024 18:07:39 GMTNew models
- Qwen 2: a new series of large language models from Alibaba group
What's Changed
- Fixed issue where
qwen2would output erroneous text such asGGGon Nvidia and AMD GPUs ollama pullis now faster if it detects a model is already downloadedollama createwill now automatically detect prompt templates for popular model architectures such as Llama, Gemma, Phi and more.- Ollama can now be accessed from local apps built with Electron and Tauri, as well as in developing apps in local html files
- Update welcome prompt in Windows to
llama3 - Fixed issues where
/api/psand/api/tagswould show invalid timestamps in responses
New Contributors
- @shoebham made their first contribution in https://github.com/ollama/ollama/pull/4766
- @kartikm7 made their first contribution in https://github.com/ollama/ollama/pull/4719
- @royjhan made their first contribution in https://github.com/ollama/ollama/pull/4822
Full Changelog: https://github.com/ollama/ollama/compare/v0.1.41...v0.1.42
Downloads
-
released this
2024-06-01 21:24:33 -05:00 | 2015 commits to main since this release📅 Originally published on GitHub: Sun, 02 Jun 2024 03:30:47 GMT
🏷️ Git tag created: Sun, 02 Jun 2024 02:24:33 GMTWhat's Changed
- Fixed issue on Windows 10 and 11 with Intel CPUs with integrated GPUs where Ollama would encounter an error
Full Changelog: https://github.com/ollama/ollama/compare/v0.1.40...v0.1.41
Downloads
-
released this
2024-05-31 20:54:21 -05:00 | 2016 commits to main since this release📅 Originally published on GitHub: Fri, 31 May 2024 05:49:35 GMT
🏷️ Git tag created: Sat, 01 Jun 2024 01:54:21 GMTNew models
- Codestral: Codestral is Mistral AI’s first-ever code model designed for code generation tasks.
- IBM Granite Code: now in 3B and 8B parameter sizes.
- Deepseek V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
What's Changed
- Fixed out of memory and incorrect token issues when running Codestral on 16GB Macs
- Fixed issue where full-width characters (e.g. Japanese, Chinese, Russian) were deleted at end of the line when using
ollama run
New Examples
New Contributors
- @zhewang1-intc made their first contribution in https://github.com/ollama/ollama/pull/3278
Full Changelog: https://github.com/ollama/ollama/compare/v0.1.39...v0.1.40
Downloads
-
released this
2024-05-28 14:04:03 -05:00 | 2053 commits to main since this release📅 Originally published on GitHub: Wed, 22 May 2024 02:46:48 GMT
🏷️ Git tag created: Tue, 28 May 2024 19:04:03 GMTNew models
- Cohere Aya 23: A new state-of-the-art, multilingual LLM covering 23 different languages.
- Mistral 7B 0.3: A new version of Mistral 7B with initial support for function calling.
- Phi-3 Medium: a 14B parameters, lightweight, state-of-the-art open model by Microsoft.
- Phi-3 Mini 128K and Phi-3 Medium 128K: versions of the Phi-3 models that support a context window size of 128K
- Granite code: A family of open foundation models by IBM for Code Intelligence
Llama 3 import
It is now possible to import and quantize Llama 3 and its finetunes from Safetensors format to Ollama.
First, clone a Hugging Face repo with a Safetensors model:
git clone https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct cd Meta-Llama-3-8B-InstructNext, create a
Modelfile:FROM . TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|> {{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|> {{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|> {{ .Response }}<|eot_id|>""" PARAMETER stop <|start_header_id|> PARAMETER stop <|end_header_id|> PARAMETER stop <|eot_id|>Then, create and quantize a model:
ollama create --quantize q4_0 -f Modelfile my-llama3 ollama run my-llama3What's Changed
- Fixed issues where wide characters such as Chinese, Korean, Japanese and Russian languages.
- Added new
OLLAMA_NOHISTORY=1environment variable that can be set to disable history when usingollama run - New experimental
OLLAMA_FLASH_ATTENTION=1flag forollama servethat improves token generation speed on Apple Silicon Macs and NVIDIA graphics cards - Fixed error that would occur on Windows running
ollama create -f Modelfile ollama createcan now create models from I-Quant GGUF files- Fixed
EOFerrors when resuming downloads viaollama pull - Added a
Ctrl+Wshortcut toollama run
New Contributors
- @rapmd73 made their first contribution in https://github.com/ollama/ollama/pull/4467
- @sammcj made their first contribution in https://github.com/ollama/ollama/pull/4120
- @likejazz made their first contribution in https://github.com/ollama/ollama/pull/4535
Full Changelog: https://github.com/ollama/ollama/compare/v0.1.38...v0.1.39
Downloads
-
released this
2024-05-15 17:43:16 -05:00 | 2125 commits to main since this release📅 Originally published on GitHub: Wed, 15 May 2024 00:28:00 GMT
🏷️ Git tag created: Wed, 15 May 2024 22:43:16 GMTNew Models
- Falcon 2: A new 11B parameters causal decoder-only model built by TII and trained over 5T tokens.
- Yi 1.5: A new high-performing version of Yi, now licensed as Apache 2.0. Available in 6B, 9B and 34B sizes.
What's Changed
ollama psA new command is now available:
ollama ps. This command displays currently loaded models, their memory footprint, and the processors used (GPU or CPU):% ollama ps NAME ID SIZE PROCESSOR UNTIL mixtral:latest 7708c059a8bb 28 GB 47%/53% CPU/GPU Forever llama3:latest a6990ed6be41 5.5 GB 100% GPU 4 minutes from now all-minilm:latest 1b226e2802db 585 MB 100% GPU 4 minutes from now/clearTo clear the chat history for a session when running
ollama run, use/clear:>>> /clear Cleared session context- Fixed issue where switching loaded models on Windows would take several seconds
- Running
/savewill no longer abort the chat session if an incorrect name is provided - The
/api/tagsAPI endpoint will now correctly return an empty list[]instead ofnullif no models are provided
New Contributors
- @fangtaosong made their first contribution in https://github.com/ollama/ollama/pull/4387
- @machimachida made their first contribution in https://github.com/ollama/ollama/pull/4424
Full Changelog: https://github.com/ollama/ollama/compare/v0.1.37...v0.1.38
Downloads
mirror of
https://github.com/ollama/ollama.git
synced 2025-12-05 18:46:22 -06:00