[GH-ISSUE #15650] Incorrectly calculates available system memory for qwen3.6:35b-a3b #72046

Open
opened 2026-05-05 03:23:55 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @fullheart on GitHub (Apr 17, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/15650

What is the issue?

Ollama fails to load qwen3.6:35b-a3b with a memory error, despite sufficient RAM being available. The memory calculation appears to be incorrect/conservative.

System specs:

  • OS: Ubuntu 22.04.3 LTS (Jammy Jellyfish)
  • Ollama version: 0.20.7
  • Host RAM: 15 GiB total, 13 GiB available (as reported by free -h)
  • GPU: 16 GB VRAM
  • Docker: No memory limits set (Memory: 0)
  • Container sees correct RAM: MemAvailable: 13829416 kB (~13.8 GiB)

Ollama's claim:

Error: 500 Internal Server Error: model requires more system memory (9.7 GiB) than is available (8.1 GiB)

Problem: Ollama reports only 8.1 GiB available when the container actually has 13.8 GiB free - a discrepancy of ~5.7 GiB.

Relevant log output

Host memory (outside container):

$ free -h
               total        used        free      shared  buff/cache   available
Mem:            15Gi       1,9Gi       1,0Gi       1,0Mi        12Gi        13Gi
Swap:          4,0Gi       1,3Gi       2,7Gi

Container memory (inside ollama container):

$ docker exec -it ollama-hosting-ollama-1 cat /proc/meminfo | grep MemAvailable
MemAvailable:   13829416 kB  (~13.8 GiB)

Docker memory limits:

$ docker inspect ollama-hosting-ollama-1 | grep -i memory
            Memory: 0,
            MemoryReservation: 0,
            MemorySwap: 0,

Ollama error:

$ docker-compose exec ollama ollama run qwen3.6:35b-a3b
Error: 500 Internal Server Error: model requires more system memory (9.7 GiB) than is available (8.1 GiB)

Steps to reproduce

  1. Run Ollama 0.20.7 in Docker on Ubuntu 22.04.3 LTS without memory limits
  2. Ensure host has >13 GiB available RAM
  3. Execute: ollama run qwen3.6:35b-a3b
  4. Observe memory error despite sufficient RAM

Expected behavior

Model should load successfully since:

  • Available RAM (13.8 GiB) > Required RAM (9.7 GiB)
  • GPU has 16 GB VRAM for offloading

This appears to be related to:

  • #14501 - Similar issue with qwen3.5:35b-a3b and qwen3.5:27b-q4_K_M
  • #14557 - Memory calculation error (595 MB required, 19.7 MB reported available)
  • #14719 - deepseek-r1:70b memory requirements increased unexpectedly

OS

Ubuntu 22.04.3 LTS (Jammy Jellyfish)

GPU

Nvidia

Ollama version

0.20.7

Additional context

  • Swap is enabled (4GB total, 2.7GB free)
  • buff/cache: 12 GiB (Linux shows this as available, Ollama may not)
  • Setting OLLAMA_GPU_LAYERS=999 does not resolve the pre-flight memory check failure
  • Issue persists regardless of Docker memory configuration

Note: This is related to #14501, but for the newer qwen3.6:35b-a3b model on Ollama 0.20.7 with detailed memory analysis.

Originally created by @fullheart on GitHub (Apr 17, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/15650 ### What is the issue? Ollama fails to load `qwen3.6:35b-a3b` with a memory error, despite sufficient RAM being available. The memory calculation appears to be incorrect/conservative. **System specs:** - **OS:** Ubuntu 22.04.3 LTS (Jammy Jellyfish) - **Ollama version:** 0.20.7 - **Host RAM:** 15 GiB total, 13 GiB available (as reported by `free -h`) - **GPU:** 16 GB VRAM - **Docker:** No memory limits set (`Memory: 0`) - **Container sees correct RAM:** `MemAvailable: 13829416 kB` (~13.8 GiB) **Ollama's claim:** ``` Error: 500 Internal Server Error: model requires more system memory (9.7 GiB) than is available (8.1 GiB) ``` **Problem:** Ollama reports only 8.1 GiB available when the container actually has 13.8 GiB free - a discrepancy of ~5.7 GiB. ### Relevant log output **Host memory (outside container):** ``` $ free -h total used free shared buff/cache available Mem: 15Gi 1,9Gi 1,0Gi 1,0Mi 12Gi 13Gi Swap: 4,0Gi 1,3Gi 2,7Gi ``` **Container memory (inside ollama container):** ``` $ docker exec -it ollama-hosting-ollama-1 cat /proc/meminfo | grep MemAvailable MemAvailable: 13829416 kB (~13.8 GiB) ``` **Docker memory limits:** ``` $ docker inspect ollama-hosting-ollama-1 | grep -i memory Memory: 0, MemoryReservation: 0, MemorySwap: 0, ``` **Ollama error:** ``` $ docker-compose exec ollama ollama run qwen3.6:35b-a3b Error: 500 Internal Server Error: model requires more system memory (9.7 GiB) than is available (8.1 GiB) ``` ### Steps to reproduce 1. Run Ollama 0.20.7 in Docker on Ubuntu 22.04.3 LTS without memory limits 2. Ensure host has >13 GiB available RAM 3. Execute: `ollama run qwen3.6:35b-a3b` 4. Observe memory error despite sufficient RAM ### Expected behavior Model should load successfully since: - Available RAM (13.8 GiB) > Required RAM (9.7 GiB) - GPU has 16 GB VRAM for offloading ### Related issues This appears to be related to: - #14501 - Similar issue with qwen3.5:35b-a3b and qwen3.5:27b-q4_K_M - #14557 - Memory calculation error (595 MB required, 19.7 MB reported available) - #14719 - deepseek-r1:70b memory requirements increased unexpectedly ### OS Ubuntu 22.04.3 LTS (Jammy Jellyfish) ### GPU Nvidia ### Ollama version 0.20.7 ### Additional context - Swap is enabled (4GB total, 2.7GB free) - buff/cache: 12 GiB (Linux shows this as available, Ollama may not) - Setting `OLLAMA_GPU_LAYERS=999` does not resolve the pre-flight memory check failure - Issue persists regardless of Docker memory configuration --- **Note:** This is related to #14501, but for the newer `qwen3.6:35b-a3b` model on Ollama 0.20.7 with detailed memory analysis.
Author
Owner

@rick-github commented on GitHub (Apr 17, 2026):

#15474

<!-- gh-comment-id:4270268324 --> @rick-github commented on GitHub (Apr 17, 2026): #15474
Author
Owner

@PureBlissAK commented on GitHub (Apr 18, 2026):

🤖 Automated Triage & Analysis Report

Issue: #15650
Analyzed: 2026-04-18T18:13:49.932991

Analysis

  • Type: unknown
  • Severity: medium
  • Components: unknown

Implementation Plan

  • Effort: medium
  • Steps:

This issue has been triaged and marked for implementation.

<!-- gh-comment-id:4274295002 --> @PureBlissAK commented on GitHub (Apr 18, 2026): <!-- ollama-issue-orchestrator:v1 issue:15650 --> ## 🤖 Automated Triage & Analysis Report **Issue**: #15650 **Analyzed**: 2026-04-18T18:13:49.932991 ### Analysis - **Type**: unknown - **Severity**: medium - **Components**: unknown ### Implementation Plan - **Effort**: medium - **Steps**: *This issue has been triaged and marked for implementation.*
Author
Owner

@fullheart commented on GitHub (Apr 22, 2026):

After upgrading to latest version (0.21.0) this problem is gone.

@rick-github does it work for you with the latest version, too?

<!-- gh-comment-id:4294703091 --> @fullheart commented on GitHub (Apr 22, 2026): After upgrading to latest version (`0.21.0`) this problem is gone. @rick-github does it work for you with the latest version, too?
Author
Owner

@markasoftware-tc commented on GitHub (Apr 27, 2026):

This is fixed by https://github.com/ollama/ollama/pull/13782

The issue may appear to go away and come back randomly because it depends on how much memory the container has in the page cache, which can change frequently.

<!-- gh-comment-id:4329058332 --> @markasoftware-tc commented on GitHub (Apr 27, 2026): This is fixed by https://github.com/ollama/ollama/pull/13782 The issue may appear to go away and come back randomly because it depends on how much memory the container has in the page cache, which can change frequently.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#72046