[GH-ISSUE #15913] [Feature Request] Support weightless RMSNorm (for FlashNorm weight folding trick) #72195

Open
opened 2026-05-05 03:37:15 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @NilsGraf on GitHub (May 1, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/15913

Please add support for RMSNorm without normalization weights.

This is to support FlashNorm — a mathematically equivalent variant of RMSNorm that folds norm weights into the subsequent linear layer. See explainer video.

We have applied this weight folding trick to a few LLMs (Llama, Qwen, SMolLM) here:
https://huggingface.co/models?other=weightless-rmsnorm

Image

Motivation

FlashNorm's removal of norm weights reduces inference overhead at zero accuracy cost, and we'd like to share these optimized models with the broader community.

Possible Implementation

Remove norm weights from your RMSNorm implementation. E.g., just skip norm weight multiplication if there are no norm weights provided.

Originally created by @NilsGraf on GitHub (May 1, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/15913 Please add support for RMSNorm without normalization weights. This is to support [FlashNorm](https://arxiv.org/abs/2407.09577) — a mathematically equivalent variant of RMSNorm that folds norm weights into the subsequent linear layer. [See explainer video](https://www.youtube.com/watch?v=GEuJv34_XgU). We have applied this weight folding trick to a few LLMs (Llama, Qwen, SMolLM) here: https://huggingface.co/models?other=weightless-rmsnorm <img width="332" height="183" alt="Image" src="https://github.com/user-attachments/assets/d97b50ba-1092-4d44-ad70-ff2bca448b1d" /> ### Motivation FlashNorm's removal of norm weights reduces inference overhead at zero accuracy cost, and we'd like to share these optimized models with the broader community. ### Possible Implementation Remove norm weights from your RMSNorm implementation. E.g., just skip norm weight multiplication if there are no norm weights provided.
GiteaMirror added the feature request label 2026-05-05 03:37:15 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#72195