[PR #1444] [MERGED] fix(tinytorch): constant tensor silently zeroed after quantize/dequantize roundtrip #8208

New Issue

GiteaMirror · 2026-04-27T17:31:33-05:00

GiteaMirror commented

2026-04-27 17:31:33 -05:00

📋 Pull Request Information

Original PR: https://github.com/harvard-edge/cs249r_book/pull/1444
Author: @Shashank-Tripathi-07
Created: 4/22/2026
Status: ✅ Merged
Merged: 4/22/2026
Merged by: @profvjreddi

Base: dev ← Head: fix/tinytorch-quantize-constant-tensor-zeroed

📝 Commits (1)

20f812f fix(tinytorch): constant tensor quantized to all-zeros, losing original value

📊 Changes

1 file changed (+22 additions, -3 deletions)

View changed files

📝 tinytorch/src/15_quantization/15_quantization.py (+22 -3)

📄 Description

What this fixes

quantize_int8() in src/15_quantization/15_quantization.py has a special case for constant tensors (all elements equal, so max == min). The guard exists to avoid division by zero when computing scale. It sets scale=1.0 and zero_point=0, then returns an all-zeros INT8 tensor.

On dequantization the formula is (quantized - zero_point) * scale. With quantized=0, zero_point=0, scale=1.0 that gives 0.0 for every element -- the original constant value is gone.

# Before fix
constant_tensor = Tensor([[5.0, 5.0]])
q, scale, zp = quantize_int8(constant_tensor)  # scale=1.0, zp=0, q=[[0,0]]
restored = (q.data - zp) * scale               # [[0.0, 0.0]]  -- wrong, should be 5.0

Any weight tensor that happens to be uniform (e.g. a bias layer initialised to a constant, or a frozen embedding) is silently zeroed out after quantization. No error is raised.

Root cause

The contributor correctly avoided the division-by-zero when max == min, but forgot that zero_point must encode the constant so dequantization can recover it. The invariant is:

(0 - zero_point) * scale == original_constant
=> zero_point = -original_constant / scale

With scale=1.0 this simplifies to zero_point = round(-min_val), clamped to [-128, 127].

Fix

# After fix
zero_point = int(np.clip(np.round(-min_val), INT8_MIN_VALUE, INT8_MAX_VALUE))

Dequantization now correctly recovers the constant:

(0 - zero_point) * 1.0 == min_val  # holds for any constant in [-128, 127]

Test gap closed

The existing unit test only asserted scale_const == 1.0 and never verified the roundtrip. That is why this survived undetected. The updated test explicitly dequantizes and asserts value recovery for both positive and negative constants.

_{🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.}

## 📋 Pull Request Information **Original PR:** https://github.com/harvard-edge/cs249r_book/pull/1444 **Author:** [@Shashank-Tripathi-07](https://github.com/Shashank-Tripathi-07) **Created:** 4/22/2026 **Status:** ✅ Merged **Merged:** 4/22/2026 **Merged by:** [@profvjreddi](https://github.com/profvjreddi) **Base:** `dev` ← **Head:** `fix/tinytorch-quantize-constant-tensor-zeroed` --- ### 📝 Commits (1) - [`20f812f`](https://github.com/harvard-edge/cs249r_book/commit/20f812fe58de8fc5269ed4802152bb3f90618cad) fix(tinytorch): constant tensor quantized to all-zeros, losing original value ### 📊 Changes **1 file changed** (+22 additions, -3 deletions) <details> <summary>View changed files</summary> 📝 `tinytorch/src/15_quantization/15_quantization.py` (+22 -3) </details> ### 📄 Description ## What this fixes `quantize_int8()` in `src/15_quantization/15_quantization.py` has a special case for constant tensors (all elements equal, so `max == min`). The guard exists to avoid division by zero when computing `scale`. It sets `scale=1.0` and `zero_point=0`, then returns an all-zeros INT8 tensor. On dequantization the formula is `(quantized - zero_point) * scale`. With `quantized=0`, `zero_point=0`, `scale=1.0` that gives `0.0` for every element -- the original constant value is gone. ```python # Before fix constant_tensor = Tensor([[5.0, 5.0]]) q, scale, zp = quantize_int8(constant_tensor) # scale=1.0, zp=0, q=[[0,0]] restored = (q.data - zp) * scale # [[0.0, 0.0]] -- wrong, should be 5.0 ``` Any weight tensor that happens to be uniform (e.g. a bias layer initialised to a constant, or a frozen embedding) is silently zeroed out after quantization. No error is raised. ## Root cause The contributor correctly avoided the division-by-zero when `max == min`, but forgot that `zero_point` must encode the constant so dequantization can recover it. The invariant is: ``` (0 - zero_point) * scale == original_constant => zero_point = -original_constant / scale ``` With `scale=1.0` this simplifies to `zero_point = round(-min_val)`, clamped to `[-128, 127]`. ## Fix ```python # After fix zero_point = int(np.clip(np.round(-min_val), INT8_MIN_VALUE, INT8_MAX_VALUE)) ``` Dequantization now correctly recovers the constant: ```python (0 - zero_point) * 1.0 == min_val # holds for any constant in [-128, 127] ``` ## Test gap closed The existing unit test only asserted `scale_const == 1.0` and never verified the roundtrip. That is why this survived undetected. The updated test explicitly dequantizes and asserts value recovery for both positive and negative constants. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>

GiteaMirror added the pull-request label 2026-04-27 17:31:33 -05:00

GiteaMirror closed this issue

2026-04-27 17:31:36 -05:00

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/cs249r_book#8208