mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-05-21 13:31:55 -05:00
[PR #1784] fix(mlsysim): correct B200/NVL72 TFLOP constants (2x error in FP16/FP8/INT4/FP4) #15735
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/harvard-edge/cs249r_book/pull/1784
Author: @Shashank-Tripathi-07
Created: 5/18/2026
Status: 🔄 Open
Base:
dev← Head:fix/mlsysim-formula-audit📝 Commits (3)
fbb3f3dfix(site): add nav-footer and dropdown-menu dark mode selectors0b1208cfix(kits): correct hardware specs for Raspberry Pi and Nicla Vision43ddd99fix(mlsysim): correct B200 and GB200 NVL72 TFLOP constants by 2x📊 Changes
6 files changed (+50 additions, -15 deletions)
View changed files
📝
kits/contents/platforms.qmd(+1 -1)📝
kits/contents/raspi/raspi.qmd(+3 -2)📝
kits/contents/raspi/setup/setup.qmd(+1 -1)📝
mlsysim/mlsysim/core/constants.py(+7 -7)📝
shared/config/footer-site.yml(+1 -1)📝
shared/styles/_site-dark.scss(+37 -3)📄 Description
Summary
B200_FLOPS_FP16_TENSORwas stored as 2250 TFLOPs -- half the correct value. The NVIDIA Blackwell datasheet lists 4500 TFLOPs as the dense FP16/BF16 value (with sparsity at 9000).Changes
B200_FLOPS_FP16_TENSORB200_FLOPS_FP16_SPARSEB200_FLOPS_FP8_TENSORB200_FLOPS_INT4NVL72_FLOPS_FP16_TENSORNVL72_FLOPS_FP8_TENSORNVL72_FLOPS_FP4_TENSORImpact
The
B200hardware registry node usesB200_FLOPS_FP16_TENSORas itspeak_flops. Any roofline simulation, throughput estimate, or TCO calculation usingHardware.B200orHardware.NVL72was underestimating peak compute by 2x.The other key GPU constants (H100, A100, V100, H200) were verified correct against their datasheets.
Test plan
cd mlsysim && python -m pytest tests/ -q-- all 424 tests pass (verified locally)Hardware.B200.peak_flopsreturns 4500 TFLOPsHardware.NVL72.peak_flopsreturns 324 PFLOPs🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.