- Fix goodput formula: use steady-state overhead model (checkpoint/interval +
recovery/MTBF) instead of prob_fail-based formula that approaches zero at scale
- Fix speculative decode: draft model uses its own KV cache, not target's
- Clarify hierarchical AllReduce: document NCCL reduce-scatter design choice
- Add docs/laws-explained.md: plain-English explanation of all 22 walls + Iron Law