Refines the explanation of K,V computation savings in the memoization module,
quantifying redundant computations and highlighting the efficiency gain.
The paper and module now specify that generating 100 tokens requires 5,050
total K,V computations, but only 100 are necessary, resulting in 4,950
redundant calculations.
Add new subsection describing function decomposition pattern used within
modules. Documents how complex operations (attention, convolution, training)
are split into focused helper functions with individual unit tests before
composition into exported functions. Updates pedagogical justification to
cover both inter-module and intra-module progressive disclosure.
Escape unescaped & characters in references.bib (Taylor & Francis,
AI & Machine-Learning) and replace Unicode em-dashes (U+2014) with
LaTeX --- ligatures in paper.tex for T1 font compatibility.
- Name the problem: "algorithm-systems divide"
- Name the approach: "implementation-based systems pedagogy"
- Add concrete systems examples (O(N^2), Adam 3x memory)
- Include MLPerf-style benchmarking in milestones
- Strengthen citable terminology throughout
- Switch from fontspec/TeX Gyre to standard fonts (mathpazo, helvet, courier)
- Replace emoji package with no-op (title is just "TinyTorch")
- Switch from biblatex/biber to natbib/bibtex
- Works with both lualatex (local) and pdflatex (arXiv)
- Acknowledge CS249r students whose feedback shaped the curriculum
- Acknowledge global mlsysbook.ai community
- Expand GenAI statement to cover framework development
- Frame AI assistance as enabling democratization (single author, 20 modules)
- Remove em-dashes throughout for cleaner prose
Reverts arXiv-specific changes to work with compile_paper.sh:
- Restored fontspec with TeX Gyre fonts
- Restored emoji package
- Restored biblatex with biber backend
- Works with lualatex as expected by compile script
- Replace fontspec/custom fonts with standard LaTeX fonts (mathpazo, helvet, courier)
- Remove emoji package, define empty command for \emoji
- Switch from biblatex/biber to natbib/bibtex for better arXiv support
- Change \printbibliography to \bibliography{references}
- Remove "Tensors to Systems" from subtitle for cleaner title
- Fix @article entries that should be @inproceedings (kannan2022astrasim,
micikevicius2018mixed, strubell2019energy, vaswani2017attention)
- Remove duplicate booktitle field from williams2009roofline
- Standardize year fields across entries
- New title: "TinyTorch: Building Machine Learning Systems from First
Principles: Tensors to Systems"
- Add strategic mentions of AI engineering as an emerging discipline
- Update competency matrix caption to reference AI engineer competencies
- Update conclusion to position TinyTorch as training AI engineers
- Add SEI AI Engineering workshop report reference
- Remove fabricated claim about quantum ML and robotics community forks
that don't actually exist
- Remove "quantum ML" from conclusion's list of future fork variants
- Change "validated through pilot implementations" to "designed for
diverse institutional contexts" since validation is planned future work
- Replace MNIST/ImageNet full dataset calculation with the actual Module 1 exercise
- New text: "Students calculate memory footprints, discovering that a single batch of 32 ImageNet images requires 19 MB, while the full dataset exceeds 670 GB."
- Aligns paper claims with the codebase (01_tensor.py Q1)
- Rephrase 'optimization opportunities' sentence to avoid dangling word
- Adjust 'roles industry desperately needs' to 'roles that the industry desperately needs'
- Change 'over dense layers' to 'compared to dense layers' to improve line break
- Reorder 'construct mental models gradually' to 'gradually construct mental models'
- Change 'self-paced professional development' to 'independent professional development'
- Add single node focus framing throughout paper
- Update ML Systems Competency Matrix caption to clarify single node scope
- Strengthen distributed systems discussion in Curriculum Evolution
- Remove all em-dashes (25 total) replacing with colons, commas, periods
- Switch to lining numbers for technical content
- Add widow/orphan penalties to reduce dangling lines
- Fix single-item bullet list (Intentional Gap section)
- Update author limit to 50 before et al truncation
- Fix tinytorch package references to match actual implementation
Trim detailed phase-by-phase validation plan to a concise summary.
Removes specific dates, sample sizes, and instrument names that would
age poorly. Keeps the Open Science Commitment and general validation
approach. Also removes two orphaned references (paas1992training,
sorva2012visual) that were only cited in the removed text.
- Switch from natbib to biblatex for better author truncation control
- Fix package structure references (tinytorch.nn.conv → tinytorch.core.spatial)
- Fix import examples to use actual tinytorch API patterns
- Fix class references (Transformer → GPT, Attention → MultiHeadAttention)
- Correct Adam coefficient from 0.001 to 0.01
- Fix 11 bibliography entries with wrong/corrupted data:
- abelson1996sicp, bruner1960process, hotz2023tinygrad
- tanenbaum1987minix, perkins1992transfer, papert1980mindstorms
- vygotsky1978mind, blank2019nbgrader, roberthalf2024talent
- keller2025ai, pytorch04release, tensorflow20
- Fix organization author names using double braces
- Configure maxbibnames=10 for "et al." truncation in bibliography
All 60 references verified via web search for arXiv submission.
Adds 'Building systems creates irreversible understanding' to the
paper's conclusion section, reinforcing the pedagogical thesis with
concrete examples: once you implement autograd, you cannot unsee
the computational graph; once you profile memory, you cannot unknow
the costs.
- Increase main title size (26pt → 28pt) for more impact
- Tighten subtitle line spacing
- Add proper line height to author block
- Create visual hierarchy: name > affiliation > URL
- Make URL more subtle (smaller, lighter gray)
- Close narrative loop: tie Conclusion back to Bitter Lesson framing
- Clarify that students implement enable_autograd() themselves
- Fix terminology: use 'Progressive Disclosure' consistently (not 'Enhancement')
- Fix citation: use mlsysbook2025 consistently for textbook reference
- Rewrite progressive disclosure section (Section 4) to accurately
describe how Module 01 Tensor is clean and Module 06 adds gradient
features via monkey-patching (not dormant features from start)
- Update code listings to match actual implementation
- Update figure from dormant-active to foundation-enhanced
- Remove TA_GUIDE.md references (file does not exist)
- Fix export directive count from 13 modules to all 20 modules
- Update GitHub repo URL to monorepo path (cs249r_book/tinytorch)
- Rename milestone directory from 01_1957_perceptron to 01_1958_perceptron
- Update all references to use 1958 (publication year) for consistency
with academic citation format (rosenblatt1958perceptron)
- Changes affect: READMEs, docs, tests, milestone tracker
Rationale: Using 1958 aligns with the publication year and standard
academic citations, while 1957 was the development year.
Cherry-picked from: 28ca41582 (feature/tito-dev-validate)
Add MIT's xv6 teaching OS to the Related Work section alongside
Nachos and Pintos. The x86 to RISC-V transition exemplifies the
strip to essentials philosophy that TinyTorch follows.
Also link to the research paper from the preface for readers
interested in the pedagogical foundations.
Refactors the module name from "Spatial" to "Convolutions" to better reflect its content and purpose, which focuses on convolutional neural networks.
This change ensures consistency and clarity across the codebase, documentation, and examples.
- Change mermaid diagram from LR to TB (top-down) layout
- Add module numbers to node labels (01: Tensor, 02: Activations, etc.)
- Color nodes by tier: blue (foundation), purple (architecture), orange (optimization)
- Add zoom:1.5 for larger display on website