Add generate_team.py script that reads .all-contributorsrc and generates
the team.md page automatically. This keeps the website team page in sync
with the README contributors.
- Add tinytorch/site/scripts/generate_team.py
- Update both deploy workflows to run the script before building
- Contributors added via @all-contributors now appear on the website
The script runs during site build, so team.md stays fresh with each deploy.
The formula sqrt(1/fan_in) is actually LeCun initialization (1998),
not Xavier/Glorot. True Xavier uses sqrt(2/(fan_in+fan_out)).
- Rename XAVIER_SCALE_FACTOR → INIT_SCALE_FACTOR
- Update all comments to say "LeCun-style initialization"
- Add note explaining difference between LeCun, Xavier, and He init
- Keep the simpler formula for pedagogical clarity
Fixes#1161
AndreaMattiaGaravagn (truncated) and AndreaMattiaGaravagno were listed
as separate contributors. Merged into a single entry with the correct
username, avatar, and combined contributions (code + doc).
Students hit "No module named 'tinytorch.core.tensor'" in notebooks because
the Jupyter kernel used a different Python than where tinytorch was installed.
- setup: install ipykernel + nbdev, register named kernel during tito setup
- health: add Notebook Readiness checks (import, kernel, Python match)
- export: verify exported file exists and has content (fail loudly)
- Windows: add get_venv_bin_dir() helper for cross-platform venv paths
On PR events, github.ref_name resolves to the merge ref (e.g.
"1159/merge") which doesn't exist on raw.githubusercontent.com,
causing a 404. Use github.head_ref (the actual source branch)
for PRs, falling back to ref_name for push events.
Also adds -f flag to curl so HTTP errors fail immediately with
a clear message instead of silently saving the 404 HTML page.
Escape unescaped & characters in references.bib (Taylor & Francis,
AI & Machine-Learning) and replace Unicode em-dashes (U+2014) with
LaTeX --- ligatures in paper.tex for T1 font compatibility.
The fresh install test script used / as the sed delimiter when
substituting the branch name, which breaks on any branch containing /
(e.g. feature/foo, fix/bar, or GitHub merge refs like 1156/merge).
Show the exact definition, tanh approximation, and sigmoid approximation
side by side so students understand where 1.702 comes from and why we
chose the sigmoid form. Avoids erf notation in favor of plain-language
description of Φ(x) appropriate for Module 2 students.
Related to harvard-edge/cs249r_book#1154
The hint claimed 1.702 comes from √(2/π) ≈ 0.798, which is incorrect.
The 1.702 constant is empirically fitted so that sigmoid(1.702x) ≈ Φ(x),
the Gaussian CDF. The √(2/π) constant appears in the separate tanh-based
GELU approximation, not the sigmoid approximation used here.
Fixesharvard-edge/cs249r_book#1154
- Clarify that attention time complexity is O(n²×d), not O(n²), since each
of the n² query-key pairs requires a d-dimensional dot product
- Fix Total Memory column in analyze_attention_memory_overhead() which was
duplicating the Optimizer column instead of summing all components
- Update KEY INSIGHT multiplier from 4x to 7x to match corrected total
Fixesharvard-edge/cs249r_book#1150
- Add new Team page featuring core staff and contributors
- Vijay Janapa Reddi (Nerdy Professor), Andrea Garavagno (Tech Lead),
Kari Janapareddi (Chief of Staff), Kai Kleinbard (Web Wizard)
- Display community staff side-by-side with fun role titles
- Move Acknowledgments to separate page for MiniTorch/micrograd credits
- Add contributor generator script for future automation
- Update navigation: Team → Ecosystem → Acknowledgments
Replace the slide download card with an inline PDF viewer using PDF.js:
- Change grid layout from 4 cards (2x2) to 3 cards in a row
- Add embedded slide viewer with navigation, zoom, and fullscreen
- Load slides from local _static/slides/ for reliable CORS handling
- Add "· AI-generated" subtitle to match audio card pattern
- Use 🔥 icon consistently across all viewers
Affected: 20 module ABOUT.md files + big-picture.md
Changed Paper link from local PDF download to arXiv external link:
- URL: _static/downloads/TinyTorch-Paper.pdf → arxiv.org/abs/2601.19107
- Icon: ↓ (download) → ↗ (external link)
- Added target=_blank to open in new tab
The nbdev_export CLI was not reading settings.ini correctly in CI,
causing exports to silently fail. Using nbdev.export.nb_export()
Python API directly with explicit lib_path ensures exports work
reliably regardless of environment.
The nbdev_export command wasn't in PATH in CI. Using sys.executable
with -m nbdev.export ensures we use the same Python environment that's
running tito, which is more reliable across different environments.
Fixed in:
- tito/commands/dev/export.py (both _export_all_modules and _run_nbdev_export)
- tito/commands/module/workflow.py (_export_notebook)
The directory check was looking for tinytorch/tinytorch/__init__.py
when running from the tinytorch directory (as CI does). Fixed to check
for either repo root or tinytorch project directory structure.
Fixed NameError in step_1_profile and step_5_accelerate by adding
Tensor as a parameter. These functions create sample tensors for
profiling/testing but the Tensor class is imported in main(), so
they need it passed as an argument.
Each challenge function now has visual documentation showing:
- The task input/output example
- The ideal attention weight matrix pattern
- Why the pattern is required for the task
Challenge 3 also explains why a fresh model is needed
(sequential training causes "catastrophic forgetting").
Break the monolithic main() into clean, documented functions:
- CONFIG dict for shared hyperparameters
- build_model() for creating fresh model/optimizer/loss
- challenge_1_reversal() - anti-diagonal attention patterns
- challenge_2_copying() - diagonal attention patterns
- challenge_3_mixed() - prefix-conditioned behavior (fresh model)
- print_final_results() - summary table and messages
This makes the code much easier for students to understand
and clearly shows why challenge 3 needs a fresh model.
The transformer was being trained sequentially on reversal then copying,
which caused it to "forget" reversal before the mixed task. Now we
reinitialize the model before challenge 3 so it learns both tasks
together with proper prefix conditioning.