Commit Graph

1111 Commits

Author SHA1 Message Date
Vijay Janapa Reddi
56419ea4c2 Standardize milestone naming with numbered sequence and historical anchors
Applied consistent naming pattern: 0X_[figure]_[task].py

M01 (1957 Perceptron):
- forward_pass.py → 01_rosenblatt_forward.py
- perceptron_trained.py → 02_rosenblatt_trained.py

M02 (1969 XOR):
- xor_crisis.py → 01_xor_crisis.py
- xor_solved.py → 02_xor_solved.py

M03 (1986 MLP):
- mlp_digits.py → 01_rumelhart_tinydigits.py
- mlp_mnist.py → 02_rumelhart_mnist.py

M04 (1998 CNN):
- cnn_digits.py → 01_lecun_tinydigits.py
- lecun_cifar10.py → 02_lecun_cifar10.py

M05 (2017 Transformer):
- vaswani_chatgpt.py → 01_vaswani_generation.py
- vaswani_copilot.py → 02_vaswani_dialogue.py
- profile_kv_cache.py → 03_vaswani_profile.py

Benefits:
- Clear execution order (01, 02, 03)
- Historical context (rosenblatt, lecun, vaswani)
- Descriptive purpose (generation, dialogue, profile)
- Consistent structure across all milestones

Updated documentation:
- README.md: Updated all milestone examples
- site/chapters/milestones.md: Updated bash commands
2025-11-11 12:20:36 -05:00
Vijay Janapa Reddi
e456f438e7 Remove redundant review documentation
Removed redundant and superseded review reports:
- Module 15: COMPREHENSIVE_REVIEW_REPORT.md, FINAL_VALIDATION_REPORT.md, REVIEW_SUMMARY.md
- Docs: RESTRUCTURING_VERIFICATION.md, book-development/CLEANUP_SUMMARY.md

Also removed untracked files:
- Module 11: REVIEW_REPORT_FINAL.md (superseded by REVIEW_REPORT.md)
- Module 12: REVIEW_SUMMARY.md (redundant with REVIEW_REPORT.md)
- Module 20: COMPLIANCE_CHECKLIST.md (redundant with REVIEW_REPORT.md)
- Module 6, 8, 14, 18: COMPLIANCE_SUMMARY.md and QUICK_SUMMARY.md files

Retained comprehensive REVIEW_REPORT.md files which contain the most complete QA documentation.
2025-11-11 12:15:36 -05:00
Vijay Janapa Reddi
148326e996 Remove temporary analysis and fix documentation
Removed 31 temporary markdown files that documented completed work:
- Module-specific fix reports (Module 07, 16, 17, 19-20)
- Hasattr audit files (completed audit)
- Module progression review reports (completed)
- Infrastructure analysis reports (completed)
- Renumbering and restructuring summaries (completed)

Retained valuable documentation:
- All REVIEW_REPORT.md files (comprehensive QA documentation)
- All COMPLIANCE_SUMMARY.md files (quick reference)
- COMPREHENSIVE_MODULE_REVIEW_STATUS.md (tracking)
- MODULE_DEPENDENCY_MAP.md and MODULE_PROGRESSION_GUIDE.md (guides)
2025-11-11 12:09:31 -05:00
Vijay Janapa Reddi
9ad2524bf2 Add jupyter-book to site/requirements.txt
- Added jupyter-book>=0.15.0,<1.0.0 dependency for documentation builds
- This dependency is referenced by GitHub Actions workflows
- Required for both HTML and PDF book generation
2025-11-11 11:56:25 -05:00
Vijay Janapa Reddi
f7dcbc8505 Remove temporary analysis files from modules
Cleaned up temporary AI-generated analysis files:
- modules/15_quantization/FIXES_APPLIED.md
- modules/15_quantization/FIXES_TO_APPLY.md
- modules/16_compression/FIXES_REQUIRED.md
- modules/17_memoization/FIXES_APPLIED.md
- Plus other untracked analysis files

These were temporary debugging/review artifacts. Now covered by
.gitignore patterns to prevent future accumulation.
2025-11-10 19:50:43 -05:00
Vijay Janapa Reddi
2da497d727 Update gitignore to exclude temporary analysis files
Added comprehensive patterns to ignore AI-generated temporary reports:
- Module review reports (*_REPORT*.md)
- Analysis summaries (*_SUMMARY.md, *_ANALYSIS.md)
- Fix tracking (*_FIXES*.md, *_CHANGES*.md)
- Verification scripts (VERIFY_*.py)
- Other temporary docs (*_CHECKLIST.md, *_GUIDE.md, etc.)

These files are generated during module reviews and debugging sessions
but are not part of the permanent codebase documentation.
2025-11-10 19:50:26 -05:00
Vijay Janapa Reddi
a14f9fa66a Add module metadata for competition module
Added module.yaml for Module 20 (Competition & Validation):
- Module configuration and learning objectives
- Prerequisites and skill development tracking
- Test coverage and connection documentation

This module brings together all optimization techniques learned
in modules 14-18 for competition preparation.
2025-11-10 19:44:06 -05:00
Vijay Janapa Reddi
832c569cad Add module development files to new structure
Added all module development files to modules/XX_name/ directories:

Module notebooks and scripts:
- 18 modules with .ipynb and .py files (01-20, excluding some gaps)
- Moved from modules/source/ to direct module directories
- Includes tensor, autograd, layers, transformers, optimization modules

Module README files:
- Added README.md for modules with additional documentation
- Complements ABOUT.md files added earlier

This completes the module restructuring:
- Before: modules/source/XX_name/*_dev.{py,ipynb}
- After: modules/XX_name/*_dev.{py,ipynb}

All development happens directly in numbered module directories now.
2025-11-10 19:43:36 -05:00
Vijay Janapa Reddi
a4e38cb906 Update documentation for site/ migration and restructuring
Documentation updates across the codebase:

Root documentation:
- README.md: Updated references from book/ to site/
- CONTRIBUTING.md: Updated build and workflow instructions
- .shared-ai-rules.md: Updated AI assistant rules for new structure

GitHub configuration:
- Issue templates updated for new module locations
- Workflow references updated from book/ to site/

docs/ updates:
- STUDENT_QUICKSTART.md: New paths and structure
- module-rules.md: Updated module development guidelines
- NBGrader documentation: Updated for module restructuring
- Archive documentation: Updated references

Module documentation:
- modules/17_memoization/README.md: Updated after reordering

All documentation now correctly references:
- site/ instead of book/
- modules/XX_name/ instead of modules/source/
2025-11-10 19:42:48 -05:00
Vijay Janapa Reddi
0af88840b1 Update test suite for module restructuring
Updated test imports and paths after modules/source/ removal:
- Progressive integration tests for modules 03, 06, 08, 13, 14
- Checkpoint integration tests
- Module completion orchestrator
- Optimizer integration tests
- Gradient flow regression tests

Updated test documentation:
- tests/README.md with new module paths
- tests/TEST_STRATEGY.md with restructuring notes

All tests now reference modules/XX_name/ instead of modules/source/.
2025-11-10 19:42:23 -05:00
Vijay Janapa Reddi
41b132f55f Update tinytorch and tito with module exports
Re-exported all modules after restructuring:
- Updated _modidx.py with new module locations
- Removed outdated autogeneration headers
- Updated all core modules (tensor, autograd, layers, etc.)
- Updated optimization modules (quantization, compression, etc.)
- Updated TITO commands for new structure

Changes include:
- 24 tinytorch/ module files
- 24 tito/ command and core files
- Updated references from modules/source/ to modules/

All modules re-exported via nbdev from their new locations.
2025-11-10 19:42:03 -05:00
Vijay Janapa Reddi
9fdfa4317c Remove modules/source/ directory structure
Completed restructuring: modules/source/XX_name/ → modules/XX_name/

All module development files moved to their numbered directories:
- modules/01_tensor/tensor_dev.{py,ipynb}
- modules/02_activations/activations_dev.{py,ipynb}
- ... (modules 03-20)

Removed obsolete source structure:
- modules/source/01_tensor/ through modules/source/20_capstone/
- modules/source/20_competition/ (legacy competition module)
- 43 files total (21 modules × 2 files each + 1 module.yaml)

This simplifies the module structure and makes development files
easier to find alongside their ABOUT.md and README.md files.
2025-11-10 19:41:24 -05:00
Vijay Janapa Reddi
5abd29b7d9 Remove book/ directory and old release documentation
Completed migration from book/ to site/:
- All content moved to site/ structure (committed previously)
- GitHub workflows updated to reference site/
- TITO commands updated to use site/

Removed obsolete documentation:
- DECEMBER_2024_RELEASE.md (outdated release checklist)
- RELEASE_CHECKLIST.md (replaced by milestone-based releases)
- STUDENT_VERSION_TOOLING.md (integrated into docs/)

book/ contained 51 files including:
- Jupyter Book configuration (_config.yml, _toc.yml)
- Static assets (logos, favicons, custom CSS)
- Chapter content (00-20, milestones, etc.)
- Build scripts and requirements

All functionality preserved in site/ directory.
2025-11-10 19:40:50 -05:00
Vijay Janapa Reddi
a5679de141 Update documentation after module reordering
All module references updated to reflect new ordering:
- Module 15: Quantization (was 16)
- Module 16: Compression (was 17)
- Module 17: Memoization (was 15)

Updated by module-developer and website-manager agents:
- Module ABOUT files with correct numbers and prerequisites
- Cross-references and "What's Next" chains
- Website navigation (_toc.yml) and content
- Learning path progression in LEARNING_PATH.md
- Profile milestone completion message (Module 17)

Pedagogical flow now: Profile → Quantize → Prune → Cache → Accelerate
2025-11-10 19:37:41 -05:00
Vijay Janapa Reddi
5f3591a57b Reorder modules for better pedagogical flow
Moved memoization (KV-cache) after compression to align with optimization tier milestones.

Changes:
- Module 15: Quantization (was 16)
- Module 16: Compression (was 17)
- Module 17: Memoization (was 15)

Pedagogical Rationale:
This creates clear alignment with the optimization milestone structure:
  - M06 (Profiling): Module 14
  - M07 (Compression): Modules 15-16 (Quantization + Compression)
  - M08 (Acceleration): Modules 17-18 (Memoization/KV-cache + Acceleration)

Before: Students learned KV-cache before understanding why models are slow
After: Students profile → compress → then optimize with KV-cache

Updated milestone reference in profile_kv_cache.py: Module 15 → Module 17
2025-11-10 19:29:10 -05:00
Vijay Janapa Reddi
af12404076 Increase TinyDigits to 1000 samples following Karpathy's philosophy
You were right - 150 samples was too small for decent accuracy.
Following Andrej Karpathy's "~1000 samples" educational dataset philosophy.

Results:
- Before (150 samples): 19% test accuracy (too small!)
- After (1000 samples): 79.5% test accuracy (decent!)

Changes:
- Increased training: 150 → 1000 samples (100 per digit class)
- Increased test: 47 → 200 samples (20 per digit class)
- Perfect class balance: 0.00 std deviation
- File size: 51 KB → 310 KB (still tiny for USB stick)
- Training time: ~3-5 sec → ~8-10 sec (still fast)

Updated:
- create_tinydigits.py: Load from sklearn, generate 1K samples
- train.pkl: 258 KB (1000 samples, perfectly balanced)
- test.pkl: 52 KB (200 samples, balanced)
- README.md: Updated all documentation with new sizes
- mlp_digits.py: Updated docstring to reflect 1K dataset

Dataset Philosophy:
"~1000 samples is the sweet spot for educational datasets"
- Small enough: Trains in seconds on CPU
- Large enough: Achieves decent accuracy (~80%)
- Balanced: Perfect stratification across all classes
- Reproducible: Fixed seed=42 for consistency

Still perfect for TinyTorch-on-a-stick vision:
- 310 KB fits on any USB drive
- Works on RasPi0
- No downloads needed
- Offline-first education
2025-11-10 17:20:54 -05:00
Vijay Janapa Reddi
84568f0bd5 Create TinyDigits educational dataset for self-contained TinyTorch
Replaces sklearn-sourced digits_8x8.npz with TinyTorch-branded dataset.

Changes:
- Created datasets/tinydigits/ (~51KB total)
  - train.pkl: 150 samples (15 per digit class 0-9)
  - test.pkl: 47 samples (balanced across digits)
  - README.md: Full curation documentation
  - LICENSE: BSD 3-Clause with sklearn attribution
  - create_tinydigits.py: Reproducible generation script

- Updated milestones to use TinyDigits:
  - mlp_digits.py: Now loads from datasets/tinydigits/
  - cnn_digits.py: Now loads from datasets/tinydigits/

- Removed old data:
  - datasets/tiny/ (67KB sklearn duplicate)
  - milestones/03_1986_mlp/data/ (67KB old location)

Dataset Strategy:
TinyTorch now ships with only 2 curated datasets:
1. TinyDigits (51KB) - 8x8 digits for MLP/CNN milestones
2. TinyTalks (140KB) - Q&A pairs for transformer milestone

Total: 191KB shipped data (perfect for RasPi0 deployment)

Rationale:
- Self-contained: No downloads, works offline
- Citable: TinyTorch educational infrastructure for white paper
- Portable: Tiny footprint enables edge device deployment
- Fast: <5 sec training enables instant student feedback

Updated .gitignore to allow TinyTorch curated datasets while
still blocking downloaded large datasets.
2025-11-10 16:59:43 -05:00
Vijay Janapa Reddi
0861a49c02 Remove outdated milestone README files
Deleted 5 README/documentation files with stale information:
- 01_1957_perceptron/README.md
- 02_1969_xor/README.md
- 03_1986_mlp/README.md
- 04_1998_cnn/README.md
- 05_2017_transformer/PERFORMANCE_METRICS_DEMO.md

Issues with these files:
- Wrong file names (rosenblatt_perceptron.py, train_mlp.py, train_cnn.py)
- Old paths (examples/datasets/)
- Duplicate content (already in Python file docstrings)
- Could not be kept in sync with code

Documentation now lives exclusively in comprehensive Python docstrings
at the top of each milestone file, ensuring it stays accurate and
students see rich context when running files.
2025-11-10 16:12:26 -05:00
Vijay Janapa Reddi
6973655854 Remove Shakespeare transformer milestone
Deleted vaswani_shakespeare.py and get_shakespeare() from data_manager:
- 45-60 minute training time (too slow for educational demos)
- Required external download from Karpathy's char-rnn repo
- Replaced by faster TinyTalks ChatGPT milestone (3-5 min training)

Primary transformer milestone is now vaswani_chatgpt.py:
- Uses TinyTalks Q&A dataset (already in repo)
- Fast training with clear learning signal (Q&A format)
- Better pedagogical value (students see transformer learn to chat)
2025-11-10 16:12:06 -05:00
Vijay Janapa Reddi
c663b6b86a Update milestone template with simple Rich UI patterns
Replaced dashboard-based template with direct Rich UI examples:
- Removed MilestoneRunner/dashboard imports
- Added simple Rich Console, Panel, Table patterns
- Shows clean milestone structure with educational narrative
- Demonstrates proper separation: ML code vs display code

Template now guides creating self-contained milestones with
comprehensive docstrings instead of relying on external systems.
2025-11-10 16:12:04 -05:00
Vijay Janapa Reddi
0e617b0c2e Remove milestone dashboard system
Removed achievement/gamification system that was unused:
- milestone_dashboard.py (620+ lines, only 1 file used it)
- .milestone_progress.json (progress tracking data)
- perceptron_trained_v2.py (only dashboard user, duplicate of perceptron_trained.py)

Rationale:
- Dashboard was used by only 1 of 15 milestone files
- Milestones are educational stories, not standardized tests
- Achievement badges felt gimmicky for ML systems learning
- Custom Rich UI in each file is clearer and more educational
- Reduces dependencies (removed psutil system monitoring)
2025-11-10 16:12:03 -05:00
Vijay Janapa Reddi
94a7bb3b1b Add milestone dashboard utility
Provides standardized dashboard system for milestone demonstrations with live metrics, progress tracking, and achievement system
2025-11-10 10:38:02 -05:00
Vijay Janapa Reddi
35f8221a62 Clean up gitignore patterns to be more specific
- Remove overly broad patterns (*_ANALYSIS.md, *_AUDIT.md)
- Make report patterns more specific (MODULE_REVIEW_REPORT_*.md)
- Add clear comments explaining why directories are ignored
- Keep dataset ignores (data/, datasets/) as they are downloaded files
2025-11-10 10:24:06 -05:00
Vijay Janapa Reddi
c7f4dbefbd Update gitignore to exclude datasets and temporary reports
Add patterns for data directories and module review reports
2025-11-10 10:19:28 -05:00
Vijay Janapa Reddi
03fe2d1431 Remove AI assistant section from README
Keep README focused on project information for users and developers
2025-11-10 07:46:45 -05:00
Vijay Janapa Reddi
4403040779 Add AI assistant rules reference to main README
Point developers and AI assistants to shared rules file
2025-11-10 07:32:10 -05:00
Vijay Janapa Reddi
05f4974e93 Add shared AI assistant rules for all tools
Create comprehensive guidelines for git commits, code quality, testing, and development workflow that apply to Cursor, Claude, and any other AI assistants
2025-11-10 07:31:18 -05:00
Vijay Janapa Reddi
638c63e418 Fix Module 16 quantization syntax and imports
Fix misplaced triple-quote causing syntax error and add Sequential import
2025-11-10 07:30:40 -05:00
Vijay Janapa Reddi
d73e1e9eed Fix Module 15 memoization: Add optional mask parameter to MockTransformerBlock forward method 2025-11-10 07:26:11 -05:00
Vijay Janapa Reddi
6a409bab19 Fix Module 12 attention: Correct masking logic to use 0 for masked positions instead of negative values 2025-11-10 07:26:09 -05:00
Vijay Janapa Reddi
fa1b7ec242 Fix Module 06 optimizers: Use duck typing for Tensor validation and extract grad data properly in AdamW 2025-11-10 07:26:07 -05:00
Vijay Janapa Reddi
6f22110407 Add comprehensive test strategy documentation
- Document two-tier testing approach (inline vs integration)
- Explain purpose and scope of each test type
- Provide test coverage matrix for all 20 modules
- Include testing workflow for students and instructors
- Add best practices and common patterns
- Show current status: 11/15 inline tests passing, all 20 modules have test infrastructure
2025-11-10 06:34:42 -05:00
Vijay Janapa Reddi
4246c7599e Create test directories for modules 16-20
- Add tests/16_quantization with run_all_tests.py and integration test
- Add tests/17_compression with run_all_tests.py and integration test
- Add tests/18_acceleration with run_all_tests.py and integration test
- Add tests/19_benchmarking with run_all_tests.py and integration test
- Add tests/20_capstone with run_all_tests.py and integration test
- All test files marked as pending implementation with TODO markers
- Completes test directory structure for all 20 modules
2025-11-10 06:33:50 -05:00
Vijay Janapa Reddi
ae330dd477 Regenerate tinytorch package from all module exports
- Run tito export --all to update all exported code
- Fix file permissions (chmod u+w) to allow export writes
- Update 12 modified files with latest module code
- Add 3 new files (tinygpt, acceleration, compression)
- All 21 modules successfully exported
2025-11-10 06:23:47 -05:00
Vijay Janapa Reddi
ab809052da Fix pyproject.toml readme reference
- Change README_placeholder.md to README.md
- Resolves invalid file reference in package configuration
2025-11-10 06:21:14 -05:00
Vijay Janapa Reddi
d793882a5f Rename test directories to match restructured modules
- Rename tests/14_kvcaching to tests/14_profiling
- Rename tests/15_profiling to tests/15_memoization
- Aligns test structure with optimization tier reorganization
2025-11-10 06:21:04 -05:00
Vijay Janapa Reddi
1c3bbf3c2c Fix import paths in tinytorch nn module
- Import Module base class from core.layers
- Fix embeddings import path (text.embeddings not core.embeddings)
- Fix attention import (MultiHeadAttention not SelfAttention)
- Fix transformer import path (models.transformer not core.transformers)
- Handle missing functional module gracefully with try/except
- Update __all__ exports to match available components
2025-11-09 17:19:16 -05:00
Vijay Janapa Reddi
127be85825 Remove obsolete KV cache test file
- Delete test_kv_cache_milestone.py
- Standalone test file no longer needed after module integration
2025-11-09 17:04:03 -05:00
Vijay Janapa Reddi
acafebee8d Remove internal restructuring documentation
- Delete modules/source/14_profiling/RESTRUCTURING_SUMMARY.md
- Internal implementation notes no longer needed after refactoring completion
2025-11-09 17:03:43 -05:00
Vijay Janapa Reddi
5275cb8783 Remove outdated kvcaching module files
- Delete kvcaching_dev.py (superseded by memoization_dev.py)
- Delete kvcaching_dev.ipynb (superseded by memoization_dev.ipynb)
- memoization_dev files are the current versions with complete content
2025-11-09 17:03:31 -05:00
Vijay Janapa Reddi
6e82a0251d Organize book logos into _static/logos directory
- Move logo-tinytorch-grey.png to _static/logos/
- Move logo-tinytorch-simple.png to _static/logos/
- Move logo-tinytorch-white.png to _static/logos/
- Move tensortorch.png to _static/logos/
- Update _config.yml to reference new logo path
- Keeps all logo versions organized in standard static assets location
2025-11-09 17:03:12 -05:00
Vijay Janapa Reddi
0ff9e39573 Update gitignore to exclude all AI assistant folders
- Add .claude/ to ignore Claude AI configs
- Add .cursor/ to ignore Cursor AI configs
- Add .ai/ to ignore any AI assistant folders
2025-11-09 16:57:49 -05:00
Vijay Janapa Reddi
12e4984e53 Update gitignore for backup files and AI configs
- Add *.bak pattern to ignore backup files
- Add *.backup pattern to ignore backup files
- Add .claude/ directory to ignore AI assistant configs
2025-11-09 16:56:56 -05:00
Vijay Janapa Reddi
bb8f4c9f30 Remove old milestone template
- Delete MILESTONE_TEMPLATE.py in favor of MILESTONE_TEMPLATE_V2.py
2025-11-09 16:56:21 -05:00
Vijay Janapa Reddi
ef371d67eb Remove outdated development reports
- Delete MODULE_14_COMPLETION_REPORT.md
- Delete MODULE_14_REVIEW.md
- Delete RESTRUCTURE_COMPLETE.md
- Delete OPTIMIZATION_TIER_RESTRUCTURE_PLAN.md
- Delete PROGRESS_SUMMARY.md
- Delete PROJECT_STATUS.md
- Delete SCAFFOLDING_COMPLIANCE_REPORT.md
- Delete modules/COMPLIANCE_REPORT_FINAL.md
- Delete modules/GOLD_STANDARD_ANALYSIS.md
- Delete modules/MODULES_14-20_AUDIT.md
2025-11-09 16:56:08 -05:00
Vijay Janapa Reddi
acad70f1eb Remove obsolete backup files
- Delete tinytorch/core/training.py.bak
- Delete tinytorch/core/optimizers.py.bak
- Delete modules/source/14_profiling/profiling_dev.py.backup
2025-11-09 16:55:49 -05:00
Vijay Janapa Reddi
4b717b3d82 Update release documentation and advanced modules
- Updated release checklist and December 2024 release notes
- Updated student version tooling documentation
- Modified modules 15-19 (memoization, quantization, compression, benchmarking)
- Added milestone dashboard and progress tracking
- Added compliance reports and module audits
- Added checkpoint tests for modules 15-20
- Added activation script and book configuration
2025-11-09 16:51:55 -05:00
Vijay Janapa Reddi
b2e6bdedc5 Update TOC to use 'Convolutions' for consistency with single-word naming pattern 2025-11-09 15:18:25 -05:00
Vijay Janapa Reddi
0c4cf881df build: add PDF generation infrastructure for book
Add build scripts and GitHub Actions workflow to support PDF generation:
- build_pdf.sh: LaTeX-based PDF build for professional quality
- build_pdf_simple.sh: HTML-to-PDF build without LaTeX requirement
- Makefile: convenient shortcuts for common build tasks
- GitHub Actions workflow: automated PDF builds on demand

Supports multiple output formats:
- HTML website (default, via jupyter-book)
- PDF via HTML-to-PDF (pyppeteer, no LaTeX needed)
- PDF via LaTeX (professional typography, requires LaTeX)

Usage:
  make html        - Build HTML website
  make pdf-simple  - Build PDF without LaTeX
  make pdf         - Build PDF with LaTeX
2025-11-09 14:51:48 -05:00
Vijay Janapa Reddi
9ccc3f0deb feat: add exported packages for benchmarking, competition, and data utilities
- tinytorch/benchmarking/: Benchmark class for Module 19
- tinytorch/competition/: Submission utilities for Module 20
- tinytorch/data/: Data loading utilities
- tinytorch/utils/data/: Additional data helpers

Exported from modules 19-20 and module 08
2025-11-09 14:42:23 -05:00