TinyTorch

mirror of https://github.com/MLSysBook/TinyTorch.git synced 2026-03-25 05:34:42 -05:00

Author	SHA1	Message	Date
Vijay Janapa Reddi	08321b0e3f	Module improvements: Advanced modules (16-20) - Update memoization module and notebook - Enhance acceleration module - Improve benchmarking module - Refine capstone module - Update competition module	2025-11-11 19:05:02 -05:00
Vijay Janapa Reddi	c8555bdb78	Module improvements: Core modules (01-08) - Update tensor module notebook - Enhance activations module - Expand layers module functionality - Improve autograd implementation - Add optimizers enhancements - Update training module - Refine dataloader notebook	2025-11-11 19:05:00 -05:00
Vijay Janapa Reddi	f445e133ac	Add systems analysis: Autograd profiling - Add memory profiling with tracemalloc - Add backward pass performance benchmarking - Add computational complexity analysis - Demonstrates autograd overhead and performance characteristics	2025-11-11 19:04:59 -05:00
Vijay Janapa Reddi	9a0924376e	Cleanup: Remove old/unused files - Remove datasets analysis and download scripts (replaced by updated README) - Remove archived book development documentation - Remove module review reports (16_compression, 17_memoization)	2025-11-11 19:04:56 -05:00
Vijay Janapa Reddi	884f024743	Fix NBGrader metadata for Modules 15 and 16 Module 15 (Quantization): - Added locked=true to test_module cell (line 1523) - Added NBGrader metadata to systems-thinking markdown cell (line 1751) - Added schema_version: 3 to both cells Module 16 (Compression): - Added NBGrader metadata to 6 solution cells: * measure-sparsity (line 380) * magnitude-prune (line 511) * structured-prune (line 675) * low-rank-approx (line 843) * distillation (line 1013) * compress-model-comprehensive (line 1234) - Added NBGrader metadata to 6 test cells: * test-measure-sparsity (line 427) - 5 points * test-magnitude-prune (line 567) - 10 points * test-structured-prune (line 733) - 10 points * test-low-rank (line 888) - 10 points * test-distillation (line 1133) - 15 points * test-compression-integration (line 1300) - 20 points - Total: 70 points for Module 16 Result: - Module 15: 0 P0-BLOCKER, 0 P1-IMPORTANT (was 1 P0 + 1 P1) - Module 16: 0 P0-BLOCKER, 0 P1-IMPORTANT (was 12 P0) - Both modules now production-ready for NBGrader deployment(https://claude.com/claude-code)	2025-11-11 14:50:37 -05:00
Vijay Janapa Reddi	e456f438e7	Remove redundant review documentation Removed redundant and superseded review reports: - Module 15: COMPREHENSIVE_REVIEW_REPORT.md, FINAL_VALIDATION_REPORT.md, REVIEW_SUMMARY.md - Docs: RESTRUCTURING_VERIFICATION.md, book-development/CLEANUP_SUMMARY.md Also removed untracked files: - Module 11: REVIEW_REPORT_FINAL.md (superseded by REVIEW_REPORT.md) - Module 12: REVIEW_SUMMARY.md (redundant with REVIEW_REPORT.md) - Module 20: COMPLIANCE_CHECKLIST.md (redundant with REVIEW_REPORT.md) - Module 6, 8, 14, 18: COMPLIANCE_SUMMARY.md and QUICK_SUMMARY.md files Retained comprehensive REVIEW_REPORT.md files which contain the most complete QA documentation.	2025-11-11 12:15:36 -05:00
Vijay Janapa Reddi	148326e996	Remove temporary analysis and fix documentation Removed 31 temporary markdown files that documented completed work: - Module-specific fix reports (Module 07, 16, 17, 19-20) - Hasattr audit files (completed audit) - Module progression review reports (completed) - Infrastructure analysis reports (completed) - Renumbering and restructuring summaries (completed) Retained valuable documentation: - All REVIEW_REPORT.md files (comprehensive QA documentation) - All COMPLIANCE_SUMMARY.md files (quick reference) - COMPREHENSIVE_MODULE_REVIEW_STATUS.md (tracking) - MODULE_DEPENDENCY_MAP.md and MODULE_PROGRESSION_GUIDE.md (guides)	2025-11-11 12:09:31 -05:00
Vijay Janapa Reddi	f7dcbc8505	Remove temporary analysis files from modules Cleaned up temporary AI-generated analysis files: - modules/15_quantization/FIXES_APPLIED.md - modules/15_quantization/FIXES_TO_APPLY.md - modules/16_compression/FIXES_REQUIRED.md - modules/17_memoization/FIXES_APPLIED.md - Plus other untracked analysis files These were temporary debugging/review artifacts. Now covered by .gitignore patterns to prevent future accumulation.	2025-11-10 19:50:43 -05:00
Vijay Janapa Reddi	a14f9fa66a	Add module metadata for competition module Added module.yaml for Module 20 (Competition & Validation): - Module configuration and learning objectives - Prerequisites and skill development tracking - Test coverage and connection documentation This module brings together all optimization techniques learned in modules 14-18 for competition preparation.	2025-11-10 19:44:06 -05:00
Vijay Janapa Reddi	832c569cad	Add module development files to new structure Added all module development files to modules/XX_name/ directories: Module notebooks and scripts: - 18 modules with .ipynb and .py files (01-20, excluding some gaps) - Moved from modules/source/ to direct module directories - Includes tensor, autograd, layers, transformers, optimization modules Module README files: - Added README.md for modules with additional documentation - Complements ABOUT.md files added earlier This completes the module restructuring: - Before: modules/source/XX_name/_dev.{py,ipynb} - After: modules/XX_name/_dev.{py,ipynb} All development happens directly in numbered module directories now.	2025-11-10 19:43:36 -05:00
Vijay Janapa Reddi	a4e38cb906	Update documentation for site/ migration and restructuring Documentation updates across the codebase: Root documentation: - README.md: Updated references from book/ to site/ - CONTRIBUTING.md: Updated build and workflow instructions - .shared-ai-rules.md: Updated AI assistant rules for new structure GitHub configuration: - Issue templates updated for new module locations - Workflow references updated from book/ to site/ docs/ updates: - STUDENT_QUICKSTART.md: New paths and structure - module-rules.md: Updated module development guidelines - NBGrader documentation: Updated for module restructuring - Archive documentation: Updated references Module documentation: - modules/17_memoization/README.md: Updated after reordering All documentation now correctly references: - site/ instead of book/ - modules/XX_name/ instead of modules/source/	2025-11-10 19:42:48 -05:00
Vijay Janapa Reddi	9fdfa4317c	Remove modules/source/ directory structure Completed restructuring: modules/source/XX_name/ → modules/XX_name/ All module development files moved to their numbered directories: - modules/01_tensor/tensor_dev.{py,ipynb} - modules/02_activations/activations_dev.{py,ipynb} - ... (modules 03-20) Removed obsolete source structure: - modules/source/01_tensor/ through modules/source/20_capstone/ - modules/source/20_competition/ (legacy competition module) - 43 files total (21 modules × 2 files each + 1 module.yaml) This simplifies the module structure and makes development files easier to find alongside their ABOUT.md and README.md files.	2025-11-10 19:41:24 -05:00
Vijay Janapa Reddi	a5679de141	Update documentation after module reordering All module references updated to reflect new ordering: - Module 15: Quantization (was 16) - Module 16: Compression (was 17) - Module 17: Memoization (was 15) Updated by module-developer and website-manager agents: - Module ABOUT files with correct numbers and prerequisites - Cross-references and "What's Next" chains - Website navigation (_toc.yml) and content - Learning path progression in LEARNING_PATH.md - Profile milestone completion message (Module 17) Pedagogical flow now: Profile → Quantize → Prune → Cache → Accelerate	2025-11-10 19:37:41 -05:00
Vijay Janapa Reddi	5f3591a57b	Reorder modules for better pedagogical flow Moved memoization (KV-cache) after compression to align with optimization tier milestones. Changes: - Module 15: Quantization (was 16) - Module 16: Compression (was 17) - Module 17: Memoization (was 15) Pedagogical Rationale: This creates clear alignment with the optimization milestone structure: - M06 (Profiling): Module 14 - M07 (Compression): Modules 15-16 (Quantization + Compression) - M08 (Acceleration): Modules 17-18 (Memoization/KV-cache + Acceleration) Before: Students learned KV-cache before understanding why models are slow After: Students profile → compress → then optimize with KV-cache Updated milestone reference in profile_kv_cache.py: Module 15 → Module 17	2025-11-10 19:29:10 -05:00
Vijay Janapa Reddi	638c63e418	Fix Module 16 quantization syntax and imports Fix misplaced triple-quote causing syntax error and add Sequential import	2025-11-10 07:30:40 -05:00
Vijay Janapa Reddi	d73e1e9eed	Fix Module 15 memoization: Add optional mask parameter to MockTransformerBlock forward method	2025-11-10 07:26:11 -05:00
Vijay Janapa Reddi	6a409bab19	Fix Module 12 attention: Correct masking logic to use 0 for masked positions instead of negative values	2025-11-10 07:26:09 -05:00
Vijay Janapa Reddi	fa1b7ec242	Fix Module 06 optimizers: Use duck typing for Tensor validation and extract grad data properly in AdamW	2025-11-10 07:26:07 -05:00
Vijay Janapa Reddi	acafebee8d	Remove internal restructuring documentation - Delete modules/source/14_profiling/RESTRUCTURING_SUMMARY.md - Internal implementation notes no longer needed after refactoring completion	2025-11-09 17:03:43 -05:00
Vijay Janapa Reddi	5275cb8783	Remove outdated kvcaching module files - Delete kvcaching_dev.py (superseded by memoization_dev.py) - Delete kvcaching_dev.ipynb (superseded by memoization_dev.ipynb) - memoization_dev files are the current versions with complete content	2025-11-09 17:03:31 -05:00
Vijay Janapa Reddi	ef371d67eb	Remove outdated development reports - Delete MODULE_14_COMPLETION_REPORT.md - Delete MODULE_14_REVIEW.md - Delete RESTRUCTURE_COMPLETE.md - Delete OPTIMIZATION_TIER_RESTRUCTURE_PLAN.md - Delete PROGRESS_SUMMARY.md - Delete PROJECT_STATUS.md - Delete SCAFFOLDING_COMPLIANCE_REPORT.md - Delete modules/COMPLIANCE_REPORT_FINAL.md - Delete modules/GOLD_STANDARD_ANALYSIS.md - Delete modules/MODULES_14-20_AUDIT.md	2025-11-09 16:56:08 -05:00
Vijay Janapa Reddi	acad70f1eb	Remove obsolete backup files - Delete tinytorch/core/training.py.bak - Delete tinytorch/core/optimizers.py.bak - Delete modules/source/14_profiling/profiling_dev.py.backup	2025-11-09 16:55:49 -05:00
Vijay Janapa Reddi	4b717b3d82	Update release documentation and advanced modules - Updated release checklist and December 2024 release notes - Updated student version tooling documentation - Modified modules 15-19 (memoization, quantization, compression, benchmarking) - Added milestone dashboard and progress tracking - Added compliance reports and module audits - Added checkpoint tests for modules 15-20 - Added activation script and book configuration	2025-11-09 16:51:55 -05:00
Vijay Janapa Reddi	b6baedab97	build: add generated memoization notebook Generated from memoization_dev.py after module restructuring	2025-11-09 14:41:24 -05:00
Vijay Janapa Reddi	3d768fe3a7	docs: add comprehensive docstrings to optimization modules 16-19 - Add Args/Returns/Example/Hints to key functions - Improve documentation for compare_model_sizes (16) - Enhance function documentation in compression (17) - Add docstring details for acceleration (18) - Improve benchmarking function docs (19)	2025-11-09 14:38:44 -05:00
Vijay Janapa Reddi	9adfafcd2a	docs: add Args/Returns docstrings to quantization functions	2025-11-09 13:03:43 -05:00
Vijay Janapa Reddi	8138a9d7c0	build: regenerate profiling notebook from updated dev file	2025-11-09 13:03:30 -05:00
Vijay Janapa Reddi	397be2b8ec	refactor: Remove old module and chapter files after reorganization Cleanup of renamed files: - Deleted old module source files (14_kvcaching, 15_profiling, 16_acceleration, etc.) - Deleted old chapter markdown files - These have been replaced by reorganized versions in previous commits	2025-11-09 12:26:47 -05:00
Vijay Janapa Reddi	9fbd8261d6	refactor(modules): Reorganize optimization tier structure (14-19) Module renaming and reordering: - 15_profiling → 14_profiling (now first in optimization tier) - 14_kvcaching → 15_memoization (renamed to emphasize pattern) - 17_quantization → 16_quantization - 18_compression → 17_compression - 16_acceleration → 18_acceleration (moved after compression) - 19_benchmarking (unchanged) All module metadata updated (numbers, prerequisites, connection maps)	2025-11-09 12:26:13 -05:00
Vijay Janapa Reddi	269f6369e9	feat(modules): Add profiling motivation sections to optimization modules - Quantization: Shows FP32 memory usage, motivates precision reduction - Compression: Shows weight distribution, motivates pruning - Acceleration: Shows CNN compute bottleneck, motivates vectorization Each module now follows pattern: Profile → Discover → Fix	2025-11-09 12:26:03 -05:00
Vijay Janapa Reddi	fc038dc0c3	feat(memoization): Add profiling motivation section - Shows O(n²) latency growth in transformer generation - Demonstrates problem before teaching solution - Prepares module for reorganization to Module 15	2025-11-09 09:16:08 -05:00
Vijay Janapa Reddi	961da3d340	feat(profiler): Add helper functions for optimization modules - Add quick_profile() for simplified profiling interface - Add analyze_weight_distribution() for compression module - Both functions will be used by modules 15-18	2025-11-09 09:15:13 -05:00
Vijay Janapa Reddi	4c83bdbacc	Implement MLPerf Edu Competition module (Module 20) Complete capstone competition implementation: - Two division tracks: Closed (optimize) and Open (innovate) - Baseline CNN model for CIFAR-10 - Validation and submission generation system - Integration with Module 19 normalized scoring - Honor code and GitHub repo submission workflow - Worked examples and student templates Module 20 is now a pedagogically sound capstone that applies all Optimization Tier techniques in a fair competition format.	2025-11-07 20:04:57 -05:00
Vijay Janapa Reddi	bd4b2c0bf1	Add normalized scoring and MLPerf principles to Module 19 Enhancements to benchmarking module: - Added calculate_normalized_scores() for fair hardware comparison - Implemented speedup, compression ratio, accuracy delta metrics - Added MLPerf principles section to educational content - Updated module to support competition fairness These changes enable Module 20 competition to work across different hardware.	2025-11-07 20:04:46 -05:00
Vijay Janapa Reddi	007f45d364	Add validation and normalized scoring to Module 20 competition submissions - Import calculate_normalized_scores from Module 19 for fair comparison - Implement validate_submission() with sanity checks for submissions - Check for reasonable speedup (<50x), compression (<32x), accuracy preservation - Verify GitHub repo and required fields are present - Update generate_submission() to use normalized MLPerf-style scoring - Add division parameter for Closed/Open Division tracking - Include github_repo and honor_code fields in submission - Display normalized scores: speedup, compression ratio, accuracy delta - Guide students to use 'tito submit' for final submission workflow	2025-11-06 23:57:55 -05:00
Vijay Janapa Reddi	5f25a4ff69	Add normalized scoring to Module 19 for fair competition comparison - Add Section 4.5: Normalized Metrics - Fair Comparison Across Different Hardware - Implement calculate_normalized_scores() function for MLPerf-style relative metrics - Calculate speedup, compression ratio, accuracy delta, and efficiency score - Add comprehensive unit tests for normalized scoring - Ensures fairness across different hardware by measuring relative improvements - Prepares students for Module 20 TinyMLPerf competition submissions	2025-11-06 23:57:34 -05:00
Vijay Janapa Reddi	30847f0d69	Add MLPerf methodology to Module 19 and rebrand Module 20 as TinyMLPerf Module 19 Updates: - Added Section 4.4: MLPerf Principles & Methodology - Explains MLPerf framework (industry-standard benchmarking) - Teaches Closed vs Open Division concepts - Covers reproducibility and standardization requirements - References TinyMLPerf for embedded systems - Prepares students for professional ML benchmarking Module 20 Updates: - Rebranded as TinyMLPerf Competition (from generic competition) - Emphasizes MLPerf Closed Division rules throughout - Section 1: TinyMLPerf rules and what is/isnt allowed - Section 2: Official baseline following MLPerf standards - Section 3: Complete workflow following MLPerf methodology - Section 4: Submission template with MLPerf compliance Pedagogical Improvement: - Grounds capstone in real-world MLPerf methodology - Students learn industry-standard benchmarking practices - Competition has professional credibility - Clear rules ensure fair comparison - Reproducibility and documentation emphasized	2025-11-06 23:34:00 -05:00
Vijay Janapa Reddi	4e75ba932e	Refactor Module 19 to TorchPerf Olympics framework - Updated module title to TorchPerf Olympics Preparation - Added OlympicEvent enum with 5 competition categories - Removed meta-analysis sections (532 lines) - Added section 4.5 on combination strategies and ablation studies - Updated documentation to explain Olympic events and optimization order - Module teaches benchmarking principles while preparing students for capstone	2025-11-06 21:53:36 -05:00
Vijay Janapa Reddi	2cb84282d9	Add Profiler demo to Module 18 Compression - Added Section 8.5: Measuring Compression Impact with Profiler - Demonstrates 70% magnitude pruning parameter reduction - Shows sparsity measurements and active parameter counts - Uses Profiler from Module 15 for measurements - Educates students on compression workflow: measure prune validate deploy	2025-11-06 20:38:50 -05:00
Vijay Janapa Reddi	9ca29870c6	Add Profiler demo to Module 17 Quantization - Added Section 5.5: Measuring Quantization Savings with Profiler - Demonstrates FP32 to INT8 memory reduction (4x savings) - Shows actual memory measurements before/after quantization - Uses Profiler from Module 15 for measurements - Educates students on production workflow: measure compress validate deploy	2025-11-06 20:38:44 -05:00
Vijay Janapa Reddi	d7d75c2a30	Rename ProfilerComplete to Profiler for cleaner API - Updated all imports: ProfilerComplete → Profiler - Updated Module 16: Uses Profiler for acceleration demos - Updated Module 19: Uses Profiler in Benchmark class - Updated all comments and docstrings - Simpler, more professional naming (no awkward Complete suffix)	2025-11-06 20:35:21 -05:00
Vijay Janapa Reddi	5ce16bfd10	Refactor Module 19 Benchmark to use ProfilerComplete from Module 15 - Added import: from tinytorch.profiling.profiler import ProfilerComplete - Benchmark class now initializes self.profiler = ProfilerComplete() - run_latency_benchmark() uses profiler.measure_latency() - run_memory_benchmark() uses profiler.measure_memory() and profiler.count_parameters() - Updated architecture diagram to show ProfilerComplete as foundation - Added pedagogical note explaining build-once-reuse-everywhere principle Benefits: - Eliminates code duplication between M15 and M19 - Shows proper systems architecture (composition/reuse) - Students see ProfilerComplete tool evolving and being reused - Clear separation: Profiler=measure, Benchmark=compare	2025-11-06 20:30:50 -05:00
Vijay Janapa Reddi	bd803bab27	Fix Module 16 test to remove mixed precision trainer references - Removed SimpleOptimizer class (unused after mixed precision removal) - Replaced trainer.train_step() test with simple forward pass test - Test now validates accelerated operations without mixed precision - Checks numerical correctness and reasonable output values	2025-11-06 20:19:03 -05:00
Vijay Janapa Reddi	d14dd76270	Streamline Module 18 Compression (Option 2: Moderate cleanup) - Removed Section 9: Systems Analysis (118 lines) - Removed analyze_compression_accuracy_tradeoff function (56 lines) - Replaced minimal Tensor/Linear implementations with proper imports (57 lines saved) - Added CompressionComplete export class with all core methods (120 lines) - Net reduction: 111 lines (7%) Result: 1564 → 1453 lines Focus: Core compression techniques (pruning, distillation, low-rank) Imports: Now uses tinytorch.core.tensor and tinytorch.core.layers	2025-11-06 20:13:51 -05:00
Vijay Janapa Reddi	065314c535	Streamline Module 17 Quantization by removing analysis functions - Removed Section: Quantization Quality + analyze_quantization_error (84 lines) - Removed Section 5: Systems Analysis + analyze_quantization_performance (226 lines) - Removed Section: Quantization Error Visualization (122 lines) - Removed analyze_quantization_strategies function (108 lines) - Total reduction: 540 lines (24%) - Renumbered remaining sections - Fixed markdown cell formatting Result: 2295 → 1703 lines Focus: Core quantization (quantize/dequantize/QuantizedLinear/quantize_model)	2025-11-06 17:48:47 -05:00
Vijay Janapa Reddi	34c8f51284	Remove mixed precision content from Module 16 Acceleration - Removed Section 4: Mixed Precision Training (446 lines) - Removed analyze_mixed_precision_benefits function (88 lines) - Cleaned up all mixed precision references - Total reduction: 580 lines (34%) - Module now focuses on: vectorization and kernel fusion - Fixed duplicate markdown cells from deletion Result: 1698 → 1118 lines	2025-11-06 17:43:39 -05:00
Vijay Janapa Reddi	4e1b536fac	Module 17: Export QuantizationComplete for INT8 quantization - Added QuantizationComplete class with quantize/dequantize methods - Exported quantization functions to tinytorch/optimization/quantization.py - Provides 4x memory reduction with minimal accuracy loss - Removed pedagogical QuantizedLinear export to avoid conflicts - Added proper imports to export block	2025-11-06 15:50:48 -05:00
Vijay Janapa Reddi	cafc265931	Format matrix diagram in acceleration module for better readability Improved spacing in matrix multiplication visualization	2025-11-06 15:31:57 -05:00
Vijay Janapa Reddi	3451819af9	Add Module 14-15 connection section to profiling documentation Explains how profiling enables optimization discovery and connects to KV caching workflow	2025-11-06 15:31:48 -05:00
Vijay Janapa Reddi	abdffd8e48	Module 15: Export ProfilerComplete and create KV cache profiling demo - Added ProfilerComplete class to profiling_dev.py with all measurement methods - Exported ProfilerComplete to tinytorch/profiling/profiler.py - Created profile_kv_cache.py milestone demonstrating scientific performance measurement - Demo shows 19x speedup from KV caching with detailed profiling metrics - Validates Module 14 KV cache optimization impact quantitatively	2025-11-06 14:21:22 -05:00

1 2 3 4 5 ...

532 Commits