TinyTorch

mirror of https://github.com/MLSysBook/TinyTorch.git synced 2026-07-20 07:58:34 -05:00

Author	SHA1	Message	Date
Vijay Janapa Reddi	4aeb3c9c69	Merge main into dev, resolving conflicts with dev's version	2025-12-03 07:26:43 -08:00
Vijay Janapa Reddi	dde470a4e5	Fix all stale imports from models.transformer to core.transformer	2025-12-03 00:28:37 -08:00
Vijay Janapa Reddi	a622e2c200	Fix regression tests for current API - Update TransformerBlock to use mlp_ratio instead of hidden_dim - Update PositionalEncoding argument order - Fix MultiHeadAttention to use self-attention API - Add missing MultiHeadAttention import	2025-12-02 22:30:42 -08:00
Vijay Janapa Reddi	1e155fb4da	Remove legacy broken tests with outdated API imports - tests/performance/: Referenced non-existent modules/ directory - tests/system/: Required tinytorch.nn.functional which does not exist - tests/regression/test_conv_linear_dimensions.py: Same issue - These tests predated the API consolidation	2025-12-02 22:30:37 -08:00
Vijay Janapa Reddi	0af88840b1	Update test suite for module restructuring Updated test imports and paths after modules/source/ removal: - Progressive integration tests for modules 03, 06, 08, 13, 14 - Checkpoint integration tests - Module completion orchestrator - Optimizer integration tests - Gradient flow regression tests Updated test documentation: - tests/README.md with new module paths - tests/TEST_STRATEGY.md with restructuring notes All tests now reference modules/XX_name/ instead of modules/source/.	2025-11-10 19:42:23 -05:00
Vijay Janapa Reddi	90581b23c0	Update test suite for module restructuring Updated test imports and paths after modules/source/ removal: - Progressive integration tests for modules 03, 06, 08, 13, 14 - Checkpoint integration tests - Module completion orchestrator - Optimizer integration tests - Gradient flow regression tests Updated test documentation: - tests/README.md with new module paths - tests/TEST_STRATEGY.md with restructuring notes All tests now reference modules/XX_name/ instead of modules/source/.	2025-11-10 19:42:23 -05:00
Vijay Janapa Reddi	788cd5aa34	chore: Remove temporary documentation files from tests/ Removed files created during debugging: - tests/regression/GRADIENT_FLOW_TEST_SUMMARY.md (info now in test docstrings) - tests/debug_posenc.py (temporary debug script) Test organization is clean: - Module tests: tests/XX_modulename/ - Integration tests: tests/integration/ - Regression tests: tests/regression/ (gradient flow tests) - Milestone tests: tests/milestones/ - System tests: tests/system/ All actual test files remain and pass.	2025-10-28 08:40:31 -04:00
Vijay Janapa Reddi	58a04c45ad	chore: Remove temporary documentation files from tests/ Removed files created during debugging: - tests/regression/GRADIENT_FLOW_TEST_SUMMARY.md (info now in test docstrings) - tests/debug_posenc.py (temporary debug script) Test organization is clean: - Module tests: tests/XX_modulename/ - Integration tests: tests/integration/ - Regression tests: tests/regression/ (gradient flow tests) - Milestone tests: tests/milestones/ - System tests: tests/system/ All actual test files remain and pass.	2025-10-28 08:40:31 -04:00
Vijay Janapa Reddi	9044d0ae61	docs: Add gradient flow test suite summary Summary of comprehensive test coverage: - 18 tests total (9 regression + 9 NLP component) - All tests pass ✅ - Covers modules 01, 02, 03, 05, 10, 11, 12, 13 - Verifies all 37 GPT parameters receive gradients - Documents test execution and results	2025-10-28 08:35:56 -04:00
Vijay Janapa Reddi	6cf8dedc14	docs: Add gradient flow test suite summary Summary of comprehensive test coverage: - 18 tests total (9 regression + 9 NLP component) - All tests pass ✅ - Covers modules 01, 02, 03, 05, 10, 11, 12, 13 - Verifies all 37 GPT parameters receive gradients - Documents test execution and results	2025-10-28 08:35:56 -04:00
Vijay Janapa Reddi	f36721509c	test: Add comprehensive NLP component gradient flow tests Created exhaustive test suite for all NLP modules: Module 10 - Tokenization: - Verified encode/decode functionality - No gradients needed (preprocessing) Module 11 - Embeddings: - ✅ Embedding lookup preserves requires_grad - ✅ EmbeddingBackward correctly accumulates gradients - ✅ Sparse gradient updates (only used indices) - ✅ PositionalEncoding adds positional info - ✅ Gradients flow through addition Module 12 - Attention: - ✅ Scaled dot-product attention: Q, K, V all receive gradients - ✅ Works with and without causal masking - ✅ Multi-head attention: ALL projections (Q, K, V, out) receive gradients - ✅ Reshape and permute operations preserve gradients - ✅ Batched attention computation works correctly Module 13 - Transformer: - ✅ LayerNorm: gamma and beta receive gradients - ✅ MLP: both linear layers receive gradients - ✅ TransformerBlock: ALL 10 parameters receive gradients - Both LayerNorms (ln1, ln2) - All attention projections - Both MLP layers - Residual connections don't break flow Full GPT Model: - ✅ End-to-end gradient flow verified - ✅ ALL 37 parameters receive gradients - ✅ Token + position embeddings - ✅ All transformer blocks - ✅ Final LayerNorm + LM head Results: 9/9 tests PASS ✅ All NLP components have correct gradient flow!	2025-10-28 08:35:20 -04:00
Vijay Janapa Reddi	2531aa164e	test: Add comprehensive NLP component gradient flow tests Created exhaustive test suite for all NLP modules: Module 10 - Tokenization: - Verified encode/decode functionality - No gradients needed (preprocessing) Module 11 - Embeddings: - ✅ Embedding lookup preserves requires_grad - ✅ EmbeddingBackward correctly accumulates gradients - ✅ Sparse gradient updates (only used indices) - ✅ PositionalEncoding adds positional info - ✅ Gradients flow through addition Module 12 - Attention: - ✅ Scaled dot-product attention: Q, K, V all receive gradients - ✅ Works with and without causal masking - ✅ Multi-head attention: ALL projections (Q, K, V, out) receive gradients - ✅ Reshape and permute operations preserve gradients - ✅ Batched attention computation works correctly Module 13 - Transformer: - ✅ LayerNorm: gamma and beta receive gradients - ✅ MLP: both linear layers receive gradients - ✅ TransformerBlock: ALL 10 parameters receive gradients - Both LayerNorms (ln1, ln2) - All attention projections - Both MLP layers - Residual connections don't break flow Full GPT Model: - ✅ End-to-end gradient flow verified - ✅ ALL 37 parameters receive gradients - ✅ Token + position embeddings - ✅ All transformer blocks - ✅ Final LayerNorm + LM head Results: 9/9 tests PASS ✅ All NLP components have correct gradient flow!	2025-10-28 08:35:20 -04:00
Vijay Janapa Reddi	f1ec8e81e0	fix(module-05): Add TransposeBackward and fix MatmulBackward for batched ops TransposeBackward: - New backward function for transpose operation - Patch Tensor.transpose() to track gradients - Critical for attention (Q @ K.T) gradient flow MatmulBackward batched fix: - Change np.dot to np.matmul for batched 3D+ tensors - Use np.swapaxes instead of .T for proper batched transpose - Fixes gradient shapes in attention mechanisms Tests added: - tests/05_autograd/test_batched_matmul_backward.py (3 tests) - Updated tests/regression/test_gradient_flow_fixes.py (9 tests total) All gradient flow issues for transformer training are now resolved!	2025-10-27 20:35:06 -04:00
Vijay Janapa Reddi	87d5a7e381	fix(module-05): Add TransposeBackward and fix MatmulBackward for batched ops TransposeBackward: - New backward function for transpose operation - Patch Tensor.transpose() to track gradients - Critical for attention (Q @ K.T) gradient flow MatmulBackward batched fix: - Change np.dot to np.matmul for batched 3D+ tensors - Use np.swapaxes instead of .T for proper batched transpose - Fixes gradient shapes in attention mechanisms Tests added: - tests/05_autograd/test_batched_matmul_backward.py (3 tests) - Updated tests/regression/test_gradient_flow_fixes.py (9 tests total) All gradient flow issues for transformer training are now resolved!	2025-10-27 20:35:06 -04:00
Vijay Janapa Reddi	d6314ccec1	fix(module-01): Fix batched matmul and transpose grad preservation - Change np.dot to np.matmul for proper batched 3D tensor multiplication - Add requires_grad preservation in transpose() operation - Fixes attention mechanism gradient flow issues Regression tests added in tests/regression/test_gradient_flow_fixes.py	2025-10-27 20:28:53 -04:00
Vijay Janapa Reddi	fb753882ec	fix(module-01): Fix batched matmul and transpose grad preservation - Change np.dot to np.matmul for proper batched 3D tensor multiplication - Add requires_grad preservation in transpose() operation - Fixes attention mechanism gradient flow issues Regression tests added in tests/regression/test_gradient_flow_fixes.py	2025-10-27 20:28:53 -04:00
Vijay Janapa Reddi	73e7f5b67a	FOUNDATION: Establish AI Engineering as a discipline through TinyTorch 🎯 NORTH STAR VISION DOCUMENTED: 'Don't Just Import It, Build It' - Training AI Engineers, not just ML users AI Engineering emerges as a foundational discipline like Computer Engineering, bridging algorithms and systems to build the AI infrastructure of the future. 🧪 ROBUST TESTING FRAMEWORK ESTABLISHED: - Created tests/regression/ for sandbox integrity tests - Implemented test-driven bug prevention workflow - Clear separation: student tests (pedagogical) vs system tests (robustness) - Every bug becomes a test to prevent recurrence ✅ KEY IMPLEMENTATIONS: - NORTH_STAR.md: Vision for AI Engineering discipline - Testing best practices: Focus on robust student sandbox - Git workflow standards: Professional development practices - Regression test suite: Prevent infrastructure issues - Conv->Linear dimension tests (found CNN bug) - Transformer reshaping tests (found GPT bug) 🏗️ SANDBOX INTEGRITY: Students need a solid, predictable environment where they focus on ML concepts, not debugging framework issues. The framework must be invisible. 📚 EDUCATIONAL PHILOSOPHY: TinyTorch isn't just teaching a framework - it's founding the AI Engineering discipline by training engineers who understand how to BUILD ML systems. This establishes the foundation for training the first generation of true AI Engineers who will define this emerging discipline.	2025-09-25 11:16:28 -04:00
Vijay Janapa Reddi	56f374efa3	FOUNDATION: Establish AI Engineering as a discipline through TinyTorch 🎯 NORTH STAR VISION DOCUMENTED: 'Don't Just Import It, Build It' - Training AI Engineers, not just ML users AI Engineering emerges as a foundational discipline like Computer Engineering, bridging algorithms and systems to build the AI infrastructure of the future. 🧪 ROBUST TESTING FRAMEWORK ESTABLISHED: - Created tests/regression/ for sandbox integrity tests - Implemented test-driven bug prevention workflow - Clear separation: student tests (pedagogical) vs system tests (robustness) - Every bug becomes a test to prevent recurrence ✅ KEY IMPLEMENTATIONS: - NORTH_STAR.md: Vision for AI Engineering discipline - Testing best practices: Focus on robust student sandbox - Git workflow standards: Professional development practices - Regression test suite: Prevent infrastructure issues - Conv->Linear dimension tests (found CNN bug) - Transformer reshaping tests (found GPT bug) 🏗️ SANDBOX INTEGRITY: Students need a solid, predictable environment where they focus on ML concepts, not debugging framework issues. The framework must be invisible. 📚 EDUCATIONAL PHILOSOPHY: TinyTorch isn't just teaching a framework - it's founding the AI Engineering discipline by training engineers who understand how to BUILD ML systems. This establishes the foundation for training the first generation of true AI Engineers who will define this emerging discipline.	2025-09-25 11:16:28 -04:00