4 Commits

Author SHA1 Message Date
Vijay Janapa Reddi
dde470a4e5 Fix all stale imports from models.transformer to core.transformer 2025-12-03 00:28:37 -08:00
Vijay Janapa Reddi
90581b23c0 Update test suite for module restructuring
Updated test imports and paths after modules/source/ removal:
- Progressive integration tests for modules 03, 06, 08, 13, 14
- Checkpoint integration tests
- Module completion orchestrator
- Optimizer integration tests
- Gradient flow regression tests

Updated test documentation:
- tests/README.md with new module paths
- tests/TEST_STRATEGY.md with restructuring notes

All tests now reference modules/XX_name/ instead of modules/source/.
2025-11-10 19:42:23 -05:00
Vijay Janapa Reddi
87d5a7e381 fix(module-05): Add TransposeBackward and fix MatmulBackward for batched ops
TransposeBackward:
- New backward function for transpose operation
- Patch Tensor.transpose() to track gradients
- Critical for attention (Q @ K.T) gradient flow

MatmulBackward batched fix:
- Change np.dot to np.matmul for batched 3D+ tensors
- Use np.swapaxes instead of .T for proper batched transpose
- Fixes gradient shapes in attention mechanisms

Tests added:
- tests/05_autograd/test_batched_matmul_backward.py (3 tests)
- Updated tests/regression/test_gradient_flow_fixes.py (9 tests total)

All gradient flow issues for transformer training are now resolved!
2025-10-27 20:35:06 -04:00
Vijay Janapa Reddi
fb753882ec fix(module-01): Fix batched matmul and transpose grad preservation
- Change np.dot to np.matmul for proper batched 3D tensor multiplication
- Add requires_grad preservation in transpose() operation
- Fixes attention mechanism gradient flow issues

Regression tests added in tests/regression/test_gradient_flow_fixes.py
2025-10-27 20:28:53 -04:00