- Deleted root-level tests/test_gradient_flow.py
- Comprehensive tests now in tests/regression/test_gradient_flow_fixes.py
- Module-specific tests in tests/05_autograd/test_batched_matmul_backward.py
- Better test organization following TinyTorch conventions
Add test_xor_simple.py - validates multi-layer gradient flow
- 100% accuracy on XOR (the 1969 'impossible' problem)
- Hidden layer (2→4) + ReLU + output (4→1) architecture
- Gradients flow correctly through 2 layers
- Loss decreases smoothly during training
This proves:
✅ Multi-layer networks work
✅ Backprop works through hidden layers
✅ ReLU activation works in training
✅ The 1969 AI Winter problem is solved!
Historical significance: Minsky proved single-layer perceptrons
couldn't solve XOR. Multi-layer networks (what we built) can!