- Add CLAUDE.md entry point for Claude AI system
- Fix tito test command to set PYTHONPATH for module imports
- Fix embeddings export directive placement for nbdev
- Fix attention module to export imports properly
- Fix transformers embedding index casting to int
- Update transformers module to match tokenization style with improved ASCII diagrams
- Fix attention module to use proper multi-head interface
- Update transformer era milestone for refined module integration
- Fix import paths and ensure forward() method consistency
- All transformer components now work seamlessly together
PROBLEM:
- nbdev requires #| export directive on EACH cell to export when using # %% markers
- Cell markers inside class definitions split classes across multiple cells
- Only partial classes were being exported to tinytorch package
- Missing matmul, arithmetic operations, and activation classes in exports
SOLUTION:
1. Removed # %% cell markers INSIDE class definitions (kept classes as single units)
2. Added #| export to imports cell at top of each module
3. Added #| export before each exportable class definition in all 20 modules
4. Added __call__ method to Sigmoid for functional usage
5. Fixed numpy import (moved to module level from __init__)
MODULES FIXED:
- 01_tensor: Tensor class with all operations (matmul, arithmetic, shape ops)
- 02_activations: Sigmoid, ReLU, Tanh, GELU, Softmax classes
- 03_layers: Linear, Dropout classes
- 04_losses: MSELoss, CrossEntropyLoss, BinaryCrossEntropyLoss classes
- 05_autograd: Function, AddBackward, MulBackward, MatmulBackward, SumBackward
- 06_optimizers: Optimizer, SGD, Adam, AdamW classes
- 07_training: CosineSchedule, Trainer classes
- 08_dataloader: Dataset, TensorDataset, DataLoader classes
- 09_spatial: Conv2d, MaxPool2d, AvgPool2d, SimpleCNN classes
- 10-20: All exportable classes in remaining modules
TESTING:
- Test functions use 'if __name__ == "__main__"' guards
- Tests run in notebooks but NOT on import
- Rosenblatt Perceptron milestone working perfectly
RESULT:
✅ All 20 modules export correctly
✅ Perceptron (1957) milestone functional
✅ Clean separation: development (modules/source) vs package (tinytorch)
- 12_attention: Export scaled_dot_product_attention, MultiHeadAttention only
- 13_transformers: Export TransformerBlock, GPT only
Continues professional selective export pattern across advanced modules.
Clean public APIs for transformer architecture components.
- Remove circular imports where modules imported from themselves
- Convert tinytorch.core imports to sys.path relative imports
- Only import dependencies that are actually used in each module
- Preserve documentation imports in markdown cells
- Use consistent relative path pattern across all modules
- Remove hardcoded absolute paths in favor of relative imports
Affected modules: 02_activations, 03_layers, 04_losses, 06_optimizers,
07_training, 09_spatial, 12_attention, 17_quantization
Clean, dependency-driven organization:
- Part I (1-5): MLPs for XORNet
- Part II (6-10): CNNs for CIFAR-10
- Part III (11-15): Transformers for TinyGPT
Key improvements:
- Dropped modules 16-17 (regularization/systems) to maintain scope
- Moved normalization to module 13 (Part III where it's needed)
- Created three CIFAR-10 examples: random, MLP, CNN
- Each part introduces ONE major innovation (FC → Conv → Attention)
CIFAR-10 now showcases progression:
- test_random_baseline.py: ~10% (random chance)
- train_mlp.py: ~55% (no convolutions)
- train_cnn.py: ~60%+ (WITH Conv2D - shows why convolutions matter!)
This follows actual ML history and each module is needed for its capstone.