Files
TinyTorch/modules/13_transformers/module.yaml
Vijay Janapa Reddi 4a9131f8c4 Major reorganization: Remove setup module, renumber all modules, add tito setup command and numeric shortcuts
- Removed 01_setup module (archived to archive/setup_module)
- Renumbered all modules: tensor is now 01, activations is 02, etc.
- Added tito setup command for environment setup and package installation
- Added numeric shortcuts: tito 01, tito 02, etc. for quick module access
- Fixed view command to find dev files correctly
- Updated module dependencies and references
- Improved user experience: immediate ML learning instead of boring setup
2025-09-28 07:02:08 -04:00

33 lines
1.2 KiB
YAML

description: Complete transformer architecture with LayerNorm, transformer blocks,
and language model implementation
estimated_time: 6-7 hours
exports:
- LayerNorm
- PositionwiseFeedForward
- TransformerBlock
- Transformer
- TransformerProfiler
learning_objectives:
- Implement LayerNorm for stable deep network training
- Build position-wise feed-forward networks for transformer blocks
- Create complete transformer blocks with attention, normalization, and residual connections
- Develop full transformer models with embeddings, multiple layers, and generation
capability
- Understand transformer scaling characteristics and production deployment considerations
ml_systems_focus: Transformer architecture optimization, memory scaling with depth,
production deployment strategies
name: Transformers
next_modules:
- Advanced transformer architectures and optimization techniques
number: 14
prerequisites:
- 02_tensor
- 12_embeddings
- 13_attention
systems_concepts:
- Linear memory scaling with transformer depth
- Layer normalization vs batch normalization trade-offs
- Residual connection gradient flow optimization
- Parameter allocation across depth, width, and attention heads
- Training memory vs inference memory requirements