Clean up: Remove old numbered .yml files, CLI uses module.yaml

CLEANUP: Removed duplicate/obsolete configuration files

Removed Files:
- All old numbered .yml files (02_tensor.yml, 03_activations.yml, etc.)
- These were leftover from the module reorganization
- Had incorrect dependencies (still referenced 'setup')

Current State:
 CLI correctly uses module.yaml files (19 modules)
 All module.yaml files have correct dependencies
 No more duplicate/conflicting configuration files
 Clean module structure with single source of truth

The CLI was already using module.yaml correctly, so this cleanup removes
the confusing duplicate files without affecting functionality.
This commit is contained in:
Vijay Janapa Reddi
2025-09-28 08:01:26 -04:00
parent 250de57ff0
commit 95f001a485
19 changed files with 0 additions and 587 deletions

View File

@@ -1,30 +0,0 @@
# TinyTorch Module Metadata
# Essential system information for CLI tools and build systems
name: "tensor"
title: "Tensor"
description: "Core tensor data structure and operations"
# Dependencies - Used by CLI for module ordering and prerequisites
dependencies:
prerequisites: ["setup"]
# Package Export - What gets built into tinytorch package
exports_to: "tinytorch.core.tensor"
# File Structure - What files exist in this module
files:
dev_file: "tensor_dev.py"
readme: "README.md"
tests: "inline"
# Educational Metadata
difficulty: "⭐⭐"
time_estimate: "4-6 hours"
# Components - What's implemented in this module
components:
- "Tensor"
- "tensor_creation"
- "tensor_operations"
- "tensor_arithmetic"

View File

@@ -1,30 +0,0 @@
# TinyTorch Module Metadata
# Essential system information for CLI tools and build systems
name: "activations"
title: "Activation Functions"
description: "Neural network activation functions (ReLU, Sigmoid, Tanh, Softmax)"
# Dependencies - Used by CLI for module ordering and prerequisites
dependencies:
prerequisites: ["tensor"]
# Package Export - What gets built into tinytorch package
exports_to: "tinytorch.core.activations"
# File Structure - What files exist in this module
files:
dev_file: "activations_dev.py"
readme: "README.md"
tests: "inline"
# Educational Metadata
difficulty: "⭐⭐"
time_estimate: "3-4 hours"
# Components - What's implemented in this module
components:
- "ReLU"
- "Sigmoid"
- "Tanh"
- "Softmax"

View File

@@ -1,29 +0,0 @@
# TinyTorch Module Metadata
# Essential system information for CLI tools and build systems
name: "layers"
title: "Layers"
description: "Neural network layers (Linear, activation layers)"
# Dependencies - Used by CLI for module ordering and prerequisites
dependencies:
prerequisites: ["setup", "tensor", "activations"]
# Package Export - What gets built into tinytorch package
exports_to: "tinytorch.core.layers"
# File Structure - What files exist in this module
files:
dev_file: "layers_dev.py"
readme: "README.md"
tests: "inline"
# Educational Metadata
difficulty: "⭐⭐"
time_estimate: "4-5 hours"
# Components - What's implemented in this module
components:
- "Dense"
- "Linear"
- "matmul"

View File

@@ -1,21 +0,0 @@
name: "Loss Functions"
number: 5
description: "Essential loss functions for neural network training objectives"
learning_objectives:
- "Implement MSE, CrossEntropy, and BinaryCrossEntropy loss functions"
- "Understand numerical stability in loss computation"
- "Match loss functions to problem types (regression vs classification)"
- "Build production-ready loss functions with batch processing"
prerequisites:
- "02_tensor"
difficulty: "⭐⭐⭐"
time_estimate: "2-3 hours"
exports:
- "MeanSquaredError"
- "CrossEntropyLoss"
- "BinaryCrossEntropyLoss"
key_concepts:
- "Training objectives and optimization"
- "Numerical stability in loss computation"
- "Regression vs classification loss functions"
- "Batch processing for scalable training"

View File

@@ -1,29 +0,0 @@
# TinyTorch Module Metadata
# Essential system information for CLI tools and build systems
name: "autograd"
title: "Autograd"
description: "Automatic differentiation engine for gradient computation"
# Dependencies - Used by CLI for module ordering and prerequisites
dependencies:
prerequisites: ["setup", "tensor", "activations"]
# Package Export - What gets built into tinytorch package
exports_to: "tinytorch.core.autograd"
# File Structure - What files exist in this module
files:
dev_file: "autograd_dev.py"
test_file: "tests/test_autograd.py"
readme: "README.md"
# Educational Metadata
difficulty: "⭐⭐⭐⭐"
time_estimate: "8-10 hours"
# Components - What's implemented in this module
components:
- "Variable"
- "backward"
- "gradient_computation"

View File

@@ -1,30 +0,0 @@
# TinyTorch Module Metadata
# Essential system information for CLI tools and build systems
name: "optimizers"
title: "Optimizers"
description: "Gradient-based parameter optimization algorithms"
# Dependencies - Used by CLI for module ordering and prerequisites
dependencies:
prerequisites: ["setup", "tensor", "autograd"]
# Package Export - What gets built into tinytorch package
exports_to: "tinytorch.core.optimizers"
# File Structure - What files exist in this module
files:
dev_file: "optimizers_dev.py"
readme: "README.md"
tests: "inline"
# Educational Metadata
difficulty: "⭐⭐⭐⭐"
time_estimate: "6-8 hours"
# Components - What's implemented in this module
components:
- "SGD"
- "Adam"
- "StepLR"
- "gradient_descent_step"

View File

@@ -1,31 +0,0 @@
# TinyTorch Module Metadata
# Essential system information for CLI tools and build systems
name: "training"
title: "Training"
description: "Neural network training loops, loss functions, and metrics"
# Dependencies - Used by CLI for module ordering and prerequisites
dependencies:
prerequisites: ["setup", "tensor", "activations", "layers", "networks", "dataloader", "autograd", "optimizers"]
# Package Export - What gets built into tinytorch package
exports_to: "tinytorch.core.training"
# File Structure - What files exist in this module
files:
dev_file: "training_dev.py"
readme: "README.md"
tests: "inline"
# Educational Metadata
difficulty: "⭐⭐⭐⭐"
time_estimate: "8-10 hours"
# Components - What's implemented in this module
components:
- "MeanSquaredError"
- "CrossEntropyLoss"
- "BinaryCrossEntropyLoss"
- "Accuracy"
- "Trainer"

View File

@@ -1,29 +0,0 @@
# TinyTorch Module Metadata
# Essential system information for CLI tools and build systems
name: "spatial"
title: "Spatial Networks"
description: "Convolutional networks for spatial pattern recognition and image processing"
# Dependencies - Used by CLI for module ordering and prerequisites
dependencies:
prerequisites: ["setup", "tensor", "activations", "layers", "dense"]
# Package Export - What gets built into tinytorch package
exports_to: "tinytorch.core.spatial"
# File Structure - What files exist in this module
files:
dev_file: "spatial_dev.py"
readme: "README.md"
tests: "inline"
# Educational Metadata
difficulty: "⭐⭐⭐"
time_estimate: "6-8 hours"
# Components - What's implemented in this module
components:
- "conv2d_naive"
- "Conv2D"
- "flatten"

View File

@@ -1,29 +0,0 @@
# TinyTorch Module Metadata
# Essential system information for CLI tools and build systems
name: "dataloader"
title: "DataLoader"
description: "Dataset interfaces and data loading pipelines"
# Dependencies - Used by CLI for module ordering and prerequisites
dependencies:
prerequisites: ["setup", "tensor"]
# Package Export - What gets built into tinytorch package
exports_to: "tinytorch.core.dataloader"
# File Structure - What files exist in this module
files:
dev_file: "dataloader_dev.py"
readme: "README.md"
tests: "inline"
# Educational Metadata
difficulty: "⭐⭐⭐"
time_estimate: "5-6 hours"
# Components - What's implemented in this module
components:
- "Dataset"
- "DataLoader"
- "SimpleDataset"

View File

@@ -1,32 +0,0 @@
name: "Tokenization"
number: 11
description: "Text processing systems that convert raw text into numerical sequences for language models"
learning_objectives:
- "Implement character-level tokenization with special token handling"
- "Build BPE (Byte Pair Encoding) tokenizer for subword units"
- "Understand tokenization trade-offs: vocabulary size vs sequence length"
- "Optimize tokenization performance for production systems"
- "Analyze how tokenization affects model memory and training efficiency"
prerequisites:
- "02_tensor"
exports:
- "CharTokenizer"
- "BPETokenizer"
- "TokenizationProfiler"
- "OptimizedTokenizer"
systems_concepts:
- "Memory efficiency of token representations"
- "Vocabulary size vs model size tradeoffs"
- "Tokenization throughput optimization"
- "String processing performance"
- "Cache-friendly text processing patterns"
ml_systems_focus: "Text processing pipelines, tokenization throughput, memory-efficient vocabulary management"
estimated_time: "4-5 hours"
next_modules:
- "12_embeddings"

View File

@@ -1,33 +0,0 @@
name: "Embeddings"
number: 12
description: "Dense vector representations that convert discrete tokens into continuous semantic spaces"
learning_objectives:
- "Implement embedding layers with efficient lookup operations"
- "Build sinusoidal and learned positional encoding systems"
- "Understand embedding memory scaling and optimization techniques"
- "Analyze how embedding choices affect model capacity and performance"
- "Design embedding systems for production language model deployment"
prerequisites:
- "02_tensor"
- "11_tokenization"
exports:
- "Embedding"
- "PositionalEncoding"
- "LearnedPositionalEmbedding"
- "EmbeddingProfiler"
systems_concepts:
- "Embedding table memory scaling O(vocab_size × embed_dim)"
- "Memory-bandwidth bound lookup operations"
- "Cache-friendly embedding access patterns"
- "Position encoding trade-offs and extrapolation"
- "Distributed embedding table management"
ml_systems_focus: "Memory-efficient embedding lookup, position encoding scalability, large-scale parameter management"
estimated_time: "4-5 hours"
next_modules:
- "13_attention"

View File

@@ -1,33 +0,0 @@
name: "Attention"
number: 13
description: "Scaled dot-product and multi-head attention mechanisms that enable transformer architectures"
learning_objectives:
- "Implement scaled dot-product attention with proper masking and numerical stability"
- "Build multi-head attention with parallel head processing and output projection"
- "Design KV-cache systems for efficient autoregressive generation"
- "Understand attention's O(N²) scaling and memory optimization techniques"
- "Analyze attention performance bottlenecks and production optimization strategies"
prerequisites:
- "02_tensor"
- "12_embeddings"
exports:
- "ScaledDotProductAttention"
- "MultiHeadAttention"
- "KVCache"
- "AttentionProfiler"
systems_concepts:
- "Quadratic memory scaling O(N²) with sequence length"
- "Memory-bandwidth bound attention computation"
- "KV-cache optimization for autoregressive generation"
- "Multi-head parallelization and hardware optimization"
- "Attention masking patterns and causal dependencies"
ml_systems_focus: "Attention memory scaling, generation efficiency optimization, sequence length limitations"
estimated_time: "5-6 hours"
next_modules:
- "14_transformers"

View File

@@ -1,35 +0,0 @@
name: "Transformers"
number: 14
description: "Complete transformer architecture with LayerNorm, transformer blocks, and language model implementation"
learning_objectives:
- "Implement LayerNorm for stable deep network training"
- "Build position-wise feed-forward networks for transformer blocks"
- "Create complete transformer blocks with attention, normalization, and residual connections"
- "Develop full transformer models with embeddings, multiple layers, and generation capability"
- "Understand transformer scaling characteristics and production deployment considerations"
prerequisites:
- "02_tensor"
- "12_embeddings"
- "13_attention"
exports:
- "LayerNorm"
- "PositionwiseFeedForward"
- "TransformerBlock"
- "Transformer"
- "TransformerProfiler"
systems_concepts:
- "Linear memory scaling with transformer depth"
- "Layer normalization vs batch normalization trade-offs"
- "Residual connection gradient flow optimization"
- "Parameter allocation across depth, width, and attention heads"
- "Training memory vs inference memory requirements"
ml_systems_focus: "Transformer architecture optimization, memory scaling with depth, production deployment strategies"
estimated_time: "6-7 hours"
next_modules:
- "Advanced transformer architectures and optimization techniques"

View File

@@ -1,30 +0,0 @@
name: Profiling
number: 15
type: systems
difficulty: advanced
estimated_hours: 8-10
description: |
Build professional profiling infrastructure to measure and analyze performance.
Students learn to create timing, memory, and operation profilers that reveal
bottlenecks and guide optimization decisions. Performance detective work that
makes optimization exciting through data-driven insights.
learning_objectives:
- Build accurate timing infrastructure with statistical rigor
- Implement memory profiling and allocation tracking
- Create FLOP counting for computational analysis
- Master profiling methodology for bottleneck identification
- Connect profiling insights to ML systems optimization decisions
prerequisites:
- Module 14: Transformers (need models to profile)
skills_developed:
- Performance measurement
- Bottleneck identification
- Profiling tool development
- Statistical analysis
exports:
- tinytorch.profiling

View File

@@ -1,38 +0,0 @@
name: "acceleration"
title: "Hardware Acceleration - The Simplest Optimization"
description: "Master the easiest optimization: using better backends! Learn why naive loops are slow, how cache-friendly blocking helps, and why NumPy provides 100x+ speedups."
learning_objectives:
- "Understand CPU cache hierarchy and memory access performance bottlenecks"
- "Implement cache-friendly blocked matrix multiplication algorithms"
- "Build vectorized operations with optimized memory access patterns"
- "Design transparent backend systems for automatic optimization selection"
- "Measure and quantify real performance improvements scientifically"
- "Apply systems thinking to optimization decisions in ML workflows"
prerequisites:
- "Module 2: Tensor operations and NumPy fundamentals"
- "Module 4: Linear layers and matrix multiplication"
- "Understanding of basic algorithmic complexity (O notation)"
estimated_time: "3-4 hours"
difficulty: "Advanced"
tags:
- "performance"
- "optimization"
- "systems"
- "hardware"
- "acceleration"
- "cache"
- "vectorization"
- "backends"
exports:
- "matmul_naive"
- "matmul_blocked"
- "matmul_numpy"
- "OptimizedBackend"
- "matmul"
- "set_backend"
assessment:
- "Understand why naive loops have poor cache performance"
- "Implement cache-friendly blocked matrix multiplication showing 10-50x speedups"
- "Recognize why NumPy provides 100x+ speedups over custom implementations"
- "Build backend system that automatically chooses optimal implementations"
- "Apply the 'free speedup' principle: use better tools, don't write faster code"

View File

@@ -1,29 +0,0 @@
name: Quantization
number: 17
type: optimization
difficulty: advanced
estimated_hours: 6-8
description: |
Precision optimization through INT8 quantization. Students learn to reduce model size
and accelerate inference by using lower precision arithmetic while maintaining accuracy.
Especially powerful for CNN convolutions and edge deployment.
learning_objectives:
- Understand precision vs performance trade-offs
- Implement INT8 quantization for neural networks
- Build calibration-based quantization systems
- Optimize CNN inference for mobile deployment
prerequisites:
- Module 09: Spatial (CNNs)
- Module 16: Acceleration
skills_developed:
- Quantization techniques and mathematics
- Post-training optimization strategies
- Hardware-aware optimization
- Mobile and edge deployment patterns
exports:
- tinytorch.quantization

View File

@@ -1,29 +0,0 @@
name: Compression
number: 17
type: optimization
difficulty: advanced
estimated_hours: 8-10
description: |
Model compression through pruning and sparsity. Students learn to identify and remove
redundant parameters, achieving 70-80% sparsity while maintaining accuracy. Essential
for edge deployment and mobile devices.
learning_objectives:
- Understand sparsity and redundancy in neural networks
- Implement magnitude-based pruning
- Build structured and unstructured pruning
- Measure accuracy vs model size tradeoffs
prerequisites:
- Module 15: Acceleration
- Module 16: Quantization
skills_developed:
- Pruning techniques
- Sparsity management
- Model compression
- Edge deployment optimization
exports:
- tinytorch.optimizations.compression

View File

@@ -1,29 +0,0 @@
name: Caching
number: 18
type: optimization
difficulty: advanced
estimated_hours: 8-10
description: |
Memory optimization through KV caching for transformer inference. Students learn to
transform O(N²) attention complexity into O(N) for autoregressive generation, achieving
dramatic speedups in transformer inference.
learning_objectives:
- Understand attention memory complexity
- Implement KV caching for transformers
- Build incremental computation patterns
- Optimize autoregressive generation
prerequisites:
- Module 14: Transformers
- Module 17: Compression
skills_developed:
- KV caching implementation
- Memory-computation tradeoffs
- Incremental computation
- Production inference patterns
exports:
- tinytorch.optimizations.caching

View File

@@ -1,41 +0,0 @@
# TinyTorch Module Metadata
# Essential system information for CLI tools and build systems
# === CORE IDENTITY ===
name: "capstone"
number: 20
folder_name: "20_capstone"
# === DISPLAY ===
display:
title: "Torch Olympics"
subtitle: "MLPerf-Inspired Challenges"
emoji: "🏆"
# === DEPENDENCIES ===
dependencies:
prerequisites: ["setup", "tensor", "activations", "layers", "losses", "autograd", "optimizers", "training", "spatial", "dataloader", "tokenization", "embeddings", "attention", "transformers", "profiling", "acceleration", "quantization", "compression", "caching"]
# === BUILD SYSTEM ===
build:
exports_to: "tinytorch.benchmarking"
main_file: "capstone_dev.py"
# === EDUCATION ===
education:
stage: "optimization"
difficulty: "⭐⭐⭐⭐⭐"
time_estimate: "6-8 hours"
description: "TinyMLPerf Olympics - the culmination of your TinyTorch journey! Build a comprehensive benchmarking suite using your profiler from Module 19, then compete on speed, memory, and efficiency. Benchmark the models you built throughout the course to see the impact of all your optimizations."
# === CHECKPOINT ===
checkpoint:
unlocks: 15
capability: "Can I build unified ML frameworks across modalities?"
# === COMPONENTS ===
components:
- "TinyMLPerf"
- "BenchmarkSuite"
- "PerformanceReporter"
- "CompetitionFramework"