Files
TinyTorch/modules/09_spatial
Vijay Janapa Reddi 9c0042f08d Add release check workflow and clean up legacy dev files
This commit implements a comprehensive quality assurance system and removes
outdated backup files from the repository.

## Release Check Workflow

Added GitHub Actions workflow for systematic release validation:
- Manual-only workflow (workflow_dispatch) - no automatic PR triggers
- 6 sequential quality gates: educational, implementation, testing, package, documentation, systems
- 13 validation scripts (4 fully implemented, 9 stubs for future work)
- Comprehensive documentation in .github/workflows/README.md
- Release process guide in .github/RELEASE_PROCESS.md

Implemented validators:
- validate_time_estimates.py - Ensures consistency between LEARNING_PATH.md and ABOUT.md files
- validate_difficulty_ratings.py - Validates star rating consistency across modules
- validate_testing_patterns.py - Checks for test_unit_* and test_module() patterns
- check_checkpoints.py - Recommends checkpoint markers for long modules (8+ hours)

## Pedagogical Improvements

Added checkpoint markers to Module 05 (Autograd):
- Checkpoint 1: After computational graph construction (~40% progress)
- Checkpoint 2: After automatic differentiation implementation (~80% progress)
- Helps students track progress through the longest foundational module (8-10 hours)

## Codebase Cleanup

Removed 20 legacy *_dev.py files across all modules:
- Confirmed via export system analysis: only *.py files (without _dev suffix) are used
- Export system explicitly reads from {name}.py (see tito/commands/export.py line 461)
- All _dev.py files were outdated backups not used by the build/export pipeline
- Verified all active .py files contain current implementations with optimizations

This cleanup:
- Eliminates confusion about which files are source of truth
- Reduces repository size
- Makes development workflow clearer (work in modules/XX_name/name.py)

## Formatting Standards Documentation

Documents formatting and style standards discovered through systematic
review of all 20 TinyTorch modules.

### Key Findings

Overall Status: 9/10 (Excellent consistency)
- All 20 modules use correct test_module() naming
- 18/20 modules have proper if __name__ guards
- All modules use proper Jupytext format (no JSON leakage)
- Strong ASCII diagram quality
- All 20 modules missing 🧪 emoji in test_module() docstrings

### Standards Documented

1. Test Function Naming: test_unit_* for units, test_module() for integration
2. if __name__ Guards: Immediate guards after every test/analysis function
3. Emoji Protocol: 🔬 for unit tests, 🧪 for module tests, 📊 for analysis
4. Markdown Formatting: Jupytext format with proper section hierarchy
5. ASCII Diagrams: Box-drawing characters, labeled dimensions, data flow arrows
6. Module Structure: Standard template with 9 sections

### Quick Fixes Identified

- Add 🧪 emoji to test_module() in all 20 modules (~5 min)
- Fix Module 16 if __name__ guards (~15 min)
- Fix Module 08 guard (~5 min)

Total quick fixes: 25 minutes to achieve 10/10 consistency
2025-11-24 14:47:04 -05:00
..
2025-11-22 20:30:58 -05:00

Module 09: Spatial Operations - CNNs for Vision

Overview

Time: 3-4 hours Difficulty:

Build convolutional neural networks (CNNs) - the foundation of computer vision. Learn how spatial operations enable pattern recognition in images through local connectivity and parameter sharing.

Prerequisites

Required Modules: 01-08 must be completed and tested

  • Module 01 (Tensor): Data structures
  • Module 02 (Activations): ReLU for feature detection
  • Module 03 (Layers): Linear layers foundation
  • Module 04 (Losses): CrossEntropy for classification
  • Module 05 (Autograd): Gradient computation
  • Module 06 (Optimizers): SGD/Adam for training
  • Module 07 (Training): Training loop patterns
  • Module 08 (Data): Efficient data loading

Before starting, verify prerequisites:

pytest modules/01_tensor/test_tensor.py
pytest modules/02_activations/test_activations.py
# ... test all modules 01-08

Learning Objectives

By the end of this module, you will:

Core Concepts

  1. Understand Convolutional Operations

    • Sliding window computation over spatial dimensions
    • Filter/kernel mathematics (cross-correlation)
    • Output size calculations: (H-K+2P)/S + 1
    • Why convolution works for spatial data
  2. Implement Conv2d Layers

    • Forward pass: applying filters to extract features
    • Backward pass: gradients for filters, inputs, and biases
    • Parameter sharing reduces model size vs fully-connected
    • Local connectivity captures spatial patterns
  3. Master Pooling Operations

    • MaxPool2d: dimensionality reduction while preserving features
    • Stride and kernel size trade-offs
    • Translation invariance for robust recognition
    • When to pool vs when to use strided convolution
  4. Build Spatial Hierarchies

    • Early layers: edges and textures (local patterns)
    • Middle layers: parts and shapes (combinations)
    • Deep layers: objects and scenes (high-level concepts)
    • How receptive fields grow with depth

Systems Understanding

  1. Computational Complexity

    • FLOPs analysis: O(N²M²K²) for naive convolution
    • Why convolution is expensive (6 nested loops)
    • Memory bottlenecks in spatial operations
    • Cache efficiency and data locality
  2. Optimization Techniques

    • Im2col algorithm: trade memory for speed
    • Vectorization strategies for convolution
    • Why GPUs excel at convolutional operations
    • Batch processing for throughput
  3. Production Considerations

    • Parameter efficiency: CNNs vs MLPs for images
    • Mobile deployment: depthwise-separable convolutions
    • Memory footprint during training (activations + gradients)
    • Inference optimization patterns

ML Engineering Skills

  1. Architecture Design

    • Choosing filter sizes (1×1, 3×3, 5×5)
    • Balancing depth vs width
    • When to pool and when to stride
    • Building feature extraction pipelines
  2. Debugging Spatial Layers

    • Shape tracking through conv and pool layers
    • Gradient flow verification in deep networks
    • Common errors: dimension mismatches
    • Validating learned filters visually
  3. Performance Profiling

    • Measuring convolution speed vs input size
    • Memory usage scaling with batch size
    • Comparing naive vs optimized implementations
    • Bottleneck identification in CNN pipelines

What You'll Build

Core Components

  1. Conv2d: Convolutional layer with learnable filters
  2. MaxPool2d: Max pooling for dimensionality reduction
  3. Flatten: Reshape spatial features for classification
  4. Helper functions: Shape calculation utilities

Complete CNN System

By module end, you'll have all components to build:

  • LeNet-style architectures (1998 - digit recognition)
  • Feature extraction pipelines
  • Spatial hierarchy networks
  • Ready for Milestone 04: LeNet CNN

Module Structure

modules/09_spatial/
├── README.md                 ← You are here
├── spatial_dev.py            ← Main implementation file
├── spatial_dev.ipynb         ← Jupyter notebook version
└── test_spatial.py           ← Validation tests

After This Module

Immediate Next Step

→ Milestone 04: LeNet CNN (1998) Build Yann LeCun's historic convolutional network that revolutionized digit recognition. You now have all components: Conv2d, MaxPool2d, ReLU, and training loops.

Future Modules Will Add

  • Module 10: Normalization (BatchNorm, LayerNorm)
  • Module 11: Modern architectures (ResNets, skip connections)
  • Module 12: Attention mechanisms (transformers)

What Becomes Possible

  • Image classification (MNIST, CIFAR-10)
  • Feature extraction for transfer learning
  • Spatial pattern recognition
  • Building blocks for modern vision models

Key Insights You'll Discover

Why CNNs Work

  1. Parameter Sharing: Same filter applied everywhere → fewer parameters
  2. Local Connectivity: Neurons see small regions → translation equivariance
  3. Hierarchical Features: Stack layers → learn complex patterns
  4. Spatial Structure: Preserve 2D topology → better for images

Performance Realities

  1. Convolution is Expensive: O(N²M²K²) complexity → GPUs essential
  2. Memory Scales Quadratically: Large images → huge activations
  3. Im2col Trade-off: 10× memory → 100× speed possible
  4. Batch Processing: Amortize overhead → better throughput

Architectural Patterns

  1. Gradual Downsampling: Increase channels, decrease spatial size
  2. 3×3 Dominance: Best balance of expressiveness and efficiency
  3. Pooling Alternatives: Strided conv can replace pooling
  4. Depth Matters: More layers → better hierarchies

Tips for Success

Implementation Strategy

  1. Start Simple: Get 3×3 convolution working first
  2. Test Incrementally: Verify shapes at each step
  3. Profile Early: Measure performance to understand complexity
  4. Visualize Outputs: Check feature maps make sense

Common Pitfalls

  • ⚠️ Shape Mismatches: Track dimensions carefully through conv/pool
  • ⚠️ Memory Errors: Batch size × spatial size can be huge
  • ⚠️ Gradient Issues: Deep networks need careful initialization
  • ⚠️ Performance: Naive implementation will be slow (that's the point!)

Debugging Techniques

# Always print shapes during development
print(f"Input: {x.shape}")
x = conv1(x)
print(f"After conv1: {x.shape}")
x = pool1(x)
print(f"After pool1: {x.shape}")

Estimated Timeline

  • Part 1-2: Introduction & Math (30 minutes)
  • Part 3: Conv2d Implementation (90 minutes)
  • Part 4: MaxPool2d & Flatten (45 minutes)
  • Part 5: Systems Analysis (30 minutes)
  • Part 6: Integration & Testing (30 minutes)
  • Total: 3-4 hours with breaks

Learning Approach

This is a Core Module (complexity level 4/5):

  • Full implementation with explicit loops (see the complexity!)
  • Systems analysis reveals performance characteristics
  • Connection to production patterns (im2col, GPU kernels)
  • Immediate testing after each component

Don't rush - understanding spatial operations deeply is crucial for modern ML.

Getting Started

Open spatial_dev.py and begin with Part 1: Introduction to Spatial Operations.

Remember: You're building the foundation of computer vision. Take time to understand how these operations enable hierarchical feature learning in images.


Ready? Let's build CNNs! 🏗️