Files
TinyTorch/modules/source/16_regularization/module.yaml
Vijay Janapa Reddi 1d6fd4b9f7 Restructure TinyTorch into three-part learning journey (17 modules)
- Part I: Foundations (Modules 1-5) - Build MLPs, solve XOR
- Part II: Computer Vision (Modules 6-11) - Build CNNs, classify CIFAR-10
- Part III: Language Models (Modules 12-17) - Build transformers, generate text

Key changes:
- Renamed 05_dense to 05_networks for clarity
- Moved 08_dataloader to 07_dataloader (swap with attention)
- Moved 07_attention to 13_attention (Part III)
- Renamed 12_compression to 16_regularization
- Created placeholder dirs for new language modules (12,14,15,17)
- Moved old modules 13-16 to temp_holding for content migration
- Updated README with three-part structure
- Added comprehensive documentation in docs/three-part-structure.md

This structure gives students three natural exit points with concrete achievements at each level.
2025-09-22 09:50:48 -04:00

127 lines
3.8 KiB
YAML

name: "10_compression"
title: "Compression & Optimization"
description: "Making AI models efficient for real-world deployment"
version: "1.0.0"
author: "TinyTorch Team"
dependencies:
- "00_setup"
- "01_tensor"
- "03_layers"
- "04_networks"
- "09_training"
learning_goals:
- "Understand model size and deployment constraints"
- "Implement magnitude-based pruning for weight reduction"
- "Master quantization for memory efficiency"
- "Build knowledge distillation for compact models"
- "Create structured pruning for architecture optimization"
- "Compare compression techniques and their trade-offs"
components:
- name: "CompressionMetrics"
description: "Model size analysis and parameter counting"
type: "class"
- name: "prune_weights_by_magnitude"
description: "Remove unimportant weights from layers"
type: "function"
- name: "calculate_sparsity"
description: "Calculate fraction of zero weights"
type: "function"
- name: "prune_model_by_magnitude"
description: "Apply pruning to entire models"
type: "function"
- name: "quantize_layer_weights"
description: "Reduce parameter precision for memory savings"
type: "function"
- name: "DistillationLoss"
description: "Train compact models with teacher guidance"
type: "class"
- name: "prune_layer_neurons"
description: "Remove entire neurons/channels"
type: "function"
tests:
- name: "test_compression_metrics_comprehensive"
description: "Test model size analysis functionality"
- name: "test_magnitude_pruning_comprehensive"
description: "Test weight pruning algorithms"
- name: "test_quantization_comprehensive"
description: "Test precision reduction techniques"
- name: "test_distillation_comprehensive"
description: "Test knowledge distillation training"
- name: "test_structured_pruning_comprehensive"
description: "Test neuron/channel removal"
- name: "test_compression_integration_comprehensive"
description: "Test combined compression techniques"
educational_flow:
- step: 1
title: "Understanding Model Size"
description: "Learn to analyze and measure neural network parameters"
- step: 2
title: "Magnitude-Based Pruning"
description: "Remove unimportant weights based on magnitude"
- step: 3
title: "Quantization Experiments"
description: "Reduce precision for memory efficiency"
- step: 4
title: "Knowledge Distillation"
description: "Train compact models with teacher guidance"
- step: 5
title: "Structured Pruning"
description: "Remove entire neurons and channels"
- step: 6
title: "Comprehensive Comparison"
description: "Compare all techniques and combine for maximum benefit"
real_world_applications:
- "Mobile AI deployment (smartphone apps)"
- "Edge computing (IoT devices, sensors)"
- "Real-time inference (autonomous vehicles)"
- "Cost optimization (cloud inference)"
- "Battery efficiency (wearable devices)"
industry_connections:
- "MobileNet: Mobile-optimized architectures"
- "DistilBERT: Compressed language models"
- "TinyML: Microcontroller deployment"
- "Neural Architecture Search: Automated optimization"
assessment_criteria:
- "Implement 4 compression techniques correctly"
- "Understand accuracy vs efficiency trade-offs"
- "Measure compression effectiveness quantitatively"
- "Apply techniques to real neural networks"
- "Compare different compression strategies"
next_steps:
- "Module 11: Kernels - Hardware-aware optimization"
- "Module 12: Benchmarking - Performance measurement"
- "Module 13: MLOps - Production deployment"
# File Structure - What files exist in this module
files:
dev_file: "compression_dev.py"
readme: "README.md"
tests: "inline"
# Educational Metadata
difficulty: "⭐⭐⭐⭐"
time_estimate: "8-10 hours"