Files
TinyTorch/modules/source/10_compression/module.yaml
Vijay Janapa Reddi fc7c00c2e2 Complete compression module with 6 compression techniques
- Added CompressionMetrics for parameter counting and model size analysis
- Implemented magnitude-based pruning with sparsity calculation
- Added quantization for FP32→INT8 conversion with error tracking
- Implemented knowledge distillation with temperature scaling
- Added structured pruning with neuron removal
- Created comprehensive comparison framework
- All 6 tests passing (100% success rate)
- Module follows TinyTorch educational patterns
- Uses standard tito testing framework
- Ready for integration testing
2025-07-14 09:44:04 -04:00

117 lines
3.6 KiB
YAML

name: "10_compression"
title: "Compression & Optimization"
description: "Making AI models efficient for real-world deployment"
version: "1.0.0"
author: "TinyTorch Team"
dependencies:
- "00_setup"
- "01_tensor"
- "03_layers"
- "04_networks"
- "09_training"
learning_goals:
- "Understand model size and deployment constraints"
- "Implement magnitude-based pruning for weight reduction"
- "Master quantization for memory efficiency"
- "Build knowledge distillation for compact models"
- "Create structured pruning for architecture optimization"
- "Compare compression techniques and their trade-offs"
components:
- name: "CompressionMetrics"
description: "Model size analysis and parameter counting"
type: "class"
- name: "prune_weights_by_magnitude"
description: "Remove unimportant weights from layers"
type: "function"
- name: "calculate_sparsity"
description: "Calculate fraction of zero weights"
type: "function"
- name: "prune_model_by_magnitude"
description: "Apply pruning to entire models"
type: "function"
- name: "quantize_layer_weights"
description: "Reduce parameter precision for memory savings"
type: "function"
- name: "DistillationLoss"
description: "Train compact models with teacher guidance"
type: "class"
- name: "prune_layer_neurons"
description: "Remove entire neurons/channels"
type: "function"
tests:
- name: "test_compression_metrics_comprehensive"
description: "Test model size analysis functionality"
- name: "test_magnitude_pruning_comprehensive"
description: "Test weight pruning algorithms"
- name: "test_quantization_comprehensive"
description: "Test precision reduction techniques"
- name: "test_distillation_comprehensive"
description: "Test knowledge distillation training"
- name: "test_structured_pruning_comprehensive"
description: "Test neuron/channel removal"
- name: "test_compression_integration_comprehensive"
description: "Test combined compression techniques"
educational_flow:
- step: 1
title: "Understanding Model Size"
description: "Learn to analyze and measure neural network parameters"
- step: 2
title: "Magnitude-Based Pruning"
description: "Remove unimportant weights based on magnitude"
- step: 3
title: "Quantization Experiments"
description: "Reduce precision for memory efficiency"
- step: 4
title: "Knowledge Distillation"
description: "Train compact models with teacher guidance"
- step: 5
title: "Structured Pruning"
description: "Remove entire neurons and channels"
- step: 6
title: "Comprehensive Comparison"
description: "Compare all techniques and combine for maximum benefit"
real_world_applications:
- "Mobile AI deployment (smartphone apps)"
- "Edge computing (IoT devices, sensors)"
- "Real-time inference (autonomous vehicles)"
- "Cost optimization (cloud inference)"
- "Battery efficiency (wearable devices)"
industry_connections:
- "MobileNet: Mobile-optimized architectures"
- "DistilBERT: Compressed language models"
- "TinyML: Microcontroller deployment"
- "Neural Architecture Search: Automated optimization"
assessment_criteria:
- "Implement 4 compression techniques correctly"
- "Understand accuracy vs efficiency trade-offs"
- "Measure compression effectiveness quantitatively"
- "Apply techniques to real neural networks"
- "Compare different compression strategies"
next_steps:
- "Module 11: Kernels - Hardware-aware optimization"
- "Module 12: Benchmarking - Performance measurement"
- "Module 13: MLOps - Production deployment"