Files
TinyTorch/modules/19_benchmarking/benchmarking_dev.py
Vijay Janapa Reddi c52a5dc789 Improve module-developer guidelines and fix all module issues
- Added progressive complexity guidelines (Foundation/Intermediate/Advanced)
- Added measurement function consolidation to prevent information overload
- Fixed all diagnostic issues in losses_dev.py
- Fixed markdown formatting across all modules
- Consolidated redundant analysis functions in foundation modules
- Fixed syntax errors and unused variables
- Ensured all educational content is in proper markdown cells for Jupyter
2025-09-28 09:42:25 -04:00

1700 lines
72 KiB
Python
Raw Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# %% [markdown]
"""
# Module 20: TinyMLPerf - The Ultimate ML Systems Competition
## Learning Objectives
By the end of this module, you will be able to:
1. **Build Competition Benchmarking Infrastructure**: Create standardized TinyMLPerf benchmark suite for fair competition
2. **Use Profiling Tools for Systematic Measurement**: Apply Module 15's profiler to measure real performance gains
3. **Compete Across Multiple Categories**: Optimize for speed, memory, model size, and innovation simultaneously
4. **Calculate Relative Performance Improvements**: Show speedup ratios independent of hardware differences
5. **Drive Innovation Through Competition**: Use competitive pressure to discover new optimization techniques
## The TinyMLPerf Vision
**Key Message**: Competition proves optimization mastery by measuring concrete performance improvements across all your TinyTorch implementations!
**The TinyMLPerf Journey:**
1. **Benchmark Suite**: Load standard models (MLP, CNN, Transformer) as competition workloads
2. **Profiling Integration**: Use your Module 15 profiler for rigorous performance measurement
3. **Competition Categories**: Three exciting events - MLP Sprint, CNN Marathon, Transformer Decathlon
4. **Relative Scoring**: Hardware-independent speedup measurements (3x faster = 3.0 score)
5. **Leaderboard Glory**: Track innovations and celebrate optimization achievements
"""
# %%
#| default_exp utils.benchmark
import time
import json
import hashlib
import tracemalloc
from datetime import datetime
from pathlib import Path
from typing import Dict, Any, List, Optional, Tuple, Union, Callable
import numpy as np
import pickle
# Performance measurement constants
WEIGHT_INIT_SCALE = 0.1 # Xavier-style initialization scale for stable training
NUMERICAL_EPSILON = 1e-8 # Prevent division by zero in softmax calculations
DEFAULT_WARMUP_RUNS = 3 # Number of warmup runs to stabilize CPU caches
DEFAULT_TIMING_RUNS = 5 # Minimum runs for statistical reliability
DEFAULT_PROFILER_TIMING_RUNS = 10 # More thorough profiling for detailed analysis
# Model architecture constants (for standardized benchmarks)
MLP_INPUT_SIZE = 784 # Flattened 28x28 MNIST-like images
MLP_HIDDEN1_SIZE = 128 # First hidden layer size
MLP_HIDDEN2_SIZE = 64 # Second hidden layer size
MLP_OUTPUT_SIZE = 10 # Classification output classes
CNN_CONV1_FILTERS = 32 # First convolution layer filters
CNN_CONV2_FILTERS = 64 # Second convolution layer filters
CNN_KERNEL_SIZE = 3 # Convolution kernel size (3x3)
CNN_FC_INPUT_SIZE = 1600 # Flattened conv output size
TRANSFORMER_D_MODEL = 128 # Model embedding dimension
TRANSFORMER_N_HEADS = 8 # Number of attention heads
TRANSFORMER_SEQ_LEN = 64 # Maximum sequence length
TRANSFORMER_FF_RATIO = 4 # Feed-forward expansion ratio
# Competition scoring constants
SPEED_WEIGHT = 0.7 # Weight for speed in composite scoring
INNOVATION_WEIGHT = 0.3 # Weight for innovation in composite scoring
CREATIVITY_BONUS_THRESHOLD = 3 # Minimum techniques for creativity bonus
MAX_INNOVATION_SCORE = 1.0 # Maximum possible innovation score
# Leaderboard formatting templates
LEADERBOARD_HEADER = "{rank:<6} {team:<20} {speedup:<10} {time_ms:<12} {techniques:<25}"
INNOVATION_HEADER = "{rank:<6} {team:<20} {innovation:<12} {techniques:<8} {description:<25}"
COMPOSITE_HEADER = "{rank:<6} {team:<18} {composite:<11} {speed:<9} {innovation:<11} {techniques}"
# Simplified innovation pattern keywords (easier for students to understand)
OPTIMIZATION_KEYWORDS = {
'quantization': ['quantized', 'int8'], # Reduced precision computation
'pruning': ['pruned', 'sparse'], # Removing unnecessary weights
'distillation': ['distilled', 'teacher'], # Knowledge transfer
'custom_kernels': ['custom_kernel', 'cuda', 'vectorized'], # Hardware optimization
'memory_optimization': ['memory_pool', 'in_place'], # Memory efficiency
'compression': ['compressed', 'weight_sharing'] # Model compression
}
# Import TinyTorch profiler from Module 15
def _check_profiler_availability():
"""Check if TinyTorch profiler is available and explain implications."""
try:
from tinytorch.utils.profiler import SimpleProfiler, profile_function
print("PASS TinyTorch profiler loaded - using advanced timing")
return True, SimpleProfiler, profile_function
except ImportError:
print("WARNING TinyTorch profiler not available")
print(" Make sure Module 15 (Profiling) is completed first")
print(" Using basic timing as fallback")
return False, None, None
HAS_PROFILER, SimpleProfiler, profile_function = _check_profiler_availability()
# %% [markdown]
"""
## Part 1: Understanding Benchmarking Fundamentals
Before diving into the full competition, let's understand the core concepts step by step.
"""
# %%
def simple_timing_demo():
"""TARGET Learning Checkpoint 1: Basic Performance Measurement
Understand why we need systematic timing for fair comparison.
"""
print("MAGNIFY Learning Checkpoint 1: Basic Performance Measurement")
print("=" * 60)
# Simple function to time
def slow_matrix_multiply(a, b):
"""Naive matrix multiplication - intentionally slow"""
result = np.zeros((a.shape[0], b.shape[1]))
for i in range(a.shape[0]):
for j in range(b.shape[1]):
for k in range(a.shape[1]):
result[i, j] += a[i, k] * b[k, j]
return result
def fast_matrix_multiply(a, b):
"""Optimized matrix multiplication using NumPy"""
return np.dot(a, b)
# Create test matrices
test_size = 50
matrix_a = np.random.randn(test_size, test_size).astype(np.float32)
matrix_b = np.random.randn(test_size, test_size).astype(np.float32)
print(f"📊 Timing matrix multiplication ({test_size}x{test_size})...")
# Time the slow version
start = time.perf_counter()
slow_result = slow_matrix_multiply(matrix_a, matrix_b)
slow_time = time.perf_counter() - start
# Time the fast version
start = time.perf_counter()
fast_result = fast_matrix_multiply(matrix_a, matrix_b)
fast_time = time.perf_counter() - start
# Calculate speedup
speedup = slow_time / fast_time
print(f" Slow version: {slow_time*1000:.2f} ms")
print(f" Fast version: {fast_time*1000:.2f} ms")
print(f" ROCKET Speedup: {speedup:.2f}x faster")
print(f"\nTIP Key Insight: Optimization can provide dramatic speedups!")
print(f" This is why we need systematic benchmarking to measure improvements.")
return {'slow_time': slow_time, 'fast_time': fast_time, 'speedup': speedup}
def statistical_timing_demo():
"""TARGET Learning Checkpoint 2: Why We Need Multiple Runs
Understand timing variability and the need for statistical reliability.
"""
print("\nMAGNIFY Learning Checkpoint 2: Statistical Timing Reliability")
print("=" * 60)
# Simple operation to time
def simple_operation(x):
return np.sum(x ** 2)
test_data = np.random.randn(10000).astype(np.float32)
print(f"📊 Measuring timing variability with {DEFAULT_TIMING_RUNS} runs...")
# Single timing run
start = time.perf_counter()
_ = simple_operation(test_data)
single_time = time.perf_counter() - start
# Multiple timing runs
times = []
for run in range(DEFAULT_TIMING_RUNS):
start = time.perf_counter()
_ = simple_operation(test_data)
end = time.perf_counter()
times.append(end - start)
mean_time = np.mean(times)
std_time = np.std(times)
min_time = np.min(times)
max_time = np.max(times)
print(f" Single run: {single_time*1000:.2f} ms")
print(f" Mean time: {mean_time*1000:.2f} ± {std_time*1000:.2f} ms")
print(f" Range: {min_time*1000:.2f} - {max_time*1000:.2f} ms")
variability = (std_time / mean_time) * 100
print(f" PROGRESS Variability: {variability:.1f}% coefficient of variation")
print(f"\nTIP Key Insight: Single measurements are unreliable!")
print(f" We need {DEFAULT_TIMING_RUNS}+ runs with warmup for statistical reliability.")
return {'times': times, 'mean': mean_time, 'std': std_time}
def benchmark_model_demo():
"""TARGET Learning Checkpoint 3: Model Benchmarking Basics
Understand how to benchmark ML models specifically.
"""
print("\nMAGNIFY Learning Checkpoint 3: ML Model Benchmarking")
print("=" * 60)
# Simple model for demonstration
class SimpleModel:
def __init__(self, size):
self.weights = np.random.randn(size, size).astype(np.float32) * 0.1
def predict(self, x):
return x @ self.weights
# Create models of different sizes
small_model = SimpleModel(64)
large_model = SimpleModel(256)
# Test data
batch_size = 100
small_data = np.random.randn(batch_size, 64).astype(np.float32)
large_data = np.random.randn(batch_size, 256).astype(np.float32)
print(f"📊 Comparing model sizes...")
# Benchmark small model
times = []
for _ in range(DEFAULT_TIMING_RUNS):
start = time.perf_counter()
_ = small_model.predict(small_data)
times.append(time.perf_counter() - start)
small_time = np.mean(times)
# Benchmark large model
times = []
for _ in range(DEFAULT_TIMING_RUNS):
start = time.perf_counter()
_ = large_model.predict(large_data)
times.append(time.perf_counter() - start)
large_time = np.mean(times)
print(f" Small model (64): {small_time*1000:.2f} ms")
print(f" Large model (256): {large_time*1000:.2f} ms")
print(f" 🔢 Size ratio: {256/64:.0f}x parameters")
print(f" ⏱️ Time ratio: {large_time/small_time:.1f}x slower")
print(f"\nTIP Key Insight: Model complexity directly affects inference time!")
print(f" This is why standardized models are crucial for fair competition.")
return {'small_time': small_time, 'large_time': large_time}
# %%
def run_learning_checkpoints():
"""Run all learning checkpoints to build understanding progressively"""
print("🎓 TinyMLPerf Learning Journey")
print("=" * 80)
print("Building understanding step by step...\n")
# Checkpoint 1: Basic timing
timing_results = simple_timing_demo()
# Checkpoint 2: Statistical reliability
stats_results = statistical_timing_demo()
# Checkpoint 3: Model benchmarking
model_results = benchmark_model_demo()
print("\n" + "=" * 80)
print("CELEBRATE Learning checkpoints complete! Ready for TinyMLPerf competition.")
print("=" * 80)
return {
'timing': timing_results,
'statistics': stats_results,
'models': model_results
}
# %% [markdown]
"""
### Test Learning Checkpoints
Let's run the learning checkpoints to build understanding progressively.
"""
# %%
def test_learning_checkpoints():
"""Test the learning checkpoint system"""
print("Testing learning checkpoints...")
results = run_learning_checkpoints()
print("\nPASS Learning checkpoints test complete!")
return results
# %% [markdown]
"""
## Part 2: TinyMLPerf Benchmark Suite - Standard Competition Models
Now that we understand the fundamentals, let's build the TinyMLPerf benchmark suite with three exciting competition events using standard models.
"""
# Standard benchmark models for TinyMLPerf competition events
class MLPBenchmark:
"""Standard MLP model for TinyMLPerf sprint event.
Simple 3-layer feedforward network optimized for speed competitions.
Students will optimize this architecture for fastest inference.
"""
def __init__(self):
"""Initialize MLP with standard architecture using named constants."""
# Layer 1: Input -> Hidden1 (flattened MNIST-like input)
self.layer1_weights = np.random.randn(MLP_INPUT_SIZE, MLP_HIDDEN1_SIZE).astype(np.float32) * WEIGHT_INIT_SCALE
self.layer1_bias = np.random.randn(MLP_HIDDEN1_SIZE).astype(np.float32) * WEIGHT_INIT_SCALE
# Layer 2: Hidden1 -> Hidden2
self.layer2_weights = np.random.randn(MLP_HIDDEN1_SIZE, MLP_HIDDEN2_SIZE).astype(np.float32) * WEIGHT_INIT_SCALE
self.layer2_bias = np.random.randn(MLP_HIDDEN2_SIZE).astype(np.float32) * WEIGHT_INIT_SCALE
# Layer 3: Hidden2 -> Output (classification)
self.layer3_weights = np.random.randn(MLP_HIDDEN2_SIZE, MLP_OUTPUT_SIZE).astype(np.float32) * WEIGHT_INIT_SCALE
self.layer3_bias = np.random.randn(MLP_OUTPUT_SIZE).astype(np.float32) * WEIGHT_INIT_SCALE
def forward(self, x):
"""Forward pass through 3-layer MLP with ReLU activations."""
# Layer 1: Input -> Hidden1 with ReLU
hidden1 = np.maximum(0, x @ self.layer1_weights + self.layer1_bias)
# Layer 2: Hidden1 -> Hidden2 with ReLU
hidden2 = np.maximum(0, hidden1 @ self.layer2_weights + self.layer2_bias)
# Layer 3: Hidden2 -> Output (no activation)
output = hidden2 @ self.layer3_weights + self.layer3_bias
return output
def predict(self, x):
"""Prediction interface for benchmarking."""
return self.forward(x)
class CNNBenchmark:
"""Standard CNN model for TinyMLPerf marathon event.
Simplified convolutional network for image processing competitions.
Students will optimize convolution operations and memory access patterns.
"""
def __init__(self):
"""Initialize CNN with simplified architecture using named constants."""
# Simplified CNN weights (real CNN would need proper conv operations)
self.conv1_filters = np.random.randn(CNN_KERNEL_SIZE, CNN_KERNEL_SIZE, 1, CNN_CONV1_FILTERS).astype(np.float32) * WEIGHT_INIT_SCALE
self.conv2_filters = np.random.randn(CNN_KERNEL_SIZE, CNN_KERNEL_SIZE, CNN_CONV1_FILTERS, CNN_CONV2_FILTERS).astype(np.float32) * WEIGHT_INIT_SCALE
# Fully connected layer after convolution + pooling
self.fc_weights = np.random.randn(CNN_FC_INPUT_SIZE, MLP_OUTPUT_SIZE).astype(np.float32) * WEIGHT_INIT_SCALE
self.fc_bias = np.random.randn(MLP_OUTPUT_SIZE).astype(np.float32) * WEIGHT_INIT_SCALE
def forward(self, x):
"""Forward pass through simplified CNN.
Note: This is a simplified version. Students will implement
real convolution operations for optimization.
"""
batch_size = x.shape[0]
# Simulate conv + pooling by flattening and projecting
x_flattened = x.reshape(batch_size, -1)
# Ensure correct input size (pad or truncate as needed)
if x_flattened.shape[1] != CNN_FC_INPUT_SIZE:
if x_flattened.shape[1] > CNN_FC_INPUT_SIZE:
x_flattened = x_flattened[:, :CNN_FC_INPUT_SIZE]
else:
padding = ((0, 0), (0, CNN_FC_INPUT_SIZE - x_flattened.shape[1]))
x_flattened = np.pad(x_flattened, padding, 'constant')
# Final classification layer
output = x_flattened @ self.fc_weights + self.fc_bias
return output
def predict(self, x):
"""Prediction interface for benchmarking."""
return self.forward(x)
class TransformerBenchmark:
"""Standard Transformer model for TinyMLPerf decathlon event.
Simplified attention-based model for sequence processing competitions.
Students will optimize attention mechanisms and memory usage.
"""
def __init__(self, d_model=TRANSFORMER_D_MODEL, n_heads=TRANSFORMER_N_HEADS, seq_len=TRANSFORMER_SEQ_LEN):
"""Initialize Transformer with standard attention architecture using named constants.
Args:
d_model: Model dimension (embedding size) - default from TRANSFORMER_D_MODEL
n_heads: Number of attention heads - default from TRANSFORMER_N_HEADS
seq_len: Maximum sequence length - default from TRANSFORMER_SEQ_LEN
"""
self.d_model = d_model
self.n_heads = n_heads
self.seq_len = seq_len
self.head_dim = d_model // n_heads
# Multi-head attention weights (clearer naming)
self.query_weights = np.random.randn(d_model, d_model).astype(np.float32) * WEIGHT_INIT_SCALE
self.key_weights = np.random.randn(d_model, d_model).astype(np.float32) * WEIGHT_INIT_SCALE
self.value_weights = np.random.randn(d_model, d_model).astype(np.float32) * WEIGHT_INIT_SCALE
self.output_weights = np.random.randn(d_model, d_model).astype(np.float32) * WEIGHT_INIT_SCALE
# Feed forward network weights (using standard 4x expansion ratio)
ff_dim = d_model * TRANSFORMER_FF_RATIO
self.feedforward_layer1 = np.random.randn(d_model, ff_dim).astype(np.float32) * WEIGHT_INIT_SCALE
self.feedforward_layer2 = np.random.randn(ff_dim, d_model).astype(np.float32) * WEIGHT_INIT_SCALE
def forward(self, x):
"""Forward pass through simplified transformer block.
Note: This is a simplified version. Students will implement
real multi-head attention for optimization.
"""
batch_size, seq_len, d_model = x.shape
# Self-attention computation (simplified single-head)
queries = x @ self.query_weights # [batch, seq, d_model]
keys = x @ self.key_weights
values = x @ self.value_weights
# Attention scores with proper scaling
attention_scores = queries @ keys.transpose(0, 2, 1) / np.sqrt(d_model)
# Softmax with numerical stability
exp_scores = np.exp(attention_scores - np.max(attention_scores, axis=-1, keepdims=True))
attention_weights = exp_scores / (np.sum(exp_scores, axis=-1, keepdims=True) + NUMERICAL_EPSILON)
# Apply attention to values
attention_output = attention_weights @ values # [batch, seq, d_model]
# Residual connection + layer norm (simplified)
attention_output = attention_output + x
# Feed forward network
ff_intermediate = np.maximum(0, attention_output @ self.feedforward_layer1) # ReLU
ff_output = ff_intermediate @ self.feedforward_layer2
# Another residual connection
final_output = ff_output + attention_output
# Global average pooling for classification
return np.mean(final_output, axis=1) # [batch, d_model]
def predict(self, x):
"""Prediction interface for benchmarking."""
return self.forward(x)
# %%
class TinyMLPerf:
"""
TinyMLPerf benchmark suite - The Olympics of ML Systems Optimization!
Provides three standard competition events:
- MLP Sprint: Fastest feedforward inference
- CNN Marathon: Efficient convolution operations
- Transformer Decathlon: Complete attention-based model performance
Each event uses standardized models and datasets for fair competition.
"""
def __init__(self, profiler_warmup_runs: int = DEFAULT_WARMUP_RUNS,
profiler_timing_runs: int = DEFAULT_PROFILER_TIMING_RUNS):
"""
Initialize TinyMLPerf benchmark suite.
Args:
profiler_warmup_runs: Number of warmup runs for stable measurements
profiler_timing_runs: Number of timing runs for statistical reliability
"""
self.warmup_runs = profiler_warmup_runs
self.timing_runs = profiler_timing_runs
self.benchmark_models = {}
self.benchmark_datasets = {}
print("🏆 TinyMLPerf Competition Suite Initialized!")
print("TARGET Three Events: MLP Sprint, CNN Marathon, Transformer Decathlon")
# Load standard benchmark models
self._load_benchmark_models()
self._load_benchmark_datasets()
def _load_benchmark_models(self):
"""Load standard benchmark models for each competition event"""
print("📥 Loading TinyMLPerf Benchmark Models...")
# Create instances of the standardized benchmark models
self.benchmark_models = {
'mlp_sprint': MLPBenchmark(),
'cnn_marathon': CNNBenchmark(),
'transformer_decathlon': TransformerBenchmark()
}
print("PASS Benchmark models loaded successfully!")
for event, model in self.benchmark_models.items():
print(f" 📋 {event.replace('_', ' ').title()}: {type(model).__name__}")
def _load_benchmark_datasets(self):
"""Load standard benchmark datasets for each competition event"""
print("📊 Loading TinyMLPerf Benchmark Datasets...")
# MLP Sprint dataset - MNIST-like flattened images
mlp_batch_size = 100
mlp_data = {
'inputs': np.random.randn(mlp_batch_size, MLP_INPUT_SIZE).astype(np.float32), # Batch of samples
'targets': np.eye(MLP_OUTPUT_SIZE)[np.random.randint(0, MLP_OUTPUT_SIZE, mlp_batch_size)], # One-hot labels
'event': 'MLP Sprint',
'description': 'Feedforward inference on flattened 28x28 images'
}
# CNN Marathon dataset - Image-like data
cnn_batch_size = 50
cnn_image_size = 28 # 28x28 standard image size
cnn_data = {
'inputs': np.random.randn(cnn_batch_size, cnn_image_size, cnn_image_size, 1).astype(np.float32), # Batch of images
'targets': np.eye(MLP_OUTPUT_SIZE)[np.random.randint(0, MLP_OUTPUT_SIZE, cnn_batch_size)],
'event': 'CNN Marathon',
'description': 'Convolutional inference on 28x28x1 images'
}
# Transformer Decathlon dataset - Sequence data
transformer_batch_size = 32
transformer_data = {
'inputs': np.random.randn(transformer_batch_size, TRANSFORMER_SEQ_LEN, TRANSFORMER_D_MODEL).astype(np.float32), # Batch of sequences
'targets': np.eye(MLP_OUTPUT_SIZE)[np.random.randint(0, MLP_OUTPUT_SIZE, transformer_batch_size)],
'event': 'Transformer Decathlon',
'description': 'Self-attention inference on 64-token sequences'
}
self.benchmark_datasets = {
'mlp_sprint': mlp_data,
'cnn_marathon': cnn_data,
'transformer_decathlon': transformer_data
}
print("PASS Benchmark datasets loaded successfully!")
for event, data in self.benchmark_datasets.items():
print(f" TARGET {data['event']}: {data['inputs'].shape} -> {data['targets'].shape}")
def load_benchmark(self, event_name: str) -> Tuple[Any, Dict[str, Any]]:
"""
Load a specific benchmark model and dataset.
Args:
event_name: Name of competition event ('mlp_sprint', 'cnn_marathon', 'transformer_decathlon')
Returns:
Tuple of (model, dataset) for the specified event
"""
if event_name not in self.benchmark_models:
available = list(self.benchmark_models.keys())
raise ValueError(f"Event '{event_name}' not found. Available: {available}")
model = self.benchmark_models[event_name]
dataset = self.benchmark_datasets[event_name]
print(f"📋 Loaded benchmark: {dataset['event']}")
print(f" Model: {type(model).__name__}")
print(f" Data: {dataset['description']}")
return model, dataset
def get_available_events(self) -> Dict[str, str]:
"""Get list of available competition events with descriptions"""
return {
'mlp_sprint': 'Fastest feedforward neural network inference',
'cnn_marathon': 'Efficient convolutional neural network processing',
'transformer_decathlon': 'Complete attention mechanism optimization'
}
# %% [markdown]
"""
### Test TinyMLPerf Benchmark Suite
Let's test the benchmark suite to ensure all models and datasets load correctly.
"""
# %%
def test_tinymlperf_benchmark_suite():
"""Test the TinyMLPerf benchmark suite"""
print("Testing TinyMLPerf Benchmark Suite...")
# Initialize benchmark suite
benchmark_suite = TinyMLPerf(profiler_warmup_runs=2, profiler_timing_runs=3)
# Test each event
events = benchmark_suite.get_available_events()
print(f"\n🏆 Available Events: {len(events)}")
for event_name, description in events.items():
print(f"\n📋 Testing {event_name}...")
model, dataset = benchmark_suite.load_benchmark(event_name)
# Test model inference
inputs = dataset['inputs']
outputs = model.predict(inputs)
print(f" PASS Inference successful: {inputs.shape} -> {outputs.shape}")
# Verify output shape makes sense
batch_size = inputs.shape[0]
assert outputs.shape[0] == batch_size, f"Batch size mismatch: {outputs.shape[0]} != {batch_size}"
print(f" PASS Output shape verified")
print(f"\nPASS TinyMLPerf benchmark suite test complete!")
return benchmark_suite
# %% [markdown]
"""
## Part 2: Performance Benchmarking Using Module 15's Profiler
Now let's build the core benchmarking infrastructure that uses the profiler from Module 15 to measure performance.
"""
# %%
class CompetitionProfiler:
"""
Competition profiling infrastructure using TinyTorch's Module 15 profiler.
Provides rigorous performance measurement for fair competition by:
- Using standardized profiling from Module 15
- Multiple timing runs with statistical analysis
- Memory usage tracking and analysis
- Hardware-independent relative scoring
"""
def __init__(self, warmup_runs: int = DEFAULT_WARMUP_RUNS,
timing_runs: int = DEFAULT_PROFILER_TIMING_RUNS):
"""
Initialize competition profiler.
Args:
warmup_runs: Number of warmup runs to stabilize performance
timing_runs: Number of timing runs for statistical reliability
"""
self.warmup_runs = warmup_runs
self.timing_runs = timing_runs
self.has_profiler = HAS_PROFILER
if not self.has_profiler:
print("WARNING Warning: Advanced profiling unavailable, using basic timing")
else:
print("PASS Using TinyTorch Module 15 profiler for advanced metrics")
def benchmark_model(self, model, dataset: Dict[str, Any]) -> Dict[str, Any]:
"""
Benchmark a model using rigorous profiling methodology.
Args:
model: Model to benchmark (must have predict() or forward() method)
dataset: Dataset dictionary with 'inputs' key
Returns:
Comprehensive benchmarking results with performance metrics
"""
print(f"🏁 Benchmarking {dataset.get('event', 'Model')}...")
inputs = dataset['inputs']
results = {
'event': dataset.get('event', 'Unknown'),
'model_type': type(model).__name__,
'input_shape': inputs.shape,
'benchmark_timestamp': datetime.now().isoformat()
}
if self.has_profiler:
# Use advanced profiling from Module 15
results.update(self._profile_with_tinytorch_profiler(model, inputs))
else:
# Fallback to basic timing
results.update(self._profile_basic_timing(model, inputs))
self._print_benchmark_results(results)
return results
def quick_benchmark(self, model, dataset: Dict[str, Any]) -> float:
"""
Simple benchmarking returning just the mean inference time.
This is a simplified interface for students who just want basic timing.
Args:
model: Model to benchmark
dataset: Dataset dictionary with 'inputs' key
Returns:
Mean inference time in seconds
"""
results = self._run_basic_profiling(model, dataset['inputs'])
return results['mean_inference_time']
def compare_models(self, model, baseline_model, dataset: Dict[str, Any]) -> Dict[str, Any]:
"""
Compare two models directly with simplified interface.
Args:
model: Optimized model to test
baseline_model: Baseline model for comparison
dataset: Dataset dictionary with 'inputs' key
Returns:
Comparison results with speedup information
"""
print(f"🏁 Comparing models for {dataset.get('event', 'Model')}...")
# Benchmark both models
baseline_results = self._run_basic_profiling(baseline_model, dataset['inputs'])
model_results = self._run_basic_profiling(model, dataset['inputs'])
# Calculate speedup
speedup = baseline_results['mean_inference_time'] / model_results['mean_inference_time']
comparison = {
'baseline_time': baseline_results['mean_inference_time'],
'optimized_time': model_results['mean_inference_time'],
'speedup': speedup,
'event': dataset.get('event', 'Unknown'),
'baseline_model': type(baseline_model).__name__,
'optimized_model': type(model).__name__
}
print(f"📊 Baseline: {comparison['baseline_time']*1000:.2f} ms")
print(f"📊 Optimized: {comparison['optimized_time']*1000:.2f} ms")
print(f"ROCKET Speedup: {speedup:.2f}x {'faster' if speedup > 1.0 else 'slower'}")
return comparison
def benchmark_with_baseline(self, model, dataset: Dict[str, Any], baseline_time: float) -> Dict[str, Any]:
"""
Benchmark a model against a known baseline time.
Args:
model: Model to benchmark
dataset: Dataset dictionary with 'inputs' key
baseline_time: Baseline time in seconds for speedup calculation
Returns:
Benchmark results with speedup calculation
"""
results = self.benchmark_model(model, dataset)
speedup = baseline_time / results['mean_inference_time']
results['speedup_vs_baseline'] = speedup
print(f"ROCKET Speedup vs baseline: {speedup:.2f}x {'faster' if speedup > 1.0 else 'slower'}")
return results
def _run_basic_profiling(self, model, inputs: np.ndarray) -> Dict[str, Any]:
"""
Run basic profiling without complex options.
This is used by simplified interfaces.
"""
if self.has_profiler:
return self._profile_with_tinytorch_profiler(model, inputs)
else:
return self._profile_basic_timing(model, inputs)
def _profile_with_tinytorch_profiler(self, model, inputs: np.ndarray) -> Dict[str, Any]:
"""Profile using Module 15's advanced profiler"""
profiler = SimpleProfiler(track_memory=True, track_cpu=True)
# Run profiling sessions
profile_results = self._run_profiling_sessions(profiler, model, inputs)
# Calculate statistics
return self._calculate_profiling_statistics(profile_results)
def _run_profiling_sessions(self, profiler, model, inputs: np.ndarray) -> List[Dict[str, Any]]:
"""Run multiple profiling sessions for statistical reliability."""
profile_results = []
for run in range(self.timing_runs):
# Each profiling session includes warmup
result = profiler.profile(
model.predict, inputs,
name=f"inference_run_{run}",
warmup=True # Profiler handles warmup
)
profile_results.append(result)
return profile_results
def _calculate_profiling_statistics(self, profile_results: List[Dict[str, Any]]) -> Dict[str, Any]:
"""Calculate timing and memory statistics from profile results."""
# Extract timing data
wall_times = [r['wall_time'] for r in profile_results]
cpu_times = [r['cpu_time'] for r in profile_results]
# Calculate timing statistics
timing_stats = {
'mean_inference_time': np.mean(wall_times),
'std_inference_time': np.std(wall_times),
'min_inference_time': np.min(wall_times),
'max_inference_time': np.max(wall_times),
'p95_inference_time': np.percentile(wall_times, 95),
'mean_cpu_time': np.mean(cpu_times),
'cpu_efficiency': np.mean([r['cpu_efficiency'] for r in profile_results]),
'profiling_method': 'TinyTorch Module 15 Profiler'
}
# Add memory statistics
memory_stats = self._extract_memory_statistics(profile_results)
timing_stats.update(memory_stats)
return timing_stats
def _extract_memory_statistics(self, profile_results: List[Dict[str, Any]]) -> Dict[str, Any]:
"""Extract memory statistics from profiling results."""
# Use last run as most representative
last_result = profile_results[-1]
memory_stats = {}
if 'memory_delta_mb' in last_result:
memory_stats.update({
'memory_delta_mb': last_result['memory_delta_mb'],
'peak_memory_mb': last_result['peak_memory_mb'],
'result_size_mb': last_result.get('result_size_mb', 0)
})
return memory_stats
def _profile_basic_timing(self, model, inputs: np.ndarray) -> Dict[str, Any]:
"""Fallback basic timing without advanced profiling"""
# Warmup runs
for _ in range(self.warmup_runs):
_ = model.predict(inputs)
# Timing runs
times = []
for _ in range(self.timing_runs):
start = time.perf_counter()
_ = model.predict(inputs)
end = time.perf_counter()
times.append(end - start)
return {
'mean_inference_time': np.mean(times),
'std_inference_time': np.std(times),
'min_inference_time': np.min(times),
'max_inference_time': np.max(times),
'p95_inference_time': np.percentile(times, 95),
'profiling_method': 'Basic Timing'
}
def _print_benchmark_results(self, results: Dict[str, Any]):
"""Print formatted benchmark results"""
print(f"\n📊 {results['event']} Benchmark Results:")
print(f" Model: {results['model_type']}")
print(f" Input: {results['input_shape']}")
print(f" Mean Time: {results['mean_inference_time']*1000:.2f} ± {results['std_inference_time']*1000:.2f} ms")
print(f" Best Time: {results['min_inference_time']*1000:.2f} ms")
print(f" P95 Time: {results['p95_inference_time']*1000:.2f} ms")
if 'speedup_vs_baseline' in results:
print(f" ROCKET Speedup: {results['speedup_vs_baseline']:.2f}x faster")
if 'memory_delta_mb' in results:
print(f" 💾 Memory: {results['memory_delta_mb']:.2f} MB delta, {results['peak_memory_mb']:.2f} MB peak")
print(f" 📏 Method: {results['profiling_method']}")
# %% [markdown]
"""
### Test Competition Profiler
Let's test the competition profiler with TinyMLPerf benchmark models.
"""
# %%
def test_competition_profiler():
"""Test the competition profiler with benchmark models"""
print("Testing Competition Profiler...")
# Initialize TinyMLPerf and profiler
benchmark_suite = TinyMLPerf(profiler_warmup_runs=2, profiler_timing_runs=3)
competition_profiler = CompetitionProfiler(warmup_runs=2, timing_runs=3)
# Test MLP Sprint profiling
mlp_model, mlp_dataset = benchmark_suite.load_benchmark('mlp_sprint')
mlp_results = competition_profiler.benchmark_model(mlp_model, mlp_dataset)
# Test CNN Marathon profiling
cnn_model, cnn_dataset = benchmark_suite.load_benchmark('cnn_marathon')
cnn_results = competition_profiler.benchmark_model(cnn_model, cnn_dataset)
# Test speedup calculation with baseline
print(f"\n🏃 Testing Speedup Calculation...")
cnn_speedup_results = competition_profiler.benchmark_with_baseline(
cnn_model, cnn_dataset,
baseline_time=mlp_results['mean_inference_time'] # Use MLP as baseline
)
print(f"\nPASS Competition profiler test complete!")
return competition_profiler, mlp_results, cnn_results
# %% [markdown]
"""
## Part 3: Simplified Competition Framework - Focused Leaderboards
Let's build a simplified competition framework with focused classes and clear responsibilities.
"""
# %%
class CompetitionSubmission:
"""Handles creation and validation of individual competition submissions."""
def __init__(self, team_name: str, event_name: str, optimized_model,
optimization_description: str = "", github_url: str = ""):
"""Create a competition submission."""
self.team_name = team_name
self.event_name = event_name
self.optimized_model = optimized_model
self.optimization_description = optimization_description
self.github_url = github_url
self.submission_id = self._generate_id()
self.timestamp = datetime.now().isoformat()
def _generate_id(self) -> str:
"""Generate unique submission ID."""
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
team_hash = hashlib.md5(self.team_name.encode()).hexdigest()[:6]
return f"{self.event_name}_{team_hash}_{timestamp}"
def to_dict(self) -> Dict[str, Any]:
"""Convert submission to dictionary for storage."""
return {
'submission_id': self.submission_id,
'timestamp': self.timestamp,
'team_name': self.team_name,
'event_name': self.event_name,
'optimization_description': self.optimization_description,
'github_url': self.github_url
}
class CompetitionStorage:
"""Handles saving and loading competition results."""
def __init__(self, results_dir: str = "tinymlperf_results"):
"""Initialize storage with results directory."""
self.results_dir = Path(results_dir)
self.results_dir.mkdir(exist_ok=True)
def save_submission(self, submission_data: Dict[str, Any]):
"""Save submission to storage."""
filename = f"{submission_data['submission_id']}.json"
filepath = self.results_dir / filename
with open(filepath, 'w') as f:
json.dump(submission_data, f, indent=2, default=str)
print(f"💾 Submission saved: {filepath}")
def load_event_submissions(self, event_name: str) -> List[Dict[str, Any]]:
"""Load all submissions for a specific event."""
submissions = []
for filepath in self.results_dir.glob(f"{event_name}_*.json"):
try:
with open(filepath, 'r') as f:
submission = json.load(f)
submissions.append(submission)
except Exception as e:
print(f"Warning: Could not load {filepath}: {e}")
return submissions
class SimpleInnovationDetector:
"""Simple innovation detection using basic keyword matching."""
def detect_techniques(self, description: str) -> List[str]:
"""Detect optimization techniques using simple keywords."""
description_lower = description.lower()
detected = []
for technique, keywords in OPTIMIZATION_KEYWORDS.items():
for keyword in keywords:
if keyword in description_lower:
detected.append(technique)
break # Only count each technique once
return detected
def calculate_innovation_score(self, detected_techniques: List[str]) -> float:
"""Calculate innovation score based on number of techniques."""
base_score = len(detected_techniques) * 0.2
# Bonus for multiple techniques
if len(detected_techniques) >= 3:
base_score += 0.3
return min(base_score, MAX_INNOVATION_SCORE)
class CompetitionLeaderboard:
"""Focused leaderboard display with configurable sorting."""
def __init__(self, storage: CompetitionStorage):
"""Initialize leaderboard with storage backend."""
self.storage = storage
self.innovation_detector = SimpleInnovationDetector()
def display_leaderboard(self, event_name: str, sort_by: str = 'speed', top_n: int = 10) -> List[Dict[str, Any]]:
"""Display leaderboard with configurable sorting.
Args:
event_name: Event to show leaderboard for
sort_by: 'speed', 'innovation', or 'composite'
top_n: Number of top entries to display
"""
submissions = self.storage.load_event_submissions(event_name)
if not submissions:
print(f"🏆 {event_name.replace('_', ' ').title()} Leaderboard ({sort_by.title()})")
print("No submissions yet! Be the first to compete!")
return []
# Add innovation scores if needed
if sort_by in ['innovation', 'composite']:
self._add_innovation_scores(submissions)
# Sort submissions
sorted_submissions = self._sort_submissions(submissions, sort_by)
top_submissions = sorted_submissions[:top_n]
# Display leaderboard
self._display_formatted_leaderboard(event_name, top_submissions, sort_by)
return top_submissions
def _add_innovation_scores(self, submissions: List[Dict[str, Any]]):
"""Add innovation scores to submissions that don't have them."""
for submission in submissions:
if 'innovation_score' not in submission:
techniques = self.innovation_detector.detect_techniques(
submission.get('optimization_description', '')
)
submission['detected_techniques'] = techniques
submission['innovation_score'] = self.innovation_detector.calculate_innovation_score(techniques)
# Calculate composite score if speedup exists
if 'speedup_score' in submission:
submission['composite_score'] = (
SPEED_WEIGHT * submission['speedup_score'] +
INNOVATION_WEIGHT * submission['innovation_score']
)
def _sort_submissions(self, submissions: List[Dict[str, Any]], sort_by: str) -> List[Dict[str, Any]]:
"""Sort submissions by specified criteria."""
if sort_by == 'speed':
return sorted(submissions, key=lambda s: s.get('speedup_score', 0), reverse=True)
elif sort_by == 'innovation':
return sorted(submissions, key=lambda s: s.get('innovation_score', 0), reverse=True)
elif sort_by == 'composite':
return sorted(submissions, key=lambda s: s.get('composite_score', 0), reverse=True)
else:
raise ValueError(f"Unknown sort type: {sort_by}")
def _display_formatted_leaderboard(self, event_name: str, submissions: List[Dict[str, Any]], sort_by: str):
"""Display formatted leaderboard based on sort type."""
print(f"\n🏆 TINYMLPERF LEADERBOARD - {event_name.replace('_', ' ').title()} ({sort_by.title()})")
print("=" * 80)
if sort_by == 'speed':
self._display_speed_leaderboard(submissions)
elif sort_by == 'innovation':
self._display_innovation_leaderboard(submissions)
elif sort_by == 'composite':
self._display_composite_leaderboard(submissions)
print("-" * 80)
print(f"Showing top {len(submissions)} submissions")
def _display_speed_leaderboard(self, submissions: List[Dict[str, Any]]):
"""Display speed-focused leaderboard."""
print(LEADERBOARD_HEADER.format(
rank="Rank", team="Team", speedup="Speedup", time_ms="Time (ms)", techniques="Techniques"
))
print("-" * 80)
for i, submission in enumerate(submissions):
rank = i + 1
team = submission['team_name'][:19]
speedup = f"{submission.get('speedup_score', 0):.2f}x"
time_ms = f"{submission.get('submission_time_ms', 0):.2f}"
techniques = submission.get('optimization_description', '')[:24]
print(LEADERBOARD_HEADER.format(
rank=rank, team=team, speedup=speedup, time_ms=time_ms, techniques=techniques
))
def _display_innovation_leaderboard(self, submissions: List[Dict[str, Any]]):
"""Display innovation-focused leaderboard."""
print(INNOVATION_HEADER.format(
rank="Rank", team="Team", innovation="Innovation", techniques="Tech#", description="Description"
))
print("-" * 80)
for i, submission in enumerate(submissions):
rank = i + 1
team = submission['team_name'][:19]
innovation = f"{submission.get('innovation_score', 0):.3f}"
num_tech = len(submission.get('detected_techniques', []))
description = submission.get('optimization_description', '')[:24]
print(INNOVATION_HEADER.format(
rank=rank, team=team, innovation=innovation, techniques=num_tech, description=description
))
def _display_composite_leaderboard(self, submissions: List[Dict[str, Any]]):
"""Display composite leaderboard."""
print(COMPOSITE_HEADER.format(
rank="Rank", team="Team", composite="Composite", speed="Speed", innovation="Innovation", techniques="Techniques"
))
print("-" * 80)
for i, submission in enumerate(submissions):
rank = i + 1
team = submission['team_name'][:17]
composite = f"{submission.get('composite_score', 0):.3f}"
speed = f"{submission.get('speedup_score', 0):.2f}x"
innovation = f"{submission.get('innovation_score', 0):.3f}"
techniques = ", ".join(submission.get('detected_techniques', [])[:2])[:15]
print(COMPOSITE_HEADER.format(
rank=rank, team=team, composite=composite, speed=speed, innovation=innovation, techniques=techniques
))
class TinyMLPerfCompetition:
"""
TinyMLPerf Competition Framework - The Olympics of ML Optimization!
Manages three exciting competition events:
- MLP Sprint: Fastest feedforward network
- CNN Marathon: Most efficient convolutions
- Transformer Decathlon: Ultimate attention optimization
Features hardware-independent relative scoring and transparent leaderboards.
"""
def __init__(self, results_dir: str = "tinymlperf_results"):
"""
Initialize TinyMLPerf competition.
Args:
results_dir: Directory to store competition results and leaderboards
"""
self.results_dir = Path(results_dir)
self.results_dir.mkdir(exist_ok=True)
self.tinyperf = TinyMLPerf()
self.profiler = CompetitionProfiler(warmup_runs=DEFAULT_WARMUP_RUNS,
timing_runs=DEFAULT_TIMING_RUNS)
# Initialize storage and leaderboard components
self.storage = CompetitionStorage(results_dir)
self.leaderboard = CompetitionLeaderboard(self.storage)
# Load baseline models for relative scoring
self.baselines = self._establish_baselines()
print("🏆 TinyMLPerf Competition Initialized!")
print("TARGET Three Events Ready for Competition!")
def _establish_baselines(self) -> Dict[str, float]:
"""Establish baseline performance for relative scoring."""
print("📏 Establishing baseline performance for relative scoring...")
baselines = {}
events = ['mlp_sprint', 'cnn_marathon', 'transformer_decathlon']
for event in events:
model, dataset = self.tinyperf.load_benchmark(event)
results = self.profiler.benchmark_model(model, dataset)
baselines[event] = results['mean_inference_time']
print(f" {event}: {baselines[event]*1000:.2f} ms baseline")
return baselines
def submit_entry(self, team_name: str, event_name: str, optimized_model,
optimization_description: str = "", github_url: str = "") -> Dict[str, Any]:
"""Submit an optimized model to TinyMLPerf competition.
Args:
team_name: Name of the competing team
event_name: Competition event ('mlp_sprint', 'cnn_marathon', 'transformer_decathlon')
optimized_model: The optimized model to submit
optimization_description: Description of optimization techniques used
github_url: Link to code repository (for transparency)
Returns:
Submission results with performance metrics and scoring
"""
# Validate event
if event_name not in self.baselines:
available = list(self.baselines.keys())
print(f"FAIL Event '{event_name}' not recognized!")
print("TARGET Available competitions:")
for event in available:
print(f"{event.replace('_', ' ').title()}")
return None
print(f"ROCKET TINYMLPERF SUBMISSION")
print(f"🏆 Event: {event_name.replace('_', ' ').title()}")
print(f"👥 Team: {team_name}")
print("-" * 60)
# Load benchmark dataset for this event
_, dataset = self.tinyperf.load_benchmark(event_name)
# Benchmark the submitted model with baseline comparison
results = self.profiler.benchmark_with_baseline(
optimized_model, dataset,
baseline_time=self.baselines[event_name]
)
# Calculate competition score (relative speedup)
baseline_time = self.baselines[event_name]
submission_time = results['mean_inference_time']
speedup_score = baseline_time / submission_time
# Create submission record
submission = {
'submission_id': self._generate_submission_id(team_name, event_name),
'timestamp': datetime.now().isoformat(),
'team_name': team_name,
'event_name': event_name,
'optimization_description': optimization_description,
'github_url': github_url,
'performance_metrics': results,
'speedup_score': speedup_score,
'baseline_time_ms': baseline_time * 1000,
'submission_time_ms': submission_time * 1000
}
# Save submission to storage
self.storage.save_submission(submission)
# Display submission results
self._display_submission_results(submission)
return submission
def _generate_submission_id(self, team_name: str, event_name: str) -> str:
"""Generate unique submission ID"""
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
team_hash = hashlib.md5(team_name.encode()).hexdigest()[:6]
return f"{event_name}_{team_hash}_{timestamp}"
def _benchmark_submission(self, submission: CompetitionSubmission) -> Dict[str, Any]:
"""Benchmark a submission and calculate scores."""
# Load benchmark dataset
_, dataset = self.tinyperf.load_benchmark(submission.event_name)
# Run profiling
results = self.profiler.benchmark_model(
submission.optimized_model, dataset,
baseline_time=self.baselines[submission.event_name]
)
# Calculate scores
baseline_time = self.baselines[submission.event_name]
submission_time = results['mean_inference_time']
speedup_score = baseline_time / submission_time
# Create submission data
submission_data = submission.to_dict()
submission_data.update({
'performance_metrics': results,
'speedup_score': speedup_score,
'baseline_time_ms': baseline_time * 1000,
'submission_time_ms': submission_time * 1000
})
return submission_data
def _display_submission_results(self, submission: Dict[str, Any]):
"""Display formatted submission results."""
metrics = submission['performance_metrics']
speedup = submission['speedup_score']
print(f"\n🏆 SUBMISSION RESULTS")
print(f"=" * 50)
print(f"Team: {submission['team_name']}")
print(f"Event: {submission['event_name'].replace('_', ' ').title()}")
print(f"\n⏱️ Performance:")
print(f" Your Time: {submission['submission_time_ms']:.2f} ms")
print(f" Baseline: {submission['baseline_time_ms']:.2f} ms")
print(f" ROCKET Speedup: {speedup:.2f}x {'FASTER' if speedup > 1.0 else 'slower'}")
if 'memory_delta_mb' in metrics:
print(f" 💾 Memory: {metrics['memory_delta_mb']:.2f} MB")
# Award celebration for good performance
if speedup >= 3.0:
print(f"\nCELEBRATE AMAZING! 3x+ speedup achieved!")
elif speedup >= 2.0:
print(f"\n🏆 EXCELLENT! 2x+ speedup!")
elif speedup >= 1.5:
print(f"\n⭐ GREAT! 50%+ speedup!")
elif speedup >= 1.1:
print(f"\nPASS Good optimization!")
else:
print(f"\nTHINK Keep optimizing - you can do better!")
if submission['optimization_description']:
print(f"\nTIP Techniques Used:")
print(f" {submission['optimization_description']}")
def display_leaderboard(self, event_name: str, sort_by: str = 'speed', top_n: int = 10) -> List[Dict[str, Any]]:
"""Display leaderboard for specific event with configurable sorting.
Args:
event_name: Event to show leaderboard for
sort_by: 'speed', 'innovation', or 'composite'
top_n: Number of top entries to display
"""
return self.leaderboard.display_leaderboard(event_name, sort_by, top_n)
def display_all_leaderboards(self, sort_by: str = 'speed'):
"""Display leaderboards for all events.
Args:
sort_by: 'speed', 'innovation', or 'composite'
"""
events = ['mlp_sprint', 'cnn_marathon', 'transformer_decathlon']
for event in events:
self.display_leaderboard(event, sort_by=sort_by, top_n=5)
print()
def get_team_progress(self, team_name: str) -> Dict[str, List[Dict[str, Any]]]:
"""Get all submissions from a specific team across all events."""
team_submissions = {'mlp_sprint': [], 'cnn_marathon': [], 'transformer_decathlon': []}
for event in team_submissions.keys():
submissions = self.storage.load_event_submissions(event)
team_submissions[event] = [
s for s in submissions if s['team_name'] == team_name
]
# Sort by timestamp
team_submissions[event].sort(key=lambda s: s['timestamp'])
return team_submissions
# %% [markdown]
"""
### Test TinyMLPerf Competition Framework
Let's test the competition framework with multiple team submissions and leaderboards.
"""
# %%
def test_tinymlperf_competition():
"""Test the TinyMLPerf competition framework"""
print("Testing TinyMLPerf Competition Framework...")
# Initialize competition
competition = TinyMLPerfCompetition()
# Create some test optimized models
class FastMLPModel:
"""Simulated optimized MLP - smaller and faster"""
def __init__(self):
# Smaller model for speed
self.weights1 = np.random.randn(784, 64).astype(np.float32) * 0.1
self.bias1 = np.random.randn(64).astype(np.float32) * 0.1
self.weights2 = np.random.randn(64, 10).astype(np.float32) * 0.1
self.bias2 = np.random.randn(10).astype(np.float32) * 0.1
def predict(self, x):
h1 = np.maximum(0, x @ self.weights1 + self.bias1)
return h1 @ self.weights2 + self.bias2
class EfficientCNNModel:
"""Simulated optimized CNN"""
def __init__(self):
# Optimized weights
self.fc_weights = np.random.randn(1600, 10).astype(np.float32) * 0.05
self.fc_bias = np.random.randn(10).astype(np.float32) * 0.05
def predict(self, x):
batch_size = x.shape[0]
x_flat = x.reshape(batch_size, -1)
if x_flat.shape[1] != 1600:
x_flat = x_flat[:, :1600] if x_flat.shape[1] > 1600 else np.pad(x_flat, ((0, 0), (0, 1600 - x_flat.shape[1])), 'constant')
return x_flat @ self.fc_weights + self.fc_bias
# Submit optimized models to competition
print("\nROCKET Submitting Competition Entries...")
# MLP Sprint submissions
mlp_submission1 = competition.submit_entry(
team_name="Speed Demons",
event_name="mlp_sprint",
optimized_model=FastMLPModel(),
optimization_description="Reduced hidden layer size for 2x speedup",
github_url="https://github.com/speed-demons/fast-mlp"
)
mlp_submission2 = competition.submit_entry(
team_name="Lightning Fast",
event_name="mlp_sprint",
optimized_model=FastMLPModel(),
optimization_description="Quantization + kernel optimization",
github_url="https://github.com/lightning-fast/mlp-opt"
)
# CNN Marathon submission
cnn_submission = competition.submit_entry(
team_name="CNN Champions",
event_name="cnn_marathon",
optimized_model=EfficientCNNModel(),
optimization_description="Custom convolution kernels + memory optimization",
github_url="https://github.com/cnn-champions/efficient-cnn"
)
# Display leaderboards
print("\n📊 Competition Leaderboards:")
competition.display_all_leaderboards()
print("\nPASS TinyMLPerf competition framework test complete!")
return competition
# %% [markdown]
"""
## Part 4: Simplified Competition Testing
Let's test the simplified competition framework with all three leaderboard types.
"""
# %%
def test_simplified_competition_features():
"""Test the simplified competition framework with all leaderboard types."""
print("Testing Simplified Competition Framework with All Leaderboard Types...")
# Initialize competition
competition = TinyMLPerfCompetition()
# Create optimized models with different innovation descriptions
class FastMLPModel:
"""Simulated optimized MLP - smaller and faster"""
def __init__(self):
# Smaller model for speed
self.weights1 = np.random.randn(784, 64).astype(np.float32) * 0.1
self.bias1 = np.random.randn(64).astype(np.float32) * 0.1
self.weights2 = np.random.randn(64, 10).astype(np.float32) * 0.1
self.bias2 = np.random.randn(10).astype(np.float32) * 0.1
def predict(self, x):
h1 = np.maximum(0, x @ self.weights1 + self.bias1)
return h1 @ self.weights2 + self.bias2
class EfficientCNNModel:
"""Simulated optimized CNN"""
def __init__(self):
# Optimized weights
self.fc_weights = np.random.randn(1600, 10).astype(np.float32) * 0.05
self.fc_bias = np.random.randn(10).astype(np.float32) * 0.05
def predict(self, x):
batch_size = x.shape[0]
x_flat = x.reshape(batch_size, -1)
if x_flat.shape[1] != 1600:
x_flat = x_flat[:, :1600] if x_flat.shape[1] > 1600 else np.pad(x_flat, ((0, 0), (0, 1600 - x_flat.shape[1])), 'constant')
return x_flat @ self.fc_weights + self.fc_bias
# Submit entries with different optimization descriptions
print("\nROCKET Submitting Competition Entries...")
# MLP submissions with different techniques
submission1 = competition.submit_entry(
team_name="Speed Demons",
event_name="mlp_sprint",
optimized_model=FastMLPModel(),
optimization_description="Reduced hidden layer size for 2x speedup",
github_url="https://github.com/speed-demons/fast-mlp"
)
submission2 = competition.submit_entry(
team_name="Quantized Team",
event_name="mlp_sprint",
optimized_model=FastMLPModel(),
optimization_description="INT8 quantization with custom kernels",
github_url="https://github.com/quantized-team/mlp-opt"
)
submission3 = competition.submit_entry(
team_name="Pruning Pros",
event_name="cnn_marathon",
optimized_model=EfficientCNNModel(),
optimization_description="Sparse pruned model with distillation",
github_url="https://github.com/pruning-pros/efficient-cnn"
)
# Test all three leaderboard types
print("\n📊 Testing All Leaderboard Types:")
print("\n1. Speed Leaderboard:")
competition.display_leaderboard("mlp_sprint", sort_by="speed", top_n=5)
print("\n2. Innovation Leaderboard:")
competition.display_leaderboard("mlp_sprint", sort_by="innovation", top_n=5)
print("\n3. Composite Leaderboard:")
competition.display_leaderboard("mlp_sprint", sort_by="composite", top_n=5)
print("\nPASS Simplified competition features test complete!")
return competition
# %% [markdown]
"""
## Comprehensive Testing
Let's run a complete TinyMLPerf competition demonstration with simplified features.
"""
def run_complete_tinymlperf_demo():
"""Run comprehensive TinyMLPerf competition demonstration"""
print("🏆 TINYMLPERF - THE ULTIMATE ML SYSTEMS COMPETITION")
print("=" * 80)
print("\n1. 🏗️ Setting up TinyMLPerf Benchmark Suite...")
# Test benchmark suite
benchmark_suite = test_tinymlperf_benchmark_suite()
print("\n2. SPEED Testing Competition Profiling...")
# Test profiling infrastructure
competition_profiler, mlp_results, cnn_results = test_competition_profiler()
print("\n3. ROCKET Running Basic Competition...")
# Test basic competition
basic_competition = test_tinymlperf_competition()
print("\n4. 🔬 Testing Simplified Competition Features...")
# Test simplified competition with all leaderboard types
simplified_competition = test_simplified_competition_features()
print("\n" + "=" * 80)
print("CELEBRATE TINYMLPERF DEMO COMPLETE!")
print("=" * 80)
print("\n🏆 TinyMLPerf Competition Ready:")
print("PASS Three exciting events: MLP Sprint, CNN Marathon, Transformer Decathlon")
print("PASS TinyTorch Module 15 profiler integration for rigorous benchmarking")
print("PASS Hardware-independent relative scoring (speedup ratios)")
print("PASS Transparent leaderboards with evidence requirements")
print("PASS Simplified innovation detection and creativity rewards")
print("PASS Three leaderboard types: speed, innovation, and composite scoring")
print("\nROCKET Competition Features:")
print("• Standardized benchmark models and datasets")
print("• Statistical reliability with multiple timing runs")
print("• Multiple leaderboard categories with simple keyword detection")
print("• GitHub integration for transparency and reproducibility")
print("• Focused classes with single responsibilities")
print("\nTARGET Ready to Compete:")
print("1. Optimize your models using techniques from Modules 16-19")
print("2. Submit to TinyMLPerf events using competition.submit_entry()")
print("3. See your results on speed, innovation, or composite leaderboards")
print("4. Iterate and improve based on performance feedback")
print("5. Prove your ML systems optimization mastery!")
return {
'benchmark_suite': benchmark_suite,
'profiler': competition_profiler,
'basic_competition': basic_competition,
'simplified_competition': simplified_competition
}
# %% [markdown]
"""
## Systems Analysis Summary
This simplified TinyMLPerf competition module demonstrates advanced ML systems engineering through streamlined competitive benchmarking:
### 🏗️ **Simplified Competition Infrastructure**
- **Focused Classes**: Each class has a single responsibility - submission, storage, leaderboard, or innovation detection
- **Clear Separation of Concerns**: CompetitionSubmission, CompetitionStorage, CompetitionLeaderboard, and SimpleInnovationDetector work together
- **Consistent API**: Single parameterized leaderboard method replaces three separate implementations
- **Student-Friendly**: Reduced cognitive load while maintaining all essential functionality
### SPEED **Streamlined Performance Optimization**
- **Single Leaderboard Interface**: One method with sort_by parameter ('speed', 'innovation', 'composite') replaces complex multiple methods
- **Simple Innovation Detection**: Basic keyword matching replaces complex pattern analysis and model introspection
- **Consistent Formatting**: Centralized header templates ensure visual consistency across all leaderboard types
- **Clear Error Messages**: Student-friendly guidance when events are not recognized
### 📊 **Simplified Competition Analysis**
- **TinyTorch Profiler Integration**: Unchanged - still leverages Module 15's profiling infrastructure
- **Progressive Feature Introduction**: Students can focus on speed first, then add innovation scoring
- **Visual Clarity**: Clear section headers and spacing prevent information overload
- **Focused Testing**: Each test function validates one specific capability
### TIP **Educational Improvements**
- **Reduced Complexity**: Eliminated 100+ line classes in favor of focused 20-30 line classes
- **Better Mental Models**: Students understand leaderboard concepts instead of getting lost in implementation details
- **Maintainable Code**: Consistent patterns and centralized formatting make code easier to debug and extend
- **KISS Principle**: Keep It Simple, Stupid - core pedagogical value preserved with implementation complexity reduced
### TARGET **Key Learning Objectives Maintained**
- Competition still accelerates optimization learning through concrete performance measurements
- Hardware-independent scoring ensures fair comparison across different development environments
- Multiple leaderboard types prevent single-metric tunnel vision
- Evidence requirements teach reproducibility and honest performance reporting
### 🏆 **Professional Development**
The simplified framework teaches students that good software engineering means:
- Breaking large classes into focused components
- Choosing clear, consistent APIs over feature proliferation
- Prioritizing readability and maintainability
- Making complex systems accessible without losing functionality
This refactored competition framework proves that educational software can be both pedagogically effective AND well-engineered, setting a positive example for students about professional software development practices.
"""
# %% [markdown]
"""
## Main Execution Block
Run the complete TinyMLPerf competition system when this module is executed directly.
"""
# %%
if __name__ == "__main__":
print("Module 20: TinyMLPerf - The Ultimate ML Systems Competition")
print("=" * 80)
# Run complete TinyMLPerf demonstration
results = run_complete_tinymlperf_demo()
print(f"\nCELEBRATE Module 20 complete!")
print(f"🏆 TinyMLPerf competition infrastructure ready!")
print(f"ROCKET Time to optimize your models and climb the leaderboards!")
# %% [markdown]
"""
## THINK ML Systems Thinking: Interactive Questions
1. **Why is separation of concerns crucial in competition software architecture?** Your refactored TinyMLPerf breaks large classes into focused components: CompetitionSubmission, CompetitionStorage, CompetitionLeaderboard, and SimpleInnovationDetector. Explain why this modular design is essential for educational software and how it teaches students professional software development practices beyond just ML systems concepts.
2. **How does simplifying innovation detection improve student learning outcomes?** You replaced complex pattern matching and model introspection with basic keyword detection. Analyze why reducing implementation complexity while preserving core functionality helps students focus on competition concepts rather than text processing algorithms, and how this reflects real-world engineering trade-offs.
3. **What makes single parameterized methods superior to multiple specialized methods?** Your leaderboard refactor replaced three separate methods (display_leaderboard, display_innovation_leaderboard, display_composite_leaderboard) with one configurable method. Explain why this API design choice reduces cognitive load while maintaining functionality, and how this principle applies to ML systems interfaces in production.
4. **How does consistent formatting contribute to system maintainability and user experience?** Your centralized header templates (LEADERBOARD_HEADER, INNOVATION_HEADER, COMPOSITE_HEADER) ensure visual consistency across all leaderboard displays. Analyze why standardized formatting matters in ML systems dashboards and monitoring tools, and how it prevents the user interface inconsistencies that plague many ML operations platforms.
"""
# %% [markdown]
"""
## TARGET MODULE SUMMARY: TinyMLPerf - Simplified Competition Framework
This refactored module demonstrates the power of the KISS principle in educational software design, proving that complex systems can be both pedagogically effective and professionally engineered.
### 🛤️ **The Simplification Journey**
- **Original Problem**: 600+ lines of complex, intertwined classes causing student cognitive overload
- **Solution Approach**: Break large classes into focused components with single responsibilities
- **Result**: Clean, maintainable code that teaches competition concepts without implementation distractions
### 🏗️ **Architecture Improvements**
- **CompetitionSubmission**: Focused on creating and validating individual submissions
- **CompetitionStorage**: Dedicated to saving and loading competition data
- **CompetitionLeaderboard**: Specialized for ranking and display with configurable sorting
- **SimpleInnovationDetector**: Basic keyword matching replacing complex pattern analysis
- **TinyMLPerfCompetition**: Orchestrates components with clean delegation patterns
### TARGET **Educational Excellence**
Students learn both ML systems concepts AND professional software engineering:
- **Modular Design**: How to break complex problems into manageable components
- **API Consistency**: Why parameterized methods beat specialized implementations
- **Code Maintainability**: How consistent formatting and clear separation of concerns prevent technical debt
- **KISS Principle**: That simplicity is the ultimate sophistication in software design
### 🏆 **Competition Integrity Maintained**
All essential functionality preserved with improved usability:
- Three competition events with standardized benchmarking
- Hardware-independent relative scoring for fair comparison
- Multiple leaderboard types (speed, innovation, composite) preventing tunnel vision
- Evidence requirements ensuring reproducible, honest performance claims
- Simple but effective innovation detection rewarding creative optimization
### TIP **Professional Development**
This refactor teaches students that excellent engineering means:
- Choosing clarity over clever complexity
- Building maintainable systems that others can understand and extend
- Designing APIs that guide users toward correct usage
- Making sophisticated functionality accessible without dumbing it down
**The ultimate lesson**: Great ML systems engineers build tools that make complex concepts simple to use, not simple concepts complex to understand. This competition framework exemplifies how educational software can teach both domain knowledge and engineering excellence simultaneously.
"""