mirror of
https://github.com/MLSysBook/TinyTorch.git
synced 2026-05-02 00:35:50 -05:00
- Added progressive complexity guidelines (Foundation/Intermediate/Advanced) - Added measurement function consolidation to prevent information overload - Fixed all diagnostic issues in losses_dev.py - Fixed markdown formatting across all modules - Consolidated redundant analysis functions in foundation modules - Fixed syntax errors and unused variables - Ensured all educational content is in proper markdown cells for Jupyter
1700 lines
72 KiB
Python
1700 lines
72 KiB
Python
# %% [markdown]
|
||
"""
|
||
# Module 20: TinyMLPerf - The Ultimate ML Systems Competition
|
||
|
||
## Learning Objectives
|
||
By the end of this module, you will be able to:
|
||
|
||
1. **Build Competition Benchmarking Infrastructure**: Create standardized TinyMLPerf benchmark suite for fair competition
|
||
2. **Use Profiling Tools for Systematic Measurement**: Apply Module 15's profiler to measure real performance gains
|
||
3. **Compete Across Multiple Categories**: Optimize for speed, memory, model size, and innovation simultaneously
|
||
4. **Calculate Relative Performance Improvements**: Show speedup ratios independent of hardware differences
|
||
5. **Drive Innovation Through Competition**: Use competitive pressure to discover new optimization techniques
|
||
|
||
## The TinyMLPerf Vision
|
||
|
||
**Key Message**: Competition proves optimization mastery by measuring concrete performance improvements across all your TinyTorch implementations!
|
||
|
||
**The TinyMLPerf Journey:**
|
||
1. **Benchmark Suite**: Load standard models (MLP, CNN, Transformer) as competition workloads
|
||
2. **Profiling Integration**: Use your Module 15 profiler for rigorous performance measurement
|
||
3. **Competition Categories**: Three exciting events - MLP Sprint, CNN Marathon, Transformer Decathlon
|
||
4. **Relative Scoring**: Hardware-independent speedup measurements (3x faster = 3.0 score)
|
||
5. **Leaderboard Glory**: Track innovations and celebrate optimization achievements
|
||
"""
|
||
|
||
# %%
|
||
#| default_exp utils.benchmark
|
||
|
||
import time
|
||
import json
|
||
import hashlib
|
||
import tracemalloc
|
||
from datetime import datetime
|
||
from pathlib import Path
|
||
from typing import Dict, Any, List, Optional, Tuple, Union, Callable
|
||
import numpy as np
|
||
import pickle
|
||
|
||
# Performance measurement constants
|
||
WEIGHT_INIT_SCALE = 0.1 # Xavier-style initialization scale for stable training
|
||
NUMERICAL_EPSILON = 1e-8 # Prevent division by zero in softmax calculations
|
||
DEFAULT_WARMUP_RUNS = 3 # Number of warmup runs to stabilize CPU caches
|
||
DEFAULT_TIMING_RUNS = 5 # Minimum runs for statistical reliability
|
||
DEFAULT_PROFILER_TIMING_RUNS = 10 # More thorough profiling for detailed analysis
|
||
|
||
# Model architecture constants (for standardized benchmarks)
|
||
MLP_INPUT_SIZE = 784 # Flattened 28x28 MNIST-like images
|
||
MLP_HIDDEN1_SIZE = 128 # First hidden layer size
|
||
MLP_HIDDEN2_SIZE = 64 # Second hidden layer size
|
||
MLP_OUTPUT_SIZE = 10 # Classification output classes
|
||
|
||
CNN_CONV1_FILTERS = 32 # First convolution layer filters
|
||
CNN_CONV2_FILTERS = 64 # Second convolution layer filters
|
||
CNN_KERNEL_SIZE = 3 # Convolution kernel size (3x3)
|
||
CNN_FC_INPUT_SIZE = 1600 # Flattened conv output size
|
||
|
||
TRANSFORMER_D_MODEL = 128 # Model embedding dimension
|
||
TRANSFORMER_N_HEADS = 8 # Number of attention heads
|
||
TRANSFORMER_SEQ_LEN = 64 # Maximum sequence length
|
||
TRANSFORMER_FF_RATIO = 4 # Feed-forward expansion ratio
|
||
|
||
# Competition scoring constants
|
||
SPEED_WEIGHT = 0.7 # Weight for speed in composite scoring
|
||
INNOVATION_WEIGHT = 0.3 # Weight for innovation in composite scoring
|
||
CREATIVITY_BONUS_THRESHOLD = 3 # Minimum techniques for creativity bonus
|
||
MAX_INNOVATION_SCORE = 1.0 # Maximum possible innovation score
|
||
|
||
# Leaderboard formatting templates
|
||
LEADERBOARD_HEADER = "{rank:<6} {team:<20} {speedup:<10} {time_ms:<12} {techniques:<25}"
|
||
INNOVATION_HEADER = "{rank:<6} {team:<20} {innovation:<12} {techniques:<8} {description:<25}"
|
||
COMPOSITE_HEADER = "{rank:<6} {team:<18} {composite:<11} {speed:<9} {innovation:<11} {techniques}"
|
||
|
||
# Simplified innovation pattern keywords (easier for students to understand)
|
||
OPTIMIZATION_KEYWORDS = {
|
||
'quantization': ['quantized', 'int8'], # Reduced precision computation
|
||
'pruning': ['pruned', 'sparse'], # Removing unnecessary weights
|
||
'distillation': ['distilled', 'teacher'], # Knowledge transfer
|
||
'custom_kernels': ['custom_kernel', 'cuda', 'vectorized'], # Hardware optimization
|
||
'memory_optimization': ['memory_pool', 'in_place'], # Memory efficiency
|
||
'compression': ['compressed', 'weight_sharing'] # Model compression
|
||
}
|
||
|
||
# Import TinyTorch profiler from Module 15
|
||
def _check_profiler_availability():
|
||
"""Check if TinyTorch profiler is available and explain implications."""
|
||
try:
|
||
from tinytorch.utils.profiler import SimpleProfiler, profile_function
|
||
print("PASS TinyTorch profiler loaded - using advanced timing")
|
||
return True, SimpleProfiler, profile_function
|
||
except ImportError:
|
||
print("WARNING️ TinyTorch profiler not available")
|
||
print(" Make sure Module 15 (Profiling) is completed first")
|
||
print(" Using basic timing as fallback")
|
||
return False, None, None
|
||
|
||
HAS_PROFILER, SimpleProfiler, profile_function = _check_profiler_availability()
|
||
|
||
# %% [markdown]
|
||
"""
|
||
## Part 1: Understanding Benchmarking Fundamentals
|
||
|
||
Before diving into the full competition, let's understand the core concepts step by step.
|
||
"""
|
||
|
||
# %%
|
||
def simple_timing_demo():
|
||
"""TARGET Learning Checkpoint 1: Basic Performance Measurement
|
||
|
||
Understand why we need systematic timing for fair comparison.
|
||
"""
|
||
print("MAGNIFY Learning Checkpoint 1: Basic Performance Measurement")
|
||
print("=" * 60)
|
||
|
||
# Simple function to time
|
||
def slow_matrix_multiply(a, b):
|
||
"""Naive matrix multiplication - intentionally slow"""
|
||
result = np.zeros((a.shape[0], b.shape[1]))
|
||
for i in range(a.shape[0]):
|
||
for j in range(b.shape[1]):
|
||
for k in range(a.shape[1]):
|
||
result[i, j] += a[i, k] * b[k, j]
|
||
return result
|
||
|
||
def fast_matrix_multiply(a, b):
|
||
"""Optimized matrix multiplication using NumPy"""
|
||
return np.dot(a, b)
|
||
|
||
# Create test matrices
|
||
test_size = 50
|
||
matrix_a = np.random.randn(test_size, test_size).astype(np.float32)
|
||
matrix_b = np.random.randn(test_size, test_size).astype(np.float32)
|
||
|
||
print(f"📊 Timing matrix multiplication ({test_size}x{test_size})...")
|
||
|
||
# Time the slow version
|
||
start = time.perf_counter()
|
||
slow_result = slow_matrix_multiply(matrix_a, matrix_b)
|
||
slow_time = time.perf_counter() - start
|
||
|
||
# Time the fast version
|
||
start = time.perf_counter()
|
||
fast_result = fast_matrix_multiply(matrix_a, matrix_b)
|
||
fast_time = time.perf_counter() - start
|
||
|
||
# Calculate speedup
|
||
speedup = slow_time / fast_time
|
||
|
||
print(f" Slow version: {slow_time*1000:.2f} ms")
|
||
print(f" Fast version: {fast_time*1000:.2f} ms")
|
||
print(f" ROCKET Speedup: {speedup:.2f}x faster")
|
||
|
||
print(f"\nTIP Key Insight: Optimization can provide dramatic speedups!")
|
||
print(f" This is why we need systematic benchmarking to measure improvements.")
|
||
|
||
return {'slow_time': slow_time, 'fast_time': fast_time, 'speedup': speedup}
|
||
|
||
def statistical_timing_demo():
|
||
"""TARGET Learning Checkpoint 2: Why We Need Multiple Runs
|
||
|
||
Understand timing variability and the need for statistical reliability.
|
||
"""
|
||
print("\nMAGNIFY Learning Checkpoint 2: Statistical Timing Reliability")
|
||
print("=" * 60)
|
||
|
||
# Simple operation to time
|
||
def simple_operation(x):
|
||
return np.sum(x ** 2)
|
||
|
||
test_data = np.random.randn(10000).astype(np.float32)
|
||
|
||
print(f"📊 Measuring timing variability with {DEFAULT_TIMING_RUNS} runs...")
|
||
|
||
# Single timing run
|
||
start = time.perf_counter()
|
||
_ = simple_operation(test_data)
|
||
single_time = time.perf_counter() - start
|
||
|
||
# Multiple timing runs
|
||
times = []
|
||
for run in range(DEFAULT_TIMING_RUNS):
|
||
start = time.perf_counter()
|
||
_ = simple_operation(test_data)
|
||
end = time.perf_counter()
|
||
times.append(end - start)
|
||
|
||
mean_time = np.mean(times)
|
||
std_time = np.std(times)
|
||
min_time = np.min(times)
|
||
max_time = np.max(times)
|
||
|
||
print(f" Single run: {single_time*1000:.2f} ms")
|
||
print(f" Mean time: {mean_time*1000:.2f} ± {std_time*1000:.2f} ms")
|
||
print(f" Range: {min_time*1000:.2f} - {max_time*1000:.2f} ms")
|
||
|
||
variability = (std_time / mean_time) * 100
|
||
print(f" PROGRESS Variability: {variability:.1f}% coefficient of variation")
|
||
|
||
print(f"\nTIP Key Insight: Single measurements are unreliable!")
|
||
print(f" We need {DEFAULT_TIMING_RUNS}+ runs with warmup for statistical reliability.")
|
||
|
||
return {'times': times, 'mean': mean_time, 'std': std_time}
|
||
|
||
def benchmark_model_demo():
|
||
"""TARGET Learning Checkpoint 3: Model Benchmarking Basics
|
||
|
||
Understand how to benchmark ML models specifically.
|
||
"""
|
||
print("\nMAGNIFY Learning Checkpoint 3: ML Model Benchmarking")
|
||
print("=" * 60)
|
||
|
||
# Simple model for demonstration
|
||
class SimpleModel:
|
||
def __init__(self, size):
|
||
self.weights = np.random.randn(size, size).astype(np.float32) * 0.1
|
||
|
||
def predict(self, x):
|
||
return x @ self.weights
|
||
|
||
# Create models of different sizes
|
||
small_model = SimpleModel(64)
|
||
large_model = SimpleModel(256)
|
||
|
||
# Test data
|
||
batch_size = 100
|
||
small_data = np.random.randn(batch_size, 64).astype(np.float32)
|
||
large_data = np.random.randn(batch_size, 256).astype(np.float32)
|
||
|
||
print(f"📊 Comparing model sizes...")
|
||
|
||
# Benchmark small model
|
||
times = []
|
||
for _ in range(DEFAULT_TIMING_RUNS):
|
||
start = time.perf_counter()
|
||
_ = small_model.predict(small_data)
|
||
times.append(time.perf_counter() - start)
|
||
small_time = np.mean(times)
|
||
|
||
# Benchmark large model
|
||
times = []
|
||
for _ in range(DEFAULT_TIMING_RUNS):
|
||
start = time.perf_counter()
|
||
_ = large_model.predict(large_data)
|
||
times.append(time.perf_counter() - start)
|
||
large_time = np.mean(times)
|
||
|
||
print(f" Small model (64): {small_time*1000:.2f} ms")
|
||
print(f" Large model (256): {large_time*1000:.2f} ms")
|
||
print(f" 🔢 Size ratio: {256/64:.0f}x parameters")
|
||
print(f" ⏱️ Time ratio: {large_time/small_time:.1f}x slower")
|
||
|
||
print(f"\nTIP Key Insight: Model complexity directly affects inference time!")
|
||
print(f" This is why standardized models are crucial for fair competition.")
|
||
|
||
return {'small_time': small_time, 'large_time': large_time}
|
||
|
||
# %%
|
||
def run_learning_checkpoints():
|
||
"""Run all learning checkpoints to build understanding progressively"""
|
||
print("🎓 TinyMLPerf Learning Journey")
|
||
print("=" * 80)
|
||
print("Building understanding step by step...\n")
|
||
|
||
# Checkpoint 1: Basic timing
|
||
timing_results = simple_timing_demo()
|
||
|
||
# Checkpoint 2: Statistical reliability
|
||
stats_results = statistical_timing_demo()
|
||
|
||
# Checkpoint 3: Model benchmarking
|
||
model_results = benchmark_model_demo()
|
||
|
||
print("\n" + "=" * 80)
|
||
print("CELEBRATE Learning checkpoints complete! Ready for TinyMLPerf competition.")
|
||
print("=" * 80)
|
||
|
||
return {
|
||
'timing': timing_results,
|
||
'statistics': stats_results,
|
||
'models': model_results
|
||
}
|
||
|
||
# %% [markdown]
|
||
"""
|
||
### Test Learning Checkpoints
|
||
|
||
Let's run the learning checkpoints to build understanding progressively.
|
||
"""
|
||
|
||
# %%
|
||
def test_learning_checkpoints():
|
||
"""Test the learning checkpoint system"""
|
||
print("Testing learning checkpoints...")
|
||
results = run_learning_checkpoints()
|
||
print("\nPASS Learning checkpoints test complete!")
|
||
return results
|
||
|
||
# %% [markdown]
|
||
"""
|
||
## Part 2: TinyMLPerf Benchmark Suite - Standard Competition Models
|
||
|
||
Now that we understand the fundamentals, let's build the TinyMLPerf benchmark suite with three exciting competition events using standard models.
|
||
"""
|
||
|
||
# Standard benchmark models for TinyMLPerf competition events
|
||
class MLPBenchmark:
|
||
"""Standard MLP model for TinyMLPerf sprint event.
|
||
|
||
Simple 3-layer feedforward network optimized for speed competitions.
|
||
Students will optimize this architecture for fastest inference.
|
||
"""
|
||
|
||
def __init__(self):
|
||
"""Initialize MLP with standard architecture using named constants."""
|
||
# Layer 1: Input -> Hidden1 (flattened MNIST-like input)
|
||
self.layer1_weights = np.random.randn(MLP_INPUT_SIZE, MLP_HIDDEN1_SIZE).astype(np.float32) * WEIGHT_INIT_SCALE
|
||
self.layer1_bias = np.random.randn(MLP_HIDDEN1_SIZE).astype(np.float32) * WEIGHT_INIT_SCALE
|
||
|
||
# Layer 2: Hidden1 -> Hidden2
|
||
self.layer2_weights = np.random.randn(MLP_HIDDEN1_SIZE, MLP_HIDDEN2_SIZE).astype(np.float32) * WEIGHT_INIT_SCALE
|
||
self.layer2_bias = np.random.randn(MLP_HIDDEN2_SIZE).astype(np.float32) * WEIGHT_INIT_SCALE
|
||
|
||
# Layer 3: Hidden2 -> Output (classification)
|
||
self.layer3_weights = np.random.randn(MLP_HIDDEN2_SIZE, MLP_OUTPUT_SIZE).astype(np.float32) * WEIGHT_INIT_SCALE
|
||
self.layer3_bias = np.random.randn(MLP_OUTPUT_SIZE).astype(np.float32) * WEIGHT_INIT_SCALE
|
||
|
||
def forward(self, x):
|
||
"""Forward pass through 3-layer MLP with ReLU activations."""
|
||
# Layer 1: Input -> Hidden1 with ReLU
|
||
hidden1 = np.maximum(0, x @ self.layer1_weights + self.layer1_bias)
|
||
|
||
# Layer 2: Hidden1 -> Hidden2 with ReLU
|
||
hidden2 = np.maximum(0, hidden1 @ self.layer2_weights + self.layer2_bias)
|
||
|
||
# Layer 3: Hidden2 -> Output (no activation)
|
||
output = hidden2 @ self.layer3_weights + self.layer3_bias
|
||
return output
|
||
|
||
def predict(self, x):
|
||
"""Prediction interface for benchmarking."""
|
||
return self.forward(x)
|
||
|
||
|
||
class CNNBenchmark:
|
||
"""Standard CNN model for TinyMLPerf marathon event.
|
||
|
||
Simplified convolutional network for image processing competitions.
|
||
Students will optimize convolution operations and memory access patterns.
|
||
"""
|
||
|
||
def __init__(self):
|
||
"""Initialize CNN with simplified architecture using named constants."""
|
||
# Simplified CNN weights (real CNN would need proper conv operations)
|
||
self.conv1_filters = np.random.randn(CNN_KERNEL_SIZE, CNN_KERNEL_SIZE, 1, CNN_CONV1_FILTERS).astype(np.float32) * WEIGHT_INIT_SCALE
|
||
self.conv2_filters = np.random.randn(CNN_KERNEL_SIZE, CNN_KERNEL_SIZE, CNN_CONV1_FILTERS, CNN_CONV2_FILTERS).astype(np.float32) * WEIGHT_INIT_SCALE
|
||
|
||
# Fully connected layer after convolution + pooling
|
||
self.fc_weights = np.random.randn(CNN_FC_INPUT_SIZE, MLP_OUTPUT_SIZE).astype(np.float32) * WEIGHT_INIT_SCALE
|
||
self.fc_bias = np.random.randn(MLP_OUTPUT_SIZE).astype(np.float32) * WEIGHT_INIT_SCALE
|
||
|
||
def forward(self, x):
|
||
"""Forward pass through simplified CNN.
|
||
|
||
Note: This is a simplified version. Students will implement
|
||
real convolution operations for optimization.
|
||
"""
|
||
batch_size = x.shape[0]
|
||
|
||
# Simulate conv + pooling by flattening and projecting
|
||
x_flattened = x.reshape(batch_size, -1)
|
||
|
||
# Ensure correct input size (pad or truncate as needed)
|
||
if x_flattened.shape[1] != CNN_FC_INPUT_SIZE:
|
||
if x_flattened.shape[1] > CNN_FC_INPUT_SIZE:
|
||
x_flattened = x_flattened[:, :CNN_FC_INPUT_SIZE]
|
||
else:
|
||
padding = ((0, 0), (0, CNN_FC_INPUT_SIZE - x_flattened.shape[1]))
|
||
x_flattened = np.pad(x_flattened, padding, 'constant')
|
||
|
||
# Final classification layer
|
||
output = x_flattened @ self.fc_weights + self.fc_bias
|
||
return output
|
||
|
||
def predict(self, x):
|
||
"""Prediction interface for benchmarking."""
|
||
return self.forward(x)
|
||
|
||
|
||
class TransformerBenchmark:
|
||
"""Standard Transformer model for TinyMLPerf decathlon event.
|
||
|
||
Simplified attention-based model for sequence processing competitions.
|
||
Students will optimize attention mechanisms and memory usage.
|
||
"""
|
||
|
||
def __init__(self, d_model=TRANSFORMER_D_MODEL, n_heads=TRANSFORMER_N_HEADS, seq_len=TRANSFORMER_SEQ_LEN):
|
||
"""Initialize Transformer with standard attention architecture using named constants.
|
||
|
||
Args:
|
||
d_model: Model dimension (embedding size) - default from TRANSFORMER_D_MODEL
|
||
n_heads: Number of attention heads - default from TRANSFORMER_N_HEADS
|
||
seq_len: Maximum sequence length - default from TRANSFORMER_SEQ_LEN
|
||
"""
|
||
self.d_model = d_model
|
||
self.n_heads = n_heads
|
||
self.seq_len = seq_len
|
||
self.head_dim = d_model // n_heads
|
||
|
||
# Multi-head attention weights (clearer naming)
|
||
self.query_weights = np.random.randn(d_model, d_model).astype(np.float32) * WEIGHT_INIT_SCALE
|
||
self.key_weights = np.random.randn(d_model, d_model).astype(np.float32) * WEIGHT_INIT_SCALE
|
||
self.value_weights = np.random.randn(d_model, d_model).astype(np.float32) * WEIGHT_INIT_SCALE
|
||
self.output_weights = np.random.randn(d_model, d_model).astype(np.float32) * WEIGHT_INIT_SCALE
|
||
|
||
# Feed forward network weights (using standard 4x expansion ratio)
|
||
ff_dim = d_model * TRANSFORMER_FF_RATIO
|
||
self.feedforward_layer1 = np.random.randn(d_model, ff_dim).astype(np.float32) * WEIGHT_INIT_SCALE
|
||
self.feedforward_layer2 = np.random.randn(ff_dim, d_model).astype(np.float32) * WEIGHT_INIT_SCALE
|
||
|
||
def forward(self, x):
|
||
"""Forward pass through simplified transformer block.
|
||
|
||
Note: This is a simplified version. Students will implement
|
||
real multi-head attention for optimization.
|
||
"""
|
||
batch_size, seq_len, d_model = x.shape
|
||
|
||
# Self-attention computation (simplified single-head)
|
||
queries = x @ self.query_weights # [batch, seq, d_model]
|
||
keys = x @ self.key_weights
|
||
values = x @ self.value_weights
|
||
|
||
# Attention scores with proper scaling
|
||
attention_scores = queries @ keys.transpose(0, 2, 1) / np.sqrt(d_model)
|
||
|
||
# Softmax with numerical stability
|
||
exp_scores = np.exp(attention_scores - np.max(attention_scores, axis=-1, keepdims=True))
|
||
attention_weights = exp_scores / (np.sum(exp_scores, axis=-1, keepdims=True) + NUMERICAL_EPSILON)
|
||
|
||
# Apply attention to values
|
||
attention_output = attention_weights @ values # [batch, seq, d_model]
|
||
|
||
# Residual connection + layer norm (simplified)
|
||
attention_output = attention_output + x
|
||
|
||
# Feed forward network
|
||
ff_intermediate = np.maximum(0, attention_output @ self.feedforward_layer1) # ReLU
|
||
ff_output = ff_intermediate @ self.feedforward_layer2
|
||
|
||
# Another residual connection
|
||
final_output = ff_output + attention_output
|
||
|
||
# Global average pooling for classification
|
||
return np.mean(final_output, axis=1) # [batch, d_model]
|
||
|
||
def predict(self, x):
|
||
"""Prediction interface for benchmarking."""
|
||
return self.forward(x)
|
||
|
||
# %%
|
||
class TinyMLPerf:
|
||
"""
|
||
TinyMLPerf benchmark suite - The Olympics of ML Systems Optimization!
|
||
|
||
Provides three standard competition events:
|
||
- MLP Sprint: Fastest feedforward inference
|
||
- CNN Marathon: Efficient convolution operations
|
||
- Transformer Decathlon: Complete attention-based model performance
|
||
|
||
Each event uses standardized models and datasets for fair competition.
|
||
"""
|
||
|
||
def __init__(self, profiler_warmup_runs: int = DEFAULT_WARMUP_RUNS,
|
||
profiler_timing_runs: int = DEFAULT_PROFILER_TIMING_RUNS):
|
||
"""
|
||
Initialize TinyMLPerf benchmark suite.
|
||
|
||
Args:
|
||
profiler_warmup_runs: Number of warmup runs for stable measurements
|
||
profiler_timing_runs: Number of timing runs for statistical reliability
|
||
"""
|
||
self.warmup_runs = profiler_warmup_runs
|
||
self.timing_runs = profiler_timing_runs
|
||
self.benchmark_models = {}
|
||
self.benchmark_datasets = {}
|
||
|
||
print("🏆 TinyMLPerf Competition Suite Initialized!")
|
||
print("TARGET Three Events: MLP Sprint, CNN Marathon, Transformer Decathlon")
|
||
|
||
# Load standard benchmark models
|
||
self._load_benchmark_models()
|
||
self._load_benchmark_datasets()
|
||
|
||
def _load_benchmark_models(self):
|
||
"""Load standard benchmark models for each competition event"""
|
||
print("📥 Loading TinyMLPerf Benchmark Models...")
|
||
|
||
# Create instances of the standardized benchmark models
|
||
self.benchmark_models = {
|
||
'mlp_sprint': MLPBenchmark(),
|
||
'cnn_marathon': CNNBenchmark(),
|
||
'transformer_decathlon': TransformerBenchmark()
|
||
}
|
||
|
||
print("PASS Benchmark models loaded successfully!")
|
||
for event, model in self.benchmark_models.items():
|
||
print(f" 📋 {event.replace('_', ' ').title()}: {type(model).__name__}")
|
||
|
||
def _load_benchmark_datasets(self):
|
||
"""Load standard benchmark datasets for each competition event"""
|
||
print("📊 Loading TinyMLPerf Benchmark Datasets...")
|
||
|
||
# MLP Sprint dataset - MNIST-like flattened images
|
||
mlp_batch_size = 100
|
||
mlp_data = {
|
||
'inputs': np.random.randn(mlp_batch_size, MLP_INPUT_SIZE).astype(np.float32), # Batch of samples
|
||
'targets': np.eye(MLP_OUTPUT_SIZE)[np.random.randint(0, MLP_OUTPUT_SIZE, mlp_batch_size)], # One-hot labels
|
||
'event': 'MLP Sprint',
|
||
'description': 'Feedforward inference on flattened 28x28 images'
|
||
}
|
||
|
||
# CNN Marathon dataset - Image-like data
|
||
cnn_batch_size = 50
|
||
cnn_image_size = 28 # 28x28 standard image size
|
||
cnn_data = {
|
||
'inputs': np.random.randn(cnn_batch_size, cnn_image_size, cnn_image_size, 1).astype(np.float32), # Batch of images
|
||
'targets': np.eye(MLP_OUTPUT_SIZE)[np.random.randint(0, MLP_OUTPUT_SIZE, cnn_batch_size)],
|
||
'event': 'CNN Marathon',
|
||
'description': 'Convolutional inference on 28x28x1 images'
|
||
}
|
||
|
||
# Transformer Decathlon dataset - Sequence data
|
||
transformer_batch_size = 32
|
||
transformer_data = {
|
||
'inputs': np.random.randn(transformer_batch_size, TRANSFORMER_SEQ_LEN, TRANSFORMER_D_MODEL).astype(np.float32), # Batch of sequences
|
||
'targets': np.eye(MLP_OUTPUT_SIZE)[np.random.randint(0, MLP_OUTPUT_SIZE, transformer_batch_size)],
|
||
'event': 'Transformer Decathlon',
|
||
'description': 'Self-attention inference on 64-token sequences'
|
||
}
|
||
|
||
self.benchmark_datasets = {
|
||
'mlp_sprint': mlp_data,
|
||
'cnn_marathon': cnn_data,
|
||
'transformer_decathlon': transformer_data
|
||
}
|
||
|
||
print("PASS Benchmark datasets loaded successfully!")
|
||
for event, data in self.benchmark_datasets.items():
|
||
print(f" TARGET {data['event']}: {data['inputs'].shape} -> {data['targets'].shape}")
|
||
|
||
def load_benchmark(self, event_name: str) -> Tuple[Any, Dict[str, Any]]:
|
||
"""
|
||
Load a specific benchmark model and dataset.
|
||
|
||
Args:
|
||
event_name: Name of competition event ('mlp_sprint', 'cnn_marathon', 'transformer_decathlon')
|
||
|
||
Returns:
|
||
Tuple of (model, dataset) for the specified event
|
||
"""
|
||
if event_name not in self.benchmark_models:
|
||
available = list(self.benchmark_models.keys())
|
||
raise ValueError(f"Event '{event_name}' not found. Available: {available}")
|
||
|
||
model = self.benchmark_models[event_name]
|
||
dataset = self.benchmark_datasets[event_name]
|
||
|
||
print(f"📋 Loaded benchmark: {dataset['event']}")
|
||
print(f" Model: {type(model).__name__}")
|
||
print(f" Data: {dataset['description']}")
|
||
|
||
return model, dataset
|
||
|
||
def get_available_events(self) -> Dict[str, str]:
|
||
"""Get list of available competition events with descriptions"""
|
||
return {
|
||
'mlp_sprint': 'Fastest feedforward neural network inference',
|
||
'cnn_marathon': 'Efficient convolutional neural network processing',
|
||
'transformer_decathlon': 'Complete attention mechanism optimization'
|
||
}
|
||
|
||
# %% [markdown]
|
||
"""
|
||
### Test TinyMLPerf Benchmark Suite
|
||
|
||
Let's test the benchmark suite to ensure all models and datasets load correctly.
|
||
"""
|
||
|
||
# %%
|
||
def test_tinymlperf_benchmark_suite():
|
||
"""Test the TinyMLPerf benchmark suite"""
|
||
print("Testing TinyMLPerf Benchmark Suite...")
|
||
|
||
# Initialize benchmark suite
|
||
benchmark_suite = TinyMLPerf(profiler_warmup_runs=2, profiler_timing_runs=3)
|
||
|
||
# Test each event
|
||
events = benchmark_suite.get_available_events()
|
||
print(f"\n🏆 Available Events: {len(events)}")
|
||
|
||
for event_name, description in events.items():
|
||
print(f"\n📋 Testing {event_name}...")
|
||
model, dataset = benchmark_suite.load_benchmark(event_name)
|
||
|
||
# Test model inference
|
||
inputs = dataset['inputs']
|
||
outputs = model.predict(inputs)
|
||
|
||
print(f" PASS Inference successful: {inputs.shape} -> {outputs.shape}")
|
||
|
||
# Verify output shape makes sense
|
||
batch_size = inputs.shape[0]
|
||
assert outputs.shape[0] == batch_size, f"Batch size mismatch: {outputs.shape[0]} != {batch_size}"
|
||
print(f" PASS Output shape verified")
|
||
|
||
print(f"\nPASS TinyMLPerf benchmark suite test complete!")
|
||
return benchmark_suite
|
||
|
||
# %% [markdown]
|
||
"""
|
||
## Part 2: Performance Benchmarking Using Module 15's Profiler
|
||
|
||
Now let's build the core benchmarking infrastructure that uses the profiler from Module 15 to measure performance.
|
||
"""
|
||
|
||
# %%
|
||
class CompetitionProfiler:
|
||
"""
|
||
Competition profiling infrastructure using TinyTorch's Module 15 profiler.
|
||
|
||
Provides rigorous performance measurement for fair competition by:
|
||
- Using standardized profiling from Module 15
|
||
- Multiple timing runs with statistical analysis
|
||
- Memory usage tracking and analysis
|
||
- Hardware-independent relative scoring
|
||
"""
|
||
|
||
def __init__(self, warmup_runs: int = DEFAULT_WARMUP_RUNS,
|
||
timing_runs: int = DEFAULT_PROFILER_TIMING_RUNS):
|
||
"""
|
||
Initialize competition profiler.
|
||
|
||
Args:
|
||
warmup_runs: Number of warmup runs to stabilize performance
|
||
timing_runs: Number of timing runs for statistical reliability
|
||
"""
|
||
self.warmup_runs = warmup_runs
|
||
self.timing_runs = timing_runs
|
||
self.has_profiler = HAS_PROFILER
|
||
|
||
if not self.has_profiler:
|
||
print("WARNING️ Warning: Advanced profiling unavailable, using basic timing")
|
||
else:
|
||
print("PASS Using TinyTorch Module 15 profiler for advanced metrics")
|
||
|
||
def benchmark_model(self, model, dataset: Dict[str, Any]) -> Dict[str, Any]:
|
||
"""
|
||
Benchmark a model using rigorous profiling methodology.
|
||
|
||
Args:
|
||
model: Model to benchmark (must have predict() or forward() method)
|
||
dataset: Dataset dictionary with 'inputs' key
|
||
|
||
Returns:
|
||
Comprehensive benchmarking results with performance metrics
|
||
"""
|
||
print(f"🏁 Benchmarking {dataset.get('event', 'Model')}...")
|
||
|
||
inputs = dataset['inputs']
|
||
results = {
|
||
'event': dataset.get('event', 'Unknown'),
|
||
'model_type': type(model).__name__,
|
||
'input_shape': inputs.shape,
|
||
'benchmark_timestamp': datetime.now().isoformat()
|
||
}
|
||
|
||
if self.has_profiler:
|
||
# Use advanced profiling from Module 15
|
||
results.update(self._profile_with_tinytorch_profiler(model, inputs))
|
||
else:
|
||
# Fallback to basic timing
|
||
results.update(self._profile_basic_timing(model, inputs))
|
||
|
||
self._print_benchmark_results(results)
|
||
return results
|
||
|
||
def quick_benchmark(self, model, dataset: Dict[str, Any]) -> float:
|
||
"""
|
||
Simple benchmarking returning just the mean inference time.
|
||
|
||
This is a simplified interface for students who just want basic timing.
|
||
|
||
Args:
|
||
model: Model to benchmark
|
||
dataset: Dataset dictionary with 'inputs' key
|
||
|
||
Returns:
|
||
Mean inference time in seconds
|
||
"""
|
||
results = self._run_basic_profiling(model, dataset['inputs'])
|
||
return results['mean_inference_time']
|
||
|
||
def compare_models(self, model, baseline_model, dataset: Dict[str, Any]) -> Dict[str, Any]:
|
||
"""
|
||
Compare two models directly with simplified interface.
|
||
|
||
Args:
|
||
model: Optimized model to test
|
||
baseline_model: Baseline model for comparison
|
||
dataset: Dataset dictionary with 'inputs' key
|
||
|
||
Returns:
|
||
Comparison results with speedup information
|
||
"""
|
||
print(f"🏁 Comparing models for {dataset.get('event', 'Model')}...")
|
||
|
||
# Benchmark both models
|
||
baseline_results = self._run_basic_profiling(baseline_model, dataset['inputs'])
|
||
model_results = self._run_basic_profiling(model, dataset['inputs'])
|
||
|
||
# Calculate speedup
|
||
speedup = baseline_results['mean_inference_time'] / model_results['mean_inference_time']
|
||
|
||
comparison = {
|
||
'baseline_time': baseline_results['mean_inference_time'],
|
||
'optimized_time': model_results['mean_inference_time'],
|
||
'speedup': speedup,
|
||
'event': dataset.get('event', 'Unknown'),
|
||
'baseline_model': type(baseline_model).__name__,
|
||
'optimized_model': type(model).__name__
|
||
}
|
||
|
||
print(f"📊 Baseline: {comparison['baseline_time']*1000:.2f} ms")
|
||
print(f"📊 Optimized: {comparison['optimized_time']*1000:.2f} ms")
|
||
print(f"ROCKET Speedup: {speedup:.2f}x {'faster' if speedup > 1.0 else 'slower'}")
|
||
|
||
return comparison
|
||
|
||
def benchmark_with_baseline(self, model, dataset: Dict[str, Any], baseline_time: float) -> Dict[str, Any]:
|
||
"""
|
||
Benchmark a model against a known baseline time.
|
||
|
||
Args:
|
||
model: Model to benchmark
|
||
dataset: Dataset dictionary with 'inputs' key
|
||
baseline_time: Baseline time in seconds for speedup calculation
|
||
|
||
Returns:
|
||
Benchmark results with speedup calculation
|
||
"""
|
||
results = self.benchmark_model(model, dataset)
|
||
speedup = baseline_time / results['mean_inference_time']
|
||
results['speedup_vs_baseline'] = speedup
|
||
|
||
print(f"ROCKET Speedup vs baseline: {speedup:.2f}x {'faster' if speedup > 1.0 else 'slower'}")
|
||
return results
|
||
|
||
def _run_basic_profiling(self, model, inputs: np.ndarray) -> Dict[str, Any]:
|
||
"""
|
||
Run basic profiling without complex options.
|
||
|
||
This is used by simplified interfaces.
|
||
"""
|
||
if self.has_profiler:
|
||
return self._profile_with_tinytorch_profiler(model, inputs)
|
||
else:
|
||
return self._profile_basic_timing(model, inputs)
|
||
|
||
def _profile_with_tinytorch_profiler(self, model, inputs: np.ndarray) -> Dict[str, Any]:
|
||
"""Profile using Module 15's advanced profiler"""
|
||
profiler = SimpleProfiler(track_memory=True, track_cpu=True)
|
||
|
||
# Run profiling sessions
|
||
profile_results = self._run_profiling_sessions(profiler, model, inputs)
|
||
|
||
# Calculate statistics
|
||
return self._calculate_profiling_statistics(profile_results)
|
||
|
||
def _run_profiling_sessions(self, profiler, model, inputs: np.ndarray) -> List[Dict[str, Any]]:
|
||
"""Run multiple profiling sessions for statistical reliability."""
|
||
profile_results = []
|
||
|
||
for run in range(self.timing_runs):
|
||
# Each profiling session includes warmup
|
||
result = profiler.profile(
|
||
model.predict, inputs,
|
||
name=f"inference_run_{run}",
|
||
warmup=True # Profiler handles warmup
|
||
)
|
||
profile_results.append(result)
|
||
|
||
return profile_results
|
||
|
||
def _calculate_profiling_statistics(self, profile_results: List[Dict[str, Any]]) -> Dict[str, Any]:
|
||
"""Calculate timing and memory statistics from profile results."""
|
||
# Extract timing data
|
||
wall_times = [r['wall_time'] for r in profile_results]
|
||
cpu_times = [r['cpu_time'] for r in profile_results]
|
||
|
||
# Calculate timing statistics
|
||
timing_stats = {
|
||
'mean_inference_time': np.mean(wall_times),
|
||
'std_inference_time': np.std(wall_times),
|
||
'min_inference_time': np.min(wall_times),
|
||
'max_inference_time': np.max(wall_times),
|
||
'p95_inference_time': np.percentile(wall_times, 95),
|
||
'mean_cpu_time': np.mean(cpu_times),
|
||
'cpu_efficiency': np.mean([r['cpu_efficiency'] for r in profile_results]),
|
||
'profiling_method': 'TinyTorch Module 15 Profiler'
|
||
}
|
||
|
||
# Add memory statistics
|
||
memory_stats = self._extract_memory_statistics(profile_results)
|
||
timing_stats.update(memory_stats)
|
||
|
||
return timing_stats
|
||
|
||
def _extract_memory_statistics(self, profile_results: List[Dict[str, Any]]) -> Dict[str, Any]:
|
||
"""Extract memory statistics from profiling results."""
|
||
# Use last run as most representative
|
||
last_result = profile_results[-1]
|
||
memory_stats = {}
|
||
|
||
if 'memory_delta_mb' in last_result:
|
||
memory_stats.update({
|
||
'memory_delta_mb': last_result['memory_delta_mb'],
|
||
'peak_memory_mb': last_result['peak_memory_mb'],
|
||
'result_size_mb': last_result.get('result_size_mb', 0)
|
||
})
|
||
|
||
return memory_stats
|
||
|
||
def _profile_basic_timing(self, model, inputs: np.ndarray) -> Dict[str, Any]:
|
||
"""Fallback basic timing without advanced profiling"""
|
||
|
||
# Warmup runs
|
||
for _ in range(self.warmup_runs):
|
||
_ = model.predict(inputs)
|
||
|
||
# Timing runs
|
||
times = []
|
||
for _ in range(self.timing_runs):
|
||
start = time.perf_counter()
|
||
_ = model.predict(inputs)
|
||
end = time.perf_counter()
|
||
times.append(end - start)
|
||
|
||
return {
|
||
'mean_inference_time': np.mean(times),
|
||
'std_inference_time': np.std(times),
|
||
'min_inference_time': np.min(times),
|
||
'max_inference_time': np.max(times),
|
||
'p95_inference_time': np.percentile(times, 95),
|
||
'profiling_method': 'Basic Timing'
|
||
}
|
||
|
||
def _print_benchmark_results(self, results: Dict[str, Any]):
|
||
"""Print formatted benchmark results"""
|
||
print(f"\n📊 {results['event']} Benchmark Results:")
|
||
print(f" Model: {results['model_type']}")
|
||
print(f" Input: {results['input_shape']}")
|
||
print(f" Mean Time: {results['mean_inference_time']*1000:.2f} ± {results['std_inference_time']*1000:.2f} ms")
|
||
print(f" Best Time: {results['min_inference_time']*1000:.2f} ms")
|
||
print(f" P95 Time: {results['p95_inference_time']*1000:.2f} ms")
|
||
|
||
if 'speedup_vs_baseline' in results:
|
||
print(f" ROCKET Speedup: {results['speedup_vs_baseline']:.2f}x faster")
|
||
|
||
if 'memory_delta_mb' in results:
|
||
print(f" 💾 Memory: {results['memory_delta_mb']:.2f} MB delta, {results['peak_memory_mb']:.2f} MB peak")
|
||
|
||
print(f" 📏 Method: {results['profiling_method']}")
|
||
|
||
# %% [markdown]
|
||
"""
|
||
### Test Competition Profiler
|
||
|
||
Let's test the competition profiler with TinyMLPerf benchmark models.
|
||
"""
|
||
|
||
# %%
|
||
def test_competition_profiler():
|
||
"""Test the competition profiler with benchmark models"""
|
||
print("Testing Competition Profiler...")
|
||
|
||
# Initialize TinyMLPerf and profiler
|
||
benchmark_suite = TinyMLPerf(profiler_warmup_runs=2, profiler_timing_runs=3)
|
||
competition_profiler = CompetitionProfiler(warmup_runs=2, timing_runs=3)
|
||
|
||
# Test MLP Sprint profiling
|
||
mlp_model, mlp_dataset = benchmark_suite.load_benchmark('mlp_sprint')
|
||
mlp_results = competition_profiler.benchmark_model(mlp_model, mlp_dataset)
|
||
|
||
# Test CNN Marathon profiling
|
||
cnn_model, cnn_dataset = benchmark_suite.load_benchmark('cnn_marathon')
|
||
cnn_results = competition_profiler.benchmark_model(cnn_model, cnn_dataset)
|
||
|
||
# Test speedup calculation with baseline
|
||
print(f"\n🏃 Testing Speedup Calculation...")
|
||
cnn_speedup_results = competition_profiler.benchmark_with_baseline(
|
||
cnn_model, cnn_dataset,
|
||
baseline_time=mlp_results['mean_inference_time'] # Use MLP as baseline
|
||
)
|
||
|
||
print(f"\nPASS Competition profiler test complete!")
|
||
return competition_profiler, mlp_results, cnn_results
|
||
|
||
# %% [markdown]
|
||
"""
|
||
## Part 3: Simplified Competition Framework - Focused Leaderboards
|
||
|
||
Let's build a simplified competition framework with focused classes and clear responsibilities.
|
||
"""
|
||
|
||
# %%
|
||
class CompetitionSubmission:
|
||
"""Handles creation and validation of individual competition submissions."""
|
||
|
||
def __init__(self, team_name: str, event_name: str, optimized_model,
|
||
optimization_description: str = "", github_url: str = ""):
|
||
"""Create a competition submission."""
|
||
self.team_name = team_name
|
||
self.event_name = event_name
|
||
self.optimized_model = optimized_model
|
||
self.optimization_description = optimization_description
|
||
self.github_url = github_url
|
||
self.submission_id = self._generate_id()
|
||
self.timestamp = datetime.now().isoformat()
|
||
|
||
def _generate_id(self) -> str:
|
||
"""Generate unique submission ID."""
|
||
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
|
||
team_hash = hashlib.md5(self.team_name.encode()).hexdigest()[:6]
|
||
return f"{self.event_name}_{team_hash}_{timestamp}"
|
||
|
||
def to_dict(self) -> Dict[str, Any]:
|
||
"""Convert submission to dictionary for storage."""
|
||
return {
|
||
'submission_id': self.submission_id,
|
||
'timestamp': self.timestamp,
|
||
'team_name': self.team_name,
|
||
'event_name': self.event_name,
|
||
'optimization_description': self.optimization_description,
|
||
'github_url': self.github_url
|
||
}
|
||
|
||
class CompetitionStorage:
|
||
"""Handles saving and loading competition results."""
|
||
|
||
def __init__(self, results_dir: str = "tinymlperf_results"):
|
||
"""Initialize storage with results directory."""
|
||
self.results_dir = Path(results_dir)
|
||
self.results_dir.mkdir(exist_ok=True)
|
||
|
||
def save_submission(self, submission_data: Dict[str, Any]):
|
||
"""Save submission to storage."""
|
||
filename = f"{submission_data['submission_id']}.json"
|
||
filepath = self.results_dir / filename
|
||
|
||
with open(filepath, 'w') as f:
|
||
json.dump(submission_data, f, indent=2, default=str)
|
||
|
||
print(f"💾 Submission saved: {filepath}")
|
||
|
||
def load_event_submissions(self, event_name: str) -> List[Dict[str, Any]]:
|
||
"""Load all submissions for a specific event."""
|
||
submissions = []
|
||
|
||
for filepath in self.results_dir.glob(f"{event_name}_*.json"):
|
||
try:
|
||
with open(filepath, 'r') as f:
|
||
submission = json.load(f)
|
||
submissions.append(submission)
|
||
except Exception as e:
|
||
print(f"Warning: Could not load {filepath}: {e}")
|
||
|
||
return submissions
|
||
|
||
class SimpleInnovationDetector:
|
||
"""Simple innovation detection using basic keyword matching."""
|
||
|
||
def detect_techniques(self, description: str) -> List[str]:
|
||
"""Detect optimization techniques using simple keywords."""
|
||
description_lower = description.lower()
|
||
detected = []
|
||
|
||
for technique, keywords in OPTIMIZATION_KEYWORDS.items():
|
||
for keyword in keywords:
|
||
if keyword in description_lower:
|
||
detected.append(technique)
|
||
break # Only count each technique once
|
||
|
||
return detected
|
||
|
||
def calculate_innovation_score(self, detected_techniques: List[str]) -> float:
|
||
"""Calculate innovation score based on number of techniques."""
|
||
base_score = len(detected_techniques) * 0.2
|
||
# Bonus for multiple techniques
|
||
if len(detected_techniques) >= 3:
|
||
base_score += 0.3
|
||
return min(base_score, MAX_INNOVATION_SCORE)
|
||
|
||
class CompetitionLeaderboard:
|
||
"""Focused leaderboard display with configurable sorting."""
|
||
|
||
def __init__(self, storage: CompetitionStorage):
|
||
"""Initialize leaderboard with storage backend."""
|
||
self.storage = storage
|
||
self.innovation_detector = SimpleInnovationDetector()
|
||
|
||
def display_leaderboard(self, event_name: str, sort_by: str = 'speed', top_n: int = 10) -> List[Dict[str, Any]]:
|
||
"""Display leaderboard with configurable sorting.
|
||
|
||
Args:
|
||
event_name: Event to show leaderboard for
|
||
sort_by: 'speed', 'innovation', or 'composite'
|
||
top_n: Number of top entries to display
|
||
"""
|
||
submissions = self.storage.load_event_submissions(event_name)
|
||
|
||
if not submissions:
|
||
print(f"🏆 {event_name.replace('_', ' ').title()} Leaderboard ({sort_by.title()})")
|
||
print("No submissions yet! Be the first to compete!")
|
||
return []
|
||
|
||
# Add innovation scores if needed
|
||
if sort_by in ['innovation', 'composite']:
|
||
self._add_innovation_scores(submissions)
|
||
|
||
# Sort submissions
|
||
sorted_submissions = self._sort_submissions(submissions, sort_by)
|
||
top_submissions = sorted_submissions[:top_n]
|
||
|
||
# Display leaderboard
|
||
self._display_formatted_leaderboard(event_name, top_submissions, sort_by)
|
||
|
||
return top_submissions
|
||
|
||
def _add_innovation_scores(self, submissions: List[Dict[str, Any]]):
|
||
"""Add innovation scores to submissions that don't have them."""
|
||
for submission in submissions:
|
||
if 'innovation_score' not in submission:
|
||
techniques = self.innovation_detector.detect_techniques(
|
||
submission.get('optimization_description', '')
|
||
)
|
||
submission['detected_techniques'] = techniques
|
||
submission['innovation_score'] = self.innovation_detector.calculate_innovation_score(techniques)
|
||
|
||
# Calculate composite score if speedup exists
|
||
if 'speedup_score' in submission:
|
||
submission['composite_score'] = (
|
||
SPEED_WEIGHT * submission['speedup_score'] +
|
||
INNOVATION_WEIGHT * submission['innovation_score']
|
||
)
|
||
|
||
def _sort_submissions(self, submissions: List[Dict[str, Any]], sort_by: str) -> List[Dict[str, Any]]:
|
||
"""Sort submissions by specified criteria."""
|
||
if sort_by == 'speed':
|
||
return sorted(submissions, key=lambda s: s.get('speedup_score', 0), reverse=True)
|
||
elif sort_by == 'innovation':
|
||
return sorted(submissions, key=lambda s: s.get('innovation_score', 0), reverse=True)
|
||
elif sort_by == 'composite':
|
||
return sorted(submissions, key=lambda s: s.get('composite_score', 0), reverse=True)
|
||
else:
|
||
raise ValueError(f"Unknown sort type: {sort_by}")
|
||
|
||
def _display_formatted_leaderboard(self, event_name: str, submissions: List[Dict[str, Any]], sort_by: str):
|
||
"""Display formatted leaderboard based on sort type."""
|
||
print(f"\n🏆 TINYMLPERF LEADERBOARD - {event_name.replace('_', ' ').title()} ({sort_by.title()})")
|
||
print("=" * 80)
|
||
|
||
if sort_by == 'speed':
|
||
self._display_speed_leaderboard(submissions)
|
||
elif sort_by == 'innovation':
|
||
self._display_innovation_leaderboard(submissions)
|
||
elif sort_by == 'composite':
|
||
self._display_composite_leaderboard(submissions)
|
||
|
||
print("-" * 80)
|
||
print(f"Showing top {len(submissions)} submissions")
|
||
|
||
def _display_speed_leaderboard(self, submissions: List[Dict[str, Any]]):
|
||
"""Display speed-focused leaderboard."""
|
||
print(LEADERBOARD_HEADER.format(
|
||
rank="Rank", team="Team", speedup="Speedup", time_ms="Time (ms)", techniques="Techniques"
|
||
))
|
||
print("-" * 80)
|
||
|
||
for i, submission in enumerate(submissions):
|
||
rank = i + 1
|
||
team = submission['team_name'][:19]
|
||
speedup = f"{submission.get('speedup_score', 0):.2f}x"
|
||
time_ms = f"{submission.get('submission_time_ms', 0):.2f}"
|
||
techniques = submission.get('optimization_description', '')[:24]
|
||
|
||
print(LEADERBOARD_HEADER.format(
|
||
rank=rank, team=team, speedup=speedup, time_ms=time_ms, techniques=techniques
|
||
))
|
||
|
||
def _display_innovation_leaderboard(self, submissions: List[Dict[str, Any]]):
|
||
"""Display innovation-focused leaderboard."""
|
||
print(INNOVATION_HEADER.format(
|
||
rank="Rank", team="Team", innovation="Innovation", techniques="Tech#", description="Description"
|
||
))
|
||
print("-" * 80)
|
||
|
||
for i, submission in enumerate(submissions):
|
||
rank = i + 1
|
||
team = submission['team_name'][:19]
|
||
innovation = f"{submission.get('innovation_score', 0):.3f}"
|
||
num_tech = len(submission.get('detected_techniques', []))
|
||
description = submission.get('optimization_description', '')[:24]
|
||
|
||
print(INNOVATION_HEADER.format(
|
||
rank=rank, team=team, innovation=innovation, techniques=num_tech, description=description
|
||
))
|
||
|
||
def _display_composite_leaderboard(self, submissions: List[Dict[str, Any]]):
|
||
"""Display composite leaderboard."""
|
||
print(COMPOSITE_HEADER.format(
|
||
rank="Rank", team="Team", composite="Composite", speed="Speed", innovation="Innovation", techniques="Techniques"
|
||
))
|
||
print("-" * 80)
|
||
|
||
for i, submission in enumerate(submissions):
|
||
rank = i + 1
|
||
team = submission['team_name'][:17]
|
||
composite = f"{submission.get('composite_score', 0):.3f}"
|
||
speed = f"{submission.get('speedup_score', 0):.2f}x"
|
||
innovation = f"{submission.get('innovation_score', 0):.3f}"
|
||
techniques = ", ".join(submission.get('detected_techniques', [])[:2])[:15]
|
||
|
||
print(COMPOSITE_HEADER.format(
|
||
rank=rank, team=team, composite=composite, speed=speed, innovation=innovation, techniques=techniques
|
||
))
|
||
|
||
class TinyMLPerfCompetition:
|
||
"""
|
||
TinyMLPerf Competition Framework - The Olympics of ML Optimization!
|
||
|
||
Manages three exciting competition events:
|
||
- MLP Sprint: Fastest feedforward network
|
||
- CNN Marathon: Most efficient convolutions
|
||
- Transformer Decathlon: Ultimate attention optimization
|
||
|
||
Features hardware-independent relative scoring and transparent leaderboards.
|
||
"""
|
||
|
||
def __init__(self, results_dir: str = "tinymlperf_results"):
|
||
"""
|
||
Initialize TinyMLPerf competition.
|
||
|
||
Args:
|
||
results_dir: Directory to store competition results and leaderboards
|
||
"""
|
||
self.results_dir = Path(results_dir)
|
||
self.results_dir.mkdir(exist_ok=True)
|
||
|
||
self.tinyperf = TinyMLPerf()
|
||
self.profiler = CompetitionProfiler(warmup_runs=DEFAULT_WARMUP_RUNS,
|
||
timing_runs=DEFAULT_TIMING_RUNS)
|
||
|
||
# Initialize storage and leaderboard components
|
||
self.storage = CompetitionStorage(results_dir)
|
||
self.leaderboard = CompetitionLeaderboard(self.storage)
|
||
|
||
# Load baseline models for relative scoring
|
||
self.baselines = self._establish_baselines()
|
||
|
||
print("🏆 TinyMLPerf Competition Initialized!")
|
||
print("TARGET Three Events Ready for Competition!")
|
||
|
||
def _establish_baselines(self) -> Dict[str, float]:
|
||
"""Establish baseline performance for relative scoring."""
|
||
print("📏 Establishing baseline performance for relative scoring...")
|
||
|
||
baselines = {}
|
||
events = ['mlp_sprint', 'cnn_marathon', 'transformer_decathlon']
|
||
|
||
for event in events:
|
||
model, dataset = self.tinyperf.load_benchmark(event)
|
||
results = self.profiler.benchmark_model(model, dataset)
|
||
baselines[event] = results['mean_inference_time']
|
||
print(f" {event}: {baselines[event]*1000:.2f} ms baseline")
|
||
|
||
return baselines
|
||
|
||
def submit_entry(self, team_name: str, event_name: str, optimized_model,
|
||
optimization_description: str = "", github_url: str = "") -> Dict[str, Any]:
|
||
"""Submit an optimized model to TinyMLPerf competition.
|
||
|
||
Args:
|
||
team_name: Name of the competing team
|
||
event_name: Competition event ('mlp_sprint', 'cnn_marathon', 'transformer_decathlon')
|
||
optimized_model: The optimized model to submit
|
||
optimization_description: Description of optimization techniques used
|
||
github_url: Link to code repository (for transparency)
|
||
|
||
Returns:
|
||
Submission results with performance metrics and scoring
|
||
"""
|
||
# Validate event
|
||
if event_name not in self.baselines:
|
||
available = list(self.baselines.keys())
|
||
print(f"FAIL Event '{event_name}' not recognized!")
|
||
print("TARGET Available competitions:")
|
||
for event in available:
|
||
print(f" • {event.replace('_', ' ').title()}")
|
||
return None
|
||
|
||
print(f"ROCKET TINYMLPERF SUBMISSION")
|
||
print(f"🏆 Event: {event_name.replace('_', ' ').title()}")
|
||
print(f"👥 Team: {team_name}")
|
||
print("-" * 60)
|
||
|
||
# Load benchmark dataset for this event
|
||
_, dataset = self.tinyperf.load_benchmark(event_name)
|
||
|
||
# Benchmark the submitted model with baseline comparison
|
||
results = self.profiler.benchmark_with_baseline(
|
||
optimized_model, dataset,
|
||
baseline_time=self.baselines[event_name]
|
||
)
|
||
|
||
# Calculate competition score (relative speedup)
|
||
baseline_time = self.baselines[event_name]
|
||
submission_time = results['mean_inference_time']
|
||
speedup_score = baseline_time / submission_time
|
||
|
||
# Create submission record
|
||
submission = {
|
||
'submission_id': self._generate_submission_id(team_name, event_name),
|
||
'timestamp': datetime.now().isoformat(),
|
||
'team_name': team_name,
|
||
'event_name': event_name,
|
||
'optimization_description': optimization_description,
|
||
'github_url': github_url,
|
||
'performance_metrics': results,
|
||
'speedup_score': speedup_score,
|
||
'baseline_time_ms': baseline_time * 1000,
|
||
'submission_time_ms': submission_time * 1000
|
||
}
|
||
|
||
# Save submission to storage
|
||
self.storage.save_submission(submission)
|
||
|
||
# Display submission results
|
||
self._display_submission_results(submission)
|
||
|
||
return submission
|
||
|
||
def _generate_submission_id(self, team_name: str, event_name: str) -> str:
|
||
"""Generate unique submission ID"""
|
||
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
|
||
team_hash = hashlib.md5(team_name.encode()).hexdigest()[:6]
|
||
return f"{event_name}_{team_hash}_{timestamp}"
|
||
|
||
def _benchmark_submission(self, submission: CompetitionSubmission) -> Dict[str, Any]:
|
||
"""Benchmark a submission and calculate scores."""
|
||
# Load benchmark dataset
|
||
_, dataset = self.tinyperf.load_benchmark(submission.event_name)
|
||
|
||
# Run profiling
|
||
results = self.profiler.benchmark_model(
|
||
submission.optimized_model, dataset,
|
||
baseline_time=self.baselines[submission.event_name]
|
||
)
|
||
|
||
# Calculate scores
|
||
baseline_time = self.baselines[submission.event_name]
|
||
submission_time = results['mean_inference_time']
|
||
speedup_score = baseline_time / submission_time
|
||
|
||
# Create submission data
|
||
submission_data = submission.to_dict()
|
||
submission_data.update({
|
||
'performance_metrics': results,
|
||
'speedup_score': speedup_score,
|
||
'baseline_time_ms': baseline_time * 1000,
|
||
'submission_time_ms': submission_time * 1000
|
||
})
|
||
|
||
return submission_data
|
||
|
||
def _display_submission_results(self, submission: Dict[str, Any]):
|
||
"""Display formatted submission results."""
|
||
metrics = submission['performance_metrics']
|
||
speedup = submission['speedup_score']
|
||
|
||
print(f"\n🏆 SUBMISSION RESULTS")
|
||
print(f"=" * 50)
|
||
print(f"Team: {submission['team_name']}")
|
||
print(f"Event: {submission['event_name'].replace('_', ' ').title()}")
|
||
|
||
print(f"\n⏱️ Performance:")
|
||
print(f" Your Time: {submission['submission_time_ms']:.2f} ms")
|
||
print(f" Baseline: {submission['baseline_time_ms']:.2f} ms")
|
||
print(f" ROCKET Speedup: {speedup:.2f}x {'FASTER' if speedup > 1.0 else 'slower'}")
|
||
|
||
if 'memory_delta_mb' in metrics:
|
||
print(f" 💾 Memory: {metrics['memory_delta_mb']:.2f} MB")
|
||
|
||
# Award celebration for good performance
|
||
if speedup >= 3.0:
|
||
print(f"\nCELEBRATE AMAZING! 3x+ speedup achieved!")
|
||
elif speedup >= 2.0:
|
||
print(f"\n🏆 EXCELLENT! 2x+ speedup!")
|
||
elif speedup >= 1.5:
|
||
print(f"\n⭐ GREAT! 50%+ speedup!")
|
||
elif speedup >= 1.1:
|
||
print(f"\nPASS Good optimization!")
|
||
else:
|
||
print(f"\nTHINK Keep optimizing - you can do better!")
|
||
|
||
if submission['optimization_description']:
|
||
print(f"\nTIP Techniques Used:")
|
||
print(f" {submission['optimization_description']}")
|
||
|
||
def display_leaderboard(self, event_name: str, sort_by: str = 'speed', top_n: int = 10) -> List[Dict[str, Any]]:
|
||
"""Display leaderboard for specific event with configurable sorting.
|
||
|
||
Args:
|
||
event_name: Event to show leaderboard for
|
||
sort_by: 'speed', 'innovation', or 'composite'
|
||
top_n: Number of top entries to display
|
||
"""
|
||
return self.leaderboard.display_leaderboard(event_name, sort_by, top_n)
|
||
|
||
def display_all_leaderboards(self, sort_by: str = 'speed'):
|
||
"""Display leaderboards for all events.
|
||
|
||
Args:
|
||
sort_by: 'speed', 'innovation', or 'composite'
|
||
"""
|
||
events = ['mlp_sprint', 'cnn_marathon', 'transformer_decathlon']
|
||
|
||
for event in events:
|
||
self.display_leaderboard(event, sort_by=sort_by, top_n=5)
|
||
print()
|
||
|
||
def get_team_progress(self, team_name: str) -> Dict[str, List[Dict[str, Any]]]:
|
||
"""Get all submissions from a specific team across all events."""
|
||
team_submissions = {'mlp_sprint': [], 'cnn_marathon': [], 'transformer_decathlon': []}
|
||
|
||
for event in team_submissions.keys():
|
||
submissions = self.storage.load_event_submissions(event)
|
||
team_submissions[event] = [
|
||
s for s in submissions if s['team_name'] == team_name
|
||
]
|
||
# Sort by timestamp
|
||
team_submissions[event].sort(key=lambda s: s['timestamp'])
|
||
|
||
return team_submissions
|
||
|
||
# %% [markdown]
|
||
"""
|
||
### Test TinyMLPerf Competition Framework
|
||
|
||
Let's test the competition framework with multiple team submissions and leaderboards.
|
||
"""
|
||
|
||
# %%
|
||
def test_tinymlperf_competition():
|
||
"""Test the TinyMLPerf competition framework"""
|
||
print("Testing TinyMLPerf Competition Framework...")
|
||
|
||
# Initialize competition
|
||
competition = TinyMLPerfCompetition()
|
||
|
||
# Create some test optimized models
|
||
class FastMLPModel:
|
||
"""Simulated optimized MLP - smaller and faster"""
|
||
def __init__(self):
|
||
# Smaller model for speed
|
||
self.weights1 = np.random.randn(784, 64).astype(np.float32) * 0.1
|
||
self.bias1 = np.random.randn(64).astype(np.float32) * 0.1
|
||
self.weights2 = np.random.randn(64, 10).astype(np.float32) * 0.1
|
||
self.bias2 = np.random.randn(10).astype(np.float32) * 0.1
|
||
|
||
def predict(self, x):
|
||
h1 = np.maximum(0, x @ self.weights1 + self.bias1)
|
||
return h1 @ self.weights2 + self.bias2
|
||
|
||
class EfficientCNNModel:
|
||
"""Simulated optimized CNN"""
|
||
def __init__(self):
|
||
# Optimized weights
|
||
self.fc_weights = np.random.randn(1600, 10).astype(np.float32) * 0.05
|
||
self.fc_bias = np.random.randn(10).astype(np.float32) * 0.05
|
||
|
||
def predict(self, x):
|
||
batch_size = x.shape[0]
|
||
x_flat = x.reshape(batch_size, -1)
|
||
if x_flat.shape[1] != 1600:
|
||
x_flat = x_flat[:, :1600] if x_flat.shape[1] > 1600 else np.pad(x_flat, ((0, 0), (0, 1600 - x_flat.shape[1])), 'constant')
|
||
return x_flat @ self.fc_weights + self.fc_bias
|
||
|
||
# Submit optimized models to competition
|
||
print("\nROCKET Submitting Competition Entries...")
|
||
|
||
# MLP Sprint submissions
|
||
mlp_submission1 = competition.submit_entry(
|
||
team_name="Speed Demons",
|
||
event_name="mlp_sprint",
|
||
optimized_model=FastMLPModel(),
|
||
optimization_description="Reduced hidden layer size for 2x speedup",
|
||
github_url="https://github.com/speed-demons/fast-mlp"
|
||
)
|
||
|
||
mlp_submission2 = competition.submit_entry(
|
||
team_name="Lightning Fast",
|
||
event_name="mlp_sprint",
|
||
optimized_model=FastMLPModel(),
|
||
optimization_description="Quantization + kernel optimization",
|
||
github_url="https://github.com/lightning-fast/mlp-opt"
|
||
)
|
||
|
||
# CNN Marathon submission
|
||
cnn_submission = competition.submit_entry(
|
||
team_name="CNN Champions",
|
||
event_name="cnn_marathon",
|
||
optimized_model=EfficientCNNModel(),
|
||
optimization_description="Custom convolution kernels + memory optimization",
|
||
github_url="https://github.com/cnn-champions/efficient-cnn"
|
||
)
|
||
|
||
# Display leaderboards
|
||
print("\n📊 Competition Leaderboards:")
|
||
competition.display_all_leaderboards()
|
||
|
||
print("\nPASS TinyMLPerf competition framework test complete!")
|
||
return competition
|
||
|
||
# %% [markdown]
|
||
"""
|
||
## Part 4: Simplified Competition Testing
|
||
|
||
Let's test the simplified competition framework with all three leaderboard types.
|
||
"""
|
||
|
||
# %%
|
||
def test_simplified_competition_features():
|
||
"""Test the simplified competition framework with all leaderboard types."""
|
||
print("Testing Simplified Competition Framework with All Leaderboard Types...")
|
||
|
||
# Initialize competition
|
||
competition = TinyMLPerfCompetition()
|
||
|
||
# Create optimized models with different innovation descriptions
|
||
class FastMLPModel:
|
||
"""Simulated optimized MLP - smaller and faster"""
|
||
def __init__(self):
|
||
# Smaller model for speed
|
||
self.weights1 = np.random.randn(784, 64).astype(np.float32) * 0.1
|
||
self.bias1 = np.random.randn(64).astype(np.float32) * 0.1
|
||
self.weights2 = np.random.randn(64, 10).astype(np.float32) * 0.1
|
||
self.bias2 = np.random.randn(10).astype(np.float32) * 0.1
|
||
|
||
def predict(self, x):
|
||
h1 = np.maximum(0, x @ self.weights1 + self.bias1)
|
||
return h1 @ self.weights2 + self.bias2
|
||
|
||
class EfficientCNNModel:
|
||
"""Simulated optimized CNN"""
|
||
def __init__(self):
|
||
# Optimized weights
|
||
self.fc_weights = np.random.randn(1600, 10).astype(np.float32) * 0.05
|
||
self.fc_bias = np.random.randn(10).astype(np.float32) * 0.05
|
||
|
||
def predict(self, x):
|
||
batch_size = x.shape[0]
|
||
x_flat = x.reshape(batch_size, -1)
|
||
if x_flat.shape[1] != 1600:
|
||
x_flat = x_flat[:, :1600] if x_flat.shape[1] > 1600 else np.pad(x_flat, ((0, 0), (0, 1600 - x_flat.shape[1])), 'constant')
|
||
return x_flat @ self.fc_weights + self.fc_bias
|
||
|
||
# Submit entries with different optimization descriptions
|
||
print("\nROCKET Submitting Competition Entries...")
|
||
|
||
# MLP submissions with different techniques
|
||
submission1 = competition.submit_entry(
|
||
team_name="Speed Demons",
|
||
event_name="mlp_sprint",
|
||
optimized_model=FastMLPModel(),
|
||
optimization_description="Reduced hidden layer size for 2x speedup",
|
||
github_url="https://github.com/speed-demons/fast-mlp"
|
||
)
|
||
|
||
submission2 = competition.submit_entry(
|
||
team_name="Quantized Team",
|
||
event_name="mlp_sprint",
|
||
optimized_model=FastMLPModel(),
|
||
optimization_description="INT8 quantization with custom kernels",
|
||
github_url="https://github.com/quantized-team/mlp-opt"
|
||
)
|
||
|
||
submission3 = competition.submit_entry(
|
||
team_name="Pruning Pros",
|
||
event_name="cnn_marathon",
|
||
optimized_model=EfficientCNNModel(),
|
||
optimization_description="Sparse pruned model with distillation",
|
||
github_url="https://github.com/pruning-pros/efficient-cnn"
|
||
)
|
||
|
||
# Test all three leaderboard types
|
||
print("\n📊 Testing All Leaderboard Types:")
|
||
|
||
print("\n1. Speed Leaderboard:")
|
||
competition.display_leaderboard("mlp_sprint", sort_by="speed", top_n=5)
|
||
|
||
print("\n2. Innovation Leaderboard:")
|
||
competition.display_leaderboard("mlp_sprint", sort_by="innovation", top_n=5)
|
||
|
||
print("\n3. Composite Leaderboard:")
|
||
competition.display_leaderboard("mlp_sprint", sort_by="composite", top_n=5)
|
||
|
||
print("\nPASS Simplified competition features test complete!")
|
||
return competition
|
||
|
||
# %% [markdown]
|
||
"""
|
||
## Comprehensive Testing
|
||
|
||
Let's run a complete TinyMLPerf competition demonstration with simplified features.
|
||
"""
|
||
|
||
def run_complete_tinymlperf_demo():
|
||
"""Run comprehensive TinyMLPerf competition demonstration"""
|
||
print("🏆 TINYMLPERF - THE ULTIMATE ML SYSTEMS COMPETITION")
|
||
print("=" * 80)
|
||
|
||
print("\n1. 🏗️ Setting up TinyMLPerf Benchmark Suite...")
|
||
# Test benchmark suite
|
||
benchmark_suite = test_tinymlperf_benchmark_suite()
|
||
|
||
print("\n2. SPEED Testing Competition Profiling...")
|
||
# Test profiling infrastructure
|
||
competition_profiler, mlp_results, cnn_results = test_competition_profiler()
|
||
|
||
print("\n3. ROCKET Running Basic Competition...")
|
||
# Test basic competition
|
||
basic_competition = test_tinymlperf_competition()
|
||
|
||
print("\n4. 🔬 Testing Simplified Competition Features...")
|
||
# Test simplified competition with all leaderboard types
|
||
simplified_competition = test_simplified_competition_features()
|
||
|
||
print("\n" + "=" * 80)
|
||
print("CELEBRATE TINYMLPERF DEMO COMPLETE!")
|
||
print("=" * 80)
|
||
|
||
print("\n🏆 TinyMLPerf Competition Ready:")
|
||
print("PASS Three exciting events: MLP Sprint, CNN Marathon, Transformer Decathlon")
|
||
print("PASS TinyTorch Module 15 profiler integration for rigorous benchmarking")
|
||
print("PASS Hardware-independent relative scoring (speedup ratios)")
|
||
print("PASS Transparent leaderboards with evidence requirements")
|
||
print("PASS Simplified innovation detection and creativity rewards")
|
||
print("PASS Three leaderboard types: speed, innovation, and composite scoring")
|
||
|
||
print("\nROCKET Competition Features:")
|
||
print("• Standardized benchmark models and datasets")
|
||
print("• Statistical reliability with multiple timing runs")
|
||
print("• Multiple leaderboard categories with simple keyword detection")
|
||
print("• GitHub integration for transparency and reproducibility")
|
||
print("• Focused classes with single responsibilities")
|
||
|
||
print("\nTARGET Ready to Compete:")
|
||
print("1. Optimize your models using techniques from Modules 16-19")
|
||
print("2. Submit to TinyMLPerf events using competition.submit_entry()")
|
||
print("3. See your results on speed, innovation, or composite leaderboards")
|
||
print("4. Iterate and improve based on performance feedback")
|
||
print("5. Prove your ML systems optimization mastery!")
|
||
|
||
return {
|
||
'benchmark_suite': benchmark_suite,
|
||
'profiler': competition_profiler,
|
||
'basic_competition': basic_competition,
|
||
'simplified_competition': simplified_competition
|
||
}
|
||
|
||
# %% [markdown]
|
||
"""
|
||
## Systems Analysis Summary
|
||
|
||
This simplified TinyMLPerf competition module demonstrates advanced ML systems engineering through streamlined competitive benchmarking:
|
||
|
||
### 🏗️ **Simplified Competition Infrastructure**
|
||
- **Focused Classes**: Each class has a single responsibility - submission, storage, leaderboard, or innovation detection
|
||
- **Clear Separation of Concerns**: CompetitionSubmission, CompetitionStorage, CompetitionLeaderboard, and SimpleInnovationDetector work together
|
||
- **Consistent API**: Single parameterized leaderboard method replaces three separate implementations
|
||
- **Student-Friendly**: Reduced cognitive load while maintaining all essential functionality
|
||
|
||
### SPEED **Streamlined Performance Optimization**
|
||
- **Single Leaderboard Interface**: One method with sort_by parameter ('speed', 'innovation', 'composite') replaces complex multiple methods
|
||
- **Simple Innovation Detection**: Basic keyword matching replaces complex pattern analysis and model introspection
|
||
- **Consistent Formatting**: Centralized header templates ensure visual consistency across all leaderboard types
|
||
- **Clear Error Messages**: Student-friendly guidance when events are not recognized
|
||
|
||
### 📊 **Simplified Competition Analysis**
|
||
- **TinyTorch Profiler Integration**: Unchanged - still leverages Module 15's profiling infrastructure
|
||
- **Progressive Feature Introduction**: Students can focus on speed first, then add innovation scoring
|
||
- **Visual Clarity**: Clear section headers and spacing prevent information overload
|
||
- **Focused Testing**: Each test function validates one specific capability
|
||
|
||
### TIP **Educational Improvements**
|
||
- **Reduced Complexity**: Eliminated 100+ line classes in favor of focused 20-30 line classes
|
||
- **Better Mental Models**: Students understand leaderboard concepts instead of getting lost in implementation details
|
||
- **Maintainable Code**: Consistent patterns and centralized formatting make code easier to debug and extend
|
||
- **KISS Principle**: Keep It Simple, Stupid - core pedagogical value preserved with implementation complexity reduced
|
||
|
||
### TARGET **Key Learning Objectives Maintained**
|
||
- Competition still accelerates optimization learning through concrete performance measurements
|
||
- Hardware-independent scoring ensures fair comparison across different development environments
|
||
- Multiple leaderboard types prevent single-metric tunnel vision
|
||
- Evidence requirements teach reproducibility and honest performance reporting
|
||
|
||
### 🏆 **Professional Development**
|
||
The simplified framework teaches students that good software engineering means:
|
||
- Breaking large classes into focused components
|
||
- Choosing clear, consistent APIs over feature proliferation
|
||
- Prioritizing readability and maintainability
|
||
- Making complex systems accessible without losing functionality
|
||
|
||
This refactored competition framework proves that educational software can be both pedagogically effective AND well-engineered, setting a positive example for students about professional software development practices.
|
||
"""
|
||
|
||
# %% [markdown]
|
||
"""
|
||
## Main Execution Block
|
||
|
||
Run the complete TinyMLPerf competition system when this module is executed directly.
|
||
"""
|
||
|
||
# %%
|
||
if __name__ == "__main__":
|
||
print("Module 20: TinyMLPerf - The Ultimate ML Systems Competition")
|
||
print("=" * 80)
|
||
|
||
# Run complete TinyMLPerf demonstration
|
||
results = run_complete_tinymlperf_demo()
|
||
|
||
print(f"\nCELEBRATE Module 20 complete!")
|
||
print(f"🏆 TinyMLPerf competition infrastructure ready!")
|
||
print(f"ROCKET Time to optimize your models and climb the leaderboards!")
|
||
|
||
# %% [markdown]
|
||
"""
|
||
## THINK ML Systems Thinking: Interactive Questions
|
||
|
||
1. **Why is separation of concerns crucial in competition software architecture?** Your refactored TinyMLPerf breaks large classes into focused components: CompetitionSubmission, CompetitionStorage, CompetitionLeaderboard, and SimpleInnovationDetector. Explain why this modular design is essential for educational software and how it teaches students professional software development practices beyond just ML systems concepts.
|
||
|
||
2. **How does simplifying innovation detection improve student learning outcomes?** You replaced complex pattern matching and model introspection with basic keyword detection. Analyze why reducing implementation complexity while preserving core functionality helps students focus on competition concepts rather than text processing algorithms, and how this reflects real-world engineering trade-offs.
|
||
|
||
3. **What makes single parameterized methods superior to multiple specialized methods?** Your leaderboard refactor replaced three separate methods (display_leaderboard, display_innovation_leaderboard, display_composite_leaderboard) with one configurable method. Explain why this API design choice reduces cognitive load while maintaining functionality, and how this principle applies to ML systems interfaces in production.
|
||
|
||
4. **How does consistent formatting contribute to system maintainability and user experience?** Your centralized header templates (LEADERBOARD_HEADER, INNOVATION_HEADER, COMPOSITE_HEADER) ensure visual consistency across all leaderboard displays. Analyze why standardized formatting matters in ML systems dashboards and monitoring tools, and how it prevents the user interface inconsistencies that plague many ML operations platforms.
|
||
"""
|
||
|
||
# %% [markdown]
|
||
"""
|
||
## TARGET MODULE SUMMARY: TinyMLPerf - Simplified Competition Framework
|
||
|
||
This refactored module demonstrates the power of the KISS principle in educational software design, proving that complex systems can be both pedagogically effective and professionally engineered.
|
||
|
||
### 🛤️ **The Simplification Journey**
|
||
- **Original Problem**: 600+ lines of complex, intertwined classes causing student cognitive overload
|
||
- **Solution Approach**: Break large classes into focused components with single responsibilities
|
||
- **Result**: Clean, maintainable code that teaches competition concepts without implementation distractions
|
||
|
||
### 🏗️ **Architecture Improvements**
|
||
- **CompetitionSubmission**: Focused on creating and validating individual submissions
|
||
- **CompetitionStorage**: Dedicated to saving and loading competition data
|
||
- **CompetitionLeaderboard**: Specialized for ranking and display with configurable sorting
|
||
- **SimpleInnovationDetector**: Basic keyword matching replacing complex pattern analysis
|
||
- **TinyMLPerfCompetition**: Orchestrates components with clean delegation patterns
|
||
|
||
### TARGET **Educational Excellence**
|
||
Students learn both ML systems concepts AND professional software engineering:
|
||
- **Modular Design**: How to break complex problems into manageable components
|
||
- **API Consistency**: Why parameterized methods beat specialized implementations
|
||
- **Code Maintainability**: How consistent formatting and clear separation of concerns prevent technical debt
|
||
- **KISS Principle**: That simplicity is the ultimate sophistication in software design
|
||
|
||
### 🏆 **Competition Integrity Maintained**
|
||
All essential functionality preserved with improved usability:
|
||
- Three competition events with standardized benchmarking
|
||
- Hardware-independent relative scoring for fair comparison
|
||
- Multiple leaderboard types (speed, innovation, composite) preventing tunnel vision
|
||
- Evidence requirements ensuring reproducible, honest performance claims
|
||
- Simple but effective innovation detection rewarding creative optimization
|
||
|
||
### TIP **Professional Development**
|
||
This refactor teaches students that excellent engineering means:
|
||
- Choosing clarity over clever complexity
|
||
- Building maintainable systems that others can understand and extend
|
||
- Designing APIs that guide users toward correct usage
|
||
- Making sophisticated functionality accessible without dumbing it down
|
||
|
||
**The ultimate lesson**: Great ML systems engineers build tools that make complex concepts simple to use, not simple concepts complex to understand. This competition framework exemplifies how educational software can teach both domain knowledge and engineering excellence simultaneously.
|
||
"""
|