Standardize all modules to follow NBGrader style guide

- Updated 7 non-compliant modules for consistency - Module 01_setup: Added EXAMPLE USAGE sections with code examples - Module 02_tensor: Added STEP-BY-STEP IMPLEMENTATION and LEARNING CONNECTIONS - Module 05_dense: Added LEARNING CONNECTIONS to all functions - Module 06_spatial: Added STEP-BY-STEP and LEARNING CONNECTIONS - Module 08_dataloader: Added LEARNING CONNECTIONS sections - Module 11_training: Added STEP-BY-STEP and LEARNING CONNECTIONS - Module 14_benchmarking: Added STEP-BY-STEP and LEARNING CONNECTIONS - All modules now follow consistent format per NBGRADER_STYLE_GUIDE.md - Preserved all existing solution blocks and functionality
2026-05-03 07:42:31 -05:00 · 2025-09-16 16:48:14 -04:00
parent 0a0197b72c
commit 6349c218d2
7 changed files with 402 additions and 39 deletions
--- a/modules/source/14_benchmarking/benchmarking_dev.py
+++ b/modules/source/14_benchmarking/benchmarking_dev.py
@@ -249,7 +249,7 @@ class BenchmarkScenarios:
    
    TODO: Implement the three benchmark scenarios following MLPerf patterns.
    
-    UNDERSTANDING THE SCENARIOS:
+    STEP-BY-STEP IMPLEMENTATION:
    1. Single-Stream: Send queries one at a time, measure latency
    2. Server: Send queries following Poisson distribution, measure QPS
    3. Offline: Send all queries at once, measure total throughput
@@ -260,6 +260,12 @@ class BenchmarkScenarios:
    3. Calculate appropriate metrics for each scenario
    4. Return BenchmarkResult with all measurements
    
+    LEARNING CONNECTIONS:
+    - **MLPerf Standards**: Industry-standard benchmarking methodology used by Google, NVIDIA, etc.
+    - **Performance Scenarios**: Different deployment patterns require different measurement approaches
+    - **Production Validation**: Benchmarking validates model performance before deployment
+    - **Resource Planning**: Results guide infrastructure scaling and capacity planning
+    
    EXAMPLE USAGE:
    scenarios = BenchmarkScenarios()
    result = scenarios.single_stream(model, dataset, num_queries=1000)
@@ -275,7 +281,7 @@ class BenchmarkScenarios:
        
        TODO: Implement single-stream benchmarking.
        
-        STEP-BY-STEP:
+        STEP-BY-STEP IMPLEMENTATION:
        1. Initialize empty list for latencies
        2. For each query (up to num_queries):
           a. Get next sample from dataset (cycle if needed)
@@ -288,6 +294,12 @@ class BenchmarkScenarios:
        4. Calculate accuracy if possible
        5. Return BenchmarkResult with SINGLE_STREAM scenario
        
+        LEARNING CONNECTIONS:
+        - **Mobile/Edge Deployment**: Single-stream simulates user-facing applications
+        - **Tail Latency**: 90th/95th percentiles matter more than averages for user experience
+        - **Interactive Systems**: Chatbots, recommendation engines use single-stream patterns
+        - **SLA Validation**: Ensures models meet response time requirements
+        
        HINTS:
        - Use time.perf_counter() for precise timing
        - Use dataset[i % len(dataset)] to cycle through samples
@@ -337,7 +349,7 @@ class BenchmarkScenarios:
        
        TODO: Implement server benchmarking.
        
-        STEP-BY-STEP:
+        STEP-BY-STEP IMPLEMENTATION:
        1. Calculate inter-arrival time = 1.0 / target_qps
        2. Run for specified duration:
           a. Wait for next query arrival (Poisson distribution)
@@ -348,6 +360,12 @@ class BenchmarkScenarios:
        3. Calculate actual QPS = total_queries / duration
        4. Return results
        
+        LEARNING CONNECTIONS:
+        - **Web Services**: Server scenario simulates API endpoints handling concurrent requests
+        - **Load Testing**: Validates system behavior under realistic traffic patterns
+        - **Scalability Analysis**: Tests how well models handle increasing load
+        - **Production Deployment**: Critical for microservices and web-scale applications
+        
        HINTS:
        - Use np.random.exponential(inter_arrival_time) for Poisson
        - Track both query arrival times and completion times
@@ -400,7 +418,7 @@ class BenchmarkScenarios:
        
        TODO: Implement offline benchmarking.
        
-        STEP-BY-STEP:
+        STEP-BY-STEP IMPLEMENTATION:
        1. Group dataset into batches of batch_size
        2. For each batch:
           a. Record start time
@@ -410,6 +428,12 @@ class BenchmarkScenarios:
        3. Calculate total throughput = total_samples / total_time
        4. Return results
        
+        LEARNING CONNECTIONS:
+        - **Batch Processing**: Offline scenario simulates data pipeline and ETL workloads
+        - **Throughput Optimization**: Maximizes processing efficiency for large datasets
+        - **Data Center Workloads**: Common in recommendation systems and analytics pipelines
+        - **Cost Optimization**: High throughput reduces compute costs per sample
+        
        HINTS:
        - Process data in batches for efficiency
        - Measure total time for all batches
@@ -521,7 +545,7 @@ class StatisticalValidator:
    
    TODO: Implement statistical validation for benchmark results.
    
-    UNDERSTANDING STATISTICAL TESTING:
+    STEP-BY-STEP IMPLEMENTATION:
    1. Null hypothesis: No difference between models
    2. T-test: Compare means of two groups
    3. P-value: Probability of seeing this difference by chance
@@ -534,6 +558,12 @@ class StatisticalValidator:
    3. Calculate effect size (Cohen's d)
    4. Calculate confidence interval
    5. Provide clear recommendation
+    
+    LEARNING CONNECTIONS:
+    - **Scientific Rigor**: Ensures performance claims are statistically valid
+    - **A/B Testing**: Foundation for production model comparison and rollout decisions
+    - **Research Validation**: Required for academic papers and technical reports
+    - **Business Decisions**: Statistical significance guides investment in new models
    """
    
    def __init__(self, confidence_level: float = 0.95):
@@ -733,7 +763,7 @@ class TinyTorchPerf:
    
    TODO: Implement the complete benchmarking framework.
    
-    UNDERSTANDING THE FRAMEWORK:
+    STEP-BY-STEP IMPLEMENTATION:
    1. Combines all benchmark scenarios
    2. Integrates statistical validation
    3. Provides easy-to-use API
@@ -744,6 +774,12 @@ class TinyTorchPerf:
    2. Provide methods for each scenario
    3. Include statistical validation
    4. Generate comprehensive reports
+    
+    LEARNING CONNECTIONS:
+    - **MLPerf Integration**: Follows industry-standard benchmarking patterns
+    - **Production Deployment**: Validates models before production rollout
+    - **Performance Engineering**: Identifies bottlenecks and optimization opportunities
+    - **Framework Design**: Demonstrates how to build reusable ML tools
    """
    
    def __init__(self):
@@ -1376,13 +1412,19 @@ class ProductionBenchmarkingProfiler:
    
    TODO: Implement production-grade profiling capabilities.
    
-    UNDERSTANDING PRODUCTION PROFILING:
+    STEP-BY-STEP IMPLEMENTATION:
    1. End-to-end pipeline analysis (not just model inference)
    2. Resource utilization monitoring (CPU, memory, bandwidth)
    3. Statistical A/B testing frameworks
    4. Production monitoring and alerting integration
    5. Performance regression detection
    6. Load testing and capacity planning
+    
+    LEARNING CONNECTIONS:
+    - **Production ML Systems**: Real-world profiling for deployment optimization
+    - **Performance Engineering**: Systematic approach to identifying and fixing bottlenecks
+    - **A/B Testing**: Statistical frameworks for safe model rollouts
+    - **Cost Optimization**: Understanding resource usage for efficient cloud deployment
    """
    
    def __init__(self, enable_monitoring: bool = True):