mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-04-28 16:48:30 -05:00
Test Cleanup (113 files, -22,000 lines): - Remove 21 redundant run_all_tests.py files - Remove checkpoints/ folder (22 obsolete checkpoint files) - Remove progressive/, debugging/, diagnostic/ folders - Remove duplicate integration tests and examples - Remove orphaned dev artifacts and generated outputs - Consolidate test_gradient_flow_overall.py into system/ Documentation Cleanup (4 files removed): - Remove duplicate HOW_TO_USE.md, WORKFLOW.md, SYSTEM_DESIGN.md - Trim environment/README.md from 334 to 86 lines - Update capstone/README.md removing outdated bug references Test Fixes: - Add requires_grad=True to layer parameters in gradient tests - Fix PositionalEncoding argument order in test_shapes.py - Adjust performance thresholds for realistic expectations - Fix gradient clipping to handle memoryview correctly - Update zero_grad assertions to accept None or zeros
Capstone Integration Tests - Module 20
Comprehensive integration tests that validate the ENTIRE TinyTorch learning journey.
Overview
The capstone tests verify that all 19 previous modules work together to build production-ready ML systems.
Test Coverage
Priority 1: Complete ML Pipeline
- test_complete_ml_pipeline_end_to_end: Full data → model → training → evaluation
- Validates: Modules 01-08 integration
Priority 2: Model Architecture
- test_mlp_architecture_integration: Multi-layer perceptron
- test_cnn_architecture_integration: CNN with Conv2d, pooling, flatten
- test_transformer_architecture_integration: Attention, embeddings, positional encoding
Priority 3: Training Convergence
- test_xor_convergence: Classic XOR problem
- test_binary_classification_convergence: Real binary classification
Priority 4: Optimization & Deployment
- test_quantization_pipeline: INT8 quantization
- test_pruning_pipeline: Weight pruning
- test_combined_optimization_deployment: Quantization + pruning together
Priority 5: Gradient Flow & Performance
- test_deep_network_gradient_flow: Gradients through all layer types
- test_memory_efficiency: Reasonable memory usage
- test_training_performance: Training speed meets expectations
Running Tests
# Run all capstone tests
pytest tests/20_capstone/ -v
# Run specific test class
pytest tests/20_capstone/test_capstone_core.py::TestCompleteMLPipeline -v
Test Philosophy
Tests follow production ML workflow patterns:
- Data Creation → Representative datasets
- Model Building → Real architectures (MLP, CNN, Transformer)
- Training → Actual convergence (loss decreases, accuracy improves)
- Evaluation → Real metrics
- Optimization → Production techniques (quantization, pruning)
Success Criteria
For capstone tests to pass, students must have:
- Built all 19 modules correctly
- Integrated modules properly
- Implemented autograd correctly (gradients flow everywhere)
- Created working optimizers
- Validated on real tasks (models actually learn)
What This Tests That Unit Tests Don't
| Aspect | Unit Tests | Capstone Tests |
|---|---|---|
| Scope | Single module | All 19 modules together |
| Integration | Module isolation | Cross-module integration |
| Real workflows | Synthetic checks | Production ML pipelines |
| Learning | Correctness only | Models must converge |
| Deployment | Not tested | Quantization, pruning tested |