cs249r_book/tinytorch/tests/20_capstone/README.md

# Capstone Integration Tests - Module 20

Comprehensive integration tests that validate the ENTIRE TinyTorch learning journey.

## Overview

The capstone tests verify that all 19 previous modules work together to build production-ready ML systems.

## Test Coverage

### Priority 1: Complete ML Pipeline
- **test_complete_ml_pipeline_end_to_end**: Full data → model → training → evaluation
- Validates: Modules 01-08 integration

### Priority 2: Model Architecture
- **test_mlp_architecture_integration**: Multi-layer perceptron
- **test_cnn_architecture_integration**: CNN with Conv2d, pooling, flatten
- **test_transformer_architecture_integration**: Attention, embeddings, positional encoding

### Priority 3: Training Convergence
- **test_xor_convergence**: Classic XOR problem
- **test_binary_classification_convergence**: Real binary classification

### Priority 4: Optimization & Deployment
- **test_quantization_pipeline**: INT8 quantization
- **test_pruning_pipeline**: Weight pruning
- **test_combined_optimization_deployment**: Quantization + pruning together

### Priority 5: Gradient Flow & Performance
- **test_deep_network_gradient_flow**: Gradients through all layer types
- **test_memory_efficiency**: Reasonable memory usage
- **test_training_performance**: Training speed meets expectations

## Running Tests

```bash
# Run all capstone tests
pytest tests/20_capstone/ -v

# Run specific test class
pytest tests/20_capstone/test_capstone_core.py::TestCompleteMLPipeline -v
```

## Test Philosophy

Tests follow production ML workflow patterns:

1. **Data Creation** → Representative datasets
2. **Model Building** → Real architectures (MLP, CNN, Transformer)
3. **Training** → Actual convergence (loss decreases, accuracy improves)
4. **Evaluation** → Real metrics
5. **Optimization** → Production techniques (quantization, pruning)

## Success Criteria

For capstone tests to pass, students must have:

1. Built all 19 modules correctly
2. Integrated modules properly
3. Implemented autograd correctly (gradients flow everywhere)
4. Created working optimizers
5. Validated on real tasks (models actually learn)

## What This Tests That Unit Tests Don't

| Aspect | Unit Tests | Capstone Tests |
|--------|------------|----------------|
| Scope | Single module | All 19 modules together |
| Integration | Module isolation | Cross-module integration |
| Real workflows | Synthetic checks | Production ML pipelines |
| Learning | Correctness only | Models must converge |
| Deployment | Not tested | Quantization, pruning tested |