# ๐Ÿ“– TinyTorch Module Development Guide **Complete methodology for creating educational modules with real-world ML engineering practices.** ## ๐ŸŽฏ Philosophy **"Build โ†’ Use โ†’ Understand โ†’ Repeat"** with real data and immediate feedback. Create complete, working implementations that automatically generate student exercise versions while maintaining production-quality exports. ## ๐Ÿ”‘ Core Principles ### **Real Data, Real Systems** - **Use production datasets**: No mock/fake data - students work with CIFAR-10, not synthetic data - **Show progress feedback**: Downloads, training need visual progress indicators - **Cache for efficiency**: Download once, use repeatedly - **Real-world scale**: Use actual dataset sizes, not toy examples ### **Immediate Visual Feedback** - **Visual confirmation**: Students see their code working (images, plots, results) - **Development vs. Export separation**: Rich feedback in `_dev.py`, clean exports to package - **Progress indicators**: Status messages, progress bars for long operations - **Real-time validation**: Students can verify each step immediately ### **Educational Excellence** - **Progressive complexity**: Easy โ†’ Medium โ†’ Hard with clear difficulty indicators - **Comprehensive guidance**: TODO sections with approach, examples, hints, systems thinking - **Real-world connections**: Connect every concept to production ML engineering - **Immediate testing**: Test each component with real inputs as you build ## ๐Ÿ—๏ธ Development Workflow ### Step 1: Choose the Learning Pattern - **Select engagement pattern**: Reflect, Analyze, or Optimize? - **Use the Pattern Selection Guide** from [Pedagogical Principles](../pedagogy/pedagogical-principles.md): - **Build โ†’ Use โ†’ Reflect**: Early modules, design decisions, systems thinking - **Build โ†’ Use โ†’ Analyze**: Middle modules, technical depth, performance - **Build โ†’ Use โ†’ Optimize**: Advanced modules, iteration, production focus - **Document your choice** with clear rationale ### Step 2: Plan the Learning Journey - **Define learning objectives**: What should students implement vs. receive? - **Choose real data**: What production dataset will they use? - **Design progression**: How does complexity build through the module? - **Map to production**: How does this connect to real ML systems? - **Design pattern-specific activities**: Questions, exercises, or challenges ### Step 3: Write Complete Implementation Create `modules/{module}/{module}_dev.py` with NBDev structure: ```python # --- # jupyter: # jupytext: # text_representation: # extension: .py # format_name: percent # format_version: '1.3' # jupytext_version: 1.17.1 # --- # %% [markdown] """ # Module: {Title} - {Purpose} ## ๐ŸŽฏ Learning Pattern: Build โ†’ Use โ†’ [Pattern] **Pattern Choice**: [Reflect/Analyze/Optimize] **Rationale**: [Why this pattern fits the learning objectives] **Key Activities**: - [Pattern-specific activity 1] - [Pattern-specific activity 2] - [Pattern-specific activity 3] ## Learning Objectives - โœ… Build {core_concept} from scratch - โœ… Use it with real data ({dataset_name}) - โœ… [Engage] through {pattern_specific_activities} - โœ… Connect to production ML systems ## What You'll Build {description_of_what_students_build} """ # %% #| default_exp core.{module} import numpy as np import matplotlib.pyplot as plt from typing import Union, List, Optional # %% #| export class MainClass: """ {Description of the class} TODO: {What students need to implement} APPROACH: 1. {Step 1 with specific guidance} 2. {Step 2 with specific guidance} 3. {Step 3 with specific guidance} EXAMPLE: Input: {concrete_example} Expected: {expected_output} HINTS: - {Helpful hint about approach} - {Systems thinking hint} - {Real-world connection} """ def __init__(self, params): raise NotImplementedError("Student implementation required") # %% #| hide #| export class MainClass: """Complete implementation (hidden from students).""" def __init__(self, params): # Actual working implementation pass # %% [markdown] """ ## ๐Ÿงช Test Your Implementation """ # %% # Test with real data try: # Test student implementation result = MainClass(real_data_example) print(f"โœ… Success: {result}") except NotImplementedError: print("โš ๏ธ Implement the class above first!") # Visual feedback (development only - not exported) def show_results(data): """Show visual confirmation of working code.""" plt.figure(figsize=(10, 6)) # Visualization code plt.show() if _should_show_plots(): show_results(real_data) ``` ### Step 4: Create Tests with Real Data Create `modules/{module}/tests/test_{module}.py`: ```python import pytest import numpy as np from {module}_dev import MainClass def test_with_real_data(): """Test with actual production data.""" # Use real datasets, not mocks real_data = load_real_dataset() instance = MainClass(real_data) result = instance.process() # Test real properties assert result.shape == expected_real_shape assert result.dtype == expected_real_dtype # Test with actual data characteristics ``` ### Step 5: Convert and Export ```bash # Convert to notebook (using Jupytext) tito module notebooks --module {module} # Export to package python bin/tito.py sync --module {module} # Test everything python bin/tito.py test --module {module} ``` ## ๐Ÿท๏ธ NBDev Directives ### Core Directives - `#| default_exp core.{module}` - Sets export destination - `#| export` - Marks code for export to package - `#| hide` + `#| export` - Hidden implementation (instructor solution) - `# %% [markdown]` - Markdown cells for explanations - `# %%` - Code cells ### Educational Structure - **Concept explanation** โ†’ **Implementation guidance** โ†’ **Hidden solution** โ†’ **Testing** โ†’ **Visual feedback** ## ๐ŸŽจ Difficulty System - **๐ŸŸข Easy (5-10 min)**: Constructor, properties, basic operations - **๐ŸŸก Medium (10-20 min)**: Conditional logic, data processing, error handling - **๐Ÿ”ด Hard (20+ min)**: Complex algorithms, system integration, optimization ## ๐Ÿ“‹ Implementation Guidelines ### Students Implement (Core Learning) - **Main functionality**: Core algorithms and data structures - **Data processing**: Loading, preprocessing, batching - **Error handling**: Input validation, type checking - **Basic operations**: Mathematical operations, transformations ### Students Receive (Focus on Learning Goals) - **Complex setup**: Download progress bars, caching systems - **Utility functions**: Visualization, debugging helpers - **Advanced features**: Optimization, GPU support - **Infrastructure**: Test frameworks, import management ### TODO Guidance Quality ```python """ TODO: {Clear, specific task} APPROACH: 1. {Concrete first step} 2. {Concrete second step} 3. {Concrete third step} EXAMPLE: Input: {actual_data_example} Expected: {concrete_expected_output} HINTS: - {Helpful guidance without giving code} - {Systems thinking consideration} - {Real-world connection} SYSTEMS THINKING: - {Performance consideration} - {Scalability question} - {User experience aspect} """ ``` ## ๐Ÿ—‚๏ธ Module Structure ``` modules/{module}/ โ”œโ”€โ”€ {module}_dev.py # ๐Ÿ”ง Complete implementation โ”œโ”€โ”€ {module}_dev.ipynb # ๐Ÿ““ Generated notebook โ”œโ”€โ”€ tests/ โ”‚ โ””โ”€โ”€ test_{module}.py # ๐Ÿงช Real data tests โ”œโ”€โ”€ README.md # ๐Ÿ“– Module guide โ””โ”€โ”€ data/ # ๐Ÿ“Š Cached datasets (if needed) ``` ## โœ… Quality Standards ### Before Release - [ ] Uses real data, not synthetic/mock data - [ ] Includes progress feedback for long operations - [ ] Visual feedback functions (development only) - [ ] Tests use actual datasets at realistic scales - [ ] TODO guidance includes systems thinking - [ ] Clean separation between development and exports - [ ] Follows "Build โ†’ Use โ†’ Understand" progression ### Integration Requirements - [ ] Exports correctly to `tinytorch.core.{module}` - [ ] No circular dependencies - [ ] Consistent with existing module patterns - [ ] Compatible with TinyTorch CLI tools ## ๐Ÿ’ก Best Practices ### Development Process 1. **Start with real data**: Choose production dataset first 2. **Write complete implementation**: Get it working before adding markers 3. **Add rich feedback**: Visual confirmation, progress indicators 4. **Test the student path**: Follow your own TODO guidance 5. **Optimize user experience**: Consider performance, caching, error messages ### Systems Thinking - **Performance**: How does this scale with larger datasets? - **Caching**: How do we avoid repeated expensive operations? - **User Experience**: How do students know the code is working? - **Production Relevance**: How does this connect to real ML systems? ### Educational Design - **Immediate gratification**: Students see results quickly - **Progressive complexity**: Build understanding step by step - **Real-world connections**: Connect every concept to production - **Visual confirmation**: Students see their code working ## ๐Ÿ”„ Continuous Improvement After teaching with a module: 1. **Monitor student experience**: Where do they get stuck? 2. **Improve guidance**: Better TODO instructions, clearer hints 3. **Enhance feedback**: More visual confirmation, better progress indicators 4. **Optimize performance**: Faster data loading, better caching 5. **Update documentation**: Share learnings with other developers ## ๐ŸŽฏ Success Metrics **Students should be able to:** - Explain what they built in simple terms - Modify code to solve related problems - Connect module concepts to real ML systems - Debug issues by understanding the system **Modules should achieve:** - High student engagement and completion rates - Smooth progression to next modules - Real-world relevance and production quality - Consistent patterns across the curriculum --- **Remember**: We're teaching ML systems engineering, not just algorithms. Every module should reflect real-world practices and challenges while maintaining the "Build โ†’ Use โ†’ Understand" educational cycle.