📚 Break Complex Modules into Digestible Sub-Components While Maintaining Module Unity #3

Open
opened 2025-11-02 00:09:34 -05:00 by GiteaMirror · 6 comments
Owner

Originally created by @profvjreddi on GitHub (Jul 18, 2025).

📚 Break Complex Modules into Digestible Sub-Components While Maintaining Module Unity

📚 Educational Problem

Several TinyTorch modules have grown quite large (1000+ lines), making them difficult for students to navigate, understand, and debug. While the modules work well as cohesive educational units, the individual development files can be overwhelming.

Current Complex Modules:

  • 02_tensor/tensor_dev.py: 1,578 lines
  • 15_mlops/mlops_dev.py: 1,667 lines
  • 13_kernels/kernels_dev.py: 1,381 lines
  • 05_dense/dense_dev.py: 907 lines

🎯 Proposed Solution

Break each complex module into smaller, focused sub-components while maintaining the module structure and educational flow. Think "bite-sized pieces that still work as a whole."

Example: Breaking Down 02_tensor Module

Current Structure:

modules/source/02_tensor/
├── tensor_dev.py          # 1,578 lines - everything in one file
├── module.yaml
└── README.md

Proposed Structure:

modules/source/02_tensor/
├── parts/
│   ├── 01_foundations.py     # Mathematical foundations & tensor theory
│   ├── 02_creation.py        # Tensor creation & initialization
│   ├── 03_operations.py      # Core arithmetic operations  
│   ├── 04_broadcasting.py    # Broadcasting & shape manipulation
│   ├── 05_advanced.py        # Advanced operations & edge cases
│   └── 06_integration.py     # Integration tests & complete examples
├── tensor_dev.py             # Main orchestrator that imports all parts
├── module.yaml
└── README.md

Example: Breaking Down 15_mlops Module

Current Structure:

modules/source/15_mlops/
├── mlops_dev.py          # 1,667 lines - entire MLOps pipeline
├── module.yaml  
└── README.md

Proposed Structure:

modules/source/15_mlops/
├── parts/
│   ├── 01_monitoring.py      # Model and data drift detection
│   ├── 02_deployment.py      # Model serving & API endpoints
│   ├── 03_pipeline.py        # Continuous learning workflows
│   ├── 04_registry.py        # Model versioning & registry
│   ├── 05_alerting.py        # Alert systems & notifications
│   └── 06_integration.py     # Full MLOps pipeline integration
├── mlops_dev.py              # Main orchestrator 
├── module.yaml
└── README.md

🏗️ Implementation Strategy

1. Maintain Module Unity

  • Keep the main {module}_dev.py file as the primary entry point
  • Use imports to bring all sub-components together
  • Ensure the module still "feels like one cohesive lesson"

2. Logical Decomposition

  • Break modules by conceptual boundaries, not arbitrary line counts
  • Each sub-component should be self-contained but integrate seamlessly
  • Maintain the Build → Use → Optimize educational flow across parts

3. Educational Benefits

  • Easier navigation: Students can focus on specific concepts
  • Better debugging: Smaller files are easier to troubleshoot
  • Clearer progression: Natural learning checkpoints within modules
  • Maintained cohesion: Everything still works together as intended

4. Technical Implementation

# Main module file (e.g., tensor_dev.py)
"""
TinyTorch Tensor Module - Complete Implementation
Students work through parts/ directory, then see integration here.
"""

# Import all sub-components
from .parts.foundations import *
from .parts.creation import *
from .parts.operations import *
from .parts.broadcasting import *
from .parts.advanced import *

# Integration and final examples
from .parts.integration import run_complete_tensor_demo

# Expose the complete Tensor class
__all__ = ['Tensor', 'run_complete_tensor_demo']

🎓 Educational Advantages

  1. Bite-sized Learning: Students can master one concept at a time
  2. Natural Progression: Clear path through complex topics
  3. Better Testing: Each part can have focused inline tests
  4. Easier Review: Instructors can review specific components
  5. Maintained Flow: Module still tells one coherent story

🔧 Implementation Notes

  • This is architectural improvement, not feature addition
  • Maintains all existing functionality and educational goals
  • Backward compatible: Current workflows continue to work
  • Each module can be migrated independently
  • Priority should be given to largest/most complex modules first

📋 Success Criteria

  • No single sub-component exceeds ~300 lines
  • Each part has clear educational purpose
  • Main module file remains functional entry point
  • All inline tests continue to pass
  • Students report improved navigation and understanding
  • Module still "feels like one lesson" despite internal structure

🎯 Priority Modules for Migration

  1. 02_tensor (1,578 lines) - Foundation module, affects all others
  2. 15_mlops (1,667 lines) - Complex capstone module
  3. 13_kernels (1,381 lines) - Performance engineering module
  4. 11_training (estimated 1,000+ lines) - Core training pipeline

This enhancement will make TinyTorch more student-friendly while maintaining its educational integrity and systematic learning progression.

Labels: enhancement, education, architecture, modules

Originally created by @profvjreddi on GitHub (Jul 18, 2025). # 📚 Break Complex Modules into Digestible Sub-Components While Maintaining Module Unity ## 📚 **Educational Problem** Several TinyTorch modules have grown quite large (1000+ lines), making them difficult for students to navigate, understand, and debug. While the modules work well as cohesive educational units, the individual development files can be overwhelming. **Current Complex Modules:** - `02_tensor/tensor_dev.py`: 1,578 lines - `15_mlops/mlops_dev.py`: 1,667 lines - `13_kernels/kernels_dev.py`: 1,381 lines - `05_dense/dense_dev.py`: 907 lines ## 🎯 **Proposed Solution** Break each complex module into **smaller, focused sub-components** while maintaining the module structure and educational flow. Think "bite-sized pieces that still work as a whole." ### Example: Breaking Down `02_tensor` Module **Current Structure:** ``` modules/source/02_tensor/ ├── tensor_dev.py # 1,578 lines - everything in one file ├── module.yaml └── README.md ``` **Proposed Structure:** ``` modules/source/02_tensor/ ├── parts/ │ ├── 01_foundations.py # Mathematical foundations & tensor theory │ ├── 02_creation.py # Tensor creation & initialization │ ├── 03_operations.py # Core arithmetic operations │ ├── 04_broadcasting.py # Broadcasting & shape manipulation │ ├── 05_advanced.py # Advanced operations & edge cases │ └── 06_integration.py # Integration tests & complete examples ├── tensor_dev.py # Main orchestrator that imports all parts ├── module.yaml └── README.md ``` ### Example: Breaking Down `15_mlops` Module **Current Structure:** ``` modules/source/15_mlops/ ├── mlops_dev.py # 1,667 lines - entire MLOps pipeline ├── module.yaml └── README.md ``` **Proposed Structure:** ``` modules/source/15_mlops/ ├── parts/ │ ├── 01_monitoring.py # Model and data drift detection │ ├── 02_deployment.py # Model serving & API endpoints │ ├── 03_pipeline.py # Continuous learning workflows │ ├── 04_registry.py # Model versioning & registry │ ├── 05_alerting.py # Alert systems & notifications │ └── 06_integration.py # Full MLOps pipeline integration ├── mlops_dev.py # Main orchestrator ├── module.yaml └── README.md ``` ## 🏗️ **Implementation Strategy** ### 1. **Maintain Module Unity** - Keep the main `{module}_dev.py` file as the **primary entry point** - Use imports to bring all sub-components together - Ensure the module still "feels like one cohesive lesson" ### 2. **Logical Decomposition** - Break modules by **conceptual boundaries**, not arbitrary line counts - Each sub-component should be **self-contained** but **integrate seamlessly** - Maintain the **Build → Use → Optimize** educational flow across parts ### 3. **Educational Benefits** - **Easier navigation**: Students can focus on specific concepts - **Better debugging**: Smaller files are easier to troubleshoot - **Clearer progression**: Natural learning checkpoints within modules - **Maintained cohesion**: Everything still works together as intended ### 4. **Technical Implementation** ```python # Main module file (e.g., tensor_dev.py) """ TinyTorch Tensor Module - Complete Implementation Students work through parts/ directory, then see integration here. """ # Import all sub-components from .parts.foundations import * from .parts.creation import * from .parts.operations import * from .parts.broadcasting import * from .parts.advanced import * # Integration and final examples from .parts.integration import run_complete_tensor_demo # Expose the complete Tensor class __all__ = ['Tensor', 'run_complete_tensor_demo'] ``` ## 🎓 **Educational Advantages** 1. **Bite-sized Learning**: Students can master one concept at a time 2. **Natural Progression**: Clear path through complex topics 3. **Better Testing**: Each part can have focused inline tests 4. **Easier Review**: Instructors can review specific components 5. **Maintained Flow**: Module still tells one coherent story ## 🔧 **Implementation Notes** - This is **architectural improvement**, not feature addition - Maintains all existing functionality and educational goals - **Backward compatible**: Current workflows continue to work - Each module can be migrated independently - Priority should be given to largest/most complex modules first ## 📋 **Success Criteria** - [ ] No single sub-component exceeds ~300 lines - [ ] Each part has clear educational purpose - [ ] Main module file remains functional entry point - [ ] All inline tests continue to pass - [ ] Students report improved navigation and understanding - [ ] Module still "feels like one lesson" despite internal structure ## 🎯 **Priority Modules for Migration** 1. **`02_tensor`** (1,578 lines) - Foundation module, affects all others 2. **`15_mlops`** (1,667 lines) - Complex capstone module 3. **`13_kernels`** (1,381 lines) - Performance engineering module 4. **`11_training`** (estimated 1,000+ lines) - Core training pipeline --- **This enhancement will make TinyTorch more student-friendly while maintaining its educational integrity and systematic learning progression.** **Labels:** `enhancement`, `education`, `architecture`, `modules`
GiteaMirror added the enhancementdocumentationgood first issue labels 2025-11-02 00:09:34 -05:00
Author
Owner

@HasarinduPerera commented on GitHub (Sep 19, 2025):

Do you still need help with this? I'd love to collaborate on this project in some way.

@HasarinduPerera commented on GitHub (Sep 19, 2025): Do you still need help with this? I'd love to collaborate on this project in some way.
Author
Owner

@profvjreddi commented on GitHub (Sep 19, 2025):

Hey @HasarinduPerera, Thanks so much for dropping me a note. I've been working on this feverishly over the past couple of days, so let me get that completed to a stable state, and I will release. Been cranking on a version of a tiny GPT engine. But it's not really working yet, so I need to close it off a bit. So give me a few days and I will drop you a note. Sound good?

@profvjreddi commented on GitHub (Sep 19, 2025): Hey @HasarinduPerera, Thanks so much for dropping me a note. I've been working on this feverishly over the past couple of days, so let me get that completed to a stable state, and I will release. Been cranking on a version of a tiny GPT engine. But it's not really working yet, so I need to close it off a bit. So give me a few days and I will drop you a note. Sound good?
Author
Owner

@Zappandy commented on GitHub (Oct 12, 2025):

@profvjreddi first of all thank you for your work! I recently started going over the modules and was refactoring the tensor module myself based on the structure you provided. I know you were already working on this, but let me know if you'd be interested in either contributions to the tensor module or otherwise.

I do wanna mention that to proper contribute to module refactoring, I also went over the tito/commands/export.py, so when I make a contribution I can export the module and test it. I noticed the common mappings from the dev branch are still pointing at the paths defined in main instead of the ones in dev (line ~306)l:

        # Common mappings
        source_mappings = {
            ('core', 'tensor'): 'modules/source/02_tensor/tensor_dev.py',
            ('core', 'activations'): 'modules/source/03_activations/activations_dev.py', 
            ('core', 'layers'): 'modules/source/04_layers/layers_dev.py',
            ('core', 'dense'): 'modules/source/05_dense/dense_dev.py',
            ('core', 'spatial'): 'modules/source/06_spatial/spatial_dev.py',
            ('core', 'attention'): 'modules/source/07_attention/attention_dev.py',
            ('core', 'dataloader'): 'modules/source/08_dataloader/dataloader_dev.py',
            ('core', 'autograd'): 'modules/source/09_autograd/autograd_dev.py',
            ('core', 'optimizers'): 'modules/source/10_optimizers/optimizers_dev.py',
            ('core', 'training'): 'modules/source/11_training/training_dev.py',
            ('core', 'compression'): 'modules/source/12_compression/compression_dev.py',
            ('core', 'kernels'): 'modules/source/13_kernels/kernels_dev.py',
            ('core', 'benchmarking'): 'modules/source/14_benchmarking/benchmarking_dev.py',
            ('core', 'networks'): 'modules/source/16_tinygpt/tinygpt_dev.ipynb',
        }

Mind you the MODULE_TO_CHECKPOINT is properly defined, this is just something else that stood out to me. I'm wondering whether this should be updated too in the dev branch so as we make contributions, we can properly test them while adhering to tito's existing functionalities.

Side note, in regards to refactoring complex modules, I reckon long term the setup module will be removed, correct?

@Zappandy commented on GitHub (Oct 12, 2025): @profvjreddi first of all thank you for your work! I recently started going over the modules and was refactoring the tensor module myself based on the structure you provided. I know you were already working on this, but let me know if you'd be interested in either contributions to the tensor module or otherwise. I do wanna mention that to proper contribute to module refactoring, I also went over the `tito/commands/export.py`, so when I make a contribution I can export the module and test it. I noticed the common mappings from the dev branch are still pointing at the paths defined in main instead of the ones in dev (line ~306)l: ```Python # Common mappings source_mappings = { ('core', 'tensor'): 'modules/source/02_tensor/tensor_dev.py', ('core', 'activations'): 'modules/source/03_activations/activations_dev.py', ('core', 'layers'): 'modules/source/04_layers/layers_dev.py', ('core', 'dense'): 'modules/source/05_dense/dense_dev.py', ('core', 'spatial'): 'modules/source/06_spatial/spatial_dev.py', ('core', 'attention'): 'modules/source/07_attention/attention_dev.py', ('core', 'dataloader'): 'modules/source/08_dataloader/dataloader_dev.py', ('core', 'autograd'): 'modules/source/09_autograd/autograd_dev.py', ('core', 'optimizers'): 'modules/source/10_optimizers/optimizers_dev.py', ('core', 'training'): 'modules/source/11_training/training_dev.py', ('core', 'compression'): 'modules/source/12_compression/compression_dev.py', ('core', 'kernels'): 'modules/source/13_kernels/kernels_dev.py', ('core', 'benchmarking'): 'modules/source/14_benchmarking/benchmarking_dev.py', ('core', 'networks'): 'modules/source/16_tinygpt/tinygpt_dev.ipynb', } ``` Mind you the `MODULE_TO_CHECKPOINT` is properly defined, this is just something else that stood out to me. I'm wondering whether this should be updated too in the dev branch so as we make contributions, we can properly test them while adhering to tito's existing functionalities. Side note, in regards to refactoring complex modules, I reckon long term the setup module will be removed, correct?
Author
Owner

@profvjreddi commented on GitHub (Oct 12, 2025):

Hi there! This project is still a work in progress, and I have quite a few
local commits that haven’t been pushed yet. Thanks so much for trying it
out! Give me another month or two, and I should have the first Alpha
release ready. But in the meantime totally welcome feedback and thoughts!
Thanks 😊

Vijay Janapa Reddi, Ph.D.
Gordon McKay Professor of Electrical Engineering
John A. Paulson School of Engineering and Applied Sciences
Harvard University

📍 150 Western Ave, Room 5.305,
Boston, MA 02134, USA (Google Maps
https://maps.app.goo.gl/ixd9WAgDbCGBwymC6)

🌐 Homepage https://vijay.seas.harvard.edu | 📚 Research
https://profvjreddi.github.io/homepage/research | 📕 Teaching
https://mlsysbook.org/ | 💻 GitHub https://github.com/profvjreddi | 📅
Calendar https://profvjreddi.github.io/homepage/contact | 🧑‍💼 Assistant
https://profvjreddi.github.io/homepage/contact#contact-methods

On Sun, Oct 12, 2025 at 12:22 PM, Andres Gonzalez Gongora <
@.***> wrote:

Zappandy left a comment (MLSysBook/TinyTorch#3)
https://github.com/MLSysBook/TinyTorch/issues/3#issuecomment-3394817543

@profvjreddi https://github.com/profvjreddi first of all thank you for
your work! I recently started going over the modules and was refactoring
the tensor module myself based on the structure you provided. I know you
were already working on this, but would you be interested in taking a look
down the line?

To proper contribute to module refactoring, I also went over the
tito/commands/export.py, so when I make a contribution I can export the
module and test it. I noticed the common mappings from the dev branch are
still pointing at the paths defined in main instead of the ones in dev
(line ~306)l:

    # Common mappings
    source_mappings = {
        ('core', 'tensor'): 'modules/source/02_tensor/tensor_dev.py',
        ('core', 'activations'): 'modules/source/03_activations/activations_dev.py',
        ('core', 'layers'): 'modules/source/04_layers/layers_dev.py',
        ('core', 'dense'): 'modules/source/05_dense/dense_dev.py',
        ('core', 'spatial'): 'modules/source/06_spatial/spatial_dev.py',
        ('core', 'attention'): 'modules/source/07_attention/attention_dev.py',
        ('core', 'dataloader'): 'modules/source/08_dataloader/dataloader_dev.py',
        ('core', 'autograd'): 'modules/source/09_autograd/autograd_dev.py',
        ('core', 'optimizers'): 'modules/source/10_optimizers/optimizers_dev.py',
        ('core', 'training'): 'modules/source/11_training/training_dev.py',
        ('core', 'compression'): 'modules/source/12_compression/compression_dev.py',
        ('core', 'kernels'): 'modules/source/13_kernels/kernels_dev.py',
        ('core', 'benchmarking'): 'modules/source/14_benchmarking/benchmarking_dev.py',
        ('core', 'networks'): 'modules/source/16_tinygpt/tinygpt_dev.ipynb',
    }

Mind you the MODULE_TO_CHECKPOINT is properly defined, this is just
something else that stood out to me. I'm wondering whether this should be
updated too in the dev branch so as we make contributions, we can properly
test them while adhering to tito's existing functionalities.

Side note, in regards to refactoring complex modules, I reckon long term
the setup module will be removed, correct?


Reply to this email directly, view it on GitHub
https://github.com/MLSysBook/TinyTorch/issues/3#issuecomment-3394817543,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ABT6DFEKBRP6U5MT2H4FGE33XJ55FAVCNFSM6AAAAACBZWTW76VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTGOJUHAYTONJUGM
.
You are receiving this because you were mentioned.Message ID:
@.***>

@profvjreddi commented on GitHub (Oct 12, 2025): Hi there! This project is still a work in progress, and I have quite a few local commits that haven’t been pushed yet. Thanks so much for trying it out! Give me another month or two, and I should have the first Alpha release ready. But in the meantime totally welcome feedback and thoughts! Thanks 😊 *Vijay Janapa Reddi, Ph.D.* *Gordon McKay Professor of Electrical Engineering* John A. Paulson School of Engineering and Applied Sciences Harvard University 📍 150 Western Ave, Room 5.305, Boston, MA 02134, USA (Google Maps <https://maps.app.goo.gl/ixd9WAgDbCGBwymC6>) 🌐 Homepage <https://vijay.seas.harvard.edu> | 📚 Research <https://profvjreddi.github.io/homepage/research> | 📕 Teaching <https://mlsysbook.org/> | 💻 GitHub <https://github.com/profvjreddi> | 📅 Calendar <https://profvjreddi.github.io/homepage/contact> | 🧑‍💼 Assistant <https://profvjreddi.github.io/homepage/contact#contact-methods> On Sun, Oct 12, 2025 at 12:22 PM, Andres Gonzalez Gongora < ***@***.***> wrote: > *Zappandy* left a comment (MLSysBook/TinyTorch#3) > <https://github.com/MLSysBook/TinyTorch/issues/3#issuecomment-3394817543> > > @profvjreddi <https://github.com/profvjreddi> first of all thank you for > your work! I recently started going over the modules and was refactoring > the tensor module myself based on the structure you provided. I know you > were already working on this, but would you be interested in taking a look > down the line? > > To proper contribute to module refactoring, I also went over the > tito/commands/export.py, so when I make a contribution I can export the > module and test it. I noticed the common mappings from the dev branch are > still pointing at the paths defined in main instead of the ones in dev > (line ~306)l: > > # Common mappings > source_mappings = { > ('core', 'tensor'): 'modules/source/02_tensor/tensor_dev.py', > ('core', 'activations'): 'modules/source/03_activations/activations_dev.py', > ('core', 'layers'): 'modules/source/04_layers/layers_dev.py', > ('core', 'dense'): 'modules/source/05_dense/dense_dev.py', > ('core', 'spatial'): 'modules/source/06_spatial/spatial_dev.py', > ('core', 'attention'): 'modules/source/07_attention/attention_dev.py', > ('core', 'dataloader'): 'modules/source/08_dataloader/dataloader_dev.py', > ('core', 'autograd'): 'modules/source/09_autograd/autograd_dev.py', > ('core', 'optimizers'): 'modules/source/10_optimizers/optimizers_dev.py', > ('core', 'training'): 'modules/source/11_training/training_dev.py', > ('core', 'compression'): 'modules/source/12_compression/compression_dev.py', > ('core', 'kernels'): 'modules/source/13_kernels/kernels_dev.py', > ('core', 'benchmarking'): 'modules/source/14_benchmarking/benchmarking_dev.py', > ('core', 'networks'): 'modules/source/16_tinygpt/tinygpt_dev.ipynb', > } > > Mind you the MODULE_TO_CHECKPOINT is properly defined, this is just > something else that stood out to me. I'm wondering whether this should be > updated too in the dev branch so as we make contributions, we can properly > test them while adhering to tito's existing functionalities. > > Side note, in regards to refactoring complex modules, I reckon long term > the setup module will be removed, correct? > > — > Reply to this email directly, view it on GitHub > <https://github.com/MLSysBook/TinyTorch/issues/3#issuecomment-3394817543>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/ABT6DFEKBRP6U5MT2H4FGE33XJ55FAVCNFSM6AAAAACBZWTW76VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTGOJUHAYTONJUGM> > . > You are receiving this because you were mentioned.Message ID: > ***@***.***> >
Author
Owner

@Zappandy commented on GitHub (Oct 18, 2025):

Hi @profvjreddi, thank you so much for your reply! In the meantime, I'll keep going through the modules of the current publicly available dev version and write down things if they have not been addressed by the alpha release!

Feel free to block the MR I made as well, because I don't think multiple virtual env management should be an option until TinyTorch is in alpha.

@Zappandy commented on GitHub (Oct 18, 2025): Hi @profvjreddi, thank you so much for your reply! In the meantime, I'll keep going through the modules of the current publicly available dev version and write down things if they have not been addressed by the alpha release! Feel free to block the MR I made as well, because I don't think multiple virtual env management should be an option until TinyTorch is in alpha.
Author
Owner

@profvjreddi commented on GitHub (Oct 20, 2025):

👍 I just pushed updates to the website which should reflect more of what I am putting together.

@profvjreddi commented on GitHub (Oct 20, 2025): 👍 I just pushed updates to the website which should reflect more of what I am putting together.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/TinyTorch#3