mirror of https://github.com/MLSysBook/TinyTorch.git synced 2026-03-12 10:46:32 -05:00

Files

Vijay Janapa Reddi d04d66a716 Implement interactive ML Systems questions and standardize module structure

Major Educational Framework Enhancements:
• Deploy interactive NBGrader text response questions across ALL modules
• Replace passive question lists with active 150-300 word student responses
• Enable comprehensive ML Systems learning assessment and grading

TinyGPT Integration (Module 16):
• Complete TinyGPT implementation showing 70% component reuse from TinyTorch
• Demonstrates vision-to-language framework generalization principles
• Full transformer architecture with attention, tokenization, and generation
• Shakespeare demo showing autoregressive text generation capabilities

Module Structure Standardization:
• Fix section ordering across all modules: Tests → Questions → Summary
• Ensure Module Summary is always the final section for consistency
• Standardize comprehensive testing patterns before educational content

Interactive Question Implementation:
• 3 focused questions per module replacing 10-15 passive questions
• NBGrader integration with manual grading workflow for text responses
• Questions target ML Systems thinking: scaling, deployment, optimization
• Cumulative knowledge building across the 16-module progression

Technical Infrastructure:
• TPM agent for coordinated multi-agent development workflows
• Enhanced documentation with pedagogical design principles
• Updated book structure to include TinyGPT as capstone demonstration
• Comprehensive QA validation of all module structures

Framework Design Insights:
• Mathematical unity: Dense layers power both vision and language models
• Attention as key innovation for sequential relationship modeling
• Production-ready patterns: training loops, optimization, evaluation
• System-level thinking: memory, performance, scaling considerations

Educational Impact:
• Transform passive learning to active engagement through written responses
• Enable instructors to assess deep ML Systems understanding
• Provide clear progression from foundations to complete language models
• Demonstrate real-world framework design principles and trade-offs

2025-09-17 14:42:24 -04:00

README.md

Implement interactive ML Systems questions and standardize module structure

2025-09-17 14:42:24 -04:00

tokenizer.py

Implement interactive ML Systems questions and standardize module structure

2025-09-17 14:42:24 -04:00

training.py

Implement interactive ML Systems questions and standardize module structure

2025-09-17 14:42:24 -04:00

README.md

TinyGPT Core Components

This directory contains the core components for TinyGPT, a educational implementation of GPT-style language models built on TinyTorch foundations.

Components

`tokenizer.py` - Character-Level Tokenization

CharTokenizer: Character-level tokenizer for text processing
Key Features:
- Simple character-to-token mapping
- Vocabulary size limiting for computational efficiency
- Special tokens support (<UNK>, <PAD>)
- Batch encoding with padding/truncation
- Comprehensive text analysis capabilities

Usage:

from core.tokenizer import CharTokenizer

tokenizer = CharTokenizer(vocab_size=100)
tokenizer.fit(training_text)
tokens = tokenizer.encode("Hello, world!")
text = tokenizer.decode(tokens)

`training.py` - Language Model Training Infrastructure

LanguageModelTrainer: Complete training pipeline for language models
LanguageModelLoss: Cross-entropy loss with next-token prediction
LanguageModelAccuracy: Accuracy metrics for language modeling

Key Features:

Text-to-sequence data preparation
Next-token prediction training
Autoregressive text generation
Training/validation splitting
Comprehensive evaluation metrics

Usage:

from core.training import LanguageModelTrainer
from core.models import TinyGPT

model = TinyGPT(vocab_size=50, d_model=128)
trainer = LanguageModelTrainer(model, tokenizer)

history = trainer.fit(text, epochs=5, seq_length=64)
generated = trainer.generate_text("Hello", max_length=50)

`attention.py` - Attention Mechanisms

MultiHeadAttention: Multi-head self-attention implementation
SelfAttention: Simplified single-head attention
PositionalEncoding: Sinusoidal positional embeddings
create_causal_mask: Causal masking for autoregressive models

`models.py` - Transformer Models

TinyGPT: Complete GPT-style transformer model
TransformerBlock: Individual transformer layer
LayerNorm: Layer normalization implementation
SimpleLM: Simplified language model for comparison

Integration with TinyTorch

The TinyGPT components are designed to maximize reuse of TinyTorch components:

Reused Components (70%+):

Dense layers for all linear transformations
Activation functions (ReLU, Softmax)
Loss functions (CrossEntropyLoss)
Optimizers (Adam)
Training infrastructure patterns
Tensor operations

New Components for NLP:

Multi-head attention mechanisms
Positional encoding
Layer normalization
Causal masking
Text tokenization
Autoregressive generation

Educational Benefits

Character-Level Simplicity: Easy to understand tokenization without complex subword algorithms
Transparent Architecture: All components implemented with clear educational comments
Component Reuse: Demonstrates how ML foundations generalize across domains
Progressive Complexity: From simple tokenizer to full transformer model
Mock Implementations: Works with or without TinyTorch for standalone learning

Example: Shakespeare Demo

The examples/shakespeare_demo.py demonstrates the complete pipeline:

Character tokenization of Shakespeare text
TinyGPT model creation and training
Text generation at different temperatures
Performance analysis and comparison with vision models

This shows how the same mathematical foundations (linear layers, attention, optimization) power both computer vision and natural language processing.

File Dependencies

core/
├── tokenizer.py       # Standalone, only requires numpy
├── attention.py       # Uses TinyTorch Tensor and Dense (with mocks)
├── models.py          # Uses attention.py and TinyTorch layers
├── training.py        # Uses tokenizer.py and TinyTorch components
└── README.md          # This file

Design Philosophy

TinyGPT follows the same educational philosophy as TinyTorch:

Build → Use → Understand: Implement each component before using it
Educational Clarity: Clear code with extensive documentation
Minimal Dependencies: NumPy + educational implementations
Real-World Relevance: Patterns used in production frameworks
Component Modularity: Each piece can be understood independently

The goal is to demystify how language models work while showing how they share foundational concepts with computer vision models.

README.md

TinyGPT Core Components

Components

tokenizer.py - Character-Level Tokenization

training.py - Language Model Training Infrastructure

attention.py - Attention Mechanisms

models.py - Transformer Models

Integration with TinyTorch

Educational Benefits

Example: Shakespeare Demo

File Dependencies

Design Philosophy

`tokenizer.py` - Character-Level Tokenization

`training.py` - Language Model Training Infrastructure

`attention.py` - Attention Mechanisms

`models.py` - Transformer Models