mirror of
https://github.com/MLSysBook/TinyTorch.git
synced 2026-03-12 00:33:34 -05:00
Restructure .claude directory with comprehensive guidelines
- Created organized guidelines/ directory with focused documentation: - DESIGN_PHILOSOPHY.md: KISS principle and simplicity focus - MODULE_DEVELOPMENT.md: How to build modules with systems focus - TESTING_STANDARDS.md: Immediate testing patterns - PERFORMANCE_CLAIMS.md: Honest reporting based on CIFAR-10 lessons - AGENT_COORDINATION.md: How agents work together effectively - GIT_WORKFLOW.md: Moved from root, branching standards - Added .claude/README.md as central navigation - Updated CLAUDE.md to reference guideline files - Created CLAUDE_SIMPLE.md as streamlined entry point All learnings from recent work captured in appropriate guidelines
This commit is contained in:
138
.claude/README.md
Normal file
138
.claude/README.md
Normal file
@@ -0,0 +1,138 @@
|
||||
# TinyTorch .claude Directory Structure
|
||||
|
||||
This directory contains all guidelines, standards, and agent definitions for the TinyTorch project.
|
||||
|
||||
## 📁 Directory Structure
|
||||
|
||||
```
|
||||
.claude/
|
||||
├── README.md # This file
|
||||
├── guidelines/ # Development standards and principles
|
||||
│ ├── DESIGN_PHILOSOPHY.md # KISS principle and simplicity guidelines
|
||||
│ ├── GIT_WORKFLOW.md # Git branching and commit standards
|
||||
│ ├── MODULE_DEVELOPMENT.md # How to develop TinyTorch modules
|
||||
│ ├── TESTING_STANDARDS.md # Testing patterns and requirements
|
||||
│ ├── PERFORMANCE_CLAIMS.md # How to make honest performance claims
|
||||
│ └── AGENT_COORDINATION.md # How AI agents work together
|
||||
├── agents/ # AI agent definitions
|
||||
│ ├── technical-program-manager.md
|
||||
│ ├── education-architect.md
|
||||
│ ├── module-developer.md
|
||||
│ ├── package-manager.md
|
||||
│ ├── quality-assurance.md
|
||||
│ ├── documentation-publisher.md
|
||||
│ ├── workflow-coordinator.md
|
||||
│ ├── devops-engineer.md
|
||||
│ └── tito-cli-developer.md
|
||||
└── [legacy files to review]
|
||||
|
||||
```
|
||||
|
||||
## 🎯 Quick Start for New Development
|
||||
|
||||
1. **Read Core Principles First**
|
||||
- `guidelines/DESIGN_PHILOSOPHY.md` - Understand KISS principle
|
||||
- `guidelines/GIT_WORKFLOW.md` - Learn branching requirements
|
||||
|
||||
2. **For Module Development**
|
||||
- `guidelines/MODULE_DEVELOPMENT.md` - Module structure and patterns
|
||||
- `guidelines/TESTING_STANDARDS.md` - How to write tests
|
||||
- `guidelines/PERFORMANCE_CLAIMS.md` - How to report results
|
||||
|
||||
3. **For Agent Coordination**
|
||||
- `guidelines/AGENT_COORDINATION.md` - How agents work together
|
||||
- Start with Technical Program Manager (TPM) for all requests
|
||||
|
||||
## 📋 Key Principles Summary
|
||||
|
||||
### 1. Keep It Simple, Stupid (KISS)
|
||||
- One file, one purpose
|
||||
- Clear over clever
|
||||
- Verified over theoretical
|
||||
- Direct over abstract
|
||||
|
||||
### 2. Git Workflow
|
||||
- ALWAYS work on feature branches
|
||||
- NEVER commit directly to main/dev
|
||||
- Test before committing
|
||||
- No automated attribution in commits
|
||||
|
||||
### 3. Module Development
|
||||
- Edit .py files only (never .ipynb)
|
||||
- Test immediately after implementation
|
||||
- Include systems analysis (memory, performance)
|
||||
- Follow exact structure pattern
|
||||
|
||||
### 4. Testing Standards
|
||||
- Test immediately, not at the end
|
||||
- Simple assertions over complex frameworks
|
||||
- Tests should educate, not just verify
|
||||
- Always compare against baseline
|
||||
|
||||
### 5. Performance Claims
|
||||
- Only claim what you've measured
|
||||
- Include all relevant metrics
|
||||
- Report failures honestly
|
||||
- Reproducibility is key
|
||||
|
||||
### 6. Agent Coordination
|
||||
- TPM is primary interface
|
||||
- Sequential workflow with clear handoffs
|
||||
- QA testing is MANDATORY
|
||||
- Package integration is MANDATORY
|
||||
|
||||
## 🚀 Common Workflows
|
||||
|
||||
### Starting New Module Development
|
||||
```bash
|
||||
1. Create feature branch
|
||||
2. Request TPM agent assistance
|
||||
3. Follow MODULE_DEVELOPMENT.md structure
|
||||
4. Test with TESTING_STANDARDS.md patterns
|
||||
5. Verify performance per PERFORMANCE_CLAIMS.md
|
||||
6. Merge following GIT_WORKFLOW.md
|
||||
```
|
||||
|
||||
### Making Performance Claims
|
||||
```bash
|
||||
1. Run baseline measurements
|
||||
2. Run actual measurements
|
||||
3. Calculate real improvements
|
||||
4. Document with all metrics
|
||||
5. No unverified claims
|
||||
```
|
||||
|
||||
### Working with Agents
|
||||
```bash
|
||||
1. Always start with TPM agent
|
||||
2. Let TPM coordinate other agents
|
||||
3. Wait for QA approval before proceeding
|
||||
4. Wait for Package Manager integration
|
||||
5. Only then commit
|
||||
```
|
||||
|
||||
## 📝 Important Notes
|
||||
|
||||
- **Virtual Environment**: Always activate .venv before development
|
||||
- **Honesty**: Report actual results, not aspirations
|
||||
- **Simplicity**: When in doubt, choose the simpler option
|
||||
- **Education First**: We're teaching, not impressing
|
||||
|
||||
## 🔗 Quick Links
|
||||
|
||||
- Main Instructions: `/CLAUDE.md`
|
||||
- Module Source: `/modules/source/`
|
||||
- Examples: `/examples/`
|
||||
- Tests: `/tests/`
|
||||
|
||||
## 📌 Remember
|
||||
|
||||
> "If students can't understand it, we've failed."
|
||||
|
||||
Every decision should be filtered through:
|
||||
1. Is it simple?
|
||||
2. Is it honest?
|
||||
3. Is it educational?
|
||||
4. Is it verified?
|
||||
|
||||
If any answer is "no", reconsider.
|
||||
204
.claude/guidelines/AGENT_COORDINATION.md
Normal file
204
.claude/guidelines/AGENT_COORDINATION.md
Normal file
@@ -0,0 +1,204 @@
|
||||
# TinyTorch Agent Coordination Guidelines
|
||||
|
||||
## 🎯 Core Principle
|
||||
|
||||
**Agents work in sequence with clear handoffs, not in isolation.**
|
||||
|
||||
## 🤖 The Agent Team
|
||||
|
||||
### Primary Interface: Technical Program Manager (TPM)
|
||||
|
||||
The TPM is your SINGLE point of communication for all development.
|
||||
|
||||
```
|
||||
User Request → TPM → Coordinates Agents → Reports Back
|
||||
```
|
||||
|
||||
**The TPM knows when to invoke:**
|
||||
- Education Architect - Learning design
|
||||
- Module Developer - Implementation
|
||||
- Package Manager - Integration
|
||||
- Quality Assurance - Testing
|
||||
- Documentation Publisher - Content
|
||||
- Workflow Coordinator - Process
|
||||
- DevOps Engineer - Infrastructure
|
||||
- Tito CLI Developer - CLI features
|
||||
|
||||
## 📋 Standard Development Workflow
|
||||
|
||||
### The Sequential Pattern
|
||||
|
||||
**For EVERY module development:**
|
||||
|
||||
```
|
||||
1. Planning (Workflow Coordinator + Education Architect)
|
||||
↓
|
||||
2. Implementation (Module Developer)
|
||||
↓
|
||||
3. Testing (Quality Assurance) ← MANDATORY
|
||||
↓
|
||||
4. Integration (Package Manager) ← MANDATORY
|
||||
↓
|
||||
5. Documentation (Documentation Publisher)
|
||||
↓
|
||||
6. Review (Workflow Coordinator)
|
||||
```
|
||||
|
||||
### Critical Handoff Points
|
||||
|
||||
**Module Developer → QA Agent**
|
||||
```python
|
||||
# Module Developer completes implementation
|
||||
"Implementation complete. Ready for QA testing.
|
||||
Files modified: 02_tensor_dev.py
|
||||
Key changes: Added reshape operation with broadcasting"
|
||||
|
||||
# QA MUST test before proceeding
|
||||
```
|
||||
|
||||
**QA Agent → Package Manager**
|
||||
```python
|
||||
# QA completes testing
|
||||
"All tests passed.
|
||||
- Module imports correctly
|
||||
- All functions work as expected
|
||||
- Performance benchmarks met
|
||||
Ready for package integration"
|
||||
|
||||
# Package Manager MUST verify integration
|
||||
```
|
||||
|
||||
## 🚫 Blocking Rules
|
||||
|
||||
### QA Agent Can Block Progress
|
||||
|
||||
**If tests fail, STOP everything:**
|
||||
- No commits allowed
|
||||
- No integration permitted
|
||||
- Must fix and re-test
|
||||
|
||||
### Package Manager Can Block Release
|
||||
|
||||
**If integration fails:**
|
||||
- Module doesn't export correctly
|
||||
- Breaks other modules
|
||||
- Package won't build
|
||||
|
||||
## 📝 Agent Communication Protocol
|
||||
|
||||
### Structured Handoffs
|
||||
|
||||
Every handoff must include:
|
||||
1. **What was completed**
|
||||
2. **What needs to be done next**
|
||||
3. **Any issues found**
|
||||
4. **Test results (if applicable)**
|
||||
5. **Recommendations**
|
||||
|
||||
**Example:**
|
||||
```
|
||||
From: Module Developer
|
||||
To: QA Agent
|
||||
|
||||
Completed:
|
||||
- Implemented attention mechanism in 07_attention_dev.py
|
||||
- Added scaled dot-product attention
|
||||
- Included positional encoding
|
||||
|
||||
Needs Testing:
|
||||
- Attention score computation
|
||||
- Mask application
|
||||
- Memory usage with large sequences
|
||||
|
||||
Known Issues:
|
||||
- Performance degrades with sequences >1000 tokens
|
||||
|
||||
Recommendations:
|
||||
- Focus testing on edge cases with padding
|
||||
```
|
||||
|
||||
## 🔄 Parallel vs Sequential Work
|
||||
|
||||
### Can Work in Parallel
|
||||
|
||||
✅ Different modules by different developers
|
||||
✅ Documentation while code is being tested
|
||||
✅ Planning next modules while current ones build
|
||||
|
||||
### Must Be Sequential
|
||||
|
||||
❌ Implementation → Testing (MUST test after implementation)
|
||||
❌ Testing → Integration (MUST pass tests first)
|
||||
❌ Integration → Commit (MUST integrate successfully)
|
||||
|
||||
## 🎯 The Checkpoint Success Story
|
||||
|
||||
**How agents successfully implemented the 16-checkpoint system:**
|
||||
|
||||
1. **Education Architect** designed capability progression
|
||||
2. **Workflow Coordinator** orchestrated implementation
|
||||
3. **Module Developer** built checkpoint tests + CLI
|
||||
4. **QA Agent** validated all 16 checkpoints work
|
||||
5. **Package Manager** ensured integration with modules
|
||||
6. **Documentation Publisher** updated all docs
|
||||
|
||||
**Result:** Complete working system with proper handoffs
|
||||
|
||||
## ⚠️ Common Coordination Failures
|
||||
|
||||
### Working in Isolation
|
||||
❌ Module Developer implements without QA testing
|
||||
❌ Documentation written before code works
|
||||
❌ Integration attempted before tests pass
|
||||
|
||||
### Skipping Handoffs
|
||||
❌ Direct commit without QA approval
|
||||
❌ Missing Package Manager validation
|
||||
❌ No Workflow Coordinator review
|
||||
|
||||
### Poor Communication
|
||||
❌ "It's done" (no details)
|
||||
❌ No test results provided
|
||||
❌ Issues discovered but not reported
|
||||
|
||||
## 📋 Agent Checklist
|
||||
|
||||
### Before Module Developer Starts
|
||||
- [ ] Education Architect defined learning objectives
|
||||
- [ ] Workflow Coordinator approved plan
|
||||
- [ ] Clear specifications provided
|
||||
|
||||
### Before QA Testing
|
||||
- [ ] Module Developer completed ALL implementation
|
||||
- [ ] Code follows standards
|
||||
- [ ] Basic self-testing done
|
||||
|
||||
### Before Package Integration
|
||||
- [ ] QA Agent ran comprehensive tests
|
||||
- [ ] All tests PASSED
|
||||
- [ ] Performance acceptable
|
||||
|
||||
### Before Commit
|
||||
- [ ] Package Manager verified integration
|
||||
- [ ] Documentation complete
|
||||
- [ ] Workflow Coordinator approved
|
||||
|
||||
## 🔧 Conflict Resolution
|
||||
|
||||
**If agents disagree:**
|
||||
|
||||
1. **QA has veto on quality** - If tests fail, stop
|
||||
2. **Education Architect owns learning objectives**
|
||||
3. **Workflow Coordinator resolves other disputes**
|
||||
4. **User has final override**
|
||||
|
||||
## 📌 Remember
|
||||
|
||||
> Agents amplify capabilities when coordinated, create chaos when isolated.
|
||||
|
||||
**Key Success Factors:**
|
||||
- Clear handoffs between agents
|
||||
- Mandatory testing and integration
|
||||
- Structured communication
|
||||
- Sequential workflow where needed
|
||||
- Parallel work where possible
|
||||
212
.claude/guidelines/DESIGN_PHILOSOPHY.md
Normal file
212
.claude/guidelines/DESIGN_PHILOSOPHY.md
Normal file
@@ -0,0 +1,212 @@
|
||||
# TinyTorch Design Philosophy
|
||||
|
||||
## 🎯 Core Principle: Keep It Simple, Stupid (KISS)
|
||||
|
||||
**Simplicity is the soul of TinyTorch. We are building an educational framework where clarity beats cleverness every time.**
|
||||
|
||||
## 📚 Why Simplicity Matters
|
||||
|
||||
TinyTorch is for students learning ML systems engineering. If they can't understand it, we've failed our mission. Every design decision should prioritize:
|
||||
|
||||
1. **Readability** over performance
|
||||
2. **Clarity** over cleverness
|
||||
3. **Directness** over abstraction
|
||||
4. **Honesty** over aspiration
|
||||
|
||||
## 🚀 KISS Guidelines
|
||||
|
||||
### Code Simplicity
|
||||
|
||||
**✅ DO:**
|
||||
- Write code that reads like a textbook
|
||||
- Use descriptive variable names (`gradient` not `g`)
|
||||
- Implement one concept per file
|
||||
- Show the direct path from input to output
|
||||
- Keep functions short and focused
|
||||
|
||||
**❌ DON'T:**
|
||||
- Use clever one-liners that require decoding
|
||||
- Create unnecessary abstractions
|
||||
- Optimize prematurely
|
||||
- Hide complexity behind magic
|
||||
|
||||
**Example:**
|
||||
```python
|
||||
# ✅ GOOD: Clear and direct
|
||||
def forward(self, x):
|
||||
h1 = self.relu(self.fc1(x))
|
||||
h2 = self.relu(self.fc2(h1))
|
||||
return self.fc3(h2)
|
||||
|
||||
# ❌ BAD: Clever but unclear
|
||||
def forward(self, x):
|
||||
return reduce(lambda h, l: self.relu(l(h)) if l != self.layers[-1] else l(h),
|
||||
self.layers, x)
|
||||
```
|
||||
|
||||
### File Organization
|
||||
|
||||
**✅ DO:**
|
||||
- One purpose per file
|
||||
- Clear, descriptive filenames
|
||||
- Minimal file count
|
||||
|
||||
**❌ DON'T:**
|
||||
- Create multiple versions of the same thing
|
||||
- Split related code unnecessarily
|
||||
- Create deep directory hierarchies
|
||||
|
||||
**Example:**
|
||||
```
|
||||
✅ GOOD:
|
||||
examples/cifar10/
|
||||
├── random_baseline.py # Shows untrained performance
|
||||
├── train.py # Training script
|
||||
└── README.md # Simple documentation
|
||||
|
||||
❌ BAD:
|
||||
examples/cifar10/
|
||||
├── train_basic.py
|
||||
├── train_optimized.py
|
||||
├── train_advanced.py
|
||||
├── train_experimental.py
|
||||
├── train_with_ui.py
|
||||
└── ... (20 more variations)
|
||||
```
|
||||
|
||||
### Documentation Simplicity
|
||||
|
||||
**✅ DO:**
|
||||
- State what it does clearly
|
||||
- Give one good example
|
||||
- Report verified results only
|
||||
- Keep README files short
|
||||
|
||||
**❌ DON'T:**
|
||||
- Write novels in docstrings
|
||||
- Promise theoretical performance
|
||||
- Add complex diagrams for simple concepts
|
||||
- Create documentation that's longer than the code
|
||||
|
||||
**Example:**
|
||||
```python
|
||||
# ✅ GOOD: Clear and concise
|
||||
"""
|
||||
Train a neural network on CIFAR-10 images.
|
||||
Achieves 55% accuracy in 2 minutes.
|
||||
"""
|
||||
|
||||
# ❌ BAD: Over-documented
|
||||
"""
|
||||
This advanced training framework implements state-of-the-art optimization
|
||||
techniques including adaptive learning rate scheduling, progressive data
|
||||
augmentation, and sophisticated regularization strategies to push the
|
||||
boundaries of what's possible with MLPs on CIFAR-10, potentially achieving
|
||||
60-70% accuracy with proper hyperparameter tuning...
|
||||
[continues for 500 more words]
|
||||
"""
|
||||
```
|
||||
|
||||
### Performance Claims
|
||||
|
||||
**✅ DO:**
|
||||
- Report what you actually measured
|
||||
- Include training time
|
||||
- Be honest about limitations
|
||||
- Compare against clear baselines
|
||||
|
||||
**❌ DON'T:**
|
||||
- Claim unverified performance
|
||||
- Hide negative results
|
||||
- Exaggerate improvements
|
||||
- Make theoretical claims
|
||||
|
||||
**Example:**
|
||||
```markdown
|
||||
✅ GOOD:
|
||||
- Random baseline: 10% (measured)
|
||||
- Trained model: 55% (measured)
|
||||
- Training time: 2 minutes
|
||||
|
||||
❌ BAD:
|
||||
- Can achieve 60-70% with optimization (unverified)
|
||||
- State-of-the-art MLP performance (vague)
|
||||
- Approaches CNN-level accuracy (misleading)
|
||||
```
|
||||
|
||||
## 🎓 Educational Simplicity
|
||||
|
||||
### Learning Progression
|
||||
|
||||
**✅ DO:**
|
||||
- Build concepts incrementally
|
||||
- Show before explaining
|
||||
- Test immediately after implementing
|
||||
- Keep examples minimal but complete
|
||||
|
||||
**❌ DON'T:**
|
||||
- Jump to complex examples
|
||||
- Hide important details
|
||||
- Add unnecessary features
|
||||
- Overwhelm with options
|
||||
|
||||
### Error Messages
|
||||
|
||||
**✅ DO:**
|
||||
- Make errors educational
|
||||
- Suggest fixes
|
||||
- Show what went wrong clearly
|
||||
|
||||
**❌ DON'T:**
|
||||
- Hide errors
|
||||
- Use cryptic messages
|
||||
- Stack trace without context
|
||||
|
||||
## 🔍 Decision Framework
|
||||
|
||||
When making any design decision, ask:
|
||||
|
||||
1. **Can a student understand this in 30 seconds?**
|
||||
- If no → simplify
|
||||
|
||||
2. **Is there a simpler way that still works?**
|
||||
- If yes → use it
|
||||
|
||||
3. **Does this add essential value?**
|
||||
- If no → remove it
|
||||
|
||||
4. **Would I want to debug this at 2 AM?**
|
||||
- If no → rewrite it
|
||||
|
||||
## 📝 Examples of KISS in Action
|
||||
|
||||
### Recent CIFAR-10 Cleanup
|
||||
**Before:** 20+ experimental files with complex optimizations
|
||||
**After:** 2 files (random_baseline.py, train.py)
|
||||
**Result:** Clearer story, same educational value
|
||||
|
||||
### Module Structure
|
||||
**Before:** Complex inheritance hierarchies
|
||||
**After:** Direct implementations students can trace
|
||||
**Result:** Students understand what's happening
|
||||
|
||||
### Testing
|
||||
**Before:** Complex test frameworks
|
||||
**After:** Simple assertions after each implementation
|
||||
**Result:** Immediate feedback and understanding
|
||||
|
||||
## 🚨 When Complexity is OK
|
||||
|
||||
Sometimes complexity is necessary, but it must be:
|
||||
1. **Essential** to the learning objective
|
||||
2. **Well-documented** with clear explanations
|
||||
3. **Isolated** from simpler concepts
|
||||
4. **Justified** by significant educational value
|
||||
|
||||
Example: Autograd is complex, but it's the core learning objective of that module.
|
||||
|
||||
## 📌 Remember
|
||||
|
||||
> "Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away." - Antoine de Saint-Exupéry
|
||||
|
||||
**Every line of code, every file, every feature should justify its existence. When in doubt, leave it out.**
|
||||
299
.claude/guidelines/MODULE_DEVELOPMENT.md
Normal file
299
.claude/guidelines/MODULE_DEVELOPMENT.md
Normal file
@@ -0,0 +1,299 @@
|
||||
# TinyTorch Module Development Standards
|
||||
|
||||
## 🎯 Core Principle
|
||||
|
||||
**Modules teach ML systems engineering through building, not just ML algorithms through reading.**
|
||||
|
||||
## 📁 File Structure
|
||||
|
||||
### One Module = One .py File
|
||||
|
||||
```
|
||||
modules/source/XX_modulename/
|
||||
├── modulename_dev.py # The ONLY file you edit
|
||||
├── modulename_dev.ipynb # Auto-generated from .py (DO NOT EDIT)
|
||||
└── README.md # Module overview
|
||||
```
|
||||
|
||||
**Critical Rules:**
|
||||
- ✅ ALWAYS edit `.py` files only
|
||||
- ❌ NEVER edit `.ipynb` notebooks directly
|
||||
- ✅ Use jupytext to sync .py → .ipynb
|
||||
|
||||
## 📚 Module Structure Pattern
|
||||
|
||||
Every module MUST follow this exact structure:
|
||||
|
||||
```python
|
||||
# %% [markdown]
|
||||
"""
|
||||
# Module XX: [Name]
|
||||
|
||||
**Learning Objectives:**
|
||||
- Build [component] from scratch
|
||||
- Understand [systems concept]
|
||||
- Analyze performance implications
|
||||
"""
|
||||
|
||||
# %% [markdown]
|
||||
"""
|
||||
## Part 1: Mathematical Foundations
|
||||
[Theory and complexity analysis]
|
||||
"""
|
||||
|
||||
# %% [code]
|
||||
# Implementation
|
||||
|
||||
# %% [markdown]
|
||||
"""
|
||||
### Testing [Component]
|
||||
Let's verify our implementation works correctly.
|
||||
"""
|
||||
|
||||
# %% [code]
|
||||
# Immediate test
|
||||
|
||||
# %% [markdown]
|
||||
"""
|
||||
## Part 2: Systems Analysis
|
||||
### Memory Profiling
|
||||
Let's understand the memory implications.
|
||||
"""
|
||||
|
||||
# %% [code]
|
||||
# Memory profiling code
|
||||
|
||||
# %% [markdown]
|
||||
"""
|
||||
## Part 3: Production Context
|
||||
In real ML systems like PyTorch...
|
||||
"""
|
||||
|
||||
# ... continue pattern ...
|
||||
|
||||
# %% [code]
|
||||
if __name__ == "__main__":
|
||||
run_all_tests()
|
||||
|
||||
# %% [markdown]
|
||||
"""
|
||||
## 🤔 ML Systems Thinking
|
||||
[Interactive questions analyzing implementation]
|
||||
"""
|
||||
|
||||
# %% [markdown]
|
||||
"""
|
||||
## 🎯 Module Summary
|
||||
[What was learned - ALWAYS LAST]
|
||||
"""
|
||||
```
|
||||
|
||||
## 🧪 Implementation → Test Pattern
|
||||
|
||||
**MANDATORY**: Every implementation must be immediately followed by a test.
|
||||
|
||||
```python
|
||||
# ✅ CORRECT Pattern:
|
||||
|
||||
# %% [markdown]
|
||||
"""
|
||||
## Building the Dense Layer
|
||||
"""
|
||||
|
||||
# %% [code]
|
||||
class Dense:
|
||||
def __init__(self, in_features, out_features):
|
||||
self.weights = np.random.randn(in_features, out_features) * 0.1
|
||||
self.bias = np.zeros(out_features)
|
||||
|
||||
def forward(self, x):
|
||||
return x @ self.weights + self.bias
|
||||
|
||||
# %% [markdown]
|
||||
"""
|
||||
### Testing Dense Layer
|
||||
Let's verify our dense layer handles shapes correctly.
|
||||
"""
|
||||
|
||||
# %% [code]
|
||||
def test_dense_layer():
|
||||
layer = Dense(10, 5)
|
||||
x = np.random.randn(32, 10) # Batch of 32, 10 features
|
||||
output = layer.forward(x)
|
||||
assert output.shape == (32, 5), f"Expected (32, 5), got {output.shape}"
|
||||
print("✅ Dense layer forward pass works!")
|
||||
|
||||
test_dense_layer()
|
||||
```
|
||||
|
||||
## 🔬 ML Systems Focus
|
||||
|
||||
### MANDATORY Systems Analysis Sections
|
||||
|
||||
Every module MUST include:
|
||||
|
||||
1. **Complexity Analysis**
|
||||
```python
|
||||
# %% [markdown]
|
||||
"""
|
||||
### Computational Complexity
|
||||
- Matrix multiply: O(batch × in_features × out_features)
|
||||
- Memory usage: O(in_features × out_features) for weights
|
||||
- This becomes the bottleneck when...
|
||||
"""
|
||||
```
|
||||
|
||||
2. **Memory Profiling**
|
||||
```python
|
||||
# %% [code]
|
||||
def profile_memory():
|
||||
import tracemalloc
|
||||
tracemalloc.start()
|
||||
|
||||
layer = Dense(1000, 1000)
|
||||
x = np.random.randn(128, 1000)
|
||||
output = layer.forward(x)
|
||||
|
||||
current, peak = tracemalloc.get_traced_memory()
|
||||
print(f"Peak memory: {peak / 1024 / 1024:.2f} MB")
|
||||
print("This shows why large models need GPUs!")
|
||||
```
|
||||
|
||||
3. **Production Context**
|
||||
```python
|
||||
# %% [markdown]
|
||||
"""
|
||||
### In Production Systems
|
||||
PyTorch's nn.Linear does the same thing but with:
|
||||
- GPU acceleration via CUDA kernels
|
||||
- Automatic differentiation support
|
||||
- Optimized BLAS operations
|
||||
- Memory pooling for efficiency
|
||||
"""
|
||||
```
|
||||
|
||||
## 📝 NBGrader Integration
|
||||
|
||||
### Cell Metadata Structure
|
||||
|
||||
```python
|
||||
# %% [code] {"nbgrader": {"grade": false, "locked": false, "solution": true, "grade_id": "dense_implementation"}}
|
||||
### BEGIN SOLUTION
|
||||
class Dense:
|
||||
# Full implementation for instructors
|
||||
...
|
||||
### END SOLUTION
|
||||
|
||||
### BEGIN HIDDEN TESTS
|
||||
# Instructor-only tests
|
||||
...
|
||||
### END HIDDEN TESTS
|
||||
```
|
||||
|
||||
### Critical NBGrader Rules
|
||||
|
||||
1. **Every cell needs unique grade_id**
|
||||
2. **Scaffolding stays OUTSIDE solution blocks**
|
||||
3. **Hidden tests validate student work**
|
||||
4. **Points should reflect complexity**
|
||||
|
||||
## 🎓 Educational Patterns
|
||||
|
||||
### The "Build → Measure → Understand" Pattern
|
||||
|
||||
```python
|
||||
# 1. BUILD
|
||||
class LayerNorm:
|
||||
def forward(self, x):
|
||||
mean = np.mean(x, axis=-1, keepdims=True)
|
||||
var = np.var(x, axis=-1, keepdims=True)
|
||||
return (x - mean) / np.sqrt(var + 1e-5)
|
||||
|
||||
# 2. MEASURE
|
||||
def measure_performance():
|
||||
layer = LayerNorm()
|
||||
x = np.random.randn(1000, 512)
|
||||
|
||||
start = time.time()
|
||||
for _ in range(100):
|
||||
output = layer.forward(x)
|
||||
elapsed = time.time() - start
|
||||
|
||||
print(f"Time per forward pass: {elapsed/100*1000:.2f}ms")
|
||||
print(f"Throughput: {100*1000*512/elapsed:.0f} tokens/sec")
|
||||
|
||||
# 3. UNDERSTAND
|
||||
"""
|
||||
With 512 dimensions, normalization adds ~2ms overhead.
|
||||
This is why large models use fused kernels!
|
||||
"""
|
||||
```
|
||||
|
||||
### Progressive Complexity
|
||||
|
||||
Start simple, build up:
|
||||
|
||||
```python
|
||||
# Step 1: Simplest possible version
|
||||
def relu_v1(x):
|
||||
return np.maximum(0, x)
|
||||
|
||||
# Step 2: Add complexity
|
||||
def relu_v2(x):
|
||||
# Handle gradients
|
||||
output = np.maximum(0, x)
|
||||
output.grad_fn = lambda grad: grad * (x > 0)
|
||||
return output
|
||||
|
||||
# Step 3: Production version
|
||||
class ReLU:
|
||||
def forward(self, x):
|
||||
self.input = x # Save for backward
|
||||
return np.maximum(0, x)
|
||||
|
||||
def backward(self, grad):
|
||||
return grad * (self.input > 0)
|
||||
```
|
||||
|
||||
## ⚠️ Common Pitfalls
|
||||
|
||||
1. **Too Much Theory**
|
||||
- Students want to BUILD, not read
|
||||
- Show through code, not exposition
|
||||
|
||||
2. **Missing Systems Analysis**
|
||||
- Not just algorithms, but engineering
|
||||
- Always discuss memory and performance
|
||||
|
||||
3. **Tests at the End**
|
||||
- Loses educational flow
|
||||
- Test immediately after implementation
|
||||
|
||||
4. **No Production Context**
|
||||
- Students need to see real-world relevance
|
||||
- Compare with PyTorch/TensorFlow
|
||||
|
||||
## 📌 Module Checklist
|
||||
|
||||
Before considering a module complete:
|
||||
|
||||
- [ ] All code in .py file (not notebook)
|
||||
- [ ] Follows exact structure pattern
|
||||
- [ ] Every implementation has immediate test
|
||||
- [ ] Includes memory profiling
|
||||
- [ ] Includes complexity analysis
|
||||
- [ ] Shows production context
|
||||
- [ ] NBGrader metadata correct
|
||||
- [ ] ML systems thinking questions
|
||||
- [ ] Summary is LAST section
|
||||
- [ ] Tests run when module executed
|
||||
|
||||
## 🎯 Remember
|
||||
|
||||
> We're teaching ML systems engineering, not just ML algorithms.
|
||||
|
||||
Every module should help students understand:
|
||||
- How to BUILD ML systems
|
||||
- Why performance matters
|
||||
- Where bottlenecks occur
|
||||
- How production systems work
|
||||
245
.claude/guidelines/PERFORMANCE_CLAIMS.md
Normal file
245
.claude/guidelines/PERFORMANCE_CLAIMS.md
Normal file
@@ -0,0 +1,245 @@
|
||||
# TinyTorch Performance Claims Guidelines
|
||||
|
||||
## 🎯 Core Principle
|
||||
|
||||
**Only claim what you have measured and verified. Honesty builds trust.**
|
||||
|
||||
## ✅ Verified Performance Standards
|
||||
|
||||
### The Three-Step Verification
|
||||
|
||||
1. **Measure Baseline**
|
||||
```python
|
||||
# Random/untrained performance
|
||||
random_model = create_untrained_model()
|
||||
baseline_accuracy = evaluate(random_model, test_data)
|
||||
print(f"Baseline: {baseline_accuracy:.1%}") # Measured: 10%
|
||||
```
|
||||
|
||||
2. **Measure Actual Performance**
|
||||
```python
|
||||
# Trained model performance
|
||||
trained_model = train_model(epochs=15)
|
||||
actual_accuracy = evaluate(trained_model, test_data)
|
||||
print(f"Actual: {actual_accuracy:.1%}") # Measured: 55%
|
||||
```
|
||||
|
||||
3. **Calculate Real Improvement**
|
||||
```python
|
||||
improvement = actual_accuracy / baseline_accuracy
|
||||
print(f"Improvement: {improvement:.1f}×") # Measured: 5.5×
|
||||
```
|
||||
|
||||
### Reporting Requirements
|
||||
|
||||
**ALWAYS include:**
|
||||
- Exact accuracy percentage
|
||||
- Training time
|
||||
- Hardware used
|
||||
- Number of epochs
|
||||
- Dataset size
|
||||
|
||||
**Example:**
|
||||
```markdown
|
||||
✅ GOOD:
|
||||
- Accuracy: 55% on CIFAR-10 test set
|
||||
- Training time: 2 minutes on M1 MacBook
|
||||
- Epochs: 15
|
||||
- Batch size: 64
|
||||
|
||||
❌ BAD:
|
||||
- "State-of-the-art performance"
|
||||
- "Can achieve 60-70% with optimization"
|
||||
- "Approaches CNN-level accuracy"
|
||||
```
|
||||
|
||||
## 📊 The CIFAR-10 Lesson
|
||||
|
||||
### What We Claimed vs Reality
|
||||
|
||||
**Initial Claims (unverified):**
|
||||
- "60-70% accuracy achievable with optimization"
|
||||
- "Advanced techniques push beyond baseline"
|
||||
- "Sophisticated MLPs rival simple CNNs"
|
||||
|
||||
**Actual Results (verified):**
|
||||
- Baseline: 51-55% consistently
|
||||
- With optimization attempts: Still ~55%
|
||||
- Deep networks: Too slow, no improvement
|
||||
- **Honest conclusion: MLPs achieve 55% reliably**
|
||||
|
||||
### The Right Response
|
||||
|
||||
When results don't match expectations:
|
||||
|
||||
✅ **CORRECT Approach:**
|
||||
- Test thoroughly
|
||||
- Report actual results
|
||||
- Update documentation
|
||||
- Explain limitations
|
||||
|
||||
❌ **WRONG Approach:**
|
||||
- Keep unverified claims
|
||||
- Hide negative results
|
||||
- Blame implementation
|
||||
- Make excuses
|
||||
|
||||
## 🔬 Performance Testing Protocol
|
||||
|
||||
### Minimum Testing Requirements
|
||||
|
||||
```python
|
||||
def verify_performance_claim():
|
||||
"""
|
||||
Every performance claim must pass this verification.
|
||||
"""
|
||||
results = []
|
||||
|
||||
# Run multiple trials
|
||||
for trial in range(3):
|
||||
model = create_model()
|
||||
accuracy = train_and_evaluate(model)
|
||||
results.append(accuracy)
|
||||
|
||||
mean_acc = np.mean(results)
|
||||
std_acc = np.std(results)
|
||||
|
||||
# Report with confidence intervals
|
||||
print(f"Performance: {mean_acc:.1%} ± {std_acc:.1%}")
|
||||
|
||||
# Only claim if consistent
|
||||
if std_acc > 0.02: # >2% variance
|
||||
print("⚠️ High variance - need more testing")
|
||||
return False
|
||||
|
||||
return True
|
||||
```
|
||||
|
||||
### Time Complexity Reporting
|
||||
|
||||
```python
|
||||
# ✅ GOOD: Measured complexity
|
||||
def measure_scalability():
|
||||
sizes = [100, 1000, 10000]
|
||||
times = []
|
||||
|
||||
for size in sizes:
|
||||
data = create_data(size)
|
||||
start = time.time()
|
||||
process(data)
|
||||
times.append(time.time() - start)
|
||||
|
||||
# Analyze scaling
|
||||
print("Scaling behavior:")
|
||||
for size, time in zip(sizes, times):
|
||||
print(f" n={size}: {time:.2f}s")
|
||||
|
||||
# Determine complexity
|
||||
if times[2] / times[1] > 90: # 10x data → 100x time
|
||||
print("Complexity: O(n²)")
|
||||
|
||||
# ❌ BAD: Theoretical claims
|
||||
def theoretical_complexity():
|
||||
print("Should be O(n log n)") # Not measured
|
||||
```
|
||||
|
||||
## 📝 Documentation Standards
|
||||
|
||||
### Performance Tables
|
||||
|
||||
```markdown
|
||||
✅ GOOD Table:
|
||||
|
||||
| Model | Dataset | Accuracy | Time | Hardware |
|
||||
|-------|---------|----------|------|----------|
|
||||
| MLP-4-layer | CIFAR-10 | 55% | 2 min | M1 CPU |
|
||||
| Random baseline | CIFAR-10 | 10% | 0 sec | N/A |
|
||||
| MLP-4-layer | MNIST | 98% | 30 sec | M1 CPU |
|
||||
|
||||
❌ BAD Table:
|
||||
|
||||
| Model | Performance |
|
||||
|-------|------------|
|
||||
| Our MLP | State-of-the-art |
|
||||
| With optimization | Up to 70% |
|
||||
| Best case | Rivals CNNs |
|
||||
```
|
||||
|
||||
### Comparison Claims
|
||||
|
||||
```markdown
|
||||
✅ GOOD Comparisons:
|
||||
- "5.5× better than random baseline (10% → 55%)"
|
||||
- "Matches typical educational MLP benchmarks"
|
||||
- "20% below simple CNN performance"
|
||||
|
||||
❌ BAD Comparisons:
|
||||
- "Competitive with modern architectures"
|
||||
- "Approaching state-of-the-art"
|
||||
- "Best-in-class for educational frameworks"
|
||||
```
|
||||
|
||||
## ⚠️ Red Flags to Avoid
|
||||
|
||||
### Weasel Words
|
||||
- "Can achieve..." (but didn't)
|
||||
- "Up to..." (theoretical maximum)
|
||||
- "Potentially..." (unverified)
|
||||
- "Should be able to..." (untested)
|
||||
- "With proper tuning..." (hand-waving)
|
||||
|
||||
### Unverified Optimizations
|
||||
- "With these 10 techniques..." (didn't implement)
|
||||
- "Research shows..." (not our research)
|
||||
- "In theory..." (not in practice)
|
||||
- "Could reach..." (but didn't)
|
||||
|
||||
### Vague Metrics
|
||||
- "Good performance"
|
||||
- "Impressive results"
|
||||
- "Significant improvement"
|
||||
- "Fast training"
|
||||
|
||||
## 🎯 The Integrity Test
|
||||
|
||||
Before making any performance claim, ask:
|
||||
|
||||
1. **Did I measure this myself?**
|
||||
- If no → Don't claim it
|
||||
|
||||
2. **Can someone reproduce this?**
|
||||
- If no → Don't publish it
|
||||
|
||||
3. **Is this the typical case?**
|
||||
- If no → Note it's exceptional
|
||||
|
||||
4. **Would I bet money on this?**
|
||||
- If no → Reconsider the claim
|
||||
|
||||
## 📌 Remember
|
||||
|
||||
> "It's better to under-promise and over-deliver than the opposite."
|
||||
|
||||
**Trust is earned through:**
|
||||
- Honest reporting
|
||||
- Reproducible results
|
||||
- Clear limitations
|
||||
- Verified claims
|
||||
|
||||
**Trust is lost through:**
|
||||
- Exaggerated claims
|
||||
- Unverified results
|
||||
- Hidden failures
|
||||
- Theoretical promises
|
||||
|
||||
## 🏆 Good Examples from TinyTorch
|
||||
|
||||
### CIFAR-10 Cleanup
|
||||
**Before:** "60-70% achievable with optimization"
|
||||
**After:** "55% verified performance"
|
||||
**Result:** Honest, trustworthy documentation
|
||||
|
||||
### XOR Network
|
||||
**Claim:** "100% accuracy on XOR"
|
||||
**Verified:** Yes, consistently achieves 100%
|
||||
**Result:** Credible claim that builds trust
|
||||
228
.claude/guidelines/TESTING_STANDARDS.md
Normal file
228
.claude/guidelines/TESTING_STANDARDS.md
Normal file
@@ -0,0 +1,228 @@
|
||||
# TinyTorch Testing Standards
|
||||
|
||||
## 🎯 Core Testing Philosophy
|
||||
|
||||
**Test immediately, test simply, test educationally.**
|
||||
|
||||
Testing in TinyTorch serves two purposes:
|
||||
1. **Verification**: Ensure the code works
|
||||
2. **Education**: Help students understand what they built
|
||||
|
||||
## 📋 Testing Patterns
|
||||
|
||||
### The Immediate Testing Pattern
|
||||
|
||||
**MANDATORY**: Test immediately after each implementation, not at the end.
|
||||
|
||||
```python
|
||||
# ✅ CORRECT: Implementation followed by immediate test
|
||||
class Tensor:
|
||||
def __init__(self, data):
|
||||
self.data = data
|
||||
|
||||
# Test Tensor creation immediately
|
||||
def test_tensor_creation():
|
||||
t = Tensor([1, 2, 3])
|
||||
assert t.data == [1, 2, 3], "Tensor should store data"
|
||||
print("✅ Tensor creation works")
|
||||
|
||||
test_tensor_creation()
|
||||
|
||||
# ❌ WRONG: All tests grouped at the end
|
||||
# [100 lines of implementations]
|
||||
# [Then all tests at the bottom]
|
||||
```
|
||||
|
||||
### Simple Assertion Testing
|
||||
|
||||
**Use simple assertions, not complex frameworks.**
|
||||
|
||||
```python
|
||||
# ✅ GOOD: Simple and clear
|
||||
def test_forward_pass():
|
||||
model = SimpleMLP()
|
||||
x = Tensor(np.random.randn(32, 784))
|
||||
output = model.forward(x)
|
||||
assert output.shape == (32, 10), f"Expected (32, 10), got {output.shape}"
|
||||
print("✅ Forward pass shapes correct")
|
||||
|
||||
# ❌ BAD: Over-engineered
|
||||
class TestMLPForwardPass(unittest.TestCase):
|
||||
def setUp(self):
|
||||
self.model = SimpleMLP()
|
||||
|
||||
def test_forward_pass_shape_validation_with_mock_data(self):
|
||||
# ... 50 lines of test setup
|
||||
```
|
||||
|
||||
### Educational Test Messages
|
||||
|
||||
**Tests should teach, not just verify.**
|
||||
|
||||
```python
|
||||
# ✅ GOOD: Educational
|
||||
def test_backpropagation():
|
||||
# Create simple network: 2 inputs → 2 hidden → 1 output
|
||||
net = TwoLayerNet(2, 2, 1)
|
||||
|
||||
# Forward pass with XOR data
|
||||
x = Tensor([[0, 0], [0, 1], [1, 0], [1, 1]])
|
||||
y = Tensor([[0], [1], [1], [0]])
|
||||
|
||||
output = net.forward(x)
|
||||
loss = mse_loss(output, y)
|
||||
|
||||
print(f"Initial loss: {loss.data:.4f}")
|
||||
print("This high loss shows the network hasn't learned XOR yet")
|
||||
|
||||
# Backward pass
|
||||
loss.backward()
|
||||
|
||||
# Check gradients exist
|
||||
assert net.w1.grad is not None, "Gradients should be computed"
|
||||
print("✅ Backpropagation computed gradients")
|
||||
print("The network can now learn from its mistakes!")
|
||||
|
||||
# ❌ BAD: Just verification
|
||||
def test_backprop():
|
||||
net = TwoLayerNet(2, 2, 1)
|
||||
# ... minimal test
|
||||
assert net.w1.grad is not None
|
||||
# No educational value
|
||||
```
|
||||
|
||||
## 🧪 Performance Testing
|
||||
|
||||
### Baseline Comparisons
|
||||
|
||||
**Always test against a clear baseline.**
|
||||
|
||||
```python
|
||||
def test_model_performance():
|
||||
# 1. Test random baseline
|
||||
random_model = create_random_network()
|
||||
random_acc = evaluate(random_model, test_data)
|
||||
print(f"Random network accuracy: {random_acc:.1%}")
|
||||
|
||||
# 2. Test trained model
|
||||
trained_model = load_trained_model()
|
||||
trained_acc = evaluate(trained_model, test_data)
|
||||
print(f"Trained network accuracy: {trained_acc:.1%}")
|
||||
|
||||
# 3. Show improvement
|
||||
improvement = trained_acc / random_acc
|
||||
print(f"Improvement: {improvement:.1f}× better than random")
|
||||
|
||||
assert trained_acc > random_acc * 2, "Should be at least 2× better than random"
|
||||
```
|
||||
|
||||
### Honest Performance Reporting
|
||||
|
||||
```python
|
||||
# ✅ GOOD: Report actual measurements
|
||||
def test_training_performance():
|
||||
start_time = time.time()
|
||||
accuracy = train_model(epochs=10)
|
||||
train_time = time.time() - start_time
|
||||
|
||||
print(f"Achieved accuracy: {accuracy:.1%}")
|
||||
print(f"Training time: {train_time:.1f} seconds")
|
||||
print(f"Status: {'✅ PASS' if accuracy > 0.5 else '❌ FAIL'}")
|
||||
|
||||
# ❌ BAD: Theoretical claims
|
||||
def test_training():
|
||||
# ... training code
|
||||
print("Can achieve 60-70% with proper tuning") # Unverified claim
|
||||
```
|
||||
|
||||
## 🔍 Test Organization
|
||||
|
||||
### Test Placement
|
||||
|
||||
```python
|
||||
# Module structure with immediate tests
|
||||
# module_name.py
|
||||
|
||||
# Part 1: Core implementation
|
||||
class Tensor:
|
||||
...
|
||||
|
||||
# Immediate test
|
||||
test_tensor_creation()
|
||||
|
||||
# Part 2: Operations
|
||||
def add(a, b):
|
||||
...
|
||||
|
||||
# Immediate test
|
||||
test_addition()
|
||||
|
||||
# Part 3: Advanced features
|
||||
def backward():
|
||||
...
|
||||
|
||||
# Immediate test
|
||||
test_backward()
|
||||
|
||||
# At the end: Run all tests when executed directly
|
||||
if __name__ == "__main__":
|
||||
print("Running all tests...")
|
||||
test_tensor_creation()
|
||||
test_addition()
|
||||
test_backward()
|
||||
print("✅ All tests passed!")
|
||||
```
|
||||
|
||||
## ⚠️ Common Testing Mistakes
|
||||
|
||||
1. **Grouping all tests at the end**
|
||||
- Loses educational flow
|
||||
- Students don't see immediate verification
|
||||
|
||||
2. **Over-complicated test frameworks**
|
||||
- Obscures what's being tested
|
||||
- Adds unnecessary complexity
|
||||
|
||||
3. **Testing without teaching**
|
||||
- Missing opportunity to reinforce concepts
|
||||
- No educational value
|
||||
|
||||
4. **Unverified performance claims**
|
||||
- Damages credibility
|
||||
- Misleads students
|
||||
|
||||
## 📝 Test Documentation
|
||||
|
||||
```python
|
||||
def test_attention_mechanism():
|
||||
"""
|
||||
Test that attention correctly weighs different positions.
|
||||
|
||||
This test demonstrates the key insight of attention:
|
||||
the model learns what to focus on.
|
||||
"""
|
||||
# Create simple sequence
|
||||
sequence = Tensor([[1, 0, 0], # Position 0: important
|
||||
[0, 0, 0], # Position 1: padding
|
||||
[0, 0, 1]]) # Position 2: important
|
||||
|
||||
attention_weights = compute_attention(sequence)
|
||||
|
||||
# Check that important positions get more weight
|
||||
assert attention_weights[0] > attention_weights[1]
|
||||
assert attention_weights[2] > attention_weights[1]
|
||||
|
||||
print("✅ Attention focuses on important positions")
|
||||
print(f"Weights: {attention_weights}")
|
||||
print("Notice how padding (position 1) gets less attention")
|
||||
```
|
||||
|
||||
## 🎯 Remember
|
||||
|
||||
> Tests are teaching tools, not just verification tools.
|
||||
|
||||
Every test should help a student understand:
|
||||
- What the code does
|
||||
- Why it matters
|
||||
- How to verify it works
|
||||
- What success looks like
|
||||
39
CLAUDE.md
39
CLAUDE.md
@@ -1,7 +1,20 @@
|
||||
# Claude Code Instructions for TinyTorch
|
||||
|
||||
## ⚡ **MANDATORY: Read Git Policies First**
|
||||
**Before any development work, you MUST read and follow the Git Workflow Standards section below.**
|
||||
## 📚 **MANDATORY: Read Guidelines First**
|
||||
|
||||
**All development standards are documented in the `.claude/` directory.**
|
||||
|
||||
### Required Reading Order:
|
||||
1. `.claude/guidelines/DESIGN_PHILOSOPHY.md` - KISS principle and core values
|
||||
2. `.claude/guidelines/GIT_WORKFLOW.md` - Git policies and branching standards
|
||||
3. `.claude/guidelines/MODULE_DEVELOPMENT.md` - How to build modules
|
||||
4. `.claude/guidelines/TESTING_STANDARDS.md` - Testing requirements
|
||||
5. `.claude/guidelines/PERFORMANCE_CLAIMS.md` - Honest reporting standards
|
||||
6. `.claude/guidelines/AGENT_COORDINATION.md` - How to work with AI agents
|
||||
|
||||
**Start with `.claude/README.md` for a complete overview.**
|
||||
|
||||
## ⚡ **CRITICAL: Core Policies**
|
||||
|
||||
**CRITICAL POLICIES - NO EXCEPTIONS:**
|
||||
- ✅ Always use virtual environment (`.venv`)
|
||||
@@ -15,28 +28,6 @@
|
||||
|
||||
---
|
||||
|
||||
## 💡 **CORE PRINCIPLE: Keep It Simple, Stupid (KISS)**
|
||||
|
||||
**Simplicity is a fundamental principle of TinyTorch. Always prefer simple, clear solutions over complex ones.**
|
||||
|
||||
**KISS Guidelines:**
|
||||
- **One file, one purpose** - Don't create multiple versions doing the same thing
|
||||
- **Clear over clever** - Code should be readable by students learning ML
|
||||
- **Minimal dependencies** - Avoid unnecessary libraries or complex UI
|
||||
- **Direct implementation** - Show the core concepts without abstraction layers
|
||||
- **Honest performance** - Report what actually works, not theoretical possibilities
|
||||
|
||||
**Examples:**
|
||||
- ✅ `random_baseline.py` and `train.py` - two files, clear story
|
||||
- ❌ Multiple optimization scripts with unverified claims
|
||||
- ✅ Simple console output showing progress
|
||||
- ❌ Complex dashboards with ASCII plots that don't add educational value
|
||||
- ✅ "Achieves 55% accuracy" (verified)
|
||||
- ❌ "Can achieve 60-70% with optimization" (unverified)
|
||||
|
||||
**When in doubt, choose the simpler option. If students can't understand it, we've failed.**
|
||||
|
||||
---
|
||||
|
||||
## 🚨 **CRITICAL: Think First, Don't Just Agree**
|
||||
|
||||
|
||||
81
CLAUDE_SIMPLE.md
Normal file
81
CLAUDE_SIMPLE.md
Normal file
@@ -0,0 +1,81 @@
|
||||
# Claude Code Instructions for TinyTorch
|
||||
|
||||
## 📚 **START HERE: Read the Guidelines**
|
||||
|
||||
All development standards, principles, and workflows are documented in the `.claude/` directory.
|
||||
|
||||
### Quick Start
|
||||
```bash
|
||||
# First, read the overview
|
||||
cat .claude/README.md
|
||||
|
||||
# Then read core guidelines in order:
|
||||
cat .claude/guidelines/DESIGN_PHILOSOPHY.md # KISS principle
|
||||
cat .claude/guidelines/GIT_WORKFLOW.md # Git standards
|
||||
cat .claude/guidelines/MODULE_DEVELOPMENT.md # Building modules
|
||||
cat .claude/guidelines/TESTING_STANDARDS.md # Testing patterns
|
||||
```
|
||||
|
||||
## 🎯 Core Mission
|
||||
|
||||
**Build an educational ML framework where students learn ML systems engineering by implementing everything from scratch.**
|
||||
|
||||
Key principles:
|
||||
- **KISS**: Keep It Simple, Stupid
|
||||
- **Build to Learn**: Implementation teaches more than reading
|
||||
- **Systems Focus**: Not just algorithms, but engineering
|
||||
- **Honest Claims**: Only report verified performance
|
||||
|
||||
## ⚡ Critical Policies
|
||||
|
||||
1. **ALWAYS use virtual environment** (`.venv`)
|
||||
2. **ALWAYS work on feature branches** (never main/dev directly)
|
||||
3. **ALWAYS test before committing**
|
||||
4. **NEVER add automated attribution** to commits
|
||||
5. **NEVER edit .ipynb files directly** (edit .py only)
|
||||
|
||||
## 🤖 Working with AI Agents
|
||||
|
||||
**Always start with the Technical Program Manager (TPM)**:
|
||||
- TPM coordinates all other agents
|
||||
- Don't invoke agents directly
|
||||
- Follow the workflow in `.claude/guidelines/AGENT_COORDINATION.md`
|
||||
|
||||
## 📁 Key Directories
|
||||
|
||||
```
|
||||
.claude/guidelines/ # All development standards
|
||||
.claude/agents/ # AI agent definitions
|
||||
modules/source/ # Module implementations (.py files)
|
||||
examples/ # Working examples (keep simple)
|
||||
tests/ # Test suites
|
||||
```
|
||||
|
||||
## 🚨 Think Critically
|
||||
|
||||
**Don't just agree with suggestions. Always:**
|
||||
1. Evaluate if it makes pedagogical sense
|
||||
2. Check if there's a simpler way
|
||||
3. Verify it actually works
|
||||
4. Consider student perspective
|
||||
|
||||
## 📋 Before Any Work
|
||||
|
||||
1. **Read guidelines**: Start with `.claude/README.md`
|
||||
2. **Create branch**: Follow `.claude/guidelines/GIT_WORKFLOW.md`
|
||||
3. **Activate venv**: `source .venv/bin/activate`
|
||||
4. **Use TPM agent**: For coordinated development
|
||||
|
||||
## 🎓 Remember
|
||||
|
||||
> "If students can't understand it, we've failed."
|
||||
|
||||
Every decision should be:
|
||||
- Simple
|
||||
- Verified
|
||||
- Educational
|
||||
- Honest
|
||||
|
||||
---
|
||||
|
||||
**For detailed instructions on any topic, see the appropriate file in `.claude/guidelines/`**
|
||||
Reference in New Issue
Block a user