diff --git a/book/docs/VOLUME_SPLIT_ROADMAP.md b/book/docs/VOLUME_SPLIT_ROADMAP.md
deleted file mode 100644
index 7ead5c517..000000000
--- a/book/docs/VOLUME_SPLIT_ROADMAP.md
+++ /dev/null
@@ -1,394 +0,0 @@
-# Machine Learning Systems: Two-Volume Split - Master Roadmap
-
-**Project Start Date**: December 2024
-**Target Completion**: June 2025 (6 months)
-**Goal**: Create flagship two-volume ML Systems textbook series for MIT Press
-
----
-
-## 📋 Project Documents
-
-### Core Planning Documents
-1. **`MIT_PRESS_PROPOSAL.md`** - Official proposal for MIT Press (ready to submit)
-2. **`VOLUME_SPLIT_SURGICAL_PLAN.md`** - Section-by-section surgery instructions (50+ pages)
-3. **`VOLUME_SPLIT_ROADMAP.md`** - This document: master project plan and progress tracker
-4. **`VOLUME_STRUCTURE_PROPOSAL.md`** - Original analysis and proposal
-
-### Supporting Documents
-- `VOLUME_SPLIT_ANALYSIS.md` - Deep analysis of pedagogical issues
-- `VOLUME_SPLIT_EXECUTIVE_SUMMARY.md` - Quick reference summary
-- `DISTRIBUTED_CONTENT_ADDITIONS.md` - Distributed awareness additions for V1
-
----
-
-## 🎯 Project Vision
-
-**Volume I: Introduction to ML Systems** (~1,150-1,200 pages)
-- Complete single-system ML engineering
-- Includes distributed awareness (not implementation)
-- Target: Undergrads, bootcamps, ML engineers entering the field
-
-**Volume II: Advanced ML Systems** (~1,100-1,150 pages)
-- Distributed systems, production scale, responsibility
-- Built on timeless principles
-- Target: Graduate students, ML infrastructure engineers, senior practitioners
-
-**The One-Liner**:
-> "Volume I teaches you to build ML systems that work; Volume II teaches you to build ML systems that scale."
-
----
-
-## 📊 Current State (December 2024)
-
-### Existing Content
-- **Total pages**: 2,172 pages across 22 chapters
-- **Status**: Complete draft of comprehensive single-volume book
-- **Quality**: Refined through extensive review feedback
-
-### Content Distribution
-- **Chapters 1-14**: Form basis of Volume 1 (with surgery)
-- **Chapters 15-21**: Move to Volume 2
-- **New content needed**: 325-375 pages (8 new chapters for V2)
-
----
-
-## 🗓️ Six-Month Timeline
-
-### Phase 1: Chapter Surgery (Months 1-2)
-**Goal**: Extract distributed content from 7 chapters, create clean V1
-
-#### Month 1: Content Extraction
-- **Week 1-2**: Extract distributed content from Chapters 6, 7, 8
-  - [ ] Chapter 6 (Data Engineering): Extract 40 pages of distributed content
-  - [ ] Chapter 7 (Frameworks): Extract 50 pages of distributed execution
-  - [ ] Chapter 8 (Training): Extract 60 pages of distributed training
-
-- **Week 3-4**: Extract from Chapters 10, 11, 12, 13
-  - [ ] Chapter 10 (Optimizations): Extract 80 pages (NAS, AutoML)
-  - [ ] Chapter 11 (Hardware): Extract 50 pages (multi-chip)
-  - [ ] Chapter 12 (Benchmarking): Extract 40 pages (distributed benchmarking)
-  - [ ] Chapter 13 (MLOps): Extract 30 pages (production scale)
-
-#### Month 2: Bridging and Polish
-- **Week 5-6**: Create V1 transitions
-  - [ ] Add "See Volume 2" callout boxes in V1
-  - [ ] Write brief distributed awareness sections
-  - [ ] Ensure V1 chapters remain coherent
-
-- **Week 7-8**: Organize extracted content
-  - [ ] Create V2 chapter structure
-  - [ ] Place extracted content in appropriate V2 chapters
-  - [ ] Identify gaps in extracted content
-
-### Phase 2: New Content Development (Months 3-5)
-
-#### Month 3: Priority 1 Chapters (Essential Infrastructure)
-- **Week 9-10**: Memory & Storage
-  - [ ] V2 Ch1: Memory Hierarchies for ML (45 pages)
-  - [ ] V2 Ch2: Storage Systems for ML (40 pages)
-
-- **Week 11-12**: Communication & Distributed Training
-  - [ ] V2 Ch3: Communication & Collective Operations (45 pages)
-  - [ ] V2 Ch4: Distributed Training Systems (50 pages) - integrate extracted content
-
-#### Month 4: Priority 2 Chapters (Production Requirements)
-- **Week 13-14**: Fault Tolerance & Inference
-  - [ ] V2 Ch5: Fault Tolerance & Resilience (40 pages)
-  - [ ] V2 Ch6: Inference at Scale (45 pages)
-
-- **Week 15-16**: Integration
-  - [ ] Integrate all extracted content into new chapters
-  - [ ] Write chapter introductions and conclusions
-  - [ ] Create cross-references
-
-#### Month 5: Priority 3 Chapters (Specialized Topics)
-- **Week 17-18**: Edge Systems
-  - [ ] V2 Ch8: Edge Intelligence Systems (50 pages)
-  - [ ] Integrate extracted edge content from Ch2
-
-- **Week 19-20**: Final new chapters
-  - [ ] V2 Ch1: Bridge Chapter (30 pages) - From Single to Distributed
-  - [ ] Update existing V2 chapters (15-20) with introductions
-
-### Phase 3: Integration and Polish (Month 6)
-
-#### Month 6: Final Integration
-- **Week 21-22**: Cross-References and Consistency
-  - [ ] Update all V1→V2 cross-references
-  - [ ] Update all V2→V1 prerequisite references
-  - [ ] Ensure consistent notation across volumes
-  - [ ] Verify all figure references work
-
-- **Week 23**: Narrative Flow
-  - [ ] Review V1 narrative arc (Foundations → Building → Optimizing → Impact)
-  - [ ] Review V2 narrative arc (Scale → Production → Responsibility)
-  - [ ] Polish chapter transitions
-  - [ ] Write volume prefaces
-
-- **Week 24**: Final Quality Checks
-  - [ ] Technical accuracy review
-  - [ ] Page count verification
-  - [ ] Exercise and quiz consistency
-  - [ ] Final copyedit pass
-  - [ ] Prepare camera-ready manuscripts
-
----
-
-## 📈 Progress Tracking
-
-### Volume 1 Progress (Target: 1,150-1,200 pages)
-
-| Chapter | Current | Target | Surgery Status | Notes |
-|---------|---------|--------|----------------|-------|
-| 1. Introduction | 90 | 60 | ⬜ Not Started | Compress history section |
-| 2. ML Systems | 70 | 70 | ⬜ Not Started | Extract hybrid architectures |
-| 3. DL Primer | 110 | 100 | ⬜ Not Started | No surgery (keep as-is) |
-| 4. DNN Architectures | 82 | 100 | ⬜ Not Started | No surgery (keep as-is) |
-| 5. Workflow | 51 | 40 | ⬜ Not Started | Minor compression |
-| 6. Data Engineering | 138 | 80 | ⬜ Not Started | Extract distributed storage |
-| 7. Frameworks | 121 | 100 | ⬜ Not Started | Extract distributed execution |
-| 8. Training | 157 | 100 | ⬜ Not Started | Extract distributed training |
-| 9. Efficient AI | 52 | 60 | ⬜ Not Started | No surgery (keep as-is) |
-| 10. Optimizations | 160 | 120 | ⬜ Not Started | Extract NAS, AutoML |
-| 11. Hardware | 181 | 90 | ⬜ Not Started | Extract multi-chip |
-| 12. Benchmarking | 124 | 80 | ⬜ Not Started | Extract distributed benchmarking |
-| 13. MLOps | 126 | 50 | ⬜ Not Started | Extract production scale |
-| 14. AI for Good | 84 | 50 | ⬜ Not Started | Minor compression |
-| **TOTAL** | **1,546** | **1,100** | **0%** | **V1 baseline complete** |
-
-Progress Key: ⬜ Not Started | 🟨 In Progress | ✅ Complete
-
-### Volume 2 Progress (Target: 1,100-1,150 pages)
-
-| Chapter | Source | Target | Status | Notes |
-|---------|--------|--------|--------|-------|
-| 1. Bridge: Single to Distributed | NEW | 30 | ⬜ Not Started | Write from scratch |
-| 2. Memory Hierarchies | NEW + Ch11 | 45 | ⬜ Not Started | New content + extracts |
-| 3. Storage Systems | NEW + Ch6 | 40 | ⬜ Not Started | New content + extracts |
-| 4. Communication & Collectives | NEW + Ch8 | 45 | ⬜ Not Started | New content + extracts |
-| 5. Distributed Training | Ch8 + Ch10 | 50 | ⬜ Not Started | Consolidate extracts |
-| 6. Fault Tolerance | NEW + Ch13 | 40 | ⬜ Not Started | New content + extracts |
-| 7. Inference at Scale | NEW + Ch2 + Ch13 | 45 | ⬜ Not Started | New content + extracts |
-| 8. Edge Intelligence | NEW + Ch2 | 50 | ⬜ Not Started | New content + extracts |
-| 9. On-Device Learning | Ch14 (existing) | 127 | ⬜ Not Started | Move from V1 |
-| 10. Privacy Systems | Ch15 (split) | 65 | ⬜ Not Started | Split Privacy/Security |
-| 11. Security Systems | Ch15 (split) | 68 | ⬜ Not Started | Split Privacy/Security |
-| 12. Robust AI | Ch16 (existing) | 137 | ⬜ Not Started | Move from V1 |
-| 13. Responsible AI | Ch17 (existing) | 135 | ⬜ Not Started | Move from V1 |
-| 14. Sustainable AI | Ch18 (existing) | 46 | ⬜ Not Started | Move from V1 |
-| 15. Frontiers & AGI | Ch19+20 (merge) | 78 | ⬜ Not Started | Merge two chapters |
-| **TOTAL** | **Mixed** | **1,001** | **0%** | **Need ~100 more pages** |
-
-### New Content Writing Progress (325-375 pages needed)
-
-| Chapter | Pages | Draft | Review | Final | Notes |
-|---------|-------|-------|--------|-------|-------|
-| Bridge Chapter | 30 | ⬜ | ⬜ | ⬜ | Priority 1 |
-| Memory Hierarchies | 45 | ⬜ | ⬜ | ⬜ | Priority 1 |
-| Storage Systems | 40 | ⬜ | ⬜ | ⬜ | Priority 1 |
-| Communication | 45 | ⬜ | ⬜ | ⬜ | Priority 2 |
-| Distributed Training | 50 | ⬜ | ⬜ | ⬜ | Priority 1 (+ extracts) |
-| Fault Tolerance | 40 | ⬜ | ⬜ | ⬜ | Priority 2 |
-| Inference at Scale | 45 | ⬜ | ⬜ | ⬜ | Priority 1 |
-| Edge Intelligence | 50 | ⬜ | ⬜ | ⬜ | Priority 3 |
-| **TOTAL NEW** | **345** | **0%** | **0%** | **0%** | 8 new chapters |
-
----
-
-## 📝 Weekly Progress Log
-
-### Week 1 (Dec 2-8, 2024)
-- [ ] Created comprehensive surgical plan
-- [ ] Created MIT Press proposal
-- [ ] Created master roadmap
-- [ ] **Next**: Begin Chapter 6 surgery
-
-### Week 2 (Dec 9-15, 2024)
-- [ ] Progress notes...
-
-### Week 3 (Dec 16-22, 2024)
-- [ ] Progress notes...
-
-*(Continue weekly log throughout 6-month period)*
-
----
-
-## 🎯 Key Milestones
-
-- [ ] **End Month 1**: All distributed content extracted from V1 chapters
-- [ ] **End Month 2**: V1 chapters coherent and complete
-- [ ] **End Month 3**: Priority 1 new chapters drafted (160 pages)
-- [ ] **End Month 4**: Priority 2 new chapters drafted (85 pages)
-- [ ] **End Month 5**: All new content drafted (345 pages)
-- [ ] **End Month 6**: Camera-ready manuscripts for both volumes
-
----
-
-## ⚠️ Risks and Mitigation
-
-### Risk 1: Page Count Imbalance
-**Risk**: V1 or V2 ends up significantly larger/smaller than target
-**Mitigation**:
-- Monitor page counts weekly
-- Adjust compression/expansion as needed
-- Have flexibility targets (±100 pages)
-
-### Risk 2: Missing Dependencies
-**Risk**: V2 assumes V1 knowledge not actually covered
-**Mitigation**:
-- Create prerequisite matrix
-- Add recap sections to V2 chapters
-- Review cross-references monthly
-
-### Risk 3: Timeline Slippage
-**Risk**: New chapter writing takes longer than estimated
-**Mitigation**:
-- Prioritize essential chapters first
-- Have backup plan to defer Priority 3 chapters
-- Build 2-week buffer into timeline
-
-### Risk 4: Content Duplication
-**Risk**: Same concept explained in both volumes
-**Mitigation**:
-- Clear "basic vs. advanced" delineation
-- V2 references V1 explicitly
-- Review for overlap in Month 6
-
----
-
-## 📚 Reference Materials
-
-### Pedagogical Framework
-- **V1 Narrative**: Foundations → Building → Optimizing → Impact
-- **V2 Narrative**: Scale → Production → Responsibility
-- **Connection**: V1 ends with inspiration, V2 begins with bridge
-
-### Chapter Surgery Guidelines
-- **Single-machine boundary**: Keep in V1
-- **Distributed systems**: Move to V2
-- **Production scale**: Move to V2
-- **Advanced optimization**: Move to V2
-
-### Writing Standards
-- Timeless principles over current tech
-- Every chapter has: Purpose, Learning Outcomes, Summary
-- Concrete examples throughout
-- "Fallacies and Pitfalls" section
-
----
-
-## 🔧 Tools and Workflow
-
-### Version Control
-- [ ] Create `volume-split` branch in Git
-- [ ] Track all changes in branch
-- [ ] Regular commits with clear messages
-
-### Organization
-- `book/volume1/` - Volume 1 chapters
-- `book/volume2/` - Volume 2 chapters
-- `book/docs/` - All planning documents
-- `book/extracted/` - Content extracted from V1
-
-### Quality Checks
-- [ ] Weekly page count tracking
-- [ ] Monthly cross-reference review
-- [ ] Technical accuracy spot checks
-- [ ] Pedagogical flow reviews
-
----
-
-## 📞 Stakeholder Communication
-
-### MIT Press Updates
-- **Monthly**: Progress report with page counts
-- **Major milestones**: Notify when phases complete
-- **Issues**: Immediate communication of risks
-
-### Community/Reviewers
-- **End Month 2**: Share V1 draft for review
-- **End Month 4**: Share V2 draft chapters for review
-- **End Month 5**: Full review cycle
-
----
-
-## ✅ Final Checklist (Month 6)
-
-### Volume 1 Completion
-- [ ] All 14 chapters present and coherent
-- [ ] Page count: 1,150-1,250 pages
-- [ ] All cross-references to V2 marked clearly
-- [ ] Exercises and quizzes updated
-- [ ] Figures and tables numbered correctly
-- [ ] Bibliography complete
-- [ ] Index prepared
-
-### Volume 2 Completion
-- [ ] All 15 chapters present and coherent
-- [ ] Page count: 1,100-1,200 pages
-- [ ] Bridge chapter effective
-- [ ] New chapters integrate extracted content
-- [ ] Exercises and quizzes complete
-- [ ] Figures and tables numbered correctly
-- [ ] Bibliography complete
-- [ ] Index prepared
-
-### Both Volumes
-- [ ] Consistent notation across volumes
-- [ ] No content duplication
-- [ ] Clear prerequisite chain
-- [ ] Professional copyedit complete
-- [ ] Ready for MIT Press submission
-
----
-
-## 📊 Success Metrics
-
-### Quantitative
-- Volume 1: 1,150-1,250 pages ✓/✗
-- Volume 2: 1,100-1,200 pages ✓/✗
-- New content: 325-375 pages ✓/✗
-- Timeline: 6 months ✓/✗
-
-### Qualitative
-- Each volume independently valuable ✓/✗
-- Clear pedagogical progression ✓/✗
-- MIT Press approval ✓/✗
-- Reviewer feedback positive ✓/✗
-
----
-
-## 🎓 Post-Completion
-
-### Publication Process
-- [ ] Submit to MIT Press
-- [ ] Incorporate editorial feedback
-- [ ] Final production review
-- [ ] Marketing materials
-- [ ] Course adoption outreach
-
-### Maintenance
-- [ ] Errata tracking system
-- [ ] Annual review cycle
-- [ ] Community feedback integration
-- [ ] Future edition planning
-
----
-
-## 📝 Notes and Decisions
-
-### December 2024 - Project Launch
-- Decision: Committed to 6-month timeline
-- Decision: Will do full surgery, not quick split
-- Decision: Flagship quality is priority over speed
-- Next decision needed: [track decisions here]
-
----
-
-**Last Updated**: December 7, 2024
-**Status**: Planning Complete - Ready to Begin Execution
-**Next Action**: Begin Chapter 6 (Data Engineering) surgery - Week 1
-
----
-
-*This roadmap is the master coordination document for the two-volume split project. Update weekly with progress, decisions, and course corrections.*
diff --git a/book/docs/VOLUME_SPLIT_SURGICAL_PLAN.md b/book/docs/VOLUME_SPLIT_SURGICAL_PLAN.md
deleted file mode 100644
index 59b8fc6f0..000000000
--- a/book/docs/VOLUME_SPLIT_SURGICAL_PLAN.md
+++ /dev/null
@@ -1,1433 +0,0 @@
-# Machine Learning Systems: Comprehensive Volume Split Surgical Plan
-
-**Document Version**: December 2024
-**Purpose**: Detailed section-by-section surgery roadmap for splitting ML Systems textbook into two volumes
-**Timeline**: 2-month execution phase
-
----
-
-## Executive Summary
-
-This surgical plan provides precise instructions for splitting each of the 22 chapters between Volume 1 (Introduction to ML Systems) and Volume 2 (Advanced ML Systems). Every section and subsection has been analyzed and assigned a specific action.
-
-### Decision Legend
-- **KEEP_V1**: Retain in Volume 1 as-is
-- **MOVE_V2**: Move entirely to Volume 2
-- **SPLIT**: Divide content between volumes (specify what goes where)
-- **MODIFY**: Rewrite/restructure for target volume
-- **REMOVE**: Delete as redundant or out of scope
-- **BRIDGE**: Add summary/recap content for cross-volume references
-
-### Volume Targets
-- **Volume 1**: 14 chapters, ~800 pages (focus: single-system ML)
-- **Volume 2**: 14 chapters, ~800 pages (focus: distributed systems and advanced topics)
-
----
-
-## PART I: FOUNDATIONAL CHAPTERS (1-4)
-*These establish core concepts needed by both volumes*
-
-## CHAPTER 1: Introduction (Current: ~90 pages)
-**V1 Target: 60 pages | V2 Target: 0 pages | Remove: 30 pages**
-
-### ## Purpose (2 pages)
-**DECISION**: KEEP_V1
-**RATIONALE**: Essential framing for the textbook
-
-### ## The Engineering Revolution in Artificial Intelligence (4 pages)
-**DECISION**: KEEP_V1
-**RATIONALE**: Sets context for entire book series
-
-### ## From Artificial Intelligence Vision to Machine Learning Practice (6 pages)
-**DECISION**: KEEP_V1
-**RATIONALE**: Historical context valuable for all readers
-
-### ## Defining ML Systems (20 pages)
-**DECISION**: KEEP_V1
-**RATIONALE**: Core definitional content
-
-### ## How ML Systems Differ from Traditional Software (3 pages)
-**DECISION**: KEEP_V1
-**RATIONALE**: Fundamental distinction needed early
-
-### ## The Bitter Lesson: Why Systems Engineering Matters (6 pages)
-**DECISION**: KEEP_V1
-**RATIONALE**: Central thesis of the book
-
-### ## Historical Evolution of AI Paradigms (30 pages)
-**DECISION**: MODIFY
-**RATIONALE**: Too long for intro chapter, compress to 15 pages
-**ACTION**: Condense each era from 5 pages to 2-3 pages
-
-#### ### Symbolic AI Era (5 pages)
-**DECISION**: MODIFY - Compress to 2 pages
-
-#### ### Expert Systems Era (4 pages)
-**DECISION**: MODIFY - Compress to 2 pages
-
-#### ### Statistical Learning Era (8 pages)
-**DECISION**: MODIFY - Compress to 3 pages
-
-#### ### Shallow Learning Era (6 pages)
-**DECISION**: MODIFY - Compress to 3 pages
-
-#### ### Deep Learning Era (7 pages)
-**DECISION**: KEEP_V1 - Keep at current length
-
-### ## Understanding ML System Lifecycle and Deployment (8 pages)
-**DECISION**: KEEP_V1
-**RATIONALE**: Essential lifecycle overview
-
-#### ### The ML Development Lifecycle (3 pages)
-**DECISION**: KEEP_V1
-
-#### ### The Deployment Spectrum (2 pages)
-**DECISION**: KEEP_V1
-
-#### ### How Deployment Shapes the Lifecycle (3 pages)
-**DECISION**: KEEP_V1
-
-### ## Case Studies in Real-World ML Systems (6 pages)
-**DECISION**: KEEP_V1
-**RATIONALE**: Concrete examples ground abstract concepts
-
-#### ### Case Study: Autonomous Vehicles (4 pages)
-**DECISION**: KEEP_V1
-
-#### ### Contrasting Deployment Scenarios (2 pages)
-**DECISION**: KEEP_V1
-
-### ## Core Engineering Challenges in ML Systems (8 pages)
-**DECISION**: SPLIT
-**V1_CONTENT**: Basic challenges (4 pages)
-**V2_CONTENT**: Scale-related challenges move to V2 intro
-
-#### ### Data Challenges (2 pages)
-**DECISION**: KEEP_V1
-
-#### ### Model Challenges (2 pages)
-**DECISION**: KEEP_V1
-
-#### ### System Challenges (2 pages)
-**DECISION**: SPLIT - Basic in V1, distributed in V2
-
-#### ### Ethical Considerations (1 page)
-**DECISION**: KEEP_V1 - Brief mention, detail in V2
-
-#### ### Understanding Challenge Interconnections (1 page)
-**DECISION**: KEEP_V1
-
-### ## Defining AI Engineering (4 pages)
-**DECISION**: KEEP_V1
-**RATIONALE**: Core professional identity content
-
-### ## Organizing ML Systems Engineering: The Five-Pillar Framework (8 pages)
-**DECISION**: KEEP_V1
-**RATIONALE**: Structural framework for book
-
-#### ### The Five Engineering Disciplines (3 pages)
-**DECISION**: KEEP_V1
-
-#### ### Connecting Components, Lifecycle, and Disciplines (2 pages)
-**DECISION**: KEEP_V1
-
-#### ### Future Directions in ML Systems Engineering (1 page)
-**DECISION**: REMOVE - Save for V2
-
-#### ### The Nature of Systems Knowledge (1 page)
-**DECISION**: KEEP_V1
-
-#### ### How to Use This Textbook (1 page)
-**DECISION**: MODIFY - Update for two-volume structure
-
----
-
-## CHAPTER 2: ML Systems (Current: ~70 pages)
-**V1 Target: 70 pages | V2 Target: 0 pages**
-
-### ## Purpose (2 pages)
-**DECISION**: KEEP_V1
-
-### ## Deployment Paradigm Framework (3 pages)
-**DECISION**: KEEP_V1
-**RATIONALE**: Essential taxonomy for understanding ML systems
-
-### ## The Deployment Spectrum (20 pages)
-**DECISION**: KEEP_V1
-**RATIONALE**: Core conceptual framework
-
-#### ### Deployment Paradigm Foundations (15 pages)
-**DECISION**: KEEP_V1
-
-### ## Cloud ML: Maximizing Computational Power (10 pages)
-**DECISION**: SPLIT
-**V1_CONTENT**: Basic cloud concepts (5 pages)
-**V2_CONTENT**: Large-scale distributed training details → V2 Ch2
-
-#### ### Cloud Infrastructure and Scale (3 pages)
-**DECISION**: KEEP_V1 - Basic concepts only
-
-#### ### Cloud ML Trade-offs and Constraints (2 pages)
-**DECISION**: KEEP_V1
-
-#### ### Large-Scale Training and Inference (5 pages)
-**DECISION**: MOVE_V2 - Goes to "Distributed Training" chapter
-**V2_DESTINATION**: Volume 2, Chapter 4: Distributed Training
-
-### ## Edge ML: Reducing Latency and Privacy Risk (10 pages)
-**DECISION**: KEEP_V1
-**RATIONALE**: Single-device edge important for V1
-
-#### ### Distributed Processing Architecture (3 pages)
-**DECISION**: MOVE_V2 - Multi-device edge is advanced
-**V2_DESTINATION**: Volume 2, Chapter 8: Edge Deployment
-
-#### ### Edge ML Benefits and Deployment Challenges (4 pages)
-**DECISION**: KEEP_V1
-
-#### ### Real-Time Industrial and IoT Systems (3 pages)
-**DECISION**: KEEP_V1
-
-### ## Mobile ML: Personal and Offline Intelligence (10 pages)
-**DECISION**: KEEP_V1
-**RATIONALE**: Single-device mobile essential for V1
-
-#### ### Battery and Thermal Constraints (3 pages)
-**DECISION**: KEEP_V1
-
-#### ### Mobile ML Benefits and Resource Constraints (4 pages)
-**DECISION**: KEEP_V1
-
-#### ### Personal Assistant and Media Processing (3 pages)
-**DECISION**: KEEP_V1
-
-### ## Tiny ML: Ubiquitous Sensing at Scale (10 pages)
-**DECISION**: KEEP_V1
-**RATIONALE**: Important for complete deployment spectrum
-
-#### ### Extreme Resource Constraints (3 pages)
-**DECISION**: KEEP_V1
-
-#### ### TinyML Advantages and Operational Trade-offs (3 pages)
-**DECISION**: KEEP_V1
-
-#### ### Environmental and Health Monitoring (4 pages)
-**DECISION**: KEEP_V1
-
-### ## Hybrid Architectures: Combining Paradigms (8 pages)
-**DECISION**: MOVE_V2
-**RATIONALE**: Multi-tier systems are advanced topic
-**V2_DESTINATION**: Volume 2, Chapter 6: Inference Systems
-
-#### ### Multi-Tier Integration Patterns (4 pages)
-**DECISION**: MOVE_V2
-
-#### ### Production System Case Studies (4 pages)
-**DECISION**: MOVE_V2
-
-### ## Shared Principles Across Deployment Paradigms (5 pages)
-**DECISION**: KEEP_V1
-
-### ## Comparative Analysis and Selection Framework (8 pages)
-**DECISION**: KEEP_V1
-
-### ## Decision Framework for Deployment Selection (8 pages)
-**DECISION**: KEEP_V1
-
-### ## Fallacies and Pitfalls (2 pages)
-**DECISION**: KEEP_V1
-
-### ## Summary (2 pages)
-**DECISION**: KEEP_V1
-
----
-
-## CHAPTER 3: Deep Learning Primer (Current: ~100 pages)
-**V1 Target: 100 pages | V2 Target: 0 pages**
-
-*ALL SECTIONS: KEEP_V1*
-**RATIONALE**: Entire chapter is foundational knowledge needed before any advanced topics
-
-### ## Purpose (2 pages)
-**DECISION**: KEEP_V1
-
-### ## Deep Learning Systems Engineering Foundation (5 pages)
-**DECISION**: KEEP_V1
-
-### ## Evolution of ML Paradigms (20 pages)
-**DECISION**: KEEP_V1
-
-### ## From Biology to Silicon (15 pages)
-**DECISION**: KEEP_V1
-
-### ## Neural Network Fundamentals (30 pages)
-**DECISION**: KEEP_V1
-
-### ## Learning Process (20 pages)
-**DECISION**: KEEP_V1
-
-### ## Inference Pipeline (8 pages)
-**DECISION**: KEEP_V1
-
-### ## Case Study: USPS Digit Recognition (5 pages)
-**DECISION**: KEEP_V1
-
-### ## Deep Learning and the AI Triangle (2 pages)
-**DECISION**: KEEP_V1
-
-### ## Fallacies and Pitfalls (2 pages)
-**DECISION**: KEEP_V1
-
-### ## Summary (2 pages)
-**DECISION**: KEEP_V1
-
----
-
-## CHAPTER 4: DNN Architectures (Current: ~100 pages)
-**V1 Target: 100 pages | V2 Target: 0 pages**
-
-*ALL SECTIONS: KEEP_V1*
-**RATIONALE**: Core architectures needed for understanding all subsequent content
-
-### ## Purpose (2 pages)
-**DECISION**: KEEP_V1
-
-### ## Architectural Principles and Engineering Trade-offs (3 pages)
-**DECISION**: KEEP_V1
-
-### ## Multi-Layer Perceptrons: Dense Pattern Processing (20 pages)
-**DECISION**: KEEP_V1
-
-### ## CNNs: Spatial Pattern Processing (25 pages)
-**DECISION**: KEEP_V1
-
-### ## RNNs: Sequential Pattern Processing (15 pages)
-**DECISION**: KEEP_V1
-
-### ## Attention Mechanisms: Dynamic Pattern Processing (30 pages)
-**DECISION**: KEEP_V1
-
-### ## Architectural Building Blocks (10 pages)
-**DECISION**: KEEP_V1
-
-### ## System-Level Building Blocks (15 pages)
-**DECISION**: KEEP_V1
-
-### ## Architecture Selection Framework (10 pages)
-**DECISION**: KEEP_V1
-
-### ## Unified Framework: Inductive Biases (3 pages)
-**DECISION**: KEEP_V1
-
-### ## Fallacies and Pitfalls (2 pages)
-**DECISION**: KEEP_V1
-
-### ## Summary (2 pages)
-**DECISION**: KEEP_V1
-
----
-
-## PART II: DESIGN PRINCIPLES (5-8)
-*Building ML systems end-to-end*
-
-## CHAPTER 5: Workflow (Current: ~40 pages)
-**V1 Target: 40 pages | V2 Target: 0 pages**
-
-*ALL SECTIONS: KEEP_V1*
-**RATIONALE**: Workflow fundamentals apply to all scales
-
-### ## Purpose (2 pages)
-**DECISION**: KEEP_V1
-
-### ## Systematic Framework for ML Development (2 pages)
-**DECISION**: KEEP_V1
-
-### ## Understanding the ML Lifecycle (5 pages)
-**DECISION**: KEEP_V1
-
-### ## ML vs Traditional Software Development (3 pages)
-**DECISION**: KEEP_V1
-
-### ## Six Core Lifecycle Stages (20 pages total)
-**DECISION**: KEEP_V1 (all subsections)
-
-### ## Integrating Systems Thinking Principles (5 pages)
-**DECISION**: KEEP_V1
-
-### ## Fallacies and Pitfalls (2 pages)
-**DECISION**: KEEP_V1
-
-### ## Summary (2 pages)
-**DECISION**: KEEP_V1
-
----
-
-## CHAPTER 6: Data Engineering (Current: ~120 pages)
-**V1 Target: 80 pages | V2 Target: 40 pages**
-
-### ## Purpose (2 pages)
-**DECISION**: KEEP_V1
-
-### ## Data Engineering as a Systems Discipline (3 pages)
-**DECISION**: KEEP_V1
-
-### ## Four Pillars Framework (20 pages)
-**DECISION**: KEEP_V1
-**RATIONALE**: Core framework applies at all scales
-
-### ## Data Cascades and Systematic Foundations (10 pages)
-**DECISION**: KEEP_V1
-
-### ## Data Pipeline Architecture (15 pages)
-**DECISION**: SPLIT
-**V1_CONTENT**: Single-machine pipelines (10 pages)
-**V2_CONTENT**: Distributed pipelines (5 pages) → V2 Ch2
-
-#### ### Quality Through Validation and Monitoring (4 pages)
-**DECISION**: KEEP_V1
-
-#### ### Reliability Through Graceful Degradation (3 pages)
-**DECISION**: KEEP_V1
-
-#### ### Scalability Patterns (4 pages)
-**DECISION**: MOVE_V2 → Storage Systems chapter
-**V2_DESTINATION**: Volume 2, Chapter 2: Storage Systems for ML
-
-#### ### Governance Through Observability (4 pages)
-**DECISION**: SPLIT - Basic in V1, distributed in V2
-
-### ## Strategic Data Acquisition (20 pages)
-**DECISION**: SPLIT
-**V1_CONTENT**: Single-source acquisition (15 pages)
-**V2_CONTENT**: Multi-source federation (5 pages) → V2 Ch2
-
-### ## Data Ingestion (15 pages)
-**DECISION**: SPLIT
-
-#### ### Batch vs. Streaming Ingestion Patterns (8 pages)
-**DECISION**: SPLIT - Batch in V1, streaming to V2
-**V2_DESTINATION**: Volume 2, Chapter 2: Storage Systems
-
-#### ### ETL and ELT Comparison (4 pages)
-**DECISION**: KEEP_V1
-
-#### ### Multi-Source Integration Strategies (3 pages)
-**DECISION**: MOVE_V2
-**V2_DESTINATION**: Volume 2, Chapter 2: Storage Systems
-
-### ## Systematic Data Processing (15 pages)
-**DECISION**: SPLIT
-
-#### ### Ensuring Training-Serving Consistency (3 pages)
-**DECISION**: KEEP_V1
-
-#### ### Building Idempotent Data Transformations (3 pages)
-**DECISION**: KEEP_V1
-
-#### ### Scaling Through Distributed Processing (3 pages)
-**DECISION**: MOVE_V2
-**V2_DESTINATION**: Volume 2, Chapter 2: Storage Systems
-
-#### ### Tracking Data Transformation Lineage (3 pages)
-**DECISION**: KEEP_V1
-
-#### ### End-to-End Processing Pipeline Design (3 pages)
-**DECISION**: KEEP_V1
-
-### ## Data Labeling (15 pages)
-**DECISION**: KEEP_V1
-**RATIONALE**: Core capability needed at all scales
-
-### ## Strategic Storage Architecture (10 pages)
-**DECISION**: SPLIT
-
-#### ### ML Storage Systems Architecture Options (3 pages)
-**DECISION**: KEEP_V1 - Basic options
-
-#### ### ML Storage Requirements and Performance (3 pages)
-**DECISION**: SPLIT - Basic in V1, distributed in V2
-
-#### ### Storage Across the ML Lifecycle (2 pages)
-**DECISION**: KEEP_V1
-
-#### ### Feature Stores: Bridging Training and Serving (2 pages)
-**DECISION**: MOVE_V2
-**V2_DESTINATION**: Volume 2, Chapter 2: Storage Systems
-
-### ## Data Governance (20 pages)
-**DECISION**: SPLIT
-**V1_CONTENT**: Basic governance (10 pages)
-**V2_CONTENT**: Enterprise governance (10 pages) → V2 Ch9
-
-### ## Fallacies and Pitfalls (2 pages)
-**DECISION**: KEEP_V1
-
-### ## Summary (2 pages)
-**DECISION**: KEEP_V1
-
----
-
-## CHAPTER 7: Frameworks (Current: ~150 pages)
-**V1 Target: 100 pages | V2 Target: 50 pages**
-
-### ## Purpose (2 pages)
-**DECISION**: KEEP_V1
-
-### ## Framework Abstraction and Necessity (3 pages)
-**DECISION**: KEEP_V1
-
-### ## Historical Development Trajectory (10 pages)
-**DECISION**: KEEP_V1
-
-### ## Fundamental Concepts (80 pages)
-**DECISION**: SPLIT
-
-#### ### Computational Graphs (20 pages)
-**DECISION**: KEEP_V1
-
-#### ### Automatic Differentiation (30 pages)
-**DECISION**: KEEP_V1
-
-#### ### Data Structures (20 pages)
-**DECISION**: KEEP_V1
-
-#### ### Programming and Execution Models (30 pages)
-**DECISION**: SPLIT
-**V1_CONTENT**: Single-device execution (20 pages)
-**V2_CONTENT**: Distributed execution (10 pages) → V2 Ch4
-
-#### ### Core Operations (20 pages)
-**DECISION**: KEEP_V1
-
-### ## Framework Architecture (5 pages)
-**DECISION**: KEEP_V1
-
-### ## Framework Ecosystem (5 pages)
-**DECISION**: KEEP_V1
-
-### ## System Integration (8 pages)
-**DECISION**: SPLIT
-**V1_CONTENT**: Local integration (4 pages)
-**V2_CONTENT**: Distributed integration (4 pages) → V2 Ch4
-
-### ## Major Framework Platform Analysis (30 pages)
-**DECISION**: SPLIT
-
-#### ### TensorFlow Ecosystem (10 pages)
-**DECISION**: SPLIT - Core in V1, distributed in V2
-
-#### ### PyTorch (10 pages)
-**DECISION**: SPLIT - Core in V1, distributed in V2
-
-#### ### JAX (5 pages)
-**DECISION**: KEEP_V1
-
-#### ### Framework Design Philosophy (5 pages)
-**DECISION**: KEEP_V1
-
-### ## Deployment Environment-Specific Frameworks (20 pages)
-**DECISION**: SPLIT
-
-#### ### Distributed Computing Platform Optimization (5 pages)
-**DECISION**: MOVE_V2
-**V2_DESTINATION**: Volume 2, Chapter 4: Distributed Training
-
-#### ### Local Processing and Low-Latency Optimization (5 pages)
-**DECISION**: KEEP_V1
-
-#### ### Resource-Constrained Device Optimization (5 pages)
-**DECISION**: KEEP_V1
-
-#### ### Microcontroller and Embedded System Implementation (5 pages)
-**DECISION**: KEEP_V1
-
-### ## Systematic Framework Selection Methodology (10 pages)
-**DECISION**: KEEP_V1
-
-### ## Systematic Framework Performance Assessment (8 pages)
-**DECISION**: KEEP_V1
-
-### ## Common Framework Selection Misconceptions (2 pages)
-**DECISION**: KEEP_V1
-
-### ## Summary (2 pages)
-**DECISION**: KEEP_V1
-
----
-
-## CHAPTER 8: Training (Current: ~160 pages)
-**V1 Target: 100 pages | V2 Target: 60 pages**
-
-### ## Purpose (2 pages)
-**DECISION**: KEEP_V1
-
-### ## Training Systems Evolution and Architecture (5 pages)
-**DECISION**: KEEP_V1
-
-### ## Training Systems (10 pages)
-**DECISION**: KEEP_V1
-
-### ## Mathematical Foundations (30 pages)
-**DECISION**: KEEP_V1
-**RATIONALE**: Core math needed for all training
-
-### ## Pipeline Architecture (25 pages)
-**DECISION**: KEEP_V1
-**RATIONALE**: Single-machine pipeline fundamentals
-
-### ## Pipeline Optimizations (40 pages)
-**DECISION**: SPLIT
-
-#### ### Systematic Optimization Framework (3 pages)
-**DECISION**: KEEP_V1
-
-#### ### Production Optimization Decision Framework (2 pages)
-**DECISION**: KEEP_V1
-
-#### ### Data Prefetching and Pipeline Overlapping (10 pages)
-**DECISION**: KEEP_V1
-
-#### ### Mixed-Precision Training (8 pages)
-**DECISION**: KEEP_V1
-
-#### ### Gradient Accumulation and Checkpointing (10 pages)
-**DECISION**: KEEP_V1
-
-#### ### Optimization Technique Comparison (3 pages)
-**DECISION**: KEEP_V1
-
-#### ### Multi-Machine Scaling Fundamentals (4 pages)
-**DECISION**: MOVE_V2
-**V2_DESTINATION**: Volume 2, Chapter 4: Distributed Training
-
-### ## Distributed Systems (60 pages)
-**DECISION**: MOVE_V2
-**RATIONALE**: Entire section about multi-machine training
-**V2_DESTINATION**: Volume 2, Chapter 4: Distributed Training
-
-#### ### Distributed Training Efficiency Metrics
-**DECISION**: MOVE_V2
-
-#### ### Data Parallelism
-**DECISION**: MOVE_V2
-
-#### ### Model Parallelism
-**DECISION**: MOVE_V2
-
-#### ### Hybrid Parallelism
-**DECISION**: MOVE_V2
-
-#### ### Parallelism Strategy Comparison
-**DECISION**: MOVE_V2
-
-#### ### Framework Integration
-**DECISION**: MOVE_V2
-
-### ## Performance Optimization (5 pages)
-**DECISION**: SPLIT
-**V1_CONTENT**: Single-machine optimization (3 pages)
-**V2_CONTENT**: Distributed optimization (2 pages) → V2 Ch4
-
-### ## Hardware Acceleration (8 pages)
-**DECISION**: SPLIT
-**V1_CONTENT**: Single-accelerator (5 pages)
-**V2_CONTENT**: Multi-accelerator (3 pages) → V2 Ch4
-
-### ## Fallacies and Pitfalls (2 pages)
-**DECISION**: KEEP_V1
-
-### ## Summary (2 pages)
-**DECISION**: KEEP_V1
-
----
-
-## PART III: PERFORMANCE ENGINEERING (9-12)
-*Making ML systems efficient*
-
-## CHAPTER 9: Efficient AI (Current: ~60 pages)
-**V1 Target: 60 pages | V2 Target: 0 pages**
-
-*ALL SECTIONS: KEEP_V1*
-**RATIONALE**: Efficiency principles apply at all scales
-
-### ## Purpose (2 pages)
-**DECISION**: KEEP_V1
-
-### ## The Efficiency Imperative (2 pages)
-**DECISION**: KEEP_V1
-
-### ## Defining System Efficiency (3 pages)
-**DECISION**: KEEP_V1
-
-### ## AI Scaling Laws (20 pages)
-**DECISION**: KEEP_V1
-**RATIONALE**: Fundamental principles needed for understanding efficiency
-
-### ## The Efficiency Framework (25 pages)
-**DECISION**: KEEP_V1
-
-### ## Real-World Efficiency Strategies (3 pages)
-**DECISION**: KEEP_V1
-
-### ## Efficiency Trade-offs and Challenges (10 pages)
-**DECISION**: KEEP_V1
-
-### ## Engineering Principles for Efficient AI (3 pages)
-**DECISION**: KEEP_V1
-
-### ## Societal and Ethical Implications (5 pages)
-**DECISION**: KEEP_V1
-
-### ## Fallacies and Pitfalls (2 pages)
-**DECISION**: KEEP_V1
-
-### ## Summary (2 pages)
-**DECISION**: KEEP_V1
-
----
-
-## CHAPTER 10: Optimizations (Current: ~200 pages)
-**V1 Target: 120 pages | V2 Target: 80 pages**
-
-### ## Purpose (2 pages)
-**DECISION**: KEEP_V1
-
-### ## Model Optimization Fundamentals (3 pages)
-**DECISION**: KEEP_V1
-
-### ## Optimization Framework (5 pages)
-**DECISION**: KEEP_V1
-
-### ## Deployment Context (5 pages)
-**DECISION**: KEEP_V1
-
-### ## Framework Application and Navigation (5 pages)
-**DECISION**: KEEP_V1
-
-### ## Optimization Dimensions (5 pages)
-**DECISION**: KEEP_V1
-
-### ## Structural Model Optimization Methods (80 pages)
-**DECISION**: SPLIT
-
-#### ### Pruning (40 pages)
-**DECISION**: SPLIT
-**V1_CONTENT**: Basic pruning techniques (25 pages)
-**V2_CONTENT**: Advanced/structured pruning (15 pages) → V2 Ch4
-
-#### ### Knowledge Distillation (15 pages)
-**DECISION**: KEEP_V1
-
-#### ### Structured Approximations (20 pages)
-**DECISION**: SPLIT
-**V1_CONTENT**: Basic approximations (10 pages)
-**V2_CONTENT**: Advanced approximations (10 pages) → V2 Ch4
-
-#### ### Neural Architecture Search (5 pages)
-**DECISION**: MOVE_V2
-**RATIONALE**: NAS requires significant compute
-**V2_DESTINATION**: Volume 2, Chapter 4: Distributed Training
-
-### ## Quantization and Precision Optimization (50 pages)
-**DECISION**: SPLIT
-**V1_CONTENT**: Basic quantization (30 pages)
-**V2_CONTENT**: Extreme quantization (20 pages) → V2 Ch8
-
-### ## Architectural Efficiency Techniques (30 pages)
-**DECISION**: SPLIT
-**V1_CONTENT**: Basic techniques (20 pages)
-**V2_CONTENT**: Advanced techniques (10 pages) → V2 Ch8
-
-### ## Implementation Strategy and Evaluation (5 pages)
-**DECISION**: KEEP_V1
-
-### ## AutoML and Automated Optimization Strategies (5 pages)
-**DECISION**: MOVE_V2
-**V2_DESTINATION**: Volume 2, Chapter 4: Distributed Training
-
-### ## Implementation Tools and Software Frameworks (10 pages)
-**DECISION**: KEEP_V1
-
-### ## Technique Comparison (3 pages)
-**DECISION**: KEEP_V1
-
-### ## Fallacies and Pitfalls (2 pages)
-**DECISION**: KEEP_V1
-
-### ## Summary (2 pages)
-**DECISION**: KEEP_V1
-
----
-
-## CHAPTER 11: Hardware Acceleration (Current: ~140 pages)
-**V1 Target: 90 pages | V2 Target: 50 pages**
-
-### ## Purpose (2 pages)
-**DECISION**: KEEP_V1
-
-### ## AI Hardware Acceleration Fundamentals (3 pages)
-**DECISION**: KEEP_V1
-
-### ## Evolution of Hardware Specialization (15 pages)
-**DECISION**: KEEP_V1
-
-### ## AI Compute Primitives (20 pages)
-**DECISION**: KEEP_V1
-
-### ## AI Memory Systems (15 pages)
-**DECISION**: KEEP_V1
-
-### ## Hardware Mapping Fundamentals for Neural Networks (10 pages)
-**DECISION**: KEEP_V1
-
-### ## Dataflow Optimization Strategies (20 pages)
-**DECISION**: KEEP_V1
-
-### ## Compiler Support (10 pages)
-**DECISION**: KEEP_V1
-
-### ## Runtime Support (5 pages)
-**DECISION**: KEEP_V1
-
-### ## Multi-Chip AI Acceleration (20 pages)
-**DECISION**: MOVE_V2
-**RATIONALE**: Multi-chip is distributed computing
-**V2_DESTINATION**: Volume 2, Chapter 1: Memory Hierarchies
-
-#### ### Chiplet-Based Architectures
-**DECISION**: MOVE_V2
-
-#### ### Multi-GPU Systems
-**DECISION**: MOVE_V2
-
-#### ### TPU Pods
-**DECISION**: MOVE_V2
-
-#### ### Wafer-Scale AI
-**DECISION**: MOVE_V2
-
-### ## Heterogeneous SoC AI Acceleration (8 pages)
-**DECISION**: KEEP_V1
-**RATIONALE**: Single-chip heterogeneity important for edge
-
-### ## Fallacies and Pitfalls (2 pages)
-**DECISION**: KEEP_V1
-
-### ## Summary (2 pages)
-**DECISION**: KEEP_V1
-
----
-
-## CHAPTER 12: Benchmarking (Current: ~120 pages)
-**V1 Target: 80 pages | V2 Target: 40 pages**
-
-### ## Purpose (2 pages)
-**DECISION**: KEEP_V1
-
-### ## Machine Learning Benchmarking Framework (3 pages)
-**DECISION**: KEEP_V1
-
-### ## Historical Context (5 pages)
-**DECISION**: KEEP_V1
-
-### ## Machine Learning Benchmarks (15 pages)
-**DECISION**: KEEP_V1
-
-### ## Benchmarking Granularity (10 pages)
-**DECISION**: KEEP_V1
-
-### ## Benchmark Components (20 pages)
-**DECISION**: KEEP_V1
-
-### ## Training vs. Inference Evaluation (3 pages)
-**DECISION**: KEEP_V1
-
-### ## Training Benchmarks (20 pages)
-**DECISION**: SPLIT
-**V1_CONTENT**: Single-system benchmarks (10 pages)
-**V2_CONTENT**: Distributed benchmarks (10 pages) → V2 Ch4
-
-### ## Inference Benchmarks (20 pages)
-**DECISION**: SPLIT
-**V1_CONTENT**: Single-system benchmarks (10 pages)
-**V2_CONTENT**: Distributed benchmarks (10 pages) → V2 Ch6
-
-### ## Power Measurement Techniques (20 pages)
-**DECISION**: SPLIT
-**V1_CONTENT**: Device-level measurement (10 pages)
-**V2_CONTENT**: Datacenter measurement (10 pages) → V2 Ch13
-
-### ## Benchmarking Limitations and Best Practices (20 pages)
-**DECISION**: KEEP_V1
-
-### ## Model and Data Benchmarking (15 pages)
-**DECISION**: KEEP_V1
-
-### ## Production Environment Evaluation (5 pages)
-**DECISION**: MOVE_V2
-**V2_DESTINATION**: Volume 2, Chapter 6: Inference Systems
-
-### ## Fallacies and Pitfalls (2 pages)
-**DECISION**: KEEP_V1
-
-### ## Summary (2 pages)
-**DECISION**: KEEP_V1
-
----
-
-## PART IV: PRACTICE & IMPACT (13-14 for V1)
-
-## CHAPTER 13: ML Operations (Current: ~80 pages)
-**V1 Target: 50 pages | V2 Target: 30 pages**
-
-### ## Purpose (2 pages)
-**DECISION**: KEEP_V1
-
-### ## Introduction to Machine Learning Operations (3 pages)
-**DECISION**: KEEP_V1
-
-### ## Historical Context (5 pages)
-**DECISION**: KEEP_V1
-
-### ## Technical Debt and System Complexity (20 pages)
-**DECISION**: SPLIT
-**V1_CONTENT**: Basic technical debt (10 pages)
-**V2_CONTENT**: Distributed system debt (10 pages) → V2 Ch5
-
-### ## Development Infrastructure and Automation (30 pages)
-**DECISION**: SPLIT
-**V1_CONTENT**: Single-system CI/CD (20 pages)
-**V2_CONTENT**: Distributed CI/CD (10 pages) → V2 Ch5
-
-### ## Production Operations (30 pages)
-**DECISION**: SPLIT
-
-#### ### Model Deployment and Serving (10 pages)
-**DECISION**: SPLIT
-**V1_CONTENT**: Single-model serving (5 pages)
-**V2_CONTENT**: Multi-model orchestration (5 pages) → V2 Ch6
-
-#### ### Resource Management and Performance Monitoring (8 pages)
-**DECISION**: SPLIT
-**V1_CONTENT**: Single-system monitoring (4 pages)
-**V2_CONTENT**: Distributed monitoring (4 pages) → V2 Ch5
-
-#### ### Model Governance and Team Coordination (8 pages)
-**DECISION**: KEEP_V1
-
-#### ### Managing Hidden Technical Debt (4 pages)
-**DECISION**: KEEP_V1
-
-### ## Case Studies (15 pages)
-**DECISION**: SPLIT
-**V1_CONTENT**: Single-system deployments (8 pages)
-**V2_CONTENT**: Large-scale deployments (7 pages) → V2 Ch5
-
-### ## Fallacies and Pitfalls (2 pages)
-**DECISION**: KEEP_V1
-
-### ## Summary (2 pages)
-**DECISION**: KEEP_V1
-
----
-
-## CHAPTER 14: AI for Good (Current: ~50 pages)
-**V1 Target: 50 pages | V2 Target: 0 pages**
-
-*ALL SECTIONS: KEEP_V1*
-**RATIONALE**: Positive conclusion for Volume 1, inspiring students
-
-### ## Purpose (2 pages)
-**DECISION**: KEEP_V1
-
-### ## Trustworthy AI Under Extreme Constraints (3 pages)
-**DECISION**: KEEP_V1
-
-### ## Societal Challenges and AI Opportunities (3 pages)
-**DECISION**: KEEP_V1
-
-### ## Real-World Deployment Paradigms (8 pages)
-**DECISION**: KEEP_V1
-
-### ## Sustainable Development Goals Framework (5 pages)
-**DECISION**: KEEP_V1
-
-### ## Resource Constraints and Engineering Challenges (10 pages)
-**DECISION**: KEEP_V1
-
-### ## Design Pattern Framework (5 pages)
-**DECISION**: KEEP_V1
-
-### ## Design Patterns Implementation (25 pages)
-**DECISION**: KEEP_V1
-
-### ## Theoretical Foundations for Constrained Learning (5 pages)
-**DECISION**: KEEP_V1
-
-### ## Common Deployment Failures and Sociotechnical Pitfalls (5 pages)
-**DECISION**: KEEP_V1
-
-### ## Summary (2 pages)
-**DECISION**: KEEP_V1
-
----
-
-## CHAPTERS MOVING TO VOLUME 2 (15-21)
-
-## CHAPTER 15: On-Device Learning → V2 Chapter 7
-**V1 Target: 0 pages | V2 Target: 80 pages**
-
-*ALL SECTIONS: MOVE_V2*
-**RATIONALE**: Advanced topic requiring distributed coordination
-
-### ## Purpose (2 pages)
-**DECISION**: MOVE_V2
-
-### ## Distributed Learning Paradigm Shift (4 pages)
-**DECISION**: MOVE_V2
-
-### ## Motivations and Benefits (20 pages)
-**DECISION**: MOVE_V2
-
-### ## Design Constraints (20 pages)
-**DECISION**: MOVE_V2
-
-### ## Model Adaptation (20 pages)
-**DECISION**: MOVE_V2
-
-### ## Data Efficiency (10 pages)
-**DECISION**: MOVE_V2
-
-### ## Federated Learning (25 pages)
-**DECISION**: MOVE_V2
-
-### ## Production Integration (10 pages)
-**DECISION**: MOVE_V2
-
-### ## Systems Integration for Production Deployment (5 pages)
-**DECISION**: MOVE_V2
-
-### ## Persistent Technical and Operational Challenges (15 pages)
-**DECISION**: MOVE_V2
-
-### ## Fallacies and Pitfalls (2 pages)
-**DECISION**: MOVE_V2
-
-### ## Summary (2 pages)
-**DECISION**: MOVE_V2
-
----
-
-## CHAPTER 16: Privacy & Security → V2 Chapter 9-10
-**V1 Target: 0 pages | V2 Target: 100 pages**
-
-*Note: Split into two chapters in V2*
-
-### Privacy Content → V2 Chapter 9: Privacy in ML Systems (50 pages)
-
-### ## Purpose (2 pages)
-**DECISION**: MOVE_V2
-
-### ## Foundational Concepts and Definitions (10 pages)
-**DECISION**: MOVE_V2
-
-### ## Privacy-Preserving Data Techniques (30 pages)
-**DECISION**: MOVE_V2
-
-### ## Federated Learning Privacy (included from Ch15)
-**DECISION**: MOVE_V2
-
-### Security Content → V2 Chapter 10: Security in ML Systems (50 pages)
-
-### ## Learning from Security Breaches (20 pages)
-**DECISION**: MOVE_V2
-
-### ## Model-Specific Attack Vectors (20 pages)
-**DECISION**: MOVE_V2
-
-### ## Hardware-Level Security Vulnerabilities (15 pages)
-**DECISION**: MOVE_V2
-
-### ## Comprehensive Defense Architectures (30 pages)
-**DECISION**: MOVE_V2
-
-### ## Practical Implementation Roadmap (8 pages)
-**DECISION**: MOVE_V2
-
----
-
-## CHAPTER 17: Robust AI → V2 Chapter 11
-**V1 Target: 0 pages | V2 Target: 100 pages**
-
-*ALL SECTIONS: MOVE_V2*
-**RATIONALE**: Production robustness at scale
-
-### ## Purpose (2 pages)
-**DECISION**: MOVE_V2
-
-### ## Introduction to Robust AI Systems (5 pages)
-**DECISION**: MOVE_V2
-
-### ## Real-World Robustness Failures (15 pages)
-**DECISION**: MOVE_V2
-
-### ## A Unified Framework for Robust AI (10 pages)
-**DECISION**: MOVE_V2
-
-### ## Hardware Faults (35 pages)
-**DECISION**: MOVE_V2
-
-### ## Intentional Input Manipulation (10 pages)
-**DECISION**: MOVE_V2
-
-### ## Environmental Shifts (5 pages)
-**DECISION**: MOVE_V2
-
-### ## Input-Level Attacks and Model Robustness (35 pages)
-**DECISION**: MOVE_V2
-
-### ## Software Faults (20 pages)
-**DECISION**: MOVE_V2
-
-### ## Fault Injection Tools and Frameworks (10 pages)
-**DECISION**: MOVE_V2
-
-### ## Fallacies and Pitfalls (2 pages)
-**DECISION**: MOVE_V2
-
-### ## Summary (2 pages)
-**DECISION**: MOVE_V2
-
----
-
-## CHAPTER 18: Responsible AI → V2 Chapter 12
-**V1 Target: 0 pages | V2 Target: 80 pages**
-
-*ALL SECTIONS: MOVE_V2*
-**RATIONALE**: Scale changes responsibility challenges
-
-### ## Purpose (2 pages)
-**DECISION**: MOVE_V2
-
-### ## Introduction to Responsible AI (5 pages)
-**DECISION**: MOVE_V2
-
-### ## Core Principles (5 pages)
-**DECISION**: MOVE_V2
-
-### ## Integrating Principles Across the ML Lifecycle (20 pages)
-**DECISION**: MOVE_V2
-
-### ## Responsible AI Across Deployment Environments (15 pages)
-**DECISION**: MOVE_V2
-
-### ## Technical Foundations (30 pages)
-**DECISION**: MOVE_V2
-
-### ## Sociotechnical Dynamics (10 pages)
-**DECISION**: MOVE_V2
-
-### ## Implementation Challenges (15 pages)
-**DECISION**: MOVE_V2
-
-### ## AI Safety and Value Alignment (8 pages)
-**DECISION**: MOVE_V2
-
-### ## Fallacies and Pitfalls (2 pages)
-**DECISION**: MOVE_V2
-
-### ## Summary (2 pages)
-**DECISION**: MOVE_V2
-
----
-
-## CHAPTER 19: Sustainable AI → V2 Chapter 13
-**V1 Target: 0 pages | V2 Target: 80 pages**
-
-*ALL SECTIONS: MOVE_V2*
-**RATIONALE**: Datacenter-scale sustainability
-
-### ## Purpose (2 pages)
-**DECISION**: MOVE_V2
-
-### ## Sustainable AI as an Engineering Discipline (3 pages)
-**DECISION**: MOVE_V2
-
-### ## The Sustainability Crisis in AI (3 pages)
-**DECISION**: MOVE_V2
-
-### ## Part I: Environmental Impact and Ethical Foundations (8 pages)
-**DECISION**: MOVE_V2
-
-### ## Part II: Measurement and Assessment (40 pages)
-**DECISION**: MOVE_V2
-
-### ## Hardware Lifecycle Environmental Assessment (10 pages)
-**DECISION**: MOVE_V2
-
-### ## Part III: Implementation and Solutions (15 pages)
-**DECISION**: MOVE_V2
-
-### ## Embedded AI and E-Waste (10 pages)
-**DECISION**: MOVE_V2
-
-### ## Policy and Regulation (8 pages)
-**DECISION**: MOVE_V2
-
-### ## Public Engagement (8 pages)
-**DECISION**: MOVE_V2
-
-### ## Future Challenges (5 pages)
-**DECISION**: MOVE_V2
-
-### ## Fallacies and Pitfalls (2 pages)
-**DECISION**: MOVE_V2
-
-### ## Summary (2 pages)
-**DECISION**: MOVE_V2
-
----
-
-## CHAPTER 20: Frontiers → V2 Chapter 14
-**V1 Target: 0 pages | V2 Target: 80 pages**
-
-*ALL SECTIONS: MOVE_V2*
-**RATIONALE**: Advanced future directions
-
-### ## Purpose (2 pages)
-**DECISION**: MOVE_V2
-
-### ## From Specialized AI to General Intelligence (3 pages)
-**DECISION**: MOVE_V2
-
-### ## Defining AGI: Intelligence as a Systems Problem (8 pages)
-**DECISION**: MOVE_V2
-
-### ## The Compound AI Systems Framework (3 pages)
-**DECISION**: MOVE_V2
-
-### ## Building Blocks for Compound Intelligence (20 pages)
-**DECISION**: MOVE_V2
-
-### ## Alternative Architectures for AGI (8 pages)
-**DECISION**: MOVE_V2
-
-### ## Training Methodologies for Compound Systems (20 pages)
-**DECISION**: MOVE_V2
-
-### ## Production Deployment of Compound AI Systems (15 pages)
-**DECISION**: MOVE_V2
-
-### ## Remaining Technical Barriers (10 pages)
-**DECISION**: MOVE_V2
-
-### ## Emergent Intelligence Through Multi-Agent Coordination (5 pages)
-**DECISION**: MOVE_V2
-
-### ## Engineering Pathways to AGI (5 pages)
-**DECISION**: MOVE_V2
-
-### ## Implications for ML Systems Engineers (5 pages)
-**DECISION**: MOVE_V2
-
-### ## Core Design Principles for AGI Systems (2 pages)
-**DECISION**: MOVE_V2
-
-### ## Fallacies and Pitfalls (2 pages)
-**DECISION**: MOVE_V2
-
-### ## Summary (2 pages)
-**DECISION**: MOVE_V2
-
----
-
-## CHAPTER 21: AGI Systems (REMOVE - Content merged into Frontiers)
-**V1 Target: 0 pages | V2 Target: 0 pages**
-
-**DECISION**: REMOVE
-**RATIONALE**: Content consolidated into expanded Frontiers chapter
-
----
-
-## CHAPTER 22: Conclusion
-**V1 Target: 10 pages | V2 Target: 10 pages**
-
-### Volume 1 Conclusion (NEW - 10 pages)
-**DECISION**: CREATE NEW
-**CONTENT**:
-- Synthesize single-system ML engineering
-- Bridge to Volume 2 concepts
-- Inspire continued learning
-- Celebrate accomplishments
-
-### Volume 2 Conclusion (MODIFY from existing - 10 pages)
-**DECISION**: MODIFY
-**CONTENT**:
-- Synthesize distributed systems principles
-- Future of ML systems at scale
-- Call to action for responsible development
-
----
-
-## NEW VOLUME 2 CHAPTERS NEEDED
-
-### V2 Chapter 1: Memory Hierarchies for ML (NEW - 60 pages)
-**SOURCE**: Extract from distributed sections of Ch11 + new content
-**TOPICS**:
-- GPU memory management
-- HBM architecture
-- Activation checkpointing
-- Multi-chip memory systems
-
-### V2 Chapter 2: Storage Systems for ML (NEW - 60 pages)
-**SOURCE**: Extract from Ch6 distributed sections + new content
-**TOPICS**:
-- Distributed file systems
-- Checkpoint I/O
-- Feature stores
-- Data lakes
-
-### V2 Chapter 3: Communication & Collective Operations (NEW - 60 pages)
-**SOURCE**: Extract from Ch8 distributed sections + new content
-**TOPICS**:
-- AllReduce algorithms
-- Network topology
-- Gradient compression
-- RDMA
-
-### V2 Chapter 4: Distributed Training (NEW - 80 pages)
-**SOURCE**: Consolidate from Ch8 + Ch10 distributed sections
-**TOPICS**:
-- Data/model/pipeline parallelism
-- Synchronization strategies
-- Load balancing
-
-### V2 Chapter 5: Fault Tolerance & Recovery (NEW - 60 pages)
-**SOURCE**: Extract from Ch13 + new content
-**TOPICS**:
-- Checkpointing strategies
-- Elastic training
-- Failure handling
-
-### V2 Chapter 6: Inference Systems (NEW - 60 pages)
-**SOURCE**: Extract from Ch2 hybrid + Ch13 serving + new content
-**TOPICS**:
-- Batching strategies
-- Model serving patterns
-- Autoscaling
-
-### V2 Chapter 8: Edge Deployment (NEW - 60 pages)
-**SOURCE**: Extract from Ch2 edge sections + new content
-**TOPICS**:
-- Model compilation
-- Runtime optimization
-- Real-time constraints
-
----
-
-## EXECUTION TIMELINE
-
-### Month 1: Content Extraction and Migration
-**Week 1-2**: Extract and migrate V2 content from existing chapters
-- Pull distributed systems content from Ch6, Ch7, Ch8
-- Extract multi-chip content from Ch11
-- Move advanced chapters (15-20) to V2
-
-**Week 3-4**: Create bridging content
-- Write V1→V2 transitions
-- Add recaps to V2 chapters
-- Update cross-references
-
-### Month 2: New Chapter Development
-**Week 5-6**: Draft new V2 chapters 1-3
-- Memory Hierarchies
-- Storage Systems
-- Communication & Collectives
-
-**Week 7-8**: Draft new V2 chapters 4-6, 8
-- Distributed Training
-- Fault Tolerance
-- Inference Systems
-- Edge Deployment
-
----
-
-## CRITICAL DEPENDENCIES TO ADDRESS
-
-### Cross-Volume References
-1. **V2 depends on V1 concepts**: Add 2-page recaps at start of V2 chapters
-2. **V1 mentions advanced topics**: Add "See Volume 2" callout boxes
-3. **Shared examples**: Maintain consistency in running examples
-
-### Content Gaps to Fill
-1. **V1 needs**: Brief sustainability mention in Ch12 or Ch13
-2. **V2 needs**: Stronger introduction chapter setting distributed context
-3. **Both need**: Updated prefaces explaining two-volume structure
-
-### Risk Mitigation
-1. **Page count imbalance**: Monitor during extraction phase
-2. **Dependency cycles**: Review after initial split
-3. **Missing topics**: Keep running list during surgery
-
----
-
-## SUCCESS METRICS
-
-### Volume 1 Success Criteria
-- Complete single-system ML lifecycle coverage
-- No hard dependencies on V2 content
-- Positive, inspiring conclusion
-- 750-850 pages total
-
-### Volume 2 Success Criteria
-- Complete distributed systems coverage
-- Clear value beyond V1
-- Timeless principles focus
-- 750-850 pages total
-
-### Overall Success
-- Each volume adoptable independently
-- Together form comprehensive curriculum
-- Minimal content duplication
-- Clear progression path
-
----
-
-## APPENDIX: QUICK REFERENCE
-
-### Chapters Staying in V1 (with modifications)
-1. Introduction (compressed)
-2. ML Systems (remove distributed)
-3. DL Primer (complete)
-4. DNN Architectures (complete)
-5. Workflow (complete)
-6. Data Engineering (remove distributed)
-7. Frameworks (remove distributed)
-8. Training (remove distributed)
-9. Efficient AI (complete)
-10. Optimizations (remove advanced)
-11. Hardware Acceleration (remove multi-chip)
-12. Benchmarking (remove distributed)
-13. ML Operations (basic only)
-14. AI for Good (complete)
-
-### Chapters Moving to V2
-- On-Device Learning → V2 Ch7
-- Privacy & Security → V2 Ch9-10 (split)
-- Robust AI → V2 Ch11
-- Responsible AI → V2 Ch12
-- Sustainable AI → V2 Ch13
-- Frontiers → V2 Ch14
-
-### New V2 Chapters
-1. Memory Hierarchies for ML
-2. Storage Systems for ML
-3. Communication & Collective Operations
-4. Distributed Training
-5. Fault Tolerance & Recovery
-6. Inference Systems
-8. Edge Deployment
-
----
-
-*End of Surgical Plan Document*
diff --git a/book/docs/VOLUME_STRUCTURE_PROPOSAL.md b/book/docs/VOLUME_STRUCTURE_PROPOSAL.md
deleted file mode 100644
index 53abdcbe0..000000000
--- a/book/docs/VOLUME_STRUCTURE_PROPOSAL.md
+++ /dev/null
@@ -1,216 +0,0 @@
-# Machine Learning Systems: Two-Volume Structure
-
-**Proposal for MIT Press**
-*Draft: December 2024*
-
----
-
-## Executive Summary
-
-The *Machine Learning Systems* textbook will be published as two complementary volumes of 14 chapters each:
-
-| Volume | Title | Focus | Chapters |
-|--------|-------|-------|----------|
-| **Volume 1** | Introduction to Machine Learning Systems | Complete ML lifecycle, single-system focus | 14 (all existing) |
-| **Volume 2** | Advanced Machine Learning Systems | Principles of scale, distribution, and production | 14 (6 existing, 8 new) |
-
-**Guiding Philosophy:**
-- **Volume 1**: Everything you need to build ML systems on a single machine, ending on a positive note with societal impact
-- **Volume 2**: Timeless principles for operating ML systems at scale, grounded in physics and mathematics rather than current technologies
-
----
-
-## Volume 1: Introduction to Machine Learning Systems
-
-*The complete ML lifecycle: understand it, build it, optimize it, deploy it, use it for good.*
-
-| Part | Chapter | Description |
-|------|---------|-------------|
-| **Part I: Systems Foundations** | | *What are ML systems?* |
-| | 1. Introduction | Motivation and scope |
-| | 2. ML Systems | System-level view of machine learning |
-| | 3. Deep Learning Primer | Neural network fundamentals |
-| | 4. DNN Architectures | Modern architecture patterns |
-| **Part II: Design Principles** | | *How do you build ML systems?* |
-| | 5. Workflow | End-to-end ML pipeline design |
-| | 6. Data Engineering | Data collection, processing, validation |
-| | 7. Frameworks | PyTorch, TensorFlow, JAX ecosystem |
-| | 8. Training | Training loops, hyperparameters, convergence |
-| **Part III: Performance Engineering** | | *How do you make ML systems fast?* |
-| | 9. Efficient AI | Efficiency principles and metrics |
-| | 10. Optimizations | Quantization, pruning, distillation |
-| | 11. Hardware Acceleration | GPUs, TPUs, custom accelerators |
-| | 12. Benchmarking | Measurement, MLPerf, evaluation methodology |
-| **Part IV: Practice & Impact** | | *How do you deploy and use ML systems responsibly?* |
-| | 13. ML Operations | Deployment, monitoring, CI/CD for ML |
-| | 14. AI for Good | Positive societal applications |
-
-**Total: 14 chapters across 4 parts (all existing content)**
-
-*Early awareness:* include a short Sustainable AI note in Benchmarking or ML Operations to flag energy and carbon impacts without adding another chapter.
-
-### Volume 1 Narrative Arc
-
-The book progresses from understanding → building → optimizing → deploying → impact:
-
-1. **Foundations** establish what ML systems are and why they matter
-2. **Design** teaches how to construct complete pipelines
-3. **Performance** shows how to make systems efficient
-4. **Practice & Impact** completes the lifecycle and ends on an inspirational note
-
-Ending on "AI for Good" leaves students with a positive vision of what they can build.
-
----
-
-## Volume 2: Advanced Machine Learning Systems
-
-*Timeless principles for building and operating ML systems at scale.*
-
-| Part | Chapter | Status | Description |
-|------|---------|--------|-------------|
-| **Part I: Data Movement & Memory** | | | *Moving data is the bottleneck* |
-| | 1. Memory Hierarchies for ML | 🆕 NEW | GPU memory, HBM, activation checkpointing |
-| | 2. Storage Systems for ML | 🆕 NEW | Distributed storage, checkpointing, feature stores |
-| | 3. Communication & Collective Operations | 🆕 NEW | AllReduce, gradient compression, network topology |
-| **Part II: Parallelism & Coordination** | | | *Decomposing computation across machines* |
-| | 4. Distributed Training | 🆕 NEW | Data/model/pipeline/tensor parallelism |
-| | 5. Fault Tolerance & Recovery | 🆕 NEW | Checkpointing, elastic training, failure handling |
-| | 6. Inference Systems | 🆕 NEW | Batching, serving architectures, autoscaling |
-| **Part III: Constrained Environments** | | | *Doing more with less* |
-| | 7. On-device Learning | Existing | Training and adaptation on edge devices |
-| | 8. Edge Deployment | 🆕 NEW | Compilation, runtime optimization, real-time |
-| **Part IV: Adversarial Environments** | | | *Systems under attack and uncertainty* |
-| | 9. Privacy in ML Systems | Existing | Differential privacy, federated learning, secure aggregation |
-| | 10. Security in ML Systems | 🆕 NEW | Supply chain, API security, multi-tenant isolation |
-| | 11. Robust AI | Existing | Adversarial robustness, distribution shift, monitoring |
-| **Part V: Stewardship** | | | *Building systems that serve humanity* |
-| | 12. Responsible AI | Existing | Fairness, accountability, transparency at scale |
-| | 13. Sustainable AI | Existing | Energy efficiency, carbon footprint, environmental impact |
-| | 14. Frontiers & Future Directions | Existing | Emerging paradigms, open problems, conclusion |
-
-**Total: 14 chapters across 5 parts (6 existing, 8 new)**
-
----
-
-## New Content for Volume 2
-
-### Part I: Data Movement & Memory
-*The physics of data movement is the fundamental constraint in modern ML.*
-
-| Chapter | Key Topics | Timeless Principle |
-|---------|------------|-------------------|
-| **Memory Hierarchies for ML** | GPU memory management, HBM architecture, caching strategies, activation checkpointing, memory-efficient attention | Memory bandwidth limits compute utilization |
-| **Storage Systems for ML** | Distributed file systems, checkpoint I/O, feature stores, data lakes, prefetching, I/O scheduling | Storage throughput gates training speed |
-| **Communication & Collective Operations** | AllReduce algorithms, ring/tree topologies, gradient compression, RDMA fundamentals, network topology design | Communication overhead limits scaling |
-
-### Part II: Parallelism & Coordination
-*The mathematics of decomposing work across machines.*
-
-| Chapter | Key Topics | Timeless Principle |
-|---------|------------|-------------------|
-| **Distributed Training** | Data parallelism, model parallelism (tensor, pipeline, expert), hybrid strategies, synchronization, load balancing | Parallelism has fundamental trade-offs |
-| **Fault Tolerance & Recovery** | Checkpoint strategies, async checkpointing, elastic training, failure detection, graceful degradation | Large systems fail; recovery must be designed in |
-| **Inference Systems** | Batching strategies, continuous batching, KV cache management, model serving patterns, autoscaling, SLO management | Serving has different constraints than training |
-
-### Part III: Constrained Environments
-*Operating under resource limitations.*
-
-| Chapter | Key Topics | Timeless Principle |
-|---------|------------|-------------------|
-| **Edge Deployment** | Model compilation, runtime optimization, heterogeneous hardware, real-time constraints, power management | Constraints force creativity |
-
-### Part IV: Adversarial Environments
-*Systems facing attacks, privacy requirements, and uncertainty.*
-
-| Chapter | Key Topics | Timeless Principle |
-|---------|------------|-------------------|
-| **Security in ML Systems** | Model provenance, supply chain security, API protection, multi-tenant isolation, access control | Production systems face adversaries |
-
----
-
-## Design Principles
-
-### Why This Structure Works
-
-**Volume 1 (Single System)**
-- Teaches the complete lifecycle
-- Everything can be learned and practiced on one machine
-- Ends positively with societal impact
-
-**Volume 2 (Distributed Systems)**
-- Builds on Volume 1 foundations
-- Addresses what changes at scale
-- Organized around timeless constraints, not current technologies
-
-### What Makes Volume 2 Timeless
-
-Each part addresses constraints rooted in physics, mathematics, or human nature:
-
-| Part | Eternal Constraint | Foundation |
-|------|-------------------|------------|
-| Data Movement & Memory | Moving data costs more than compute | Physics: speed of light, memory bandwidth |
-| Parallelism & Coordination | Work must be decomposed and synchronized | Mathematics of parallel computation |
-| Constrained Environments | Resources are always finite | Economics and physics |
-| Adversarial Environments | Attackers and uncertainty exist | Human nature, statistics |
-| Stewardship | Technology must serve humanity | Ethics, sustainability |
-
-Chapters use current examples (LLMs, transformers, specific hardware) but frame them as instances of these enduring principles.
-
----
-
-## Content Migration Summary
-
-| Chapter | Volume 1 | Volume 2 | Rationale |
-|---------|----------|----------|-----------|
-| Introduction through Benchmarking | ✓ | | Core technical content |
-| ML Operations | ✓ | | Completes the lifecycle |
-| AI for Good | ✓ | | Positive conclusion |
-| On-device Learning | | ✓ | Edge/constrained is advanced |
-| Privacy & Security | | ✓ | Production security is advanced |
-| Robust AI | | ✓ | Production robustness is advanced |
-| Responsible AI | | ✓ | Scale changes the challenges |
-| Sustainable AI | | ✓ | Datacenter scale is advanced |
-| Frontiers | | ✓ | Conclusion for advanced volume |
-
----
-
-## Audience
-
-| Volume | Primary Audience | Use Cases |
-|--------|-----------------|-----------|
-| Volume 1 | All ML practitioners, undergraduates, bootcamp students | First course in ML systems, self-study |
-| Volume 2 | Infrastructure engineers, graduate students, researchers | Advanced course, reference for practitioners at scale |
-
----
-
-## Collaboration Model
-
-Volume 2's new chapters are candidates for collaborative authorship:
-
-| Topic Area | Ideal Collaborator Profile |
-|------------|---------------------------|
-| Memory & Storage | Datacenter architects, MLPerf Storage contributors |
-| Networking & Communication | Distributed systems researchers, framework developers |
-| Distributed Training | PyTorch/JAX distributed teams, hyperscaler engineers |
-| Fault Tolerance | Site reliability engineers, systems researchers |
-| Inference Systems | ML serving infrastructure engineers |
-| Edge Deployment | Embedded ML practitioners, compiler engineers |
-| Security | ML security researchers, production security engineers |
-
----
-
-## Summary Statistics
-
-| Metric | Volume 1 | Volume 2 |
-|--------|----------|----------|
-| Chapters | 14 | 14 |
-| Parts | 4 | 5 |
-| Existing content | 14 | 6 |
-| New content | 0 | 8 |
-| Focus | Single system | Distributed systems |
-| Prerequisite | None | Volume 1 |
-
----
-
-*Document Version: December 2024*
-*For discussion with MIT Press and potential collaborators*