Adds several missing citations to the frontiers.bib file.
Expands upon the discussion of AGI, including scaling
hypotheses, neurosymbolic systems, embodied intelligence,
and multi-agent systems.
Clarifies limitations of current models and explores
potential future directions in AI research.
Removes obsolete reports related to DOI verification and definition placement audits.
These reports are no longer needed as the tasks are completed and documented elsewhere.
Addresses #1016
Changes:
- Changed PDF/EPUB navbar links from absolute URLs (mlsysbook.ai) to
relative URLs (/pdf, /epub) so they work on both main and dev sites
- Updated deploy-preview workflow to download PDF and EPUB artifacts
in addition to HTML artifact
- Added step to copy PDF and EPUB files to assets/downloads directory
- Added _redirects file to dev deployment for proper routing
This ensures dev preview site serves its own PDF/EPUB versions rather
than redirecting users to the main production site.
Addresses #1034
Fixed 47 instances across 20 quiz files where MCQ answer explanations
incorrectly referenced the correct option as one of the incorrect options.
Changes:
1. Fixed all quiz JSON files with incorrect option references
- Fixed patterns like 'Options A, C, and D' when A is correct
- Fixed patterns like 'Option C is incorrect' when C is correct
- Fixed patterns like 'Option A describes...' when A is correct
2. Created fix_mcq_answer_explanations.py script
- Automatically detects and fixes incorrect option references
- Handles plural and singular patterns
- Can be run on all quiz files or specific files
3. Enhanced quizzes.py with validation and opt-in redistribution
- Added validate_mcq_option_references() function
- Validation runs during quiz generation to catch LLM errors
- MCQ redistribution now requires --redistribute-mcq flag (opt-in)
- Prevents bug from being reintroduced during answer shuffling
All 445 MCQ questions validated across 35 quiz files.
Change default build method from 'both' to 'container' to improve
efficiency and reduce unnecessary baremetal builds. Baremetal and both
options remain available for manual workflow dispatch when needed.
Changes:
- Set default build_method to 'container'
- Auto-triggered builds now use container only
- Manual dispatch still supports all three options
- Updated summary messages to reflect container default
The missing_definitions_analysis.md working document has served its purpose.
All Tier 1 and Tier 2 recommendations have been successfully implemented:
- 6 definitions added (Tensor, Overfitting, Transfer Learning, Distributed
Training, Quantization, Batch Processing)
- Plus 2 additional critical definitions (Gradient Descent, Backpropagation)
The definition_placement_audit.md remains as the permanent quality
documentation showing all 47 definitions are optimally placed and ready
for publication.
Complete audit of all 47 definitions confirms 100% optimal placement quality.
All definitions follow textbook best practices:
- Positioned after motivating context
- Placed before substantive usage
- Located in dedicated sections
- Supporting optimal pedagogical flow
Only one repositioning required (Overfitting), now fixed.
All definitions ready for academic review.
Moved Overfitting definition from after its first mention (line 2044) to the
start of 'Convergence and Stability Considerations' section (line 2036). This
follows textbook best practice: definitions should appear BEFORE the concept
is used extensively, not after.
Placement now follows optimal pattern:
- Section header and brief introduction
- Formal definition callout
- Detailed explanation and usage
All 8 new definitions now properly placed at section starts after motivating
context and before substantive usage.
Added 6 critical definitions identified in missing_definitions_analysis.md:
- Tensor (frameworks chapter)
- Overfitting (dl_primer chapter)
- Transfer Learning (workflow chapter)
- Distributed Training (training chapter)
- Quantization (optimizations chapter)
- Batch Processing (training chapter)
All definitions follow the canonical format: single sentence, 3-6 strategic
italics, no enumeration, no leading articles. These foundational concepts
are now formally defined to match university textbook standards.
Addresses missing_definitions_analysis.md Tier 1 and Tier 2 items.
Changed all definition titles from inconsistent formats to simple term names:
- 'Definition of X' → 'X'
- 'X Definition' → 'X'
- 'Definition of the X' → 'X'
Affected 39 definitions across 18 files for consistent callout-definition presentation.
Examples:
- 'Definition of Deep Learning' → 'Deep Learning'
- 'Security Definition' → 'Security'
- 'Definition of the Machine Learning Lifecycle' → 'Machine Learning Lifecycle'
Changed 'dramatically fewer' to 'substantially fewer' for maximum academic formality.
Rubio's academic tone audit identified this as the only instance requiring refinement
across all 39 definitions. All definitions now maintain consistent formal academic voice.
Phase 8 - Benchmarking (6):
- ML Benchmarking: Removed problematic phrasing, reduced to 1 sentence
- ML Algorithmic Benchmarks: Reduced from 3 to 1 sentence, cleaner focus
- ML System Benchmarks: Reduced from 3 to 1 sentence, clear infrastructure focus
- ML Data Benchmarks: Reduced from 3 to 1 sentence, emphasized quality assessment
- ML Training Benchmarks: Reduced from 3 to 1 sentence, focused on training phase
- ML Inference Benchmarks: Reduced from 4 to 1 sentence, deployment focus clear
Phase 9 - Forward-Looking (1):
- AGI: Changed 'refers to' to 'represents', reduced from 4 to 1 sentence
All follow canonical standard. Phases 8-9 complete. IMPLEMENTATION: 39/39 definitions (100%).
- Responsible AI: Reduced from 3 to 1 sentence, focused on transformation process
- Sustainable AI: Reduced from 3 to 1 sentence, emphasized first-class constraint
- Resilient AI: Changed 'refers to ability' to 'describes systems', reduced to 1 sentence
- Security: Removed 'in machine learning systems', reduced from 3 to 1 sentence
- Privacy: Removed 'in machine learning systems', reduced from 3 to 1 sentence
- AI for Good: Changed 'refers to' to 'is', reduced from 2 to 1 sentence
All follow canonical standard. Phase 7 (6/6) complete. Progress: 32/39 definitions (82%).
- ML Accelerators: Removed article, reduced from 3 to 1 sentence, focused on efficiency
- Mapping: Reduced from 3 to 1 sentence, emphasized three core dimensions
- Model Optimization: Reduced from 4 to 1 sentence, highlighted efficiency vs performance trade-off
- Pruning: Reduced from 2 to 1 sentence, cleaner redundancy removal statement
- ML System Efficiency: Reduced from 3 to 1 sentence, three optimization dimensions clear
All follow canonical standard. Phase 5 (5/5) complete. Progress: 19/39 definitions.
- ML Lifecycle: Removed 'The', reduced from 3 to 1 sentence, focused on iterative nature
- MLOps: Reduced from 3 to 1 sentence, highlighted unique ML challenges
- Training Systems: Reduced from 4 to 1 sentence, emphasized iterative optimization
- ML Frameworks: Removed article, reduced from 2 to 1 sentence, focused on bridging role
- Data Engineering: Added 'systematic', reduced from 2 to 1 sentence, clearer transformation
All follow canonical standard. Phase 4 (5/5) complete. Progress: 14/39 definitions.
- MLPs: Reduced from 4 sentences to 1, focused on fully-connected nature and trade-offs
- CNNs: Reduced from 3 sentences to 1, emphasized spatial structure exploitation
- RNNs: Reduced from 4 sentences to 1, highlighted sequential processing trade-off
- Attention: Reduced from 3 sentences to 1, focused on content-dependent relationships
- Transformers: Reduced from 3 sentences to 1, emphasized parallelization advantage
All definitions now follow canonical standard. Phase 3 (6/6) complete.
- AI & ML: Reduced to single sentences, removed 'goal' language, clearer distinction
- Machine Learning System: From 2 sentences to 1, removed numbered list, emphasizes interdependency
- AI Engineering: From 2 sentences to 1, focused on systems-level integration vs enumeration
All definitions now follow canonical standard: single sentence, 3-6 strategic italics,
no articles, technically precise. Phase 2 of 39-definition standardization complete.
Explains the limitations of model adaptation during deployment.
Highlights that models apply fixed learned distributions,
emphasizing the importance of retraining for adapting to data
drift rather than runtime modification.
Discuss the challenges of data drift and distribution shift in ML systems, emphasizing the limitations of fixed model parameters and the need for retraining.
- Ran betterbib update on all bibliography files (synced 443/460 entries)
- Removed 2 papers with invalid/fabricated DOIs:
* Kannan2023chiplet (placeholder DOI 10.1109/MM.2022.1234567)
* chen2019edge (unverifiable DOI 10.1109/SEC.2019.00035)
- Fixed DOI typo in taylor2022 (changed -01331-1 to -01331-9)
- Removed corresponding citations from .qmd files
- Created verification reports documenting all findings
17 DOIs returned 404 errors. 2 removed, 1 fixed, 14 need further investigation.
Famous papers (ZeRO, Eyeriss, MYCIN) confirmed to exist but have incorrect DOIs.
Release notes are now managed directly on GitHub releases.
All releases have been updated with actual release dates in their
descriptions. Future releases will be handled by publish-live workflow.
Move release notes files from root to docs/releases/ for better
organization and discoverability. These files serve as local backups
of GitHub release notes.