Commit Graph

1481 Commits

Author SHA1 Message Date
Vijay Janapa Reddi
5677633b4c Update symlinks to point to vol1 build config after Vol1 build run 2026-02-21 10:51:05 -05:00
Vijay Janapa Reddi
edb2dd17b0 Fix Vol1 standalone PDF build errors across 4 chapters
- hw_acceleration: escape % in callout title 'The Five-Percent Utilization Mystery'
  (LaTeX treats % as comment char in div attribute titles, truncating the box)
- data_selection: escape % in callout title 'The Ninety-Nine Percent Sparsity Trap'
  (same \fbxSimple runaway argument error)
- model_compression: remove 28-line orphaned stale class body (merge artifact);
  add missing mat_dim=4096 to LowRankFactorization class parameters
- model_serving: move littles-law-calc code cell before the prose that references
  its exported variables (serving_qps_str etc. used before they were defined)
2026-02-21 10:48:12 -05:00
Vijay Janapa Reddi
35cc915041 fix: ensure all 36 Vol2 chapters build as standalone PDFs
Fixes a series of LaTeX/Pandoc compilation errors across Vol2 so every
chapter builds cleanly with `binder build pdf <chapter> --vol2 -v`.

Key fixes applied:

- Citations removed from fig-caps, table cells, and footnote definitions
  (Quarto 1.8 `marginCitePlaceholderInlineWithProtection` bug with
  `citation-location: margin`); citations restored to surrounding prose
- TikZ nodes with `\\` line breaks given `align=center/left` to exit
  LR mode (robust_ai, sustainable_ai)
- `\argmax` → `\operatorname{arg\,max}` (undefined in amsmath)
- `\texorpdfstring` wrapping for math in section headers (notation)
- Multi-line `{python}` inline expressions in grid tables converted to
  pipe tables (appendix_communication)
- Math expressions split across grid table row boundaries converted to
  pipe tables to avoid `\{\beta\}\$` rendering corruption
- Stale class references (`ImageNetBottleneck`, `PrefetchBuffer`,
  `CheckpointStorage`) fixed → `StorageEconomics.*` (data_storage)
- Missing `batch_per_gpu` factor in aggregate bandwidth formula (data_storage)
- Duplicate `xytext` keyword in `ax.annotate()` call (edge_intelligence)
- `&lt;` HTML entity mixed with unescaped `$` in table cells fixed (security_privacy)
- Incorrect `check()` invariant corrected (appendix_fleet)
2026-02-21 09:44:46 -05:00
Vijay Janapa Reddi
fc093ab8de Merge pull request #1194 from harvard-edge/fix/ch5
Update figure in chapter 5
2026-02-21 09:30:09 -05:00
Zeljko Hrcek
678a218372 Update figure in chapter 5 2026-02-21 15:20:35 +01:00
Vijay Janapa Reddi
62b98edee1 Updates book content and configuration
Refines book abstracts, table of contents, and diagram configurations for improved clarity and structure.

This commit enhances the descriptions of both Volume I and Volume II, emphasizing their respective focuses. It also introduces a framework decision tree to guide the selection of parallel training strategies and inference frameworks, and diagrams for visualizing hardware constraints.
2026-02-21 08:19:01 -05:00
Vijay Janapa Reddi
0614676798 Adds PDF config for Volume II of the book
Creates a YAML configuration file specifically for generating the PDF version of Volume II: Machine Learning Systems at Scale.

This configuration defines the project structure, book metadata (title, author, abstract), chapter organization, and PDF-specific settings like cover page design, table of contents depth, and inclusion of LaTeX files for custom styling.

This allows for independent building and customization of the PDF output for Volume II.
2026-02-21 08:17:13 -05:00
Vijay Janapa Reddi
9e35563d00 Merge remote-tracking branch 'origin/feature/book-volumes' into feature/book-volumes 2026-02-21 08:16:50 -05:00
Vijay Janapa Reddi
c68ca02d9e Enhances data pipeline debugging flowchart
Improves the data pipeline debugging flowchart by adding visual cues.

These cues help to highlight the type of data issue being investigated
and make the flowchart easier to understand.
2026-02-21 08:15:29 -05:00
Vijay Janapa Reddi
87ffaf288d Refines content for Volume 1 conclusion
Enhances the conclusion of Volume 1, improving clarity and flow by:

- Refining wording and structure for better readability
- Clarifying the connection between theoretical invariants and practical applications
- Adding information for clarity and context
2026-02-21 07:59:34 -05:00
Vijay Janapa Reddi
718f867039 Vol1: improve book abstracts and chapter content
- Config: academic, standalone abstracts for PDF/EPUB/copyedit
- Chapters: ml_systems, nn_architectures, nn_computation, training
2026-02-21 06:58:22 -05:00
Zeljko Hrcek
ae2ef83dd3 Merge branch 'feature/book-volumes' into fix/training 2026-02-21 09:54:24 +01:00
Zeljko Hrcek
403994ea2e Update training chapter and add missing color definition 2026-02-21 09:28:04 +01:00
Vijay Janapa Reddi
09602445de chore: update book content, config, appendices, and tooling
- Vol1: chapter updates across backmatter, benchmarking, data, frameworks, etc.
- Vol2: content updates, new appendices (assumptions, communication, fleet, reliability)
- Quarto: config, styles, formulas, constants
- Add SEMINAL_PAPERS_V2.md, learning_objectives_bolding_parallel.sh
- VSCode extension: package.json, chapterNavigatorProvider
- Landing page and docs updates
2026-02-20 18:55:24 -05:00
Vijay Janapa Reddi
b5a9e590db Standardizes P.I.C.O. code blocks and consolidates specifications across Volume 2
Audits and refactors Volume 2 chapters to ensure all Python calculation cells adhere to the P.I.C.O. (Parameters, Invariants, Calculation, Outputs) standard.

- Consolidates storage specifications and economics into StorageSetup and StorageEconomics classes in data_storage.qmd.
- Refactors collective communication math into the AllReduceCost class in collective_communication.qmd.
- Standardizes infrastructure and performance engineering setups in compute_infrastructure.qmd and performance_engineering.qmd.
- Corrects NameErrors and missing imports in benchmarking and platform ROI calculations.
- Ensures all prose variables are correctly exported and scoped within Safe Class Namespaces to prevent global pollution and ensure mathematical consistency across the fleet-scale narrative.
2026-02-20 16:53:20 -05:00
Vijay Janapa Reddi
abc7ef01d8 Fixes broken P.I.C.O. code blocks and missing imports across Volume 1
Audits all Volume 1 chapters to identify and repair structural errors in Python calculation cells introduced during the P.I.C.O. refactor.

- Consolidates redundant memory calculations and fixes missing imports in nn_computation.qmd.
- Refactors AttentionMemory in nn_architectures.qmd to resolve NameErrors and duplicated blocks.
- Cleans up QuantizationSpeedup and restores MobileNetCompressionAnchor in model_compression.qmd.
- Resolves missing Models and Hardware imports in benchmarking.qmd.
- Updates LighthouseModels in ml_systems.qmd with missing variables for MobileNet and KWS.
- Corrects indentation and structural integrity across all Volume 1 calculation scenarios to ensure valid rendering and mathematical consistency.
2026-02-20 15:43:42 -05:00
Vijay Janapa Reddi
b6b2c94988 Refactors Volume II content and structure
Restructures Volume II to improve narrative flow and address scale impediments, including reordering of sections and addition of introductory material.

Introduces "Master Map" to guide readers through the volume's layered progression.

Adds callout notes to bridge concepts between sections.

Moves references.qmd to backmatter and adjusts chapter organization for clarity.

Updates hardware parameterization and network performance modeling within code blocks.
2026-02-19 14:39:54 -05:00
Vijay Janapa Reddi
45f46ad70d Reinforces key concepts with concrete examples
Deepens understanding of abstract principles by adding concrete examples and numerical anchors.

These additions provide tangible context and illustrate the practical implications of the discussed concepts, which aids in comprehension and application. It also adds context to constraints, economics and performance.
2026-02-19 14:35:48 -05:00
Vijay Janapa Reddi
13b29eb0ea Refactors concept maps for volume 1 chapters
Updates concept map YAML files for various chapters in volume 1, including introduction, benchmarking, data engineering, data selection, frameworks, hardware acceleration, ML systems, MLOps, ML workflow, model serving, NN architectures, NN computation, optimizations, responsible engineering, and training.

Replaces the old YAML structure with a new structure that focuses on primary, secondary concepts, technical terms, methodologies, and formulas. The change emphasizes the core concepts and their relationships within each chapter. The generated dates are updated to reflect a future date.
2026-02-19 13:49:04 -05:00
Vijay Janapa Reddi
e11ad3d44c Strengthen Vol1 intellectual spine with nine micro-insertions across 12 chapters
Insert thesis declarations, spine reconnections, and evidence elevations
that make the book's central claim explicit: ML systems engineering is a
distinct discipline governed by permanent physical laws. No restructuring
or deletions; insertions only, matching the surrounding rhetorical register.
2026-02-19 13:03:05 -05:00
Vijay Janapa Reddi
717dcebc31 Consolidate bib files and rename responsible_engineering to responsible_ai
- Remove 17 empty per-chapter .bib files (all contained only newlines)
- Consolidate HTML and EPUB configs to use central backmatter/references.bib
  (matching the pattern already used by PDF configs and Vol1)
- Rename responsible_engineering/ to responsible_ai/ for consistency with
  robust_ai/ and sustainable_ai/ in Part IV: The Responsible Fleet
- Update all 4 Quarto config files with new path
2026-02-19 09:39:07 -05:00
Vijay Janapa Reddi
3c40d1288b Restructures Vol2 from 5-part/19-chapter to 4-part/16-chapter Fleet Stack architecture
Major structural reorganization of Volume II:
- New 4-part structure: The Fleet, Distributed ML, Deployment at Scale, The Responsible Fleet
- Fleet Stack framework (Infrastructure/Distribution/Serving/Governance) replaces Systems Sandwich
- Renamed and reorganized 8 chapter directories to match new structure
- Absorbed ai_good/ into responsible_engineering and emerging_challenges/ into introduction
- Wrote/expanded 6 new chapters (collective_communication, compute_infrastructure,
  fleet_orchestration, network_fabrics, data_systems, performance_engineering)
- Fixed 116+ broken @sec- cross-references across all 16 chapters and glossary
- Updated all 4 Quarto config files, part-openers, and summaries.yml
- Added \mlfleetstack LaTeX command for PDF rendering
- Removed old 5-part HTML artifacts and macOS resource fork files
- Converted grid tables to pipe tables in fleet_orchestration
- Fixed inline Python in display math blocks in collective_communication
- Resolved duplicate tbl-tco-comparison label and stale part key reference
2026-02-19 09:35:37 -05:00
Vijay Janapa Reddi
739b48622f Add war story callout with proper icon formats and supporting files
- Add war story callout definition in custom-numbered-blocks.yml
- Create war story icon in all three formats (SVG, PNG, PDF) matching
  the 64x64 stroke-only style used by all other callout icons
- Add war story bibliography and PDF config entry
- Add first war story ("The Quadratic Wall") in nn_architectures
- Include icon conversion utility script
2026-02-19 07:38:16 -05:00
Vijay Janapa Reddi
15f7e139a7 Clean up Vol1 manuscript: fix code conventions, cross-refs, and formatting
- Fix quiz filenames in 4 chapters (footnote_context -> chapter-specific)
- Replace raw _value variables in prose with formatted _str versions
- Add missing engine: jupyter in ml_ops and responsible_engr YAML
- Remove debug comments in nn_computation, translate Serbian TikZ comments
- Fix fairness figure label mismatch (96% -> 91%) in responsible_engr
- Replace hardcoded placeholder strings with computed values in ml_systems
- Fix cross-ref issues: "DL Primer" -> @sec- ref, "Part II" -> "Part I"
- Fix duplicate latency-budget-calc label, "dotted" -> "dashed" arrow
- Normalize magnitude references (seven orders), fix table spacing
- Clean up truncated comment blocks in data_selection
- Remove internal authoring HTML comment in hw_acceleration
2026-02-18 17:28:31 -05:00
Vijay Janapa Reddi
5c45391557 Refactors Distributed Training chapter for Systems Sandwich alignment
Aligns the Distributed Training chapter with the Volume 2 'Systems Sandwich'
framework, establishing it as the 'Operational Layer' of the Machine Learning Fleet.

Key changes:
- Refactors 'Purpose' and 'Learning Objectives' to use rhetorical pivots and
  focus on the 'Physics of the Fleet'.
- Updates Python setup cell to use the 'Safe Class Namespace' pattern (P.I.C.O.)
  and adds Archetype A (GPT-4) constants.
- Rewrites 'Multi-Machine Scaling Fundamentals' to center on the
  'Communication-Computation Ratio' and the 'Law of Distributed Efficiency'.
- Cross-references the Volume 2 Introduction definitions to create a cohesive narrative.
2026-02-18 15:25:00 -05:00
Vijay Janapa Reddi
ede45403c3 Refactors Volume 2 Introduction for alignment with Volume 1 Gold Standard
Aligns the rhetorical style and quantitative rigor of the Volume 2
Introduction with the established Volume 1 standards.

Introduces the "Machine Learning Fleet" narrative as the central
engineering challenge of Volume 2, shifting from single-node
optimization to cluster-scale orchestration.

Key changes:
- Establishes the "Law of Distributed Efficiency" and "CI Ratio"
  (Communication Intensity) as new quantitative frameworks.
- Defines the "Reliability Gap" to address statistical failure
  certainty in massive clusters.
- Refactors all TikZ diagrams (Systems Sandwich, Roadmap, AI Triad)
  to use project-standard colors and Helvetica font.
- Updates the "Lighthouse Archetypes" to focus on throughput,
  latency, and resource-bound fleet challenges.
- Implements P.I.C.O. math patterns for fleet-scale calculations.
2026-02-18 15:20:33 -05:00
Vijay Janapa Reddi
85c60b5935 Refactors framework evolution section
Renames and restructures the framework evolution section to
"The Ladder of Abstraction," emphasizing the problem-solving
nature of each abstraction layer.

Clarifies the role of each layer (BLAS/LAPACK, NumPy,
Deep Learning Frameworks) in solving specific problems related
to performance, usability, and differentiation, respectively.

Highlights the trade-offs between productivity and transparency
as we move up the abstraction ladder.
2026-02-17 18:49:49 -05:00
Vijay Janapa Reddi
67b4d48470 Clarifies notation for units and precision
Refines the notation section to explicitly state the use
of decimal SI prefixes for data, memory, and compute.

Updates wording for clarity and consistency, specifically
addressing units, storage, and compute contexts.

Ensures that the book uses only decimal SI prefixes and
specifies the formatting of numbers and units.
2026-02-16 10:39:14 -05:00
Vijay Janapa Reddi
dcc8afb4cc Parallel builds: add HTML/EPUB format picker, clearer reporting
- debugCommands: prompt for format (PDF, HTML, EPUB) in Build All Chapters (Parallel)
- parallelDebug: clearer success/fail messages, Open Reports Folder, REPORT.md header
- README: document volume + format selection for parallel builds
2026-02-15 17:41:10 -05:00
Vijay Janapa Reddi
e6051001b1 Parallel builds: track callout PDFs, fix extension errors, remove copy step
- .gitignore: allow book/quarto/assets/images/icons/callouts/*.pdf
- Add missing callout PDFs (takeaways, pitfall, fallacy) so worktrees have them
- Extension: runtime repo-root check and clearer errors for parallel debug
- Extension: remove copyCalloutPdfsToWorktree (PDFs now in repo by default)
- README: document parallel build requirements; drop callout copy note
2026-02-15 16:50:25 -05:00
Vijay Janapa Reddi
37d26f6def Book volumes: content, VS Code ext, CLI debug, and build updates 2026-02-15 16:02:54 -05:00
Vijay Janapa Reddi
73a956a09b chore(volumes,vscode-ext): batch volume updates and tooling improvements
Checkpoint the branch-wide content/config revisions together with workbench enhancements so chapter rendering and developer workflows stay aligned. This captures the current validation-driven formatting and parallel build/debug improvements in one commit.
2026-02-15 14:03:27 -05:00
Vijay Janapa Reddi
cac84290df feat(vscode-ext): auto-open PDF in editor after build completes
Use a file watcher to detect when the PDF is created/modified during
build, then automatically open it in VS Code. Build still runs in the
visible terminal so users see progress. Also fix LaTeX comma-in-title
bug in foldbox.tex by bracing the title argument inside tcolorbox options.
2026-02-15 12:13:04 -05:00
Vijay Janapa Reddi
1db2dacfe7 style(vol2): fix lowercase x multiplication notation across all chapters
Convert all remaining lowercase 'x' used as multiplication (e.g.,
"1000x faster") to $\times$ across 17 vol2 chapters. These were
flagged by the new lowercase_x_multiplication validator check.

Simplifies the validator regex from a fragile word-list approach to a
broader pattern matching digit-x-lowercase (e.g., \dx\s+[a-z]) which
naturally excludes hardware counts (8x A100) and hex literals (0x61).
Includes the conversion script in _archive.
2026-02-15 11:53:51 -05:00
Vijay Janapa Reddi
5d68f0a2e0 style: standardize multiplication notation to $\times$ across all chapters
Convert all Unicode × (U+00D7) to LaTeX $\times$ in prose, tables, and
math contexts across both volumes. Unicode × is preserved only inside
fig-alt text for accessibility screen readers. One instance inside a
plain markdown backtick code span (frameworks.qmd) was reverted to
Unicode × since LaTeX doesn't render in code spans.

Updates validate.py with a new lowercase-x-as-multiplication check and
refines the latex_adjacent warning to distinguish _str variables (safe)
from raw inline Python. Updates validate_inline_refs.py comments to
reflect the new convention. Includes the conversion script in _archive.
2026-02-15 11:43:45 -05:00
Vijay Janapa Reddi
f65e68222a fix(vol1): correct stale backward references across 3 chapters
Audited all 52 backward-looking prose references ("recall", "as we saw",
"introduced earlier") across all 16 Vol I chapters. Found 46 valid and
6 with issues; fixed the 4 actionable ones:

- benchmarking: fix dual attribution for energy-movement claim
- hw_acceleration: fix imprecise "100x" energy gap to "orders-of-magnitude"
- hw_acceleration: change "introduced in" to "mentioned in" for HBM ref
- conclusion: correct invariant attribution from data_engineering to Part I

Audit report: .claude/_reviews/2026-02-15_backward-reference-audit.yaml
2026-02-15 09:52:32 -05:00
Vijay Janapa Reddi
577b5d3cc7 feat(vscode-ext): add health status bar and validation checks
Add a persistent health indicator to the extension: a status bar item
that shows pass/warn/error at a glance, plus a health summary node at
the top of the Pre-commit tree view. Fast in-process TypeScript checks
run on file save, editor switch, and startup (<100ms per file).

Checks: duplicate labels, unclosed div fences, missing figure alt-text,
and unresolved in-file cross-references.

- Add src/validation/qmdChecks.ts with four pure check functions
- Add src/validation/healthManager.ts with central status tracker
- Wire HealthManager into extension.ts with status bar and event hooks
- Add expandable health summary node to PrecommitTreeProvider
- Register showHealthDetails command in package.json
2026-02-14 16:18:16 -05:00
Vijay Janapa Reddi
97118ba0d8 style(vol1): fix remaining multiplication notation violations
Second pass catching ~37 additional instances missed in the initial
cleanup, including prose in frameworks, glossary definitions, footnotes,
fig-caps, fig-alts, table cells, and callout content.

All remaining `Nx` patterns are now exclusively inside Python code
blocks (comments, docstrings, f-strings) or are mathematical variable
expressions (e.g., derivative = 2x), which are correct as-is.
2026-02-14 15:46:57 -05:00
Vijay Janapa Reddi
c9d21b768b feat(binder): add render plots command for matplotlib figure gallery
Integrate figure rendering into the binder CLI so plots can be previewed
without a full Quarto build. Extracts Python code blocks with fig-* labels
from QMD files, renders them to PNG, and outputs a browsable gallery at
_output/plots/<chapter>/. Also fixes the package import chain so `binder`
works correctly as an installed entry point.

- Add book/cli/commands/render.py with RenderCommand class
- Wire into main.py with help table entry and command dispatch
- Add matplotlib>=3.7.0 to pyproject.toml dependencies
- Add book/quarto/_output/ to .gitignore
- Archive standalone render_figures.py to _archive/
2026-02-14 12:43:23 -05:00
Vijay Janapa Reddi
a03ce064ec style(vol1): standardize multiplication notation across all chapters
Replace ~115 inconsistent multiplication symbols with the codified
standard from book-prose.md:

- Multiplier/ratio: N× (Unicode ×, no space) — e.g. "4× speedup"
- Dimensions: M × N (Unicode ×, spaces) — e.g. "224 × 224"
- Math mode: $\times$ only inside LaTeX expressions

Fixes applied across 20 files:
- Lowercase x → Unicode × in prose (~80 instances)
- Standalone $\times$ → Unicode × for multipliers (~32 instances)
- Dimension spacing corrections (3 instances)
- Hyphen → en dash in ranges (e.g. "2-4×" → "2–4×")

Protected content (code blocks, TikZ, display math) left untouched.
2026-02-14 12:09:32 -05:00
Vijay Janapa Reddi
40fcce9955 content(vol1): add thesis-driven titles to all Key Takeaways callouts
Every callout-takeaways block across Vol 1 now has a title attribute
that captures the chapter's core insight rather than repeating the
chapter name. Titles are drawn from each chapter's purpose question
or central thesis, answering "what should a student remember six
months from now?" Examples: "Constraints Drive Architecture" (Intro),
"Perfectly Available, Perfectly Wrong" (ML Ops), "Architecture Is
Infrastructure" (Network Architectures).
2026-02-14 11:23:29 -05:00
Vijay Janapa Reddi
6bca8ef9b0 fix(vscode-ext): fix typed reference colors and simplify settings
Typed references (@tbl-, @fig-, @sec-, @lst-, @eq-) and label
definitions ({#tbl-...}, {#fig-...}) were all rendering in generic
blue because overlapping decorations overrode the typed colors.
Fix by collecting typed matches first and excluding them from
generic/structural buckets. Also fix label-definition regexes to
match labels with trailing attributes (e.g. {#fig-foo fig-env=...}).

Change footnote colors from invisible slate-gray to distinct pink/rose
for clear visual separation from other reference types.

Remove all 27 individual color-override settings (mlsysbook.color*)
since only the preset picker (subtle/balanced/vivid) is needed.
2026-02-13 18:21:25 -05:00
Vijay Janapa Reddi
f4391ce26f refactor(vscode-ext): remove diagnostics system and clean up highlighter
Remove QmdDiagnosticsManager and WorkspaceLabelIndex which caused
false-positive blue squiggles during workspace index loading. Pre-commit
hooks already validate cross-references and inline Python at commit time.

Also remove div fence marker highlighting (Quarto handles natively),
add !inFence guards to label line checks, remove broken reference
decoration, and change footnote colors from gold to muted slate-gray
to convey their marginal/supplementary nature.
2026-02-13 16:54:42 -05:00
Vijay Janapa Reddi
e3cc9f7af3 refactor: rename ml_ml_workflow files, consolidate CLI, and clean up scripts
Remove redundant ml_ prefix from ml_workflow chapter files and update all
Quarto config references. Consolidate custom scripts into native binder
subcommands and archive obsolete tooling.
2026-02-13 11:06:28 -05:00
Vijay Janapa Reddi
acd571095a fix(binder): include all label types by default, enable pattern checks
- Default label types now include Equation (was missing from default set)
- --check-patterns now defaults to True for inline-refs
- Removed redundant --all-types from VSCode extension command

All five label types (Figure, Table, Section, Equation, Listing) are
now always checked unless explicitly filtered with --figures/--tables/etc.
2026-02-12 23:46:58 -05:00
Vijay Janapa Reddi
e41c2af2b7 fix(binder): update fix command prog name in argparse 2026-02-12 23:44:55 -05:00
Vijay Janapa Reddi
a0a7f7c658 feat(binder): restructure CLI into check/fix/format hierarchy
Reorganize binder commands into a clean three-verb quality system:

  check   — grouped validation (refs, labels, headers, footnotes,
            figures, rendering) with --scope for granularity
  fix     — content management (headers, footnotes, glossary, images)
  format  — auto-formatters (blanks, python, lists, divs, tables)

Key changes:
- validate → check (with backward-compat alias)
- maintain → fix (with backward-compat alias)
- 17 flat checks grouped into 6 semantic categories
- --scope flag narrows to individual checks within a group
- New FormatCommand with native blanks/lists + script delegation
- Updated pre-commit hooks, VSCode extension, and help output
2026-02-12 23:37:56 -05:00
Vijay Janapa Reddi
8caeac9cc7 refactor(binder): rename validate/maintain subcommands for clarity
Rename verbose compound names to clean, noun-based names:
- section-ids → headers
- forbidden-footnotes → footnote-placement
- footnotes → footnote-refs
- figure-completeness → figures
- figure-placement → float-flow
- index-placement → indexes
- render-patterns → rendering
- dropcap → dropcaps
- part-keys → parts
- image-refs → images

Updated in: validate.py, maintenance.py, pre-commit hooks, VSCode extension.
2026-02-12 23:26:17 -05:00
Vijay Janapa Reddi
755e4cc6a6 feat(binder): consolidate 18 custom scripts into native binder subcommands
Port all custom validation and maintenance scripts into the binder CLI
as native subcommands, eliminating the need for standalone scripts.

New `binder validate` subcommands (10):
- section-ids: verify all headers have {#sec-...} IDs
- forbidden-footnotes: check footnotes in tables/captions/divs
- footnotes: validate footnote refs/defs (undefined, unused, duplicate)
- figure-completeness: check figures have captions and alt-text
- figure-placement: audit figure/table proximity to first reference
- index-placement: check LaTeX \index{} placement
- render-patterns: detect problematic rendering patterns
- dropcap: validate drop cap compatibility
- part-keys: validate \part{key:...} against summaries.yml
- image-refs: validate image references exist on disk

New `binder maintain` subcommands (2):
- section-ids (add/repair/list/remove): full section ID lifecycle
- footnotes (cleanup/reorganize/remove): footnote management

Updated 11 pre-commit hooks to use binder commands instead of scripts.
Updated VSCode extension commands to use binder CLI.
All validators verified against original script output (parity confirmed).
2026-02-12 23:20:54 -05:00
Vijay Janapa Reddi
6b4af17b8f feat(vscode-ext): workspace-wide cross-reference validation and section ID commands
- Add WorkspaceLabelIndex: scans all .qmd files on activation, updates
  incrementally on save, provides hasLabel() for cross-file validation
- Extend QmdDiagnosticsManager to validate references against workspace
  index (not just current file); triggered on save only, not keystrokes
- Add broken reference decoration (red wavy underline) in chunk
  highlighter for refs that don't resolve to any label in the workspace
- Add commands: Add Missing Section IDs, Verify Section IDs,
  Validate Cross-References (command palette)
- Enable diagnostics by default (save-triggered, not noisy)
- Support YAML-style label definitions (#| label:, #| fig-label:, etc.)
2026-02-12 22:53:50 -05:00