mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-05-03 16:18:49 -05:00
- Create vol1/backmatter/glossary with 462 Vol1-only terms - Create vol2/backmatter/glossary with 250 Vol2-only terms - Remove combined glossary (each volume is now self-contained) - Update build_global_glossary.py to generate per-volume JSONs - Update generate_glossary.py to create per-volume QMD files - Update Quarto sidebar to link to volume-specific glossaries - Remove obsolete data/ folder (glossary data now in backmatter) - Update glossary documentation (README.md, ORGANIZATION.md) Note: Vol2 glossary has some broken refs from pre-existing data issues in edge_intelligence_glossary.json (references ondevice_learning chapter)
1.6 KiB
1.6 KiB
Glossary Management Scripts
Scripts for managing the ML Systems textbook glossary system.
Quick Commands
Full Rebuild (when chapters change)
cd /Users/VJ/GitHub/MLSysBook
python3 book/tools/scripts/glossary/build_global_glossary.py
python3 book/tools/scripts/glossary/generate_glossary.py
Generate Specific Volume
python3 book/tools/scripts/glossary/generate_glossary.py --volume vol1
python3 book/tools/scripts/glossary/generate_glossary.py --volume vol2
Data Flow
Chapter QMDs → Agent → Individual JSONs → build_global_glossary.py → Volume JSONs → generate_glossary.py → glossary.qmd
Scripts
build_global_glossary.py- Main aggregation script (chapter JSONs → volume JSONs)generate_glossary.py- Page generator (volume JSONs → volume glossary.qmd files)clean_master_glossary.py- Legacy cleanup scriptsmart_consolidation.py- Advanced term consolidationrule_based_consolidation.py- Rule-based term consolidation
Source Files
- Vol1 chapter glossaries:
quarto/contents/vol1/*/<chapter>_glossary.json - Vol2 chapter glossaries:
quarto/contents/vol2/*/<chapter>_glossary.json
Individual chapter glossaries are the source of truth. Edit those, then rebuild.
Output Files
- Volume 1 JSON:
quarto/contents/vol1/backmatter/glossary/vol1_glossary.json - Volume 1 page:
quarto/contents/vol1/backmatter/glossary/glossary.qmd - Volume 2 JSON:
quarto/contents/vol2/backmatter/glossary/vol2_glossary.json - Volume 2 page:
quarto/contents/vol2/backmatter/glossary/glossary.qmd
Each volume has its own self-contained glossary.