mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-05-06 17:49:07 -05:00
[PR #1406] [MERGED] PR-3: Scripts, audits, cleanup (build stamp, PDF dropdown, 404s, mirror guard, dedup, RELEASE-PREP) #6526
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/harvard-edge/cs249r_book/pull/1406
Author: @profvjreddi
Created: 4/19/2026
Status: ✅ Merged
Merged: 4/19/2026
Merged by: @profvjreddi
Base:
dev← Head:release-prep/scripts-audits-cleanup📝 Commits (6)
ae4159ffeat(footer): build-time "last updated" stamp86fdc91feat(navbar): expose paper.pdf for TinyTorch / MLSys·im / StaffML0fc4a56feat(404): per-site 404 pages for slides / instructors / unified site3ab7d99ci(precommit): block subsite-mirror drift on shared assets0d565a2refactor(audit): duplicate-file finder + clean up obvious leftoverab10dcadocs(release-prep): handoff notes covering all five PR groupings📊 Changes
14 files changed (+792 additions, -120 deletions)
View changed files
📝
.github/workflows/kits-publish-live.yml(+3 -0)📝
.gitignore(+4 -0)📝
.pre-commit-config.yaml(+17 -0)➕
RELEASE-PREP.md(+328 -0)📝
book/quarto/config/shared/html/footer-common.yml(+2 -1)➕
instructors/404.qmd(+31 -0)📝
shared/config/footer-site.yml(+2 -1)📝
shared/config/navbar-common.yml(+19 -0)📝
shared/config/site-head.html(+17 -0)➕
shared/scripts/find-duplicates.py(+235 -0)➕
shared/scripts/inject-build-stamp.sh(+71 -0)➕
site/404.qmd(+34 -0)➕
slides/404.qmd(+29 -0)➖
tinytorch/scripts/cleanup_repo_history.sh(+0 -118)📄 Description
Summary
Quality-of-life and hygiene improvements identified during the
release-prep review. Independent of PR-1 / PR-2 — can merge in any
order. Includes the comprehensive handoff document covering all five
release-prep PRs.
What's in this PR
Build-time "last updated" footer stamp. New
shared/scripts/inject-build-stamp.shfinds the placeholder<!-- MLSB_BUILD_STAMP -->in any built HTML page and replaces itwith
<span class="mlsb-build-stamp">Last updated YYYY-MM-DD · <SiteLabel> · <CommitSHA></span>. Style block (small, dark-modeaware) inlined into
shared/config/site-head.html. Placeholderalready added to
book/quarto/config/shared/html/footer-common.ymland
shared/config/footer-site.yml. First wired intokits-publish-live.ymlas a reference implementation; other publishworkflows can adopt the same step in followup.
Paper.pdf links surfaced in the navbar. Added direct entries
under the Build dropdown for TinyTorch and MLSys·im (where each
property generates a
paper.pdf), and under Prepare for StaffML.Single source of truth in
shared/config/navbar-common.yml; allHTML builds inherit. Closes the gap where the rendered papers had no
discovery surface.
Per-site 404 pages. Quarto/Next subsites that lacked a
maintained 404 page now have one tailored to their context with
relevant navigation back into the ecosystem:
slides/404.qmdinstructors/404.qmdsite/404.qmd(unified landing — broadest cross-property nav)Pre-commit guard against shared-mirror drift. Quarto's
resource-copy step preserves symlinks instead of dereferencing them,
so we keep real-file copies of certain shared assets (subscribe modal
JS in particular) per subsite. Without a guard, the canonical and the
mirrors silently diverge — most common symptom is the wrong subscribe
modal rendering on one subsite. New
check-shared-mirrorshook runsbash shared/scripts/sync-mirrors.sh --checkon every commit(always_run: true because mirrors can drift via deletion of the
canonical, not just by editing the canonical itself).
Duplicate-file audit script + initial cleanup.
shared/scripts/find-duplicates.pywalks the chosen subsite roots,hashes files, groups by hash, and reports unintended duplicates
(known mirrors are excluded via an allowlist; symlinks are skipped).
First run found
tinytorch/scripts/cleanup_repo_history.sh==tinytorch/tools/maintenance/cleanup_history.shbyte-for-byte —the script is removed in this commit.
.gitignoreupdated toexclude
.audit//_audit/output dirs.RELEASE-PREP.md handoff document. Single document organizing
all 19 release-prep commits into the five logical PR groupings
(safety net, visual polish, scripts/audits/cleanup, TinyTorch prep,
cutover skeletons), with per-PR rationale, deferred items, and
local verification notes. Living document — will be updated as PRs
land and merge.
Risk surface
placeholder isn't present (which is the case for any property that
hasn't adopted it), the script is a no-op (safe-by-default).
check-shared-mirrorshook refuses commits when mirrors aredrifted. Worst case for an accidentally drifted file: developer
runs
bash shared/scripts/sync-mirrors.shto re-sync, then commits.Hook itself is idempotent.
was the script removed here).
Test plan
(no schema regressions).
pre-commit run check-shared-mirrors --all-filesexits0 on a clean checkout.
python3 shared/scripts/find-duplicates.pyproduces areport with no flagged duplicates.
bash shared/scripts/inject-build-stamp.sh kits/_site Kitsrewrites the placeholder.Paper / StaffML Paper entries on a built site.
Followup
inject-build-stamp.shinto the other*-publish-live.ymlworkflows (TinyTorch, labs, slides, instructors, mlsysim, site,
staffml, book). Currently only kits is wired as a reference.
find-duplicates.pyworkflow (cron, weekly) so theaudit doesn't bit-rot.
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.