[PR #1346] [MERGED] feat(ci): tooling to catch fork-PR variable leaks and marimo dataflow bugs #5143

Closed
opened 2026-04-19 12:50:09 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/harvard-edge/cs249r_book/pull/1346
Author: @profvjreddi
Created: 4/16/2026
Status: Merged
Merged: 4/16/2026
Merged by: @profvjreddi

Base: devHead: feat/ci-sanity-tooling


📝 Commits (1)

  • 31d2f5e feat(ci): tooling to catch fork-PR variable leaks and marimo dataflow bugs

📊 Changes

7 files changed (+454 additions, -9 deletions)

View changed files

.github/scripts/check_workflow_fork_safety.py (+223 -0)
.github/workflows/ci-sanity.yml (+57 -0)
📝 .github/workflows/kits-validate-dev.yml (+12 -3)
📝 .github/workflows/labs-validate-dev.yml (+12 -3)
📝 .github/workflows/mlsysim-validate-dev.yml (+12 -3)
📝 .pre-commit-config.yaml (+14 -0)
📝 labs/tests/test_static.py (+124 -0)

📄 Description

adds static checks for two bug classes we hit in the last week, each of which produced silent, long-lived failures:

bug classes

1. workflow fork-safety (the #1344 incident). any pull_request-triggered workflow that references ${{ vars.* }} or non-GITHUB_TOKEN ${{ secrets.* }} breaks silently on fork PRs, because repository variables and secrets are not exposed in that context by GitHub's security default. the token resolves to empty string, the working directory is wrong, and the final validation step looks for a file at the wrong path. one week of broken fork CI on #1306, #1331, #1339 before anyone noticed.

2. marimo widget-in-gated-cell (the lab_01 incident, fixed by #1339). when an @app.cell function contains mo.stop() AND defines mo.ui.* widgets that appear in its return tuple, those widgets don't exist until the gate unblocks. every cell that depends on them cascades to undefined state.

what's in this PR

file purpose
.github/scripts/check_workflow_fork_safety.py standalone python + PyYAML. parses each workflow, identifies pull_request triggers, flags unsafe vars/secrets references with file:line:token pointers and fix hints. exempts secrets.GITHUB_TOKEN which is always available.
.github/workflows/ci-sanity.yml new workflow on .github/** changes. runs the fork-safety check so contributors without pre-commit still get caught.
.pre-commit-config.yaml new hook under a SECTION 3.5: CI SANITY for local fast feedback.
labs/tests/test_static.py new TestMarimoDataflow class with test_no_widget_defined_in_gated_cell. AST-based. marked xfail because 32 of 33 labs currently have the pattern; a systematic refactor is separate scope. once labs are converted (see vol2/lab_05_dist_train for the proper pattern), remove the xfail.
labs-validate-dev.yml, kits-validate-dev.yml, mlsysim-validate-dev.yml "Validate build output" step now echoes the resolved env var and fails loudly if empty, with a message explaining the fork-PR var issue. would have diagnosed #1344 in 30 seconds instead of a week.

verification

  • python3 .github/scripts/check_workflow_fork_safety.py — ok, scanned 47 workflows, 8 trigger on pull_request, 0 violations
  • marimo check labs/vol{1,2}/*.py — clean across all 33 labs (after #1345)
  • pytest labs/tests/test_static.py — 656 passed, 4 skipped, 33 xfailed, 1 xpassed
  • the xpassed lab is vol2/lab_05_dist_train — it's the canonical reference for the proper pattern

not in this PR

  • full refactor of the 33 labs to remove the widget-in-gated-cell pattern. that's a separate tracking issue; the xfail in the new test will enforce strict checking once the refactor lands.

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/harvard-edge/cs249r_book/pull/1346 **Author:** [@profvjreddi](https://github.com/profvjreddi) **Created:** 4/16/2026 **Status:** ✅ Merged **Merged:** 4/16/2026 **Merged by:** [@profvjreddi](https://github.com/profvjreddi) **Base:** `dev` ← **Head:** `feat/ci-sanity-tooling` --- ### 📝 Commits (1) - [`31d2f5e`](https://github.com/harvard-edge/cs249r_book/commit/31d2f5eb3bb484e9b0d2825f31d15e07a1c2a4a0) feat(ci): tooling to catch fork-PR variable leaks and marimo dataflow bugs ### 📊 Changes **7 files changed** (+454 additions, -9 deletions) <details> <summary>View changed files</summary> ➕ `.github/scripts/check_workflow_fork_safety.py` (+223 -0) ➕ `.github/workflows/ci-sanity.yml` (+57 -0) 📝 `.github/workflows/kits-validate-dev.yml` (+12 -3) 📝 `.github/workflows/labs-validate-dev.yml` (+12 -3) 📝 `.github/workflows/mlsysim-validate-dev.yml` (+12 -3) 📝 `.pre-commit-config.yaml` (+14 -0) 📝 `labs/tests/test_static.py` (+124 -0) </details> ### 📄 Description adds static checks for two bug classes we hit in the last week, each of which produced silent, long-lived failures: ## bug classes **1. workflow fork-safety (the #1344 incident).** any `pull_request`-triggered workflow that references `${{ vars.* }}` or non-`GITHUB_TOKEN` `${{ secrets.* }}` breaks silently on fork PRs, because repository variables and secrets are not exposed in that context by GitHub's security default. the token resolves to empty string, the working directory is wrong, and the final validation step looks for a file at the wrong path. one week of broken fork CI on #1306, #1331, #1339 before anyone noticed. **2. marimo widget-in-gated-cell (the lab_01 incident, fixed by #1339).** when an `@app.cell` function contains `mo.stop()` AND defines `mo.ui.*` widgets that appear in its return tuple, those widgets don't exist until the gate unblocks. every cell that depends on them cascades to undefined state. ## what's in this PR | file | purpose | |------|---------| | `.github/scripts/check_workflow_fork_safety.py` | standalone python + PyYAML. parses each workflow, identifies pull_request triggers, flags unsafe vars/secrets references with file:line:token pointers and fix hints. exempts `secrets.GITHUB_TOKEN` which is always available. | | `.github/workflows/ci-sanity.yml` | new workflow on `.github/**` changes. runs the fork-safety check so contributors without pre-commit still get caught. | | `.pre-commit-config.yaml` | new hook under a SECTION 3.5: CI SANITY for local fast feedback. | | `labs/tests/test_static.py` | new `TestMarimoDataflow` class with `test_no_widget_defined_in_gated_cell`. AST-based. marked `xfail` because 32 of 33 labs currently have the pattern; a systematic refactor is separate scope. once labs are converted (see `vol2/lab_05_dist_train` for the proper pattern), remove the xfail. | | `labs-validate-dev.yml`, `kits-validate-dev.yml`, `mlsysim-validate-dev.yml` | "Validate build output" step now echoes the resolved env var and fails loudly if empty, with a message explaining the fork-PR var issue. would have diagnosed #1344 in 30 seconds instead of a week. | ## verification - `python3 .github/scripts/check_workflow_fork_safety.py` — ok, scanned 47 workflows, 8 trigger on pull_request, 0 violations - `marimo check labs/vol{1,2}/*.py` — clean across all 33 labs (after #1345) - `pytest labs/tests/test_static.py` — 656 passed, 4 skipped, 33 xfailed, 1 xpassed - the xpassed lab is `vol2/lab_05_dist_train` — it's the canonical reference for the proper pattern ## not in this PR - full refactor of the 33 labs to remove the widget-in-gated-cell pattern. that's a separate tracking issue; the xfail in the new test will enforce strict checking once the refactor lands. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-19 12:50:09 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/cs249r_book#5143