mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-05-06 01:28:35 -05:00
fix(ci): give staffml-validate-{dev,vault} distinct concurrency groups
Both reusable workflows used `group: ${{ github.workflow }}-...`, but
when GitHub runs a workflow via `workflow_call`, github.workflow resolves
to the CALLER'S workflow name. So when staffml-preview-dev calls both
staffml-validate-dev and staffml-validate-vault via `uses:` from the
same parent run, the two reusable workflows collapsed into the same
concurrency group (parent-name + parent-run-id). With
`cancel-in-progress: true`, whichever queued first got cancelled by the
later one.
Concretely, on every push run since 6ddb82a71b (2026-05-02):
- Validate (Vault) jobs queue at parent+~3s with no runner assigned
- Validate (Dev) jobs queue at parent+~5s
- Vault jobs cancel ~1s later (cancel-in-progress fires when the
second occupant of the shared group enters)
Net effect: vault validation never ran but the StaffML preview-dev run
overall reported 'cancelled', flipping the README badge red despite
build + Validate (Dev) all green. 9 push runs in a row affected.
Fix: replace ${{ github.workflow }} with a literal workflow-identifying
string in each group key so the two reusable workflows live in disjoint
groups regardless of caller. The fallback to head_ref/run_id is kept,
so PR cancel-on-amend and standalone-vs-uses uniqueness still work.
Tested by dispatching staffml-validate-vault standalone before this
commit (run 25351824595): both jobs ran cleanly to success, confirming
the failure was purely the concurrency interaction between the two
reusable workflows in the same parent, not anything in the validation
logic itself.
This commit is contained in:
21
.github/workflows/staffml-validate-dev.yml
vendored
21
.github/workflows/staffml-validate-dev.yml
vendored
@@ -52,15 +52,18 @@ permissions:
|
||||
contents: read
|
||||
|
||||
concurrency:
|
||||
# `head_ref || run_id` preserves PR cancel-on-amend (head_ref is the PR
|
||||
# source branch and is stable across PR commits) while making push and
|
||||
# workflow_call runs unique per-run (head_ref is empty for non-PR events,
|
||||
# so the group falls back to run_id). Without the per-run fallback, a
|
||||
# push to dev would trigger BOTH this workflow standalone AND Preview's
|
||||
# `uses:` call into it; the two would share the same group and one would
|
||||
# cancel the other — same class of badge-flicker bug the CLAUDE.md note
|
||||
# describes for manual dispatch.
|
||||
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
|
||||
# Group key uses a literal workflow-identifying string instead of
|
||||
# ${{ github.workflow }} because the latter resolves to the CALLER's
|
||||
# workflow name when this runs via `workflow_call`. Without the
|
||||
# literal, both staffml-validate-dev and staffml-validate-vault — when
|
||||
# called from the same staffml-preview-dev parent — collapse to the
|
||||
# same group (parent-name + parent-run-id) and cancel each other.
|
||||
# `head_ref || run_id` still preserves PR cancel-on-amend (head_ref
|
||||
# set on PRs, stable across amends) and per-run uniqueness for push/
|
||||
# dispatch (head_ref empty → run_id fallback) so a push to dev that
|
||||
# triggers BOTH this workflow standalone AND Preview's `uses:` call
|
||||
# into it doesn't share a group either.
|
||||
group: staffml-validate-dev-${{ github.head_ref || github.run_id }}
|
||||
cancel-in-progress: true
|
||||
|
||||
env:
|
||||
|
||||
20
.github/workflows/staffml-validate-vault.yml
vendored
20
.github/workflows/staffml-validate-vault.yml
vendored
@@ -43,12 +43,20 @@ on:
|
||||
- 'interviews/staffml-vault-worker/**'
|
||||
|
||||
concurrency:
|
||||
# `head_ref || run_id` keeps PR cancel-on-amend behavior while making
|
||||
# push and workflow_call runs unique per-run, so a push to dev that
|
||||
# triggers both this workflow standalone AND Preview's `uses:` call
|
||||
# doesn't collide on a shared group. See staffml-validate-dev.yml for
|
||||
# the long-form rationale.
|
||||
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
|
||||
# Group key uses a literal workflow-identifying string instead of
|
||||
# ${{ github.workflow }} because the latter resolves to the CALLER's
|
||||
# workflow name when this runs via `workflow_call`. Without the
|
||||
# literal, both staffml-validate-dev and staffml-validate-vault — when
|
||||
# called from the same staffml-preview-dev parent — produce the same
|
||||
# group (parent-name + parent-run-id), and `cancel-in-progress: true`
|
||||
# silently cancels whichever queued earlier. Observed on every push
|
||||
# run since 2026-05-02 (introduced by 6ddb82a71b): vault jobs queue
|
||||
# ~3s before Validate (Dev) jobs and get cancelled by them — turning
|
||||
# the StaffML README badge red despite all real validation passing.
|
||||
# `head_ref || run_id` still preserves PR cancel-on-amend (head_ref
|
||||
# set on PRs) and per-run uniqueness for push/dispatch (head_ref empty
|
||||
# → run_id fallback).
|
||||
group: staffml-validate-vault-${{ github.head_ref || github.run_id }}
|
||||
cancel-in-progress: true
|
||||
|
||||
# Read-only token: jobs only checkout, install, lint, and test.
|
||||
|
||||
Reference in New Issue
Block a user