The "Primary chains only / All" filter dropdown that was punted from
Phase 2.3 (ed2ddb51d) so the user could review the bigger UI surface.
Implementation:
- new selectedTier state, default "primary"
- filteredQuestions filter: when "primary", drop questions whose
chain memberships are *all* secondary (questions not in any chain
pass through unchanged — they're tier-irrelevant).
- Tier FilterSelect dropdown next to the existing Level filter.
Default behaviour intentionally hides secondary-only questions —
matches the rest of the Phase 2 surfaces (practice prefers primary,
ChainBadge shows "alt path" pill on secondary, explore picks primary
chains for the related panel). Users opt into seeing the lenient-pass
questions by switching to "All chains".
Tests: new playwright case test8_explore_tier_filter:
- tier filter dropdown rendered
- switching to "All chains" keeps page interactive (no crash on
re-filter)
Smoke suite: 19/19 pass.
Pull in the dev work that landed since yaml-audit was last synced:
- --legacy-json renamed to --local-json (2b381bb949) — script/doc
updates needed below in this branch
- CI workflow refactor (validate-dev / validate-vault now reusable)
- all-contributors automation, gitignore tightening, codespell list
- PR #1622 navbar URL rewrite for dev preview
- PR #1619 clone-size refactor, #1618 milestone3 xor fix, #1617
perceptron seed, #1616 tito status M3
- Chapter 9 PDF layout refinement
- assorted staffml/practice fixes (pickRandom deps, GitHub star gate)
This merges the canonical dev state into yaml-audit so subsequent
work continues on top of the freshest base. Conflicts in
practice/page.tsx + corpus.ts + ARCHITECTURE.md resolved to keep both
sides' additive changes (Phase 2 tier work + dev's later refactors).
Carries the primary/secondary chain tier (from Phase 1) through the
build pipeline into the practice + explore surfaces, so primary chains
are the unmarked default and secondary chains are an opt-in alternative
path the user can deep-link into via ?chain=<id>.
Backend (2.1):
- legacy_export.py emits chain_tiers per question alongside chain_ids
and chain_positions; missing chain-tier defaults to "primary".
- vault build re-run: 2953 chained questions, all carry chain_tiers
(releaseHash unchanged — new field is additive, doesn't perturb the
manifest hash inputs).
- Existing legacy_export tests were stale (asserted on the v1.0 YAML
chains: field path; v1.1 made chains.json the sidecar source).
Rewrote them to write chains.json fixtures into tmp_path and added
chain_tiers assertions, plus a focused
test_chain_tiers_emitted_per_membership case.
TypeScript (2.2):
- Question.chain_tiers? (Record<string, "primary"|"secondary">)
- ChainTier export, ChainInfo.tier required.
- getChainForQuestion / getAllChainsForQuestion populate tier;
getAllChains... sorts primary first.
- New getPrimaryChainForQuestion(qid) helper for default surfaces.
UI (2.3):
- practice page reads ?chain=<id> URL param; defaults to
getPrimaryChainForQuestion when unset.
- ChainBadge gains an inline "alt path" pill when tier=secondary
(always visible — no click needed).
- ChainStrip mirrors that pill in the progress row for users who
expand the strip.
- Explore page prefers the first non-secondary chain when picking
activeChainId for the related-questions panel.
- Deferred to a follow-up commit (intentional, scoped via Progress Log):
explore-page "Primary only / All" filter; daily/mock routing.
Tests (2.4):
- test7_tier_aware_chain_routing in chain-and-vault-smoke.mjs:
secondary reachable via ?chain=, alt-path badge visible on
secondary, primary regression, alt-path badge ABSENT on primary.
- Full smoke suite: 17/17 pass (was 13/13).
Validation:
- vault check --strict: 10,701 loaded, 0 failures
- vault build --legacy-json: 9438 published, chainCount=879
- pytest interviews/vault-cli/tests: 74/74
- npx tsc --noEmit: 0 errors
- playwright chain-and-vault-smoke: 17/17
Phase 2 complete. Next: Phase 3 (gap-driven authoring; 407-gap backlog).
Replace the daily-FREE_LIMIT modal with a single mission-aligned ask
shown once after 5 lifetime reveals. The gate now retires forever on
star, honor-confirm, or dismiss — no daily cap, no username verify.
- Live stargazer count fetched from the GitHub API (24h cache).
- Copy borrows site/about: "Our only ask. Every star tells universities,
publishers, and funders that AI engineering education matters."
- Wires the same gate into the gauntlet revealAnswer path so Mock
Interview no longer bypasses the ask.
- Adds a Playwright smoke covering practice + gauntlet + dismiss
persistence across reloads.
13 checks covering:
- landing page + vault area rendering
- topic drilldown question card preview text (regression for the '...' bug)
- practice page loads + renders chain members
- chain indicator surfaces on chain-member questions
- hierarchical layout doesn't break runtime: practice loads
cloud-0000, edge-0001, mobile-0000, tinyml-0000
All 13 pass against current build. Run via:
cd interviews/staffml && npm run dev
node tests/chain-and-vault-smoke.mjs
EcosystemBar's sticky wrapper was in normal flow, so the 7 left dropdowns
+ 6 right icons (~1253px combined) propagated to body.scrollWidth and
triggered horizontal page scroll on iPad Mini landscape (1024px) and
iPad Pro landscape (1194px).
Quarto sites avoid the same overflow because their navbar lives inside
.fixed-top (position: fixed), which is removed from normal flow. We
match that behavior with overflow-x: clip on the sticky wrapper —
critically NOT overflow-x: hidden, which would force overflow-y: auto
and clip the dropdown menus that extend below the bar.
Also reverts the nav-lg breakpoint workaround (1200 → 992) so StaffML
collapses to hamburger at the same viewport as every other ecosystem
subsite (Bootstrap lg, matches shared/config/navbar-common.yml).
Copy update: "Backed by a 600-page open textbook" → "Backed by the
[Machine Learning Systems](https://mlsysbook.ai) textbook" on
welcome and about, removing the page-count claim.
Verified: tests/responsive-audit.mjs across 13 routes × 8 viewports
(WebKit) — 104/104 pass, zero horizontal scroll.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the cleanup arc (A.1–A.10 in RESUME_PLAN_RELEASE.md). Every
gate is now green: vault check --strict, vault lint, vault doctor,
vault codegen --check, staffml validate-vault, Playwright (9/9), tsc.
A.1 mobile-1962.svg: renamed `Edge` → `RegEdge` in graphviz source
(`Edge` is a reserved keyword); SVG renders cleanly. Also fixed
tinyml-1570.py (missing `import numpy as np`) which the new failure
log surfaced.
A.2 render_visuals.py: structured per-ID failure log written to
`_validation_results/render_failures.json` on every run; non-zero
exit on any per-item crash; new `--fail-fast` and `--failure-log`
CLI options. Replaces the prior silent-failure mode.
A.3 LinkML visual schema: typed as a structured sub-schema. New
`VisualKind` enum (svg only — `mermaid` was reserved but never
shipped, dropped to keep the enum honest). Path regex tightened
to `^[a-z0-9-]+\.svg$`. Alt minimum length 10, caption required
minimum length 5. TypeScript Visual interface + Question.visual
field added to staffml-vault-types/index.ts.
A.4 Pydantic Visual + Question validators:
- Visual.kind hard-rejects anything but `svg`
- Visual.path enforces the new regex
- Visual.alt min 10 chars, caption required min 5 chars
- Question.model_validator: visual.path MUST resolve to a real
file under interviews/vault/visuals/<track>/. Skipped in
production deploys where the working tree is absent.
A.5 Registry repair + doctor split:
- tools: repair_registry.py appended 5,269 missing IDs
(the rename refactor at 8a5c3ff3c left the append-only registry
unsynced; this brings disk-coverage to 100%). Header block in
id-registry.yaml documents the rebuild rationale.
- doctor.py: split symmetric `registry-integrity` check into
`disk-coverage` (HARD FAIL if any disk YAML id is unregistered)
and `registry-history` (INFO ONLY for retired ids — the registry
is by design an audit log, retired ids are normal). Pre-existing
`_check_schema_version` bug (`versions == {1}` vs string `"1.0"`)
fixed.
A.6 Lint calibration via 4-expert consensus + bloom-canonical
reclassification:
- Spawned 4 experts (Vijay Reddi, Chip Huyen, Jeff Dean,
education-reviewer) on 42 disputed (zone, level) pairs;
consensus-builder aggregated to 15 valid / 19 invalid / 8
borderline.
- User arbitrated 8 borderlines: 7 widen / 1 reclassify.
- Built ZONE_BLOOM_AFFINITY matrix (Education-Reviewer's idea):
every zone admits its dominant Bloom verb + adjacent verbs,
rejects clear hierarchy violations.
- reclassify_zone_bloom_mismatch.py applied 576 deterministic
zone fixes via BLOOM_CANONICAL_ZONE mapping (e.g. fluency+analyze
→ analyze, recall+analyze → analyze, evaluation+apply → implement).
- Question.model_validator(_zone_bloom_compatible): hard-rejects
future zone-bloom mismatches at write time. Generated drafts
can no longer ship a self-contradicting classification.
- ZONE_LEVEL_AFFINITY widened per consensus + arbitration +
post-reclassification adjustments. Lint warnings: 1,308 → 0.
A.7 Chain integrity:
- repair_chains.py: drops chain refs when a chain has <2 published
members (chain ceases to exist), renumbers all members of any
chain whose positions are non-sequential / duplicated /
non-monotonic-by-level. Sort key: level ascending, then old
position, then qid (deterministic).
- validate-vault.py: relaxed sequential check to unique-positions
check. Position gaps from mid-chain deletions are normal; what
matters is uniqueness + bloom-monotonicity (vault check --strict
enforces both from YAML source-of-truth).
A.8 Practice page visual + zoom modal:
- QuestionVisual.tsx: wraps the `<img>` in `<Zoom>` from
react-medium-image-zoom (4 KB). Click image → fullscreen
`<dialog data-rmiz-modal>`; ESC closes. Added test-id
`question-visual-img` for stable selector.
- New Playwright test: 9th in the suite, deep-links cloud-4492,
asserts the dialog opens on click and closes on ESC.
- TypeScript: removed `mermaid` from local Visual types in
corpus.ts and corpus-vault.ts; tsc clean.
A.9 All gates green:
- vault check --strict: 0 errors / 0 invariant failures
- vault lint: 0 errors / 0 warnings (was 1,308 warnings)
- vault codegen --check: artifacts in sync (hash baseline updated)
- vault doctor: 0 fails (registry-history info, git-state warn
on uncommitted state-pre-this-commit)
- staffml validate-vault: 0 errors / 0 warnings, deployment-ready
- Playwright: 9/9 pass (was 8; +zoom modal test)
- render_visuals: 0 errors (was 2 silent failures pre-A.2)
- tsc: clean
Distribution after reclassification: 9,544 published unchanged;
576 items moved zone via bloom-canonical mapping (full per-item
report at /tmp/reclassify_changes.csv). Chain count 879 → 850
after orphan-singleton drops. release_hash updated.
Carry-forward to next session (Phase B):
- Priority gap closure for parallelism cells + global L4-L6+
(the run that produced this corpus did not close the targeted
cells; B.3 needs specialized prompts per cell-class)
- 120 NEEDS_FIX items from coverage_loop/20260425_150712/ still
carry judge fix_suggestions; spawn fix-agent in Phase C
Audit followed by execution. Three findings, one big move, three minor
cleanups documented for follow-up.
Audit (interviews/vault/audit/2026-04-25-schema-folder-audit.md):
1. Folder structure is correct — flat <track>/<id>.yaml. ARCHITECTURE.md
§3.3 documents that the v0.1 deeper-hierarchy attempt dropped 86
questions and was reverted in v1.0 with sound reasoning. No change.
2. Schema is solid. Required fields populate at 100%; optional fields
populate where they make sense. Three small fixes worth making
later: tighter id regex, drop dead details.question, strip cohort
tags at promotion.
3. The 86 questions dropped on April 18 were ALREADY restored on
April 21 — set-difference of pre-v0.1 vs today's published returns
zero. Nothing to recover.
Rename:
- 4,754 cohort-tagged YAMLs (cloud-fill-*, cloud-cell-*, cloud-r2-*,
cloud-sus-*, cloud-crit-*, cloud-top-*, cloud-new-*, edge-exp-*,
*-balance-*, *-portfolio-*, *-pilot-*, ...) renamed to clean
<track>-NNNN form continuing each track's monotonic sequence.
- Per-track ranges minted:
cloud: cloud-2866..cloud-4486 (1,621 renamed)
edge: edge-0986..edge-2264 (1,279 renamed)
mobile: mobile-0841..mobile-1870 (1,030 renamed)
tinyml: tinyml-0830..tinyml-1541 (712 renamed)
global: global-320..global-431 (112 renamed)
- Bundle rebuilt: 9,224 published (unchanged).
- vault check --strict: 0 load errors, 0 invariant failures.
Chain-breakage analysis (the original concern):
- ZERO of the 3,066 chain question references used cohort-tagged IDs.
All chain refs were already in clean form. The rename has no chain
impact at all — the breakage cost we discounted was zero.
External-link preservation:
- interviews/vault/docs/id-renames-2026-04-25.yaml records every
old→new mapping for forensic lookup.
- interviews/staffml/src/data/id-redirects.json mirrors the map for
the website.
- The practice page now consults this map when ?q=<id> resolves to
nothing — preserves shareable links to the 4,428 published renames.
(326 redirects target draft items and legitimately fall back to the
not-found banner.)
Tests:
- All 7 existing Playwright smoke tests still pass.
- New test added: ?q=<legacy-cohort-id> resolves through the redirect
map (using cloud-cell-10000 → cloud-2878 as the fixture).
- 8 / 8 pass.
The practice page used to silently fall through to a random question when
the ?q=<id> deep-link resolved to nothing in the published bundle. That
hid the failure and broke shared deep-links to drafts or archived items.
Now: when ?q=X has no match, set notFoundQ=X, render a small alert banner
above the question pane that names the bad id and explains the likely
causes (draft awaiting review, archived duplicate, typo). The default
pool is still shown so the page stays usable. The banner is dismissible.
Plus four new Playwright smoke tests:
- visual filter at L5 returns a non-empty pool with an inline SVG
- chained-only filter reduces the pool but stays non-empty
- ?q=cloud-2847 deep-link surfaces the queueing-hockey-stick visual
- ?q=<unknown-id> shows the not-found banner
All 7 smoke tests now pass (3 existing + 4 new) in 12.9s.
Verifies the 68f2ca466 restructure actually works in a browser.
Previously committed without runtime validation; this closes that gap.
Smoke test coverage (tests/practice-smoke.spec.ts):
1. Layout landmarks render without console/pageerror — sticky
Your-task callout, scenario prose, textarea in LEFT column,
Reveal button directly below, Stuck-nudge, Tools panel header.
2. Submit-gradient safeguard fires on low-effort reveal
(<50 chars typed, clicked immediately). Verifies the
Think-longer? modal appears with Keep-thinking + Reveal-anyway
buttons, and that Keep-thinking returns to pre-reveal.
3. Substantive answer (>80 chars) bypasses the guard and
transitions straight to post-reveal with Model Answer visible
and self-assessment buttons ready.
Runtime fix:
- analytics.ts: add 'think_guard_triggered' to AnalyticsEvent union.
The restructure commit fired this event but the type union didn't
carry it, so tsc --noEmit failed. No behavior change beyond the
compile fix — existing consumers (Cloudflare analytics worker)
ignore unknown types gracefully.
Build hygiene:
- staffml/.gitignore: ignore test-results/ and playwright-report/
(per-run artifacts — screenshots, videos, traces on failure).
The tests/ directory itself IS committed.
- playwright.config.ts: baseURL http://localhost:3000, single-worker
serial execution, no retries, traces/screenshots/video on failure.
How to run:
npx next dev & # dev server in one shell
npx playwright test tests/practice-smoke # tests in another
All three tests pass against the dev server (chromium, 4.7s total).