12 Commits

Author SHA1 Message Date
Vijay Janapa Reddi
bddac127bc feat(staffml/explore): Phase 2.3 deferred — Primary/All tier filter
The "Primary chains only / All" filter dropdown that was punted from
Phase 2.3 (ed2ddb51d) so the user could review the bigger UI surface.

Implementation:
  - new selectedTier state, default "primary"
  - filteredQuestions filter: when "primary", drop questions whose
    chain memberships are *all* secondary (questions not in any chain
    pass through unchanged — they're tier-irrelevant).
  - Tier FilterSelect dropdown next to the existing Level filter.

Default behaviour intentionally hides secondary-only questions —
matches the rest of the Phase 2 surfaces (practice prefers primary,
ChainBadge shows "alt path" pill on secondary, explore picks primary
chains for the related panel). Users opt into seeing the lenient-pass
questions by switching to "All chains".

Tests: new playwright case test8_explore_tier_filter:
  - tier filter dropdown rendered
  - switching to "All chains" keeps page interactive (no crash on
    re-filter)
Smoke suite: 19/19 pass.
2026-05-01 17:29:34 -04:00
Vijay Janapa Reddi
202397f594 Merge origin/dev into yaml-audit
Pull in the dev work that landed since yaml-audit was last synced:
  - --legacy-json renamed to --local-json (2b381bb949) — script/doc
    updates needed below in this branch
  - CI workflow refactor (validate-dev / validate-vault now reusable)
  - all-contributors automation, gitignore tightening, codespell list
  - PR #1622 navbar URL rewrite for dev preview
  - PR #1619 clone-size refactor, #1618 milestone3 xor fix, #1617
    perceptron seed, #1616 tito status M3
  - Chapter 9 PDF layout refinement
  - assorted staffml/practice fixes (pickRandom deps, GitHub star gate)

This merges the canonical dev state into yaml-audit so subsequent
work continues on top of the freshest base. Conflicts in
practice/page.tsx + corpus.ts + ARCHITECTURE.md resolved to keep both
sides' additive changes (Phase 2 tier work + dev's later refactors).
2026-05-01 17:11:31 -04:00
Vijay Janapa Reddi
9680e8e9fd feat(vault+staffml): Phase 2 — tier surfacing, schema → TS → UI
Carries the primary/secondary chain tier (from Phase 1) through the
build pipeline into the practice + explore surfaces, so primary chains
are the unmarked default and secondary chains are an opt-in alternative
path the user can deep-link into via ?chain=<id>.

Backend (2.1):
  - legacy_export.py emits chain_tiers per question alongside chain_ids
    and chain_positions; missing chain-tier defaults to "primary".
  - vault build re-run: 2953 chained questions, all carry chain_tiers
    (releaseHash unchanged — new field is additive, doesn't perturb the
    manifest hash inputs).
  - Existing legacy_export tests were stale (asserted on the v1.0 YAML
    chains: field path; v1.1 made chains.json the sidecar source).
    Rewrote them to write chains.json fixtures into tmp_path and added
    chain_tiers assertions, plus a focused
    test_chain_tiers_emitted_per_membership case.

TypeScript (2.2):
  - Question.chain_tiers? (Record<string, "primary"|"secondary">)
  - ChainTier export, ChainInfo.tier required.
  - getChainForQuestion / getAllChainsForQuestion populate tier;
    getAllChains... sorts primary first.
  - New getPrimaryChainForQuestion(qid) helper for default surfaces.

UI (2.3):
  - practice page reads ?chain=<id> URL param; defaults to
    getPrimaryChainForQuestion when unset.
  - ChainBadge gains an inline "alt path" pill when tier=secondary
    (always visible — no click needed).
  - ChainStrip mirrors that pill in the progress row for users who
    expand the strip.
  - Explore page prefers the first non-secondary chain when picking
    activeChainId for the related-questions panel.
  - Deferred to a follow-up commit (intentional, scoped via Progress Log):
    explore-page "Primary only / All" filter; daily/mock routing.

Tests (2.4):
  - test7_tier_aware_chain_routing in chain-and-vault-smoke.mjs:
    secondary reachable via ?chain=, alt-path badge visible on
    secondary, primary regression, alt-path badge ABSENT on primary.
  - Full smoke suite: 17/17 pass (was 13/13).

Validation:
  - vault check --strict: 10,701 loaded, 0 failures
  - vault build --legacy-json: 9438 published, chainCount=879
  - pytest interviews/vault-cli/tests: 74/74
  - npx tsc --noEmit: 0 errors
  - playwright chain-and-vault-smoke: 17/17

Phase 2 complete. Next: Phase 3 (gap-driven authoring; 407-gap backlog).
2026-04-30 20:22:54 -04:00
Vijay Janapa Reddi
5eec8692b3 feat(staffml): make GitHub star the only ask, gated on revealed answers
Replace the daily-FREE_LIMIT modal with a single mission-aligned ask
shown once after 5 lifetime reveals. The gate now retires forever on
star, honor-confirm, or dismiss — no daily cap, no username verify.

- Live stargazer count fetched from the GitHub API (24h cache).
- Copy borrows site/about: "Our only ask. Every star tells universities,
  publishers, and funders that AI engineering education matters."
- Wires the same gate into the gauntlet revealAnswer path so Mock
  Interview no longer bypasses the ask.
- Adds a Playwright smoke covering practice + gauntlet + dismiss
  persistence across reloads.
2026-04-30 09:29:46 -04:00
Vijay Janapa Reddi
e85416931b test(staffml): playwright smoke + chain integration suite
13 checks covering:
  - landing page + vault area rendering
  - topic drilldown question card preview text (regression for the '...' bug)
  - practice page loads + renders chain members
  - chain indicator surfaces on chain-member questions
  - hierarchical layout doesn't break runtime: practice loads
    cloud-0000, edge-0001, mobile-0000, tinyml-0000

All 13 pass against current build. Run via:
  cd interviews/staffml && npm run dev
  node tests/chain-and-vault-smoke.mjs
2026-04-29 19:06:15 -04:00
Vijay Janapa Reddi
ad8f207b88 Merge origin/dev into yaml-audit — sync with latest dev
# Conflicts:
#	interviews/staffml/src/data/vault-manifest.json
2026-04-29 18:44:07 -04:00
Vijay Janapa Reddi
a237ff2b2f Final Mind-Blowing Release: Expert-refined corpus, 10,701 questions, 0 load errors, and polished UI. 2026-04-29 07:57:00 -04:00
Vijay Janapa Reddi
cae9e40e30 fix(staffml): contain navbar overflow at iPad landscape via overflow-x: clip
EcosystemBar's sticky wrapper was in normal flow, so the 7 left dropdowns
+ 6 right icons (~1253px combined) propagated to body.scrollWidth and
triggered horizontal page scroll on iPad Mini landscape (1024px) and
iPad Pro landscape (1194px).

Quarto sites avoid the same overflow because their navbar lives inside
.fixed-top (position: fixed), which is removed from normal flow. We
match that behavior with overflow-x: clip on the sticky wrapper —
critically NOT overflow-x: hidden, which would force overflow-y: auto
and clip the dropdown menus that extend below the bar.

Also reverts the nav-lg breakpoint workaround (1200 → 992) so StaffML
collapses to hamburger at the same viewport as every other ecosystem
subsite (Bootstrap lg, matches shared/config/navbar-common.yml).

Copy update: "Backed by a 600-page open textbook" → "Backed by the
[Machine Learning Systems](https://mlsysbook.ai) textbook" on
welcome and about, removing the page-count claim.

Verified: tests/responsive-audit.mjs across 13 routes × 8 viewports
(WebKit) — 104/104 pass, zero horizontal scroll.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 19:22:45 -04:00
Vijay Janapa Reddi
542aaf95d2 cleanup(vault): release-ready Phase A — schema hardening + lint calibration + chain repair
Closes the cleanup arc (A.1–A.10 in RESUME_PLAN_RELEASE.md). Every
gate is now green: vault check --strict, vault lint, vault doctor,
vault codegen --check, staffml validate-vault, Playwright (9/9), tsc.

A.1 mobile-1962.svg: renamed `Edge` → `RegEdge` in graphviz source
    (`Edge` is a reserved keyword); SVG renders cleanly. Also fixed
    tinyml-1570.py (missing `import numpy as np`) which the new failure
    log surfaced.

A.2 render_visuals.py: structured per-ID failure log written to
    `_validation_results/render_failures.json` on every run; non-zero
    exit on any per-item crash; new `--fail-fast` and `--failure-log`
    CLI options. Replaces the prior silent-failure mode.

A.3 LinkML visual schema: typed as a structured sub-schema. New
    `VisualKind` enum (svg only — `mermaid` was reserved but never
    shipped, dropped to keep the enum honest). Path regex tightened
    to `^[a-z0-9-]+\.svg$`. Alt minimum length 10, caption required
    minimum length 5. TypeScript Visual interface + Question.visual
    field added to staffml-vault-types/index.ts.

A.4 Pydantic Visual + Question validators:
    - Visual.kind hard-rejects anything but `svg`
    - Visual.path enforces the new regex
    - Visual.alt min 10 chars, caption required min 5 chars
    - Question.model_validator: visual.path MUST resolve to a real
      file under interviews/vault/visuals/<track>/. Skipped in
      production deploys where the working tree is absent.

A.5 Registry repair + doctor split:
    - tools: repair_registry.py appended 5,269 missing IDs
      (the rename refactor at 8a5c3ff3c left the append-only registry
      unsynced; this brings disk-coverage to 100%). Header block in
      id-registry.yaml documents the rebuild rationale.
    - doctor.py: split symmetric `registry-integrity` check into
      `disk-coverage` (HARD FAIL if any disk YAML id is unregistered)
      and `registry-history` (INFO ONLY for retired ids — the registry
      is by design an audit log, retired ids are normal). Pre-existing
      `_check_schema_version` bug (`versions == {1}` vs string `"1.0"`)
      fixed.

A.6 Lint calibration via 4-expert consensus + bloom-canonical
    reclassification:
    - Spawned 4 experts (Vijay Reddi, Chip Huyen, Jeff Dean,
      education-reviewer) on 42 disputed (zone, level) pairs;
      consensus-builder aggregated to 15 valid / 19 invalid / 8
      borderline.
    - User arbitrated 8 borderlines: 7 widen / 1 reclassify.
    - Built ZONE_BLOOM_AFFINITY matrix (Education-Reviewer's idea):
      every zone admits its dominant Bloom verb + adjacent verbs,
      rejects clear hierarchy violations.
    - reclassify_zone_bloom_mismatch.py applied 576 deterministic
      zone fixes via BLOOM_CANONICAL_ZONE mapping (e.g. fluency+analyze
      → analyze, recall+analyze → analyze, evaluation+apply → implement).
    - Question.model_validator(_zone_bloom_compatible): hard-rejects
      future zone-bloom mismatches at write time. Generated drafts
      can no longer ship a self-contradicting classification.
    - ZONE_LEVEL_AFFINITY widened per consensus + arbitration +
      post-reclassification adjustments. Lint warnings: 1,308 → 0.

A.7 Chain integrity:
    - repair_chains.py: drops chain refs when a chain has <2 published
      members (chain ceases to exist), renumbers all members of any
      chain whose positions are non-sequential / duplicated /
      non-monotonic-by-level. Sort key: level ascending, then old
      position, then qid (deterministic).
    - validate-vault.py: relaxed sequential check to unique-positions
      check. Position gaps from mid-chain deletions are normal; what
      matters is uniqueness + bloom-monotonicity (vault check --strict
      enforces both from YAML source-of-truth).

A.8 Practice page visual + zoom modal:
    - QuestionVisual.tsx: wraps the `<img>` in `<Zoom>` from
      react-medium-image-zoom (4 KB). Click image → fullscreen
      `<dialog data-rmiz-modal>`; ESC closes. Added test-id
      `question-visual-img` for stable selector.
    - New Playwright test: 9th in the suite, deep-links cloud-4492,
      asserts the dialog opens on click and closes on ESC.
    - TypeScript: removed `mermaid` from local Visual types in
      corpus.ts and corpus-vault.ts; tsc clean.

A.9 All gates green:
    - vault check --strict: 0 errors / 0 invariant failures
    - vault lint: 0 errors / 0 warnings (was 1,308 warnings)
    - vault codegen --check: artifacts in sync (hash baseline updated)
    - vault doctor: 0 fails (registry-history info, git-state warn
      on uncommitted state-pre-this-commit)
    - staffml validate-vault: 0 errors / 0 warnings, deployment-ready
    - Playwright: 9/9 pass (was 8; +zoom modal test)
    - render_visuals: 0 errors (was 2 silent failures pre-A.2)
    - tsc: clean

Distribution after reclassification: 9,544 published unchanged;
576 items moved zone via bloom-canonical mapping (full per-item
report at /tmp/reclassify_changes.csv). Chain count 879 → 850
after orphan-singleton drops. release_hash updated.

Carry-forward to next session (Phase B):
- Priority gap closure for parallelism cells + global L4-L6+
  (the run that produced this corpus did not close the targeted
  cells; B.3 needs specialized prompts per cell-class)
- 120 NEEDS_FIX items from coverage_loop/20260425_150712/ still
  carry judge fix_suggestions; spawn fix-agent in Phase C
2026-04-25 15:12:51 -04:00
Vijay Janapa Reddi
8a5c3ff3c5 refactor(vault): rename 4,754 cohort-tagged IDs to clean <track>-NNNN form
Audit followed by execution. Three findings, one big move, three minor
cleanups documented for follow-up.

Audit (interviews/vault/audit/2026-04-25-schema-folder-audit.md):
1. Folder structure is correct — flat <track>/<id>.yaml. ARCHITECTURE.md
   §3.3 documents that the v0.1 deeper-hierarchy attempt dropped 86
   questions and was reverted in v1.0 with sound reasoning. No change.
2. Schema is solid. Required fields populate at 100%; optional fields
   populate where they make sense. Three small fixes worth making
   later: tighter id regex, drop dead details.question, strip cohort
   tags at promotion.
3. The 86 questions dropped on April 18 were ALREADY restored on
   April 21 — set-difference of pre-v0.1 vs today's published returns
   zero. Nothing to recover.

Rename:
- 4,754 cohort-tagged YAMLs (cloud-fill-*, cloud-cell-*, cloud-r2-*,
  cloud-sus-*, cloud-crit-*, cloud-top-*, cloud-new-*, edge-exp-*,
  *-balance-*, *-portfolio-*, *-pilot-*, ...) renamed to clean
  <track>-NNNN form continuing each track's monotonic sequence.
- Per-track ranges minted:
    cloud:  cloud-2866..cloud-4486     (1,621 renamed)
    edge:   edge-0986..edge-2264       (1,279 renamed)
    mobile: mobile-0841..mobile-1870   (1,030 renamed)
    tinyml: tinyml-0830..tinyml-1541   (712 renamed)
    global: global-320..global-431     (112 renamed)
- Bundle rebuilt: 9,224 published (unchanged).
- vault check --strict: 0 load errors, 0 invariant failures.

Chain-breakage analysis (the original concern):
- ZERO of the 3,066 chain question references used cohort-tagged IDs.
  All chain refs were already in clean form. The rename has no chain
  impact at all — the breakage cost we discounted was zero.

External-link preservation:
- interviews/vault/docs/id-renames-2026-04-25.yaml records every
  old→new mapping for forensic lookup.
- interviews/staffml/src/data/id-redirects.json mirrors the map for
  the website.
- The practice page now consults this map when ?q=<id> resolves to
  nothing — preserves shareable links to the 4,428 published renames.
  (326 redirects target draft items and legitimately fall back to the
  not-found banner.)

Tests:
- All 7 existing Playwright smoke tests still pass.
- New test added: ?q=<legacy-cohort-id> resolves through the redirect
  map (using cloud-cell-10000 → cloud-2878 as the fixture).
- 8 / 8 pass.
2026-04-25 10:32:20 -04:00
Vijay Janapa Reddi
0cf416fa1b fix(staffml): not-found banner for ?q=<unknown> + Playwright tests for filters
The practice page used to silently fall through to a random question when
the ?q=<id> deep-link resolved to nothing in the published bundle. That
hid the failure and broke shared deep-links to drafts or archived items.

Now: when ?q=X has no match, set notFoundQ=X, render a small alert banner
above the question pane that names the bad id and explains the likely
causes (draft awaiting review, archived duplicate, typo). The default
pool is still shown so the page stays usable. The banner is dismissible.

Plus four new Playwright smoke tests:
- visual filter at L5 returns a non-empty pool with an inline SVG
- chained-only filter reduces the pool but stays non-empty
- ?q=cloud-2847 deep-link surfaces the queueing-hockey-stick visual
- ?q=<unknown-id> shows the not-found banner

All 7 smoke tests now pass (3 existing + 4 new) in 12.9s.
2026-04-25 09:47:20 -04:00
Vijay Janapa Reddi
f5e95ef34a test(staffml): Playwright smoke tests for restructured practice page + fix missing analytics event type
Verifies the 68f2ca466 restructure actually works in a browser.
Previously committed without runtime validation; this closes that gap.

Smoke test coverage (tests/practice-smoke.spec.ts):
  1. Layout landmarks render without console/pageerror — sticky
     Your-task callout, scenario prose, textarea in LEFT column,
     Reveal button directly below, Stuck-nudge, Tools panel header.
  2. Submit-gradient safeguard fires on low-effort reveal
     (<50 chars typed, clicked immediately). Verifies the
     Think-longer? modal appears with Keep-thinking + Reveal-anyway
     buttons, and that Keep-thinking returns to pre-reveal.
  3. Substantive answer (>80 chars) bypasses the guard and
     transitions straight to post-reveal with Model Answer visible
     and self-assessment buttons ready.

Runtime fix:
- analytics.ts: add 'think_guard_triggered' to AnalyticsEvent union.
  The restructure commit fired this event but the type union didn't
  carry it, so tsc --noEmit failed. No behavior change beyond the
  compile fix — existing consumers (Cloudflare analytics worker)
  ignore unknown types gracefully.

Build hygiene:
- staffml/.gitignore: ignore test-results/ and playwright-report/
  (per-run artifacts — screenshots, videos, traces on failure).
  The tests/ directory itself IS committed.
- playwright.config.ts: baseURL http://localhost:3000, single-worker
  serial execution, no retries, traces/screenshots/video on failure.

How to run:
  npx next dev &                                # dev server in one shell
  npx playwright test tests/practice-smoke      # tests in another

All three tests pass against the dev server (chromium, 4.7s total).
2026-04-24 16:21:38 -04:00