cs249r_book

mirror of https://github.com/harvard-edge/cs249r_book.git synced 2026-05-07 02:03:55 -05:00

Author	SHA1	Message	Date
Vijay Janapa Reddi	bddac127bc	feat(staffml/explore): Phase 2.3 deferred — Primary/All tier filter The "Primary chains only / All" filter dropdown that was punted from Phase 2.3 (`ed2ddb51d`) so the user could review the bigger UI surface. Implementation: - new selectedTier state, default "primary" - filteredQuestions filter: when "primary", drop questions whose chain memberships are all secondary (questions not in any chain pass through unchanged — they're tier-irrelevant). - Tier FilterSelect dropdown next to the existing Level filter. Default behaviour intentionally hides secondary-only questions — matches the rest of the Phase 2 surfaces (practice prefers primary, ChainBadge shows "alt path" pill on secondary, explore picks primary chains for the related panel). Users opt into seeing the lenient-pass questions by switching to "All chains". Tests: new playwright case test8_explore_tier_filter: - tier filter dropdown rendered - switching to "All chains" keeps page interactive (no crash on re-filter) Smoke suite: 19/19 pass.	2026-05-01 17:29:34 -04:00
Vijay Janapa Reddi	202397f594	Merge origin/dev into yaml-audit Pull in the dev work that landed since yaml-audit was last synced: - --legacy-json renamed to --local-json (`2b381bb949`) — script/doc updates needed below in this branch - CI workflow refactor (validate-dev / validate-vault now reusable) - all-contributors automation, gitignore tightening, codespell list - PR #1622 navbar URL rewrite for dev preview - PR #1619 clone-size refactor, #1618 milestone3 xor fix, #1617 perceptron seed, #1616 tito status M3 - Chapter 9 PDF layout refinement - assorted staffml/practice fixes (pickRandom deps, GitHub star gate) This merges the canonical dev state into yaml-audit so subsequent work continues on top of the freshest base. Conflicts in practice/page.tsx + corpus.ts + ARCHITECTURE.md resolved to keep both sides' additive changes (Phase 2 tier work + dev's later refactors).	2026-05-01 17:11:31 -04:00
Vijay Janapa Reddi	9680e8e9fd	feat(vault+staffml): Phase 2 — tier surfacing, schema → TS → UI Carries the primary/secondary chain tier (from Phase 1) through the build pipeline into the practice + explore surfaces, so primary chains are the unmarked default and secondary chains are an opt-in alternative path the user can deep-link into via ?chain=<id>. Backend (2.1): - legacy_export.py emits chain_tiers per question alongside chain_ids and chain_positions; missing chain-tier defaults to "primary". - vault build re-run: 2953 chained questions, all carry chain_tiers (releaseHash unchanged — new field is additive, doesn't perturb the manifest hash inputs). - Existing legacy_export tests were stale (asserted on the v1.0 YAML chains: field path; v1.1 made chains.json the sidecar source). Rewrote them to write chains.json fixtures into tmp_path and added chain_tiers assertions, plus a focused test_chain_tiers_emitted_per_membership case. TypeScript (2.2): - Question.chain_tiers? (Record<string, "primary"\|"secondary">) - ChainTier export, ChainInfo.tier required. - getChainForQuestion / getAllChainsForQuestion populate tier; getAllChains... sorts primary first. - New getPrimaryChainForQuestion(qid) helper for default surfaces. UI (2.3): - practice page reads ?chain=<id> URL param; defaults to getPrimaryChainForQuestion when unset. - ChainBadge gains an inline "alt path" pill when tier=secondary (always visible — no click needed). - ChainStrip mirrors that pill in the progress row for users who expand the strip. - Explore page prefers the first non-secondary chain when picking activeChainId for the related-questions panel. - Deferred to a follow-up commit (intentional, scoped via Progress Log): explore-page "Primary only / All" filter; daily/mock routing. Tests (2.4): - test7_tier_aware_chain_routing in chain-and-vault-smoke.mjs: secondary reachable via ?chain=, alt-path badge visible on secondary, primary regression, alt-path badge ABSENT on primary. - Full smoke suite: 17/17 pass (was 13/13). Validation: - vault check --strict: 10,701 loaded, 0 failures - vault build --legacy-json: 9438 published, chainCount=879 - pytest interviews/vault-cli/tests: 74/74 - npx tsc --noEmit: 0 errors - playwright chain-and-vault-smoke: 17/17 Phase 2 complete. Next: Phase 3 (gap-driven authoring; 407-gap backlog).	2026-04-30 20:22:54 -04:00
Vijay Janapa Reddi	5eec8692b3	feat(staffml): make GitHub star the only ask, gated on revealed answers Replace the daily-FREE_LIMIT modal with a single mission-aligned ask shown once after 5 lifetime reveals. The gate now retires forever on star, honor-confirm, or dismiss — no daily cap, no username verify. - Live stargazer count fetched from the GitHub API (24h cache). - Copy borrows site/about: "Our only ask. Every star tells universities, publishers, and funders that AI engineering education matters." - Wires the same gate into the gauntlet revealAnswer path so Mock Interview no longer bypasses the ask. - Adds a Playwright smoke covering practice + gauntlet + dismiss persistence across reloads.	2026-04-30 09:29:46 -04:00
Vijay Janapa Reddi	e85416931b	test(staffml): playwright smoke + chain integration suite 13 checks covering: - landing page + vault area rendering - topic drilldown question card preview text (regression for the '...' bug) - practice page loads + renders chain members - chain indicator surfaces on chain-member questions - hierarchical layout doesn't break runtime: practice loads cloud-0000, edge-0001, mobile-0000, tinyml-0000 All 13 pass against current build. Run via: cd interviews/staffml && npm run dev node tests/chain-and-vault-smoke.mjs	2026-04-29 19:06:15 -04:00
Vijay Janapa Reddi	ad8f207b88	Merge origin/dev into yaml-audit — sync with latest dev # Conflicts: # interviews/staffml/src/data/vault-manifest.json	2026-04-29 18:44:07 -04:00
Vijay Janapa Reddi	a237ff2b2f	Final Mind-Blowing Release: Expert-refined corpus, 10,701 questions, 0 load errors, and polished UI.	2026-04-29 07:57:00 -04:00
Vijay Janapa Reddi	cae9e40e30	fix(staffml): contain navbar overflow at iPad landscape via overflow-x: clip EcosystemBar's sticky wrapper was in normal flow, so the 7 left dropdowns + 6 right icons (~1253px combined) propagated to body.scrollWidth and triggered horizontal page scroll on iPad Mini landscape (1024px) and iPad Pro landscape (1194px). Quarto sites avoid the same overflow because their navbar lives inside .fixed-top (position: fixed), which is removed from normal flow. We match that behavior with overflow-x: clip on the sticky wrapper — critically NOT overflow-x: hidden, which would force overflow-y: auto and clip the dropdown menus that extend below the bar. Also reverts the nav-lg breakpoint workaround (1200 → 992) so StaffML collapses to hamburger at the same viewport as every other ecosystem subsite (Bootstrap lg, matches shared/config/navbar-common.yml). Copy update: "Backed by a 600-page open textbook" → "Backed by the [Machine Learning Systems](https://mlsysbook.ai) textbook" on welcome and about, removing the page-count claim. Verified: tests/responsive-audit.mjs across 13 routes × 8 viewports (WebKit) — 104/104 pass, zero horizontal scroll. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 19:22:45 -04:00
Vijay Janapa Reddi	542aaf95d2	cleanup(vault): release-ready Phase A — schema hardening + lint calibration + chain repair Closes the cleanup arc (A.1–A.10 in RESUME_PLAN_RELEASE.md). Every gate is now green: vault check --strict, vault lint, vault doctor, vault codegen --check, staffml validate-vault, Playwright (9/9), tsc. A.1 mobile-1962.svg: renamed `Edge` → `RegEdge` in graphviz source (`Edge` is a reserved keyword); SVG renders cleanly. Also fixed tinyml-1570.py (missing `import numpy as np`) which the new failure log surfaced. A.2 render_visuals.py: structured per-ID failure log written to `_validation_results/render_failures.json` on every run; non-zero exit on any per-item crash; new `--fail-fast` and `--failure-log` CLI options. Replaces the prior silent-failure mode. A.3 LinkML visual schema: typed as a structured sub-schema. New `VisualKind` enum (svg only — `mermaid` was reserved but never shipped, dropped to keep the enum honest). Path regex tightened to `^[a-z0-9-]+\.svg$`. Alt minimum length 10, caption required minimum length 5. TypeScript Visual interface + Question.visual field added to staffml-vault-types/index.ts. A.4 Pydantic Visual + Question validators: - Visual.kind hard-rejects anything but `svg` - Visual.path enforces the new regex - Visual.alt min 10 chars, caption required min 5 chars - Question.model_validator: visual.path MUST resolve to a real file under interviews/vault/visuals/<track>/. Skipped in production deploys where the working tree is absent. A.5 Registry repair + doctor split: - tools: repair_registry.py appended 5,269 missing IDs (the rename refactor at `8a5c3ff3c` left the append-only registry unsynced; this brings disk-coverage to 100%). Header block in id-registry.yaml documents the rebuild rationale. - doctor.py: split symmetric `registry-integrity` check into `disk-coverage` (HARD FAIL if any disk YAML id is unregistered) and `registry-history` (INFO ONLY for retired ids — the registry is by design an audit log, retired ids are normal). Pre-existing `_check_schema_version` bug (`versions == {1}` vs string `"1.0"`) fixed. A.6 Lint calibration via 4-expert consensus + bloom-canonical reclassification: - Spawned 4 experts (Vijay Reddi, Chip Huyen, Jeff Dean, education-reviewer) on 42 disputed (zone, level) pairs; consensus-builder aggregated to 15 valid / 19 invalid / 8 borderline. - User arbitrated 8 borderlines: 7 widen / 1 reclassify. - Built ZONE_BLOOM_AFFINITY matrix (Education-Reviewer's idea): every zone admits its dominant Bloom verb + adjacent verbs, rejects clear hierarchy violations. - reclassify_zone_bloom_mismatch.py applied 576 deterministic zone fixes via BLOOM_CANONICAL_ZONE mapping (e.g. fluency+analyze → analyze, recall+analyze → analyze, evaluation+apply → implement). - Question.model_validator(_zone_bloom_compatible): hard-rejects future zone-bloom mismatches at write time. Generated drafts can no longer ship a self-contradicting classification. - ZONE_LEVEL_AFFINITY widened per consensus + arbitration + post-reclassification adjustments. Lint warnings: 1,308 → 0. A.7 Chain integrity: - repair_chains.py: drops chain refs when a chain has <2 published members (chain ceases to exist), renumbers all members of any chain whose positions are non-sequential / duplicated / non-monotonic-by-level. Sort key: level ascending, then old position, then qid (deterministic). - validate-vault.py: relaxed sequential check to unique-positions check. Position gaps from mid-chain deletions are normal; what matters is uniqueness + bloom-monotonicity (vault check --strict enforces both from YAML source-of-truth). A.8 Practice page visual + zoom modal: - QuestionVisual.tsx: wraps the `<img>` in `<Zoom>` from react-medium-image-zoom (4 KB). Click image → fullscreen `<dialog data-rmiz-modal>`; ESC closes. Added test-id `question-visual-img` for stable selector. - New Playwright test: 9th in the suite, deep-links cloud-4492, asserts the dialog opens on click and closes on ESC. - TypeScript: removed `mermaid` from local Visual types in corpus.ts and corpus-vault.ts; tsc clean. A.9 All gates green: - vault check --strict: 0 errors / 0 invariant failures - vault lint: 0 errors / 0 warnings (was 1,308 warnings) - vault codegen --check: artifacts in sync (hash baseline updated) - vault doctor: 0 fails (registry-history info, git-state warn on uncommitted state-pre-this-commit) - staffml validate-vault: 0 errors / 0 warnings, deployment-ready - Playwright: 9/9 pass (was 8; +zoom modal test) - render_visuals: 0 errors (was 2 silent failures pre-A.2) - tsc: clean Distribution after reclassification: 9,544 published unchanged; 576 items moved zone via bloom-canonical mapping (full per-item report at /tmp/reclassify_changes.csv). Chain count 879 → 850 after orphan-singleton drops. release_hash updated. Carry-forward to next session (Phase B): - Priority gap closure for parallelism cells + global L4-L6+ (the run that produced this corpus did not close the targeted cells; B.3 needs specialized prompts per cell-class) - 120 NEEDS_FIX items from coverage_loop/20260425_150712/ still carry judge fix_suggestions; spawn fix-agent in Phase C	2026-04-25 15:12:51 -04:00
Vijay Janapa Reddi	8a5c3ff3c5	refactor(vault): rename 4,754 cohort-tagged IDs to clean <track>-NNNN form Audit followed by execution. Three findings, one big move, three minor cleanups documented for follow-up. Audit (interviews/vault/audit/2026-04-25-schema-folder-audit.md): 1. Folder structure is correct — flat <track>/<id>.yaml. ARCHITECTURE.md §3.3 documents that the v0.1 deeper-hierarchy attempt dropped 86 questions and was reverted in v1.0 with sound reasoning. No change. 2. Schema is solid. Required fields populate at 100%; optional fields populate where they make sense. Three small fixes worth making later: tighter id regex, drop dead details.question, strip cohort tags at promotion. 3. The 86 questions dropped on April 18 were ALREADY restored on April 21 — set-difference of pre-v0.1 vs today's published returns zero. Nothing to recover. Rename: - 4,754 cohort-tagged YAMLs (cloud-fill-, cloud-cell-, cloud-r2-, cloud-sus-, cloud-crit-, cloud-top-, cloud-new-, edge-exp-, -balance-, -portfolio-, -pilot-, ...) renamed to clean <track>-NNNN form continuing each track's monotonic sequence. - Per-track ranges minted: cloud: cloud-2866..cloud-4486 (1,621 renamed) edge: edge-0986..edge-2264 (1,279 renamed) mobile: mobile-0841..mobile-1870 (1,030 renamed) tinyml: tinyml-0830..tinyml-1541 (712 renamed) global: global-320..global-431 (112 renamed) - Bundle rebuilt: 9,224 published (unchanged). - vault check --strict: 0 load errors, 0 invariant failures. Chain-breakage analysis (the original concern): - ZERO of the 3,066 chain question references used cohort-tagged IDs. All chain refs were already in clean form. The rename has no chain impact at all — the breakage cost we discounted was zero. External-link preservation: - interviews/vault/docs/id-renames-2026-04-25.yaml records every old→new mapping for forensic lookup. - interviews/staffml/src/data/id-redirects.json mirrors the map for the website. - The practice page now consults this map when ?q=<id> resolves to nothing — preserves shareable links to the 4,428 published renames. (326 redirects target draft items and legitimately fall back to the not-found banner.) Tests: - All 7 existing Playwright smoke tests still pass. - New test added: ?q=<legacy-cohort-id> resolves through the redirect map (using cloud-cell-10000 → cloud-2878 as the fixture). - 8 / 8 pass.	2026-04-25 10:32:20 -04:00
Vijay Janapa Reddi	0cf416fa1b	fix(staffml): not-found banner for ?q=<unknown> + Playwright tests for filters The practice page used to silently fall through to a random question when the ?q=<id> deep-link resolved to nothing in the published bundle. That hid the failure and broke shared deep-links to drafts or archived items. Now: when ?q=X has no match, set notFoundQ=X, render a small alert banner above the question pane that names the bad id and explains the likely causes (draft awaiting review, archived duplicate, typo). The default pool is still shown so the page stays usable. The banner is dismissible. Plus four new Playwright smoke tests: - visual filter at L5 returns a non-empty pool with an inline SVG - chained-only filter reduces the pool but stays non-empty - ?q=cloud-2847 deep-link surfaces the queueing-hockey-stick visual - ?q=<unknown-id> shows the not-found banner All 7 smoke tests now pass (3 existing + 4 new) in 12.9s.	2026-04-25 09:47:20 -04:00
Vijay Janapa Reddi	f5e95ef34a	test(staffml): Playwright smoke tests for restructured practice page + fix missing analytics event type Verifies the `68f2ca466` restructure actually works in a browser. Previously committed without runtime validation; this closes that gap. Smoke test coverage (tests/practice-smoke.spec.ts): 1. Layout landmarks render without console/pageerror — sticky Your-task callout, scenario prose, textarea in LEFT column, Reveal button directly below, Stuck-nudge, Tools panel header. 2. Submit-gradient safeguard fires on low-effort reveal (<50 chars typed, clicked immediately). Verifies the Think-longer? modal appears with Keep-thinking + Reveal-anyway buttons, and that Keep-thinking returns to pre-reveal. 3. Substantive answer (>80 chars) bypasses the guard and transitions straight to post-reveal with Model Answer visible and self-assessment buttons ready. Runtime fix: - analytics.ts: add 'think_guard_triggered' to AnalyticsEvent union. The restructure commit fired this event but the type union didn't carry it, so tsc --noEmit failed. No behavior change beyond the compile fix — existing consumers (Cloudflare analytics worker) ignore unknown types gracefully. Build hygiene: - staffml/.gitignore: ignore test-results/ and playwright-report/ (per-run artifacts — screenshots, videos, traces on failure). The tests/ directory itself IS committed. - playwright.config.ts: baseURL http://localhost:3000, single-worker serial execution, no retries, traces/screenshots/video on failure. How to run: npx next dev & # dev server in one shell npx playwright test tests/practice-smoke # tests in another All three tests pass against the dev server (chromium, 4.7s total).	2026-04-24 16:21:38 -04:00

12 Commits