cs249r_book

mirror of https://github.com/harvard-edge/cs249r_book.git synced 2026-07-19 01:14:07 -05:00

Author	SHA1	Message	Date
Vijay Janapa Reddi	2b381bb949	refactor(vault-cli): rename --legacy-json to --local-json The flag is the StaffML frontend's local-dev fallback (read corpus.json from disk via NEXT_PUBLIC_VAULT_FALLBACK=static), not a deprecated path. "Legacy" implied "soon to be removed"; "local-json" describes its actual role and reads correctly in scripts and docs. - vault-cli: rename CLI flag, parameter, result key, and help text. - CI workflows + pre-commit config: invoke the new flag name. - All scripts that print the command (suggest_exemplars, pre_commit_corpus_guard, promote_validated, rename_legacy_ids, export_to_staffml, the paper analyze_corpus/generate_*) updated. - Comments and docs (ARCHITECTURE, CHANGELOG, REVIEWS, TESTING, MASSIVE_BUILD_RUNBOOK, DEPRECATED, AUTHORING, plus frontend comments and .env.example / .gitignore) updated. The "legacy_json" sentinel string in corpus_stats.json._meta.source is intentionally NOT renamed — it is a stable artifact format read by downstream paper-generation tooling.	2026-04-30 09:30:28 -04:00
Vijay Janapa Reddi	c824ac6ed1	refactor(staffml): retire prod static-fallback; opt-in dev-only (#1598 ) The bundled corpus.json was serving as a prod safety net behind the Cloudflare Worker. Post-cutover the Worker has been the real data source, and the static path was silently degrading rather than helping (corpus.json is a generated artifact whose prose `details` are blank in corpus-summary.json). This change: - Stops emitting corpus.json in the publish-live workflow - Removes the Worker-error fallback in getQuestionFullDetail — errors now propagate to useFullQuestion and the UI shows a "details unavailable" banner instead of silently filling blanks - Drops the localhost auto-trigger in shouldUseStaticDetails — the static path now requires explicit NEXT_PUBLIC_VAULT_FALLBACK=static - Switches taxonomy.ts to corpus-summary.json (was corpus.json) - Rewrites the publish-live smoke tests against corpus-summary.json - Collapses validate-vault.py to sparse-only (per-question deep validation lives in `vault check --strict`) Static-fallback remains as an OPT-IN local-dev affordance: set NEXT_PUBLIC_VAULT_FALLBACK=static and run `vault build --legacy-json` to materialize corpus.json. The Function-constructor dynamic import keeps Turbopack from requiring corpus.json at build time. useFullQuestion hook signature changed from `Question \| undefined` to `{ question, status }`. Callers updated: practice and plans pages (both render an amber "details unavailable" banner when status is 'error'). Deleted dead cutover scaffolding: corpus-source.ts (router with no UI consumers), corpus-vault.ts (worker-only mirror, never wired up), useVaultQuestion.ts (unused migration hook), vault-fallback.ts (only consumer was corpus-source.ts). Deleted stale docs: staffml/scripts/DEPRECATED.md, vault-cli/docs/ CUTOVER_QA.md, three vault/docs/RESUME_PLAN_*.md. Verified locally: tsc clean, vitest 37/37, next build produces all 15 static routes. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 18:47:03 -04:00
Vijay Janapa Reddi	643b1a51aa	docs(vault): RESUME_PLAN_PHASE_D.md — handoff for parallelism gap closure Captures the next push (Phase D/E/F) for a fresh session. Three tracks designed to interleave for parallelism: - Phase D — close the parallelism + global L4-L6+ priority gaps via hand-built topic targets + a parallelism-specific prompt variant (the analyzer's recommended_plan picks topic-level cells by priority, missing the area-level parallelism aggregation). - Phase E — three generator-leverage improvements: retry-on- validation-fail (saves ~50% API calls), auto-update vault-manifest (eliminates the stale-manifest pre-commit failures), analyzer --include-areas flag (so future runs don't need D.1's hand-build). - Phase F — residual cleanup of C.3's leftovers: 25 unjudged + 13 still-NEEDS_FIX + 20 DROP. Plus the original B.6's 20-item spot- read that we only got 5 items into. Parallelism map embedded in the plan: D.3 (parallelism loop) runs concurrent with F.2 (fix-agent on different IDs); D.5 (global loop) concurrent with F.1 (re-judge) + F.3 (spot-read). No two generation loops concurrent (next-id race). Total ~12-15 hr work, ~8 hr wall clock with parallelism, ~30-40 API calls. Three review checkpoints (D.2' prompt review, D.4 spot-read, final). Companion to RESUME_PLAN_2026-04-25.md (Phase 1-7) and RESUME_PLAN_RELEASE.md (Phase A).	2026-04-25 17:39:11 -04:00
Vijay Janapa Reddi	5350a1f9db	docs(vault): RESUME_PLAN_RELEASE.md — handoff for stable-dev push Captures the cleanup → balanced generation → final stable state plan for the next session. Locks in five user decisions (expert-driven lint calibration, fix chain data, react-medium-image-zoom modal, Claude- drafted prompts with user review, single stable commit not release tags). Three review checkpoints (A.6.3 expert consensus, B.3' prompt drafts, D.2 final state). Companion to RESUME_PLAN_2026-04-25.md (this branch's history).	2026-04-25 13:58:20 -04:00
Vijay Janapa Reddi	fdd42753fb	docs(vault): RESUME_PLAN_2026-04-25.md — handoff for next session	2026-04-25 11:01:21 -04:00
Vijay Janapa Reddi	24d3269c77	feat(vault): Phase 0 — competency_area cleanup + closed-enum hardening Pre-flight cleanup before the day's massive question-generation build. Three changes, all preventing recurrence of the Gemini-generated drift that surfaced in the GUI's area filter: 1. fix_competency_areas.py — remap script with table covering 39 observed malformed values (topic-name-as-area, zone-name-as-area, '<track> / <topic>' slash-form). Applied: 41 files fixed. 2. LinkML schema — added CompetencyArea closed enum with the 13 canonical values (deployment, parallelism, networking, latency, memory, compute, data, power, precision, reliability, optimization, architecture, cross-cutting). competency_area field now references the enum. Future drafts that try to use a topic name fail validation. 3. Pydantic validator — _area() field_validator on Question rejects any value outside VALID_COMPETENCY_AREAS. Catches drift at YAML load before vault build can include the bad row. Plus generator default batch_size bumped from 12 → 30 cells per Gemini call. The 250-call/day cap rewards larger batches. Plus MASSIVE_BUILD_RUNBOOK.md — the full day's methodology committed as a runbook so future generation sessions follow the same shape.	2026-04-25 10:59:43 -04:00
Vijay Janapa Reddi	8a5c3ff3c5	refactor(vault): rename 4,754 cohort-tagged IDs to clean <track>-NNNN form Audit followed by execution. Three findings, one big move, three minor cleanups documented for follow-up. Audit (interviews/vault/audit/2026-04-25-schema-folder-audit.md): 1. Folder structure is correct — flat <track>/<id>.yaml. ARCHITECTURE.md §3.3 documents that the v0.1 deeper-hierarchy attempt dropped 86 questions and was reverted in v1.0 with sound reasoning. No change. 2. Schema is solid. Required fields populate at 100%; optional fields populate where they make sense. Three small fixes worth making later: tighter id regex, drop dead details.question, strip cohort tags at promotion. 3. The 86 questions dropped on April 18 were ALREADY restored on April 21 — set-difference of pre-v0.1 vs today's published returns zero. Nothing to recover. Rename: - 4,754 cohort-tagged YAMLs (cloud-fill-, cloud-cell-, cloud-r2-, cloud-sus-, cloud-crit-, cloud-top-, cloud-new-, edge-exp-, -balance-, -portfolio-, -pilot-, ...) renamed to clean <track>-NNNN form continuing each track's monotonic sequence. - Per-track ranges minted: cloud: cloud-2866..cloud-4486 (1,621 renamed) edge: edge-0986..edge-2264 (1,279 renamed) mobile: mobile-0841..mobile-1870 (1,030 renamed) tinyml: tinyml-0830..tinyml-1541 (712 renamed) global: global-320..global-431 (112 renamed) - Bundle rebuilt: 9,224 published (unchanged). - vault check --strict: 0 load errors, 0 invariant failures. Chain-breakage analysis (the original concern): - ZERO of the 3,066 chain question references used cohort-tagged IDs. All chain refs were already in clean form. The rename has no chain impact at all — the breakage cost we discounted was zero. External-link preservation: - interviews/vault/docs/id-renames-2026-04-25.yaml records every old→new mapping for forensic lookup. - interviews/staffml/src/data/id-redirects.json mirrors the map for the website. - The practice page now consults this map when ?q=<id> resolves to nothing — preserves shareable links to the 4,428 published renames. (326 redirects target draft items and legitimately fall back to the not-found banner.) Tests: - All 7 existing Playwright smoke tests still pass. - New test added: ?q=<legacy-cohort-id> resolves through the redirect map (using cloud-cell-10000 → cloud-2878 as the fixture). - 8 / 8 pass.	2026-04-25 10:32:20 -04:00
Vijay Janapa Reddi	db7489d7f2	docs: refresh paper + ID_SCHEMES for new published count (9,224) Paper artifacts regenerated against the post-promotion vault state: - macros.tex: 9,224 published, refreshed track distribution and chain counts - All four data figures rebuilt - corpus_stats.json synced - paper.tex zone-count prose updated (diagnosis 1,575 → 1,583, evaluation 1,110 → 1,113; fluency unchanged at 1,227) - Paper rebuilds clean at 25 pages ID_SCHEMES.md rewritten: - Retired the 2026-04-21 yyyymm-4hex proposal as unreadable + incompatible with legacy IDs - New scheme: plain <track>-NNNN, monotonic per-track sequence - Rationale derived from arXiv / CVE / RFC ID conventions - Migration policy unchanged: NO rename of legacy IDs Plus 0.10.0 release directory: vault.db copy + release.json with published_count=9224. `latest` symlink now -> 0.10.0.	2026-04-25 09:54:44 -04:00
Vijay Janapa Reddi	d7a4745838	docs(vault): define StaffML item quality and pilot ID policy	2026-04-24 18:09:36 -04:00
Vijay Janapa Reddi	7e8444cdf2	feat(vault): ID scheme v2 + vault ls/show/chain browse commands Adds durable ID conventions for new questions and chains, plus three CLI commands that solve the 'know what's there without opening 10k YAMLs' workflow. Legacy IDs are preserved — this is purely additive. Documentation interviews/vault/docs/ID_SCHEMES.md - Question ID: <track>-<yyyymm>-<4hex> e.g. cloud-202604-a3f2. Content-addressed 4-hex from sha256(title + '\n' + topic). Collision-on-create bumps the hex. - Chain ID: chain-<track>-<topic-slug>-<yyyymm>[-suffix] e.g. chain-cloud-kv-cache-management-202604. Self-describing; topic invariant per chain enforced by 'vault lint'. - Durability principle: IDs encode only immutable axes (track + creation month + content hash). Level/zone/topic/status live in the query layer, not the name. - Migration policy: legacy IDs + chain IDs preserved forever; 'vault new' starts using the v2 scheme as of this commit. Authoring (interviews/vault-cli/src/vault_cli/commands/authoring.py) - _new_question_id() generates the v2 question ID with collision handling (increments 4-hex suffix within the same track/yyyymm bucket). - _new_chain_id() generates the v2 chain ID with suffix disambiguation. - 'vault new' now emits v2 IDs instead of the old <track>-<topic-slug>-<short-hash>-<seq> scheme. CLI commands (interviews/vault-cli/src/vault_cli/commands/) vault ls [filters] Browse questions with axis filters (--track, --level, --zone, --topic, --status, --in-chains, --limit, --plain). Columns: id \| track \| level \| zone \| topic \| #chains \| title. --plain emits TSV for grep/awk piping. vault show <question-id> Full classification + validation lineage (LLM, math, human-review) + scenario preview + every chain the question belongs to, with prev/next walk for each. vault chain ls [--track --topic] Lists all chains with member count + level span (e.g. 'L1/L3/L5'). vault chain show <chain-id> Walks the chain end-to-end, one row per member in position order. Warns if the chain spans multiple topics (likely mis-linked) or if the level sequence is not monotonically non-decreasing. Immediately surfaced a real finding: cloud-chain-432 spans both latency-decomposition + roofline-analysis; flagged for review. Tests passed locally for every flag combination above.	2026-04-21 19:24:15 -04:00

10 Commits