10 Commits

Author SHA1 Message Date
Vijay Janapa Reddi
2b381bb949 refactor(vault-cli): rename --legacy-json to --local-json
The flag is the StaffML frontend's local-dev fallback (read corpus.json
from disk via NEXT_PUBLIC_VAULT_FALLBACK=static), not a deprecated path.
"Legacy" implied "soon to be removed"; "local-json" describes its actual
role and reads correctly in scripts and docs.

- vault-cli: rename CLI flag, parameter, result key, and help text.
- CI workflows + pre-commit config: invoke the new flag name.
- All scripts that print the command (suggest_exemplars,
  pre_commit_corpus_guard, promote_validated, rename_legacy_ids,
  export_to_staffml, the paper analyze_corpus/generate_*) updated.
- Comments and docs (ARCHITECTURE, CHANGELOG, REVIEWS, TESTING,
  MASSIVE_BUILD_RUNBOOK, DEPRECATED, AUTHORING, plus frontend
  comments and .env.example / .gitignore) updated.

The "legacy_json" sentinel string in corpus_stats.json._meta.source
is intentionally NOT renamed — it is a stable artifact format read
by downstream paper-generation tooling.
2026-04-30 09:30:28 -04:00
Vijay Janapa Reddi
c824ac6ed1 refactor(staffml): retire prod static-fallback; opt-in dev-only (#1598)
The bundled corpus.json was serving as a prod safety net behind the
Cloudflare Worker. Post-cutover the Worker has been the real data
source, and the static path was silently degrading rather than helping
(corpus.json is a generated artifact whose prose `details` are blank
in corpus-summary.json). This change:

- Stops emitting corpus.json in the publish-live workflow
- Removes the Worker-error fallback in getQuestionFullDetail — errors
  now propagate to useFullQuestion and the UI shows a "details
  unavailable" banner instead of silently filling blanks
- Drops the localhost auto-trigger in shouldUseStaticDetails — the
  static path now requires explicit NEXT_PUBLIC_VAULT_FALLBACK=static
- Switches taxonomy.ts to corpus-summary.json (was corpus.json)
- Rewrites the publish-live smoke tests against corpus-summary.json
- Collapses validate-vault.py to sparse-only (per-question deep
  validation lives in `vault check --strict`)

Static-fallback remains as an OPT-IN local-dev affordance: set
NEXT_PUBLIC_VAULT_FALLBACK=static and run `vault build --legacy-json`
to materialize corpus.json. The Function-constructor dynamic import
keeps Turbopack from requiring corpus.json at build time.

useFullQuestion hook signature changed from `Question | undefined` to
`{ question, status }`. Callers updated: practice and plans pages
(both render an amber "details unavailable" banner when status
is 'error').

Deleted dead cutover scaffolding: corpus-source.ts (router with no UI
consumers), corpus-vault.ts (worker-only mirror, never wired up),
useVaultQuestion.ts (unused migration hook), vault-fallback.ts (only
consumer was corpus-source.ts).

Deleted stale docs: staffml/scripts/DEPRECATED.md, vault-cli/docs/
CUTOVER_QA.md, three vault/docs/RESUME_PLAN_*.md.

Verified locally: tsc clean, vitest 37/37, next build produces all
15 static routes.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 18:47:03 -04:00
Vijay Janapa Reddi
643b1a51aa docs(vault): RESUME_PLAN_PHASE_D.md — handoff for parallelism gap closure
Captures the next push (Phase D/E/F) for a fresh session. Three
tracks designed to interleave for parallelism:

- Phase D — close the parallelism + global L4-L6+ priority gaps
  via hand-built topic targets + a parallelism-specific prompt
  variant (the analyzer's recommended_plan picks topic-level cells
  by priority, missing the area-level parallelism aggregation).
- Phase E — three generator-leverage improvements: retry-on-
  validation-fail (saves ~50% API calls), auto-update vault-manifest
  (eliminates the stale-manifest pre-commit failures), analyzer
  --include-areas flag (so future runs don't need D.1's hand-build).
- Phase F — residual cleanup of C.3's leftovers: 25 unjudged + 13
  still-NEEDS_FIX + 20 DROP. Plus the original B.6's 20-item spot-
  read that we only got 5 items into.

Parallelism map embedded in the plan: D.3 (parallelism loop) runs
concurrent with F.2 (fix-agent on different IDs); D.5 (global loop)
concurrent with F.1 (re-judge) + F.3 (spot-read). No two generation
loops concurrent (next-id race). Total ~12-15 hr work, ~8 hr wall
clock with parallelism, ~30-40 API calls.

Three review checkpoints (D.2' prompt review, D.4 spot-read, final).

Companion to RESUME_PLAN_2026-04-25.md (Phase 1-7) and
RESUME_PLAN_RELEASE.md (Phase A).
2026-04-25 17:39:11 -04:00
Vijay Janapa Reddi
5350a1f9db docs(vault): RESUME_PLAN_RELEASE.md — handoff for stable-dev push
Captures the cleanup → balanced generation → final stable state plan
for the next session. Locks in five user decisions (expert-driven lint
calibration, fix chain data, react-medium-image-zoom modal, Claude-
drafted prompts with user review, single stable commit not release
tags). Three review checkpoints (A.6.3 expert consensus, B.3' prompt
drafts, D.2 final state).

Companion to RESUME_PLAN_2026-04-25.md (this branch's history).
2026-04-25 13:58:20 -04:00
Vijay Janapa Reddi
fdd42753fb docs(vault): RESUME_PLAN_2026-04-25.md — handoff for next session 2026-04-25 11:01:21 -04:00
Vijay Janapa Reddi
24d3269c77 feat(vault): Phase 0 — competency_area cleanup + closed-enum hardening
Pre-flight cleanup before the day's massive question-generation build.

Three changes, all preventing recurrence of the Gemini-generated drift
that surfaced in the GUI's area filter:

1. fix_competency_areas.py — remap script with table covering 39
   observed malformed values (topic-name-as-area, zone-name-as-area,
   '<track> / <topic>' slash-form). Applied: 41 files fixed.

2. LinkML schema — added CompetencyArea closed enum with the 13
   canonical values (deployment, parallelism, networking, latency,
   memory, compute, data, power, precision, reliability, optimization,
   architecture, cross-cutting). competency_area field now references
   the enum. Future drafts that try to use a topic name fail validation.

3. Pydantic validator — _area() field_validator on Question rejects
   any value outside VALID_COMPETENCY_AREAS. Catches drift at YAML
   load before vault build can include the bad row.

Plus generator default batch_size bumped from 12 → 30 cells per Gemini
call. The 250-call/day cap rewards larger batches.

Plus MASSIVE_BUILD_RUNBOOK.md — the full day's methodology committed as
a runbook so future generation sessions follow the same shape.
2026-04-25 10:59:43 -04:00
Vijay Janapa Reddi
8a5c3ff3c5 refactor(vault): rename 4,754 cohort-tagged IDs to clean <track>-NNNN form
Audit followed by execution. Three findings, one big move, three minor
cleanups documented for follow-up.

Audit (interviews/vault/audit/2026-04-25-schema-folder-audit.md):
1. Folder structure is correct — flat <track>/<id>.yaml. ARCHITECTURE.md
   §3.3 documents that the v0.1 deeper-hierarchy attempt dropped 86
   questions and was reverted in v1.0 with sound reasoning. No change.
2. Schema is solid. Required fields populate at 100%; optional fields
   populate where they make sense. Three small fixes worth making
   later: tighter id regex, drop dead details.question, strip cohort
   tags at promotion.
3. The 86 questions dropped on April 18 were ALREADY restored on
   April 21 — set-difference of pre-v0.1 vs today's published returns
   zero. Nothing to recover.

Rename:
- 4,754 cohort-tagged YAMLs (cloud-fill-*, cloud-cell-*, cloud-r2-*,
  cloud-sus-*, cloud-crit-*, cloud-top-*, cloud-new-*, edge-exp-*,
  *-balance-*, *-portfolio-*, *-pilot-*, ...) renamed to clean
  <track>-NNNN form continuing each track's monotonic sequence.
- Per-track ranges minted:
    cloud:  cloud-2866..cloud-4486     (1,621 renamed)
    edge:   edge-0986..edge-2264       (1,279 renamed)
    mobile: mobile-0841..mobile-1870   (1,030 renamed)
    tinyml: tinyml-0830..tinyml-1541   (712 renamed)
    global: global-320..global-431     (112 renamed)
- Bundle rebuilt: 9,224 published (unchanged).
- vault check --strict: 0 load errors, 0 invariant failures.

Chain-breakage analysis (the original concern):
- ZERO of the 3,066 chain question references used cohort-tagged IDs.
  All chain refs were already in clean form. The rename has no chain
  impact at all — the breakage cost we discounted was zero.

External-link preservation:
- interviews/vault/docs/id-renames-2026-04-25.yaml records every
  old→new mapping for forensic lookup.
- interviews/staffml/src/data/id-redirects.json mirrors the map for
  the website.
- The practice page now consults this map when ?q=<id> resolves to
  nothing — preserves shareable links to the 4,428 published renames.
  (326 redirects target draft items and legitimately fall back to the
  not-found banner.)

Tests:
- All 7 existing Playwright smoke tests still pass.
- New test added: ?q=<legacy-cohort-id> resolves through the redirect
  map (using cloud-cell-10000 → cloud-2878 as the fixture).
- 8 / 8 pass.
2026-04-25 10:32:20 -04:00
Vijay Janapa Reddi
db7489d7f2 docs: refresh paper + ID_SCHEMES for new published count (9,224)
Paper artifacts regenerated against the post-promotion vault state:
- macros.tex: 9,224 published, refreshed track distribution and chain counts
- All four data figures rebuilt
- corpus_stats.json synced
- paper.tex zone-count prose updated (diagnosis 1,575 → 1,583, evaluation
  1,110 → 1,113; fluency unchanged at 1,227)
- Paper rebuilds clean at 25 pages

ID_SCHEMES.md rewritten:
- Retired the 2026-04-21 yyyymm-4hex proposal as unreadable + incompatible
  with legacy IDs
- New scheme: plain <track>-NNNN, monotonic per-track sequence
- Rationale derived from arXiv / CVE / RFC ID conventions
- Migration policy unchanged: NO rename of legacy IDs

Plus 0.10.0 release directory: vault.db copy + release.json with
published_count=9224. `latest` symlink now -> 0.10.0.
2026-04-25 09:54:44 -04:00
Vijay Janapa Reddi
d7a4745838 docs(vault): define StaffML item quality and pilot ID policy 2026-04-24 18:09:36 -04:00
Vijay Janapa Reddi
7e8444cdf2 feat(vault): ID scheme v2 + vault ls/show/chain browse commands
Adds durable ID conventions for new questions and chains, plus three
CLI commands that solve the 'know what's there without opening 10k
YAMLs' workflow. Legacy IDs are preserved — this is purely additive.

Documentation
  interviews/vault/docs/ID_SCHEMES.md
    - Question ID: <track>-<yyyymm>-<4hex>
      e.g. cloud-202604-a3f2. Content-addressed 4-hex from
      sha256(title + '\n' + topic). Collision-on-create bumps the hex.
    - Chain ID: chain-<track>-<topic-slug>-<yyyymm>[-suffix]
      e.g. chain-cloud-kv-cache-management-202604. Self-describing;
      topic invariant per chain enforced by 'vault lint'.
    - Durability principle: IDs encode only immutable axes (track +
      creation month + content hash). Level/zone/topic/status live in
      the query layer, not the name.
    - Migration policy: legacy IDs + chain IDs preserved forever;
      'vault new' starts using the v2 scheme as of this commit.

Authoring (interviews/vault-cli/src/vault_cli/commands/authoring.py)
  - _new_question_id() generates the v2 question ID with collision
    handling (increments 4-hex suffix within the same track/yyyymm
    bucket).
  - _new_chain_id() generates the v2 chain ID with suffix disambiguation.
  - 'vault new' now emits v2 IDs instead of the old
    <track>-<topic-slug>-<short-hash>-<seq> scheme.

CLI commands (interviews/vault-cli/src/vault_cli/commands/)
  vault ls [filters]
    Browse questions with axis filters (--track, --level, --zone,
    --topic, --status, --in-chains, --limit, --plain). Columns:
    id | track | level | zone | topic | #chains | title.
    --plain emits TSV for grep/awk piping.

  vault show <question-id>
    Full classification + validation lineage (LLM, math, human-review)
    + scenario preview + every chain the question belongs to, with
    prev/next walk for each.

  vault chain ls [--track --topic]
    Lists all chains with member count + level span (e.g. 'L1/L3/L5').

  vault chain show <chain-id>
    Walks the chain end-to-end, one row per member in position order.
    Warns if the chain spans multiple topics (likely mis-linked) or
    if the level sequence is not monotonically non-decreasing.
    Immediately surfaced a real finding: cloud-chain-432 spans both
    latency-decomposition + roofline-analysis; flagged for review.

Tests passed locally for every flag combination above.
2026-04-21 19:24:15 -04:00