mirror of https://github.com/harvard-edge/cs249r_book.git synced 2026-05-07 18:18:42 -05:00

Files

Vijay Janapa Reddi 3f0773706f chore(vault): restore 6 unique-capability scripts as preserved-for-adaptation references

The Phase 0 cleanup removed 18 scripts as deprecated, but 6 of them have
unique-capability patterns not yet covered by the modern tooling. Restoring
them as reference patterns, not active scripts.

What's restored and why:

  gemini_backfill_question.py
    Idempotent corpus-walk + Gemini batch + thread-pool + JSON YAML
    round-trip. The "fix one field across thousands of YAMLs" pattern.
    To be mined in CORPUS_HARDENING_PLAN.md Phase 5.

  gpt_backfill_question.py
    OpenAI variant of the above. Cross-provider template.

  gemini_cli_generate_questions.py (35K)
    BATCHED generation: 12 cells per call with balanced track × area ×
    zone × level round-robin. `vault generate` does NOT batch — it calls
    once per question. This script's batching pattern is what we want
    when generating > 100 questions in bulk.

  generate.py (30K)
    Coverage-survey-driven generation engine: surveys the corpus, finds
    empty cells, generates to fill the emptiest first, stops when
    saturated. `vault generate` lacks this auto-balance loop.

  gemini_fix_errors.py
    Batch error-fixer with hardware-reference grounding (V100 / A100 /
    H100 / B200 / T4 specs as ground-truth context). To be mined for
    audit_corpus_batched.py --propose-fixes in Phase 5.

  deep_verify.py
    Claude Opus + extended thinking; SHOWS ITS WORK on every napkin-math
    claim. Useful as a tiebreaker on borderline math findings from the
    lightweight audit.

Each restored file has a 5-line STATUS comment block at the top
documenting what to adapt before running. DEPRECATED.md is restructured
to make the three categories explicit (removed / preserved-for-adaptation
/ active-migration), and adds an adaptation checklist that applies to
all preserved scripts (replace corpus.json loading, verify SDK pins,
update output paths, re-validate prompts, sample first).

Validation:
  vault check --strict — 10,711 loaded, 0 invariant failures
  pytest — 74/74
  ruff — clean

2026-05-03 07:50:28 -04:00

8.1 KiB

Raw Permalink Blame History

Legacy scripts — `interviews/vault/scripts/`

Many scripts in this directory pre-date the YAML-as-source-of-truth migration (ARCHITECTURE.md v2.x, Phase 1). YAML at ../questions/**/*.yaml is now authoritative; the legacy scripts ran against the monolithic ../corpus.json, which itself is now a generated artifact (emitted by vault build --local-json).

Do not run remaining scripts in this directory without understanding what they were for. New contributors should reach for the vault CLI first; reach into this directory only when adapting one of the preserved patterns below.

The directory has three classes of script:

Removed — replaced by vault CLI subcommands or by vault-cli/scripts/ equivalents. Findable via git log --diff-filter=D -- interviews/vault/scripts/.
Preserved for adaptation — unique-capability scripts kept as reference patterns. Each has a STATUS: comment block at the top explaining the adaptation needed before running.
Active migration / one-shot — everything else in the directory. These are mostly post-v1.0 cleanups that haven't been retired yet. Triage individually; some may move to vault-cli/ or be removed in a future cleanup pass.

Removed in 2026-05

The following 12 scripts were deleted as unambiguously dead. The mapping is preserved here for git-archaeology — find them via git log --diff-filter=D -- interviews/vault/scripts/ if you need to read the historical implementation.

Removed script	Purpose (pre-migration)	Replacement
`build_corpus.py`	Assembled `corpus.json` from track/zone data	`vault build` (walks YAML, emits `vault.db`)
`export_to_staffml.py`	Copied `corpus.json` → `staffml/src/data/` with field massaging	`vault build --local-json` (writes site-compatible JSON)
`extract_taxonomy.py`	Extracted topic graph from `corpus.json`	`vault/taxonomy.yaml` is the source now; see `vault/schema/EVOLUTION.md`
`gemini_cli_llm_judge.py`	Legacy LLM-as-judge over `corpus.json`	`vault-cli/scripts/audit_chains_with_gemini.py` (chain audit), `validate_drafts.py` (per-draft gates), and the upcoming `audit_corpus_batched.py` (CORPUS_HARDENING_PLAN.md Phase 3)
`gemini_cli_math_review.py`	CLI-flag variant of `gemini_math_review.py`	`vault-cli/scripts/audit_math.py` (active per-question math gate)
`gate.py`, `archive/expand_tracks.py`, `archive/fill_zone_gaps.py`, `archive/fill_gaps.sh`, `archive/final_balance.sh`, `archive/README.md`	Pre-launch / pre-v1.0 hierarchy-workaround one-shots	Obsolete after schema v1.0 (taxonomy lives in YAML, not in the path); YAML files authored directly via `vault new`

Preserved for adaptation

These 6 scripts have unique capability that is not yet covered by the modern tooling. They are kept as reference patterns; each has a STATUS: comment block at the top documenting what to adapt before running.

Preserved script	Why kept	When to mine it
`gemini_backfill_question.py`	Idempotent corpus-walk + Gemini batch + thread-pool + JSON YAML round-trip. The "fix one field across thousands of YAMLs" pattern.	CORPUS_HARDENING_PLAN.md Phase 5 — reuse the batching + idempotency pattern when applying Gemini-proposed format-marker corrections at scale
`gpt_backfill_question.py`	OpenAI/GPT variant of `gemini_backfill_question.py`. Cross-provider template.	When Gemini quota is exhausted, or for A/B comparison of LLM provider quality on the same task
`gemini_cli_generate_questions.py`	BATCHED generation: 12 cells per call with balanced track × area × zone × level round-robin. `vault generate` does NOT batch — it calls once per question.	When generating > 100 questions in bulk (the 1-call-per-question shape of `vault generate` is fine for tens, wasteful for hundreds)
`generate.py`	Coverage-survey-driven generation engine: surveys the corpus, finds empty cells, generates to fill the emptiest first, stops when saturated. `vault generate` does targeted per-cell generation but lacks the auto-balance loop.	When you want "fill all the gaps until the corpus is X-questions-per-cell," not "give me 5 questions about Y"
`gemini_fix_errors.py`	Batch error-fixer with hardware-reference grounding (V100 / A100 / H100 / B200 / T4 specs as JSON-encoded ground truth in the prompt).	CORPUS_HARDENING_PLAN.md Phase 5 — `audit_corpus_batched.py --propose-fixes` should embed the same hardware-reference table when proposing math/coherence corrections
`deep_verify.py`	Claude Opus + extended thinking; asks the model to SHOW ITS WORK on every napkin-math claim, step by step. Deeper than `audit_math.py`'s lightweight check.	Tiebreaker on borderline math findings from `audit_corpus_batched.py` — when the lightweight Gemini judge says "fail" but the prose looks reasonable, run deep_verify on the suspect IDs

Adaptation checklist (applies to all preserved scripts)

Before running any preserved script:

Replace corpus loading. Most preserved scripts read interviews/vault/corpus.json. That file no longer exists in git; it's a build artifact. Adapt to walk interviews/vault/questions/**/*.yaml directly. Use vault_cli.loader.load_corpus() or the same yaml.safe_load loop the active scripts use (e.g. audit_chains_with_gemini.py:load_corpus).
Verify the LLM API surface. The Gemini CLI version, Anthropic SDK version, and OpenAI client version may have moved on since the script was written. Check pyproject.toml for current pins.
Update output paths. Many preserved scripts wrote to interviews/vault/scripts/_validation_results/<UTC>/. The current convention is interviews/vault/_pipeline/runs/<UTC>/ (see interviews/vault/README.md § "Pipeline artifacts").
Re-validate the prompts. The schema has evolved (zone × bloom affinity, closed competency_area enum, format-marker conventions in common_mistake / napkin_math). Regenerate the prompt-side schema summary against the current LinkML (interviews/vault/schema/question_schema.yaml).
Run on a sample first. All these scripts are batch-mode and can touch hundreds or thousands of YAMLs in one run. Always run --limit 5 (or equivalent) first; verify the diff on a couple of files; then widen.

Active migration / one-shot scripts (not in scope of this doc)

The directory still contains several scripts not classified above (e.g. analyze_coverage_gaps.py, audit_applicability_matrix.py, audit_question_backfill_balance.py, audit_visual_questions.py, fix_competency_areas.py, iterate_coverage_loop.py, migrate_to_*, plan_gap_improvements.py, portfolio_balance_loop.py, promote_validated.py, reclassify_zone_bloom_mismatch.py, rename_legacy_ids.py, repair_chains.py, repair_registry.py, render_visuals.py, scorecard.py, validate_generation_gates.py, validate_questions.py, vault_fill.py, plus the shell wrappers review_math.sh, run_parallel.sh, run_reviews.sh).

Several of these are referenced from active docs:

MASSIVE_BUILD_RUNBOOK.md cites analyze_coverage_gaps.py, iterate_coverage_loop.py, promote_validated.py.
vault/visuals/ARCHITECTURE.md cites render_visuals.py as the single entry point for figure rendering.

Triage of those scripts is out of scope for the 2026-05 deprecation pass. They will be classified individually when each owner's workstream next touches them.

Commands that are live today

vault build --local-json                        # regenerate corpus.json
vault publish <version>                          # end-to-end release
vault export-paper <version>                     # paper macros + stats
vault verify <version>                           # academic-citability check
vault check --strict                             # 26 invariants
vault generate --topic X --zone Y --track Z --level Lz --count N   # generate new drafts

See ../../vault-cli/README.md for the full 22-subcommand reference.

For Gemini-driven audit + correction at corpus scale, see ../../vault-cli/docs/CORPUS_HARDENING_PLAN.md (the active workplan).

8.1 KiB Raw Permalink Blame History Unescape Escape

Legacy scripts — interviews/vault/scripts/