The Phase 0 cleanup removed 18 scripts as deprecated, but 6 of them have
unique-capability patterns not yet covered by the modern tooling. Restoring
them as reference patterns, not active scripts.
What's restored and why:
gemini_backfill_question.py
Idempotent corpus-walk + Gemini batch + thread-pool + JSON YAML
round-trip. The "fix one field across thousands of YAMLs" pattern.
To be mined in CORPUS_HARDENING_PLAN.md Phase 5.
gpt_backfill_question.py
OpenAI variant of the above. Cross-provider template.
gemini_cli_generate_questions.py (35K)
BATCHED generation: 12 cells per call with balanced track × area ×
zone × level round-robin. `vault generate` does NOT batch — it calls
once per question. This script's batching pattern is what we want
when generating > 100 questions in bulk.
generate.py (30K)
Coverage-survey-driven generation engine: surveys the corpus, finds
empty cells, generates to fill the emptiest first, stops when
saturated. `vault generate` lacks this auto-balance loop.
gemini_fix_errors.py
Batch error-fixer with hardware-reference grounding (V100 / A100 /
H100 / B200 / T4 specs as ground-truth context). To be mined for
audit_corpus_batched.py --propose-fixes in Phase 5.
deep_verify.py
Claude Opus + extended thinking; SHOWS ITS WORK on every napkin-math
claim. Useful as a tiebreaker on borderline math findings from the
lightweight audit.
Each restored file has a 5-line STATUS comment block at the top
documenting what to adapt before running. DEPRECATED.md is restructured
to make the three categories explicit (removed / preserved-for-adaptation
/ active-migration), and adds an adaptation checklist that applies to
all preserved scripts (replace corpus.json loading, verify SDK pins,
update output paths, re-validate prompts, sample first).
Validation:
vault check --strict — 10,711 loaded, 0 invariant failures
pytest — 74/74
ruff — clean
8.1 KiB
Legacy scripts — interviews/vault/scripts/
Many scripts in this directory pre-date the YAML-as-source-of-truth
migration (ARCHITECTURE.md v2.x, Phase 1). YAML at
../questions/**/*.yaml is now authoritative; the legacy scripts ran
against the monolithic ../corpus.json, which itself is now a
generated artifact (emitted by vault build --local-json).
Do not run remaining scripts in this directory without understanding
what they were for. New contributors should reach for the vault
CLI first; reach into this directory only when adapting one of the
preserved patterns below.
The directory has three classes of script:
- Removed — replaced by
vaultCLI subcommands or byvault-cli/scripts/equivalents. Findable viagit log --diff-filter=D -- interviews/vault/scripts/. - Preserved for adaptation — unique-capability scripts kept as
reference patterns. Each has a
STATUS:comment block at the top explaining the adaptation needed before running. - Active migration / one-shot — everything else in the directory.
These are mostly post-v1.0 cleanups that haven't been retired yet.
Triage individually; some may move to
vault-cli/or be removed in a future cleanup pass.
Removed in 2026-05
The following 12 scripts were deleted as unambiguously dead. The mapping
is preserved here for git-archaeology — find them via
git log --diff-filter=D -- interviews/vault/scripts/ if you need to
read the historical implementation.
| Removed script | Purpose (pre-migration) | Replacement |
|---|---|---|
build_corpus.py |
Assembled corpus.json from track/zone data |
vault build (walks YAML, emits vault.db) |
export_to_staffml.py |
Copied corpus.json → staffml/src/data/ with field massaging |
vault build --local-json (writes site-compatible JSON) |
extract_taxonomy.py |
Extracted topic graph from corpus.json |
vault/taxonomy.yaml is the source now; see vault/schema/EVOLUTION.md |
gemini_cli_llm_judge.py |
Legacy LLM-as-judge over corpus.json |
vault-cli/scripts/audit_chains_with_gemini.py (chain audit), validate_drafts.py (per-draft gates), and the upcoming audit_corpus_batched.py (CORPUS_HARDENING_PLAN.md Phase 3) |
gemini_cli_math_review.py |
CLI-flag variant of gemini_math_review.py |
vault-cli/scripts/audit_math.py (active per-question math gate) |
gate.py, archive/expand_tracks.py, archive/fill_zone_gaps.py, archive/fill_gaps.sh, archive/final_balance.sh, archive/README.md |
Pre-launch / pre-v1.0 hierarchy-workaround one-shots | Obsolete after schema v1.0 (taxonomy lives in YAML, not in the path); YAML files authored directly via vault new |
Preserved for adaptation
These 6 scripts have unique capability that is not yet covered by
the modern tooling. They are kept as reference patterns; each has a
STATUS: comment block at the top documenting what to adapt before
running.
| Preserved script | Why kept | When to mine it |
|---|---|---|
gemini_backfill_question.py |
Idempotent corpus-walk + Gemini batch + thread-pool + JSON YAML round-trip. The "fix one field across thousands of YAMLs" pattern. | CORPUS_HARDENING_PLAN.md Phase 5 — reuse the batching + idempotency pattern when applying Gemini-proposed format-marker corrections at scale |
gpt_backfill_question.py |
OpenAI/GPT variant of gemini_backfill_question.py. Cross-provider template. |
When Gemini quota is exhausted, or for A/B comparison of LLM provider quality on the same task |
gemini_cli_generate_questions.py |
BATCHED generation: 12 cells per call with balanced track × area × zone × level round-robin. vault generate does NOT batch — it calls once per question. |
When generating > 100 questions in bulk (the 1-call-per-question shape of vault generate is fine for tens, wasteful for hundreds) |
generate.py |
Coverage-survey-driven generation engine: surveys the corpus, finds empty cells, generates to fill the emptiest first, stops when saturated. vault generate does targeted per-cell generation but lacks the auto-balance loop. |
When you want "fill all the gaps until the corpus is X-questions-per-cell," not "give me 5 questions about Y" |
gemini_fix_errors.py |
Batch error-fixer with hardware-reference grounding (V100 / A100 / H100 / B200 / T4 specs as JSON-encoded ground truth in the prompt). | CORPUS_HARDENING_PLAN.md Phase 5 — audit_corpus_batched.py --propose-fixes should embed the same hardware-reference table when proposing math/coherence corrections |
deep_verify.py |
Claude Opus + extended thinking; asks the model to SHOW ITS WORK on every napkin-math claim, step by step. Deeper than audit_math.py's lightweight check. |
Tiebreaker on borderline math findings from audit_corpus_batched.py — when the lightweight Gemini judge says "fail" but the prose looks reasonable, run deep_verify on the suspect IDs |
Adaptation checklist (applies to all preserved scripts)
Before running any preserved script:
-
Replace corpus loading. Most preserved scripts read
interviews/vault/corpus.json. That file no longer exists in git; it's a build artifact. Adapt to walkinterviews/vault/questions/**/*.yamldirectly. Usevault_cli.loader.load_corpus()or the sameyaml.safe_loadloop the active scripts use (e.g.audit_chains_with_gemini.py:load_corpus). -
Verify the LLM API surface. The Gemini CLI version, Anthropic SDK version, and OpenAI client version may have moved on since the script was written. Check
pyproject.tomlfor current pins. -
Update output paths. Many preserved scripts wrote to
interviews/vault/scripts/_validation_results/<UTC>/. The current convention isinterviews/vault/_pipeline/runs/<UTC>/(seeinterviews/vault/README.md§ "Pipeline artifacts"). -
Re-validate the prompts. The schema has evolved (zone × bloom affinity, closed competency_area enum, format-marker conventions in common_mistake / napkin_math). Regenerate the prompt-side schema summary against the current LinkML (
interviews/vault/schema/question_schema.yaml). -
Run on a sample first. All these scripts are batch-mode and can touch hundreds or thousands of YAMLs in one run. Always run
--limit 5(or equivalent) first; verify the diff on a couple of files; then widen.
Active migration / one-shot scripts (not in scope of this doc)
The directory still contains several scripts not classified above
(e.g. analyze_coverage_gaps.py, audit_applicability_matrix.py,
audit_question_backfill_balance.py, audit_visual_questions.py,
fix_competency_areas.py, iterate_coverage_loop.py,
migrate_to_*, plan_gap_improvements.py, portfolio_balance_loop.py,
promote_validated.py, reclassify_zone_bloom_mismatch.py,
rename_legacy_ids.py, repair_chains.py, repair_registry.py,
render_visuals.py, scorecard.py, validate_generation_gates.py,
validate_questions.py, vault_fill.py, plus the shell wrappers
review_math.sh, run_parallel.sh, run_reviews.sh).
Several of these are referenced from active docs:
MASSIVE_BUILD_RUNBOOK.mdcitesanalyze_coverage_gaps.py,iterate_coverage_loop.py,promote_validated.py.vault/visuals/ARCHITECTURE.mdcitesrender_visuals.pyas the single entry point for figure rendering.
Triage of those scripts is out of scope for the 2026-05 deprecation pass. They will be classified individually when each owner's workstream next touches them.
Commands that are live today
vault build --local-json # regenerate corpus.json
vault publish <version> # end-to-end release
vault export-paper <version> # paper macros + stats
vault verify <version> # academic-citability check
vault check --strict # 26 invariants
vault generate --topic X --zone Y --track Z --level Lz --count N # generate new drafts
See ../../vault-cli/README.md for the full 22-subcommand reference.
For Gemini-driven audit + correction at corpus scale, see
../../vault-cli/docs/CORPUS_HARDENING_PLAN.md (the active workplan).