2 Commits

Author SHA1 Message Date
Vijay Janapa Reddi
5225059754 fix(vault-cli): clear ruff violations flagged by --all-files sweep
Auto-fix removed extraneous f-string prefixes, unused imports
(re, sys, textwrap, defaultdict), an unused local (qids), and
converted datetime.now(timezone.utc) to datetime.now(UTC) (UP017).
Manual fixes split colon/semicolon one-liners onto separate lines
(E701/E702), renamed unused loop vars (cid, chain_id) with leading
underscores (B007), replaced bare except with except Exception (E722),
and renamed loop var L to level to satisfy N806.
2026-05-02 09:17:15 -04:00
Vijay Janapa Reddi
4b880ebb1a feat(vault-cli): Phase 3.a + 3.b — gap-driven authoring tooling
Two new scripts that together close the loop from a gap entry to a
reviewable candidate question with a multi-gate scorecard.

generate_question_for_gap.py (3.a):
  - Reads a gap entry, loads between-questions + same-bucket exemplars,
    prompts gemini-3.1-pro-preview, runs Pydantic Question validation,
    and writes <track>/<area>/<id>.yaml.draft. The .draft suffix keeps
    drafts out of vault check / vault build until promotion.
  - ID allocator scans corpus + existing drafts so a batch run gets
    distinct fresh IDs without touching id-registry.yaml.
  - Modes: --gap-index, --gaps-from + --limit, --dry-run.

validate_drafts.py (3.b):
  - Five gates per draft: schema (Pydantic), originality (cosine vs
    in-bucket neighbours via BAAI/bge-small-en-v1.5; matches the corpus
    embeddings.npz so values are comparable; cutoff 0.92), level_fit
    (Gemini-judge against same-level exemplars), coherence
    (Gemini-judge: scenario/question/solution consistency), and bridge
    (Gemini-judge: chain-fit between the gap's two anchors).
  - Final verdict pass iff every non-skipped gate passes.
  - Skips: --no-originality, --no-llm-judge.
  - Output: interviews/vault/draft-validation-scorecard.json.

Smoke checks:
  - 3.a --dry-run --gap-index 0: resolves gap, builds prompt, allocates
    cloud-4579. Synthetic Gemini response Pydantic-validates clean.
  - 3.b on a synthetic /tmp draft: schema + originality pass (top
    neighbour cosine 0.73 vs 0.92 threshold).

Phase 3.c (pilot run on 30 gaps) deferred: it generates new YAML
question content that needs human review before promotion. The
tooling ships ready; running it is a user-supervised step.

CHAIN_ROADMAP.md Progress Log + Phase 3 status updated.
2026-05-01 11:31:06 -04:00