cs249r_book

github-starred/cs249r_book

Fork 0

mirror of https://github.com/harvard-edge/cs249r_book.git synced 2026-05-10 15:49:25 -05:00

Commit Graph

Author	SHA1	Message	Date
Vijay Janapa Reddi	2b3cf5e1da	chore(vault): consolidate AI pipeline artifacts under _pipeline/ Establishes one ignored subdirectory for ALL intermediate outputs of LLM-driven tooling (chain proposals, gap detection, draft scorecards, audit traces). Single gitignore rule: /_pipeline/. Convention is documented in interviews/vault/README.md under "Pipeline artifacts" — it's a real project layout convention, not AI-specific config. Path migration: interviews/vault/chains.proposed.json → _pipeline/chains.proposed.json interviews/vault/gaps.proposed.json → _pipeline/gaps.proposed.json interviews/vault/draft-validation-scorecard.json → _pipeline/draft-validation-scorecard.json interviews/vault/audit-runs/ → _pipeline/runs/ 8 scripts updated to define a PIPELINE_DIR constant and route default outputs through it: build_chains_with_gemini.py, apply_proposed_chains.py, merge_chain_passes.py, validate_drafts.py, audit_chains_with_gemini.py, generate_question_for_gap.py, summarize_proposed_chains.py, promote_drafts.py. Forward-looking docs (README.md chain-pipeline section + CHAIN_ROADMAP.md resume instructions + state snapshot) updated to reference the new paths. Historical Progress Log entries left as-is — they accurately describe what was committed at the time. Drive-by .gitignore fixes (both used full repo-relative paths under package-local .gitignore files, which never matched): interviews/vault-cli/.gitignore: scripts/.calibration_cache/ interviews/vault/.gitignore: /embeddings.npz Validation: - vault check --strict: 10,705 loaded, 0 invariant failures - pytest interviews/vault-cli/tests/: 74/74 - audit --dry-run: paths resolve correctly to _pipeline/runs/<ts>/ No durable corpus content moves. chains.json (live registry), id-registry.yaml, questions/, etc. all stay where they were.	2026-05-02 09:04:55 -04:00
Vijay Janapa Reddi	83fe0f7193	feat(vault): Phase 1 — second-pass chain coverage build (373 → 879) Diagnoses uncovered (track, topic) buckets and runs a relaxed Gemini sweep targeting them. New chains tier="secondary"; pre-existing chains backfilled tier="primary". Tools (Phases 1.1, 1.2/1.3, 1.5): - diagnose_chain_coverage.py: surface buckets with no chains (committed earlier on yaml-audit) - build_chains_with_gemini.py: --mode lenient adds Δ ∈ {0,1,2,3} (committed earlier on yaml-audit) - merge_chain_passes.py: merges primary + secondary, enforces the multi-membership cap (max 2 chains/qid; non-L1/L2 capped at 1) Sweep (Phase 1.4): - 17 Gemini-3.1-pro-preview calls, ~22 min wall time, 211 buckets - 506 chains accepted (above the 200-400 estimate), 269 new gaps - validator caught a few cross-bucket and Δ=4 hallucinations inline - Δ distribution: Δ=1 69.1%, Δ=2 21.1%, Δ=3 4.6%, Δ=0 5.2% (10.9% of chains contain at least one Δ=0 — within target band) - random spot-check of 5 Δ=0 chains: all share scenario threads (DMA, CMSIS-NN, on-device routing, PB-scale pipelines) Coverage gains (chains/topic before → after): - cloud 2.95 → 4.37 (242 + 116 secondary) - edge 0.64 → 2.59 ( 49 + 148 secondary) - mobile 0.74 → 2.56 ( 46 + 113 secondary) - tinyml 0.80 → 2.64 ( 36 + 83 secondary) - global 0.00 → 0.96 ( 0 + 46 secondary) Buckets with ≥1 chain: 102 / 313 (33%) → 285 / 313 (91%). Validation: - apply_proposed_chains.py --dry-run: validation clean (879 chains) - vault check --strict: 10,701 loaded, 0 invariant failures - vault build --legacy-json: chainCount 373 → 879, release_hash rolled to 04ee8a23… - playwright chain-and-vault-smoke.mjs: 13/13 pass Phase 1 complete. Next: Phase 2 (tier surfacing in staffml UI).	2026-04-30 20:12:27 -04:00

Author

SHA1

Message

Date

Vijay Janapa Reddi

2b3cf5e1da

chore(vault): consolidate AI pipeline artifacts under _pipeline/

Establishes one ignored subdirectory for ALL intermediate outputs of
LLM-driven tooling (chain proposals, gap detection, draft scorecards,
audit traces). Single gitignore rule: /_pipeline/.

Convention is documented in interviews/vault/README.md under "Pipeline
artifacts" — it's a real project layout convention, not AI-specific
config.

Path migration:
  interviews/vault/chains.proposed*.json
                  → _pipeline/chains.proposed*.json
  interviews/vault/gaps.proposed*.json
                  → _pipeline/gaps.proposed*.json
  interviews/vault/draft-validation-scorecard.json
                  → _pipeline/draft-validation-scorecard.json
  interviews/vault/audit-runs/
                  → _pipeline/runs/

8 scripts updated to define a PIPELINE_DIR constant and route default
outputs through it: build_chains_with_gemini.py,
apply_proposed_chains.py, merge_chain_passes.py, validate_drafts.py,
audit_chains_with_gemini.py, generate_question_for_gap.py,
summarize_proposed_chains.py, promote_drafts.py.

Forward-looking docs (README.md chain-pipeline section + CHAIN_ROADMAP.md
resume instructions + state snapshot) updated to reference the new
paths. Historical Progress Log entries left as-is — they accurately
describe what was committed at the time.

Drive-by .gitignore fixes (both used full repo-relative paths under
package-local .gitignore files, which never matched):
  interviews/vault-cli/.gitignore: scripts/.calibration_cache/
  interviews/vault/.gitignore:     /embeddings.npz

Validation:
  - vault check --strict: 10,705 loaded, 0 invariant failures
  - pytest interviews/vault-cli/tests/: 74/74
  - audit --dry-run: paths resolve correctly to _pipeline/runs/<ts>/

No durable corpus content moves. chains.json (live registry),
id-registry.yaml, questions/, etc. all stay where they were.

2026-05-02 09:04:55 -04:00

Vijay Janapa Reddi

83fe0f7193

feat(vault): Phase 1 — second-pass chain coverage build (373 → 879)

Diagnoses uncovered (track, topic) buckets and runs a relaxed Gemini
sweep targeting them. New chains tier="secondary"; pre-existing chains
backfilled tier="primary".

Tools (Phases 1.1, 1.2/1.3, 1.5):
  - diagnose_chain_coverage.py: surface buckets with no chains
    (committed earlier on yaml-audit)
  - build_chains_with_gemini.py: --mode lenient adds Δ ∈ {0,1,2,3}
    (committed earlier on yaml-audit)
  - merge_chain_passes.py: merges primary + secondary, enforces the
    multi-membership cap (max 2 chains/qid; non-L1/L2 capped at 1)

Sweep (Phase 1.4):
  - 17 Gemini-3.1-pro-preview calls, ~22 min wall time, 211 buckets
  - 506 chains accepted (above the 200-400 estimate), 269 new gaps
  - validator caught a few cross-bucket and Δ=4 hallucinations inline
  - Δ distribution: Δ=1 69.1%, Δ=2 21.1%, Δ=3 4.6%, Δ=0 5.2%
    (10.9% of chains contain at least one Δ=0 — within target band)
  - random spot-check of 5 Δ=0 chains: all share scenario threads
    (DMA, CMSIS-NN, on-device routing, PB-scale pipelines)

Coverage gains (chains/topic before → after):
  - cloud   2.95 → 4.37   (242 + 116 secondary)
  - edge    0.64 → 2.59   ( 49 + 148 secondary)
  - mobile  0.74 → 2.56   ( 46 + 113 secondary)
  - tinyml  0.80 → 2.64   ( 36 +  83 secondary)
  - global  0.00 → 0.96   (  0 +  46 secondary)
  Buckets with ≥1 chain: 102 / 313 (33%) → 285 / 313 (91%).

Validation:
  - apply_proposed_chains.py --dry-run: validation clean (879 chains)
  - vault check --strict: 10,701 loaded, 0 invariant failures
  - vault build --legacy-json: chainCount 373 → 879, release_hash
    rolled to 04ee8a23…
  - playwright chain-and-vault-smoke.mjs: 13/13 pass

Phase 1 complete. Next: Phase 2 (tier surfacing in staffml UI).

2026-04-30 20:12:27 -04:00

2 Commits