mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-05-22 14:03:46 -05:00
feat(vault): rebuild chains.json via Gemini 3.1 Pro Preview — 373 curated chains
Replaced the 726 author-curated chains with 373 LLM-curated chains
generated bucket-by-bucket within (track, topic). Gemini was prompted
with the strict-progression + multi-chain constraints we agreed on:
- Δ ∈ {1, 2} between consecutive members (prefer +1)
- Up to 2-chain membership only for L1/L2 anchors
- Single-topic, 2-6 members, no Δ=0 same-level pairs
- Validated structurally on apply — vault check --strict passes
Sweep stats:
- 44 calls to gemini-3.1-pro-preview (well under 250/day cap)
- 313 (track, topic) buckets processed in ~80 minutes
- 373 chains accepted (51% of legacy count, much higher per-chain
quality after strict filter)
- Level-Δ distribution: 949 strict +1 (93%), 73 +2 (7%) — 0 +0/+3+
- Chain sizes: 26 size-2, 141 size-3, 128 size-4, 60 size-5, 18 size-6
- 1,395 questions in chains (15% of corpus, vs ~20% before)
- 54 of ~87 topics have at least 1 chain
- 138 corpus gaps identified (gaps.proposed.json) — missing-rung
questions that would complete chains; feeds future authoring pass
Why fewer chains than before is fine:
- Old chains had a long tail with cos<0.65 (worse than random
same-bucket pairs). LLM curation rejects those.
- We trade quantity for pedagogical coherence.
- The 138 gaps capture what was implicit in old chains via
questions-that-shouldnt-have-been-paired; we make it explicit.
Files:
- chains.json — applied (was backed up to chains.json.bak by
apply_proposed_chains.py)
- chains.proposed.json — kept for review/audit
- gaps.proposed.json — authoring backlog
- vault-manifest.json + corpus-summary.json — regenerated
- corpus.json — gitignored (CI regenerates)
Validation: vault check --strict 0 failures, vault build clean,
playwright UI suite 13/13 pass.
This commit is contained in:
File diff suppressed because one or more lines are too long
@@ -1,11 +1,11 @@
|
||||
{
|
||||
"releaseId": "dev",
|
||||
"releaseHash": "ba3a0a7458d65073afc3f6381d86455fd526bd531106f258b108119bd79f8ca1",
|
||||
"releaseHash": "937fb2c9db19d8a311a25f4fe6657371a18d920ce4d666784055b988e96bee41",
|
||||
"schemaVersion": "1",
|
||||
"policyVersion": "1",
|
||||
"buildDate": "2026-04-30T12:45:57Z",
|
||||
"buildDate": "2026-04-30T19:14:49Z",
|
||||
"questionCount": 9438,
|
||||
"chainCount": 726,
|
||||
"chainCount": 373,
|
||||
"conceptCount": 87,
|
||||
"trackDistribution": {
|
||||
"cloud": 4028,
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
14243
interviews/vault/chains.proposed.json
Normal file
14243
interviews/vault/chains.proposed.json
Normal file
File diff suppressed because it is too large
Load Diff
1406
interviews/vault/gaps.proposed.json
Normal file
1406
interviews/vault/gaps.proposed.json
Normal file
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user