mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-05-22 14:03:46 -05:00
docs(vault): document v1.1 sidecar + hierarchy + tier model
Phase 4.8 of CHAIN_ROADMAP.md.
ARCHITECTURE.md gains a new §3.6 capturing the three deltas that landed
during the chain workstream — additive to v1, not replacements:
- hierarchical question layout (`<track>/<area>/<id>.yaml`)
- sidecar chain architecture (chains.json authoritative; YAML chains:
field retired)
- chain tier model (primary/secondary, default-primary on read)
README.md updates:
- status line: v1.1, points at CHAIN_ROADMAP.md and ARCHITECTURE.md §3.6
- new "Chain build pipeline" section with the diagnose / build /
apply / merge invocations
- layout listing reflects scripts/ and the actual src/ contents
(was stuck on Phase 0 scaffolding shape)
No code changes. The v1 release-pipeline invariants absorb the v1.1
deltas without modification (chains.json is a Merkle leaf; tier flows
into that leaf transparently).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -2,8 +2,11 @@
|
||||
|
||||
Authoring, building, and releasing the StaffML question vault.
|
||||
|
||||
> **Status**: Phase 0 scaffolding. Subcommands land in Phase 1+ per
|
||||
> [`ARCHITECTURE.md`](../vault/ARCHITECTURE.md) §14.
|
||||
> **Status**: v1.1 — sidecar chain architecture + tier-aware UI in place.
|
||||
> Chain corpus growth tracked in
|
||||
> [`docs/CHAIN_ROADMAP.md`](docs/CHAIN_ROADMAP.md); design baseline is
|
||||
> [`../vault/ARCHITECTURE.md`](../vault/ARCHITECTURE.md) (§3.6 captures
|
||||
> the v1.1 deltas).
|
||||
|
||||
## Install (local editable)
|
||||
|
||||
@@ -69,6 +72,41 @@ All commands support `--json` for machine-readable output per
|
||||
[`docs/JSON_OUTPUT.md`](docs/JSON_OUTPUT.md). Exit codes are stable per
|
||||
[`docs/EXIT_CODES.md`](docs/EXIT_CODES.md).
|
||||
|
||||
## Chain build pipeline (v1.1+)
|
||||
|
||||
Chains are pedagogical progressions through Bloom levels (L1→L6+) within
|
||||
one (track, topic) bucket. `interviews/vault/chains.json` is the
|
||||
authoritative registry; YAMLs no longer carry a `chains:` field. The
|
||||
build tooling lives in `scripts/`:
|
||||
|
||||
```bash
|
||||
# 1. Surface (track, topic) buckets that need chains. Writes
|
||||
# interviews/vault/chain-coverage.json (gitignored — regeneratable).
|
||||
python3 scripts/diagnose_chain_coverage.py
|
||||
|
||||
# 2. Strict pass: Δ ∈ {1, 2}, primary chains. Default mode.
|
||||
python3 scripts/build_chains_with_gemini.py --all \
|
||||
--output ../vault/chains.proposed.json
|
||||
|
||||
# 3. Lenient pass: Δ ∈ {0, 1, 2, 3}, secondary chains.
|
||||
# Use --buckets-from to scope the run to uncovered buckets only.
|
||||
python3 scripts/build_chains_with_gemini.py --mode lenient \
|
||||
--buckets-from ../vault/chain-coverage.json \
|
||||
--output ../vault/chains.proposed.lenient.json
|
||||
|
||||
# 4. Apply a single proposed file (replaces chains.json after validation).
|
||||
python3 scripts/apply_proposed_chains.py --proposed ../vault/chains.proposed.json
|
||||
|
||||
# 5. Merge primary + secondary into chains.json with cap enforcement
|
||||
# (each qid in ≤ 2 chains; non-L1/L2 qids capped at 1 membership).
|
||||
python3 scripts/merge_chain_passes.py
|
||||
```
|
||||
|
||||
Both `apply_proposed_chains.py` and the validator tolerate a missing
|
||||
`tier` field on chain entries (defaulting to "primary"); chains
|
||||
produced by `--mode lenient` are tagged `tier: "secondary"`. After any
|
||||
change, run `vault check --strict` and `vault build --legacy-json`.
|
||||
|
||||
## Run tests
|
||||
|
||||
```bash
|
||||
@@ -81,18 +119,27 @@ pytest interviews/vault-cli/tests/
|
||||
```
|
||||
vault-cli/
|
||||
├── pyproject.toml
|
||||
├── README.md # this file
|
||||
├── README.md # this file
|
||||
├── docs/
|
||||
│ ├── EXIT_CODES.md # stable exit-code taxonomy
|
||||
│ ├── JSON_OUTPUT.md # per-command --json schemas
|
||||
│ └── CUTOVER_QA.md # manual cutover QA checklist
|
||||
├── src/vault_cli/
|
||||
│ ├── CHAIN_ROADMAP.md # resumable chain-coverage workstream
|
||||
│ ├── EXIT_CODES.md # stable exit-code taxonomy
|
||||
│ ├── JSON_OUTPUT.md # per-command --json schemas
|
||||
│ └── CUTOVER_QA.md # manual cutover QA checklist
|
||||
├── src/vault_cli/ # Typer app + library
|
||||
│ ├── __init__.py
|
||||
│ ├── _version.py
|
||||
│ ├── exit_codes.py
|
||||
│ └── main.py # Typer app entry
|
||||
└── tests/
|
||||
└── test_smoke.py # Phase 0 smoke tests
|
||||
│ ├── compiler.py / loader.py / yaml_io.py
|
||||
│ ├── legacy_export.py # corpus.json + chain_tiers emitter
|
||||
│ ├── policy.py # release-policy filter
|
||||
│ ├── validator.py # fast / structural / slow tiers
|
||||
│ └── main.py # Typer app entry
|
||||
├── scripts/ # ops + Gemini-powered tools
|
||||
│ ├── diagnose_chain_coverage.py # surface uncovered buckets
|
||||
│ ├── build_chains_with_gemini.py # --mode {strict,lenient}
|
||||
│ ├── apply_proposed_chains.py # gate proposed chains.json
|
||||
│ ├── merge_chain_passes.py # primary + secondary, cap-enforced
|
||||
│ ├── summarize_proposed_chains.py # quick-read review
|
||||
│ └── ... # auditing, calibration, D1 emit, etc.
|
||||
└── tests/ # pytest suite (74 tests today)
|
||||
```
|
||||
|
||||
## Architecture
|
||||
|
||||
@@ -348,6 +348,55 @@ release_hash = sha256(b"\n".join(f"{id}:{h}".encode() for id, h in leaves))
|
||||
|
||||
---
|
||||
|
||||
### 3.6 v1.1 architecture updates (post-Phase-1/2 — chain build)
|
||||
|
||||
After the v1.0 design doc above was written, three deltas landed during
|
||||
the corpus growth workstream tracked in
|
||||
[`vault-cli/docs/CHAIN_ROADMAP.md`](../vault-cli/docs/CHAIN_ROADMAP.md).
|
||||
They are additive to the v1 invariants, not replacements:
|
||||
|
||||
**1. Hierarchical question layout.** Questions live at
|
||||
`interviews/vault/questions/<track>/<area>/<id>.yaml` (the v1 design
|
||||
above sketched `<track>/L<N>/<zone>/`; the actual landed layout is
|
||||
`<track>/<area>/`). The hierarchy is a build-time concern — `corpus.json`
|
||||
is path-agnostic, so site/runtime code is unaffected. `vault check
|
||||
--strict` enforces a path-vs-body invariant: the directory shards
|
||||
(track/area) must match the YAML body's `track`/`competency_area` fields.
|
||||
|
||||
**2. Sidecar chain architecture.** `interviews/vault/chains.json` is the
|
||||
authoritative chain registry. Per-question YAMLs no longer carry a
|
||||
`chains:` field — that field was retired in v1.1 to keep chain rebuilds
|
||||
to a single-file edit instead of touching 2k+ YAMLs. The exporter
|
||||
(`vault_cli.legacy_export._build_chain_index`) joins YAML + chains.json
|
||||
to produce per-question `chain_ids` / `chain_positions` in the runtime
|
||||
`corpus.json`. §3.3's chain-reference rules still hold for the
|
||||
**runtime artifact** (multi-chain membership, position monotonic, etc.);
|
||||
they no longer apply to YAML source.
|
||||
|
||||
**3. Chain tier model.** Each entry in `chains.json` carries a
|
||||
`tier: "primary" | "secondary"` field:
|
||||
- **primary** — the strict Bloom-progression sweep (Δ ∈ {1, 2}).
|
||||
Rendered by default in practice/explore.
|
||||
- **secondary** — the lenient second-pass coverage sweep
|
||||
(Δ ∈ {0, 1, 2, 3}; Δ=0 only for shared-scenario pairs). Reachable
|
||||
via `?chain=<id>` URL deep-links and the "more paths" UI; the
|
||||
`ChainBadge` shows an inline "alt path" pill when rendering one.
|
||||
- The legacy exporter emits `chain_tiers` per question alongside
|
||||
`chain_positions`. Missing tier defaults to "primary" everywhere
|
||||
on read (validator + TS runtime + UI), which keeps the v1.0
|
||||
chains.json shape forward-compatible.
|
||||
|
||||
Tooling that produced these: `diagnose_chain_coverage.py`,
|
||||
`build_chains_with_gemini.py` (with `--mode {strict,lenient}`),
|
||||
`merge_chain_passes.py`. See the README's "Chain build pipeline"
|
||||
section for invocation, and CHAIN_ROADMAP.md for the running log.
|
||||
|
||||
The v1 release-pipeline invariants (§3.5 hashing, §5 validators)
|
||||
absorb these without modification — `chains.json` is a Merkle leaf,
|
||||
and the new `tier` field flows into that leaf transparently.
|
||||
|
||||
---
|
||||
|
||||
## 4. CLI Specification (v2)
|
||||
|
||||
Framework: **Typer** (declarative, type-hint-driven). Output: **Rich** (tables, progress, panels).
|
||||
|
||||
Reference in New Issue
Block a user