mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-05-07 02:03:55 -05:00
dev
2 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
20ea20005c |
feat(vault): release-readiness final pass — E.2 + E.3 + F.4/F.5 + CHANGELOG
Closes the release-readiness push. All 8 gates green: vault check,
lint, doctor, codegen, validate-vault, render, tsc, Playwright.
Bundle: 9,775 → 9,781 published.
E.2 — Auto-emit vault-manifest.json from `vault build --legacy-json`:
Added `emit_manifest()` to `legacy_export.py` and wired it into
`commands/build.py` after the legacy corpus emission. The manifest
is now derived deterministically from the same `loaded` set that
produced corpus.json — track + level distributions, contentHash,
counts. Eliminates the recurring stale-manifest pre-commit failure
that had to be patched by hand twice during this push.
E.3 — `--include-areas` flag in analyze_coverage_gaps.py:
Injects forced area-targeted cells into the recommended_plan for
each listed competency_area (parallelism, networking, etc.). For
each (track, area) where area is in the include list, adds 1 cell
per (canonical-topic × {L4, L5, L6+}) zone. Closes the structural
mismatch where topic-priority ranking misses area-level gaps.
Tested with `--include-areas parallelism`: plan now includes 21
parallelism-topic cells (was 0 in stock plan).
F.4 — Third-pass fix-agent on 10 residuals (4 NEEDS_FIX + 6 DROP from
F.1). Substantial rewrites; 0 archived. Major math corrections:
- mobile-1948: KV cache reconstructed (96 MB / 2048 = 48 KB/token)
- tinyml-1681: cycle-model with proper register spill (5912 → 7912)
- tinyml-1716: serialization on single-core M4 (12 ms not 10 ms)
- tinyml-1634: Young/Daly hours-conversion (139 s, not 2.31 s)
- tinyml-1723: triple-buffer SRAM (43.5 KB → 19.5 KB)
- edge-2401: log2(18) = 4.17 (was 3.6)
F.5 — Re-judge: 6 PASS / 2 NEEDS_FIX / 2 DROP (60% pass rate). 6 more
promoted. The 2 still-NEEDS_FIX + 2 DROP after THREE rewrite
passes are documented as genuinely-stubborn carry-forwards.
G.1 — Cloud parallelism spot-check: 12 stratified items reviewed,
0 issues. Cloud's 326 parallelism items are still high-quality.
G.2 — CHANGELOG.md updated with comprehensive [0.1.2-dev] entry:
schema changes, new validators, tooling additions, content
additions, three documented lessons (validate-at-data-boundary,
prompt-specificity-beats-budget, topic-priority-misses-area-gaps).
Cumulative recovery rate of NEEDS_FIX/DROP items via layered fix-
agents (Phase C + F.2 + F.4): 63 of 120 = 53%. The remaining 57 split
between DROP (genuinely unrecoverable) and items still in NEEDS_FIX
state (deferred to future passes).
Final cumulative state of branch:
- Bundle: 9,224 → 9,781 published (+557 net)
- Lint warnings: 1,308+ → 0
- Doctor fails: 1 → 0
- Pydantic validators: 1 → 4
- Playwright tests: 8 → 9
- Repair scripts: 0 → 5
- Generator features: basic → bloom-aware + topic-area mapping +
parallelism prompt + retry-on-validate-fail + targets-from +
validate-at-write
- Build pipeline: manual manifest → auto-emit
- Analyzer: topic-priority only → topic-priority + area-include flag
- Parallelism gap (the original mission): closed across all tracks
|
||
|
|
6b2b3e0542 |
feat(vault): Phase D + F — parallelism gap closure (+87 PASS items)
Closes the parallelism + global L4-L6+ gaps that have been open across
three prior pushes. All gates green: vault check, lint, doctor, codegen,
validate-vault, render. Bundle: 9,688 → 9,775 published.
PARALLELISM GAP — finally closed:
tinyml/parallelism: 1 → 8
mobile/parallelism: 0 → 6
edge/parallelism: 13 → 18
global/parallelism: 0 → 19
cloud/parallelism: 326 (unchanged; was already dense)
Phase D — parallelism + global generation (87 PASS):
D.1 Hand-authored 72 parallelism cells (track × parallelism-topic ×
zone × level for edge/mobile/tinyml at L4-L6+) + 10 global L4-L6+
cells. Bypasses the analyzer's topic-priority ranking which never
surfaced parallelism cells in the top-100. Saved to
tools/phase_d/{parallelism_targets.txt,global_targets.txt}.
D.2 PARALLELISM_RULES prompt variant in gemini_cli_generate_questions.py
+ --prompt-variant {default,parallelism} CLI flag. Adds rules:
- FORBID single-step bandwidth division ("payload / bandwidth")
- REQUIRE concrete interconnect (NVLink/IB/PCIe/RoCE/LoRa/SPI/BLE
appropriate to track)
- REQUIRE quantified synchronization or pipeline-bubble cost
- REQUIRE non-obvious failure mode in common_mistake
- For tinyml: ground in real numbers (Cortex-M4 SPI 5-25 MHz,
LoRa 5-50 kbps)
+ --targets-from <file> CLI flag for hand-authored target lists.
+ parse_target() now sets competency_area from TOPIC_TO_AREA
mapping (was hardcoded to "cross-cutting").
D.3 Generator: 72/72 written, **0 validate-at-write failures**, 3 API
calls (no retries needed). Judge: 58 PASS / 12 NEEDS_FIX / 2 DROP
= **80.6% pass rate** (vs B.5's 51% on standard cells). PARALLELISM
prompt + validate-at-write together drove the rate up by 30pts.
D.4 Spot-read: 16 stratified PASS items (ran out at 16, no cloud since
D.1 skipped that track). 0% rejection rate, all show real topology
+ quantified sync cost + correct math.
D.5 Global generator: 10/10 written, 0 validate failures, 1 API call.
Judge: 6 PASS / 3 NEEDS_FIX / 1 DROP = 60% pass rate. Filled
global cells (global-0432..0441).
D.6 Promote, rebuild bundle, repair registry, update manifest.
Phase E.1 — retry-on-validation-fail in generator:
Single retry with structured error context for validate-at-write
rejections. Cap at 1 retry per batch. NOT triggered in this run
(D.3 + D.5 had 0 failures), but in place for future runs that
might face the iter-1/iter-3 zero-draft pattern from B.5.
Phase F — second-pass NEEDS_FIX/DROP rehab (23 PASS):
F.2 Spawned general-purpose fix-agent on 33 items (13 NEEDS_FIX + 20
DROP from C.3's first re-judge). 33/33 rewritten with deeper
revisions: visual-aligned reframings, math corrections, real
track-specific toolchains (Hailo-8 DFC, TensorRT 8.6 calibrators,
Cortex-X4 NEON SDOT vs Hexagon NPU), unrealistic-premise fixes
(KV cache in NPU SRAM → tiered LPDDR5/TCM scheme).
F.1 Re-judge: 23 PASS / 4 NEEDS_FIX / 6 DROP = **69.7% pass rate** on
items previously rated NEEDS_FIX or DROP. The fix-agent's deeper
rewrites recovered 70% of the carry-forward queue.
F.3 Stratified spot-read of 16 PASS items (parallel-safe with F.1):
0% rejection rate. Standout: tinyml-1817 correctly diagnoses 2x
half-duplex UART penalty by comparing observed to theoretical Ring
AllReduce time.
Cleanup:
- repair_registry.py: appended 87 new IDs (D.3 + D.5 + F.1 outputs).
- vault-manifest.json refreshed: 9,688 → 9,775; track + level
distributions updated; contentHash dccd3073672c.
API budget: ~12 calls used of 70 allotted (3 D.3 gen + 3 D.3 judge
+ 1 D.5 gen + 1 D.5 judge + 2 F.1 judge + 1 sample = 11). Far under
budget thanks to validate-at-write driving 0 retry calls.
The corpus is StaffML-day-ready with the parallelism gap genuinely
closed for the first time. The remaining 13 NEEDS_FIX + 6 DROP from
F.1 are deferred to a future cleanup; they don't block release.
|