mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-05-08 09:57:21 -05:00
Two new pieces close the generation→validation→saturation feedback loop: 1. gemini_cli_llm_judge.py — multi-criteria validator. For each draft, judges math correctness, cell-fit (does it actually target the declared track/zone/level?), scenario realism, uniqueness vs canonical questions, and visual-asset alignment. Returns PASS/NEEDS_FIX/DROP per item. Batched (default 15 per call) for budget efficiency. 2. iterate_coverage_loop.py — drives the full loop: analyze → plan → generate → render → judge → apply → re-analyze. Self-paced: stops when (a) top priority gap drops below threshold, (b) DROP rate exceeds the saturation/hallucination threshold, (c) total API calls exceed budget, or (d) the same cell is top priority for two iterations in a row (convergence). The user no longer specifies "how many questions" — the loop generates until the corpus reaches a measurable steady state. Plus 25 round-1 visual questions generated by the new batched generator (5 batched calls × 5 cells each, zero failures). The loop is the answer to "we need balance, not just volume": every iteration's plan derives from a fresh analysis of where coverage is weakest, so generation can never over-fill an already-saturated cell.
19 lines
799 B
Plaintext
19 lines
799 B
Plaintext
digraph {
|
|
rankdir=LR;
|
|
node [shape=box, style=filled, fontname="Helvetica"];
|
|
subgraph cluster_n1 {
|
|
label="Node 1 (NVLink: 900 GB/s)"; style=dashed;
|
|
G1 [label="GPU 0", fillcolor="#cfe2f3", color="#4a90c4"];
|
|
G2 [label="GPU 1", fillcolor="#cfe2f3", color="#4a90c4"];
|
|
G1 -> G2 [dir=both, label="NVLink", color="#4a90c4"];
|
|
}
|
|
subgraph cluster_n2 {
|
|
label="Node 2 (NVLink: 900 GB/s)"; style=dashed;
|
|
G3 [label="GPU 2", fillcolor="#cfe2f3", color="#4a90c4"];
|
|
G4 [label="GPU 3", fillcolor="#cfe2f3", color="#4a90c4"];
|
|
G3 -> G4 [dir=both, label="NVLink", color="#4a90c4"];
|
|
}
|
|
Net [label="400 Gbps RoCEv2", shape=ellipse, fillcolor="#fdebd0", color="#c87b2a"];
|
|
G1 -> Net [dir=both, color="#c87b2a", style=bold];
|
|
G3 -> Net [dir=both, color="#c87b2a", style=bold];
|
|
} |