mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-05-08 02:28:25 -05:00
Two new pieces close the generation→validation→saturation feedback loop: 1. gemini_cli_llm_judge.py — multi-criteria validator. For each draft, judges math correctness, cell-fit (does it actually target the declared track/zone/level?), scenario realism, uniqueness vs canonical questions, and visual-asset alignment. Returns PASS/NEEDS_FIX/DROP per item. Batched (default 15 per call) for budget efficiency. 2. iterate_coverage_loop.py — drives the full loop: analyze → plan → generate → render → judge → apply → re-analyze. Self-paced: stops when (a) top priority gap drops below threshold, (b) DROP rate exceeds the saturation/hallucination threshold, (c) total API calls exceed budget, or (d) the same cell is top priority for two iterations in a row (convergence). The user no longer specifies "how many questions" — the loop generates until the corpus reaches a measurable steady state. Plus 25 round-1 visual questions generated by the new batched generator (5 batched calls × 5 cells each, zero failures). The loop is the answer to "we need balance, not just volume": every iteration's plan derives from a fresh analysis of where coverage is weakest, so generation can never over-fill an already-saturated cell.
23 lines
862 B
Python
23 lines
862 B
Python
import os
|
|
import matplotlib.pyplot as plt
|
|
import numpy as np
|
|
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 4))
|
|
cats = ['Total HBM3', 'Allocation']
|
|
total = [192, 0]
|
|
w = [0, 140]
|
|
a = [0, 38.4]
|
|
k = [0, 13.6]
|
|
ax1.bar(cats, total, label='Total Capacity (192GB)', color='lightgray')
|
|
ax1.bar(cats, w, label='Weights (140GB)', color='#cfe2f3')
|
|
ax1.bar(cats, a, bottom=w, label='Activations (38.4GB)', color='#fdebd0')
|
|
ax1.bar(cats, k, bottom=np.add(w, a), label='KV Cache (13.6GB)', color='#d4edda')
|
|
ax1.set_ylabel('Memory (GB)')
|
|
ax1.set_title('MI300X HBM3 Allocation')
|
|
ax1.legend()
|
|
reqs = ['Max Concurrent Requests']
|
|
ax2.bar(reqs, [5], color='#4a90c4', width=0.4)
|
|
ax2.set_ylabel('Requests')
|
|
ax2.set_title('KV Pool Capacity (8192 tokens/req)')
|
|
plt.tight_layout()
|
|
out = os.environ.get('VISUAL_OUT_PATH', 'plot.svg')
|
|
plt.savefig(out, format='svg', bbox_inches='tight') |