mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-05-08 18:01:20 -05:00
Two new pieces close the generation→validation→saturation feedback loop: 1. gemini_cli_llm_judge.py — multi-criteria validator. For each draft, judges math correctness, cell-fit (does it actually target the declared track/zone/level?), scenario realism, uniqueness vs canonical questions, and visual-asset alignment. Returns PASS/NEEDS_FIX/DROP per item. Batched (default 15 per call) for budget efficiency. 2. iterate_coverage_loop.py — drives the full loop: analyze → plan → generate → render → judge → apply → re-analyze. Self-paced: stops when (a) top priority gap drops below threshold, (b) DROP rate exceeds the saturation/hallucination threshold, (c) total API calls exceed budget, or (d) the same cell is top priority for two iterations in a row (convergence). The user no longer specifies "how many questions" — the loop generates until the corpus reaches a measurable steady state. Plus 25 round-1 visual questions generated by the new batched generator (5 batched calls × 5 cells each, zero failures). The loop is the answer to "we need balance, not just volume": every iteration's plan derives from a fresh analysis of where coverage is weakest, so generation can never over-fill an already-saturated cell.
28 lines
1.1 KiB
Python
28 lines
1.1 KiB
Python
import os
|
|
import matplotlib.pyplot as plt
|
|
import numpy as np
|
|
from matplotlib.patches import Patch
|
|
|
|
time = np.arange(0, 60)
|
|
state = []
|
|
for t in time:
|
|
if 0 <= t < 10: state.append(1)
|
|
elif 10 <= t < 15: state.append(2)
|
|
elif 15 <= t < 25: state.append(1)
|
|
elif 25 <= t < 30: state.append(2)
|
|
elif 30 <= t < 35: state.append(1)
|
|
elif t == 35: state.append(0)
|
|
elif 35 < t <= 45: state.append(3)
|
|
else: state.append(2)
|
|
|
|
colors = {0: 'red', 1: '#4a90c4', 2: '#3d9e5a', 3: '#c87b2a'}
|
|
fig, ax = plt.subplots(figsize=(10, 2))
|
|
for t, s in enumerate(state):
|
|
ax.barh(0, 1, left=t, color=colors[s], edgecolor='none')
|
|
ax.set_yticks([])
|
|
ax.set_xlabel('Time (Minutes)')
|
|
ax.set_title('Synchronous Checkpointing Overload (C=10, M=11.25)')
|
|
legend_elements = [Patch(facecolor='#3d9e5a', label='Compute'), Patch(facecolor='#4a90c4', label='Checkpoint'), Patch(facecolor='red', label='Crash'), Patch(facecolor='#c87b2a', label='Recovery')]
|
|
ax.legend(handles=legend_elements, loc='upper right', bbox_to_anchor=(1.15, 1))
|
|
plt.tight_layout()
|
|
plt.savefig(os.environ.get('VISUAL_OUT_PATH', 'out.svg'), format='svg', bbox_inches='tight') |