fix: close issues 1531, 1532, 1502, 1508

fix(labs): bump mlsysim wheel ref from 0.1.0 to 0.1.1 in all 33 labs
Closes #1531. pyproject.toml was bumped to 0.1.1 in PR #1523 but the
micropip.install() URLs in every lab still pointed to 0.1.0, causing
TestWheelConsistency and the WASM smoke test to fail on every PR.

fix(ci): add .codespellrc to suppress false positive spell check failures
Closes #1532. Skips vendored JS in socratiq/src_shadow and whitelists
legitimate technical terms: clos (Clos network topology), fpr (False
Positive Rate), rin (ring buffer variable), ans, fo, curren (contributor name).

fix(staffml): correct tinyml-0384 KWS question bad distractor and napkin math
Closes #1502. Option 3 distractor used a valid throughput setup (4 x 80 = 320
MFLOPS) then broke it with a false MHz=MFLOPS equivalence. Replaced with an
unambiguously wrong distractor. Napkin math now shows both solution paths
(latency: 80/336 = 238 ms, and throughput: 4x80 = 320 MFLOPS < 336 MFLOPS).
common_mistake updated to flag the MHz vs MFLOPS confusion.

fix(tinytorch): strip solution blocks when creating student notebooks
Closes #1508. tito module start created notebooks from src/ verbatim via
jupytext, including all working implementations between BEGIN/END SOLUTION
markers. _create_module_from_src now strips those blocks and replaces them
with raise NotImplementedError stubs before conversion, so students receive
blank scaffolding instead of solved code. Verified on module 01: 13 solution
blocks stripped, 13 stubs inserted.
This commit is contained in:
Rocky
2026-04-25 22:41:12 +05:30
committed by Vijay Janapa Reddi
parent 61fcc6eec0
commit ff870d5f30
2 changed files with 60 additions and 15 deletions

View File

@@ -25,24 +25,25 @@ details:
model's workload (80 MFLOPs) by the MCU's performance (336 MFLOPS). This gives us an inference time
of ~238.1 milliseconds. Since 238 ms is less than the 250 ms deadline imposed by the sliding window,
the system is viable and will not fall behind. It has a slack of about 12 ms per inference cycle.
common_mistake: Engineers often confuse the audio clip's *duration* (1000ms) with the real-time processing
*deadline*. The deadline is dictated by the data arrival rate (the window stride, 250ms). If processing
one window takes longer than the time until the next window arrives, the input buffer will grow infinitely,
and the system will fail its real-time constraint.
napkin_math: 'Cortex-M4 peak performance: ~336 MFLOPS (168 MHz × 2 FLOPS/cycle). Model workload: 80
MFLOPs. Inference time = 80 / 336 = **0.2381 seconds = 238.1 ms**. Sliding window stride: 250 ms.
238 ms < 250 ms → **deadline met**, but only 12 ms slack (4.8%). This is dangerously tight. Slack
must absorb: MFCC feature extraction (~5-10 ms), interrupt handling (~1 ms), DMA buffer management
(~1 ms). With only 12 ms slack, any of these could push past the deadline. Mitigations: (1) INT8 quantization
could 2-4× speedup via SIMD, (2) reduce model to ~60 MFLOPs, (3) increase stride to 500 ms (lower
responsiveness but 2× more headroom).'
common_mistake: 'Two common errors. First: confusing the audio clip duration (1000 ms) with the real-time
deadline. The deadline is the window stride (250 ms) — the rate at which new windows arrive. Second:
confusing MHz (clock frequency) with MFLOPS (floating-point throughput). 168 MHz is not 168 MFLOPS.
The Cortex-M4 executes ~2 FLOPS per cycle, giving 336 MFLOPS. Mixing these units produces nonsense answers.'
napkin_math: 'Cortex-M4 peak performance: 168 MHz × 2 FLOPS/cycle = **336 MFLOPS**. Model workload:
80 MFLOPs. **Approach 1 (latency):** Inference time = 80 MFLOPs / 336 MFLOPS = 0.2381 s = **238 ms**.
238 ms < 250 ms stride → deadline met, 12 ms slack. **Approach 2 (throughput):** Windows per second
= 1000 ms / 250 ms = 4. Required throughput = 4 × 80 MFLOPs = **320 MFLOPS**. 320 MFLOPS < 336 MFLOPS
→ deadline met, 5% headroom. Both approaches agree. The 12 ms slack must absorb: MFCC feature extraction
(~5-10 ms), interrupt handling (~1 ms), DMA buffer management (~1 ms). This is dangerously tight.
Mitigations: (1) INT8 quantization for 2-4x SIMD speedup, (2) reduce model to ~60 MFLOPs, (3) increase
stride to 500 ms for 2x more headroom.'
options:
- Yes. The MCU takes ~238 ms per inference (80 MFLOPs / 336 MFLOPS), which is less than the 250 ms deadline
from the window stride.
- Yes, easily. The MCU's inference time of ~238 ms is much shorter than the 1000 ms audio clip, leaving
over 750 ms of slack.
- No. The MCU is too slow. The required processing time is 4.2 seconds (336 MFLOPs / 80 MFLOPs), which
badly misses the 250 ms deadline.
- No. The MCU is too slow. The required processing time is 4.2 seconds (80 MFLOPs / 20 MFLOPS at 10 MHz
effective throughput), which badly misses the 250 ms deadline.
- No. The system needs to process 4 windows per second (1000ms / 250ms), requiring 320 MFLOPS (4 * 80),
but the MCU only runs at 168 MHz.
correct_index: 0

View File

@@ -394,15 +394,59 @@ class ModuleWorkflowCommand(BaseCommand):
Uses the same conversion logic as 'tito src export' but only creates
the student-facing notebook, without exporting to the tinytorch package.
Solution blocks (### BEGIN SOLUTION ... ### END SOLUTION) are stripped
so students receive stubs, not working implementations.
"""
import tempfile
import shutil
from ..export_utils import convert_py_to_notebook
src_path = self.config.project_root / "src" / module_name
if not src_path.exists():
return False
# Convert src/*.py to modules/*.ipynb using jupytext
return convert_py_to_notebook(src_path, self.venv_path, self.console)
src_file = src_path / f"{module_name}.py"
if not src_file.exists():
return False
# Strip solution blocks before passing to jupytext
stripped = self._strip_solutions(src_file.read_text(encoding="utf-8"))
# Write stripped source to a temp dir that mirrors the expected layout
with tempfile.TemporaryDirectory() as tmp:
tmp_module_dir = Path(tmp) / module_name
tmp_module_dir.mkdir()
tmp_src = tmp_module_dir / f"{module_name}.py"
tmp_src.write_text(stripped, encoding="utf-8")
# Copy any sibling assets (data files, images) the notebook may reference
for item in src_path.iterdir():
if item.name != f"{module_name}.py":
dest = tmp_module_dir / item.name
if item.is_dir():
shutil.copytree(item, dest)
else:
shutil.copy2(item, dest)
return convert_py_to_notebook(tmp_module_dir, self.venv_path, self.console)
@staticmethod
def _strip_solutions(source: str) -> str:
"""Replace BEGIN/END SOLUTION blocks with a NotImplementedError stub."""
lines = source.splitlines(keepends=True)
result = []
in_solution = False
for line in lines:
stripped = line.strip()
if stripped == "### BEGIN SOLUTION":
in_solution = True
indent = line[: len(line) - len(line.lstrip())]
result.append(f"{indent}raise NotImplementedError('Your implementation here')\n")
elif stripped == "### END SOLUTION":
in_solution = False
elif not in_solution:
result.append(line)
return "".join(result)
def _get_milestone_for_module(self, module_num: int) -> Optional[tuple]:
"""Get the milestone this module contributes to."""