+ 1. Accuracy
If equalized odds is enforced → calibration gap
- ≈
- {_calibration_gap_pct:.1f}%
- (Chouldechova constraint, @sec-responsible-ai-chouldechova)
+
+ {_accuracy*100:.1f}%
-
If calibration is enforced → FPR/FNR gap
- ≈
- {_fpr_gap_pct:.1f}%
- {'✓ ≤ 10% EU AI Act' if _eu_ok else '❌ >10% EU AI Act Art. 10'}
+
+ target: ≥{ACCURACY_TARGET*100:.0f}%
+ {_tick(_accuracy_met)} {_badge(_accuracy_met)}
-
Fairness constraint accuracy cost
- ≈
~{_fairness_acc_penalty_pct:.1f}%
- degradation
+
+ base={_base_acc*100:.1f}% − DP penalty={_dp_acc_penalty*100:.1f}%
+ − adv penalty={_adv_acc_penalty*100:.1f}%
+
+
+
+ 2. P99 Latency
+
+
+ {_p99_latency_ms:.0f}ms
+
+
+ SLO: <{P99_SLO_MS}ms
+ {_tick(_latency_met)} {_badge(_latency_met)}
+
+
+ {_M_B}B params × 2 bytes = {_model_bytes_gb:.0f} GB
+ → {_shards} shard(s)/replica
+
+
+
+
+
+ 3. DP Compliance (HIPAA)
+
+
+ ε = {_eps:.1f}
+
+
+ HIPAA limit: ε ≤ {DP_EPS_LIMIT}
+ {_tick(_dp_met)} {_badge(_dp_met)}
+
+
+ accuracy penalty ≈ {_dp_acc_penalty*100:.1f}%
+
+
+
+
+
+ 4. Adversarial Robustness
+
+
+ {_adv_robustness*100:.0f}%
+
+
+ PGD target: ≥{ADV_ROBUSTNESS_TARGET*100:.0f}%
+ {_tick(_adversarial_met)} {_badge(_adversarial_met)}
+
+
+ adv weight={_adv_w:.2f}
+ → clean acc cost={_adv_acc_penalty*100:.1f}%
+
+
+
+
+
+ 5. Carbon Reduction
+
+
+ {_carbon_reduction*100:.0f}%
+
+
+ target: >{CARBON_REDUCTION_TARGET*100:.0f}% vs. baseline
+ {_tick(_carbon_met)} {_badge(_carbon_met)}
+
+
+ eff CI = {_eff_ci:.0f} g/kWh
+ (flex={_flex_pct*100:.0f}%)
+
+
+
+
+
+ 6. Fault Tolerance
+
+
+ {_uptime_pct*100:.3f}%
+
+
+ uptime target: ≥{UPTIME_TARGET*100:.1f}%
+ {_tick(_fault_tol_met)} {_badge(_fault_tol_met)}
+
+
+ T*={_T_opt_min:.0f}min | your T={_T_min}min
+
+
+
""")
+ _scorecard
+ return (
+ _accuracy,
+ _accuracy_met,
+ _latency_met,
+ _dp_met,
+ _adversarial_met,
+ _carbon_met,
+ _fault_tol_met,
+ _constraints_all_met,
+ _n_met,
+ _budget_ok,
+ _uptime_pct,
+ _p99_latency_ms,
+ _adv_robustness,
+ _carbon_reduction,
+ _eff_ci,
+ _T_opt_min,
+ _shards,
+ _replicas_needed,
+ _dp_acc_penalty,
+ _adv_acc_penalty,
+ )
+
+# ─── ACT II: FAILURE STATES AND SUCCESS STATE ─────────────────────────────────
+@app.cell(hide_code=True)
+def _(
+ ACCURACY_TARGET,
+ ADV_ROBUSTNESS_TARGET,
+ BUDGET_GPUS,
+ CARBON_REDUCTION_TARGET,
+ DP_EPS_LIMIT,
+ P99_SLO_MS,
+ UPTIME_TARGET,
+ _accuracy,
+ _accuracy_met,
+ _adv_robustness,
+ _adversarial_met,
+ _budget_ok,
+ _carbon_met,
+ _carbon_reduction,
+ _constraints_all_met,
+ _dp_met,
+ _eff_ci,
+ _fault_tol_met,
+ _latency_met,
+ _n_met,
+ _p99_latency_ms,
+ _replicas_needed,
+ _T_opt_min,
+ _uptime_pct,
+ checkpoint_interval_min,
+ dp_epsilon,
+ mo,
+):
_banners = []
- if not _eu_ok:
+
+ if not _accuracy_met:
_banners.append(mo.callout(mo.md(
- f"**EU AI Act Violation.** Enforcing calibration with a {_base_rate_gap*100:.0f}% "
- f"base rate gap across jurisdictions produces an FPR/FNR gap of "
- f"**{_fpr_gap_pct:.1f}%** — exceeding the 10% threshold under EU AI Act Article 10. "
- f"The Chouldechova impossibility theorem states this cannot be fixed by "
- f"better training data alone: the incompatibility is mathematical, not empirical. "
- f"Options: jurisdiction-specific models, or accept accuracy cost to enforce "
- f"equalized odds at the expense of calibration."
+ f"**Accuracy below clinical threshold.** Current accuracy: "
+ f"**{_accuracy*100:.1f}%** (required: {ACCURACY_TARGET*100:.0f}%). "
+ f"DP noise (eps={dp_epsilon.value:.1f}) and adversarial training together "
+ f"impose accuracy penalties that compound. "
+ f"Increase model size OR reduce adversarial weight OR raise eps (if HIPAA allows). "
+ f"Note: DP and adversarial training pull accuracy in the SAME downward direction "
+ f"— both add noise/randomization that smooth decision boundaries."
), kind="danger"))
- if _chouldechova_active and _base_rate_gap > 0.10:
+ if not _latency_met:
_banners.append(mo.callout(mo.md(
- f"**Chouldechova Theorem Active.** With a {_base_rate_gap*100:.0f}% base rate gap "
- f"and equalized odds enforcement, calibration error will be approximately "
- f"**{_calibration_gap_pct:.1f}%**. This is not a model quality problem — "
- f"it is a mathematical constraint from @sec-responsible-ai-chouldechova. "
- f"The only architectural solutions are: per-jurisdiction models, "
- f"rejection of the equalized odds criterion in high-gap jurisdictions, "
- f"or explicit transparency to regulators."
+ f"**P99 SLO violated.** Estimated P99 = **{_p99_latency_ms:.0f}ms** "
+ f"(SLO: {P99_SLO_MS}ms). "
+ f"The model's decode rate is bandwidth-bound (arithmetic intensity = 1 op/byte). "
+ f"Reduce model size to lower per-token latency, or add more replicas. "
+ f"You need {_replicas_needed:,} GPU-shards; budget is {BUDGET_GPUS:,}."
+ ), kind="danger"))
+
+ if not _dp_met:
+ _banners.append(mo.callout(mo.md(
+ f"**HIPAA DP violation.** epsilon = **{dp_epsilon.value:.1f}** exceeds "
+ f"the HIPAA-grade limit of eps <= {DP_EPS_LIMIT}. "
+ f"Medical image data under HIPAA requires strong differential privacy. "
+ f"Reduce epsilon — at the cost of increased accuracy penalty."
+ ), kind="danger"))
+
+ if not _adversarial_met:
+ _banners.append(mo.callout(mo.md(
+ f"**Adversarial robustness insufficient.** Current PGD robustness: "
+ f"**{_adv_robustness*100:.0f}%** (target: {ADV_ROBUSTNESS_TARGET*100:.0f}%). "
+ f"Medical AI systems in adversarial environments require adversarial training. "
+ f"Increase adversarial training weight — but note it reduces clean accuracy."
+ ), kind="danger"))
+
+ if not _carbon_met:
+ _banners.append(mo.callout(mo.md(
+ f"**Carbon reduction target missed.** Achieved: "
+ f"**{_carbon_reduction*100:.0f}%** reduction "
+ f"(effective CI: {_eff_ci:.0f} g CO2/kWh). "
+ f"Target: {CARBON_REDUCTION_TARGET*100:.0f}% reduction vs. baseline. "
+ f"Switch to carbon-optimized context OR increase flexible job percentage. "
+ f"Jevons Paradox warning: efficiency gains alone may be insufficient "
+ f"if fleet scale grows faster than carbon intensity falls."
+ ), kind="danger"))
+
+ if not _fault_tol_met:
+ _banners.append(mo.callout(mo.md(
+ f"**Uptime target missed.** Estimated uptime: "
+ f"**{_uptime_pct*100:.3f}%** (target: {UPTIME_TARGET*100:.1f}%). "
+ f"Young-Daly optimal checkpoint interval is **{_T_opt_min:.0f} min** "
+ f"for this fleet size. Your interval: {checkpoint_interval_min.value} min. "
+ f"Reduce checkpoint interval toward T* to minimize expected waste time."
+ ), kind="danger"))
+
+ if not _budget_ok:
+ _banners.append(mo.callout(mo.md(
+ f"**GPU budget exceeded.** Your configuration requires "
+ f"**{_replicas_needed:,} GPU-shards** but the budget is {BUDGET_GPUS:,} H100s. "
+ f"Reduce model size, increase quantization (which increases shards-per-replica "
+ f"at lower memory), or accept lower replica count with higher latency."
), kind="warn"))
- if _banners:
- mo.vstack([_formula] + _banners)
+ if _constraints_all_met:
+ mo.callout(mo.md(
+ f"**ARCHITECTURE APPROVED: All {_n_met}/6 constraints satisfied. "
+ f"System is deployable.** "
+ f"Accuracy: {_accuracy*100:.1f}% | P99: {_p99_latency_ms:.0f}ms | "
+ f"DP eps: {dp_epsilon.value:.1f} | Robustness: {_adv_robustness*100:.0f}% | "
+ f"Carbon reduction: {_carbon_reduction*100:.0f}% | "
+ f"Uptime: {_uptime_pct*100:.3f}%"
+ ), kind="success")
+ elif _banners:
+ mo.vstack(_banners)
else:
- _formula
- return (
- _criterion,
- _base_rate_gap,
- _eu_ok,
- _fpr_gap_pct,
- _calibration_gap_pct,
- _fairness_acc_penalty_pct,
- _chouldechova_active,
- )
-
-
-# ─── ACT II: CARBON CONSTRAINT ────────────────────────────────────────────────
-@app.cell(hide_code=True)
-def _(mo):
- mo.md("### Carbon Constraint — 2027 Carbon-Neutral Commitment")
+ mo.callout(mo.md(
+ f"**{_n_met}/6 constraints met.** "
+ f"Adjust the sliders above to satisfy all constraints simultaneously."
+ ), kind="info")
return
-@app.cell(hide_code=True)
-def _(
- CARBON_THRESHOLD_G_KWH,
- COLORS,
- EU_GRID_CARBON_G_KWH,
- H100_TDP_W,
- RENEW_CARBON_G_KWH,
- _cluster_power_mw,
- _N,
- _serving_cost_yr_b,
- mo,
-):
- # ── Total system power ─────────────────────────────────────────────────────
- # Training cluster + serving fleet + overhead (PUE 1.2×)
- # Source: @sec-sustainable-ai-data-center-pue
- _PUE = 1.2 # Power Usage Effectiveness; industry average
- _serving_gpus_est = int(_serving_cost_yr_b * 1e9 / (3.50 * 8760)) # rough estimate
-
- _total_gpus = int(_N) + _serving_gpus_est
- _raw_power_mw = _total_gpus * H100_TDP_W / 1e6 # MW
- _total_power_mw = _raw_power_mw * _PUE
-
- # ── Annual energy ─────────────────────────────────────────────────────────
- _annual_energy_gwh = _total_power_mw * 8760 / 1000 # GWh/year
-
- # ── Carbon emissions ──────────────────────────────────────────────────────
- # Assume 3 data center regions: EU, US-CA, US-East
- # Mix: 40% renewable PPA, 60% grid average
- _renew_fraction = 0.4
- _eff_carbon = (
- _renew_fraction * RENEW_CARBON_G_KWH +
- (1 - _renew_fraction) * EU_GRID_CARBON_G_KWH
- )
- _carbon_ok = _eff_carbon <= CARBON_THRESHOLD_G_KWH
-
- _annual_co2_kt = _annual_energy_gwh * _eff_carbon / 1e6 * 1e9 / 1e6 # kilotonnes CO2
-
- # Carbon-neutral path: 100% renewable PPA
- _eff_carbon_100renew = RENEW_CARBON_G_KWH
- _co2_100renew_kt = _annual_energy_gwh * _eff_carbon_100renew / 1e6 * 1e9 / 1e6
-
- # ── Renewable PPA cost premium ─────────────────────────────────────────────
- # Renewable PPA ~$50/MWh vs grid ~$40/MWh → ~25% premium
- # Source: @sec-sustainable-ai-carbon-aware-scheduling
- _ppa_premium_usd_m = _annual_energy_gwh * 1000 * 10.0 / 1e6 # $10/MWh delta × GWh
-
- _carbon_color = COLORS["GreenLine"] if _carbon_ok else COLORS["RedLine"]
- _eff_color = COLORS["GreenLine"] if _eff_carbon <= CARBON_THRESHOLD_G_KWH else (
- COLORS["OrangeLine"] if _eff_carbon <= 150 else COLORS["RedLine"]
- )
-
- _formula = mo.Html(f"""
-
-
- Carbon Physics — Jevons Paradox + Grid Carbon Intensity
-
-
-
Total GPUs (train + serve) ≈ {_total_gpus:,}
-
Raw GPU power = {_total_gpus:,} × {H100_TDP_W}W
- = {_raw_power_mw:.0f} MW
-
-
Total facility power (PUE {_PUE})
- = {_total_power_mw:.0f} MW
-
-
Annual energy = {_total_power_mw:.0f}MW × 8,760 hr
- = {_annual_energy_gwh:.0f} GWh/yr
-
-
Effective carbon intensity ({int(_renew_fraction*100)}% renewable PPA)
- =
- {_eff_carbon:.0f} g CO&sub2;/kWh
- (threshold: {CARBON_THRESHOLD_G_KWH} g/kWh)
- {'✓ carbon-neutral' if _carbon_ok else '❌ above threshold'}
-
-
Annual CO&sub2; emissions
- ≈ {_annual_co2_kt:.0f} kt CO&sub2;
-
-
100% renewable path: {_co2_100renew_kt:.0f} kt CO&sub2;
- | PPA premium: +${_ppa_premium_usd_m:.0f}M/yr
-
-
-
- """)
-
- _banners = []
- if not _carbon_ok:
- _banners.append(mo.callout(mo.md(
- f"**Carbon Target Missed.** With {int(_renew_fraction*100)}% renewable PPA, "
- f"effective carbon intensity is **{_eff_carbon:.0f} g CO2/kWh**, "
- f"exceeding the carbon-neutral threshold of {CARBON_THRESHOLD_G_KWH} g/kWh. "
- f"Your {_total_power_mw:.0f} MW fleet emits ~{_annual_co2_kt:.0f} kt CO2/year. "
- f"To reach carbon-neutral by 2027, increase renewable PPA to ≥90% or "
- f"relocate training workloads to zero-carbon regions (Iceland, Norway, Quebec). "
- f"Note the Jevons Paradox (@sec-sustainable-ai-jevons): "
- f"efficiency improvements alone cannot reach this target if fleet size grows."
- ), kind="danger"))
-
- if _banners:
- mo.vstack([_formula] + _banners)
- else:
- _formula
- return (
- _carbon_ok,
- _total_power_mw,
- _annual_energy_gwh,
- _eff_carbon,
- _annual_co2_kt,
- _total_gpus,
- )
-
-
-# ─── ACT II: SYSTEM FEASIBILITY VERDICT ──────────────────────────────────────
-@app.cell(hide_code=True)
-def _(mo):
- mo.md("### System Feasibility Verdict")
- return
-
-
-@app.cell(hide_code=True)
-def _(
- COLORS,
- _accuracy_penalty_pct,
- _annual_co2_kt,
- _avail_ok,
- _carbon_ok,
- _eu_ok,
- _fairness_acc_penalty_pct,
- _gdpr_ok,
- _oom,
- _overhead_ok,
- _p99_ms,
- _serving_cost_yr_b,
- _slo_ok,
- _total_gpus,
- _train_cost_m,
- go,
- apply_plotly_theme,
- mo,
-):
- # ── Aggregate system validity ──────────────────────────────────────────────
- _constraints = {
- "Training: No OOM": not _oom,
- "Checkpoint: Overhead OK": _overhead_ok,
- "Serving: P99 < 500ms": _slo_ok,
- "Reliability: 99.99% avail": _avail_ok,
- "Privacy: GDPR ε ≤ 1": _gdpr_ok,
- "Fairness: EU AI Act ≤ 10%": _eu_ok,
- "Carbon: Neutral by 2027": _carbon_ok,
- }
-
- _total_pass = sum(_constraints.values())
- _total_checks = len(_constraints)
- _system_valid = _total_pass == _total_checks
-
- # ── Total cost estimate ────────────────────────────────────────────────────
- _total_cost_b = (_train_cost_m * 12 / 1000) + _serving_cost_yr_b # billion $/year
-
- # ── Constraint bar chart ───────────────────────────────────────────────────
- _labels = list(_constraints.keys())
- _pass = [1 if v else 0 for v in _constraints.values()]
- _colors_bar = [
- COLORS["GreenLine"] if v else COLORS["RedLine"]
- for v in _constraints.values()
- ]
-
- _fig = go.Figure(go.Bar(
- x=_labels,
- y=_pass,
- marker_color=_colors_bar,
- text=["PASS" if v else "FAIL" for v in _constraints.values()],
- textposition="outside",
- textfont=dict(size=11, color=COLORS["TextSec"]),
- ))
- _fig.update_layout(
- height=280,
- xaxis=dict(tickangle=-20, tickfont=dict(size=10, color=COLORS["TextSec"])),
- yaxis=dict(visible=False, range=[0, 1.4]),
- margin=dict(t=40, b=80, l=20, r=20),
- title=dict(
- text=f"System Constraint Audit — {_total_pass}/{_total_checks} Passed",
- font=dict(size=13, color=COLORS["Text"]),
- x=0.5,
- ),
- )
- apply_plotly_theme(_fig)
-
- # ── Summary card ──────────────────────────────────────────────────────────
- _verdict_color = COLORS["GreenLine"] if _system_valid else COLORS["RedLine"]
- _verdict_label = "FEASIBLE" if _system_valid else "INFEASIBLE"
-
- _summary = mo.Html(f"""
-
-
-
- System Verdict
-
-
- {_verdict_label}
-
-
- {_total_pass}/{_total_checks} constraints
-
-
-
-
- P99 Latency
-
-
- {_p99_ms:.0f}ms
-
-
SLO: < 500ms
-
-
-
- Annual Cost
-
-
- ${_total_cost_b:.1f}B
-
-
- budget: $10B total
-
-
-
-
- Annual CO2
-
-
- {_annual_co2_kt:.0f}kt
-
-
- {'carbon-neutral' if _carbon_ok else 'above target'}
-
-
-
- """)
-
- mo.vstack([_summary, mo.ui.plotly(_fig)])
- return (
- _constraints,
- _system_valid,
- _total_pass,
- _total_checks,
- _total_cost_b,
- _verdict_label,
- )
-
-
# ─── ACT II: PREDICTION REVEAL ────────────────────────────────────────────────
@app.cell(hide_code=True)
def _(
- COLORS,
- _constraints,
- _system_valid,
- _total_pass,
- _total_checks,
+ _accuracy,
+ _accuracy_met,
+ _adversarial_met,
+ _constraints_all_met,
+ _dp_met,
+ _latency_met,
+ _n_met,
act2_pred,
mo,
):
- _constraint_names_failing = [k for k, v in _constraints.items() if not v]
+ _failing = []
+ if not _accuracy_met: _failing.append("Accuracy")
+ if not _latency_met: _failing.append("P99 Latency")
+ if not _dp_met: _failing.append("DP Compliance")
+ if not _adversarial_met: _failing.append("Adversarial Robustness")
_feedback_map = {
"A": (
- "**Training constraint.** "
- "Memory is real: a 1T parameter model in full training state requires 20 TB. "
- "Without 3D parallelism, no H100 (80 GB) cluster of any size can hold it in "
- "per-GPU memory. The OOM constraint is structural, not solvable by adding GPUs "
- "without also changing the sharding strategy. However, with 3D parallelism "
- "and sufficient sharding, the training constraint *can* be satisfied — it is "
- "not the binding limit at reasonable cluster sizes."
+ "**DP epsilon is genuinely difficult** — at eps=1, accuracy degrades by ~5%. "
+ "For a baseline 95% target model, this leaves no margin for other accuracy "
+ "costs. But this is only correct in a narrow sense. The deeper issue is that "
+ "DP noise and adversarial training *both* degrade accuracy in the same direction: "
+ "both smooth decision boundaries. These two constraints are "
+ "**fundamentally incompatible**, not just difficult to balance simultaneously. "
+ "DP adds noise to make the model's outputs less sensitive to any individual "
+ "training sample. Adversarial training adds noise to make the model robust "
+ "to input perturbations. Both mechanisms reduce model confidence — but for "
+ "orthogonal reasons. This is the mathematical conflict at the heart of Act II."
),
"B": (
- "**Serving constraint.** "
- "P99 < 500ms at 5B users is achievable — but only by combining cloud, edge, "
- "and mobile tiers. Serving all requests at cloud P99 would require enormous "
- "replica counts to keep utilization below the tail-latency cliff. "
- "Edge and mobile offloading (quantized models at INT4/INT2) are the architectural "
- "levers that make the P99 SLO achievable within budget. This is the design "
- "insight: serving is tractable *if* you use the full tier hierarchy."
+ "**Correct.** No single architecture satisfies all six constraints without "
+ "explicit tradeoff negotiation. The key conflicts are: "
+ "(1) DP and adversarial robustness both reduce accuracy — they cannot both "
+ "be maximized without a model large enough to absorb both penalties; "
+ "(2) large models reduce P99 latency; (3) carbon reduction conflicts with "
+ "fleet scale. The feasible region (all-green) requires navigating the "
+ "intersection of these constraints — which is exactly what the Architecture "
+ "Synthesizer reveals. This is the Chouldechova-generalized lesson: "
+ "in multi-constraint systems, you choose which constraint to relax."
),
"C": (
- "**Privacy constraint.** "
- "GDPR-grade DP (ε ≤ 1) does impose accuracy degradation — approximately 5% at ε=1. "
- "This is significant but not fatal. The binding privacy constraint is not "
- "accuracy degradation but *data residency*: EU data cannot be used to train "
- "a centralized model without GDPR compliance, so federated learning with "
- "LoRA adapter updates is architecturally required regardless of ε. "
- "Privacy is a hard structural constraint, not just an accuracy tax."
+ "**Partially correct, but incomplete.** The 10,000 H100 budget *can* "
+ "accommodate the serving load at smaller model sizes. But budget sufficiency "
+ "does not equal constraint satisfaction. Even with 10,000 H100s, "
+ "a 70B model at P99 < 200ms requires more shards-per-replica than available, "
+ "and DP + adversarial training may push accuracy below 95%. "
+ "Hardware budget is necessary but not sufficient."
),
"D": (
- "**All constraints satisfiable.** "
- "You are correct in spirit: with the right architectural choices, all constraints "
- "can be satisfied simultaneously. But 'simultaneously' is doing a lot of work. "
- "Each satisfying configuration requires a specific combination: 3D parallelism "
- "for training, tier-aware serving for P99, Young-Daly optimal checkpointing, "
- "GDPR-compliant federated strategy for EU, per-jurisdiction fairness models, "
- "and ≥90% renewable PPA for carbon. The constraints are navigable — but they "
- "are not independent. Every architectural choice propagates to multiple constraints. "
- "That is the meta-principle."
+ "**Incorrect.** Carbon reduction is NOT independent of other constraints. "
+ "The Jevons Paradox directly links carbon to fleet scale: if you add GPUs "
+ "to satisfy the latency SLO, you increase total power consumption, "
+ "which makes the carbon target harder to hit. Carbon-aware scheduling "
+ "reduces effective CI, but only if deferrable jobs exist to shift. "
+ "Carbon is entangled with every other dimension through fleet size."
),
}
- _chosen = _feedback_map.get(act2_pred.value, _feedback_map["D"])
+ _chosen = _feedback_map.get(act2_pred.value, _feedback_map["B"])
- if _system_valid:
- _status = mo.callout(mo.md(
- f"**Feasible architecture found.** {_total_pass}/{_total_checks} constraints pass. "
- + _chosen
+ if _constraints_all_met:
+ mo.callout(mo.md(
+ f"**{_n_met}/6 constraints satisfied.** " + _chosen
), kind="success")
else:
- _fail_list = ", ".join(_constraint_names_failing)
- _status = mo.callout(mo.md(
- f"**Architecture infeasible.** {_total_pass}/{_total_checks} constraints pass. "
- f"Failing: **{_fail_list}**. " + _chosen
+ _fail_str = ", ".join(_failing) if _failing else "multiple"
+ mo.callout(mo.md(
+ f"**{_n_met}/6 constraints satisfied.** "
+ f"Currently failing: **{_fail_str}**. "
+ + _chosen
), kind="warn")
-
- _status
return
-# ─── ACT II: MATH PEEK ────────────────────────────────────────────────────────
+# ─── ACT II: REFLECTION ───────────────────────────────────────────────────────
+@app.cell(hide_code=True)
+def _(mo):
+ mo.md("### Reflection")
+ return
+
+
+@app.cell(hide_code=True)
+def _(mo):
+ act2_reflection = mo.ui.radio(
+ options={
+ "A) P99 latency and model accuracy — larger models are slower": "A",
+ "B) DP privacy and adversarial robustness — both require noise/randomization but in opposite directions for model confidence": "B",
+ "C) Carbon reduction and fault tolerance — checkpointing uses more energy": "C",
+ "D) Parallelism efficiency and checkpoint overhead — communication vs. recovery cost": "D",
+ },
+ label="Which two constraints are FUNDAMENTALLY incompatible (not just hard to balance simultaneously)?",
+ )
+ act2_reflection
+ return (act2_reflection,)
+
+
+@app.cell(hide_code=True)
+def _(act2_reflection, mo):
+ if act2_reflection.value is None:
+ mo.callout(
+ mo.md("Select your reflection answer above to continue."),
+ kind="warn",
+ )
+ elif act2_reflection.value == "B":
+ mo.callout(mo.md(
+ "**Correct.** DP privacy and adversarial robustness are fundamentally "
+ "incompatible in the following sense: "
+ "**DP noise makes the model's outputs smoother and less sensitive** "
+ "to individual inputs (including adversarial perturbations). "
+ "**Adversarial training sharpens the model's decision boundaries** "
+ "to resist those same perturbations. "
+ "These two mechanisms push model confidence in opposite directions. "
+ "DP adds isotropic Gaussian noise to gradients during training, which "
+ "diffuses the loss landscape. Adversarial training concentrates the "
+ "loss signal at adversarial examples, sharpening it. "
+ "The result: achieving strong DP (low eps) while simultaneously achieving "
+ "high adversarial robustness requires a model with enough capacity to "
+ "maintain both — but both penalize clean accuracy. "
+ "This is not an engineering challenge. It is an algebraic tension, "
+ "analogous to Chouldechova's impossibility in the fairness domain."
+ ), kind="success")
+ elif act2_reflection.value == "A":
+ mo.callout(mo.md(
+ "**This is a tradeoff, not a fundamental incompatibility.** "
+ "Larger models are slower — true. But you can add replicas, use "
+ "quantization, or select a smaller model that still achieves 95% accuracy. "
+ "Latency and accuracy can both be satisfied with the right design. "
+ "There is no mathematical theorem preventing their simultaneous satisfaction. "
+ "DP and adversarial robustness, by contrast, have mechanistic interference."
+ ), kind="warn")
+ elif act2_reflection.value == "C":
+ mo.callout(mo.md(
+ "**This is not fundamentally incompatible.** "
+ "Carbon-aware scheduling and checkpoint frequency operate on different "
+ "timescales and resource dimensions. You can checkpoint frequently "
+ "without increasing power consumption (checkpoints are I/O-bound, "
+ "not compute-bound). Fault tolerance and carbon are independently satisfiable "
+ "with the right architectural choices. They are not mechanistically coupled."
+ ), kind="warn")
+ else:
+ mo.callout(mo.md(
+ "**This is a tradeoff, not a fundamental incompatibility.** "
+ "Communication overhead and checkpoint cost can be jointly minimized "
+ "with asynchronous checkpointing and topology-aware AllReduce. "
+ "They compete for network bandwidth but do not violate any theorem. "
+ "The right system design reduces both independently."
+ ), kind="warn")
+ return
+
+
+# ─── ACT II: MATHPEEK ACCORDION ───────────────────────────────────────────────
@app.cell(hide_code=True)
def _(mo):
mo.accordion({
- "The five governing equations for Act II": mo.md("""
- **Young-Daly (Fault Tolerance):**
- `T* = sqrt(2 × C / λ)`
- where C = checkpoint write time, λ = cluster failure rate = N_GPUs / MTBF_per_GPU
- — Source: @sec-fault-tolerance-young-daly
+ "The fundamental conflict: DP noise vs. adversarial robustness": mo.md("""
+ **Differential Privacy (DP Noise Direction):**
+ During training, DP-SGD clips gradients to sensitivity S, then adds noise:
+ `g_tilde = clip(g, S) + N(0, sigma^2 * S^2 * I)`
+ Effect: gradients from all training examples (including adversarial ones)
+ are *smoothed*. The learned decision boundary becomes flatter near training points.
+ — Source: @sec-security-privacy-dp-sgd
- **Little's Law (Serving):**
- `N_concurrent = λ_arrival × W_latency`
- At saturation: `Throughput_max = N_replicas / (latency × n_shards)`
- P99 ≈ 3× mean for M/M/1 queues at moderate utilization
- — Source: @sec-model-serving-littles-law
+ **Adversarial Training (Robustness Direction):**
+ At each step, adversarial training maximizes the loss over a perturbation ball:
+ `theta* = argmin_theta E[max_{delta: ||delta|| <= eps} L(x + delta, y; theta)]`
+ Effect: the decision boundary is forced *sharp* at adversarial perturbations.
+ The model must distinguish clean from perturbed inputs with high confidence.
+ — Source: @sec-robust-ai-pgd
- **Roofline (Inference Latency):**
- `Tokens/sec = min(TFLOPS × MFU, BW_GBs / bytes_per_param / params)`
- At batch=1 (autoregressive decode), arithmetic intensity = 1 op/byte → bandwidth-bound
- — Source: @sec-hw-acceleration-roofline
+ **The Tension:**
+ DP smooths → lower confidence near any input.
+ Adversarial training sharpens → higher confidence near adversarial inputs.
+ Both *penalize clean accuracy* for different reasons.
+ At low DP epsilon (strong privacy), the noise scale sigma is large,
+ and the gradients from adversarial examples are effectively washed out —
+ the adversarial training signal is attenuated by DP noise.
+ This is not fixable by adding more data or a larger model:
+ it is a consequence of the conflicting objectives.
- **Differential Privacy:**
- `ε ≥ Δf / σ` where σ = noise scale, Δf = L2 sensitivity of query
- GDPR-grade: ε ≤ 1.0; CCPA-grade: ε ≤ 3.0
- Accuracy penalty ≈ k/ε (monotone: stronger privacy = more noise = lower accuracy)
- — Source: @sec-security-privacy-dp
-
- **Chouldechova Impossibility:**
- When base rates differ between groups A and B (p_A ≠ p_B), any classifier
- satisfying calibration AND equalized odds must have FPR_A ≠ FPR_B.
- No ML improvement can resolve this — it is an algebraic identity.
- — Source: @sec-responsible-ai-chouldechova
+ **The Resolution:**
+ The feasible region exists (all-green is achievable in this lab)
+ but requires: (1) a model large enough that both accuracy penalties still
+ leave you above 95%, (2) an epsilon in [0.5, 1.0] that satisfies HIPAA
+ while not destroying the adversarial training signal, and (3) an
+ adversarial weight calibrated to the DP noise level.
""")
})
return
# ═══════════════════════════════════════════════════════════════════════════════
-# DESIGN LEDGER SAVE + HUD
+# VOL1 + VOL2 SYNTHESIS TIMELINE
+# ═══════════════════════════════════════════════════════════════════════════════
+
+
+@app.cell(hide_code=True)
+def _(mo):
+ mo.md("""
+ ---
+ ## Curriculum Journey Summary
+ """)
+ return
+
+
+@app.cell(hide_code=True)
+def _(COLORS, _ledger_map, mo):
+ # Build a visual timeline of all 33 labs with constraint_hit indicators
+ _vol1_entries = [
+ (f"V1-{c:02d}", str(c)) for c in range(1, 17)
+ ]
+ _vol2_entries = [
+ (f"V2-{c:02d}", f"v2_{c:02d}") for c in range(1, 18)
+ ]
+ _all_entries = _vol1_entries + _vol2_entries
+
+ def _dot(label, key):
+ _d = _ledger_map.get(key, {})
+ _done = key in _ledger_map
+ _hit = _d.get("constraint_hit", False)
+ if not _done:
+ _bg = "#e2e8f0"
+ _color = "#94a3b8"
+ _sym = ""
+ elif _hit:
+ _bg = COLORS["RedL"]
+ _color = COLORS["RedLine"]
+ _sym = "!"
+ else:
+ _bg = COLORS["GreenL"]
+ _color = COLORS["GreenLine"]
+ _sym = "✓"
+ return (
+ f'
'
+ f'{label}'
+ f'{_sym}'
+ f'
'
+ )
+
+ _v1_dots = "".join(_dot(lbl, key) for lbl, key in _vol1_entries)
+ _v2_dots = "".join(_dot(lbl, key) for lbl, key in _vol2_entries)
+
+ _total_done = sum(1 for _, k in _all_entries if k in _ledger_map)
+ _total_hit = sum(
+ 1 for _, k in _all_entries
+ if _ledger_map.get(k, {}).get("constraint_hit", False)
+ )
+
+ # Dominant context across all labs
+ _contexts = [
+ _ledger_map[k].get("context", "")
+ for _, k in _all_entries if k in _ledger_map
+ ]
+ from collections import Counter as _Counter
+ _ctx_count = _Counter(_contexts)
+ _dom_ctx = _ctx_count.most_common(1)[0][0] if _ctx_count else "N/A"
+
+ mo.Html(f"""
+
+
+ Lab Journey — All 33 Labs (Vol I + Vol II)
+
+
+ Volume I (Labs V1-01 through V1-16)
+
+
+ {_v1_dots}
+
+
+ Volume II (Labs V2-01 through V2-17)
+
+
+ {_v2_dots}
+
+
+
+ Labs completed:
+ {_total_done}/33
+
+
+ Constraints triggered:
+ {_total_hit}
+
+
+ Dominant context:
+ {_dom_ctx}
+
+
+ Green = completed, no failure |
+ Red = constraint triggered |
+ Grey = not yet completed
+
+
+
+ """)
+ return
+
+
+@app.cell(hide_code=True)
+def _(mo):
+ mo.md("""
+ *You have completed the ML Systems curriculum. The physics doesn't change — the constraints
+ just shift with scale.*
+ """)
+ return
+
+
+# ═══════════════════════════════════════════════════════════════════════════════
+# DESIGN LEDGER SAVE + HUD FOOTER
# ═══════════════════════════════════════════════════════════════════════════════
@app.cell(hide_code=True)
def _(
COLORS,
- _accuracy_penalty_pct,
- _avail_ok,
- _carbon_ok,
- _eu_ok,
- _gdpr_ok,
- _is_federated,
- _N,
- _oom,
- _overhead_ok,
- _slo_ok,
- _system_valid,
- _total_cost_b,
- _total_gpus,
- _total_pass,
- _total_checks,
- _T_min,
- _verdict_label,
+ _accuracy,
+ _accuracy_met,
+ _adversarial_met,
+ _adv_robustness,
+ _carbon_met,
+ _carbon_reduction,
+ _constraints_all_met,
+ _dp_met,
+ _fault_tol_met,
+ _latency_met,
+ _n_met,
+ _p99_latency_ms,
+ _uptime_pct,
act1_pred,
act2_pred,
- d1_parallelism,
- d4_epsilon,
- d5_fairness,
+ adv_train_weight,
+ checkpoint_interval_min,
+ context_toggle,
+ dp_epsilon,
+ flexible_job_pct,
ledger,
+ model_size_b,
mo,
+ parallelism_strategy,
):
- # ── Invariants applied in this lab ────────────────────────────────────────
- _invariants = [
- "Young-Daly Optimal Checkpoint",
- "Amdahl Scale Ceiling",
- "Roofline Bandwidth Bound",
- "Little's Law Serving",
- "Differential Privacy epsilon-delta",
- "Chouldechova Impossibility",
- "Jevons Carbon Paradox",
- "Memory Wall / OOM",
- ]
-
# ── Save to Design Ledger ─────────────────────────────────────────────────
ledger.save(
chapter="v2_17",
design={
- "context": "full_fleet",
- "cluster_gpus": int(_N),
- "parallelism_strategy": d1_parallelism.value,
- "checkpoint_interval_min": float(_T_min),
- "dp_epsilon": float(d4_epsilon.value),
- "fairness_criterion": d5_fairness.value,
- "carbon_compliant": bool(_carbon_ok),
- "p99_slo_met": bool(_slo_ok),
- "total_system_cost_b": float(_total_cost_b),
+ "context": context_toggle.value,
+ "model_size_b": float(model_size_b.value),
+ "dp_epsilon": float(dp_epsilon.value),
+ "adv_train_weight": float(adv_train_weight.value),
+ "parallelism_strategy": parallelism_strategy.value,
+ "checkpoint_interval_min": int(checkpoint_interval_min.value),
+ "flexible_job_pct": float(flexible_job_pct.value),
+ "constraints_all_met": bool(_constraints_all_met),
+ "accuracy_met": bool(_accuracy_met),
+ "latency_met": bool(_latency_met),
+ "dp_met": bool(_dp_met),
+ "adversarial_met": bool(_adversarial_met),
+ "carbon_met": bool(_carbon_met),
+ "fault_tolerance_met": bool(_fault_tol_met),
"act1_prediction": str(act1_pred.value),
- "act1_correct": False, # no single correct answer in Act I
- "act2_result": "feasible" if _system_valid else "infeasible",
- "act2_decision": d1_parallelism.value,
- "constraint_hit": not _system_valid,
- "system_valid": bool(_system_valid),
- "invariants_connected": _invariants,
+ "act1_correct": act1_pred.value == "D",
+ "act2_result": "approved" if _constraints_all_met else "infeasible",
+ "act2_decision": parallelism_strategy.value,
+ "constraint_hit": not _constraints_all_met,
+ "curriculum_complete": True,
}
)
- # ── HUD footer ────────────────────────────────────────────────────────────
- _checks_list = [
- ("Training: No OOM", not _oom),
- ("Checkpoint Overhead OK", _overhead_ok),
- ("P99 < 500ms SLO", _slo_ok),
- ("Availability 99.99%", _avail_ok),
- ("GDPR ε Compliance", _gdpr_ok),
- ("EU AI Act Fairness", _eu_ok),
- ("Carbon Neutral 2027", _carbon_ok),
+ # ── Build constraint status list ─────────────────────────────────────────
+ _checks = [
+ ("Accuracy >= 95%", _accuracy_met),
+ ("P99 < 200ms", _latency_met),
+ ("DP eps <= 1", _dp_met),
+ ("Robustness >= 50%", _adversarial_met),
+ ("Carbon -40%", _carbon_met),
+ ("Uptime 99.9%", _fault_tol_met),
]
_badge_html = "".join([
@@ -2079,9 +1749,12 @@ def _(
font-weight: 600; margin: 3px;">
{'✓' if ok else '❌'} {label}
"""
- for label, ok in _checks_list
+ for label, ok in _checks
])
+ _arch_status = "APPROVED" if _constraints_all_met else f"INFEASIBLE ({_n_met}/6)"
+ _status_color = "#4ade80" if _constraints_all_met else "#f87171"
+
_hud = mo.Html(f"""
- Design Ledger · Chapter v2_17 · Capstone
+ LAB = V2-17 (CAPSTONE) ·
+ CONTEXT = {context_toggle.value.upper()} ·
+ CURRICULUM COMPLETE
- Planet-Scale Architecture: {_verdict_label}
- {_total_pass}/{_total_checks} constraints passed
+ Architecture Status:
+ {_arch_status}
+ — CONSTRAINTS MET: {_n_met}/6
""")
@@ -2123,7 +1812,7 @@ def _(
# ═══════════════════════════════════════════════════════════════════════════════
-# CURRICULUM SYNTHESIS — THE META-PRINCIPLE
+# THE META-PRINCIPLE
# ═══════════════════════════════════════════════════════════════════════════════
@@ -2137,19 +1826,21 @@ def _(mo):
**physical laws create hard ceilings that no amount of engineering can dissolve.**
You cannot wish away the memory wall — HBM bandwidth is determined by signal
- physics and pin count. You cannot wish away Amdahl's Law — coordination cost
- grows with cluster size regardless of how good your scheduler is. You cannot
- wish away Chouldechova's theorem — it follows from the definition of conditional
- probability. You cannot wish away Young-Daly — it follows from the calculus of
- minimization. You cannot wish away Little's Law — it follows from queueing theory
- steady-state.
+ physics and pin count. You cannot wish away Amdahl's Law — the serial fraction
+ of your workload caps speedup regardless of cluster size. You cannot wish away
+ Chouldechova's theorem — it follows from the definition of conditional probability
+ when base rates differ. You cannot wish away Young-Daly — it follows from the
+ calculus of minimization under a Poisson failure process. You cannot wish away
+ Little's Law — it follows from queueing theory steady-state. You cannot make DP
+ and adversarial robustness simultaneously costless — they are mechanistically
+ opposed in the same loss landscape.
But you *can* navigate these constraints. That is the discipline of ML systems:
not finding a way around the physics, but designing systems that respect it.
The skilled ML architect does not ask: "How do I avoid the memory wall?"
They ask: "Which memory-wall-respecting architecture best satisfies my
- throughput, latency, and cost requirements simultaneously?"
+ throughput, latency, cost, and safety requirements simultaneously?"
That is the question this curriculum trained you to ask.
""")
@@ -2166,12 +1857,12 @@ def _(mo):
which constraint to prioritize when they cannot all be satisfied simultaneously.
The invariants give you the exact tradeoff surface. Read them.
- 2. **The gap between your prediction and reality is your learning.**
- If your Systems Intuition Radar shows a weak domain, that is not a failure —
- it is a calibration report. The researchers who built the fastest training
- systems, the most efficient serving pipelines, and the fairest production
- models were the ones who had internalized which invariants bind when,
- and why. That intuition is now yours to develop.
+ 2. **The bottleneck moves with scale — that is the curriculum.**
+ Memory dominates single-node inference. Communication dominates multi-node
+ training. Privacy and fairness constraints activate at any scale but are
+ invisible until you look across populations. The researcher who built the
+ fastest training systems internalized which invariant binds when, and why.
+ That intuition is now yours to develop.
""")
return