mirror of https://github.com/harvard-edge/cs249r_book.git synced 2026-05-03 16:18:49 -05:00

Files

Vijay Janapa Reddi 533cfa6e99 fix: pre-commit hooks — all 48 checks now pass

- book/quarto/mlsys/__init__.py: add repo-root sys.path injection so
  mlsysim is importable when scripts run from book/quarto/ context
- book/quarto/mlsys/{constants,formulas,formatting,hardware}.py: new
  compatibility shims that re-export from mlsysim.core.* and mlsysim.fmt
- mlsysim/viz/__init__.py: remove try/except for dashboard import; use
  explicit "import from mlsysim.viz.dashboard" pattern instead
- .codespell-ignore-words.txt: add "covert" (legitimate security term)
- book/tools/scripts/reference_check_log.txt: delete generated artifact
- Various QMD, bib, md files: auto-formatted by pre-commit hooks
  (trailing whitespace, bibtex-tidy, pipe table alignment)

2026-03-01 17:30:24 -05:00

4.5 KiB

Raw Blame History

📐 Mission Plan: 13_sec_privacy (Volume 2: Fleet Scale)

1. Chapter Context

Chapter Title: Security & Privacy: The Responsible Fleet.
Core Invariant: The Privacy-Utility-Budget Triangle (Accuracy vs. Privacy \epsilon vs. Cost O).
The Struggle: Understanding that "Privacy is a Finite Resource." Students must navigate the trade-off between Model Utility (accuracy) and Privacy Guarantees (\epsilon), specifically focusing on the "Defense Tax" (Latency/Energy overhead) of secure execution environments.
Target Duration: 45 Minutes.

2. The 4-Track Storyboard (Security Missions)

Track	Persona	Fixed North Star Mission	The "Security" Crisis
Cloud Titan	LLM Architect	Maximize Llama-3-70B serving.	The PII Leak. Your model is regurgitating PII from the training set. The board demands DP-SGD with `\epsilon < 1`. You must determine the exact 'TFLOPS Penalty' required to regain safety.
Edge Guardian	AV Systems Lead	Deterministic 10ms safety loop.	The Adversarial Sticker. A 50-cent sticker causes your AV to misidentify a 'Stop' sign as '80 MPH'. You must implement 'Adversarial Training' without breaking the 10ms safety window.
Mobile Nomad	AR Glasses Dev	60FPS AR translation.	The FaceID Side-Channel. Power analysis of the NPU reveals the user's biometric embeddings. You must move inference to a TEE (TrustZone) while staying under the 2W thermal cap.
Tiny Pioneer	Hearable Lead	Neural isolation in <10ms under 1mW.	The Model Extraction. Competitors are dumping your model weights via JTAG. You must implement 'Binary Encryption' and PUFs, adding 2ms to your 10ms 'Echo Window'.

3. The 3-Part Mission (The KATs)

Part 1: The Privacy-Utility Frontier (Exploration - 15 Mins)

Objective: Quantify the "Accuracy Collapse" as the Privacy Epsilon (\epsilon) is tightened.
The "Lock" (Prediction): "If you decrease \epsilon from 8.0 (Weak) to 1.0 (Strong), will your training time double, triple, or stay the same to reach the same accuracy?"
The Workbench:
- Action: Slide the Privacy Budget (\epsilon). Adjust Batch Size (DP-SGD sensitivity).
- Observation: The Privacy-Utility Pareto Plot. Watch the "Utility Cliff" appear as noise overrides the gradient signal.
Reflect: "Patterson asks: 'Why does larger batching help recover the accuracy lost to DP noise?' (Reference the Signal-to-Noise ratio)."

Part 2: Secure Aggregation Tax (Trade-off - 15 Mins)

Objective: Balance communication overhead vs. privacy in a distributed Federated Learning fleet.
The "Lock" (Prediction): "Does adding 'Pairwise Masking' (Secure Aggregation) increase network bandwidth consumption or compute latency more?"
The Workbench:
- Interaction: Toggle Secure Aggregation. Adjust Number of Clients (N) and Secret Key Size.
- Instruments: Privacy-Communication Waterfall (Masking Overhead vs. Math Time).
- The 10-Iteration Rule: Students must find the "Mask Complexity" that satisfies the Privacy Officer without causing the 5G sync to exceed the battery budget.
Reflect: "Jeff Dean observes: 'One slow node (straggler) is holding up the entire secure handshake.' Propose a 'Dropout-Resilient' aggregation strategy."

Part 3: The Defense-in-Depth Audit (Synthesis - 15 Mins)

Objective: Design a "Hardened Architecture" that satisfies the Security Lead within the 10ms budget.
The "Lock" (Prediction): "Will running the model in a TEE (Trusted Execution Environment) be faster or slower than running it in plaintext with input sanitization?"
The Workbench:
- Interaction: TEE Toggle. Input Validator Toggle. Output Smoothing Slider.
- The "Stakeholder" Challenge: The Security Lead rejects any design that doesn't use TEEs. You must prove that using Model Sharding (keeping only sensitive layers in TEE) hits the safety goal with 50% less latency than a full-TEE approach.
Reflect (The Ledger): "Defend your final 'Secure Design.' Did you prioritize 'Model Integrity' or 'User Experience'? Justify how you managed the 'Privacy-Utility-Budget Triangle'."

4. Visual Layout Specification

Primary: PrivacyUtilityPareto (Accuracy vs. Epsilon).
Secondary: SecurityOverheadWaterfall (Math vs. Encryption vs. Sanitization time).
Math Peek: Toggle for DP-SGD Noise Scale and Secure Aggregation Complexity.

4.5 KiB Raw Blame History

📐 Mission Plan: 13_sec_privacy (Volume 2: Fleet Scale)

1. Chapter Context

2. The 4-Track Storyboard (Security Missions)

3. The 3-Part Mission (The KATs)

Part 1: The Privacy-Utility Frontier (Exploration - 15 Mins)

Part 2: Secure Aggregation Tax (Trade-off - 15 Mins)

Part 3: The Defense-in-Depth Audit (Synthesis - 15 Mins)

4. Visual Layout Specification

4.5 KiB

Raw Blame History