mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-05-07 18:18:42 -05:00
Apply structural and line-level fixes from a 5-reviewer ensemble (senior MLSys, pedagogy, prose stylist, math verifier, Gemini 3.1 Pro). Structural: - Delete §Correspondence to MLIR (section, contribution C4, 5 listings, 2 tables) — scope pivot away from compiler framing - Promote three first-order performance bounds to contribution C4 - Combine the two walkthrough sections into one §Walkthroughs with FlashAttention as a subsection and its stages as subsubsections - Motivation-first rewrite of §Introduction - Add §Appendix: The 90 Elements, auto-generated from table.yml - Move appendix after bibliography for clean page layout - Trim 7 subsection titles to avoid two-column wrapping Review findings: - PagedAttention: rename "70B model" to "70B dense-MHA" (Llama uses GQA) - ZeRO-3 all-gather: apply (N-1)/N ring factor (1.9ms -> 1.7ms NVLink, 35ms -> 31ms InfiniBand, verdict ratio ~2.5x -> ~2.8x) - TP InfiniBand: reconcile 10 us with end-to-end 9 us latency - M4 Pro TOPS: clarify as aggregate GPU+NE estimate - Fix predict_paper.tex self-contradictions (FlashAttention = HBM not SRAM; Continuous Batching = utilization not arithmetic intensity) - PagedAttention 12.5x -> ~13x arithmetic correction - Add §4.2 ZeRO-3 intro paragraph (only subsection missing one) Prose: - Tighten abstract (grammar, restore concrete numbers, four-filter and dead-end analysis commitments) - Delete 12 filler sentences - Replace 4 remaining em-dashes with commas/parens - Cut 2 decorative italics - Reformat Rule 1-6 paragraph titles (noun-phrase labels) Build: 25 pages, 412 KB, zero undefined refs.
125 lines
6.8 KiB
TeX
125 lines
6.8 KiB
TeX
% Auto-generated from table.yml by make-appendix-elements.sh
|
|
% DO NOT EDIT DIRECTLY. Regenerate with: python3 scripts/make-appendix-elements.py
|
|
|
|
\section*{Appendix: The 90 Elements}
|
|
\label{sec:appendix-elements}
|
|
\addcontentsline{toc}{section}{Appendix: The 90 Elements}
|
|
|
|
The main text refers to elements of the Periodic Table by symbol: \texttt{At} for Attention, \texttt{Ti} for Tiling, \texttt{Hb} for HBM, and so on. \Cref{tab:appendix-elements} provides the complete reference index, listing all 90 elements with their catalog number, symbol, full name, abstraction layer, and information-processing role. The visual layout of the table, which shows how elements cluster by row and column, appears in \Cref{fig:periodic-table}; this appendix is the textual counterpart, optimized for lookup rather than for conveying structure. Readers encountering an unfamiliar symbol in a molecular formula should consult this table first. Elements are numbered sequentially from the Data layer (lowest) to the Production layer (highest), so the catalog number carries a rough sense of abstraction depth.
|
|
|
|
\begin{table*}[t!]
|
|
\centering
|
|
\caption{\textbf{The 90 elements of the Periodic Table of Machine Learning Systems.} Each element lists its catalog number, symbol, full name, abstraction layer (row), and information-processing role (column). This table is the complete reference index; the main text refers to elements by symbol (e.g., \texttt{At} for Attention) and occasionally by catalog number.}
|
|
\label{tab:appendix-elements}
|
|
\footnotesize
|
|
\renewcommand{\arraystretch}{1.05}
|
|
\begin{minipage}[t]{0.49\textwidth}
|
|
\centering
|
|
\begin{tabular}{@{}r l l l l@{}}
|
|
\toprule
|
|
\textbf{\#} & \textbf{Sym} & \textbf{Name} & \textbf{Layer} & \textbf{Role} \\
|
|
\midrule
|
|
1 & \texttt{Tn} & Tensor & Math & Represent \\
|
|
2 & \texttt{Pr} & Probability & Math & Represent \\
|
|
3 & \texttt{Op} & Operator & Math & Compute \\
|
|
4 & \texttt{Cr} & Chain Rule & Math & Communicate \\
|
|
5 & \texttt{Ob} & Objective & Math & Control \\
|
|
6 & \texttt{Cs} & Constraint & Math & Control \\
|
|
7 & \texttt{Dv} & Divergence & Math & Measure \\
|
|
8 & \texttt{Pm} & Parameter & Algorithms & Represent \\
|
|
9 & \texttt{Eb} & Embedding & Algorithms & Represent \\
|
|
10 & \texttt{Sp} & Sample & Algorithms & Represent \\
|
|
11 & \texttt{Dd} & Dense Dot & Algorithms & Compute \\
|
|
12 & \texttt{Cv} & Convolution & Algorithms & Compute \\
|
|
13 & \texttt{Po} & Pooling & Algorithms & Compute \\
|
|
14 & \texttt{Sm} & Sampling & Algorithms & Compute \\
|
|
15 & \texttt{Ad} & Autodiff & Algorithms & Communicate \\
|
|
16 & \texttt{Tk} & Tokenization & Algorithms & Communicate \\
|
|
17 & \texttt{Gd} & Grad Descent & Algorithms & Control \\
|
|
18 & \texttt{Rw} & Reward & Algorithms & Control \\
|
|
19 & \texttt{Iz} & Initialization & Algorithms & Control \\
|
|
20 & \texttt{Lf} & Loss Function & Algorithms & Measure \\
|
|
21 & \texttt{Tp} & Topology & Architecture & Represent \\
|
|
22 & \texttt{Hs} & Hidden State & Architecture & Represent \\
|
|
23 & \texttt{At} & Attention & Architecture & Compute \\
|
|
24 & \texttt{Gt} & Gating & Architecture & Compute \\
|
|
25 & \texttt{Nm} & Normalization & Architecture & Compute \\
|
|
26 & \texttt{Ro} & Routing & Architecture & Compute \\
|
|
27 & \texttt{Sk} & Skip/Res & Architecture & Communicate \\
|
|
28 & \texttt{Fb} & Feedback & Architecture & Communicate \\
|
|
29 & \texttt{Mk} & Masking & Architecture & Control \\
|
|
30 & \texttt{Rf} & Receptive Fld & Architecture & Measure \\
|
|
31 & \texttt{Fc} & Factorization & Optimization & Represent \\
|
|
32 & \texttt{Os} & Optim State & Optimization & Represent \\
|
|
33 & \texttt{Qz} & Quantization & Optimization & Compute \\
|
|
34 & \texttt{Sp} & Sparsification & Optimization & Compute \\
|
|
35 & \texttt{Ws} & Weight Sharing & Optimization & Communicate \\
|
|
36 & \texttt{En} & Ensembling & Optimization & Communicate \\
|
|
37 & \texttt{Sc} & Scheduling & Optimization & Control \\
|
|
38 & \texttt{Rg} & Regularization & Optimization & Control \\
|
|
39 & \texttt{Tm} & Termination & Optimization & Control \\
|
|
40 & \texttt{Id} & Info Density & Optimization & Measure \\
|
|
41 & \texttt{Cc} & Caching & Runtime & Represent \\
|
|
42 & \texttt{Cp} & Checkpointing & Runtime & Represent \\
|
|
43 & \texttt{Ir} & Int. Rep. & Runtime & Represent \\
|
|
44 & \texttt{Fs} & Fusion & Runtime & Compute \\
|
|
45 & \texttt{Bt} & Batching & Runtime & Compute \\
|
|
\bottomrule
|
|
\end{tabular}
|
|
\end{minipage}\hfill
|
|
\begin{minipage}[t]{0.49\textwidth}
|
|
\centering
|
|
\begin{tabular}{@{}r l l l l@{}}
|
|
\toprule
|
|
\textbf{\#} & \textbf{Sym} & \textbf{Name} & \textbf{Layer} & \textbf{Role} \\
|
|
\midrule
|
|
46 & \texttt{Ti} & Tiling & Runtime & Compute \\
|
|
47 & \texttt{Cl} & Compilation & Runtime & Compute \\
|
|
48 & \texttt{Pl} & Pipelining & Runtime & Communicate \\
|
|
49 & \texttt{Sy} & Sync / Coll & Runtime & Communicate \\
|
|
50 & \texttt{Pf} & Prefetching & Runtime & Communicate \\
|
|
51 & \texttt{Al} & Allocation & Runtime & Control \\
|
|
52 & \texttt{Ut} & Utilization & Runtime & Measure \\
|
|
53 & \texttt{Sr} & SRAM & Hardware & Represent \\
|
|
54 & \texttt{Dr} & DRAM & Hardware & Represent \\
|
|
55 & \texttt{Ma} & MAC Unit & Hardware & Compute \\
|
|
56 & \texttt{Vu} & Vector Unit & Hardware & Compute \\
|
|
57 & \texttt{Ic} & Interconnect & Hardware & Communicate \\
|
|
58 & \texttt{Rt} & HW Router & Hardware & Communicate \\
|
|
59 & \texttt{Ar} & Arbiter & Hardware & Control \\
|
|
60 & \texttt{Ck} & Clock/Sync & Hardware & Control \\
|
|
61 & \texttt{Ew} & Energy & Hardware & Measure \\
|
|
62 & \texttt{As} & Artifact Store & Production & Represent \\
|
|
63 & \texttt{Ex} & Exec Engine & Production & Compute \\
|
|
64 & \texttt{Rp} & RPC Protocol & Production & Communicate \\
|
|
65 & \texttt{Mq} & Msg Queue & Production & Communicate \\
|
|
66 & \texttt{Ld} & Load Balancer & Production & Control \\
|
|
67 & \texttt{Oc} & Orchestrator & Production & Control \\
|
|
68 & \texttt{La} & Latency & Production & Measure \\
|
|
69 & \texttt{Av} & Availability & Production & Measure \\
|
|
70 & \texttt{Rc} & Record & Data & Represent \\
|
|
71 & \texttt{Ds} & Dataset & Data & Represent \\
|
|
72 & \texttt{Tr} & Transform & Data & Compute \\
|
|
73 & \texttt{Ag} & Aggregate & Data & Compute \\
|
|
74 & \texttt{Fl} & Flow/Stream & Data & Communicate \\
|
|
75 & \texttt{Fm} & Format & Data & Communicate \\
|
|
76 & \texttt{Fi} & Filter & Data & Control \\
|
|
77 & \texttt{Sm} & Schema & Data & Control \\
|
|
78 & \texttt{Vl} & Volume & Data & Measure \\
|
|
79 & \texttt{An} & Analog ALU & Hardware & Compute \\
|
|
80 & \texttt{En} & Entropy & Data & Measure \\
|
|
81 & \texttt{Ix} & Indexing & Architecture & Represent \\
|
|
82 & \texttt{Ro} & Routing & Architecture & Control \\
|
|
83 & \texttt{Vr} & Virtualization & Runtime & Represent \\
|
|
84 & \texttt{Td} & Thermodynamics & Hardware & Measure \\
|
|
85 & \texttt{Rs} & Resilience & Production & Control \\
|
|
86 & \texttt{Ac} & Activation & Algorithms & Compute \\
|
|
87 & \texttt{St} & State & Math & Represent \\
|
|
88 & \texttt{Re} & Retrieve & Optimization & Communicate \\
|
|
89 & \texttt{Wa} & Weight Avg & Optimization & Compute \\
|
|
90 & \texttt{Ct} & Critic & Algorithms & Control \\
|
|
\bottomrule
|
|
\end{tabular}
|
|
\end{minipage}
|
|
\end{table*}
|