Add war story callout with proper icon formats and supporting files

- Add war story callout definition in custom-numbered-blocks.yml
- Create war story icon in all three formats (SVG, PNG, PDF) matching
  the 64x64 stroke-only style used by all other callout icons
- Add war story bibliography and PDF config entry
- Add first war story ("The Quadratic Wall") in nn_architectures
- Include icon conversion utility script
This commit is contained in:
Vijay Janapa Reddi
2026-02-19 07:38:16 -05:00
parent 15f7e139a7
commit 739b48622f
8 changed files with 180 additions and 0 deletions

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.1 KiB

View File

@@ -0,0 +1,4 @@
<svg xmlns="http://www.w3.org/2000/svg" width="64" height="64" viewBox="0 0 64 64" fill="none">
<path d="M32 6 L52 16 L52 34 Q52 50 32 58 Q12 50 12 34 L12 16 Z" stroke="#C53030" stroke-width="2.5" fill="none" stroke-linejoin="round"/>
<path d="M32 18 L28 28 L34 32 L30 44" stroke="#C53030" stroke-width="2.5" stroke-linecap="round" stroke-linejoin="round" fill="none"/>
</svg>

After

Width:  |  Height:  |  Size: 381 B

View File

@@ -103,6 +103,7 @@ book:
bibliography:
- contents/vol1/backmatter/references.bib
- contents/vol1/backmatter/war_stories.bib
citation: true
license: CC-BY-NC-SA

View File

@@ -75,6 +75,10 @@ filter-metadata:
colors: ["FDF2F7", "BE185D"]
collapse: false
numbered: false
war-story:
colors: ["FEF2F2", "C53030"] # Red/Warning theme
collapse: false
numbered: false
theorem:
colors: ["F5F0FA", "6B46C1"]
collapse: false
@@ -131,3 +135,6 @@ filter-metadata:
callout-takeaways:
label: "Key Takeaways"
group: takeaways
callout-war-story:
label: "War Story"
group: war-story

View File

@@ -0,0 +1,125 @@
@article{sec2013knight,
title = {Keynote Address: The Knight Capital Error},
author = {SEC},
journal = {U.S. Securities and Exchange Commission Release No. 34-70694},
year = {2013},
url = {https://www.sec.gov/litigation/admin/2013/34-70694.pdf},
note = {Administrative Proceeding File No. 3-15570}
}
@misc{zillow2021letter,
title = {Zillow Group Q3 2021 Shareholder Letter},
author = {{Zillow Group}},
year = {2021},
url = {https://s24.q4cdn.com/723050407/files/doc_financials/2021/q3/Zillow-Group-Q3-21-Shareholder-Letter.pdf},
note = {Announcing the wind-down of Zillow Offers due to algorithmic unpredictability}
}
@inproceedings{covington2016deep,
title = {Deep Neural Networks for YouTube Recommendations},
author = {Covington, Paul and Adams, Jay and Sargin, Emre},
booktitle = {Proceedings of the 10th ACM Conference on Recommender Systems},
pages = {191--198},
year = {2016},
organization = {ACM},
note = {Describes shift from click-based to watch-time-based optimization to avoid clickbait}
}
@article{lapuschkin2019unmasking,
title = {Unmasking Clever Hans predictors and assessing what machines really learn},
author = {Lapuschkin, Sebastian and W{\"a}ldchen, Stephan and Binder, Alexander and Montavon, Gr{\'e}goire and Samek, Wojciech and M{\"u}ller, Klaus-Robert},
journal = {Nature Communications},
volume = {10},
number = {1},
pages = {1096},
year = {2019},
publisher = {Nature Publishing Group}
}
@inproceedings{paszke2019pytorch,
title = {PyTorch: An Imperative Style, High-Performance Deep Learning Library},
author = {Paszke, Adam and Gross, Sam and Massa, Francisco and Lerer, Adam and Bradbury, James and Chanan, Gregory and Killeen, Trevor and Lin, Zeming and Gimelshein, Natalia and Antiga, Luca and others},
booktitle = {Advances in Neural Information Processing Systems},
pages = {8024--8035},
year = {2019}
}
@misc{beazley2010understanding,
title = {Understanding the Python GIL},
author = {Beazley, David},
year = {2010},
url = {http://www.dabeaz.com/python/UnderstandingGIL.pdf},
note = {PyCon 2010 Presentation}
}
@inproceedings{recht2019imagenet,
title = {Do ImageNet Classifiers Generalize to ImageNet?},
author = {Recht, Benjamin and Roelofs, Rebecca and Schmidt, Ludwig and Shankar, Vaishaal},
booktitle = {International Conference on Machine Learning},
pages = {5389--5400},
year = {2019},
organization = {PMLR}
}
@article{hooker2020hardware,
title = {The Hardware Lottery},
author = {Hooker, Sara},
journal = {Communications of the ACM},
volume = {64},
number = {12},
pages = {58--65},
year = {2021},
publisher = {ACM New York, NY, USA}
}
@article{jia2018highly,
title = {Highly Scalable Deep Learning Training System with Mixed-Precision: Training ImageNet in Four Minutes},
author = {Jia, Xianyan and Song, Shutao and He, Wei and Wang, Yangzihao and Rong, Haibin and Zhou, Feihu and Xie, Liqiang and Guo, Zhenyu and Yang, Yuanzhou and Yu, Liwei and others},
journal = {arXiv preprint arXiv:1807.11205},
year = {2018}
}
@article{sculley2015hidden,
title = {Hidden Technical Debt in Machine Learning Systems},
author = {Sculley, D. and Holt, Gary and Golovin, Daniel and Davydov, Eugene and Phillips, Todd and Ebner, Dietmar and Chaudhary, Vinay and Young, Michael and Crespo, Jean-Francois and Dennison, Dan},
journal = {Advances in Neural Information Processing Systems},
volume = {28},
pages = {2503--2511},
year = {2015}
}
@article{wolf2017we,
title = {Why We Can't Have Nice Things: A Social Analysis of Microsoft's Tay Chatbot},
author = {Wolf, Marty J. and Miller, Keith and Grodzinsky, Frances S.},
journal = {ACM SIGCAS Computers and Society},
volume = {47},
number = {3},
pages = {24--39},
year = {2017},
publisher = {ACM New York, NY, USA}
}
@misc{discord2020rust,
title = {Why Discord is switching from Go to Rust},
author = {Howarth, Jesse},
year = {2020},
url = {https://discord.com/blog/why-discord-is-switching-from-go-to-rust},
note = {Detailed engineering blog post on GC latency spikes}
}
@report{ntsb2019uber,
title = {Collision Between Vehicle Controlled by Developmental Automated Driving System and Pedestrian},
author = {{National Transportation Safety Board}},
year = {2019},
institution = {NTSB},
number = {HWY18MH010},
url = {https://www.ntsb.gov/investigations/AccidentReports/Reports/HAR1903.pdf}
}
@misc{crankshaw2017clipper_paper,
title = {Clipper: A Low-Latency Online Prediction Serving System},
author = {Crankshaw, Daniel and Wang, Xin and Zhou, Guilio and Franklin, Michael J and Gonzalez, Joseph E and Stoica, Ion},
year = {2017},
howpublished = {NSDI 2017},
note = {Demonstrates RPC overhead in model serving}
}

View File

@@ -2444,6 +2444,16 @@ Modern AI scaling is defined by the cost of Attention. Verify your intuition:
- [ ] **Implication**: Can you explain why this $O(N^2)$ cost makes long-context models fundamentally more expensive than short-context ones, regardless of hardware improvements?
:::
::: {.callout-war-story title="The Quadratic Wall"}
**The Context**: When Google released BERT in 2018, it revolutionized NLP. However, the engineering team strictly limited the input sequence length to 512 tokens, despite users clamoring for longer context to process full documents.
**The Failure**: This wasn't a product decision; it was a physics decision. The self-attention mechanism's memory requirement scales quadratically ($O(N^2)$). Doubling the context from 512 to 1024 would quadruple the memory; increasing it to a modest 4,000 tokens (for a short article) would increase memory usage by $64 \times$.
**The Consequence**: Without this hard limit, a single long document would cause an Out-Of-Memory (OOM) crash, taking down the training cluster. The "Quadratic Wall" forced the entire industry to fragment documents into 512-token chunks for years until $O(N)$ attention approximations (like Linformer) and IO-aware optimizations (like FlashAttention) were invented.
**The Systems Lesson**: Big-O notation is not just theory; it is infrastructure destiny. A quadratic algorithm is a "denial of service" vulnerability waiting to happen. Production systems must enforce hard limits on input dimensions that trigger super-linear resource consumption [@vaswani2017attention].
:::
Despite these costs, attention's ability to connect any position to any other in constant depth is too powerful to use merely as an add-on to recurrent architectures. If attention can bypass sequential processing entirely, why preserve the recurrent structure at all? The answer to that question produced the most consequential architectural shift in modern deep learning.
## Transformers: Parallel Sequence Processing {#sec-network-architectures-transformers-attentiononly-architecture-1b56}

View File

@@ -0,0 +1,33 @@
import os
from PIL import Image
icon_dir = "book/quarto/assets/images/icons/callouts/"
# Process all v* files and the main file
files = [f for f in os.listdir(icon_dir) if (f.startswith("icon_callout_war_story") and f.endswith(".png"))]
for f in files:
img_path = os.path.join(icon_dir, f)
pdf_path = os.path.join(icon_dir, f.replace(".png", ".pdf"))
try:
image = Image.open(img_path)
# Create a white background image for PDF to avoid alpha channel issues in some PDF viewers/printers
# and to match the likely white page background.
if image.mode in ('RGBA', 'LA') or (image.mode == 'P' and 'transparency' in image.info):
# Convert to RGBA first to handle transparency correctly
image = image.convert('RGBA')
# Create white background
background = Image.new("RGB", image.size, (255, 255, 255))
# Composite
background.paste(image, mask=image.split()[3])
image_to_save = background
else:
image_to_save = image.convert('RGB')
# Save as PDF with high resolution (300 DPI)
image_to_save.save(pdf_path, "PDF", resolution=300.0)
print(f"Converted {f} to {pdf_path} @ 300 DPI")
except Exception as e:
print(f"Failed to convert {f}: {e}")