mirror of https://github.com/harvard-edge/cs249r_book.git synced 2026-03-09 07:15:51 -05:00

Files

Vijay Janapa Reddi 3750ee12e9 Enforce Computed Arithmetic Rule across all chapters (1,064 inline refs, 0 unresolved)

Replace every hand-typed derived number with Python-computed inline
references. Add just-in-time compute cells before prose so that changing
any input constant automatically propagates to all derived values.

Vol 1 chapters fixed: dl_primer, dnn_architectures, serving,
model_compression, hw_acceleration, benchmarking, ops, appendix_machine,
appendix_data, frameworks, data_engineering, training, ml_systems,
responsible_engr, data_selection, workflow, introduction.

Vol 2 chapters fixed: distributed_training, inference, infrastructure,
storage, sustainable_ai, fault_tolerance, ops_scale, edge_intelligence,
ai_for_good, privacy_security.

Key corrections caught by forcing computation:
- training.qmd carbon footprint: 64 GPUs → 1024 GPUs (original was
  mathematically impossible for 7B params × 1T tokens)
- hw_acceleration.qmd systolic energy: 10 pJ/250× → 11 pJ/233× (exact)
- hw_acceleration.qmd GPT-2 utilization: 0.6% → 0.7% (exact)
- serving.qmd tokens/hour: ~190M → ~192M (exact)

Also adds calc/validate_inline_refs.py pre-render guardrail and
extends calc/viz.py with Harvard Crimson plotting palette.

2026-02-01 11:13:42 -05:00

13 KiB

Raw Blame History

Volume 1 Seminal Papers Corpus

This document defines the core corpus of papers that should be cited in each chapter, with justification for why each is seminal.

Generated: January 29, 2026

How to Use This Document

For each chapter:

Check if the paper is already cited
If not cited but topic is discussed → ADD the citation
If topic is not discussed → SKIP (don't force citations)

Chapter 1: Introduction

Paper	Authors	Year	Why Seminal
Computing Machinery and Intelligence	Turing	1950	Introduced Turing Test, framed machine intelligence
A Proposal for the Dartmouth Summer Research Project	McCarthy et al.	1955	Coined "artificial intelligence", launched AI as field
The Perceptron	Rosenblatt	1957	First learning algorithm that adjusts weights from data
Perceptrons: An Introduction to Computational Geometry	Minsky & Papert	1969	Proved perceptron limitations, caused first AI winter
Learning Representations by Back-Propagating Errors	Rumelhart, Hinton, Williams	1986	Popularized backpropagation, enabled deep learning
ImageNet Classification with Deep CNNs (AlexNet)	Krizhevsky et al.	2012	Sparked deep learning revolution
Software 2.0	Karpathy	2017	Framed shift from code to learned models
The Bitter Lesson	Sutton	2019	Showed computation beats encoded expertise
Hidden Technical Debt in ML Systems	Sculley et al.	2015	Established ML systems engineering as discipline
AI and Compute	Amodei & Hernandez	2018	Quantified exponential growth in AI compute

Chapter 2: ML Systems

Paper	Authors	Year	Why Seminal
In-Datacenter Performance Analysis of a TPU	Jouppi et al.	2017	First TPU disclosure, established domain-specific accelerators
Hitting the Memory Wall	Wulf & McKee	1995	Coined "memory wall", identified fundamental bottleneck
MobileNets	Howard et al.	2017	Enabled efficient mobile deployment
Communication-Efficient Learning (FedAvg)	McMahan et al.	2017	Established federated learning
Widening Access to Applied ML with TinyML	Reddi et al.	2022	Democratized ML on resource-constrained devices
MLPerf Tiny Benchmark	Banbury, Reddi et al.	2021	First benchmark for microcontroller ML
Deep Learning Recommendation Model (DLRM)	Naumov et al.	2019	Industry-standard recommendation architecture
Roofline Model	Williams et al.	2009	Framework for compute vs memory-bound analysis

Chapter 3: Neural Computation

Paper	Authors	Year	Why Seminal
Learning Representations by Back-Propagating Errors	Rumelhart et al.	1986	Standard training algorithm
Rectified Linear Units Improve RBMs	Nair & Hinton	2010	Established ReLU as default activation
Adam: A Method for Stochastic Optimization	Kingma & Ba	2014	Default optimizer for most applications
Dropout: Preventing Overfitting	Srivastava et al.	2014	Standard regularization technique
Batch Normalization	Ioffe & Szegedy	2015	Enables faster, stable training
Understanding Difficulty of Training Deep Networks	Glorot & Bengio	2010	Xavier/Glorot initialization
Deep Learning (Nature)	LeCun, Bengio, Hinton	2015	Landmark review marking mainstream acceptance
Approximation by Superpositions of Sigmoidal Function	Cybenko	1989	Universal approximation theorem
Delving Deep into Rectifiers (He Init)	He et al.	2015	Initialization for ReLU networks

Chapter 4: Network Architectures

Paper	Authors	Year	Why Seminal
Gradient-based Learning (LeNet)	LeCun et al.	1998	First successful CNN
ImageNet Classification (AlexNet)	Krizhevsky et al.	2012	Deep learning breakthrough
Very Deep CNNs (VGGNet)	Simonyan & Zisserman	2014	Showed depth improves performance
Going Deeper with Convolutions (GoogLeNet)	Szegedy et al.	2015	Multi-scale inception modules
Deep Residual Learning (ResNet)	He et al.	2016	Skip connections enabled 100+ layers
Densely Connected CNNs (DenseNet)	Huang et al.	2017	Feature reuse through dense connectivity
Long Short-Term Memory	Hochreiter & Schmidhuber	1997	Gating for long-term dependencies
GRU	Cho et al.	2014	Simpler alternative to LSTM
Neural Machine Translation (Attention)	Bahdanau et al.	2014	Introduced attention mechanism
Attention Is All You Need	Vaswani et al.	2017	Transformer architecture
BERT	Devlin et al.	2019	Bidirectional pre-training paradigm
GPT	Radford et al.	2018	Autoregressive pre-training
Vision Transformer (ViT)	Dosovitskiy et al.	2021	Transformers for vision
Layer Normalization	Ba et al.	2016	Essential for transformers

Chapter 5: ML Frameworks

Paper	Authors	Year	Why Seminal
TensorFlow	Abadi et al.	2016	Static graph execution model
PyTorch	Paszke et al.	2019	Dynamic graph, define-by-run
JAX/Autograd	Frostig et al. / Bradbury et al.	2018	Functional transformations
Theano	Bergstra et al.	2010	First symbolic computation + autodiff
Automatic Differentiation Survey	Baydin et al.	2018	Definitive autodiff reference
cuDNN	Chetlur et al.	2014	GPU primitives foundation
BLAS	Lawson et al.	1979	Linear algebra interface standard
Training with Sublinear Memory	Chen et al.	2016	Gradient checkpointing

Chapter 6: Model Training

Paper	Authors	Year	Why Seminal
Learning Representations by Back-Propagating Errors	Rumelhart et al.	1986	Core training algorithm
Mixed Precision Training	Micikevicius et al.	2017	FP16/FP32 training
Training with Sublinear Memory	Chen et al.	2016	Gradient checkpointing
FlashAttention	Dao et al.	2022	IO-aware attention, O(n) memory
Accurate, Large Minibatch SGD	Goyal et al.	2017	Linear scaling rule for large batches
Large Scale Distributed Deep Networks	Dean et al.	2012	Parameter server architecture
Horovod	Sergeev & Del Balso	2018	Ring AllReduce for distributed training
SGDR: Warm Restarts	Loshchilov & Hutter	2016	Cosine annealing schedule

Chapter 7: Hardware Acceleration

Paper	Authors	Year	Why Seminal
Scalable Parallel Programming with CUDA	Nickolls et al.	2008	GPU computing model
cuDNN	Chetlur et al.	2014	GPU deep learning primitives
In-Datacenter TPU Analysis	Jouppi et al.	2017	TPU architecture
Ten Lessons from Three TPU Generations	Jouppi et al.	2021	TPU evolution
Systolic Arrays for VLSI	Kung & Leiserson	1979	Systolic array concept
Why Systolic Architectures?	Kung	1982	Systolic design principles
Eyeriss	Chen et al.	2016	Dataflow taxonomy (weight/output/input stationary)
TVM	Chen et al.	2018	ML compiler with auto-tuning
MLIR	Lattner et al.	2019	Multi-level IR for ML
Roofline Model	Williams et al.	2009	Compute vs memory-bound analysis
Efficient Processing of DNNs Survey	Sze et al.	2017	Comprehensive accelerator survey

Chapter 8: Model Compression

Paper	Authors	Year	Why Seminal
Quantization and Training for Efficient Inference	Jacob et al.	2018	Standard INT8 quantization
Deep Compression	Han et al.	2015	Pruning + quantization pipeline
Optimal Brain Damage	LeCun et al.	1989	First pruning formalization
Pruning Filters for Efficient ConvNets	Li et al.	2017	Structured pruning
Distilling Knowledge in a Neural Network	Hinton et al.	2015	Knowledge distillation
Neural Architecture Search with RL	Zoph & Le	2017	Automated architecture discovery
DARTS	Liu et al.	2019	Differentiable NAS
MobileNets	Howard et al.	2017	Depthwise separable convolutions
EfficientNet	Tan & Le	2019	Compound scaling
Lottery Ticket Hypothesis	Frankle & Carlin	2019	Sparse trainable subnetworks

Chapter 9: Benchmarking

Paper	Authors	Year	Why Seminal
MLPerf Training Benchmark	Mattson et al.	2020	Industry standard training benchmark
MLPerf Inference Benchmark	Reddi et al.	2020	Standardized inference evaluation
MLPerf Tiny Benchmark	Banbury et al.	2021	Microcontroller ML benchmark
DAWNBench	Coleman et al.	2017	Time-to-accuracy evaluation
ImageNet	Deng et al.	2009	Standard vision benchmark
COCO	Lin et al.	2014	Detection/segmentation benchmark
SQuAD	Rajpurkar et al.	2016	Reading comprehension benchmark
GLUE	Wang et al.	2018	Multi-task NLP benchmark

Chapter 10: Model Serving

Paper	Authors	Year	Why Seminal
TensorFlow Serving	Olston et al.	2017	Dynamic batching, model serving architecture
Clipper	Crankshaw et al.	2017	Low-latency prediction serving
The Tail at Scale	Dean & Barroso	2013	Tail latency in distributed systems
Orca	Yu et al.	2022	Continuous batching for LLMs
vLLM (PagedAttention)	Kwon et al.	2023	KV cache memory management
FlashAttention	Dao et al.	2022	Efficient attention for inference
Nexus	Shen et al.	2019	GPU cluster for DNN serving
Little's Law	Little	1961	Queuing theory foundation

Chapter 11: Data Engineering

Paper	Authors	Year	Why Seminal
Data Cascades in High-Stakes AI	Sambasivan et al.	2021	Data quality as engineering concern
Hidden Technical Debt in ML Systems	Sculley et al.	2015	Training-serving skew
Datasheets for Datasets	Gebru et al.	2021	Dataset documentation standard
Survey on Concept Drift Adaptation	Gama et al.	2014	Drift detection taxonomy
Cheap and Fast—But is it Good?	Snow et al.	2008	Crowdsourcing quality

Chapter 12: Data Efficiency

Paper	Authors	Year	Why Seminal
Scaling Laws for Neural Language Models	Kaplan et al.	2020	Power-law scaling relationships
Training Compute-Optimal LLMs (Chinchilla)	Hoffmann et al.	2022	Optimal data-to-parameter ratios
Curriculum Learning	Bengio et al.	2009	Easy-to-hard training order
Active Learning	Settles	2009	Query strategies book
FixMatch	Sohn et al.	2020	Semi-supervised learning
SimCLR	Chen et al.	2020	Contrastive self-supervised learning
MoCo	He et al.	2020	Momentum contrastive learning
mixup	Zhang et al.	2018	Data augmentation

Chapter 13: ML Operations

Paper	Authors	Year	Why Seminal
Hidden Technical Debt in ML Systems	Sculley et al.	2015	ML technical debt framework
Software Engineering for ML	Amershi et al.	2019	ML-specific SE practices
ML Test Score	Breck et al.	2017	Production readiness rubric
TFX	Baylor et al.	2017	End-to-end ML platform
MLflow	Zaharia et al.	2018	Experiment tracking standard

Chapter 14: Responsible Engineering

Paper	Authors	Year	Why Seminal
Model Cards for Model Reporting	Mitchell et al.	2019	Model documentation standard
Datasheets for Datasets	Gebru et al.	2021	Dataset documentation
"Why Should I Trust You?" (LIME)	Ribeiro et al.	2016	Model-agnostic explanations
SHAP	Lundberg & Lee	2017	Game-theoretic feature attribution
Gender Shades	Buolamwini & Gebru	2018	Bias audit methodology
Equality of Opportunity	Hardt et al.	2016	Fairness definitions
Inherent Trade-Offs in Fair Risk Scores	Kleinberg et al.	2016	Fairness impossibility results
Big Data's Disparate Impact	Barocas & Selbst	2016	Legal framework for algorithmic discrimination

Chapter 15: ML Workflow

Paper	Authors	Year	Why Seminal
From Data Mining to KDD	Fayyad et al.	1996	KDD process methodology
CRISP-DM	Chapman et al.	2000	Industry-standard ML workflow
Software Engineering for ML	Amershi et al.	2019	ML lifecycle principles

Summary Statistics

Chapter	Seminal Papers Listed
Introduction	10
ML Systems	8
Neural Computation	9
Network Architectures	14
ML Frameworks	8
Model Training	8
Hardware Acceleration	11
Model Compression	10
Benchmarking	8
Model Serving	8
Data Engineering	5
Data Efficiency	8
ML Operations	5
Responsible Engr	8
ML Workflow	3
TOTAL	~113 unique papers

Next Steps

Cross-check each chapter against this corpus
Add missing citations where topics are discussed
Remove any citations that aren't justified by this list (clutter)

This corpus represents the foundational literature for ML systems. Each paper was selected because it introduced a concept, technique, or result that shaped the field.

13 KiB Raw Blame History