Standardize section headers to use colons instead of dashes

2025-12-05 19:17:52 -06:00 · 2025-12-05 13:03:00 -08:00
parent 3aa6a9b040
commit 42025d34aa
4 changed files with 21 additions and 21 deletions
--- a/modules/15_quantization/quantization.py
+++ b/modules/15_quantization/quantization.py
@@ -165,7 +165,7 @@ if __name__ == "__main__":

 # %% [markdown]
 """
-## 1. Introduction - The Memory Wall Problem
+## 1. Introduction: The Memory Wall Problem

 Imagine trying to fit a library in your backpack. Neural networks face the same challenge - models are getting huge, but devices have limited memory!

@@ -241,7 +241,7 @@ Today you'll build the production-quality quantization system that makes all thi

 # %% [markdown]
 """
-## 2. Foundations - The Mathematics of Compression
+## 2. Foundations: The Mathematics of Compression

 ### Understanding the Core Challenge

@@ -354,7 +354,7 @@ INT8 gives us 4× memory reduction with <1% accuracy loss - the perfect balance

 # %% [markdown]
 """
-## 3. Implementation - Building the Quantization Engine
+## 3. Implementation: Building the Quantization Engine

 ### Our Implementation Strategy

@@ -932,7 +932,7 @@ if __name__ == "__main__":

 # %% [markdown]
 """
-## 4. Integration - Scaling to Full Neural Networks
+## 4. Integration: Scaling to Full Neural Networks

 ### The Model Quantization Challenge

@@ -1331,7 +1331,7 @@ if __name__ == "__main__":

 # %% [markdown]
 """
-## 5. Verification - Proving Optimization Works
+## 5. Verification: Proving Optimization Works

 Before analyzing quantization in production, let's verify that our optimization actually works using real measurements.
 """
@@ -1413,7 +1413,7 @@ if __name__ == "__main__":

 # %% [markdown]
 """
-## 6. Systems Analysis - Quantization in Production
+## 6. Systems Analysis: Quantization in Production

 Now let's measure the real-world impact of quantization through systematic analysis.
 """
--- a/modules/16_compression/compression.py
+++ b/modules/16_compression/compression.py
@@ -338,7 +338,7 @@ Reconstruction Error:

 # %% [markdown]
 """
-## 3. Sparsity Measurement - Understanding Model Density
+## 3. Sparsity Measurement: Understanding Model Density

 Before we can compress models, we need to understand how dense they are. Sparsity measurement tells us what percentage of weights are zero (or effectively zero).

@@ -436,7 +436,7 @@ if __name__ == "__main__":

 # %% [markdown]
 """
-## 4. Magnitude-Based Pruning - Removing Small Weights
+## 4. Magnitude-Based Pruning: Removing Small Weights

 Magnitude pruning is the simplest and most intuitive compression technique. It's based on the observation that weights with small magnitudes contribute little to the model's output.

@@ -593,7 +593,7 @@ if __name__ == "__main__":

 # %% [markdown]
 """
-## 5. Structured Pruning - Hardware-Friendly Compression
+## 5. Structured Pruning: Hardware-Friendly Compression

 While magnitude pruning creates scattered zeros throughout the network, structured pruning removes entire computational units (channels, neurons, heads). This creates sparsity patterns that modern hardware can actually accelerate.

@@ -766,7 +766,7 @@ if __name__ == "__main__":

 # %% [markdown]
 """
-## 6. Low-Rank Approximation - Matrix Compression Through Factorization
+## 6. Low-Rank Approximation: Matrix Compression Through Factorization

 Low-rank approximation discovers that large weight matrices often contain redundant information that can be captured with much smaller matrices through mathematical decomposition.

@@ -914,7 +914,7 @@ if __name__ == "__main__":

 # %% [markdown]
 """
-## 7. Knowledge Distillation - Learning from Teacher Models
+## 7. Knowledge Distillation: Learning from Teacher Models

 Knowledge distillation is like having an expert teacher simplify complex concepts for a student. The large "teacher" model shares its knowledge with a smaller "student" model, achieving similar performance with far fewer parameters.

@@ -1332,7 +1332,7 @@ if __name__ == "__main__":

 # %% [markdown]
 """
-## 5. Verification - Proving Pruning Works
+## 5. Verification: Proving Pruning Works

 Before analyzing compression in production, let's verify that our pruning actually achieves sparsity using real measurements.
 """
@@ -1403,7 +1403,7 @@ if __name__ == "__main__":

 # %% [markdown]
 """
-## 6. Systems Analysis - Compression Techniques
+## 6. Systems Analysis: Compression Techniques

 Understanding the real-world effectiveness of different compression techniques through systematic measurement and comparison.

--- a/modules/17_memoization/memoization.py
+++ b/modules/17_memoization/memoization.py
@@ -1367,7 +1367,7 @@ if __name__ == "__main__":

 # %% [markdown]
 """
-## 5. Verification - Proving KV Cache Speedup
+## 5. Verification: Proving KV Cache Speedup

 Before analyzing KV cache performance, let's verify that caching actually provides the dramatic speedup we expect using real timing measurements.
 """
@@ -1463,7 +1463,7 @@ if __name__ == "__main__":

 # %% [markdown]
 """
-## 6. Systems Analysis - KV Cache Performance
+## 6. Systems Analysis: KV Cache Performance

 Now let's analyze the performance characteristics and trade-offs of KV caching.
 """
--- a/modules/18_acceleration/acceleration.py
+++ b/modules/18_acceleration/acceleration.py
@@ -91,7 +91,7 @@ We'll fix these issues with vectorization and kernel fusion, achieving 2-5× spe

 # %% [markdown]
 """
-## 1. Introduction - The Performance Challenge
+## 1. Introduction: The Performance Challenge

 Modern neural networks face two fundamental bottlenecks that limit their speed:

@@ -153,7 +153,7 @@ from tinytorch.core.tensor import Tensor

 # %% [markdown]
 """
-## 2. Foundations - Vectorization: From Loops to Lightning
+## 2. Foundations: Vectorization: From Loops to Lightning

 ### The SIMD Revolution

@@ -328,7 +328,7 @@ if __name__ == "__main__":

 # %% [markdown]
 """
-## 3. Implementation - Kernel Fusion: Eliminating Memory Bottlenecks
+## 3. Implementation: Kernel Fusion: Eliminating Memory Bottlenecks

 ### The Memory Bandwidth Crisis

@@ -754,7 +754,7 @@ if __name__ == "__main__":

 # %% [markdown]
 """
-## 4. Verification - Proving Vectorization Speedup
+## 4. Verification: Proving Vectorization Speedup

 Before analyzing acceleration performance, let's verify that vectorization actually provides significant speedup using real timing measurements.
 """
@@ -849,7 +849,7 @@ if __name__ == "__main__":

 # %% [markdown]
 """
-## 5. Systems Analysis - Performance Scaling Patterns
+## 5. Systems Analysis: Performance Scaling Patterns

 Let's analyze how our acceleration techniques perform across different scenarios and understand their scaling characteristics.
 """
@@ -1062,7 +1062,7 @@ if __name__ == "__main__":

 # %% [markdown]
 """
-## 5. Optimization Insights - Production Acceleration Strategy
+## 5. Optimization Insights: Production Acceleration Strategy

 Understanding when and how to apply different acceleration techniques in real-world scenarios.
 """