feat(pdf): align cover layout across volumes, add logo recolor script

- Vol 2: reference/citation at document end, sky-blue logo, cover image position
- Vol 1: match cover image position (bg-image-left 0.175, bg-image-bottom 8)
- Add recolor_cover_logo.py for hue-shift variants of cover logo
This commit is contained in:
Vijay Janapa Reddi
2026-02-21 16:20:05 -05:00
parent 417f47722a
commit 8f6950a257
4 changed files with 42 additions and 47 deletions

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.0 MiB

After

Width:  |  Height:  |  Size: 1.0 MiB

View File

@@ -54,58 +54,58 @@ book:
# ==================================================
# Volume I Frontmatter
# ==================================================
# - contents/vol1/frontmatter/dedication.qmd
# - contents/vol1/frontmatter/foreword.qmd
# - contents/vol1/frontmatter/about.qmd
# - contents/vol1/frontmatter/acknowledgements.qmd
# - contents/vol1/frontmatter/notation.qmd
- contents/vol1/frontmatter/dedication.qmd
- contents/vol1/frontmatter/foreword.qmd
- contents/vol1/frontmatter/about.qmd
- contents/vol1/frontmatter/acknowledgements.qmd
- contents/vol1/frontmatter/notation.qmd
# ==================================================
# Part I: Foundations
# ==================================================
# - contents/vol1/parts/foundations_principles.qmd
# - contents/vol1/introduction/introduction.qmd
# - contents/vol1/ml_systems/ml_systems.qmd
# - contents/vol1/ml_workflow/ml_workflow.qmd
# - contents/vol1/data_engineering/data_engineering.qmd
- contents/vol1/parts/foundations_principles.qmd
- contents/vol1/introduction/introduction.qmd
- contents/vol1/ml_systems/ml_systems.qmd
- contents/vol1/ml_workflow/ml_workflow.qmd
- contents/vol1/data_engineering/data_engineering.qmd
# ==================================================
# Part II: Build
# ==================================================
# - contents/vol1/parts/build_principles.qmd
# - contents/vol1/nn_computation/nn_computation.qmd
# - contents/vol1/nn_architectures/nn_architectures.qmd
# - contents/vol1/frameworks/frameworks.qmd
# - contents/vol1/training/training.qmd
- contents/vol1/parts/build_principles.qmd
- contents/vol1/nn_computation/nn_computation.qmd
- contents/vol1/nn_architectures/nn_architectures.qmd
- contents/vol1/frameworks/frameworks.qmd
- contents/vol1/training/training.qmd
# ==================================================
# Part III: Optimize
# ==================================================
# - contents/vol1/parts/optimize_principles.qmd
# - contents/vol1/data_selection/data_selection.qmd
# - contents/vol1/optimizations/model_compression.qmd
# - contents/vol1/hw_acceleration/hw_acceleration.qmd
- contents/vol1/parts/optimize_principles.qmd
- contents/vol1/data_selection/data_selection.qmd
- contents/vol1/optimizations/model_compression.qmd
- contents/vol1/hw_acceleration/hw_acceleration.qmd
- contents/vol1/benchmarking/benchmarking.qmd
# ==================================================
# Part IV: Deploy
# ==================================================
# - contents/vol1/parts/deploy_principles.qmd
# - contents/vol1/model_serving/model_serving.qmd
# - contents/vol1/ml_ops/ml_ops.qmd
# - contents/vol1/responsible_engr/responsible_engr.qmd
# - contents/vol1/conclusion/conclusion.qmd
# - contents/vol1/backmatter/references.qmd
- contents/vol1/parts/deploy_principles.qmd
- contents/vol1/model_serving/model_serving.qmd
- contents/vol1/ml_ops/ml_ops.qmd
- contents/vol1/responsible_engr/responsible_engr.qmd
- contents/vol1/conclusion/conclusion.qmd
- contents/vol1/backmatter/references.qmd
# ==================================================
# Appendices (uses Appendix A, B, C... numbering)
# ==================================================
# appendices:
# - contents/vol1/backmatter/appendix_dam.qmd
# - contents/vol1/backmatter/appendix_machine.qmd
# - contents/vol1/backmatter/appendix_algorithm.qmd
# - contents/vol1/backmatter/appendix_data.qmd
# - contents/vol1/backmatter/glossary/glossary.qmd
- contents/vol1/backmatter/appendix_dam.qmd
- contents/vol1/backmatter/appendix_machine.qmd
- contents/vol1/backmatter/appendix_algorithm.qmd
- contents/vol1/backmatter/appendix_data.qmd
- contents/vol1/backmatter/glossary/glossary.qmd
bibliography:
- contents/vol1/backmatter/references.bib
@@ -139,8 +139,8 @@ format:
coverpage-footer: "Introduction to"
coverpage-theme:
page-text-align: "left"
bg-image-left: "0.225\\paperwidth"
bg-image-bottom: 9
bg-image-left: "0.175\\paperwidth"
bg-image-bottom: 8
bg-image-rotate: 0
bg-image-opacity: 1.0
header-style: "none"

View File

@@ -130,30 +130,29 @@ format:
coverpage-footer: "At Scale"
coverpage-theme:
page-text-align: "left"
bg-image-left: "0.225\\paperwidth"
bg-image-bottom: 9
bg-image-left: "0.175\\paperwidth"
bg-image-bottom: 8
bg-image-rotate: 0
bg-image-opacity: 1.0
header-style: "none"
date-style: "none"
footer-fontsize: 25
footer-fontsize: 22
footer-left: "0.075\\paperwidth"
footer-bottom: "0.475\\paperwidth"
footer-bottom: "0.435\\paperwidth"
footer-width: "0.9\\paperwidth"
footer-align: "left"
title-fontsize: 52
title-fontsize: 60
title-left: "0.075\\paperwidth"
title-bottom: "0.4\\paperwidth"
title-bottom: "0.37\\paperwidth"
title-width: "0.9\\paperwidth"
author-style: "plain"
author-sep: "newline"
author-fontsize: 20
author-fontsize: 25
author-align: "right"
author-bottom: "0.225\\paperwidth"
#author-left: "0.075\\paperwidth"
author-bottom: "0.1375\\paperwidth"
author-left: ".925\\paperwidth"
author-width: 6in

View File

@@ -1066,10 +1066,8 @@ wm_params_str = WeightMatrixCalc.wm_params_str
This computation demonstrates the scale of matrix operations in neural networks. Each output neuron (`{python} wm_out_str` total) must process all input features (`{python} wm_in_str` total) for every sample in the batch (32 samples). The weight matrix alone contains `{python} wm_in_str` $\times$ `{python} wm_out_str` = `{python} wm_params_str` parameters that define these transformations, illustrating why efficient matrix multiplication dominates performance considerations.
Neural networks employ matrix operations across diverse architectural patterns beyond simple linear layers.
\index{Convolution!matrix multiplication equivalence}
Matrix operations appear consistently across modern neural architectures. Convolution operations transform into matrix multiplications through the im2col technique\index{im2col!convolution to matrix}[^fn-im2col-hardware], enabling efficient execution on matrix-optimized hardware. @lst-matrix_patterns illustrates these diverse applications.
Neural networks employ matrix operations across diverse architectural patterns beyond simple linear layers. Matrix operations appear consistently across modern neural architectures. Convolution operations transform into matrix multiplications through the im2col technique\index{im2col!convolution to matrix}[^fn-im2col-hardware], enabling efficient execution on matrix-optimized hardware. @lst-matrix_patterns illustrates these diverse applications.
[^fn-im2col-hardware]: **Im2col (Image-to-Column)**: A preprocessing technique that converts convolution operations into matrix multiplications by unfolding image patches into column vectors. A 3 $\times$ 3 convolution on a 224 $\times$ 224 image creates a matrix with ~50,000 columns, enabling efficient GEMM execution but increasing memory usage 9 $\times$ due to overlapping patches. This transformation explains why convolutions are actually matrix operations in modern ML accelerators.
@@ -1095,12 +1093,10 @@ output = matmul(kernel, patches)
```
:::
This pervasive pattern of matrix multiplication has direct implications for hardware design: the need for efficient matrix operations drives the development of specialized hardware architectures that can handle these computations at scale.
#### Matrix Operations Hardware Acceleration {#sec-hardware-acceleration-matrix-operations-hardware-acceleration-514a}
\index{Matrix Operations!hardware acceleration}
The computational demands of matrix operations have driven specialized hardware optimizations. @lst-matrix_unit demonstrates how modern processors implement dedicated matrix units that process entire 16 $\times$ 16 blocks simultaneously, achieving 32 $\times$ higher throughput than vector processing alone.
This pervasive pattern of matrix multiplication has direct implications for hardware design: the need for efficient matrix operations drives the development of specialized hardware architectures that can handle these computations at scale. The computational demands of matrix operations have driven specialized hardware optimizations. @lst-matrix_unit demonstrates how modern processors implement dedicated matrix units that process entire 16 $\times$ 16 blocks simultaneously, achieving 32 $\times$ higher throughput than vector processing alone.
::: {#lst-matrix_unit lst-cap="**Matrix Unit Operation**: Enables efficient block-wise matrix multiplication and accumulation in hardware-accelerated systems, demonstrating how specialized units streamline computational tasks for AI/ML operations."}
```{.c}