diff --git a/book/quarto/contents/core/dl_primer/dl_primer.qmd b/book/quarto/contents/core/dl_primer/dl_primer.qmd index 6890e7756..6201711fc 100644 --- a/book/quarto/contents/core/dl_primer/dl_primer.qmd +++ b/book/quarto/contents/core/dl_primer/dl_primer.qmd @@ -2707,7 +2707,7 @@ Neural networks transform computational approaches by replacing rule-based progr Neural network architecture demonstrates hierarchical processing, where each layer extracts progressively more abstract patterns from raw data. Training adjusts connection weights through iterative optimization to minimize prediction errors, while inference applies learned knowledge to make predictions on new data. This separation between learning and application phases creates distinct system requirements for computational resources, memory usage, and processing latency that shape system design and deployment strategies. -This chapter established mathematics and systems implications through fully-connected architectures. The multilayer perceptrons explored here demonstrate universal function approximation. With enough neurons and appropriate weights, such networks can theoretically learn any continuous function. This mathematical generality comes with computational costs. Consider our MNIST example: a 28×28 pixel image contains 784 input values, and a fully-connected network treats each pixel independently, learning 61,400 weights just in the first layer (784 inputs × 100 neurons). Neighboring pixels are highly correlated while distant pixels rarely interact. Fully-connected architectures expend computational resources learning irrelevant long-range relationships. +This chapter established mathematics and systems implications through fully-connected architectures. The multilayer perceptrons explored here demonstrate universal function approximation. With enough neurons and appropriate weights, such networks can theoretically learn any continuous function. This mathematical generality comes with computational costs. Consider our MNIST example: a 28×28 pixel image contains 784 input values, and a fully-connected network treats each pixel independently, learning 78,400 weights just in the first layer (784 inputs × 100 neurons). Neighboring pixels are highly correlated while distant pixels rarely interact. Fully-connected architectures expend computational resources learning irrelevant long-range relationships. ::: {.callout-important title="Key Takeaways"} * Neural networks replace hand-coded rules with adaptive patterns discovered from data through hierarchical processing architectures