mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-05-07 10:08:50 -05:00
fix(dl_primer): correct weight count from 61,400 to 78,400 in summary
784 × 100 = 78,400, not 61,400. All other references in the chapter already use the correct figure. Fixes #1226.
This commit is contained in:
@@ -2707,7 +2707,7 @@ Neural networks transform computational approaches by replacing rule-based progr
|
||||
|
||||
Neural network architecture demonstrates hierarchical processing, where each layer extracts progressively more abstract patterns from raw data. Training adjusts connection weights through iterative optimization to minimize prediction errors, while inference applies learned knowledge to make predictions on new data. This separation between learning and application phases creates distinct system requirements for computational resources, memory usage, and processing latency that shape system design and deployment strategies.
|
||||
|
||||
This chapter established mathematics and systems implications through fully-connected architectures. The multilayer perceptrons explored here demonstrate universal function approximation. With enough neurons and appropriate weights, such networks can theoretically learn any continuous function. This mathematical generality comes with computational costs. Consider our MNIST example: a 28×28 pixel image contains 784 input values, and a fully-connected network treats each pixel independently, learning 61,400 weights just in the first layer (784 inputs × 100 neurons). Neighboring pixels are highly correlated while distant pixels rarely interact. Fully-connected architectures expend computational resources learning irrelevant long-range relationships.
|
||||
This chapter established mathematics and systems implications through fully-connected architectures. The multilayer perceptrons explored here demonstrate universal function approximation. With enough neurons and appropriate weights, such networks can theoretically learn any continuous function. This mathematical generality comes with computational costs. Consider our MNIST example: a 28×28 pixel image contains 784 input values, and a fully-connected network treats each pixel independently, learning 78,400 weights just in the first layer (784 inputs × 100 neurons). Neighboring pixels are highly correlated while distant pixels rarely interact. Fully-connected architectures expend computational resources learning irrelevant long-range relationships.
|
||||
|
||||
::: {.callout-important title="Key Takeaways"}
|
||||
* Neural networks replace hand-coded rules with adaptive patterns discovered from data through hierarchical processing architectures
|
||||
|
||||
Reference in New Issue
Block a user