Mlsysbook_chapter18_4_dropout #409

Closed
opened 2026-03-22 15:38:56 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @formlsysbookissue on GitHub (Jul 29, 2025).

Originally assigned to: @profvjreddi on GitHub.

As you mention, dropout is primarily used to prevent overfitting by deactivating neurons during training. While it can highlight areas of high uncertainty, it doesn’t directly estimate uncertainty. Perhaps combining dropout with Bayesian or ensemble methods could provide more robustness. Bayesian neural networks and ensemble methods offer more precise uncertainty estimates, though they are computationally expensive. In the meanwhile, I would like to suggest adding a brief mention of how the existing overfitting intervention mechanisms work in more detail.

Originally created by @formlsysbookissue on GitHub (Jul 29, 2025). Originally assigned to: @profvjreddi on GitHub. As you mention, dropout is primarily used to prevent overfitting by deactivating neurons during training. While it can highlight areas of high uncertainty, it doesn’t directly estimate uncertainty. Perhaps combining dropout with Bayesian or ensemble methods could provide more robustness. Bayesian neural networks and ensemble methods offer more precise uncertainty estimates, though they are computationally expensive. In the meanwhile, I would like to suggest adding a brief mention of how the existing overfitting intervention mechanisms work in more detail.
GiteaMirror added the area: booktype: improvement labels 2026-03-22 15:38:56 -05:00
Author
Owner

@profvjreddi commented on GitHub (Jul 29, 2025):

Thanks for the feedback, will do! And by the way, you are welcome to issue a small PR too if you'd like, as I like to get the community to engage that way, and everyone who contributes is automatically recognised ;contrib. If not no worries, I will fix it. :-)

@profvjreddi commented on GitHub (Jul 29, 2025): Thanks for the feedback, will do! And by the way, you are welcome to issue a small PR too if you'd like, as I like to get the community to engage that way, and everyone who contributes is automatically recognised [;contrib](https://mlsysbook.ai/contents/frontmatter/acknowledgements/acknowledgements.html#contributors). If not no worries, I will fix it. :-)
Author
Owner

@profvjreddi commented on GitHub (Aug 15, 2025):

Thank you for your feedback 🙏

I updated the text as follows, and I hope that addresses your concern; it is all still a work in progress -- so thank you again for your time and feedback.

Dropout, originally designed as a regularization technique to prevent overfitting during training [@hinton2012improvingneuralnetworkspreventing], works by randomly deactivating a fraction of neurons during each training iteration, forcing the network to avoid over-reliance on specific neurons and improving generalization. This mechanism can be repurposed for uncertainty estimation through Monte Carlo dropout at inference time, where multiple forward passes with different dropout masks approximate the uncertainty distribution. However, this approach provides less precise uncertainty estimates since dropout was not specifically designed for uncertainty quantification but rather for preventing overfitting through enforced redundancy. Hybrid approaches that combine dropout with lightweight ensemble methods or Bayesian approximations can balance computational efficiency with estimation quality, making uncertainty-based detection more practical for real-world deployment.

@profvjreddi commented on GitHub (Aug 15, 2025): Thank you for your feedback 🙏 I updated the text as follows, and I hope that addresses your concern; it is all still a work in progress -- so thank you again for your time and feedback. > Dropout, originally designed as a regularization technique to prevent overfitting during training [@hinton2012improvingneuralnetworkspreventing], works by randomly deactivating a fraction of neurons during each training iteration, forcing the network to avoid over-reliance on specific neurons and improving generalization. This mechanism can be repurposed for uncertainty estimation through Monte Carlo dropout at inference time, where multiple forward passes with different dropout masks approximate the uncertainty distribution. However, this approach provides less precise uncertainty estimates since dropout was not specifically designed for uncertainty quantification but rather for preventing overfitting through enforced redundancy. Hybrid approaches that combine dropout with lightweight ensemble methods or Bayesian approximations can balance computational efficiency with estimation quality, making uncertainty-based detection more practical for real-world deployment.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/cs249r_book#409