[GH-ISSUE #606] Typos in some files #1509

Closed
opened 2026-04-11 07:52:22 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @BravoBaldo on GitHub (Jan 13, 2025).
Original GitHub issue: https://github.com/harvard-edge/cs249r_book/issues/606

In ai_for_good.qmd and in introduction.qmd
Search "Farmbeats" --> "FarmBeats" (B is uppercase)

In contents/core/training/data_engineering.qmd

  • Search "Kafka" Please make it clear that this is an Apache data streaming platform. (maybe change in "Apache Kafka")
  • Search "rrowdsourced" --> "crowdsourced"
  • Search "collection platforms etc" --> append a dot.
  • My opinion, search "ensuring only high-quality samples are used for model training": removing dirt from training data could introduce some sort of bias: you don't always ask Alexa to turn on the light while staying in complete silence!!! However, it could be specified that cleaning is part of the pre-processing phase.
Originally created by @BravoBaldo on GitHub (Jan 13, 2025). Original GitHub issue: https://github.com/harvard-edge/cs249r_book/issues/606 In ai_for_good.qmd and in introduction.qmd Search "Farmbeats" --> "FarmBeats" (B is uppercase) In contents/core/training/data_engineering.qmd - Search "Kafka" Please make it clear that this is an Apache data streaming platform. (maybe change in "Apache Kafka") - Search "rrowdsourced" --> "crowdsourced" - Search "collection platforms etc" --> append a dot. - My opinion, search "ensuring only high-quality samples are used for model training": removing dirt from training data could introduce some sort of bias: you don't always ask Alexa to turn on the light while staying in complete silence!!! However, it could be specified that cleaning is part of the pre-processing phase.
GiteaMirror added the area: book label 2026-04-11 07:52:22 -05:00
Author
Owner

@profvjreddi commented on GitHub (Jan 13, 2025):

In contents/core/training/data_engineering.qmd

Thanks for the feedback 🙏

  • Search "Kafka" Please make it clear that this is an Apache data streaming platform. (maybe change in "Apache Kafka")
  • Search "rrowdsourced" --> "crowdsourced"
  • Search "collection platforms etc" --> append a dot.

Fixed.

  • My opinion, search "ensuring only high-quality samples are used for model training": removing dirt from training data could introduce some sort of bias: you don't always ask Alexa to turn on the light while staying in complete silence!!! However, it could be specified that cleaning is part of the pre-processing phase.

Excellent point! I updated the text.

<!-- gh-comment-id:2587049497 --> @profvjreddi commented on GitHub (Jan 13, 2025): > In contents/core/training/data_engineering.qmd Thanks for the feedback 🙏 > > * Search "Kafka" Please make it clear that this is an Apache data streaming platform. (maybe change in "Apache Kafka") > * Search "rrowdsourced" --> "crowdsourced" > * Search "collection platforms etc" --> append a dot. Fixed. > * My opinion, search "ensuring only high-quality samples are used for model training": removing dirt from training data could introduce some sort of bias: you don't always ask Alexa to turn on the light while staying in complete silence!!! However, it could be specified that cleaning is part of the pre-processing phase. Excellent point! I updated the text.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/cs249r_book#1509