[GH-ISSUE #606] Typos in some files #1509

New Issue

GiteaMirror · 2026-04-11T07:52:22-05:00

GiteaMirror commented

2026-04-11 07:52:22 -05:00

Originally created by @BravoBaldo on GitHub (Jan 13, 2025).
Original GitHub issue: https://github.com/harvard-edge/cs249r_book/issues/606

In ai_for_good.qmd and in introduction.qmd
Search "Farmbeats" --> "FarmBeats" (B is uppercase)

In contents/core/training/data_engineering.qmd

Search "Kafka" Please make it clear that this is an Apache data streaming platform. (maybe change in "Apache Kafka")
Search "rrowdsourced" --> "crowdsourced"
Search "collection platforms etc" --> append a dot.
My opinion, search "ensuring only high-quality samples are used for model training": removing dirt from training data could introduce some sort of bias: you don't always ask Alexa to turn on the light while staying in complete silence!!! However, it could be specified that cleaning is part of the pre-processing phase.

Originally created by @BravoBaldo on GitHub (Jan 13, 2025). Original GitHub issue: https://github.com/harvard-edge/cs249r_book/issues/606 In ai_for_good.qmd and in introduction.qmd Search "Farmbeats" --> "FarmBeats" (B is uppercase) In contents/core/training/data_engineering.qmd - Search "Kafka" Please make it clear that this is an Apache data streaming platform. (maybe change in "Apache Kafka") - Search "rrowdsourced" --> "crowdsourced" - Search "collection platforms etc" --> append a dot. - My opinion, search "ensuring only high-quality samples are used for model training": removing dirt from training data could introduce some sort of bias: you don't always ask Alexa to turn on the light while staying in complete silence!!! However, it could be specified that cleaning is part of the pre-processing phase.

GiteaMirror added the area: book label 2026-04-11 07:52:22 -05:00

GiteaMirror closed this issue

2026-04-11 07:52:22 -05:00

GiteaMirror commented

2026-04-11 07:52:23 -05:00

@profvjreddi commented on GitHub (Jan 13, 2025):

In contents/core/training/data_engineering.qmd

Thanks for the feedback 🙏

Search "Kafka" Please make it clear that this is an Apache data streaming platform. (maybe change in "Apache Kafka")

Search "rrowdsourced" --> "crowdsourced"

Search "collection platforms etc" --> append a dot.

Fixed.

My opinion, search "ensuring only high-quality samples are used for model training": removing dirt from training data could introduce some sort of bias: you don't always ask Alexa to turn on the light while staying in complete silence!!! However, it could be specified that cleaning is part of the pre-processing phase.

Excellent point! I updated the text.

@profvjreddi commented on GitHub (Jan 13, 2025): > In contents/core/training/data_engineering.qmd Thanks for the feedback 🙏 > > * Search "Kafka" Please make it clear that this is an Apache data streaming platform. (maybe change in "Apache Kafka") > * Search "rrowdsourced" --> "crowdsourced" > * Search "collection platforms etc" --> append a dot. Fixed. > * My opinion, search "ensuring only high-quality samples are used for model training": removing dirt from training data could introduce some sort of bias: you don't always ask Alexa to turn on the light while staying in complete silence!!! However, it could be specified that cleaning is part of the pre-processing phase. Excellent point! I updated the text.