mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-04-30 01:29:07 -05:00
31 lines
2.2 KiB
Plaintext
31 lines
2.2 KiB
Plaintext
# Datasets
|
|
|
|
1. **Google Speech Commands Dataset**
|
|
- Description: A set of one-second .wav audio files, each containing a single spoken English word.
|
|
- [Link to the Dataset](https://ai.googleblog.com/2017/08/launching-speech-commands-dataset.html)
|
|
|
|
2. **VisualWakeWords Dataset**
|
|
- Description: A dataset tailored for tinyML vision applications, consisting of binary labeled images indicating whether a person is in the image or not.
|
|
- [Link to the Dataset](https://github.com/tensorflow/models/tree/master/research/slim#preparing-the-visualwakewords-dataset)
|
|
|
|
3. **EMNIST Dataset**
|
|
- Description: A dataset containing 28x28 pixel images of handwritten characters and digits, which is an extension of the MNIST dataset but includes letters.
|
|
- [Link to the Dataset](https://www.nist.gov/itl/products-and-services/emnist-dataset)
|
|
|
|
4. **UCI Machine Learning Repository: Human Activity Recognition Using Smartphones**
|
|
- Description: A dataset with the recordings of 30 study participants performing activities of daily living (ADL) while carrying a waist-mounted smartphone with embedded inertial sensors.
|
|
- [Link to the Dataset](https://archive.ics.uci.edu/ml/datasets/human+activity+recognition+using+smartphones)
|
|
|
|
5. **PlantVillage Dataset**
|
|
- Description: A dataset comprising of images of healthy and diseased crop leaves categorized based on the crop type and disease type, which could be used in a tinyML agricultural project.
|
|
- [Link to the Dataset](https://github.com/spMohanty/PlantVillage-Dataset)
|
|
|
|
6. **Gesture Recognition using 3D Motion Sensing (3D Gesture Database)**
|
|
- Description: This dataset contains 3D gesture data recorded using a Leap Motion Controller, which might be useful for gesture recognition projects.
|
|
- [Link to the Dataset](https://lttm.dei.unipd.it/downloads/gesture/)
|
|
|
|
7. **Multilingual Spoken Words Corpus**
|
|
- Description: A dataset containing recordings of common spoken words in various languages, useful for speech recognition projects targeting multiple languages.
|
|
- [Link to the Dataset](https://mlcommons.org/en/multilingual-spoken-words/)
|
|
|
|
Remember to verify the dataset's license or terms of use to ensure it can be used for your intended purpose. |