mirror of
https://github.com/vinta/awesome-python.git
synced 2026-03-22 14:12:18 -05:00
[PR #2615] add GlassFlow #1932
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/vinta/awesome-python/pull/2615
Author: @Boburmirzo
Created: 9/20/2024
Status: 🔄 Open
Base:
master← Head:master📝 Commits (1)
20cef27add GlassFlow📊 Changes
1 file changed (+1 additions, -0 deletions)
View changed files
📝
README.md(+1 -0)📄 Description
What is this Python project?
GlassFlow is a serverless, Python-centric real-time data transformation solution for end-to-end data pipelines. If you use GlassFlow, you do not need Apache Kafka and Flink. Visit the docs page to learn more: https://docs.glassflow.dev/get-started/introduction
Describe features.
You can:
GlassFlow does:
What's the difference between this Python project and similar ones?
Most real-time data processing tools including Kafka are Java-based, while in recent days Python has been the go-to language for data science and machine learning, especially with the AI hype. Because Python has a rich set of libraries for data manipulation and analysis, such as Pandas. To bridge this gap, nowadays you can find a set of tools and technologies available for real-time data processing in Python such as wrapper Python APIs/libraries for (JVM). However, In all Kafka wrappers, you can not simulate easily a production environment without a complex initial setup like creating computing clusters and managing partitions, shards, and workers' setups.
They need to implement a custom transformation user-defined function (UDF) to convert lets say most famous library Pandas transformation to Java syntax. This translation time can significantly impact the throughput and responsiveness of real-time applications.
Enumerate comparisons.
Getting a similar PyFlink based pipeline in production takes 6-12 months and involves several tools to use. GlassFlow can get your data pipeline up and running in just 15 minutes with single tool.
--
Anyone who agrees with this pull request could submit an Approve review to it.
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.