[PR #2615] add GlassFlow #1932

Open
opened 2025-11-06 13:26:06 -06:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/vinta/awesome-python/pull/2615
Author: @Boburmirzo
Created: 9/20/2024
Status: 🔄 Open

Base: masterHead: master


📝 Commits (1)

📊 Changes

1 file changed (+1 additions, -0 deletions)

View changed files

📝 README.md (+1 -0)

📄 Description

What is this Python project?

GlassFlow is a serverless, Python-centric real-time data transformation solution for end-to-end data pipelines. If you use GlassFlow, you do not need Apache Kafka and Flink. Visit the docs page to learn more: https://docs.glassflow.dev/get-started/introduction

Describe features.

You can:

  • Use GlassFlow out-of-the-box with any existing Python library.
  • Start GlassFlow without a complex initial setup such as creating clusters.
  • Skip the headache of managing partitions, shards, and workers' setup.
  • Define your pipeline as code using GlassFlow CLI.
  • Implement your transformation function using GlassFlow Python SDK
  • Run your Python code locally for easy development and debugging.

GlassFlow does:

  • Provides a pure Python and zero infrastructure environment.
  • Keeps your original data where it is.
  • Connects live data sources.
  • Ingests real-time data continuously.
  • Does real-time data transformation.
  • Simulates your production workloads.
  • Deploys your pipeline to production within minutes.
  • Delivers auto-scalable serverless event streaming infrastructure.

What's the difference between this Python project and similar ones?

Most real-time data processing tools including Kafka are Java-based, while in recent days Python has been the go-to language for data science and machine learning, especially with the AI hype. Because Python has a rich set of libraries for data manipulation and analysis, such as Pandas. To bridge this gap, nowadays you can find a set of tools and technologies available for real-time data processing in Python such as wrapper Python APIs/libraries for (JVM). However, In all Kafka wrappers, you can not simulate easily a production environment without a complex initial setup like creating computing clusters and managing partitions, shards, and workers' setups.

They need to implement a custom transformation user-defined function (UDF) to convert lets say most famous library Pandas transformation to Java syntax. This translation time can significantly impact the throughput and responsiveness of real-time applications.

Enumerate comparisons.

Getting a similar PyFlink based pipeline in production takes 6-12 months and involves several tools to use. GlassFlow can get your data pipeline up and running in just 15 minutes with single tool.

--

Anyone who agrees with this pull request could submit an Approve review to it.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/vinta/awesome-python/pull/2615 **Author:** [@Boburmirzo](https://github.com/Boburmirzo) **Created:** 9/20/2024 **Status:** 🔄 Open **Base:** `master` ← **Head:** `master` --- ### 📝 Commits (1) - [`20cef27`](https://github.com/vinta/awesome-python/commit/20cef276f6f0a1eb914192dc033c18c8e0f6c412) add GlassFlow ### 📊 Changes **1 file changed** (+1 additions, -0 deletions) <details> <summary>View changed files</summary> 📝 `README.md` (+1 -0) </details> ### 📄 Description ## What is this Python project? GlassFlow is a serverless, Python-centric real-time data transformation solution for end-to-end data pipelines. If you use GlassFlow, you do not need Apache Kafka and Flink. Visit the docs page to learn more: https://docs.glassflow.dev/get-started/introduction Describe features. You can: - Use GlassFlow out-of-the-box with any existing Python library. - Start GlassFlow without a complex initial setup such as creating clusters. - Skip the headache of managing partitions, shards, and workers' setup. - Define your pipeline as code using GlassFlow CLI. - Implement your transformation function using GlassFlow Python SDK - Run your Python code locally for easy development and debugging. GlassFlow does: - Provides a pure Python and zero infrastructure environment. - Keeps your original data where it is. - Connects live data sources. - Ingests real-time data continuously. - Does real-time data transformation. - Simulates your production workloads. - Deploys your pipeline to production within minutes. - Delivers auto-scalable serverless event streaming infrastructure. ## What's the difference between this Python project and similar ones? Most real-time data processing tools including Kafka are Java-based, while in recent days Python has been the go-to language for data science and machine learning, especially with the AI hype. Because Python has a rich set of libraries for data manipulation and analysis, such as Pandas. To bridge this gap, nowadays you can find a set of tools and technologies available for real-time data processing in Python such as wrapper Python APIs/libraries for (JVM). However, In all Kafka wrappers, you can not simulate easily a production environment without a complex initial setup like creating computing clusters and managing partitions, shards, and workers' setups. They need to implement a custom transformation user-defined function (UDF) to convert lets say most famous library Pandas transformation to Java syntax. This translation time can significantly impact the throughput and responsiveness of real-time applications. Enumerate comparisons. Getting a similar PyFlink based pipeline in production takes 6-12 months and involves several tools to use. GlassFlow can get your data pipeline up and running in just 15 minutes with single tool. -- Anyone who agrees with this pull request could submit an *Approve* review to it. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2025-11-06 13:26:06 -06:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/awesome-python#1932