[PR #1221] [MERGED] Add Apache Airflow #1090

Closed
opened 2025-11-06 13:09:03 -06:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/vinta/awesome-python/pull/1221
Author: @duyet
Created: 1/28/2019
Status: Merged
Merged: 1/28/2019
Merged by: @vinta

Base: masterHead: patch-1


📝 Commits (1)

📊 Changes

1 file changed (+2 additions, -0 deletions)

View changed files

📝 README.md (+2 -0)

📄 Description

What is this Python project?

Apache Airflow: Use airflow to author workflows as directed acyclic graphs (DAGs) of tasks. The airflow scheduler executes your tasks on an array of workers while following the specified dependencies. Rich command line utilities make performing complex surgeries on DAGs a snap. The rich user interface makes it easy to visualize pipelines running in production, monitor progress, and troubleshoot issues when needed.

What's the difference between this Python project and similar ones?

Airflow vs. Luigi:

Airflow

  • Easy-to-use UI (+)
  • Built in scheduler (+)
  • Easy testing of DAGs (+)
  • Separates output data and task state (+)
  • Strong and active community (+)
    Luigi
  • Creating and testing tasks is difficult (-)
  • The UI is challenging to navigate (-)
  • Not scalable due to tight coupling with cron jobs; the number of worker processes is bounded by number of cron workers assigned to a job (-)
  • Re-running pipelines is not possible

Airflow vs. Oozie

Airflow

  • Python Code for DAGs (+)
  • Has connectors for every major service/cloud provider (+)
  • More versatile (+)
  • Advanced metrics (+)
  • Better UI and API (+)
  • Capable of creating extremely complex workflows (+)
  • Jinja Templating (+)
  • Can be parallelized (=)
  • Native Connections to HDFS, HIVE, PIG etc.. (=)
  • Graph as DAG (=)

Oozie

  • Java or XML for DAGs (---)
  • Hard to build complex pipelines (-)
  • Smaller, less active community (-)
  • Worse WEB GUI (-)
  • Java API (-)
  • Can be parallelized (=)
  • Native Connections to HDFS, HIVE, PIG etc.. (=)
  • Graph as DAG (=)

--

Anyone who agrees with this pull request could vote for it by adding a 👍 to it, and usually, the maintainer will merge it when votes reach 20.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/vinta/awesome-python/pull/1221 **Author:** [@duyet](https://github.com/duyet) **Created:** 1/28/2019 **Status:** ✅ Merged **Merged:** 1/28/2019 **Merged by:** [@vinta](https://github.com/vinta) **Base:** `master` ← **Head:** `patch-1` --- ### 📝 Commits (1) - [`736868b`](https://github.com/vinta/awesome-python/commit/736868b5caf9b80c65cd8ad378adb8f1775306d6) Update README.md ### 📊 Changes **1 file changed** (+2 additions, -0 deletions) <details> <summary>View changed files</summary> 📝 `README.md` (+2 -0) </details> ### 📄 Description ## What is this Python project? Apache Airflow: Use airflow to author workflows as directed acyclic graphs (DAGs) of tasks. The airflow scheduler executes your tasks on an array of workers while following the specified dependencies. Rich command line utilities make performing complex surgeries on DAGs a snap. The rich user interface makes it easy to visualize pipelines running in production, monitor progress, and troubleshoot issues when needed. ## What's the difference between this Python project and similar ones? ### Airflow vs. Luigi: **Airflow** - Easy-to-use UI (+) - Built in scheduler (+) - Easy testing of DAGs (+) - Separates output data and task state (+) - Strong and active community (+) **Luigi** - Creating and testing tasks is difficult (-) - The UI is challenging to navigate (-) - Not scalable due to tight coupling with cron jobs; the number of worker processes is bounded by number of cron workers assigned to a job (-) - Re-running pipelines is not possible ### Airflow vs. Oozie **Airflow** - Python Code for DAGs (+) - Has connectors for every major service/cloud provider (+) - More versatile (+) - Advanced metrics (+) - Better UI and API (+) - Capable of creating extremely complex workflows (+) - Jinja Templating (+) - Can be parallelized (=) - Native Connections to HDFS, HIVE, PIG etc.. (=) - Graph as DAG (=) **Oozie** - Java or XML for DAGs (---) - Hard to build complex pipelines (-) - Smaller, less active community (-) - Worse WEB GUI (-) - Java API (-) - Can be parallelized (=) - Native Connections to HDFS, HIVE, PIG etc.. (=) - Graph as DAG (=) -- Anyone who agrees with this pull request could vote for it by adding a :+1: to it, and usually, the maintainer will merge it when votes reach **20**. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2025-11-06 13:09:03 -06:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/awesome-python#1090