[PR #1336] [CLOSED] Add Neuraxle #1179

Closed
opened 2025-11-06 13:11:00 -06:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/vinta/awesome-python/pull/1336
Author: @guillaume-chevalier
Created: 8/26/2019
Status: Closed

Base: masterHead: patch-1


📝 Commits (1)

📊 Changes

1 file changed (+1 additions, -0 deletions)

View changed files

📝 README.md (+1 -0)

📄 Description

What is this Python project?

Neuraxle is a Machine Learning (ML) library for building neat pipelines, providing the right abstractions to both ease research, development, and deployment of your ML applications. Features:

  • Better hyperparameter space handling
  • Composite design pattern for steps in a pipeline
  • Streaming pipelines where all data can flow (not just like a block all at once)
  • AutoML algorithms to launch hyperparameter search
  • Meta pipeline steps as meta-optimizers for AutoML
  • (soon) Vizualisation of hyperparameter correlation
  • (soon) Automatic REST API serving of models

What's the difference between this Python project and similar ones?

Production-ready

Most research projects don't ever get to production. However, you want
your project to be production-ready and already adaptable (clean) by the
time you finish it. You also want things to be simple so that you can
get started quickly.

Most existing machine learning pipeline frameworks are either too simple
or too complicated for medium-scale projects. Neuraxle is balanced for
medium-scale projects, providing simple, yet powerful abstractions that
are ready to be used.

Compatibility

Neuraxle is built as a framework that enables you to define your own pipeline steps.

This means that you can use scikit-learn, Keras, TensorFlow, PyTorch and/or any other machine learning library you like within and throughout your Neuraxle pipelines.

Parallel Computing

Neuraxle offer multiple parallel processing features using joblib. Most parallel processing in Neuraxle happens in the pipeline and union modules, and as such, neuraxle can be easily parallelized on a cluster of computers using distributed as its joblib backend.

Automatic Machine Learning

One of the core goal of this framework is to enable easy automatic
machine learning, and also meta-learning. It should be easy to train a
meta-optimizer on many different tasks: the optimizer is a model itself
that maps features of datasets and features of the hyperparameter space
to a guessed performance score to predict the best hyperparameters.
Hyperparameter spaces are easily defined with a range, and are only
coupled to their respective pipeline steps, rather than being coupled to
the whole pipeline, which enable class reuse and more modularity.

Comparison to Other Machine Learning Pipeline Frameworks

scikit-learn

Everything that works in sklearn is also useable in Neuraxle. Neuraxle
is built in a way that does not replace what already exists. Therefore,
Neuraxle adds more power to scikit-lean by providing neat abstractions,
and neuraxle is even retrocompatible with sklean if it ever needed to be
included in an already-existing sklearn pipeline (you can do that by
using .tosklearn() on your Neuraxle pipeline). We believe that
Neuraxle helps scikit-learn, and also scikit-learn will help Neuraxle.
Neuraxle is best used with scikit-learn.

Also, the top core developers of scikit-learn, Andreas C. Müller, gave a talk in which
he lists the elements that are yet to be done in scikit-learn. He refers to
building bigger pipelines with automatic machine learning, meta
learning, improving the abstractions of the search spaces, and he also
points out that it would be possible do achieve that in another library
which could reuse scikit-learn. Neuraxle is here to solve those problems
that are actually shared by the open-source community in general. Let's
move forward with Neuraxle: join Neuraxle's community.


https://www.youtube.com/embed/Wy6EKjJT79M?start=1361&end=1528

Apache Beam

Apache Beam is a big, multi-language project and hence is complicated.
Neuraxle is pythonic and user-friendly: it's easy to get started.

Also, it seems that Apache Beam has GPL and MPL dependencies, which
means Apache Beam might itself be copyleft (?). Neuraxle doesn't have
such copyleft dependencies.

spaCy

spaCy has copyleft dependencies or may download copyleft content, and it
is built only for Natural Language Processing (NLP) projects. Neuraxle
is open to any kind of machine learning projects and isn't an NLP-first
project.

Kubeflow

Kubeflow is cloud-first, using Kubernetes and is more oriented towards
devops. Neuraxle isn't built as a cloud-first solution and isn't tied to
Kubernetes. Neuraxle instead offers many parallel processing features,
such as the ability to be scaled on many cores of a computer, and even
on a computer cluster (e.g.: in the cloud using any cloud provider) with
joblib, using dask's distributed library as a joblib backend. A Neuraxle
project is best deployed as a microservice within your software
environment, and you can fully control and customize how you deploy your
project (e.g.: coding yourself a pipeline step that does json conversion
to accept http requests).

--

Anyone who agrees with this pull request could vote for it by adding a 👍 to it, and usually, the maintainer will merge it when votes reach 20.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/vinta/awesome-python/pull/1336 **Author:** [@guillaume-chevalier](https://github.com/guillaume-chevalier) **Created:** 8/26/2019 **Status:** ❌ Closed **Base:** `master` ← **Head:** `patch-1` --- ### 📝 Commits (1) - [`11107d0`](https://github.com/vinta/awesome-python/commit/11107d0009140a48bf4a928d58634c2644b690b5) Add Neuraxle ### 📊 Changes **1 file changed** (+1 additions, -0 deletions) <details> <summary>View changed files</summary> 📝 `README.md` (+1 -0) </details> ### 📄 Description # What is this Python project? Neuraxle is a Machine Learning (ML) library for building neat pipelines, providing the right abstractions to both ease research, development, and deployment of your ML applications. Features: - Better hyperparameter space handling - Composite design pattern for steps in a pipeline - Streaming pipelines where all data can flow (not just like a block all at once) - AutoML algorithms to launch hyperparameter search - Meta pipeline steps as meta-optimizers for AutoML - (soon) Vizualisation of hyperparameter correlation - (soon) Automatic REST API serving of models # What's the difference between this Python project and similar ones? ## Production-ready Most research projects don't ever get to production. However, you want your project to be production-ready and already adaptable (clean) by the time you finish it. You also want things to be simple so that you can get started quickly. Most existing machine learning pipeline frameworks are either too simple or too complicated for medium-scale projects. Neuraxle is balanced for medium-scale projects, providing simple, yet powerful abstractions that are ready to be used. ## Compatibility > Neuraxle is built as a framework that enables you to define your own pipeline steps. This means that you can use [scikit-learn](https://scikit-learn.org/stable/), [Keras](https://keras.io/), [TensorFlow](https://www.tensorflow.org/), [PyTorch](https://pytorch.org/) and/or **any other machine learning library** you like within and throughout your Neuraxle pipelines. ## Parallel Computing Neuraxle offer multiple parallel processing features using [joblib](https://joblib.readthedocs.io/en/latest/parallel.html). Most parallel processing in Neuraxle happens in the [pipeline](https://www.neuraxle.neuraxio.com/stable/api/neuraxle.pipeline.html) and [union](https://www.neuraxle.neuraxio.com/stable/api/neuraxle.union.html) modules, and as such, neuraxle can be easily parallelized on a cluster of computers using [distributed](https://ml.dask.org/joblib.html) as its [joblib backend](https://ml.dask.org/joblib.html). ## Automatic Machine Learning One of the core goal of this framework is to enable easy automatic machine learning, and also meta-learning. It should be easy to train a meta-optimizer on many different tasks: the optimizer is a model itself that maps features of datasets and features of the hyperparameter space to a guessed performance score to predict the best hyperparameters. Hyperparameter spaces are easily defined with a range, and are only coupled to their respective pipeline steps, rather than being coupled to the whole pipeline, which enable class reuse and more modularity. ## Comparison to Other Machine Learning Pipeline Frameworks ### scikit-learn Everything that works in sklearn is also useable in Neuraxle. Neuraxle is built in a way that does not replace what already exists. Therefore, Neuraxle adds more power to scikit-lean by providing neat abstractions, and neuraxle is even retrocompatible with sklean if it ever needed to be included in an already-existing sklearn pipeline (you can do that by using ``.tosklearn()`` on your Neuraxle pipeline). We believe that Neuraxle helps scikit-learn, and also scikit-learn will help Neuraxle. Neuraxle is best used with scikit-learn. Also, the top core developers of scikit-learn, Andreas C. Müller, [gave a talk](https://www.youtube.com/embed/Wy6EKjJT79M) in which he lists the elements that are yet to be done in scikit-learn. He refers to building bigger pipelines with automatic machine learning, meta learning, improving the abstractions of the search spaces, and he also points out that it would be possible do achieve that in another library which could reuse scikit-learn. Neuraxle is here to solve those problems that are actually shared by the open-source community in general. Let's move forward with Neuraxle: join Neuraxle's [community](https://www.neuraxle.neuraxio.com/stable/Neuraxle/README.html#community). [![](http://i3.ytimg.com/vi/Wy6EKjJT79M/maxresdefault.jpg)](https://www.youtube.com/embed/Wy6EKjJT79M?start=1361&amp;end=1528) [https://www.youtube.com/embed/Wy6EKjJT79M?start=1361&amp;end=1528](https://www.youtube.com/embed/Wy6EKjJT79M?start=1361&amp;end=1528) ### Apache Beam Apache Beam is a big, multi-language project and hence is complicated. Neuraxle is pythonic and user-friendly: it's easy to get started. Also, it seems that Apache Beam has GPL and MPL dependencies, which means Apache Beam might itself be copyleft (?). Neuraxle doesn't have such copyleft dependencies. ### spaCy spaCy has copyleft dependencies or may download copyleft content, and it is built only for Natural Language Processing (NLP) projects. Neuraxle is open to any kind of machine learning projects and isn't an NLP-first project. ### Kubeflow Kubeflow is cloud-first, using Kubernetes and is more oriented towards devops. Neuraxle isn't built as a cloud-first solution and isn't tied to Kubernetes. Neuraxle instead offers many parallel processing features, such as the ability to be scaled on many cores of a computer, and even on a computer cluster (e.g.: in the cloud using any cloud provider) with joblib, using dask's distributed library as a joblib backend. A Neuraxle project is best deployed as a microservice within your software environment, and you can fully control and customize how you deploy your project (e.g.: coding yourself a pipeline step that does json conversion to accept http requests). -- Anyone who agrees with this pull request could vote for it by adding a :+1: to it, and usually, the maintainer will merge it when votes reach 20. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2025-11-06 13:11:00 -06:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/awesome-python#1179