mirror of
https://github.com/oxnr/awesome-bigdata.git
synced 2026-04-27 01:28:25 -05:00
Alphabetized the machine learning section
Added several more libraries with descriptions.
This commit is contained in:
19
README.md
19
README.md
@@ -343,32 +343,40 @@ You can read more about this distinction on Prof. Daniel Abadi's blog: [Distingu
|
||||
|
||||
## Machine Learning
|
||||
|
||||
* [Apache Mahout](http://mahout.apache.org/) - machine learning library for Hadoop.
|
||||
|
||||
* [brain](https://github.com/harthur/brain) - Neural networks in JavaScript.
|
||||
* [Cloudera Oryx](https://github.com/cloudera/oryx) - real-time large-scale machine learning.
|
||||
* [Concurrent Pattern](http://www.cascading.org/projects/pattern/) - machine learning library for Cascading.
|
||||
* [convnetjs](https://github.com/karpathy/convnetjs) - Deep Learning in Javascript. Train Convolutional Neural Networks (or ordinary ones) in your browser.
|
||||
* [DataVec](https://github.com/deeplearning4j/DataVec) - A vectorization and data preprocessing library for deep learning in Java and Scala. Part of the Deeplearning4j ecosystem.
|
||||
* [Deeplearning4j](https://github.com/deeplearning4j) - Fast, open deep learning for the JVM (Java, Scala, Clojure). A neural network configuration layer powered by a C++ library. Uses Spark and Hadoop to train nets on multiple GPUs and CPUs.
|
||||
* [Decider](https://github.com/danielsdeleo/Decider) - Flexible and Extensible Machine Learning in Ruby.
|
||||
* [ENCOG](http://www.heatonresearch.com/encog/) - machine learning framework that supports a variety of advanced algorithms, as well as support classes to normalize and process data.
|
||||
* [etcML](http://www.etcml.com/) - text classification with machine learning.
|
||||
* [Etsy Conjecture](https://github.com/etsy/Conjecture) - scalable Machine Learning in Scalding.
|
||||
* [Google Sibyl](https://users.soe.ucsc.edu/~niejiazhong/slides/chandra.pdf) - System for Large Scale Machine Learning at Google.
|
||||
* [GraphLab Create](https://dato.com/products/create/) - A machine learning platform in Python with a broad collection of ML toolkits, data engineering, and deployment tools.
|
||||
* [H2O](https://github.com/h2oai/h2o-3/) - statistical, machine learning and math runtime for Hadoop.
|
||||
* [H2O](https://github.com/h2oai/h2o-3/) - statistical, machine learning and math runtime with Hadoop. R and Python.
|
||||
* [Keras](https://github.com/fchollet/keras) - An intuitive neural net API inspired by Torch that runs atop Theano and Tensorflow.
|
||||
* [Mahout](http://mahout.apache.org/) - An Apache-backed machine learning library for Hadoop.
|
||||
* [MLbase](http://www.mlbase.org/) - distributed machine learning libraries for the BDAS stack.
|
||||
* [MLPNeuralNet](https://github.com/nikolaypavlov/MLPNeuralNet) - Fast multilayer perceptron neural network library for iOS and Mac OS X.
|
||||
* [MOA](http://moa.cms.waikato.ac.nz) - MOA performs big data stream mining in real time, and large scale machine learning.
|
||||
* [MonkeyLearn](http://www.monkeylearn.com/) - Text mining made easy. Extract and classify data from text.
|
||||
* [ND4J](https://github.com/deeplearning4j/nd4j) - A matrix library for the JVM. Numpy for Java.
|
||||
* [nupic](https://github.com/numenta/nupic) - Numenta Platform for Intelligent Computing: a brain-inspired machine intelligence platform, and biologically accurate neural network based on cortical learning algorithms.
|
||||
* [PredictionIO](https://prediction.io/) - machine learning server buit on Hadoop, Mahout and Cascading.
|
||||
* [RL4J](https://github.com/deeplearning4j/rl4j) - Reinforcement learning for Java and Scala. Includes Deep-Q learning and A3C algorithms, and integrates with Open AI's Gym. Runs in the Deeplearning4j ecosystem.
|
||||
* [SAMOA](http://samoa.incubator.apache.org/) - distributed streaming machine learning framework.
|
||||
* [scikit-learn](https://github.com/scikit-learn/scikit-learn) - scikit-learn: machine learning in Python.
|
||||
* [Spark MLlib](http://spark.apache.org/docs/0.9.0/mllib-guide.html) - a Spark implementation of some common machine learning (ML) functionality.
|
||||
* [Sibyl](https://users.soe.ucsc.edu/~niejiazhong/slides/chandra.pdf) - System for Large Scale Machine Learning at Google.
|
||||
* [TensorFlow](https://github.com/tensorflow/tensorflow) - Library from Google for machine learning using data flow graphs.
|
||||
* [Theano](https://github.com/theano) - A Python-focused machine learning library supported by the University of Montreal.
|
||||
* [Torch](https://github.com/torch) - A deep learning library with a Lua API, supported by NYU and Facebook.
|
||||
* [Velox](https://github.com/amplab/velox-modelserver) - System for serving machine learning predictions.
|
||||
* [Vowpal Wabbit](https://github.com/JohnLangford/vowpal_wabbit/wiki) - learning system sponsored by Microsoft and Yahoo!.
|
||||
* [WEKA](http://www.cs.waikato.ac.nz/ml/weka/) - suite of machine learning software.
|
||||
* [BidMach](https://github.com/BIDData/BIDMach) - CPU and GPU-accelerated Machine Learning Library.
|
||||
* [Velox](https://github.com/amplab/velox-modelserver) - System for serving machine learning predictions.
|
||||
* [TensorFlow](https://github.com/tensorflow/tensorflow) - Open source software library from Google for machine learning using data flow graphs.
|
||||
|
||||
## Benchmarking
|
||||
|
||||
@@ -377,6 +385,7 @@ You can read more about this distinction on Prof. Daniel Abadi's blog: [Distingu
|
||||
* [Intel HiBench](https://github.com/intel-hadoop/HiBench) - a Hadoop benchmark suite.
|
||||
* [PUMA Benchmarking](https://issues.apache.org/jira/browse/MAPREDUCE-5116) - benchmark suite for MapReduce applications.
|
||||
* [Yahoo Gridmix3](https://developer.yahoo.com/blogs/hadoop/gridmix3-emulating-production-workload-apache-hadoop-450.html) - Hadoop cluster benchmarking from Yahoo engineer team.
|
||||
* [Deeplearning4j Benchmarks](https://github.com/deeplearning4j/dl4j-benchmark)
|
||||
|
||||
## Security
|
||||
|
||||
|
||||
Reference in New Issue
Block a user