This article compiles a number of frameworks, libraries, and software (sorted by programming language) for the machine learning domain.
1. c++1.1 Computer Vision
- ccv-based on C language/provide cache/core machine Vision Library, novel Machine Vision Library
- opencv-it provides C + +, C, Python, Java and MATLAB interfaces, and supports Windows, Linux, Android and Mac os os.
1.2 Machine learning
2. Closure
- Closure toolbox-clojure language library and Tools directory
3.go3.1 Natural Language Processing
- go-porterstemmer-A native Go language cleanroom implementation of a porter stem extraction algorithm
- Implementation of Paicehusk-paice/husk stemming algorithm in go language
- Snowball-go language version of the Snowball stemming extractor
3.2 Machine Learning
- Go Learn-go Language Machine Learning Library
- Go-pr-go Language machine learning package.
- Bayesian-go language naive Bayesian classification library.
- Go-galib-go Language Genetic Algorithm library.
3.3 Data analysis/Data visualization
- Go-graph-go Language Graphics library.
- SVG generation library for Svgo-go languages.
4. Java4.1 Natural Language Processing
- corenlp-Stanford University's CORENLP offers a range of natural language processing tools that can be entered in the original English text, giving the basic form of the word (some of the tools at the beginning of Stanford are included).
- Stanford parser-a natural language parser.
- Stanford POS tagger-A part-of-speech classifier.
- Name recognizer implemented by the Stanford name Entity Recognizer-java
- Stanford Word segmenter-, the standard preprocessing steps to be used in many NLP work.
- Tregex, Tsurgeon and semgrex-are used for pattern matching in tree data structures, based on tree relationships and node-matching regular expressions (the name is the abbreviation "Tree Regular Expressions").
- Stanford Phrasal: The latest statistical phrase-based machine translation system, written in Java
- Stanford Tokens regex-A framework for defining text patterns.
- Stanford Temporal tagger-sutime is a library that recognizes and standardizes time expressions.
- Stanford spied-Use patterns on the seed set to iteratively learn character entities from untagged text
- Stanford Topic Modeling toolbox-is a topic modeling tool for social scientists and other people who want to analyze datasets.
- Twitter text Java-java Implementation of the tweet processing library
- Mallet-Java-based statistical natural language processing, document classification, clustering, theme modeling, information extraction, and other machine learning text application packages.
- opennlp-Machine Learning Toolkit that handles natural language text.
- lingpipe-a toolkit for processing text using computer linguistics.
4.2 Machine Learning
- MLlib in Apache Spark-spark Distributed machine Learning Program Library
- mahout-Distributed Machine Learning Library
- Stanford classifier-Stanford University Classifier
- Weka-weka is a set of machine learning algorithms for data mining.
- Oryx-provides a simple, large-scale real-time machine learning/predictive analytics infrastructure.
4.3 Data analysis/Data visualization
- hadoop-Big Data analytics Platform
- spark-Fast and versatile large-scale data processing engine.
- impala-Real-time query for Hadoop
5. Javascript5.1 Natural Language Processing
- Twitter-text-js-javascript implementation of the Twitter text processing library
- NLP tools written by Nlp.js-javascript and Coffeescript
- General NLP Tools under the Natural-node
- Natural language processors written by Knwl.js-js
5.2 Data analysis/Data visualization
- D3.js
- High Charts
- Nvd3.js
- Dc.js
- Chartjs
- Dimple
- Amcharts
5.3 Machine Learning
- convnet.js-a JavaScript library that trains deep learning models.
- Clustering.js-is a clustering algorithm implemented with JavaScript for use by node. js and the browser.
- Decision Trees-node.js implements the decision tree, using the ID3 algorithm.
- Fast Artificial neural network library under the Node-fann-node.js.
- Simple JavaScript implementation of the Kmeans.js-k-means algorithm for use by node. js and the browser.
- lda.js-the LDA Theme modeling tool for node. js.
- JavaScript implementation of learning.js-logical regression/c4.5 Decision tree
- Machine Learning library for machines learning-node.js.
- Support Vector Machine for node-svm-node.js
- Brain-javascript Realization of neural network
- bayesian-bandit-The Bayesian Bandit algorithm for use by node. js and the browser.
6. Julia6.1 Machine Learning
- A probabilistic graph model framework implemented by Pgm-julia.
- The regularization discriminant analysis package implemented by Da-julia.
- regression-regression analysis algorithm packages (such as linear regression and logistic regression).
- Local regression-partial regression, very smooth!
- Simple Julia Implementation of Naive bayes-naive Bayes
- Mixed models-(statistical) mixed-effect model of Julia package
- Basic MCMC Sampler implemented by simple Mcmc-julia
- Distance-julia Realized distance Evaluation module
- Decision tree-decision tree classifier and regression analyzer
- Neural-julia Realization of neural network
- MCMC Tools under the Mcmc-julia
- Generalized linear model packages written by Glm-julia
- Online Learning
- Glmnet-gmlnet's Julia Packaging edition, suitable for lasso/elastic mesh models.
- clustering-basic functions of data clustering: K-means, Dp-means, etc.
- Support Vector machine under the Svm-julia.
- Kernel Density estimator under kernal density-julia
- dimensionality reduction-Descending dimension algorithm
- A non-negative matrix decomposition package under Nmf-julia
- Ann-julia Realization of neural network
6.2 Natural Language Processing
- Theme modeling under Topic Models-julia
- Text analysis package under text Analysis-julia
6.3 Data analysis/Data visualization
- Graph layout-Pure Julia implements the graph layout algorithm.
- The meta-programming tool for Data Frames meta-dataframes.
- Julia data-processing tabular data in Julia Library
- Data read-read files from Stata, SAS, SPSS
- The hypothesis test package in hypothesis Tests-julia
- Gladfly-julia's ingenious statistical plotting system.
- Statistical test function packages written by Stats-julia
- Rdatasets-reads the Julia function pack for many datasets available in the R language.
- dataframes-The Julia Library that handles tabular data.
- distributions-the probability distribution and the related function of the Julia Packet.
- The data arrays-element value can be an empty structure.
- Time series Series-julia Data Kit.
- Basic sampling algorithm Package for Sampling-julia
6.4 Miscellaneous/Presentations
- dsp-Digital Signal Processing
- Presentations at the Juliacon Presentations-julia Conference
- Signalprocessing-julia Signal Processing Tool
- Photo Gallery of Images-julia
7. Lua7.1 Machine Learning
- Torch7
- Cephes-cephes a library of mathematical functions, packaged into torch usable form. More than 180 special mathematical functions have been provided and packaged, developed by Stephen L. Moshier, which is the core of the scipy and is used in many situations.
- graph-a graphics package for torch use.
- randomkit-generates packages of random numbers extracted from numpy, packaged into torch usable form.
- SIGNAL-TORCH-7 available Signal Processing toolkit for FFT, DCT, Hilbert, Cepstrums, Stft and other transformations.
- Nn-torch the available neural network packets.
- The nngraph-provides graphics computing power for the NN library.
- nnx-an unstable experimental package that expands the torch built-in NN library.
- Optim-torch available optimization algorithms, including SGD, Adagrad, conjugate gradient algorithm, LBFGS, Rprop and other algorithms.
- Unsupervised learning packages under the Unsup-torch. Modules provided with NN (LINEARPSD, convpsd, Autoencoder, ...) and independent algorithms (K-means, PCA) and other compatible.
- Manifold-The package that operates the manifold.
- Support Vector Library for Svm-torch.
- lbfgs-the LIBLBFGS package as an FFI interface.
- vowpalwabbit-old version of the Vowpalwabbit to torch interface.
- OPENGM-OPENGM is a graphical modeling and inference library written in C + + that can be used in Lua to describe graphics in a simple way and then optimize with OPENGM.
- Sphagetti-michaelmathieu a sparse linear module written for Torch7.
- luashkit-The local sensitive Hachiku Shkit into Lua usable form.
- Kernel SMOOTHING-KNN, nuclear weight averaging and local linear regression smoothing
- Cutorch-torch Cuda Backend Implementation
- Cunn-torch's Cuda neural network is implemented.
- Imgraph-torch's image/graphics library, which provides routines for creating graphics from images, splitting, building trees, and converting back to images
- Videograph-torch video/Graphics library that provides routines for creating graphics, splitting, building trees, and converting back to video from video
- saliency-integral image code and tools to find points of interest from the fast integration histogram.
- Stitch-uses Hugin to flatten the image and generate a video sequence.
- sfm-Motion Scene Bundle adjustment/Structure Package
- Fex-torch feature extraction package, which provides sift and dsift modules.
- Overfeat-the current highest level of universal density feature extractor.
- Numeric Lua
- Lunatic Python
- Scilua
- Lua–numerical algorithms
- Lunum
7.2 Demos and scripts
- Core Torch7 Demos Repository. Kernel Torch7 Demo Library
- Linear regression, Logistic regression
- Face Detection (training and testing are independent demos)
- The word breaker based on MST
- Train-a-digit-classifier
- Train-autoencoder
- Optical Flow Demo
- Train-on-housenumbers
- Train-on-cifar
- Tracking with deep nets
- Kinect Demo
- Visualization of filtering
- Saliency-networks
- Training a convnet for the Galaxy-zoo Kaggle Challenge (CUDA demo)
- Music tag script under musical Tagging-torch7
- Torch-datasets reads a few popular data set scripts, including:
- BSR 500
- CIFAR-10
- COIL
- Street View House Numbers
- MNIST
- Norb
- atari2600-a script that generates a dataset with a static frame in the arcade learning environment Simulator.
8. Matlab8.1 Computer Vision
- contourlets-realization of Contour wave transformation and MATLAB source code using function
- Matlab source code of shearlets-Shear Wave transformation
- Curvelets-curvelet transformation of MATLAB source code (Curvelet transformation is to the higher dimension of the wavelet transform to the promotion of different scales to represent the image. )
- Bandlets-bandlets transformation of MATLAB source code
8.2 Natural Language Processing
- nlp-a matlab library of NLP
8.3 Machine Learning
- Training a deep autoencoder or a classifier on MNIST digits-train a depth MNIST or classifier on autoencoder character datasets [deep learning].
- t-distributed Stochastic Neighbor embedding-Award-winning dimensionality reduction technology, especially suitable for visualization of high-dimensional datasets
- Spider-matlab machine Learning's complete object-oriented environment.
- libsvm-Support Vector Machine Program Library
- liblinear-Large linear Classification Program library
- Machine learning module-m. A. Girolami's machine learning courses, including PDFs, handouts and code.
- caffe-Deep Learning framework that considers code cleanliness, readability, and speed
- Pattern Recognition Toolkit in pattern recognition Toolbox-matlab, fully object-oriented
8.4 Data analysis/Data visualization
- matlab_gbl-matlab package for image processing
- gamic-image algorithm Pure matlab efficient implementation, the MATLABBGL of the MEX function is a supplement.
9.. NET9.1 Computer Vision
- opencvdotnet-wrapper to enable. NET programs to use OPENCV code
- EMGU cv-Cross-platform wrapper that can be compiled on Windows, Linus, Mac OS X, IOS, and Android.
9.2 Natural Language Processing
- STANFORD.NLP for. net-The Stanford University NLP package is fully ported on. NET and can be precompiled as a NuGet package.
9.3 General Machine Learning
- accord.machinelearning-support vector machines, decision trees, naive Bayesian models, K-means, Gaussian mixed models, and machine learning applications are common algorithms, such as: random sampling consistency algorithm, cross-validation, grid search. This package is part of the accord.net framework.
- Deep belief and depth learning packages implemented by the vulpes-f# language, which are executed using the Cuda GPU under Alea.cubase.
- Encog-advanced neural networks and machine learning frameworks, including classes used to create multiple networks, and classes that support the need for data collation and processing in neural networks. Its training uses multi-threaded elastic propagation. It can also use the GPU to speed up processing time. Provides a graphical interface to help model and train neural networks.
- Neural network designer-This is a database management system and a neural network designer. Designed with WPF development and also a UI, you can design your neural network, query the network, create and configure chat bots, which ask questions and learn from your feedback. These robots can even collect information from the Web for output, or for learning.
9.4 Data analysis/Data visualization
- NUML-NUML, the goal of this machine learning library is to simplify the standard modeling techniques for prediction and clustering.
- Based on the numerical calculation of the Math.net Numerics-math.net project, it provides methods and algorithms for scientific, engineering and daily numerical calculation. Support for. NET 4.0 on Windows, Linux and Mac,. NET 3.5 and Mono, Silverlight 5, WINDOWSPHONE/SL 8, WindowsPhone 8.1, and a PCL portable Profiles 47 and 344 of Windows 8, equipped with Xamarin's Android/ios.
- Sho-sho is an interactive environment for data analysis and scientific computing, allowing you to seamlessly connect scripts (IronPython language) and compiled code (. NET) to build prototypes quickly and flexibly. This environment includes powerful and efficient libraries, such as linear algebra, data visualization, and any. NET language, and provides a rich, interactive shell for rapid development.
Python10.1 Computer Vision
- simplecv-Open Source Computer vision framework, you can access such as OPENCV and other high-performance computing visual library. Written in Python, you can run it on Mac, Windows, and Ubuntu.
10.2 Natural Language Processing
- Nltk-a leading platform for writing Python programs that handle human language data
- Pattern-python available Web mining modules, including natural language processing, machine learning and other tools.
- Textblob-provides a consistent API for common natural language processing tasks, based on NLTK and pattern, and is well compatible with both.
- jieba-Chinese word breaker tool.
- snownlp-Chinese Text Processing library.
- loso-another Chinese word-breaking library.
- genius-Chinese word-breaking library based on conditional random domain.
- Nut-Natural Language Understanding Toolkit.
10.3 Machine Learning
- Bayesian Methods for Hackers-python language probabilistic programming ebook
- MLlib in Apache Spark-spark Distributed machine Learning Library.
- scikit-learn-SciPy-based machine learning module
- Graphlab-create-includes a library of various machine learning modules (regression, clustering, referral systems, graph analysis, etc.) based on dataframe that can be stored on disk.
- Bigml-a library that connects to an external server.
- Pattern-python Web Mining Module
- Nupic-numenta Company's Intelligent computing platform.
- Pylearn2-based on the Theano machine learning Library.
- Hebel-python writes a deep learning library that uses GPU acceleration.
- gensim-Theme Modeling Tools.
- pybrain-Another machine learning library.
- crab-extensible, fast recommendation engine.
- Python-recsys-python implementation of the recommended system.
- Thinking bayes-'s book on Bayesian analysis
- Restricted Boltzmann Machines-python implements the limited Boltzmann machine. [Deep learning].
- bolt-Online Learning Toolkit.
- Covertree-cover Tree's Python implementation, scipy.spatial.kdtree convenient alternative.
- Nilearn-python's Neural Imaging machine learning Library.
- shogun-Machine Learning Toolkit.
- pyevolve-Genetic algorithm framework.
- caffe-Deep Learning framework that considers code cleanliness, readability, and speed
- breze-depth and recursive neural network Program library, based on Theano.
10.4 Data analysis/Data visualization
- Scipy-is based on Python's open source software ecosystem for mathematics, science, and engineering.
- Numpy-python Scientific Computing Base package.
- Numba-python's low-level virtual machine JIT compiler, written by Cython and numpy developers, for scientific computing use
- Networkx-is an efficient software used for complex networks.
- pandas-This library provides high-performance, easy-to-use data structures and data analysis tools.
- The Business Intelligence tool (Pandas Web interface) in Open Mining-python.
- PYMC-MCMC Sampling Toolkit.
- Zipline-python's algorithmic Trading library.
- pydy-full name Python dynamics, assisting with dynamic modeling workflows based on NumPy, SciPy, Ipython, and Matplotlib.
- sympy-symbol Math Python library.
- Statsmodels-python's statistical modelling and Econometrics library.
- Astropy-python Astronomy Program Library, community Collaborative writing
- Matplotlib-python's 2D drawing library.
- Bokeh-python's interactive Web Drawing library.
- Plotly-python and matplotlib collaboration Web Drawing Library.
- vincent-the python data structure into the Vega visual syntax.
- D3py-python's drawing library, based on D3.js.
- The same API is provided for GGPLOT2 in ggplot-and R languages.
- Kartograph.py-python renders SVG images in a library that works beautifully.
- SVG diagram Builder under Pygal-python.
- Pycascading
10.5 Miscellaneous Scripts/ipython notes/code base
- Pattern_classification
- Thinking Stats 2
- Hyperopt
- Numpic
- 2012-paper-diginorm
- Ipython-notebooks
- Decision-weights
- Sarah Palin Lda-sarah Palin e-mail about theme modeling.
- Diffusion segmentation-A collection of image segmentation algorithms based on diffusion method.
- Scipy tutorials-scipy Tutorial, obsolete, see scipy-lecture-notes
- Crab-python's recommended engine library.
- The Bayesian inference tool in Bayespy-python.
- Scikit-learn Tutorials-scikit-learn Study Notes series
- sentiment-analyzer-Twitter Sentiment analyzer
- The group-lasso-coordinate descent algorithm experiment is applied to the (sparse) group Lasso model.
- mne-python-notebooks-Ipython notes for EEG/MEG data processing using Mne-python
- Pandas cookbook-uses the Python Pandas library method book.
- climin-Machine Learning Optimization Program library, using Python to achieve gradient descent, Lbfgs, Rmsprop, Adadelta and other algorithms.
10.6 Kaggle Competition Source code
- Wiki Challange-kaggle the implementation of the Dell Zhang solution for the last Wiki prediction challenge.
- Kaggle insults-kaggle "detect abuse from social media Commentary" contest submitted code
- Kaggle_acquire-valued-shoppers-challenge-kaggle the code to predict repeat challenge
- Kaggle-cifar-kaggle on CIFAR-10 Contest code, using Cuda-convnet
- Kaggle-blackbox-kaggle on Blackbox game code, about deep learning.
- Kaggle-accelerometer-kaggle on the accelerometer data to identify the user contest code
- Kaggle-advertised-salaries-kaggle the code for predicting wage contests with ads
- Kaggle Amazon-kaggle The code for a given employee role to predict their access requirements contest
- Kaggle-bestbuy_big-kaggle on the BestBuy user query forecast click on the product Contest code (Big Data version)
- Kaggle-bestbuy_small-kaggle The code (small data version) of the click-to-trade contest based on BestBuy user query
- Kaggle Dogs vs. Cats-kaggle identify cat and dog races from pictures
- Kaggle Galaxy Challenge-kaggle on the Remote Galaxy form classification contest winning code
- Kaggle Gender-kaggle Race: Gender-sensitive from handwriting
- Code for predicting drug molecular activity contests on Kaggle Merck-kaggle (Merck Pharma sponsorship)
- Kaggle Stackoverflow-kaggle predict if stack overflow site issues will be closed contest code
- Wine-quality-predicts red wine quality.
Ruby11.1 Natural Language Processing
- treat-text retrieval with annotation toolkit, the most comprehensive toolkit I have ever seen on Ruby.
- Ruby linguistics-This framework can build linguistic tools for Ruby objects in any language. Includes a language-independent universal front end, a module that maps language code to a language name, and a module that contains English-language tools.
- Stemmer-makes Ruby available in Libstemmer_c interfaces.
- Ruby Wordnet-wordnet's Ruby interface library.
- Raspel-aspell interface bound to Ruby
- UEA Stemmer-uealite Stemmer's ruby transplant version for search and retrieval with conservative stemmers
- twitter-text-rb-the library can automatically connect and extract user names, lists, and topic tags from Twitter.
11.2 Machine Learning
- Some of the machine learning algorithms implemented by Ruby machines Learning-ruby.
- Machine learning Ruby
- JRuby mahout-Essence! Unleash the power of Apache Mahout in the JRuby world.
- Cardmagic-classifier-can be used as a general classifier module for Bayesian and other taxonomy.
- Example code for neural Networks and deep learning-, "neural Network and depth learning".
11.3 Data analysis/Data visualization
- Rsruby–ruby–r Bridge
- data-visualization-ruby-source code and support content for the Ruby Manor demo of data visualization
- The ruby-plot-gnuplot is packaged in ruby form and is particularly suitable for converting ROC curves into SVG files.
- plot-rb-Ruby Drawing library based on Vega and D3
- Scruffy-ruby Excellent Graphics Toolkit
- Sciruby
- glean-Data management Tools
- Bioruby
- Arel
R12.1 General Machine Learning
- Clever Algorithms for machine learning
- Machine Learning for Hackers
- Machine learning Task View on Cran-r Language Machines Learning Package list, grouped by algorithm type.
- Caret-r language 150 a unified interface for machine learning algorithms
- Superlearner and subsemble-This package sets up a variety of machine learning algorithms
- Introduction to statistical learning
12.2 Data analysis/Data visualization
- Learning Statistics Using R
- ggplot2-a data visualization package based on graphics syntax.
Scala13.1 Natural Language Processing
- scalanlp-set of machine learning and numerical computing libraries
- Numerical processing library for Breeze-scala
- chalk-Natural Language Processing library.
- factorie-Deployable Probabilistic Modeling Toolkit, a software library implemented in Scala. Provide users with a concise language to create relational factor diagrams, evaluate parameters, and infer them.
13.2 Data analysis/Data visualization
- MLlib in Apache Spark-spark Distributed machine Learning Library
- Scalding-cascading's Scala interface
- Summing bird-with scalding and storm for streaming MapReduce
- Algebird-scala's abstract Algebra tool
- Xerial-scala's Data management tools
- simmer-simplify your data, perform algebraic aggregation of UNIX filters
- Predictionio-is a machine learning server for software developers and data engineers.
- Bidmat-supports the CPU and GPU acceleration matrix libraries for large-scale exploratory data analysis.
13.3 Machine Learning
Extensible machine Learning Framework under conjecture-scalding
Decision tree Tools under brushfire-scalding
ganitha-scalding-based machine learning program Library
adam-uses Apache Avro, Apache Spark and Parquet's genome processing engine, with a dedicated file format, Apache 2 software license.
Bioinformatics libraries available in the Bioscala-scala language
bidmach-machine learning CPU and GPU acceleration libraries.
[Machine Learning] Computer learning resources compiled by foreign programmers