Recommended! The machine learning resources compiled by foreign programmers

Source: Internet
Author: User
Keywords Programmers machine learning C + + learning resources
Tags .net analysis android apache available in based basic cache
C + + computer vision ccv-A library of machine vision, which is based on/provides cache/core, and a novel Machine Vision library

opencv-It provides C + +, C, Python, Java, and MATLAB interfaces, and supports Windows, Linux, Android and MAC OS operating systems.

General Machine Learning





Closure General Machine learning

Closure toolbox-clojure Language Library and tools classification catalogue

Go Natural Language processing

go-porterstemmer-A primary go language net room implementation of a porter stemming extraction algorithm

Go language implementation of Paicehusk-paice/husk stem extraction algorithm

Snowball-go language version of the snowball stem extractor

General Machine Learning

Go Learn-go Language Machine Learning Library

Go-pr-go Language Machine learning kit.

Bayesian-go language naive Bayesian classification library.

Go-galib-go Language Genetic Algorithm library.

data analysis/Data visualization

Go-graph-go Language Graphics library.

SVG generation library for Svgo-go languages.

Java Natural Language processing

corenlp-Stanford University's CORENLP offers a range of natural language processing tools to enter the original English text, giving the basic form of the word (which is included in several of the following Stanford tools).

Stanford parser-a natural language parser.

Stanford Pos tagger-a speech classifier.

Name recognizer implemented by Stanford name Entity Recognizer-java

Stanford word segmenter-participle, a lot of NLP work to use the standard preprocessing steps.

Tregex, Tsurgeon and semgrex-are used for pattern matching in tree-like data structures, and regular expressions based on tree relationships and node matching (names are abbreviations for "treeregular expressions").

Stanford phrasal: The latest machine translation system based on statistical phrases, Java writing

Stanford Tokens regex-A framework for defining text patterns.

Stanford Temporal tagger-sutime is a library that identifies and standardizes time expressions.

Stanford spied-uses patterns on a seed set to learn character entities in an iterative way from without label text

Stanford Topic Modeling toolbox-provides thematic modeling tools for social scientists and other people who want to analyze datasets.

Twitter text Java-java Implementation of the tweet library

mallet--based on Java statistical Natural Language Processing, document classification, clustering, theme modeling, information extraction and other machine learning text application package.

opennlp-A machine learning kit that handles natural language text.

lingpipe-a toolkit for processing text using computer linguistics.

General Machine Learning

Mllib in Apache Spark-spark Distributed machine Learning Library

mahout-Distributed Machine Learning Library

Stanford classifier-Stanford University Classifier

Weka-weka is a set of machine learning algorithms in data mining.

Oryx-provides a simple large-scale real-time machine learning/predictive analysis infrastructure.

data analysis/Data visualization

hadoop-Large Data Analysis platform

spark-fast general-purpose large-scale data processing engine.

impala-Real-time query for Hadoop

Javascript Natural Language Processing

Twitter-text-js-javascript implementation of the Twitter text processing library

NLP tools written by Nlp.js-javascript and Coffeescript

Universal NLP Tools under Natural-node

Natural language processor written by Knwl.js-js

data analysis/Data visualization


High Sponsor






General Machine Learning

convnet.js-Training Depth Learning model of the JavaScript library.

Clustering.js-is a clustering algorithm implemented in JavaScript for Node.js and browser use.

Decision Trees-node.js Implementation of the decision tree, using the ID3 algorithm.

A fast artificial neural network library under Node-fann-node.js.

A simple JavaScript implementation of the Kmeans.js-k-means algorithm for use by Node.js and browsers.

lda.js-, a LDA theme modeling tool for Node.js.

JavaScript implementation of learning.js-logical regression/c4.5 Decision tree

Machine learning-node.js Machine Learning Library.

Node-svm-node.js Support Vector Machines

Neural network implemented by Brain-javascript

The implementation of the Bayesian-bandit-Bayes Robber algorithm for Node.js and browser use.

Julia General Machine learning

Probabilistic graph model framework implemented by Pgm-julia.

Da-julia, a regular discriminant analysis package is implemented.

regression-regression analysis algorithm packages (e.g. linear regression and logistic regression).

Local regression-partial regression, very smooth!

Simple Julia Realization of Naive bayes-naive Bayesian

Julia Bag of Mixed models-(statistical) mixed effect model

Basic MCMC Sampler implemented by simple Mcmc-julia

Distance evaluation module implemented by Distance-julia

Decision tree-Decision tree classifier and regression analyzer

Neural network implemented by Neural-julia

MCMC Tools under Mcmc-julia

A generalized linear model package written by Glm-julia

Online Learning

Glmnet-gmlnet's Julia Packaging version, suitable for lasso/elastic mesh model.

clustering-data Clustering Basic functions: K, Dp-means, etc.

Svm-julia support Vector machines.

Nuclear density estimator under Kernal density-julia

dimensionality reduction-dimensionality Reduction algorithm

Non-negative matrix decomposition package under Nmf-julia

Neural network implemented by Ann-julia

Natural Language Processing

Theme modeling under Topic Models-julia

Text analysis package under text Analysis-julia

data analysis/Data visualization

Graph layout-Pure Julia Implementation of the diagram layout algorithm.

Data Frames meta-dataframes Programming tool.

Julia data-The Julia Library that handles tabular data

Data read-reads files from Stata, SAS, SPSS

Hypothesis test package in hypothesis Tests-julia

The ingenious statistical drawing system written by Gladfly-julia.

Statistical test function package written by Stats-julia

Rdatasets-reads the Julia function package for many of the available datasets in the R language.

dataframes-The Julia Library that handles tabular data.

distributions-the probability distribution and related functions of Julia Bag.

The data arrays-element value can be empty.

Time Series-julia Data Kit.

Sampling-julia Basic Sampling algorithm Package


dsp-Digital Signal Processing

Juliacon Presentations-julia Conference Presentation

Signalprocessing-julia Signal Processing Tools

Images-julia Pictures Gallery

Lua General Machine learning


Cephes-cephes is a library of mathematical functions packaged into torch available forms. More than 180 special mathematical functions are provided and packaged, developed by Stephen L. Moshier, which is the core of scipy and is applied to many occasions.

graph-graphic package for torch use.

randomkit-the random number generated from the NumPy package, packaged into torch available form.

Signal-torch-7 can be used to signal processing kits, FFT, DCT, Hilbert, Cepstrums, Stft and other transformations.

Nn-torch available neural network packets.

Nngraph-provides graphics computing capabilities for NN libraries.

nnx-an unstable experimental package that expands the torch built-in NN library.

Optim-torch available optimization algorithm libraries, including SGD, Adagrad, conjugate gradient algorithm, LBFGS, Rprop algorithm.

Non-supervised learning package under Unsup-torch. Modules provided with NN (LINEARPSD, convpsd, Autoencoder, ...) and independent algorithm (K, PCA) and other compatible.

Manifold-The package that operates the manifold.

Svm-torch support Vector hangar.

lbfgs-wraps Liblbfgs as a FFI interface.

vowpalwabbit-old version of the Vowpalwabbit on the Torch interface.

OPENGM-OPENGM is a graphics modeling and inference library written in C + + that binding can describe graphics in a simple way with LUA, and then use OPENGM optimization.

Sphagetti-michaelmathieu is a sparse linear module written for Torch7.

Luashkit-wraps the locally sensitive Hachiku Shkit into Lua-usable form.

Kernel SMOOTHING-KNN, nuclear power average and local linear regression smoothing

Cutorch-torch Cuda Backend Implementation

Cunn-torch Cuda Neural Network is implemented.

Imgraph-torch Image/Graphics library provides routines for creating graphics, segmenting, building trees, and converting images from images

Videograph-torch's video/graphics library, which provides routines for creating graphics, segmenting, building trees, and converting video from video

saliency-integral image code and tools to find points of interest from the fast integration histogram.

Stitch-uses Hugin to flatten the image and generate a video sequence.

sfm-Motion Scene Bundle adjustment/Structure Package

Fex-torch feature Extraction package provides sift and dsift modules.

Overfeat-the current highest level of universal density feature extractor.

Numeric Lua

Lunatic Python


Lua–numerical algorithms


Demo and Scripting

Core Torch7 Demos Repository Torch7 Demo Library

Linear regression and logistic regression

Face Detection (training and testing is an independent demo)

The word breaker based on MST



Optical Flow Demo



Tracking with deep nets

Kinect Demo

Filter visualization


Training a convnet for the Galaxy-zoo Kaggle Challenge (CUDA demo)

Music Tagging-torch7 Script

Torch-datasets reads several popular scripts for datasets, including:

BSR 500



Street View House Numbers



atari2600-scripts that generate datasets with static frames in the arcade Learning Environnement simulator.

Matlab Computer Vision

contourlets-realization of Contour Wave transform and its use function matlab source code

shearlets-Shear Wave transform matlab source code

Curvelets-curvelet transformation of the MATLAB source code (Curvelet transform is to the higher dimensions of wavelet transform to the promotion, used in different scales to represent the image. )

Bandlets-bandlets transform MATLAB Source code

Natural language Processing

nlp-a NLP library of MATLAB

General Machine Learning

Training a deep autoencoder or a classifier on Mnist digits-trains a depth mnist or classifier on the Autoencoder character data set [depth learning].

T-distributed Stochastic Neighbor embedding-award-winning dimensionality reduction technology, especially suitable for visualization of high-dimensional datasets

Spider-matlab a complete object-oriented environment for machine learning.

libsvm-Support Vector Machine Program Library

liblinear-Large linear Classification Program library

Machine Learning module-m. A. Professor Girolami's machine learning course includes PDFs, handouts, and code.

Caffe-a deep learning framework that considers code cleanliness, readability, and speed

Pattern Recognition Toolkit in patterns recognition Toolbox-matlab, fully object-oriented

data analysis/Data visualization

matlab_gbl-processing image MATLAB Package

gamic-image algorithm is realized by pure Matlab, which is a supplement to the MEX function of MATLABBGL.

. NET Computer Vision

Opencvdotnet-wrapper so that the. NET program can use OPENCV code

Emgu cv-a cross-platform wrapper that compiles on Windows, Linus, Mac OS X, IOS, and Android.

Natural Language Processing

STANFORD.NLP for. NET The Stanford University NLP package is fully ported on. NET and can be precompiled as a NuGet package.

General Machine Learning

accord.machinelearning-support vector machines, decision tree, naive Bayesian model, K, Gaussian mixture model and machine learning application of the general algorithm, such as: random sampling consistency algorithm, cross-validation, grid search. This package is part of the framework.

The vulpes-f# language implements deep belief and deep learning packs, which are performed using Cuda GPU under Alea.cubase.

encog-Advanced neural Network and machine learning framework, including classes used to create a variety of networks, but also support the need for neural network data regulation and processing classes. Its training uses the multithreading elasticity propagation. It can also use the GPU to speed up processing time. Provides a graphical interface to help model and train neural networks.

Neural receptacle designer-This is a database management system and a neural network designer. Designed with WPF development and a UI, you can design your neural network, query the network, create and configure a chat robot that can ask questions and learn from your feedback. These robots can even collect information from the Web for output, or for learning.

data analysis/Data visualization

NUML-NUML This machine learning library, the goal is to simplify the prediction and clustering of standard modeling techniques.

Based on the numerical calculation of project, the methods and algorithms of scientific, engineering and daily numerical calculation are provided. Supports. NET 4.0 on Windows, Linux, and Mac,. NET 3.5 and Mono, Silverlight 5, WINDOWSPHONE/SL 8, Windows 8.1, and PCL Portable PROFILES47 and 344 Windows 8, equipped with Xamarin Android/ios.

Sho-sho is an interactive environment for data analysis and scientific computing that allows you to seamlessly connect scripts (IronPython languages) with compiled code (. NET) to quickly and flexibly build prototypes. This environment includes powerful and efficient libraries, such as linear algebra, data visualization, available for any. NET language and provides a rich interactive shell for rapid development.

Python Computer Vision

simplecv-Open Source Computer vision framework, you can access such as OpenCV high-performance Visual Library. Written in Python, you can run on Mac, Windows, and Ubuntu.

Natural Language Processing

Nltk-a leading platform for writing Python programs that deal with human language data

Pattern-python available Web mining modules, including natural language processing, machine learning and other tools.

Textblob-provides a consistent API for common natural language processing tasks, based on NLTK and pattern, and compatible with both.

jieba-Chinese word breaker tool.

snownlp-Chinese Text Processing library.

loso-another Chinese word breaker.

genius-Chinese word breaker based on conditional random fields.

Nut-Natural Language Understanding Toolkit.

General Machine Learning

Bayesian Methods for Hackers-python language probabilistic programming ebook

Mllib in Apache Spark-spark Distributed machine Learning Library.

scikit-learn-machine learning module based on scipy

Graphlab-create-contains a variety of machine learning module libraries (regression, clustering, referral systems, graph analysis, etc.), based on the dataframe that can be stored on disk.

Bigml-a library that connects to external servers.

Pattern-python Web Mining Module

Nupic-numenta Company's Intelligent computing platform.

pylearn2-Machine Learning Library based on Theano.

Hebel-python a deep learning library that uses GPU acceleration.

gensim-Theme Modeling tool.

pybrain-Another machine learning library.

crab-scalable, fast recommendation engine.

Python-recsys-python implementation of the recommendation system.

Thinking bayes-'s book on Bayesian analysis

Restricted Boltzmann Machines-python implemented by the limited Boltzmann machine. [Deep learning].

bolt-Online Learning Toolbox.

Covertree-cover Tree's Python implementation, scipy.spatial.kdtree convenient substitution.

Nilearn-python realizes the neural Imaging machine learning Library.

shogun-Machine Learning Kit.

pyevolve-Genetic algorithm framework.

Caffe-a deep learning framework that considers code cleanliness, readability, and speed

breze-depth and recursive neural network Program library, based on Theano.

data analysis/Data visualization

Scipy-is based on Python's mathematical, scientific, and engineering open-source software ecosystem.

Numpy-python Foundation package for scientific calculation.

Numba-python's Low-level virtual machine JIT compiler, written by developers of Cython and numpy, for scientific computing use

Networkx-is an efficient software used for complex networks.

pandas-This library provides high-performance, easy-to-use data structures and data analysis tools.

The Business Intelligence tool (Pandas Web interface) in Open Mining-python.

PYMC-MCMC Sampling Kit.

Zipline-python Algorithm Trading Library.

pydy-full name Python Dynamics assists in dynamic modeling workflows based on NumPy, SciPy, Ipython, and Matplotlib.

sympy-symbol Math Python library.

Statsmodels-python statistical modeling and Econometrics Library.

Astropy-python Astronomy Program Library, community Collaborative writing

Matplotlib-python's 2D drawing library.

Bokeh-python Interactive Web Drawing library.

Collaboration Web Drawing library for Plotly-python and matplotlib.

vincent-converts the Python data structure into Vega visual syntax.

D3py-python's drawing library, based on D3.js.

The Ggplot2 in ggplot-and R languages provide the same APIs. to render the SVG diagram library, the effect is beautiful.

SVG diagram Builder under Pygal-python.


Miscellaneous Scripts/ipython notes/code base


Thinking Stats 2






Sarah Palin lda-sarah Palin Email about theme modeling.

Diffusion segmentation-A collection of image segmentation algorithms based on diffusion method.

SciPy tutorials-scipy Tutorial, obsolete, see scipy-lecture-notes

Crab-python's recommended engine library.

Bayesian inference tool in Bayespy-python.

Scikit-learn Tutorials-scikit-learn Learning Notes Series

sentiment-analyzer-Emotion Analyzer

The experiment of group-lasso-coordinate descent algorithm is applied to the (sparse) group Lasso model.

mne-python-notebooks-Ipython notes for EEG/MEG data processing using Mne-python

Pandas cookbook-uses the Python Pandas library method book.

climin-Machine Learning Optimization Program library, using Python to achieve gradient descent, Lbfgs, Rmsprop, Adadelta algorithm.

Kaggle Competition Source code

Wiki Challange-kaggle the implementation of the Dell Zhang solution to the previous Wiki prediction Challenge.

Code submitted by Kaggle Insults-kaggle "detect abuse from social media comments" contest

Kaggle_acquire-valued-shoppers-challenge-kaggle's code to predict repeat-return challenge

Kaggle-cifar-kaggle on CIFAR-10 Contest code, using Cuda-convnet

Kaggle-blackbox-kaggle on the Blackbox game code, about the depth of learning.

Kaggle-accelerometer-kaggle accelerometer data to identify user contest code

Kaggle-advertised-salaries-kaggle code to predict wage contests with advertisements

Kaggle Amazon-kaggle code for a given employee role to predict its access requirements contest

Kaggle-bestbuy_big-kaggle based on retailer user query forecast click on the Commodity Contest code (large data version)

Kaggle-bestbuy_small-kaggle based on the retailer user query forecast click on the Commodity Contest Code (small data version)

Kaggle Dogs vs. Cats-kaggle code to identify cat and dog races from pictures

Kaggle Galaxy Challenge-kaggle The winning code for the classification contest of distant galaxies

Kaggle Gender-kaggle Contest: Distinguishing gender from handwriting

Code for predicting drug molecular activity competition on Kaggle Merck-kaggle (Merck Pharmaceutical sponsorship)

Kaggle Stackoverflow-kaggle to predict whether the stack overflow Web site problem will be closed contest code

Wine-quality-predicts red wine quality.

Ruby Natural Language processing

treat-text Retrieval and annotation toolkit, Ruby on the most comprehensive toolkit I've ever seen.

Ruby linguistics-This framework can construct linguistic tools for Ruby objects in any language. Includes a language-independent general-purpose front-end, a module that maps language codes to language names, and a module that contains English language tools.

Stemmer-makes Ruby available in Libstemmer_c interfaces.

Ruby Wordnet-wordnet Ruby Interface Library.

Raspel-aspell bound to Ruby interface

UEA Stemmer-uealite Stemmer Ruby Transplant, Conservative STEM analyzer for search and retrieval

twitter-text-rb-the library can automatically connect and extract user names, lists, and topic tags from Twitter.

General Machine Learning

Some machine learning algorithms implemented by Ruby Machine Learning-ruby.

Machine Learning Ruby

JRuby mahout-Essence! Unleashed the power of Apache Mahout in the JRuby world.

cardmagic-classifier-a generic classifier module that can be used for Bayesian and other classifications.

Neural NX and Deep learning-example code for the book "Neural Network and deep learning".

data analysis/Data visualization

Rsruby-ruby–r Bridge

data-visualization-ruby-the source code and support content for the Ruby Manor demo of data visualization

Ruby-plot-wraps the gnuplot in ruby form, especially for converting ROC curves into SVG files.

plot-rb-Ruby Drawing library based on Vega and D3

Excellent Graphics Toolkit Scruffy-ruby


glean-Data management Tools



Misc Miscellaneous

Big data for chimps-large data processing a serious and interesting guide book

R General Machine learning

Clever Algorithms for Machine Learning

Machine Learning for Hackers

Machine Learning Task View on Cran-r Language Machine learning Pack list, grouped by algorithm type.

Caret-r Language 150 Unified interface for machine learning algorithms

Superlearnerandsubsemble-This package sets up a variety of machine learning algorithms

Introduction to statistical Learning

data analysis/Data visualization

Learning Statistics Using R

ggplot2-Data visualization package based on graphical syntax.

Scala Natural Language processing

scalanlp-machine learning and numerical Computing Library Suite

Numerical processing library for Breeze-scala

chalk-Natural Language Processing library.

Factorie-A deployable Probabilistic modeling toolkit, a software library implemented in Scala. Provide users with a concise language to create relational graphs, evaluate parameters, and extrapolate.

data analysis/Data visualization

Distributed machine Learning Library under Mllib in Apache Spark-spark

Scalding-cascading Scala Interface

Summing bird-streaming MapReduce with scalding and storm

Algebird-scala Abstract Algebra Tool

Xerial-scala Data management Tools

simmer-the UNIX filters that simplify your data and do algebraic aggregation

Predictionio-is a machine learning server for software developers and data engineers.

Bidmat-supports CPU and GPU accelerated matrix libraries for large-scale exploratory data analysis.

General Machine Learning

Extensible machine Learning Framework under conjecture-scalding

Brushfire-scalding Decision tree Tool.

ganitha-Machine Learning Program library based on scalding

adam-uses Apache Avro, Apache Spark and Parquet genome processing engine, has a dedicated file format, Apache 2 software license.

Bioinformatics libraries available in Bioscala-scala languages

bidmach-machine learning CPU and GPU accelerator libraries.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.