Machine Learning Resources overview [go]

Source: Internet
Author: User
Tags mathematical functions processing text svm theano nltk

This article has compiled some frameworks, libraries, and software (sorted by programming language) in the machine learning field ).

C ++ Computer Vision
  • CCV-Machine Vision Library Based on C Language/provided Cache/core, novel machine vision Library
  • Opencv-it provides C ++, C, Python, Java and Matlab interfaces, and supports windows, Linux, Android and Mac OS Operating Systems.
General Machine Learning
  • Mlpack
  • Dlib
  • Ecogg
  • Shark
Closure General Machine Learning
  • Closure toolbox-classification directory of clojure language library and tools
Go Natural Language Processing
  • Go-porterstemmer-the pure go language implementation of a porter stem Extraction Algorithm
  • Go language implementation of paicehusk-paice/husk stem Extraction Algorithm
  • Snowball-Go language-based snowball stem Extraction Tool
General Machine Learning
  • Go learn-Go language Machine Learning Library
  • Go-pr-Go language Machine Learning Package.
  • Bayesian-Go language Naive Bayes classification library.
  • Go-Galib-Go language Genetic Algorithm Library.

Data analysis/Data Visualization
  • Go-graph-Go language graphics library.
  • Svgo-Go language SVG library.
Java Natural Language Processing
  • Corenlp-corenlp of Stanford University provides a series of natural language processing tools that input original English text and give the basic form of words (the tools starting with Stanford below contain them ).
  • Stanford parser-a natural language parser.
  • Stanford POS tagger-a part-of-speech classifier.
  • Stanford name entity recognizer-name reader implemented by Java
  • Stanford word segmenter-the standard preprocessing steps used in many NLP tasks.
  • Tregex, tsurgeon and semgrex-Regular Expressions Used for pattern matching in tree data structures based on tree relationships and node matching (the name is short for "tree regular expressions ).
  • Stanford phrasal: the latest statistical phrase-based machine translation system written in Java
  • Stanford tokens RegEx-framework for defining text patterns.
  • Stanford temporal tagger-sutime is a library that recognizes and standardizes time expressions.
  • Stanford spied-usage mode on the seed set, learning character entities from unlabeled text in iterative mode
  • Stanford topic modeling toolbox-a topic modeling tool for social scientists and other people who want to analyze datasets.
  • Twitter text java-implemented Twitter Text Processing Library
  • Mallet-Java-based statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning text application packages.
  • Opennlp-machine learning toolkit for processing natural language text.
  • Lingpipe-Toolkit for processing text using computer linguistics.

General Machine Learning
  • Mllib in Apache spark-Spark distributed Machine Learning Library
  • Mahout-distributed Machine Learning Library
  • Stanford classifier-Stanford University Classifier
  • WEKA-WEKA is a machine learning algorithm set for data mining.
  • Oryx-provides a simple infrastructure for Large-Scale Real-Time Machine Learning/predictive analysis.
Data analysis/Data Visualization
  • Hadoop-Big Data Analysis Platform
  • Spark-a quick and general large-scale data processing engine.
  • Impala-Real-Time query for hadoop
Javascript Natural Language Processing
  • Twitter Text Processing library implemented by Twitter-text-js-Javascript
  • NLP. js-NLP tool written in JavaScript and coffeescript
  • General NLP tools under natural-node
  • Natural language processor compiled by knwl. js-js
Data analysis/Data Visualization
  • D3.js
  • High charts
  • Nvd3.js
  • DC. js
  • Chartjs
  • Dimple
  • Amcharts

General Machine Learning
  • Convnet. js-JavaScript library for deep learning model training.
  • Clustering. js-the clustering algorithm implemented by JavaScript For node. js and browsers.
  • Decision tree implemented by demo-Trees-Node.js, using ID3 algorithm.
  • The quick artificial neural network library under Node-Fann-node. js.
  • Simple JavaScript implementation of the kmeans. js-K-means algorithm for node. js and browsers.
  • Lda. js-lda topic modeling tool for node. js.
  • Javascript implementation of learning. js-logical regression/C4.5 decision tree
  • Machine Learning Library for the machine Learning-Node.js.
  • Support Vector Machine for Node-SVM-Node.js
  • Neural Networks implemented by brain-Javascript
  • The implementation of the Bayesian-bandit-Bayesian bandit algorithm is used by node. js and browsers.
Julia General Machine Learning
  • The probability graph model framework implemented by PGM-Julia.
  • The normalized discriminant analysis package implemented by Da-Julia.
  • Regression-regression analysis algorithm package (such as linear regression and logistic regression ).
  • Local regression-local regression, very smooth!
  • Simple Julia Implementation of Naive Bayes-Naive Bayes
  • Mixed models-(Statistics) Julia package of the Mixed Effect Model
  • Basic MCMC sampling implemented by simple MCMC-Julia
  • Distance-Julia distance evaluation module
  • Demo-tree-Decision Tree Classifier and regression Analyzer
  • Neural Networks implemented by neural-Julia
  • MCMC tool under MCMC-Julia
  • Generalized Linear Model package written by GLM-Julia
  • Online Learning
  • The Julia package version of glmnet-gmlnet is suitable for cable/elastic network models.
  • Basic functions of clustering-Data Clustering: K-means, DP-means, etc.
  • SVM under SVM-Julia.
  • Kernel Density Estimator under kernal density-Julia
  • Dimensionality loss ction-Dimension Reduction Algorithm
  • Non-negative matrix decomposition package under NMF-Julia
  • Neural Networks implemented by Ann-Julia
Natural Language Processing
  • Topic models-Julia topic Modeling
  • Text Analysis Package under Text Analysis-Julia
Data analysis/Data Visualization
  • Graph Layout-A Graph Layout Algorithm implemented by Julia.
  • Data frames meta-dataframes metaprogramming tool.
  • Julia data-Julia database for processing table data
  • Data read-read files from Stata, SAS, and SPSS
  • Hypothesis test package in Hypothesis tests-Julia
  • Gladfly-A smart statistical plotting system written by Julia.
  • Statistical Test function package compiled by stats-Julia
  • Rdatasets-Julia function package for reading a large number of available datasets in the r language.
  • Dataframes-Julia database for processing table data.
  • Distributions-Julia package of probability distribution and related functions.
  • Data Arrays-the element value can be an empty data structure.
  • Time Series-Julia's time series data toolkit.
  • Sampling-Julia's basic sampling algorithm package
Miscellaneous/presentation
  • DSP-Digital Signal Processing
  • Presentation at juliacon presentations-Julia Conference
  • Signalprocessing-Julia's signal processing tool
  • Images-Julia's Image Library
Lua
General Machine Learning
  • Torch7
    • The cephes-cephes mathematical function library is packaged into a torch available form. Providing and packaging more than 180 special mathematical functions, developed by Stephen L. Moshier, is the core of scipy and is used in many occasions.
    • Graph-a graph package for torch.
    • Randomkit-random number generated from numpy package, packaged as torch available form.
    • Signal-torch-7 is an available signal processing toolkit that supports FFT, DCT, Hilbert, cepstrums, STFT, and other transformations.
    • Nn-torch available Neural Network Package.
    • Nngraph-Provides graph computing capabilities for the NN library.
    • Nnx-an unstable experimental package that extends torch's built-in NN library.
    • Optim-torch provides available optimization algorithm libraries, including SGD, adagrad, bounded gradient algorithms, lbfgs, and rprop algorithms.
    • Unsupervised learning package under unsup-torch. Provided modules and NN (linearpsd, convpsd, autoencoder ,...) And compatible with independent algorithms (k-means, PCA.
    • Manifold-package of Operation manifold.
    • SVM-torch Support Vector library.
    • Lbfgs-package liblbfgs as the FFI interface.
    • Vowpalwabbit-the torch interface of vowpalwabbit in the old version.
    • Opengm-opengm is a graph modeling and inference library written in C ++. The binding can be used to describe the image in a simple way using Lua and then optimized using opengm.
    • Sphagetti-michaelmathieu is a sparse linear module written in torch7.
    • Luashkit-package the local sensitive hash library shkit into a Lua available form.
    • Kernel smoothing-KNN, kernel weight average, and local linear regression smoothing Tool
    • Cuda backend Implementation of cutorch-torch
    • The Cuda Neural Network implementation of cunn-torch.
    • The imgraph-torch image/Graphics Library provides routines for creating, splitting, building, and converting images from images
    • Videograph-torch's video/Graphics Library provides routines for creating, splitting, building, and converting videos from videos
    • Saliency-code and tool for integral images, used to find points of interest from the Quick integral histogram.
    • Stitch-splice an image with Hugin and generate a video sequence.
    • SFM-bundle adjustment/structure package for motion scenarios
    • FEX-torch Feature Extraction package provides sift and dsift modules.
    • Overfeat-the highest level of general Density Feature Extraction.
  • Numeric Lua
  • Lunatic Python
  • Scilua
  • Lua-Numerical Algorithms
  • Lunum
Demo and script
  • Core torch7 demos repository. Core torch7 demo Library
    • Linear regression and Logistic Regression
    • Face detection (Training and detection are independent demonstrations)
    • MST-based word Breaker
    • Train-a-digit-Classifier
    • Train-autoencoder
    • Optical Flow demo
    • Train-on-housenumbers
    • Train-on-cifar
    • Tracking with deep nets
    • Kinect demo
    • Filter Visualization
    • Saliency-Networks
  • Training a convnet for the Galaxy-zoo kaggle Challenge (Cuda demo)
  • Music tag script under music Tagging-torch7
  • Torch-datasets reads scripts for several popular datasets, including:
    • Bsr500
    • CIFAR-10
    • Coil
    • Street View House Numbers
    • Mnist
    • Norb
  • Atari2600-generate a dataset script using static frames in the arcade learning environment simulator.
MATLAB Computer Vision
  • Contourlets-Matlab source code for implementing contour Wave Transformation and using functions
  • Shearlets-Matlab source code for Shear Wave Transformation
  • The MATLAB source code of curvelets-curvelet transform (Curvelet transform is a promotion of wavelet transform to higher dimensions, used to represent images at different scales .)
  • Bandlets-Matlab source code for bandlets Transformation
Natural Language Processing
  • NLP-a nlp library of MATLAB
General Machine Learning
  • Training a deep autoencoder or a classifier on mnist digits-train a deep autoencoder or classifier [deep learning] On the mnist character dataset.
  • T-distributed stochastic neighbor embedding-the award-winning dimensionality reduction technology, especially suitable for visualization of High-Dimensional Datasets
  • Spider-a complete object-oriented environment for Matlab machine learning.
  • Libsvm-library of SVM
  • Liblinear-Large Linear Classification Library
  • Machine Learning Module-Professor M. A. girolami's machine learning courses, including PDF, handouts, and code.
  • Caffe-a deep learning framework that considers code cleansing, readability, and speed
  • Pattern Recognition toolbox-Pattern Recognition toolkit in MATLAB, fully object-oriented
Data analysis/Data Visualization
  • MATLAB package for processing images
  • Gamic-efficient implementation of the image algorithm pure MATLAB, which is a supplement to the Mex function of matlab bgl.
. Net Computer Vision
  • Opencvdotnet-package, enabling. net programs to use opencv code
  • Emgu CV-cross-platform wrapper that can be compiled on Windows, Linus, Mac OS X, IOS, and Android.
Natural Language Processing
  • Stanford. NLP for. Net-full transplantation of the Stanford University NLP package on. NET can also be used as a nuget package for pre-compilation.
General Machine Learning
  • Accord. machinelearning-supports common algorithms for vector machines, decision trees, Naive Bayes models, K-means, Gaussian mixture models, and machine learning applications, such as random sampling consistency algorithms, cross validation, and grid search. This package is part of the accord. NET Framework.
  • Vulpes-F # deep belief and deep learning package, which are executed using Cuda GPU under Alea. cubase.
  • Encog-advanced neural networks and machine learning frameworks, including classes used to create a variety of networks, and classes that support Data normalization and processing required by neural networks. Its training adopts multi-thread elastic propagation. It can also use GPU to speed up processing. Provides a graphical interface to help model and train neural networks.
  • Neural network designer-this is a database management system and neural network designer. The designer is developed with WPF and is also a UI. You can design your neural network, query the network, create and configure a chatbot. It can ask questions and learn from your feedback. These robots can even collect information from the network for output or learning.
Data analysis/Data Visualization
  • Numl-numl is a machine learning library designed to simplify standard modeling techniques for prediction and clustering.
  • Math. Net Numerics-Math.NET project, focus on providing scientific, engineering and daily numerical calculation methods and algorithms. Supports windows, Linux, and Mac. net 4.0 ,. net 3.5, Mono, Silverlight 5, windowsphone/SL 8, windowsphone 8.1, Windows 8 with PCL portable profiles 47 and 344, and Android/IOS with xamarin.
  • Sho-sho is an interactive environment for data analysis and scientific computing. It allows you to seamlessly connect scripts (ironpython) and compiled code (. NET) to quickly and flexibly Create prototypes. This environment includes powerful and efficient libraries, such as linear algebra and data visualization, which can be used in any. NET language. It also provides a rich range of Interactive Shell functions for rapid development.
Python Computer Vision
  • Simplecv-an open-source computer vision framework that allows you to access high-performance computer vision libraries such as opencv. It can be written in Python and run on Mac, windows, and ubuntu.
Natural Language Processing
  • Nltk-a leading platform for compiling Python programs that process human language data
  • Pattern-available Python web mining modules, including tools such as natural language processing and machine learning.
  • Textblob-provides consistent APIs for common natural language processing tasks, based on nltk and pattern, and is compatible with both.
  • Jieba-Chinese Word breaking tool.
  • Snownlp-Chinese Text Processing library.
  • Loso-another Chinese dictionary.
  • Genius-Chinese dictionary based on Conditional Random domain.
  • Nut-nlu toolkit.
General Machine Learning
  • Bayesian methods for hackers-Python ebook on probability Planning
  • Mllib in Distributed Machine Learning Library under Apache spark-spark.
  • Scikit-Learn-scipy-based machine learning module
  • Graphlab-create-A database containing multiple machine learning modules (regression, clustering, recommendation system, graph analysis, etc.) based on a dataframe that can be stored on a disk.
  • Bigml-database connected to the external server.
  • Pattern-Python web mining module
  • Nupic-the intelligent computing platform of numenta.
  • Pylearn2-theano-based Machine Learning Library.
  • A gpu-accelerated Deep Learning Library written in Hebel-Python.
  • Gensim-topic modeling tool.
  • Pybrain-another machine learning library.
  • Crab-scalable and fast recommendation engine.
  • Recommendation System Implemented by Python-recsys-Python.
  • Thinking Bayes-books on Bayesian analysis
  • Restricted interface-Python implements a restricted Polman machine. [Deep Learning].
  • Bolt-Online Learning toolbox.
  • The Python Implementation of covertree-cover tree is a convenient alternative to scipy. Spatial. kdtree.
  • The neural imaging machine learning library implemented by nilearn-Python.
  • Shogun-machine learning toolbox.
  • Pyevolve-Genetic Algorithm Framework.
  • Caffe-a deep learning framework that considers code cleansing, readability, and speed
  • Breze-library of deep and recursive neural networks, based on theano.
Data analysis/Data Visualization
  • Scipy-Python-based Open-source software ecosystem for mathematics, science, and engineering.
  • Numpy-Basic Python scientific computing package.
  • Numba-Python's low-level Virtual Machine JIT compiler, compiled by cython and numpy developers for scientific computing
  • Networkx-efficient software for complex networks.
  • Pandas-This database provides high-performance, easy-to-use data structures and data analysis tools.
  • Open Mining-the pandas web interface in Python ).
  • Pymc-MCMC sampling toolkit.
  • Zipline-Python algorithm trading library.
  • Pydy-Full name: Python dynamics, which assists in Dynamic Modeling workflows Based on numpy, scipy, ipython, and matplotlib.
  • Sympy-Python library for symbolic mathematics.
  • Statsmodels-Python statistical modeling and library of metered economics.
  • Astropy-Python astronomy library, community-based collaborative writing
  • Matplotlib-Python 2D drawing library.
  • Bokeh-Python interactive web Drawing Library.
  • Plotly-Python and matplotlib collaborative web Drawing Library.
  • Vincent-converts the python data structure to the Vega visual syntax.
  • Drawing Library of d3py-Python, based on d3.js.
  • Ggplot-the same API as ggplot2 in the r language.
  • Kartograph. py-the Library for rendering SVG images in Python, which has a pretty good effect.
  • SVG chart builder in pygal-Python.
  • Pycascading
Miscellaneous scripts/ipython notes/code library
  • Pattern_classification
  • Thinking stats 2
  • Hyperopt
  • Numpic
  • 2012-paper-diginorm
  • Ipython-notebooks
  • Demo-weights
  • Sarah Palin lda-Sarah Palin's email about topic modeling.
  • Diffusion segmentation-a set of image segmentation algorithms based on the diffusion method.
  • Scipy tutorials-scipy tutorial. It is out of date. Please refer to scipy-lecture-notes
  • Crab-Python recommendation engine library.
  • Bayesian inference tool in bayespy-Python.
  • Scikit-learn tutorials-scikit-learn learning notes Series
  • Sentiment-analyzer-Twitter sentiment Analyzer
  • The Group-Lasso-coordinate descent algorithm experiment is applied to the (sparse) group cable model.
  • Ipython notes for MNE-Python-notebooks-using MNE-python for EEG/MEG data processing
  • Pandas cookbook-a method book that uses the python pandas library.
  • Climin-machine learning optimization library. It uses python to implement algorithms such as gradient descent, lbfgs, rmspdrop, and adadelta.
Kaggle competition source code
  • Implementation of the Dell Zhang solution for the Wiki prediction challenge on Wiki challange-kaggle.
  • Code submitted by the "detect abuse from social media comments" contest on kaggle insults-kaggle
  • Code for the kaggle_acquire-valued-shoppers-challenge-Kaggle prediction repeat challenge
  • Code for CIFAR-10 competitions on kaggle-cifar-kaggle, using cuda-convnet
  • The blackbox competition code on kaggle-blackbox-kaggle, about deep learning.
  • Code for identifying user competitions using accelerometer data on kaggle-accelerometer-kaggle
  • Kaggle-advertised-salaries-kaggle code for predicting wage competitions with advertisements
  • Code used to predict an access requirement competition for a given employee role on kaggle Amazon-kaggle
  • Code for clicking commodity competitions on kaggle-bestbuy_big-Kaggle Based on bestbuy user queries forecast (Big Data Edition)
  • Code for clicking commodity competitions on kaggle-bestbuy_small-Kaggle Based on bestbuy user queries prediction (Small Data Edition)
  • Kaggle dogs vs. Cats-kaggle code for recognizing cat and dog competitions from images
  • Kaggle galaxy challenge-winning code for the distant galaxy morphological classification competition on kaggle
  • Kaggle gender-kaggle competition: gender differentiation from handwriting
  • Code for predicting Drug Molecular Activity competitions on kaggle Merck-kaggle (sponsored by Merck)
  • Code used on kaggle stackoverflow-kaggle to predict whether a stack overflow website issue will be closed
  • Wine-quality-predict the quality of red wine.
Ruby Natural Language Processing
  • Treat-Text Retrieval and annotation toolkit, the most comprehensive toolkit I have ever seen in ruby.
  • Ruby linguistics-this framework can build linguistic tools for Ruby objects in any language. It includes a language-independent general frontend, a module that maps language code to the language name, and a module that contains a very good language tool.
  • Stemmer-enables Ruby to use interfaces in libstemmer_c.
  • Ruby WordNet-WordNet Ruby interface library.
  • Raspel-Aspell: Ruby-bound Interface
  • Ruby porting version of UEA Stemmer-uew.stemmer, a conservative stem analyzer for search and retrieval
  • Twitter-text-Rb-this library can automatically connect and extract user names, lists, and topic tags from Twitter.
General Machine Learning
  • Ruby machine learning-some machine learning algorithms implemented by ruby.
  • Machine Learning Ruby
  • Jruby mahout-Excellent! Apache mahout has been released in the jruby world.
  • Cardmagic-classifier-common classifier modules available for Bayesian and other classification methods.
  • Neural Networks and deep learning-sample code in neural networks and deep learning.
Data analysis/Data Visualization
  • Rsruby-ruby-r Bridge
  • Data-visualization-ruby-source code and support for the ruby Manor demo of data visualization
  • Ruby-plot-packaging gnuplot in Ruby format is especially suitable for converting ROC curves into SVG files.
  • Plot-Rb-Ruby Drawing Library Based on Vega and D3
  • Scruffy-excellent graphic toolkit in ruby
  • Sciruby
  • Glean-data management tools
  • Bioruby
  • Arel
Misc
Miscellaneous
  • Big Data for chimps-a serious and interesting guide to Big Data Processing
R General Machine Learning
  • Clever algorithms for Machine Learning
  • Machine Learning for hackers
  • Machine Learning Task view on CRAN-R language Machine Learning Package list, grouped by algorithm type.
  • Unified interfaces of 150 Machine Learning Algorithms in caret-r Language
  • Superlearner and subsemble-this package combines multiple Machine Learning Algorithms
  • Introduction to statistical learning
Data analysis/Data Visualization
  • Learning statistics using R
  • Ggplot2-A data visualization package based on graphic syntax.
Scala Natural Language Processing
  • Scalanlp-set of machine learning and numerical computing Libraries
  • Breeze-numeric processing library for Scala
  • Chalk-natural language processing database.
  • Factorie-a deployable probabilistic modeling toolkit that uses the scala software library. It provides you with a concise language to create a graph of relational factors, evaluate parameters, and deduce them.
Data analysis/Data Visualization
  • Mllib in Distributed Machine Learning Library under Apache spark-Spark
  • Scalding-cascading Scala Interface
  • Summing bird-streaming mapreduce with scalding and storm
  • Abstract Algebra tool of algebird-Scala
  • Xerial-Scala data management tool
  • Simmer-Unix filter that simplifies your data and performs algebraic Aggregation
  • Predictionio-machine learning servers for software developers and data engineers.
  • Bidmat-CPU and GPU accelerated matrix library that supports large-scale exploratory data analysis.
General Machine Learning
    • Conjecture-scalable machine learning framework under scalding
    • Decision tree tool under brushfire-scalding.
    • Ganitha-Scalding-based Machine Learning Library
    • Adam-use Apache Avro, Apache spark, and parquet genome Processing engines, with a dedicated file format, Apache 2 software license.
    • Bioscala-library of bioinformatics available in Scala Language
    • Bidmach-machine learning CPU and GPU acceleration library.
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.