Overview of popular machine learning algorithms

Last Update:2016-02-20 Source: Internet

Author: User

Tags svm

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In this article we will outline some popular machine learning algorithms.

Machine learning algorithms are many, and they have many extensions themselves. Therefore, how to determine the best algorithm to solve a problem is very difficult.

Let us first say that based on the learning approach to the classification of the algorithm and the similarity between the algorithm, so that everyone has a sense of the whole, and then state the various algorithms.

I. Classification of algorithms based on learning patterns

The algorithm is divided into different types based on how the experience, environment, or any data we call input is handled. Machine learning and AI textbooks usually first consider the learning methods that algorithms can adapt to.

Only a few major learning styles or learning models are discussed here, and there are several basic examples. This classification or organization is good because it forces you to think about the role of the input data and the process of preparing the model, and then choose the algorithm that best suits your problem to get the best results.

supervised learning: The input data is called the training data and has known results or is marked. Say whether an email is spam, or a share price over time. The model makes predictions that, if wrong, will be corrected, and the process continues until it reaches a certain correct standard for training data. Examples of problems include classification and regression problems, and examples of algorithms include logistic regression and inverse neural networks.
Unsupervised Learning: The input data is not marked and there are no definite results. The model sums up the structure and numerical value of the data. Examples of problems include association rule learning and clustering, and the algorithm examples include the Apriori algorithm and the K-means algorithm.
semi-supervised learning: The input data is a mixture of tagged and unlabeled data, with some predictive problems but the model must also learn the structure and composition of the data. Examples of problems include classification and regression, and algorithm examples are essentially extensions of unsupervised learning algorithms.
Enhanced Learning: input data stimulates the model and responds to the model. Feedback is obtained not only from the learning process of supervised learning, but also from rewards or punishments in the environment. Examples of problems are robot control, examples of algorithms include q-learning and temporal difference learning.

When consolidating data to simulate business decisions, most will use supervised learning and unsupervised learning methods. A hot topic for the moment is semi-supervised learning, which has a large database, but only a small number of images are marked, compared to the problem of classification. Reinforcement learning is mostly used in the development of robotic controls and other control systems.

Ii. similarity of machine learning algorithms

The algorithm is basically categorized by function or form. For example, tree-based algorithms, neural network algorithms. This is a very useful way of classifying, but not perfect. Because there are many algorithms can easily be divided into two categories, such as learning Vector quantization is also a neural network class algorithm and an instance-based approach. Just as the machine learning algorithm itself does not have a perfect model, the algorithm's classification method is not perfect.

Three, all kinds of popular machine learning algorithms

Regression

regression (regression analysis) is concerned with the relationship between variables. It applies statistical methods, examples of several algorithms include the following:

Ordinary Least Squares
Logistic Regression
Stepwise Regression
Multivariate Adaptive Regression splines (MARS)
Locally estimated Scatterplot smoothing (loess)

Instance-based Methods

instance Based Learning (case-based learning) simulates a decision problem, and the examples or examples used are very important to the model. This method builds a database of existing data and adds the new data, and then uses a similarity measurement method to find an optimal match in the database to make a prediction. For this reason, this approach is also known as the winner-King method and the memory-based approach. The focus of attention now is on the methods of measuring the representation and similarity of the stored data.

K-nearest Neighbour (KNN)
Learning Vector Quantization (LVQ)
Self-organizing Map (SOM)

Regularization Methods

This is an extension to other methods (usually the regression method), which is more advantageous to the simpler model and more adept at induction. I'm listing it here because it's popular and powerful.

Ridge Regression
Least Absolute Shrinkage and Selection Operator (LASSO)
Elastic Net

Decision Tree Learning

Decision tree methods (decision tree method) establishes a model based on the actual values in the data. Decision trees are used to solve induction and regression problems.

Classification and Regression Tree (CART)
Iterative Dichotomiser 3 (ID3)
C4.5
chi-squared Automatic Interaction Detection (CHAID)
Decision Stump
Random Forest
Multivariate Adaptive Regression splines (MARS)
Gradient boosting Machines (GBM)

Bayesian

Bayesian method (Bayesian approach) is a method of Bayesian theorem applied in solving classification and regression problems.

Naive Bayes
Averaged one-dependence estimators (Aode)
Bayesian belief Network (BBN)

Kernel Methods

The most famous of the Kernel method (kernel methods) is the support vector machines (SVM). This method maps the input data to a higher dimension, and some collation and regression problems are easier to model.

Support Vector machines (SVM)
Radial Basis Function (RBF)
Linear discriminate Analysis (LDA)

Clustering Methods

Clustering (clustering), in itself, describes the problems and methods. Clustering methods are typically categorized by modeling. All clustering methods use a uniform data structure to organize the information, making the most common in each group.

K-means
Expectation maximisation (EM)

Association Rule Learning

Association rule Learning (Union Rule learning) is a method used to extract laws between data, which can be used to find the connection between the huge amount of multidimensional spatial data, and these important links can be utilized by organizations.

Apriori algorithm
Eclat algorithm

Artificial Neural Networks

artificial Neural Networks (Artificial neural network) is inspired by the structure and function of the biological neural network. It belongs to pattern matching, which is often used for regression and classification problems, but it is composed of hundreds of algorithms and variants. Some of them are classic popular algorithms (I take the deep learning out alone):

Perceptron
Back-propagation
Hopfield Network
Self-organizing Map (SOM)
Learning Vector Quantization (LVQ)

Deep learning

deep learning (deep learning) approach is a modern update of artificial neural networks. Compared with the traditional neural network, it has more complex network composition, many methods are concerned about semi-supervised learning, this learning problem has a lot of data, but it is rarely labeled data.

Restricted Boltzmann Machine (RBM)
Deep belief Networks (DBN)
Convolutional Network
Stacked Auto-encoders

dimensionality Reduction

dimensionality Reduction (dimensionality reduction), like the clustering method, pursues and leverages the uniform structure of the data, but it uses less information to generalize and describe the data. This is useful for visualizing data or simplifying data.

Principal Component Analysis (PCA)
Partial Least Squares Regression (PLS)
Sammon Mapping
Multidimensional Scaling (MDS)
Projection Pursuit

Ensemble Methods

ensemble methods (Combinatorial method) consists of a number of small models that are independently trained to make independent conclusions, and finally form a general prediction. Many studies focus on what models are used and how they are grouped together. This is a very powerful and popular technology.

bootstrapped Aggregation (Bagging)
stacked generalization (blending)
random Forest

This is an example of fitting using a combination method (from a wiki), each fire-fighting method is grayed out, and the final result of the synthesis is red.

Overview of popular machine learning algorithms

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More