14 Common algorithms of machine learning

Source: Internet
Author: User

Recently received a letter from the company, said, there is a robot program--** small assistant, on-line (do not know which department wrote), let everyone okay when, test, by the way to learn about the program, more than 50 chat, there is a lucky draw ~ I probably tried, a little language, this is written a god horse thing ah, so rotten, Basic Chat not 3, at most that is, "who Are You", "how big", even if I follow the procedure said, is also farfetched ~ if you let the program to learn, the internet so big, but also need to let people come, even if people come, they have to be almost the line ah, even a prototype are not, but also the nerve to let The name of the United States: let the program learn ~

Machine learning is undoubtedly a hot topic in the field of current data analysis. Many people use machine learning algorithms more or less in their usual work. Here is a summary of common machine learning algorithms that you will use for reference in your work and learning.

There are many algorithms for machine learning, including algorithms and the extensions from them. Below, from two aspects, the first aspect is the learning method, the second aspect is the similarity of the algorithm.

Learning Style

Depending on the type of data, there are different ways to model a problem. In the field of machine learning or artificial intelligence, people will first consider the algorithm's learning style. By classifying the algorithm according to the learning mode, we can make people consider the best result by choosing the most suitable algorithm according to the input data when modeling and algorithm selection.

Supervised learning

Under supervised learning, the input data is called "training data", each group of training data has a clear identification or result, for example, anti-spam system, recognition of handwritten numerals and so on. In the establishment of the predictive model, supervised learning establishes a learning process, compares the predicted results with the actual results of the "training data", and adjusts the predictive model continuously until the predicted results of the model reach an expected accuracy rate. Common application scenarios for supervised learning such as classification problems and regression problems. Common algorithms are:

    • Logical regression (Logistic Regression)
    • Reverse transfer neural networks (back propagation neural network)
Non-supervised learning

In this learning mode, the input data part is identified, the part is not identified, this model can be used for prediction, but the model first needs to learn the internal structure of the data in order to reasonably organize the data to make predictions. The application scenarios include classification and regression, and the algorithm includes some extensions to the commonly supervised learning algorithms, first trying to model the non-identified data, and then predicting the identified data.

    • Graph theory Reasoning algorithm (graph inference)
    • Laplace support Vector Machine (Laplacian SVM.)
Intensive Learning

In this mode, the input data as feedback to the model, unlike the monitoring model, the input data is only as a check model of the wrong way, under reinforcement learning, the input data directly to the model, the model must be immediately adjusted. Common application scenarios include dynamic systems and robot control. Common algorithms include q-learning and time difference learning (temporal difference learning)

In the case of enterprise Data application, the most commonly used is the model of supervised learning and unsupervised learning. In the field of image recognition, semi-supervised learning is a hot topic because of the large number of non-identifiable data and a small amount of identifiable data. Reinforcement learning is more used in robot control and other areas where system control is required.

Algorithmic similarity

According to the function and form similarity of the algorithm, the algorithm is classified, for example, tree-based algorithm, neural network based algorithm and so on. Of course, the scope of machine learning is very large, and some algorithms are difficult to classify into a certain category. For some classifications, the same classification algorithm can be used for different types of problems. Here, we try to classify commonly used algorithms in the easiest way to understand them.

Regression algorithm

The regression algorithm is a kind of algorithm that tries to use the measurement of error to explore the relationship between variables. Regression algorithm is a powerful tool for statistical machine learning. In the field of machine learning, people talk about regression, sometimes refers to a kind of problem, sometimes refers to a kind of algorithm, which often makes beginners confused. Common regression algorithms include the following:

    • Least squares (ordinary Least square)
    • Logical regression (Logistic Regression)
    • Stepwise regression (stepwise Regression)
    • Multivariate adaptive regression spline (multivariate Adaptive Regression splines)
    • Local Scatter smoothing estimate (locally estimated scatterplot smoothing)
An instance-based algorithm

Instance-based algorithms are often used to model decision problems, and such models often pick up a batch of sample data and then compare the new data with the sample data based on some approximation. Find the best match in this way. Thus, instance-based algorithms are often referred to as "winner-take-all" learning or "memory-based learning". Common algorithms include the following:

    • K-nearest Neighbor (KNN)
    • Learning vector quantization (learning vector quantization, LVQ)
    • Self-organizing mapping algorithm (self-organizing map, SOM)
Regularization method

The regularization method is the extension of other algorithms (usually the regression algorithm), which adjusts the algorithm according to the complexity of the algorithm. The regularization method usually rewards the simple model and punishes the complex algorithm. Common algorithms include the following:

    • Ridge Regression
    • Least Absolute Shrinkage and Selection Operator (LASSO)
    • Elastic Network (Elastic net)
Decision Tree Learning

Decision Tree algorithm uses tree structure to establish decision-making model according to the attribute of data, and decision tree model is often used to solve classification and regression problems. Common algorithms include: Classification and regression tree (classification and Regression tree, CART), ID3 (iterative Dichotomiser 3), C4.5, chi-squared Automatic Inte Raction Detection (CHAID), decision Stump, stochastic forest (random Forest), multivariate adaptive regression spline (MARS) and gradient propulsion (Gradient boosting machine, GBM)

Bayesian method

Bayesian algorithm is a kind of algorithm based on Bayesian theorem, which is mainly used to solve the problem of classification and regression. Common algorithms include:

    • Naive Bayesian algorithm
    • Average single-dependency estimation (averaged one-dependence estimators, Aode)
    • Bayesian belief Network (BBN)
Kernel-based algorithms

The most famous of kernel-based algorithms is support vector machine (SVM). The kernel-based algorithm maps the input data to a higher-order vector space, in which some classification or regression problems can be solved more easily. Common kernel-based algorithms include:

    • Support Vector Machines (SVM)
    • Radial basis functions (Radial Basis function, RBF)
    • Linear discriminant Analysis (Linear discriminate analyses, LDA)
Clustering algorithm

Clustering, like regression, is sometimes described as a kind of problem, sometimes describing a class of algorithms. Clustering algorithms typically merge input data by either a central point or a hierarchical approach. So the clustering algorithm tries to find the intrinsic structure of the data in order to classify the data in the most common way. Common clustering algorithms include:

    • K-means algorithm
    • Desired maximization algorithm (expectation maximization, EM)
Association Rule Learning

Association rule Learning finds useful association rules in a large number of multivariate datasets by finding rules that best explain the relationship between data variables. Common algorithms include:

    • Apriori algorithm
    • Eclat algorithm
Artificial neural network

Artificial neural network algorithm is a kind of pattern matching algorithm simulating biological neural network. Typically used to solve classification and regression problems. Artificial neural network is a huge branch of machine learning, there are hundreds of kinds of different algorithms. (Deep learning is one of these algorithms, which we will discuss separately), and important artificial neural network algorithms include:

    • Perceptron Neural Networks (Perceptron neural Network)
    • Reverse delivery (back propagation)
    • Hopfield Network
    • Self-organizing mappings (self-organizing map, SOM)
    • Learning vector quantization (learning vector quantization, LVQ)
Deep learning

Deep learning algorithm is the development of artificial neural network. In the near future won a lot of attention, especially Baidu also began to exert deep learning, is in the domestic caused a lot of concern. In today's increasingly inexpensive computing power, deep learning attempts to build a much larger and more complex neural network. Many deep learning algorithms are semi-supervised learning algorithms used to handle large datasets with small amounts of data that are not identified. Common deep-learning algorithms include:

    • Limited Boltzmann machines (Restricted Boltzmann machine, RBN)
    • Deep belief Networks (DBN)
    • Convolutional networks (convolutional network)
    • Stack-type Automatic encoder (stacked auto-encoders)
Reduce the dimension of the algorithm

Like the clustering algorithm, the reduced dimension algorithm tries to analyze the intrinsic structure of the data, but the reduced dimension algorithm attempts to use less information to summarize or interpret the data in an unsupervised learning way. Such algorithms can be used to visualize high-dimensional data or to simplify data for supervised learning. Common algorithms include the following:

    • Principal component Analysis (Principle Component, PCA)
    • Partial least squares regression (partial Least Square regression,pls)
    • Sammon Mapping
    • Multidimensional scale (multi-dimensional scaling, MDS)
    • Projection tracking (Projection Pursuit)
Integration algorithms

The integrated algorithm trains the same sample independently with some relatively weak learning models, then integrates the results for overall prediction. The main difficulty of integration algorithm is how to integrate the independent weak learning models and how to integrate the learning results. This is a very powerful algorithm, but also very popular. Common algorithms include the following:

    • Boosting
    • bootstrapped Aggregation (Bagging)
    • AdaBoost
    • Stacking generalization (stacked generalization, Blending)
    • Gradient pusher (Gradient boosting machine, GBM)
    • Stochastic forest (random Forest)

14 Common algorithms of machine learning

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.