Machine Learning common algorithm subtotals

Source: Internet
Author: User
Tags svm

Read Catalogue

    • 1. Learning Style
    • 1.1 Supervised learning
    • 1.2 Non-supervised learning
    • 1.3 Semi-supervised learning
    • 1.4 Intensive Learning
    • 2. Algorithm classification
    • 2.1 Regression algorithm
    • 2.2 Instance-based algorithms
    • 2.3 Regularization method
    • 2.4 Decision Tree Learning
    • 2.5 Bayesian method
    • 2.6 Kernel-based algorithms
    • 2.7 Clustering algorithm
    • 2.8 Association Rule Learning
    • 2.9 Genetic algorithm (genetic algorithm)
    • 2.10 Artificial Neural network
    • 2.11 Deep Learning
    • 2.12 Reducing the dimension of the algorithm
    • 2.13 Integrated algorithm

  Statement: This blog post according to Http://www.ctocio.com/hotnews/15919.html collation, the original author Zhang Meng, respect for the original.

Machine learning is undoubtedly a hot topic in the field of current data analysis. Many people use machine learning algorithms more or less in their usual work. This article summarizes common machine learning algorithms for you to reference in your work and learning.

There are many algorithms for machine learning. Many times confusing people are, many algorithms are a kind of algorithm, and some algorithms are extended from other algorithms. Here, we from two aspects to introduce to you, the first aspect is the way of learning, the second aspect is the classification of the algorithm.

Bloggers in the original based on the introduction of genetic algorithm (2.9), so that this blog post contains a more comprehensive machine learning algorithm. This blog post is a summary article, such as to specifically understand the specific implementation of each algorithm, but also for an algorithm to learn and scrutiny.

Back to Top1. Learning Style

Depending on the type of data, there are different ways to model a problem. In the field of machine learning or artificial intelligence, people will first consider the algorithm's learning style. In the field of machine learning, there are several main ways of learning. It is a good idea to classify the algorithm according to the way of learning, which allows people to consider the best possible results by choosing the most suitable algorithm based on the input data when modeling and algorithm selection.

Back to Top1.1 Supervised Learning

Under supervised learning, the input data is called "training data", each set of training data has a clear identification or results, such as the anti-spam system "spam" "non-spam", the handwritten numeral recognition of "1", "2", "3", "4" and so on. In the establishment of the predictive model, supervised learning establishes a learning process, compares the predicted results with the actual results of the "training data", and adjusts the predictive model continuously until the predicted results of the model reach an expected accuracy rate. Common application scenarios for supervised learning such as classification problems and regression problems. Common algorithms are logical regression (logistic Regression) and reverse transfer neural networks (back propagation neural network).

Back to Top1.2 Non-supervised learning

In unsupervised learning, the data is not specifically identified, and the learning model is designed to infer some intrinsic structure of the data. Common application scenarios include learning about association rules and clustering. Common algorithms include the Apriori algorithm and the K-means algorithm.

Back to Top1.3 Semi-supervised learning

In this learning mode, the input data part is identified, the part is not identified, the learning model can be used for prediction, but the model first needs to learn the internal structure of the data in order to reasonably organize the data to make predictions. The application scenarios include classification and regression, and the algorithm includes some extensions to the commonly supervised learning algorithms, which first attempt to model the non-identified data, and then predict the identified data. On the inference algorithm (Graph inference) or Laplace support vector machine (Laplacian SVM).

Back to Top1.4 Intensive Learning

In this learning mode, input data as feedback to the model, unlike the monitoring model, the input data is only as a check model of the wrong way, under the reinforcement learning, the input data directly feedback to the model, the model must be immediately adjusted. Common application scenarios include dynamic systems and robot control. Common algorithms include q-learning and time difference learning (temporal difference learning).  

In the case of enterprise Data application, the most commonly used is the model of supervised learning and unsupervised learning. In the field of image recognition, semi-supervised learning is a hot topic because of the large number of non-identifiable data and a small amount of identifiable data. Reinforcement learning is more used in robot control and other areas where system control is required.

Back to Top2. Algorithm classification

According to the function and form similarity of the algorithm, we can classify the algorithm, for example, tree-based algorithm, neural network based algorithm and so on. Of course, the scope of machine learning is very large, and some algorithms are difficult to classify into a certain category. For some classifications, the same classification algorithm can be used for different types of problems. Here, we try to classify commonly used algorithms in the easiest way to understand them.

Back to Top2.1 Regression Algorithm

The regression algorithm is a kind of algorithm that tries to use the measurement of error to explore the relationship between variables. Regression algorithm is a powerful tool for statistical machine learning. In the field of machine learning, people talk about regression, sometimes refers to a kind of problem, sometimes refers to a kind of algorithm, which often makes beginners confused. Common regression algorithms include: least squares (ordinary Least square), Logistic regression (logistic Regression), stepwise regression (stepwise Regression), multiple adaptive regression splines (multivariate Adaptive Regression splines) and local scatter smoothing estimates (locally estimated scatterplot smoothing).

Back to Top2.2 Instance-based algorithms

Instance-based algorithms are often used to model decision problems, and such models often pick up a batch of sample data and then compare the new data with the sample data based on some approximation. Find the best match in this way. Thus, instance-based algorithms are often referred to as "winner-take-all" learning or "memory-based learning". Common algorithms include K-nearest Neighbor (KNN), Learning vector quantization (learning vector quantization, LVQ), and self-organizing mapping algorithms (self-organizing map, SOM).

Back to Top2.3 Regularization Method

The regularization method is the extension of other algorithms (usually the regression algorithm), which adjusts the algorithm according to the complexity of the algorithm. The regularization method usually rewards the simple model and punishes the complex algorithm. Common algorithms include: Ridge Regression, Least Absolute Shrinkage and Selection Operator (LASSO), and elastic networks (Elastic Net).

Back to Top2.4 Decision Tree Learning

Decision Tree algorithm uses tree structure to establish decision-making model according to the attribute of data, and decision tree model is often used to solve classification and regression problems. Common algorithms include: Classification and regression tree (classification and Regression tree, CART), ID3 (iterative Dichotomiser 3), C4.5, chi-squared Automatic Inte Raction Detection (CHAID), decision Stump, stochastic forest (random Forest), multivariate adaptive regression spline (MARS) and gradient propulsion (Gradient boosting machine, GBM)

Back to Top2.5 Bayesian Method

Bayesian algorithm is a kind of algorithm based on Bayesian theorem, which is mainly used to solve the problem of classification and regression. Common algorithms include: naive Bayesian algorithm, average single-dependency estimation (averaged one-dependence estimators, Aode), and Bayesian belief Network (BBN).

Back to Top2.6 kernel-based algorithms

The most famous of kernel-based algorithms is support vector machine (SVM). The kernel-based algorithm maps the input data to a higher-order vector space, in which some classification or regression problems can be solved more easily. Common kernel-based algorithms include: Support Vector machines (SVM), Radial basis functions (Radial Basis function, RBF), and linear discriminant analysis (Linear discriminate analyses , LDA) and so on.

Back to Top2.7 Clustering Algorithm

Clustering, like regression, is sometimes described as a kind of problem, sometimes describing a class of algorithms. Clustering algorithms typically merge input data by either a central point or a hierarchical approach. So the clustering algorithm tries to find the intrinsic structure of the data in order to classify the data in the most common way. Common clustering algorithms include the K-means algorithm and the desired maximization algorithm (expectation maximization, EM).

Back to Top2.8 Association Rule Learning

Association rule Learning finds useful association rules in a large number of multivariate datasets by finding rules that best explain the relationship between data variables. Common algorithms include Apriori algorithm and Eclat algorithm.

Back to Top 2.9 Genetic algorithm (genetic algorithm)

Genetic algorithms mimic the mutation, exchange of biological reproduction and Darwin's natural selection (survival of the fittest in every ecological environment). It encodes the possible solution of the problem into a vector, called an individual, each element of the vector is called a gene, and uses the objective function (corresponding to the natural selection criteria) to evaluate each individual in a group (a collection of individuals), to select, Exchange, and mutate the individual according to the evaluation Value (fitness), So as to get new groups. Genetic algorithms are suitable for very complex and difficult environments, for example, with a lot of noise and irrelevant data, things are constantly updated, problem targets cannot be clearly and precisely defined, and the value of current behavior can be determined through a lengthy execution process. As with neural networks, the research of genetic algorithms has developed into an independent branch of artificial intelligence, whose representative character is Hollede (J.h.holland).

Back to Top2.10 Artificial Neural network

Artificial neural network algorithm is a kind of pattern matching algorithm simulating biological neural network. Typically used to solve classification and regression problems. Artificial neural network is a huge branch of machine learning, there are hundreds of kinds of different algorithms. (Deep learning is one of these algorithms, which we will discuss separately), important artificial neural network algorithms include: Perceptron Neural Networks (Perceptron neural network), reverse transfer (back propagation), Hopfield network, Self-organizing mappings (self-organizing map, SOM).

Back to Top2.11 Deep Learning

Deep learning algorithm is the development of artificial neural network.   In the near future won a lot of attention, especially Baidu also began to exert deep learning, is in the domestic caused a lot of concern. In today's increasingly inexpensive computing power, deep learning attempts to build a much larger and more complex neural network. Many deep learning algorithms are semi-supervised learning algorithms used to handle large datasets with small amounts of data that are not identified. Common depth learning algorithms include: Restricted Boltzmann machines (Restricted Boltzmann machine, RBN), deep belief Networks (DBN), convolutional networks (convolutional network), Stack-type Automatic encoder (stacked auto-encoders).

Back to Top2.12 Reducing the dimension of the algorithm

Like the clustering algorithm, the reduced dimension algorithm tries to analyze the intrinsic structure of the data, but the reduced dimension algorithm attempts to use less information to summarize or interpret the data in an unsupervised learning way. Such algorithms can be used to visualize high-dimensional data or to simplify data for supervised learning. Common algorithms include: PCA (Principle Component Analysis, PCA), Partial least squares regression (partial Least Square regression,pls), Sammon mappings, Multidimensional scales (multi-dimensional scaling, MDS), projection tracking (Projection Pursuit), etc.

Back to Top2.13 Integrated Algorithm

The integrated algorithm trains the same sample independently with some relatively weak learning models, then integrates the results for overall prediction. The main difficulty of integration algorithm is how to integrate the independent weak learning models and how to integrate the learning results. This is a very powerful algorithm, but also very popular. Common algorithms include: Boosting, bootstrapped Aggregation (Bagging), AdaBoost, stacking generalization (stacked generalization, Blending), gradient pusher (Gradient Boosting machine, GBM), stochastic forest (random Forest), GBDT (Gradient boosting decision Tree).

Machine Learning common algorithm subtotals

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.