Overview of popular Machine Learning Algorithms

Last Update:2014-07-16 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This article introduces several of the most popular machine learning algorithms. There are many machine learning algorithms. The difficulty is to classify methods. Here we will introduce two methods for thinking and classifying these algorithms. The first group of algorithms is the learning style, and the second group is similar in form and function.

Learning Style

There are different methods for modeling an algorithm based on problems. whether the problem is based on experience or environment interaction, or based on the data we need to input, the learning style is the first question that must be considered in machine learning.

Next, let's take a look at the main learning style or learning model of some algorithms.

Supervised Learning: the input data is called the training data. A model requires a training process to make the expected judgment during this process. If the model is wrong, the model is corrected, the training process continues until the expected accuracy is achieved based on the training data. The key methods are classification and regression, and the algorithms are logical regression and BP neural networks.
Unsupervised learning: without any training data, a model is derived based on unlabeled input data. The key method is association rule learning and aggregation, the algorithms include the Apriori algorithm and K-means algorithm.
Semi-supervised learning semi-Supervised Learning: the input data is a hybrid case of tag and non-tag. The model must learn the structure and then organize the data as expected. The key method is classification and regression.
Reinforcement Learning: The model must be able to respond and respond from an environmental stimulus. Feedback is not in the form of a teaching process, but can be rewarded and punished by the environment. The key method is system and robot control. Algorithms include Q-learning and temporal difference learning.

When processing data for business decision modeling, you usually use supervised and unsupervised learning methods. Currently, a hot topic is semi-supervised learning in image classification and other fields. There are few labeled examples of large datasets. Reinforcement Learning is easier to apply in Robot Control and other control systems.

Similarity Algorithm

Algorithms generally present similarity in functions or forms. For example, the tree-based method and neural network method are inspired. This is a useful grouping method, but it is imperfect. There are still some algorithms that are easy to integrate into multiple categories, such as learning vector quantization, which is both a neural network-inspired method and an instance-based algorithm.

There are also some algorithms that have the same name on the problem domains and algorithm categories, such as regression analysis and aggregation. Therefore, like the machine learning algorithm itself, there is no perfect model and only a suitable model.

Below we will display some popular machine learning algorithms.

Regression

Regression Models focus on the relationship between variables, and uses Model Prediction Error Measurement for repeated extraction. The regression method is statistical and has been incorporated into statistical machine learning. This may be confusing because we can use regression to reference various problems and algorithms. Regression is actually a process. Some example algorithms are as follows:

Ordinary Least Squares
Logistic Regression
Stepwise Regression
Multivariate adaptive regression splines (MARS)
Locally estimated scatterplot smoothing (loess)

Instance-Based Method

The instance-based learning model uses training data that is very important to the model. This type of method usually uses a database based on the sample data, use new data and database data to find the best match in a similarity method to make a prediction. For this reason, the instance-based approach is also known as the winner's all-in-one approach and memory-based learning. Focuses on the Performance of similarity measurement between storage instances.

K-nearest neighbour (KNN)
Learning vector quantization (LVQ)
Self-Organizing Map (SOM)

Regularization Method

Method-based extension (typically based on the regression method) may be complicated and easier to promote. The regularization methods listed below are popular, powerful, and simple.

Ridge Regression
Least Absolute shrinkage and selection operator (lasso)
Elastic net

Decision Tree Learning

The decision tree method is used to establish a decision model based on the actual data attribute values. Decision Making uses a tree structure until prediction decisions are made based on a given record. Decision tree training is performed on data of classification and regression.

Classification and regression tree (Cart)
Iterative dichotomiser 3 (ID3)
C4.5
Chi-squared automatic interaction detection (chaid)
Demo-stump
Random Forest
Multivariate adaptive regression splines (MARS)
Gradient boosting machines (GBM)

Bayesian Bayes

The Bayesian method clearly uses Bayesian Theorem for classification and regression:

Naive Bayes
Averaged one-dependence estimators (aode)
Bayesian Belief Network (BBN)

Kernel Methods Kernel Method

Kernel Methods is the most popular method of support vector machine. Kernel Methods focuses more on ing data to high-dimensional space vectors, where we can perform modeling for classification or regression problems.

Support Vector Machines (SVM)
Radial Basis Function (RBF)
Linear discriminate analysis (LDA)

Clustering tertering Method

The tering clustering method, similar to regression, belongs to the categories that describe the problem and method. The clustering method is usually modeled on centroid-based and hierarchical organization of the center. All methods are related to using the structure inherent in the data to better organize the data into the most common grouping.

K-means
Expectation maximisation (EM)

Association rule learning

The Learning Method of association rules is to extract rules that can interpret the data relationship between observed variables. These rules can be used to discover important and commercial associations that are useful to an organization or company in a large multi-dimensional data set.

Apriori algorithm
Eclat Algorithm

Artificial Neural Network

The artificial neural network model is inspired by the structure and function of the biological neural network. They are a type of pattern matching and are often used for regression and classification problems, because there are hundreds of branch algorithms of various types of problems. Some classic popular methods:

Perceptron
Back-Propagation
Tmpnetwork
Self-Organizing Map (SOM)
Learning vector quantization (LVQ)

Deep Learning

The deep learning method is an upgraded version of the modern artificial neural network method. It uses rich and inexpensive computing to build larger and more complex neural networks, many methods involve semi-Supervised Learning (large data contains few labeled data ).

Restricted Boltzmann Machine (RBM)
Deep belief networks (DBN)
Convolutional Network
Stacked auto-encoders

Dimensionality Reduction Method

Similar to the cluster clustering method, dimensionality reduction is the internal structure of seeking and utilizing data. However, in this case, unsupervised methods can only summarize or describe data with less information. It is useful to use in supervised mode to form visualized 3D data or simplify data.

Principal Component Analysis (PCA)
Partial Least Squares Regression (PLS)
Sammon Mapping
Multidimen1_scaling (MDS)
Projection Pursuit

Ensemble Integration Method

The integration method is composed of multiple weak models trained independently. These models are combined for overall prediction in some way. A large amount of energy is required to learn what weak types and their combinations. This is a very powerful and popular technology category:

Boosting
Bootstrapped aggregation (bagging)
AdaBoost
Stacked generalization (blending)
Gradient boosting machines (GBM)
Random Forest

The weak is gray, and the combined prediction is red. The specific display is the temperature/ozone data.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Overview of popular Machine Learning Algorithms

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Overview of popular Machine Learning Algorithms

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support