In this article we will outline some popular machine learning algorithms.
Machine learning algorithms are many, and they have many extensions themselves. Therefore, how to determine the best algorithm to solve a problem is very difficult.
Let us first say that based on the learning approach to the classification of the algorithm and the similarity between the algorithm, so that everyone has a sense of the whole, and then state the various algorithms.
I. Classification of algorithms based on learning patterns
The algorithm is divided into different types based on how the experience, environment, or any data we call input is handled. Machine learning and AI textbooks usually first consider the learning methods that algorithms can adapt to.
Only a few major learning styles or learning models are discussed here, and there are several basic examples. This classification or organization is good because it forces you to think about the role of the input data and the process of preparing the model, and then choose the algorithm that best suits your problem to get the best results.
- supervised learning: The input data is called the training data and has known results or is marked. Say whether an email is spam, or a share price over time. The model makes predictions that, if wrong, will be corrected, and the process continues until it reaches a certain correct standard for training data. Examples of problems include classification and regression problems, and examples of algorithms include logistic regression and inverse neural networks.
- Unsupervised Learning: The input data is not marked and there are no definite results. The model sums up the structure and numerical value of the data. Examples of problems include association rule learning and clustering, and the algorithm examples include the Apriori algorithm and the K-means algorithm.
- semi-supervised learning: The input data is a mixture of tagged and unlabeled data, with some predictive problems but the model must also learn the structure and composition of the data. Examples of problems include classification and regression, and algorithm examples are essentially extensions of unsupervised learning algorithms.
- Enhanced Learning: input data stimulates the model and responds to the model. Feedback is obtained not only from the learning process of supervised learning, but also from rewards or punishments in the environment. Examples of problems are robot control, examples of algorithms include q-learning and temporal difference learning.
When consolidating data to simulate business decisions, most will use supervised learning and unsupervised learning methods. A hot topic for the moment is semi-supervised learning, which has a large database, but only a small number of images are marked, compared to the problem of classification. Reinforcement learning is mostly used in the development of robotic controls and other control systems.
Ii. similarity of machine learning algorithms
The algorithm is basically categorized by function or form. For example, tree-based algorithms, neural network algorithms. This is a very useful way of classifying, but not perfect. Because there are many algorithms can easily be divided into two categories, such as learning Vector quantization is also a neural network class algorithm and an instance-based approach. Just as the machine learning algorithm itself does not have a perfect model, the algorithm's classification method is not perfect.
Three, all kinds of popular machine learning algorithms
Regression
regression (regression analysis) is concerned with the relationship between variables. It applies statistical methods, examples of several algorithms include the following:
- Ordinary Least Squares
- Logistic Regression
- Stepwise Regression
- Multivariate Adaptive Regression splines (MARS)
- Locally estimated Scatterplot smoothing (loess)
Instance-based Methods
instance Based Learning (case-based learning) simulates a decision problem, and the examples or examples used are very important to the model. This method builds a database of existing data and adds the new data, and then uses a similarity measurement method to find an optimal match in the database to make a prediction. For this reason, this approach is also known as the winner-King method and the memory-based approach. The focus of attention now is on the methods of measuring the representation and similarity of the stored data.
- K-nearest Neighbour (KNN)
- Learning Vector Quantization (LVQ)
- Self-organizing Map (SOM)
Regularization Methods
This is an extension to other methods (usually the regression method), which is more advantageous to the simpler model and more adept at induction. I'm listing it here because it's popular and powerful.
- Ridge Regression
- Least Absolute Shrinkage and Selection Operator (LASSO)
- Elastic Net
Decision Tree Learning
Decision tree methods (decision tree method) establishes a model based on the actual values in the data. Decision trees are used to solve induction and regression problems.
- Classification and Regression Tree (CART)
- Iterative Dichotomiser 3 (ID3)
- C4.5
- chi-squared Automatic Interaction Detection (CHAID)
- Decision Stump
- Random Forest
- Multivariate Adaptive Regression splines (MARS)
- Gradient boosting Machines (GBM)
Bayesian
Bayesian method (Bayesian approach) is a method of Bayesian theorem applied in solving classification and regression problems.
- Naive Bayes
- Averaged one-dependence estimators (Aode)
- Bayesian belief Network (BBN)
Kernel Methods
The most famous of the Kernel method (kernel methods) is the support vector machines (SVM). This method maps the input data to a higher dimension, and some collation and regression problems are easier to model.
- Support Vector machines (SVM)
- Radial Basis Function (RBF)
- Linear discriminate Analysis (LDA)
Clustering Methods
Clustering (clustering), in itself, describes the problems and methods. Clustering methods are typically categorized by modeling. All clustering methods use a uniform data structure to organize the information, making the most common in each group.
- K-means
- Expectation maximisation (EM)
Association Rule Learning
Association rule Learning (Union Rule learning) is a method used to extract laws between data, which can be used to find the connection between the huge amount of multidimensional spatial data, and these important links can be utilized by organizations.
- Apriori algorithm
- Eclat algorithm
Artificial Neural Networks
artificial Neural Networks (Artificial neural network) is inspired by the structure and function of the biological neural network. It belongs to pattern matching, which is often used for regression and classification problems, but it is composed of hundreds of algorithms and variants. Some of them are classic popular algorithms (I take the deep learning out alone):
- Perceptron
- Back-propagation
- Hopfield Network
- Self-organizing Map (SOM)
- Learning Vector Quantization (LVQ)
Deep learning
deep learning (deep learning) approach is a modern update of artificial neural networks. Compared with the traditional neural network, it has more complex network composition, many methods are concerned about semi-supervised learning, this learning problem has a lot of data, but it is rarely labeled data.
- Restricted Boltzmann Machine (RBM)
- Deep belief Networks (DBN)
- Convolutional Network
- Stacked Auto-encoders
dimensionality Reduction
dimensionality Reduction (dimensionality reduction), like the clustering method, pursues and leverages the uniform structure of the data, but it uses less information to generalize and describe the data. This is useful for visualizing data or simplifying data.
- Principal Component Analysis (PCA)
- Partial Least Squares Regression (PLS)
- Sammon Mapping
- Multidimensional Scaling (MDS)
- Projection Pursuit
Ensemble Methods
ensemble methods (Combinatorial method) consists of a number of small models that are independently trained to make independent conclusions, and finally form a general prediction. Many studies focus on what models are used and how they are grouped together. This is a very powerful and popular technology.
- bootstrapped Aggregation (Bagging)
-
- stacked generalization (blending)
- random Forest
This is an example of fitting using a combination method (from a wiki), each fire-fighting method is grayed out, and the final result of the synthesis is red.
Overview of popular machine learning algorithms