Regularization Algorithms
Ensemble Algorithms
Decision Tree Algorithm
Regression
Artificial Neural Network
Deep Learning
Support Vector Machine
Dimensionality Reduction Algorithms
Clustering Algorithms
Instance-based Algorithms
Bayesian Algorithms
Association Rule Learning Algorithms
Graph Models
1. Regularization Algorithms
It is an extension of another method (usually a regression method) that penalizes it based on model complexity, and it prefers a model that is relatively simple and can be better generalized.
Examples:
Ridge Regression
Minimum absolute contraction and selection operator (LASSO)
GLASSO
Elastic Net
Least-Angle Regression
Advantages:
Its punishment will reduce overfitting
There will always be a solution
Disadvantages:
Punishment can cause under-fitting
Hard to calibrate
Second, the integration algorithm (Ensemble algorithms)
The integration approach consists of a number of weaker model integration model groups, where the models can be trained separately and their predictions can be combined in some way to make an overall prediction.
The main problem with this algorithm is to find out which weaker models can be combined, and how to combine them. This is a very powerful set of technologies and is therefore very popular.
Boosting
Bootstrapped Aggregation(Bagging)
AdaBoost
Stacked Generalization (blending)
Gradient Boosting Machines (GBM)
Gradient Boosted Regression Trees (GBRT)
Random Forest
Advantage:
Algorithm synthesis is used in almost all of the most advanced predictions. It is much more accurate than the results predicted using a single model.
Disadvantages:
Need a lot of maintenance work
Third, the decision tree algorithm (Decision Tree Algorithm)
Decision tree learning uses a decision tree as a predictive model that maps an item (represented on a branch) to a conclusion about the target value of the item (represented in the leaf).
The goals in the tree model are mutable, and a set of finite values can be taken, called a classification tree; in these tree structures, the leaves represent class labels, and the branches represent the characteristics of the connections that characterize these class labels.
Examples:
Classification and Regression Tree (CART)
Iterative Dichotomiser 3 (ID3)
C4.5 and C5.0 (two different versions of a powerful method)
Advantages:
Easy to explain
Nonparametric
Disadvantages:
Tend over fit
May be trapped in a local minimum
No online learning
Fourth, the regression (Regression) algorithm
Regression is a statistical process used to estimate the relationship between two variables. When used to analyze the relationship between a dependent variable and a single independent variable, the algorithm provides many techniques for modeling and analyzing multiple variables. To be specific, regression analysis can help us understand the typical value of the dependent variable when any one of the independent variables changes and the other independent variable does not change. Most commonly, regression analysis can estimate the conditional expectation of the dependent variable given the independent variable.
The regression algorithm is the main algorithm in statistics and has been incorporated into statistical machine learning.
Examples:
Ordinary Least Squares Regression (OLSR)
Linear Regression
Logistic Regression
Stepwise Regression
Multivariate Adaptive Regression Splines (MARS)
Locally Estimated Scatterplot Smoothing (LOESS)
Advantage:
Direct and fast
high popularity
Disadvantages:
Strict assumption
Need to handle outliers
Fifth, artificial neural network
Artificial neural networks are algorithm models built by biological neural networks.
It is a pattern matching that is often used for regression and classification problems, but has a large subdomain consisting of hundreds of algorithms and variants of various problems.
Example:
sensor
Back propagation
Hopfield network
Radial Basis Function Network (RBFN)
Advantage:
Excellent in voice, semantics, visual, and various games (such as Go)
Algorithms can be quickly adjusted to accommodate new problems
Disadvantages:
Need a lot of data for training
Highly demanding hardware configuration
The model is in a "black box state" and it is difficult to understand the internal mechanism
Metaparameters and network topology selection are difficult.
6. Deep Learning
Deep learning is the latest branch of artificial neural networks that benefit from the rapid development of contemporary hardware.
The current direction of many researchers is mainly focused on building larger and more complex neural networks. There are many methods currently focusing on semi-supervised learning problems, where large data sets for training contain only a few markers.
Example:
Deep Boltzmann Machine (DBM)
Deep Belief Networks (DBN)
Convolutional Neural Network (CNN)
Stacked Auto-Encoders
Advantages / Disadvantages: see neural network
Seven, Support Vector Machines (Support Vector Machines)
Given a set of training cases, each of which belongs to one of two categories, a support vector machine (SVM) training algorithm can classify itself into one of two categories after being entered into a new case, making itself a Non-probability binary linear classifier.
The SVM model represents training cases as points in space, which are mapped into a single image, separated by an explicit, widest possible interval to distinguish between the two categories.
The new examples are then mapped into the same space and predicted which category they belong to based on which side of the interval they fall into.
Advantage:
Excellent performance on nonlinear separable problems
Disadvantages:
Very difficult to train
hard to explain
8. Dimensionality Reduction Algorithms
Similar to the clustering approach, dimension reduction pursues and exploits the inherent structure of the data in order to summarize or describe the data with less information.
This algorithm can be used to visualize high-dimensional data or to simplify the data that can then be used to supervise learning. Many of these methods can be adjusted for the use of classification and regression.
Example:
Principal Component Analysis (PCA)
Principal Component Regression (PCR)
Partial Least Squares Regression (PLSR)
Sammon Mapping
Multidimensional Scaling (MDS)
Projection Pursuit
Linear Discriminant Analysis (LDA)
Mixed Discriminant Analysis (MDA)
Quadratic Discriminant Analysis (QDA)
Flexible Discriminant Analysis (FDA)
Advantage:
Can handle large data sets
No need to make assumptions on the data
Disadvantages:
Difficult to get nonlinear data
Difficult to understand the meaning of the results
9. Clustering Algorithms
The clustering algorithm refers to classifying a group of targets. The targets belonging to the same group (ie, a class) are divided into one group. Compared with other group targets, the same group of objects are more similar to each other (in a certain sense) on).
Example:
K-means
k-Medians algorithm
Expectation Maximi Sealing ation (EM)
X. Maximum Expectation Algorithm (EM)
Hierarchical Clstering
Advantage:
Make data meaningful
Disadvantages:
The results are difficult to interpret and the results may be useless for unusual data sets.
Instance-based Algorithms
Instance-based algorithms (sometimes called memory-based learning) are learning algorithms that are not explicitly generalized, but rather compare new problem examples with examples seen during training. These examples are in memory. .
The reason is called an instance-based algorithm because it constructs the hypothesis directly from the training instance. This means that the complexity of the hypothesis can change as the data grows: in the worst case, the hypothesis is a list of training items, classifying a single new instance with a computational complexity of O(n)
Example:
K nearest neighbor (k-Nearest Neighbor (kNN))
Learning Vector Quantization (LVQ)
Self-Organizing Map (SOM)
Locally Weighted Learning (LWL)
Advantage:
Simple algorithm and easy to interpret results
Disadvantages:
Very high memory usage
High calculation cost
Impossible for high dimensional feature space
12. Bayesian Algorithms
The Bayesian method refers to a method that explicitly applies Bayes' theorem to solve problems such as classification and regression.
Examples:
Naive Bayes
Gaussian Naive Bayes
Polynomial Naive Bayes
Averaged One-Dependence Estimators (AODE)
Bayesian Belief Network (BBN)
Bayesian Network (BN)
Advantage:
Fast, easy to train, giving them the resources they need to deliver good performance
Disadvantages:
Problems occur if the input variables are related
13. Association Rule Learning Algorithms
The association rule learning method can extract the best interpretation of the relationship between variables in the data. For example, there is a rule in the sales data of a supermarket {onion, potato} => {hamburger}, which means that when a customer buys onions and potatoes at the same time, he is likely to buy hamburger meat.
Examples:
Apriori algorithm
Eclat algorithm
FP-growth
Graph Models
A PGM/probabilistic graphical model is a probabilistic model by which a graph can represent a conditional dependence structure between random variables.
Examples:
Bayesian network
Markov random field
Chain Graphs
Ancestral graph
Advantage:
The model is clear and can be intuitively understood
Disadvantages:
Determining the topology of its dependencies is difficult and sometimes ambiguous.