Machine learning
Machine Learning (machine learning, ML) is a multidisciplinary interdisciplinary, involving many disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and so on. Specialized in computer simulation or realization of human learning behavior, in order to acquire new knowledge or skills, reorganize the existing knowledge structure to continuously improve their performance.
Strict definition : machine learning is a learning machine to acquire new knowledge and skills, and to identify existing knowledge. The "machine" here refers to computers, electronic computers, neutron computers, photon computers, or neural computers, and so on.
Introduction to Machine learning
As shown: machine learning is divided into four chunks: classification (classification), Clustering (clustering), regression (regression), dimensionality Reduction (Descending dimension).
Classification & Regression
To give a simple example:
Given a sample feature x, we want to predict its corresponding property value Y, if Y is discrete, then this is a classification problem, conversely, if y is a continuous real number, this is a regression problem.
Given a set of sample characteristics S={x∈rd}, we do not have a corresponding y, but want to explore the set of samples in the D-dimensional distribution, such as the analysis of which samples are closer, which samples are far away, this is a clustering problem.
If we want to use the subspace with lower dimensionality to represent the original high-dimensional feature space, then this is the dimensionality reduction problem.
Whether it is a classification or regression, is to create a predictive model H, given an input x, you can get an output y:
Y=h (x)
The difference is only in the classification of the problem, y is discrete; In the regression problem, y is continuous. So the learning algorithms for both kinds of problems are very similar. So on this graph, we see that the learning algorithms used in the classification problem can also be used in regression problems. The most common learning algorithms for classification problems include SVM (support vector machine), SGD (random gradient descent algorithm), Bayes (Bayesian estimation), Ensemble, KNN, etc. The regression problem can also use SVR, SGD, Ensemble and other algorithms, as well as other linear regression algorithms.
Clustering
Clustering is also a property of analysis samples, somewhat similar to classification, the difference is that classification before the prediction is to know the range of Y, or know that there are several categories, and clustering is not aware of the scope of the property. So classification is also often called supervised learning, and clustering is called unsupervised learning.
Clustering does not know the attribute range of the sample beforehand, it can only analyze the properties of the sample based on the distribution of the sample in the feature space. This problem is generally more complex. The commonly used algorithms include K-means (K-means), GMM (Gaussian mixture model) and so on.
dimensionality Reduction
Dimensionality reduction is another important field of machine learning, there are many important applications in dimensionality reduction, the dimension of features is too high, it will increase the burden and storage space of training, dimensionality reduction is the redundancy that wants to remove the feature, and the feature is represented by less dimension. The most fundamental of the dimensionality reduction algorithm is PCA, and many of the algorithms are based on PCA.
Common algorithms for machine learning
There are many algorithms and models involved in machine learning, and some common algorithms are selected here:
- Regularization algorithm (regularization algorithms)
- Integration algorithm (Ensemble algorithms)
- Decision Trees algorithm (decision tree algorithm)
- Regression (Regression)
- Artificial Neural Networks (Artificial neural Network)
- Deep Learning (Deepin learning)
- SVM (Support vector machines)
- dimensionality reduction Algorithm (dimensionality Reduction algorithms)
- Clustering algorithm (clustering algorithms)
- Instance-based algorithm (instance-based algorithms)
- Bayesian algorithm (Bayesian algorithms)
- Association Rule Learning Algorithm (association rule Learning Algorithms)
- Graph model (graphical Models)
Regularization algorithm (regularization algorithms)
The regularization algorithm is an extension of another method (usually a regression method) that punishes it based on the complexity of the model, and prefers a model that is relatively simple and can be better generalized.
In regularization we will keep all the feature variables, but will reduce the order of magnitude of the characteristic variable (the size of the parameter value θ (j)). This method is very effective, and when we have many characteristic variables, each of these variables can have a little effect on the prediction.
Algorithm Example:
-Ridge return (Ridge Regression)
-Minimum absolute shrinkage and selection operator (LASSO)
-Glasso
-Elastic Network (Elastic net)
-Minimum angular regression (least-angle Regression)
Detailed link: The regularization algorithm of machine learning
Integration algorithm (Ensemble algorithms)
The integration approach is to integrate model groups from several weaker models, where the models can be trained separately and their predictions can be combined in some way to make an overall prediction. This kind of algorithm is also called meta-algorithm (META-ALGORITHM). The most common ideas for integration are two bagging and boosting.
boosting
Build new classifiers and integrate them based on error-boosting classifier performance by focusing on samples that have been categorized incorrectly by existing classifiers.
Bagging
Classifier construction method based on random resampling of data.
Algorithm Example:
- Boosting
- bootstrapped Aggregation (Bagging)
- AdaBoost
- Cascading generalization (stacked generalization) (blending)
- Gradient Propulsion machine (Gradient boosting MACHINES,GBM)
- Gradient boost regression tree (Gradient Boosted Regression TREES,GBRT)
- Stochastic forest (random Forest)
Summary: The most advanced predictions for yourselves are almost all using algorithmic integration. It is much more accurate than predicting results using a single model. But the algorithm requires a lot of maintenance work.
Integrated algorithm of machine learning algorithm
Decision Trees algorithm (decision tree algorithm)
Decision Tree Learning uses a decision tree as a predictive model that will map an item (represented on a branch) to a conclusion about the target value of the item (represented in the leaf).
The decision tree classifies instances by arranging instances from the burgundy nodes to a leaf node, and the leaf nodes are the categories to which the instances belong. Each node on the tree specifies a test for a property of the instance, and each successive branch of the node corresponds to one of the possible values of the attribute. The method of classifying an instance is to start with the root node of the tree, test the properties of the node, and then move down the branch that corresponds to the property value of the given instance. This process is then repeated on the subtree of the root of the new node.
Algorithm Example:
-Classification and regression tree (classification and Regression Tree,cart)
-Iterative Dichotomiser 3 (ID3)
-C4.5 and C5.0 (two different versions of a powerful method)
Detailed: Decision tree algorithm of machine learning algorithm
Regression (Regression) algorithm
Regression is a statistical process used to estimate the relationship between two variables. When used to analyze the relationship between a dependent variable and a plurality of arguments, the algorithm provides many techniques for modeling and analyzing multiple variables. In particular, regression analysis can help us to understand the typical value of variable changes when any one of the independent variables changes and the other argument is invariant. Most commonly, regression analysis can estimate the conditional expectations of dependent variables under the condition of a given argument.
Algorithm Example:
- Ordinary least squares regression (ordinary Least squares REGRESSION,OLSR)
- Linear regression (Linear Regression)
- Logical regression (Logistic Regression)
- Stepwise regression (stepwise Regression)
- Multivariate adaptive regression spline (multivariate Adaptive Regression splines,mars)
- Local Scatter smoothing estimate (locally estimated scatterplot smoothing,loess)
Regression algorithm: Regression algorithm of machine learning algorithm
Artificial neural network
Artificial neural network is an algorithm model which is inspired by biological neural network. It is a pattern match, often used for regression and classification problems, but has a large subdomain, consisting of hundreds of kinds of algorithms and variants of various problems.
Artificial Neural Networks (ANN) provides a common and practical way to learn from the sample values are real, discrete, or vector functions. The artificial neural network is composed of a series of simple units, each of which has a certain number of real value inputs and produces a single real value output.
Algorithm Example:
- Perceptual device
- Reverse propagation
- Hopfield Network
- Radial basis functions Network (Radial Basis function Network,rbfn)
Detailed Link: Artificial neural network of machine learning algorithm
Deep Learning (Deepin learning)
Deep Learning is the newest branch of artificial neural network, which benefits from the rapid development of modern hardware.
Many researchers are now focusing on building larger, more complex neural networks, and there are a number of ways to focus on semi-supervised learning, where large datasets for training contain only a few tags.
Algorithm Example:
- Deep Boltzmann machine (deeper Boltzmann machine,dbm)
- Deep belief Networks (DBN)
- convolutional neural Network (CNN)
- Stacked Auto-encoders
Deep learning: Deep learning of machine learning algorithms
Support vector machines (supported vectors machines)
Support Vector Machine (SVM) is a method of supervised learning (supervised learning), which is mainly used in the problem of Statistical classification (classification) and regression analysis (Regression). Support Vector machines belong to the generalized linear classifier, which can also be considered as a special case of the Tikhonov regularization method. This family classifier is characterized by the ability to minimize both the experience error and the maximized geometric edge area, so the support vector machine is also called the maximum edge region classifier. It is now more abbreviated as SVM.
Given a set of training cases, each of which belongs to one of the two categories, the support vector Machine (SVM) training algorithm can be classified into one of two categories after being entered into a new case, making itself a non-probabilistic binary linear classifier.
The SVM model represents the training cases as points in space, which are mapped to a picture, separated by an explicit, widest possible interval to differentiate between two categories.
Algorithm explanation: Support vector machine for machine learning algorithm
dimensionality reduction Algorithm (dimensionality Reduction algorithms)
The so-called dimensionality reduction means that the data points in the original high dimensional space are mapped to the space of the lower dimension by using some mapping method. The essence of dimensionality is to learn a mapping function f:x->y, where x is the expression of the original data point, which is currently used at most in vector representations. Y is the low-dimensional vector representation of a data point map, and the dimension of y is usually less than the dimension of X (and of course the dimension is also possible). F may be explicit or implicit, linear, or non-linear.
This algorithm can be used to visualize high-dimensional data or simplify the data that can then be used to supervise learning. Many of these methods can be adjusted for the use of classifications and regressions.
Algorithm Example:
- Principal component Analysis (Principal Component Analytical (PCA))
- Principal component regression (Principal Component Regression (PCR))
- Partial least squares regression (partial Least squares Regression (PLSR))
- Sammon Mapping (Sammon Mapping)
- Multidimensional scale Transformations (multidimensional scaling (MDS))
- Projection Pursuit (Projection Pursuit)
- Linear discriminant Analysis (Linear discriminant Analytical (LDA))
- Mixed discriminant Analyses (Mixture discriminant analysis (MDA))
- Two discriminant analyses (quadratic discriminant analysis (QDA))
- Flexible discriminant Analysis (flexible discriminant analytical (FDA))
Detailed link: reduced dimension algorithm
Clustering algorithm (clustering algorithms)
Clustering is the classification of a set of goals, the target of the same group (that is, a class, cluster) is divided into a group, compared with other group targets, the same group of targets are more similar to each other.
The advantage is to make the data meaningful, the disadvantage is that the results are difficult to interpret, for different data groups, the results may be useless.
Algorithm Example:
- K-Means (K-means)
- K-medians algorithm
- Expectation Maximi seal layer ation (EM)
- Maximum expectation algorithm (EM)
- Tiered clusters (hierarchical clstering)
Clustering algorithm: Clustering algorithm of machine learning algorithm
Bayesian algorithm (Bayesian algorithms)
Bayesian theorem (English: Bayes ' theorem) is a theorem in probability theory, which is related to the conditional probability and the edge probability distribution of the machine variables. In some explanations of probability, Bayes ' theorem (Bayesian update) can tell us how to use new evidence to modify an existing view. Bayesian method is a method that explicitly applies Bayes theorem to solve problems such as classification and regression.
Algorithm Example:
- Naive Bayes (Naive Bayes)
- Gaussian naive Bayes (Gaussian Naive Bayes)
- Polynomial naive Bayes (multinomial Naive Bayes)
- Average uniformly dependent estimator (averaged one-dependence estimators (Aode))
- Bayesian belief networks (Bayesian Belief Network (BBN))
- Bayesian Networks (Bayesian Network (BN))
Bayesian algorithm Link: Bayesian algorithm detailed
Association Rule Learning Algorithm (association rule Learning Algorithms)
The association rule Learning method can extract the best interpretation of the relationship between the variables in the data. For example, there are rules in the sales data for a supermarket {onion, potato}=> {hamburger}, which means that when a client buys onions and potatoes at the same time, he is likely to buy hamburger. Somewhat similar to the associative algorithm.
Algorithm Example:
- Apriori algorithm (Apriori algorithm)
- Eclat algorithm (Eclat algorithm)
- Fp-growth
Association Rule Learning Algorithm: association rule Learning Algorithm
Graph model (graphical Models)
Graph Model (graphicalmodels) establishes a marriage relationship between probability theory and graph theory. It provides a natural tool to deal with two types of problems in applied mathematics and Engineering-uncertainty (uncertainty) and complexity (complexity), especially in the analysis and design of machine learning algorithms. The basic idea of graph model is the idea of modularization, and the complex system is constructed by combining simple system. Probability theory provides a binder that enables the various parts of the system to be combined to ensure the continuous consistency of the system as a whole and provides a variety of data interface model methods.
A graph model or probability map model (Pgm/probabilistic graphical models) is a probabilistic model in which a graph (graph) can represent a conditional dependency structure between random variables (conditional dependence Structure).
Algorithm Example:
- Bayesian Networks (Bayesian Network)
- Markov random field (Markov random field)
- Chain diagram (Chain Graphs)
- Ancestor Graph (ancestral graph)
Diagram Model: Graph model of machine learning algorithm
Reference:
10 Big algorithms for machine learning
Machine Learning Stanford Open Class
A collection of machine learning algorithms