What are the initial knowledge of machine learning algorithms?

Source: Internet
Author: User

Machine learning is undoubtedly an important content in the field of data analysis now, people who engage in it work are in the usual work or many
or less will use machine learning algorithms.

There are many algorithms for machine learning, but there are two types of big aspects: one is learning, the other is the similarity of algorithms.

Learning Style:

Depending on the type of data, there are different ways to model a problem. In the field of machine learning or artificial intelligence, people will first
Consider how the algorithm is learned. In the field of machine learning, there are several main ways of learning. Classifying algorithms according to their learning style is a
The wrong idea, which allows people to consider the choice of the most suitable algorithm based on the input data when modeling and algorithm selection to obtain
The best results.

The main learning methods and learning models of the algorithm are as follows:

Supervised learning: input data is called training data, they have known labels or results, such as spam/non-spam or a
Stock price for a period of time. The model's parameter determination needs to pass through a training process, in which the model will require predictions to be made
, you need to make changes when the predictions don't match.

Unsupervised learning: The input data is not labeled or has a known result. Modeling by guessing the structures present in the input data
Type. Examples of such problems relate to the learning of union rules and clustering. Examples of algorithms include the Apriori algorithm and the K-means algorithm.

Semi-supervised learning: the input data consists of tagged and unmarked components. Although the appropriate predictive model already exists, the model
must also be able to organize the data by discovering potential structures. Such issues include classification and regression. The typical algorithms include some
Other flexible models are being generalized, and these models make assumptions about how to model unlabeled data.

Intensive learning: input data is provided to the model as an incentive from the environment, and the model must respond. Feedback is not as supervised as learning
From the process of training, but as a punishment or reward for the environment. Typical problems are system and robot control. Example of an algorithm package
Q-Learning and sequential differential learning (temporal difference learning).

Algorithmic similarity

According to the function and form similarity of the algorithm, we can classify the algorithm, for example, tree-based algorithm, based on neural network algorithm
Wait a minute. Of course, the scope of machine learning is very large, and some algorithms are difficult to classify into a certain category. For some classifications, the same
A classification algorithm can be used for different types of problems. Here, we try to divide the commonly used algorithms in the easiest way to understand
Class.

Regression analysis

Regression is a modeling method that first determines the amount of prediction error of a model, and then uses this quantity to iteratively optimize the variable
The relationship between the two. Regression method is the main application of statistics, which is classified as statistical machine learning. It's a little confusing because we can use
Regression to refer to a class of problems and a class of algorithms. In fact, regression is a process. Here are some examples:

Ordinary least squares
Logistic regression
Stepwise regression
Multiple adaptive Spline regression (MARS)
Local polynomial regression fitting (loess)

An instance-based approach

An instance-based learning model models decision issues based on what is considered important in the training data or required by the model
Instance. Such methods typically create a sample database and then make new data and databases based on a similarity metric
Compare to find the most matching item and make a prediction at the end. Thus, the case-based approach is also called the "winner-take-all" approach and the base
In the study of memory. This approach focuses on the representation of existing instances and the measurement of similarity between instances.

K Nearest Neighbor algorithm (KNN)
Learning Vector Quantization (LVQ)
Self-organizing mappings (SOM)

Regularization method

This is an extension of another method (usually a regression method) that punishes a model with a high degree of complexity and tends to promote a better, more simplified
A single model. I've listed some regularization methods here, because they're popular, powerful, and usually just simple for other methods.
The improvement.

Ridge return
Lasso Algorithm (LASSO)
Elastic Network

Decision Tree Learning

The decision tree method models the decision-making process, which is based on the actual values of the attributes in the data. The decision is forked on the tree structure until the
A particular record can make predictions. In the problem of classification or regression, we use data to train decision trees.

Classification and regression tree algorithm (CART)
Iterative binary Tree 3 generation (ID3)
C4.5 algorithm
Chi-square Automatic Interactive view (CHAID)
Single-layer decision tree
Random Forest
Multiple adaptive Spline regression (MARS)
Gradient Propulsion Machine (GBM)

Bayesian algorithm

Bayesian methods are those that explicitly apply Bayesian theorems to classification and regression problems.

Naive Bayesian algorithm
Aode algorithm
Bayesian Reliability Network (BBN)

Kernel function method

The most famous method of kernel function is the popular support vector machine algorithm, which is actually a series of methods. The kernel function method is concerned with how to
Mapping the input data to a high-dimensional vector space, where some classification or regression problems can be easily solved

Support Vector Machine (SVM)
Radial basis function (RBF)
Linear discriminant Analysis (LDA)

Clustering method

Just like regression, clustering represents both a class of problems and a class of methods. Clustering methods are generally classified according to the modeling method: Quality-based
Of the mind or hierarchy. All methods use the intrinsic structure of the data to classify the data into a class of the most common
In

K Mean Value method
Maximum expectation algorithm (EM)

Association Rule Learning

Association rule learning is a class of algorithms for extracting rules that best explain the relationships between variables in the observed data. This
Some rules can identify important and commercially useful associations in a large cube, and then use them further.

Apriori algorithm
Eclat algorithm

Artificial neural network

Artificial neural networks are algorithms that are inspired by the structure and/or function of biological neural networks. They are a class commonly used in regression and classification problems
Pattern matching method, but in fact this huge subclass contains hundreds of algorithms and algorithms of deformation, can solve various types of problems
。 Some of the classic popular methods include (I've separated deep learning from this class):

Perceptual device
Inverse propagation algorithm
Hopfield Neural Network
Adaptive Mapping (SOM)
Learning Vector Quantization (LVQ)

Deep learning

The deep learning approach is a modern and improved version of artificial neural networks using inexpensive and redundant computational resources. This kind of method tries to build much larger and
Much more complex neural networks, as mentioned earlier, many methods are based on very limited tag data in large data sets to solve semi-
Supervise the learning problem.

Restricted Boltzmann machine (RBM)
Depth belief net (DBN)
convolutional Neural Networks
Cascade Automatic Encoder (SAE)

Dimensionality reduction method

Like the clustering method, the Dimensionality reduction method attempts to summarize or describe the data by using the intrinsic structure of the data, and it is different from the unsupervised side
Use less information. This is useful for visualizing high-dimensional data or simplifying data for subsequent supervised learning.

Principal component Analysis (PCA)
Partial least squares regression (PLS)
Salmon mapping
Multidimensional scale analysis (MDS)
Projection Pursuit

Integration method

The integration method is composed of several weaker models, which are trained independently, and their predictions are integrated in some way to derive
The total forecast. Much effort is focused on choosing what type of learning model to use as a sub-model and how to integrate their results
。 This is a very powerful technology, and therefore very popular.

Propulsion Technology (boosting)
Self-exhibition Integration (Bagging)
Adaptive Propulsion (AdaBoost)
Cascading generalization strategy (Blending)
Gradient Propulsion Machine (GBM)
Random Forest

What are the initial knowledge of machine learning algorithms?

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.