Summarize each algorithm and application scenario in a nutshell?

Source: Internet
Author: User

EM, is a maximum likelihood estimation method for probabilistic model parameters with implied variables. It is mainly used in the field of machine learning and computer vision data clustering.

LR, logistic regression, is also linear regression, by fitting a curve to fit a sample, and then using a logical function for interval scaling, but generally used for classification, mainly used in CTR estimation, referral system, etc.;
SVM, support vector machine, by finding a super plane in the sample space, to achieve the classification of samples, but also for regression, mainly used in text classification, image recognition and other fields, see:;
NN, neural network, by finding some kind of non-linear model fitting data, mainly used in image processing;
NB, naive Bayesian, by finding a sample of the joint step, and then through the Bayesian formula, calculate the posterior probability of the sample, thus classification, mainly used for text classification;
DT, decision tree, build a tree, in the node according to a certain rule (general use of information entropy) to carry out the sample division, the essence is in the sample space for block division, mainly used for classification, but also to do regression, but more as a weak classifier, used in model embedding;
RF, with the forest, is composed of a number of decision trees forest, each forest training sample is sampled from the overall sample, each node needs to be divided by the characteristics of sampling, which makes each tree has a unique field of knowledge, thus has a better generalization ability;
GBDT, gradient-boosting decision trees, in fact, are made up of many trees, and RF is different, each tree training sample is the residual of the previous tree, which embodies the idea of the gradient, while the final structure is the combination of all the trees or votes, mainly used in the recommendation, relevance, etc.;
Knn,k nearest neighbor, should be the simplest ml method, for the unknown label sample, see its nearest K sample (using a distance formula, Markov distance or European distance) which label is the most, it belongs to this category;

Naive Bayes (Naive Bayes) method is a classification method based on Bayesian theorem and independent hypothesis of characteristic condition, and for a given training data set, the joint probability distribution of the input/output is first based on the hypothesis of characteristic condition. Then, based on this model, for a given input x, The maximum output y of the posteriori probability is obtained by Bayes theorem. For the given items to be categorized, the probability of each category appearing under the conditions in which the item appears, and which one is the largest, is considered to be the category to which this category belongs.

Summarize each algorithm and application scenario in a nutshell?

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.