Discriminative model and generative model

Source: Internet
Author: User
Tags knowledge base

Original Web site: http://blog.sciencenet.cn/home.php? Mod = space & uid = 248173 & do = blog & id = 227964

【Abstract]
-Generation model: Infinite sample = probability density model = generation model = Prediction
-Discriminative model: Finite Sample = Discriminant Function = prediction model = Prediction

[Overview]
Simply put, assume that o is the observed value, and q is the model.
If you model P (o | q), it is the Generative model. The basic idea is to first establish the probability density model of the sample and then use the model for inference and prediction. It is required that the known samples are infinite or as large as possible.
This method is generally based on statistical mechanics and bayes Theory.
If you model conditional probability (posterior probability) P (q | o), it is the Discrminative model. The basic idea is to establish a discriminant function under a finite sample without considering the sample generation model and directly researching the prediction model. The representative theory is the statistical learning theory.
Currently, these two methods have a lot of crossover.

[Discriminant Model Discriminative Model] -- inter-class probabilistic description

It can also be called a conditional model or a conditional probability model. Conditional Probability distribution and p (class | context) are estimated ).
The positive and negative examples and classification labels are used to determine the edge distribution of the model. The target function directly corresponds to the classification accuracy.

-Main features:
Find the optimal classification surface between different categories, reflecting the differences between different types of data.
-Advantages:
The classification boundary is more flexible than the pure probability method or production model.
Can clearly distinguish between multiple classes or between one class and other classes.
Good results in clustering, viewpoint changes, partial occlusion and scale variations
Suitable for identification of many categories
The performance of the discriminative model is simpler and easier to learn than that of the generated model.
-Disadvantages:
Does not reflect the characteristics of the training data. Limited capabilities. I can tell you whether it is 1 or 2, but there is no way to describe the entire scenario.
Lack elegance of generative: Priors, structure, uncertainty
Alternative notions of penalty functions, regularization, Core Function
Black box operation: the relationships between variables are unclear and invisible.

-Common causes include:
Logistic regression
SVMs
Traditional neural networks
Nearest neighbor
Conditional random fields (CRF): The latest popular model developed from the NLP field is evolving to ASR and CV.

-Main applications:
Image and document classification
Biosequence Analysis
Time Series Prediction

Generative model -- intra-class probabilistic description

It is also called a generative model. It is estimated that the joint probability distribution (joint probability distribution), P (class, context) = P (class | context) * P (context ).

It is used for modeling of randomly generated observed values, especially when some hidden parameters are specified. In machine learning, it is used for directly modeling data (using probability density functions to model the observed draw) or as an intermediate step for generating conditional probability density functions. Bayesian Rule can be used to obtain conditional distribution from the generated model.

If the observed data is completely generated by the generated model, fitting can generate the model parameters, thus only increasing the data similarity. However, data is rarely fully obtained from the generated model. Therefore, the more accurate method is to directly model the conditional density function, that is, to use classification or regression analysis.

Different from the description model, all variables in the description model are directly measured.

-Main features:
Generally, posterior probability modeling is used to represent the distribution of data from a statistical perspective and reflect the similarity of similar data.
Only focus on your inclass itself (that is, the probability in the lower-left corner), and do not care where the demo-boundary is.
-Advantages:
In fact, the information is richer than the discriminative model,
The Research on single-class problems is more flexible than the discriminative model.
Models can be obtained through incremental learning.
Can be used in case of incomplete data (missing data)
Modular construction of composed solutions to complex problems
Prior knowledge can be easily taken into account
Robust to partial occlusion and viewpoint changes
Can tolerate significant intra-class variation of Object Appearance
-Disadvantages:
Tend to produce a significant number of false positives. This is particle ly true for object classes which share a high visual similarity such as horses and cows
Complex learning and computing processes

-Common causes include:
Gaussians, Naive Bayes, mixtures of Multinomials
Mixtures of gaussians, mixtures of experts, HMMs
Sigmoidal belief networks, Bayesian Networks
Markov Random Fields

The enumerated generative model can also be trained using the disriminative method, such as GMM or hmm. The training method is EBW (Extended Baum Welch), or the large margin method proposed by Fei Sha recently.

-Main applications:
NLP:
Traditional rule-based or boolean logic systems (dialog and Lexis-Nexis) are giving way to statistical approaches (Markov models and stochastic context grammars)
Medical Diagnosis:
QMR knowledge base, initially a heuristic Expert Systems for reasoning about diseases and symptoms been augmented with demo-theoretic formulation genomics and Bioinformatics
Sequences represented as generative HMMs

[Relationship between the two]
The Discriminative model can be obtained from the generated model, but cannot be generated from the discriminative model.
Can performance of SVMs be combined elegantly with flexible Bayesian statistics?
Maximum Entropy Discrimination marries both methods: Solve over a distribution of parameters (a distribution over solutions)

[Reference website]
Http://prfans.com/forum/viewthread.php? Tid = 80
Http://hi.baidu.com/cat_ng/blog/item/5e59c3cea730270593457e1d.html
Http://en.wikipedia.org/wiki/Generative_model
Http://blog.csdn.net/yangleecool/archive/2009/04/05/4051029.aspx

============================
Comparison of three models: HMMs and MRF and CRF

Http://blog.sina.com.cn/s/blog_4cdaefce010082rm.html

HMMs (Hidden Markov Model ):
Status sequence cannot be directly observed (hidden );
Each observation is considered a random function of the state sequence;
The state transition matrix is a random function that changes the State based on the transition probability matrix.
The difference between HMMs and MRF is that it only contains the label field variable, not the observation field variable.

MRF (Markov Random Field)
Simulate an image into a grid composed of random variables.
Each of these variables has a clear dependence (Markov) on the nearest neighbor composed of random variables other than itself ).

CRF (Conditional Random Field), also known as Markov Random Field
A conditional probability model used to mark and split ordered data.
In terms of form, CRF can be regarded as an undirected graph model to evaluate the conditional probability of the labeled sequence of a given input sequence.

Application in visual problems:
HMMs: Image Denoising, image texture segmentation, fuzzy image restoration, texture image retrieval, automatic target recognition, etc.
MRF: Image Restoration, image segmentation, edge detection, texture analysis, target matching, and Recognition
CRF: Target Detection, recognition, and segmentation in sequential images

P.S.
The label field is a hidden random field, which describes the local correlation property of pixels. The model used should be highly flexible based on people's understanding of the image structure and features.
The prior models of the airspace labeling field mainly include non-causal Markov models and causal Markov models.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.